How to pass through form filling with web-scratching?

147
April 22, 2019, at 10:00 PM

How to pass through website login form and see HTML code of any related webpages on website.

I trying to pass through login form on the website and then parse html page where holding my account info, but I can't do this. Here is my code.

const express = require('express');
const fs = require('fs'); //access to file system
const request = require('request');
const cheerio = require('cheerio');
const rp = require('request-promise');
const app = express();
let url = 'url';
(request.post({url:'url1', form: {
    email:'email',
    password:'password'  
}}, 
function(error, response, html){
    if(error){
    console.log(error);
    }
    else{
    console.log(html);
    }
}))
app.get('/scrape', function(req, res){
    requestToWork(url);
    res.send('Check your console!')
})
function requestToWork(url){
    return rp(url)
    .then(HTMLresponse=>{ 
        const $ = cheerio.load(HTMLresponse);
        console.log($.text());
        $('.ellipsis').each((i, element) => {
            console.log(element);
        });
    })
}
app.listen('8080')
console.log('Listening port 8080');
exports = module.exports = app;

It just logs to me HTML code from LOGIN page. I want to log another page.

Answer 1

The problem is, cheerio can't follow new url.

In your particular case, there is two possible solutions:
1. Login on site with your browser, access cookies via developer tools and copy them to your request. Something like this from documentation.
2. Use automated headless browser, which can follow page redirect. And keep your session data. Like puppeteer or selenium.

If you already using node.js, it would be easier to implement logic with puppeteer.

Here is more information on puppeteer.

Update


Puppeteer:

const puppeteer = require('puppeteer');
(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://example.com');
  // Now you have two ways
  // First one with evaluate, to access page DOM
  await page.evaluate(() => {
     // Here you have access to DOM. So you can make any JS DOM operations, you wish.
     const form = document.querySelector('form');
     const email = document.querySelector('email'); 
     // ...some actions
     form.submit();
  })
  // The second one, with puppeteer helper functions
  const email = await page.$('email');
  // Type function will type text in input
  await elementHandle.type('some text');
  // press function will emulate enter button press.
  await elementHandle.press('Enter');
  await page.waitFor(1500);
  // Here you have result of your auth procedure.
  // After all your operations, just close the browser.
  await browser.close();
})();

Here is about puppeteer type



If we're looking for request implementation.
First, we have to get the cookie.
You can extract cookies via this chrome extension, or go to development tools, Network tab, click on first record and look for Cookie header in Request Headers section.
Just copy it
And then, in code you execute request like this from official documentation:

const j = request.jar();
// Here 'key1=value1' change with your cookie from browser
const cookie = request.cookie('key1=value1');
const url = 'http://www.google.com';
j.setCookie(cookie, url);
request({url: url, jar: j}, function () {
    request('http://images.google.com')
})
READ ALSO
Accordion Scroll to

Accordion Scroll to

I am trying to add an scrollTo command to the following accordion code

129
SQLiteOpenHelper - Cannot access disposed object

SQLiteOpenHelper - Cannot access disposed object

I have a my SQLiteOpenHelper class, which is written as a singletonI should note that I am not doing this in Java, I am using Xamarin

124
In Javascript IF-ELSE, IF Statement is not working

In Javascript IF-ELSE, IF Statement is not working

I am trying to work on verifying OTPHere I have two components that are:

15