Parse A HTML Page With Nodejs

Posted By : Sakshi Gadia | 30-May-2018

To make http request you can use “request” npm.

const request = require('request');
 
request('http://www.google.com', (err, response, html) => 
  if (err) { return console.log(err); }
  console.log(html.url);
  console.log(html.explanation);
});

 

Above code send http request to http://www.google.com and return the html.

Cheerio parses markup.With the use of cheerio parsing, manipulating and rendering are very efficient. Cheerio works with a very simple, consistent DOM model.

 

It can be accessed by: 

npm install cheerio

 
const cheerio =  require('cheerio');

const $ = cheerio.load(html);

 

To parse HTML table you can use cheerioTableparser. Cheerio-tableparser parse HTML tables, group them by columns, with colspan and rowspan respected. 

 

It can be accessed by:

npm install --save cheerio cheerio-tableparser

 

const cheerioTableparser = require('cheerio-tableparser');
const $ = cheerio.load(html);
     cheerioTableparser($);
let data = $("table").parsetable(true, true, true);

Output:

data = >
[ [ 'A', '1a', '1a', '1a', '1a', '1a' ],
  [ 'B', '2a', '2b', '2b', '2d', '2d' ],
  [ 'C', '3a', '2b', '2b', '3d', '3e' ],
  [ 'D', '4a', '4b', '4c', '4c', '4e' ],
  [ 'E', '5a', '5b', '5c', '5d', '5e' ] ]

 

About Author

Author Image
Sakshi Gadia

An experienced MEAN Stack developer having good knowledge in Nodejs, MongoDb. Apart from these in my spare time, I enjoy playing chess and ready to learn new technologies.

Request for Proposal

Name is required

Comment is required

Sending message..