Puppeteer is a Node library that allows a high-Level API to control Chrome over the DevTools protocol. Basically, everything you do manually in the browser can be done via Puppeteer. For example, you can:
This article will focus on a simple step-by-step approach on how to start with Puppeteer and create a short HTTP trace, which logs all the web browser interactions.
Before we start scripting, a couple of preconditions need to be in place:
To start scripting a code editor needs to be available. Otherwise, it’s not possible to script and run code. Visual Studio Code is a great option if you don’t have a code editor installed on your device. It is free and available on Linux, MacOs, and Windows.
After installing our Code Editor, it’s time to install Node Js. This is necessary because Puppeteer is a Node library. When installing Node JS, a NPM will immediately be installed on the device.
Before we install Puppeteer, create a project directory:
C:\>mkdir iovio_test
C:\>cd iovio_test
Install Puppeteer in this project directory with the following command:
C:\iovio_test>npm i puppeteer
When the installation is done, it’s time to code!
First, let’s create an empty JavaScript file. When this is done, import the Puppeteer library by adding the following line:
const puppeteer = require('puppeteer');
Now, you can make use of the functions Puppeteer is offering. You’ll be able to start a Chrome browser, navigate to a specific page, and close the browser again:
const puppeteer = require('puppeteer');
(async () => {
// Opening a browser with the give arg.
const browser = await puppeteer.launch({
headless: false,
devtools: false,
ignoreHTTPSErrors: true,
args: [
'--start-fullscreen',
'--window-size=1920,1040',
'--no-sandbox'
]
});
// Opening a new page
const page = await browser.newPage();
// Set windows height en width
await page.setViewport({ width: 1600, height: 900 });
// Navigate to specific url and wait till network traffic is idle
await page.goto('https://qualibrate.com', {waitUntil: "networkidle0"});
// Close the browser
await browser.close();
})();
Let’s check if this works by starting the script with the following command in your terminal. Don’t forget to use the file name of the file that you created. In this case this is test.js:
C:\iovio_test>node test.js
If your script runs successfully, you can add some extra steps and do a bit of navigation to the blog section of this web page. Your script will then look like the following:
const puppeteer = require('puppeteer');
(async () => {
// Opening a browser with the give arg.
const browser = await puppeteer.launch({headless: false, devtools: false, ignoreHTTPSErrors: true, args: [
'--start-fullscreen',
'--window-size=1920,1040',
'--no-sandbox'
]});
// Opening a new page
const page = await browser.newPage();
// Set windows height en width
await page.setViewport({ width: 1600, height: 900 });
// Navigate to specific url and wait till network traffic is idle
await page.goto('https://qualibrate.com', {waitUntil: "networkidle0"});
// Click on menu item Blog
const menuElement = await page.$x("//a[contains(text(), 'Blog')]");
await menuElement[0].click();
await page.waitForNavigation({waitUntil: "networkidle0"});
// Close the browser
await browser.close();
})();
To test your script, you can use the following command in your terminal:
C:\iovio_test>node test.js
If your script ran successfully, it’s time for the next step: Making HTTP tracing run during the execution time of the script. The first step is to add an extra package with the name puppeteer-har via NPM by the command:
C:\iovio_test>npm i puppeteer-har
When the package is finished downloading, you can extend your script with the following code at the top of the script. Your first two lines of code will look like this:
const puppeteer = require('puppeteer');
const PuppeteerHar = require('puppeteer-har');
After this is done, place the code below between the opening of a new page and then set the height and width of the window. In these couple of lines of code, you’ll create a timestamp for a unique filename and start the tracing:
//opening a new page
const page = await browser.newPage();
// Create a timestamp
var timestamp = (Date.now() / 1000 | 0);
const har = new PuppeteerHar(page);
// Start the HTTP Tracing
await har.start({ path: './'+timestamp+'results.har' });
console.log(`Har file started and profiling`);
// Set windows hight en width
await page.setViewport({ width: 1600, height: 900 });
To stop the HTTP tracing, add an extra line of code at the end of the script:
// Stop the HTTP Tracing
await har.stop();
console.log(`Stop Har file and save`);
await browser.close();
After you run your script, a Har file is created. The format of the file is basically a JSON object with a particular field distribution. This example will use an online HAR Analyzer, but be careful. A Har file contains sensitive data, like all submitted information. For this example, we can use the HAR Analyzer from google.
Choose your file and open it. All the requests that happen during your recording will be visible in this analysis.
After following the steps above you are able to