Skip to content

Instantly share code, notes, and snippets.

@vickyRathee
Created September 19, 2021 11:33
Show Gist options
  • Save vickyRathee/02353899b394afbae9a2b83dc5ad1702 to your computer and use it in GitHub Desktop.
Save vickyRathee/02353899b394afbae9a2b83dc5ad1702 to your computer and use it in GitHub Desktop.
Block ads and other resources using Puppeteer network request interceptor
const puppeteer = require("puppeteer");
(async () => {
const browser = await puppeteer.launch({
headless: true,
timeout: 30000,
ignoreHTTPSErrors: true,
args: ["--no-sandbox", "--disable-setuid-sandbox"],
});
const page = await browser.newPage();
await page.setRequestInterception(true);
const rejectRequestPattern = [
"googlesyndication.com",
"/*.doubleclick.net",
"/*.amazon-adsystem.com",
"/*.adnxs.com",
];
const blockList = [];
page.on("request", (request) => {
if (rejectRequestPattern.find((pattern) => request.url().match(pattern))) {
blockList.push(request.url());
request.abort();
} else request.continue();
});
await page.goto("https://www.nytimes.com/");
await page.setViewport({ width: 1440, height: 900 });
await page.setUserAgent(
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36"
);
await page.screenshot({ path: "nytimes.png", fullPage: true });
console.log(
`Blocked ${blockList.length} requests with urls: ${JSON.stringify(
blockList
)}`
);
await browser.close();
})();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment