Scraping JavaScript Sites: Puppeteer, Node.js & Proxies

Michael Chen

Last edited on July 11, 2025

Scraping Techniques

Tackling Dynamic Websites: Scraping with Puppeteer, Node.js, and Proxies

When you need to scrape data from simple, static websites, libraries like Axios or tools like Cheerio often do the trick. They are great for parsing plain HTML. However, the modern web is dynamic; many sites rely heavily on JavaScript to load and display content. Basic HTML scrapers fall short here because they can't execute this client-side code.

To effectively scrape dynamic websites, you need a tool capable of running a full browser environment, executing JavaScript just like a real user's browser would.

Enter Puppeteer: a powerful Node.js library developed by Google. It provides a clean, high-level API to control Chrome or Chromium browsers programmatically.

This guide will walk you through Puppeteer, demonstrating how to build a simple scraper to extract top video results for specific keywords from YouTube.

So, What Exactly is Puppeteer?

Puppeteer is essentially a browser automation framework. While often used for automated testing of web applications, its ability to mimic browser actions makes it invaluable for web scraping tasks involving JavaScript-rendered content.

It operates using the Chrome DevTools Protocol, giving you programmatic access to the inner workings of the browser.

With Puppeteer, you can automate nearly anything you'd do manually in a browser: navigating pages, clicking buttons, filling out forms, scrolling, typing text, and even executing custom JavaScript snippets within the page's context.

By default, Puppeteer uses Chromium (the open-source base for Google Chrome), but it's flexible enough to be configured with full Chrome or even Firefox (though support might vary).

Tutorial: Scraping YouTube Search Results with Puppeteer

In this hands-on section, we'll build a Node.js script using Puppeteer. The goal is to take a search term, query YouTube, and extract the titles and URLs of the top video results.

Later, we'll enhance the script by integrating a proxy service to help manage our digital footprint during scraping.

Setting Up Your Environment

Before we start coding, ensure you have Node.js installed. If not, head over to the official Node.js website for download and installation instructions.

First, create a dedicated directory for our project, navigate into it, and initialize a new Node.js project using npm (Node Package Manager):

mkdir evomi-puppeteer-demo
cd evomi-puppeteer-demo
npm init -y

Next, install the core Puppeteer package. This command also downloads a compatible version of Chromium for Puppeteer to control:

npm

Great! Now create a file named scraper.js (or any name you prefer) and open it in your text editor. Let's start coding.

Launching Puppeteer and Opening a Page

Let's begin with a fundamental Puppeteer script:

const puppeteer = require('puppeteer');

(async () => {
  // Launch the browser
  const browserInstance = await puppeteer.launch({
    headless: false, // Show the browser window
    defaultViewport: null // Use the browser's default viewport size
  });

  // Open a new tab
  const page = await browserInstance.newPage();

  // Navigate to YouTube
  await page.goto('https://www.youtube.com/', { waitUntil: 'networkidle2' });

  console.log('YouTube loaded!');

  // We'll add more logic here later...
  // await browserInstance.close(); // Keep it open for now
})();

This script initializes Puppeteer, launches a visible browser instance, opens a new page (tab), and navigates to YouTube's homepage.

The headless: false option is crucial during development. It lets you visually observe the browser performing the automated actions, making debugging much easier. For production or unattended scripts, setting this to true (or omitting it, as true is the default) conserves server resources by not rendering the UI. The waitUntil: 'networkidle2' option tells Puppeteer to consider navigation successful when there are no more than 2 network connections for at least 500 ms, which is often a good indicator that the main content has loaded.

To execute this script, save your changes to scraper.js and run it from your terminal:

node

You should see a Chromium browser window pop up and load YouTube.

If you encounter issues launching the browser, the official Puppeteer Troubleshooting guide is an excellent resource.

Next, we'll automate interaction with the page elements.

Handling the Cookie Consent Banner

Most websites, including YouTube, present a cookie consent banner on the first visit. We need to accept it to proceed.

Puppeteer allows element selection using standard CSS selectors and also supports XPath selectors, which are sometimes more convenient for finding elements based on their text content.

Let's use an XPath selector to find the button containing the text "Accept all". The page.waitForXPath method waits for the element to appear in the DOM.

// Wait for the cookie consent button using XPath and click it
try {
  const acceptButtonXPath = '//span[contains(text(), "Accept all")]/ancestor::button';
  const cookieAcceptButton = await page.waitForXPath(acceptButtonXPath, { timeout: 5000 }); // Wait max 5 seconds
  if (cookieAcceptButton) {
    await cookieAcceptButton.click();
    console.log('Accepted cookies.');
    // Wait for potential page reload or navigation after accepting
    await page.waitForNavigation({ waitUntil: 'networkidle0', timeout: 10000 });
    console.log('Page reloaded after cookie acceptance.');
  } else {
    console.log('Cookie banner accept button not found or timed out.');
  }
} catch (error) {
  console.log('Cookie banner not found or already accepted.', error.message);
  // Continue script execution even if the banner isn't found
}

This snippet looks for a button associated with the "Accept all" text. We wrap it in a try-catch block because the banner might not always appear (e.g., on subsequent runs). After clicking, we use page.waitForNavigation to pause the script until the page finishes loading again, as clicking the consent often triggers a reload or state change. The waitUntil: 'networkidle0' option waits until there are no network connections for 500ms.

Here’s the updated script incorporating the cookie handling:

const puppeteer = require('puppeteer');

(async () => {
  const browserInstance = await puppeteer.launch({
    headless: false,
    defaultViewport: null
  });
  const page = await browserInstance.newPage();
  await page.goto('https://www.youtube.com/', {
    waitUntil: 'networkidle2'
  });
  console.log('YouTube loaded!');

  // Handle Cookie Banner
  try {
    const acceptButtonXPath = '//span[contains(text(), "Accept all")]/ancestor::button';
    const cookieAcceptButton = await page.waitForXPath(acceptButtonXPath, {
      timeout: 5000
    });
    if (cookieAcceptButton) {
      await cookieAcceptButton.click();
      console.log('Accepted cookies.');
      await page.waitForNavigation({
        waitUntil: 'networkidle0',
        timeout: 10000
      });
      console.log('Page reloaded after cookie acceptance.');
    } else {
      console.log('Cookie banner accept button not found or timed out.');
    }
  } catch (error) {
    console.log('Cookie banner not found or already accepted.', error.message);
  }

  console.log('Ready for next steps...');
  // await browserInstance.close();
})();

Automating the Search

With the cookie banner out of the way, let's focus on the search bar. We need to locate it, type our search query, and then click the search button.

We can use CSS selectors here. YouTube's search input often has an ID like search within a specific form or container. We'll use page.waitForSelector to ensure the element exists before interacting.

// Find and interact with the search bar
const searchInputSelector = 'input#search';
const searchButtonSelector = 'button#search-icon-legacy';
const searchQuery = 'web scraping best practices'; // Our search term

try {
  await page.waitForSelector(searchInputSelector, { visible: true, timeout: 10000 });
  await page.type(searchInputSelector, searchQuery, { delay: 50 }); // Type slowly like a human
  console.log(`Typed "${searchQuery}" into search bar.`);

  await page.waitForSelector(searchButtonSelector, { visible: true });
  await page.click(searchButtonSelector);
  console.log('Clicked search button.');

  // Wait for search results page to load
  await page.waitForNavigation({ waitUntil: 'networkidle2', timeout: 15000 });
  console.log('Search results page loaded.');
} catch (error) {
  console.error('Error during search:', error.message);
  await browserInstance.close();
  return; // Stop script if search fails
}

This code waits for the search input (input#search) to be visible, types our query (`web scraping best practices`) with a slight delay between keystrokes (making it look less robotic), waits for the search button (button#search-icon-legacy), clicks it, and finally waits for the results page to load using waitForNavigation again.

Let's integrate this into our script:

const puppeteer = require('puppeteer');

(async () => {
  const browserInstance = await puppeteer.launch({
    headless: false,
    defaultViewport: null
  });
  const page = await browserInstance.newPage();
  await page.goto('https://www.youtube.com/', { waitUntil: 'networkidle2' });
  console.log('YouTube loaded!');

  // Handle Cookie Banner (same code as before)
  try {
    const acceptButtonXPath = '//span[contains(text(), "Accept all")]/ancestor::button';
    const cookieAcceptButton = await page.waitForXPath(acceptButtonXPath, { timeout: 5000 });
    if (cookieAcceptButton) {
      await cookieAcceptButton.click();
      console.log('Accepted cookies.');
      await page.waitForNavigation({ waitUntil: 'networkidle0', timeout: 10000 });
      console.log('Page reloaded after cookie acceptance.');
    } else {
      console.log('Cookie banner accept button not found or timed out.');
    }
  } catch (error) {
    console.log('Cookie banner not found or already accepted.', error.message);
  }

  // Perform Search
  const searchInputSelector = 'input#search';
  const searchButtonSelector = 'button#search-icon-legacy';
  const searchQuery = 'web scraping best practices';

  try {
    await page.waitForSelector(searchInputSelector, { visible: true, timeout: 10000 });
    await page.type(searchInputSelector, searchQuery, { delay: 50 });
    console.log(`Typed "${searchQuery}" into search bar.`);

    await page.waitForSelector(searchButtonSelector, { visible: true });
    await page.click(searchButtonSelector);
    console.log('Clicked search button.');

    await page.waitForNavigation({ waitUntil: 'networkidle2', timeout: 15000 });
    console.log('Search results page loaded.');
  } catch (error) {
    console.error('Error during search:', error.message);
    await browserInstance.close();
    return;
  }

  console.log('Ready to scrape results...');

  // await browserInstance.close();
})();

Running this should now navigate to YouTube, accept cookies (if present), perform the search, and land you on the results page.

Extracting Video Information

Now for the core scraping logic. We need to identify the video entries on the results page and extract their titles and links.

The page.evaluate() method is perfect for this. It allows us to run JavaScript code within the context of the browser page, effectively giving us access to the page's DOM as if we were using the browser's developer console.

Here's a function using page.evaluate() to grab the video titles and links. The selectors target specific elements commonly used by YouTube for video results.

// Scrape video data
const videoData = await page.evaluate(() => {
  const results = [];
  // Selector for video links/titles - targets the title link within a result item
  const videoElements = document.querySelectorAll('ytd-video-renderer h3 a#video-title');

  videoElements.forEach(element => {
    const title = element.innerText.trim();
    const link = element.href;
    if (title && link) { // Ensure we have both title and link
      results.push({ title, link });
    }
  });

  return results.slice(0, 5); // Return top 5 results
});

console.log('Scraped Video Data:');
console.log(videoData);

await browserInstance.close(); // Close the browser now

Inside page.evaluate:

We initialize an empty array results.
We use document.querySelectorAll with the selector ytd-video-renderer h3 a#video-title. This targets the anchor (a) tag with the ID video-title, which is typically inside an h3 tag within the main container (ytd-video-renderer) for each video result. This is generally more specific than just selecting all h3 tags.
We iterate through the found elements (videoElements).
For each element, we extract its visible text content (innerText) and trim whitespace to get the title.
We get the href attribute for the video link.
We create an object with the title and link and push it to our results array.
Finally, we return the first 5 results using slice(0, 5). This returned value from the function inside evaluate is assigned to the videoData variable in our Node.js script.

After extracting the data, we log it to the console and close the browser instance using browserInstance.close().

Here is the complete script:

const puppeteer = require('puppeteer');

(async () => {
  console.log('Launching browser...');
  const browserInstance = await puppeteer.launch({
    headless: false, // Keep false for debugging, set true for production
    defaultViewport: null
  });

  const page = await browserInstance.newPage();

  console.log('Navigating to YouTube...');
  await page.goto('https://www.youtube.com/', { waitUntil: 'networkidle2' });
  console.log('YouTube loaded!');

  // Handle Cookie Banner
  try {
    const acceptButtonXPath = '//span[contains(text(), "Accept all")]/ancestor::button';
    const cookieAcceptButton = await page.waitForXPath(acceptButtonXPath, { timeout: 5000 });
    if (cookieAcceptButton) {
      await cookieAcceptButton.click();
      console.log('Accepted cookies.');
      await page.waitForNavigation({ waitUntil: 'networkidle0', timeout: 10000 });
      console.log('Page reloaded after cookie acceptance.');
    } else {
      console.log('Cookie banner accept button not found or timed out.');
    }
  } catch (error) {
    console.log('Cookie banner not found or already accepted.', error.message);
  }

  // Perform Search
  const searchInputSelector = 'input#search';
  const searchButtonSelector = 'button#search-icon-legacy';
  const searchQuery = 'web scraping best practices';

  try {
    console.log('Waiting for search input...');
    await page.waitForSelector(searchInputSelector, { visible: true, timeout: 10000 });

    console.log(`Typing "${searchQuery}"...`);
    await page.type(searchInputSelector, searchQuery, { delay: 50 });

    console.log('Waiting for search button...');
    await page.waitForSelector(searchButtonSelector, { visible: true });

    console.log('Clicking search button...');
    await page.click(searchButtonSelector);

    console.log('Waiting for search results page navigation...');
    await page.waitForNavigation({ waitUntil: 'networkidle2', timeout: 15000 });
    console.log('Search results page loaded.');
  } catch (error) {
    console.error('Error during search:', error.message);
    await browserInstance.close();
    return;
  }

  // Scrape video data
  console.log('Scraping video results...');
  const videoData = await page.evaluate(() => {
    const results = [];
    const videoElements = document.querySelectorAll('ytd-video-renderer h3 a#video-title');
    videoElements.forEach(element => {
      const title = element.innerText.trim();
      const link = element.href;
      if (title && link) {
        results.push({ title, link });
      }
    });
    return results.slice(0, 5); // Return top 5 results
  });

  console.log('--- Scraped Video Data ---');
  console.log(videoData);
  console.log('--------------------------');

  console.log('Closing browser...');
  await browserInstance.close();
  console.log('Script finished.');
})();

Running this script should now print an array of objects, each containing the title and link of the top search results, similar to this (titles/links will vary):

// --- Scraped Video Data ---
[
  {
    title: 'Web Scraping Tutorial For Beginners | Scrape Anything!',
    link: 'https://www.youtube.com/watch?v=someVideoId1'
  },
  {
    title: 'Ethical Web Scraping: Best Practices & Legal Considerations',
    link: 'https://www.youtube.com/watch?v=someVideoId2'
  },
  {
    title: 'How to Avoid Getting Blocked While Web Scraping',
    link: 'https://www.youtube.com/watch?v=someVideoId3'
  },
  {
    title: 'Advanced Web Scraping Techniques in Python',
    link: 'https://www.youtube.com/watch?v=someVideoId4'
  },
  {
    title: 'Web Scraping with Node.js and Puppeteer - Full Course',
    link: 'https://www.youtube.com/watch?v=someVideoId5'
  }
]
// --------------------------

Integrating Proxies for Reliable Scraping

While occasional light scraping might go unnoticed, performing extensive or frequent scraping from a single IP address is a surefire way to get rate-limited or even permanently blocked by the target website. Imagine losing access to YouTube entirely from your home network!

This is where proxies come in. A proxy acts as an intermediary between your script and the website. Your requests are routed through the proxy server, masking your real IP address. If the proxy IP gets flagged or blocked, your own IP remains safe.

Using a pool of proxies, especially residential or mobile ones, further enhances reliability and stealth. Services like Evomi offer access to vast pools of ethically sourced IPs. By rotating through different IPs (often automatically handled by the proxy service endpoint), your scraping activity appears as traffic from many different, legitimate users rather than a single automated bot.

For scraping dynamic sites like YouTube, Evomi's Residential Proxies are an excellent choice, offering IPs from real user devices worldwide, making requests look highly authentic. They start at a competitive price point of just $0.49 per GB.

Configuring Puppeteer to use a proxy is straightforward. You'll need your proxy provider's details: the server address (endpoint) and port, plus username/password credentials if authentication is required.

Let's modify the puppeteer.launch() options to include the proxy server details via the args array:

// Example using Evomi Residential Proxy endpoint (replace with your details)
const proxyServer = 'rp.evomi.com:1000'; // Example: HTTP endpoint for Evomi Residential
const proxyUsername = 'your-evomi-username'; // Replace with your Evomi username
const proxyPassword = 'your-evomi-password'; // Replace with your Evomi password

console.log(`Launching browser with proxy: ${proxyServer}...`);

const browserInstance = await puppeteer.launch({
  headless: false, // Keep false for debugging
  defaultViewport: null,
  args: [
    `--proxy-server=${proxyServer}`
  ]
});

Replace rp.evomi.com:1000, your-evomi-username, and your-evomi-password with your actual Evomi credentials and the appropriate endpoint/port (e.g., 1001 for HTTPS, 1002 for SOCKS5 with residential proxies).

Since most quality proxy services require authentication, we need to tell Puppeteer how to authenticate with the proxy. This is done using the page.authenticate() method, called right after creating the new page:

  const page = await browserInstance.newPage();

  // Authenticate with the proxy server
  await page.authenticate({
    username: proxyUsername,
    password: proxyPassword
  });

  console.log('Proxy authentication set.');

Here's how the initial part of the script looks with proxy integration:

const puppeteer = require('puppeteer');

(async () => {
  // --- Proxy Configuration ---
  // Example using Evomi Residential Proxy endpoint (replace with your details)
  const proxyServer = 'rp.evomi.com:1000'; // Example: HTTP endpoint for Evomi Residential
  const proxyUsername = 'your-evomi-username'; // Replace with your Evomi username
  const proxyPassword = 'your-evomi-password'; // Replace with your Evomi password
  // -------------------------

  console.log(`Launching browser with proxy: ${proxyServer}...`);

  const browserInstance = await puppeteer.launch({
    headless: false, // Keep false for debugging, set true for production
    defaultViewport: null,
    args: [
      `--proxy-server=${proxyServer}`
    ]
  });

  const page = await browserInstance.newPage();

  // Authenticate with the proxy server
  await page.authenticate({
    username: proxyUsername,
    password: proxyPassword
  });
  console.log('Proxy authentication set.');

  console.log('Navigating to YouTube via proxy...');
  await page.goto('https://www.youtube.com/', { waitUntil: 'networkidle2' });
  console.log('YouTube loaded via proxy!');

  // ... (rest of the script: cookie handling, search, scraping) ...

  console.log('Closing browser...');
  await browserInstance.close();

  console.log('Script finished.');
})();

Now, when you run the script, all traffic to YouTube will be routed through your configured Evomi proxy, significantly reducing the risk of your personal IP being flagged or blocked.

Wrapping Up

In this guide, we explored Puppeteer, a versatile library for browser automation that shines when scraping JavaScript-heavy websites. You learned how to launch Puppeteer, navigate pages, interact with elements like buttons and input fields, execute JavaScript within the page context to extract data, and critically, how to integrate proxies like those from Evomi to protect your IP and improve scraping reliability.

Puppeteer opens up possibilities for interacting with almost any website, no matter how dynamic. Try applying these techniques to other complex sites to further hone your skills!

Tackling Dynamic Websites: Scraping with Puppeteer, Node.js, and Proxies

To effectively scrape dynamic websites, you need a tool capable of running a full browser environment, executing JavaScript just like a real user's browser would.

Enter Puppeteer: a powerful Node.js library developed by Google. It provides a clean, high-level API to control Chrome or Chromium browsers programmatically.

This guide will walk you through Puppeteer, demonstrating how to build a simple scraper to extract top video results for specific keywords from YouTube.

So, What Exactly is Puppeteer?

It operates using the Chrome DevTools Protocol, giving you programmatic access to the inner workings of the browser.

By default, Puppeteer uses Chromium (the open-source base for Google Chrome), but it's flexible enough to be configured with full Chrome or even Firefox (though support might vary).

Tutorial: Scraping YouTube Search Results with Puppeteer

In this hands-on section, we'll build a Node.js script using Puppeteer. The goal is to take a search term, query YouTube, and extract the titles and URLs of the top video results.

Later, we'll enhance the script by integrating a proxy service to help manage our digital footprint during scraping.

Setting Up Your Environment

Before we start coding, ensure you have Node.js installed. If not, head over to the official Node.js website for download and installation instructions.

First, create a dedicated directory for our project, navigate into it, and initialize a new Node.js project using npm (Node Package Manager):

mkdir evomi-puppeteer-demo
cd evomi-puppeteer-demo
npm init -y

Next, install the core Puppeteer package. This command also downloads a compatible version of Chromium for Puppeteer to control:

npm

Great! Now create a file named scraper.js (or any name you prefer) and open it in your text editor. Let's start coding.

Launching Puppeteer and Opening a Page

Let's begin with a fundamental Puppeteer script:

const puppeteer = require('puppeteer');

(async () => {
  // Launch the browser
  const browserInstance = await puppeteer.launch({
    headless: false, // Show the browser window
    defaultViewport: null // Use the browser's default viewport size
  });

  // Open a new tab
  const page = await browserInstance.newPage();

  // Navigate to YouTube
  await page.goto('https://www.youtube.com/', { waitUntil: 'networkidle2' });

  console.log('YouTube loaded!');

  // We'll add more logic here later...
  // await browserInstance.close(); // Keep it open for now
})();

This script initializes Puppeteer, launches a visible browser instance, opens a new page (tab), and navigates to YouTube's homepage.

To execute this script, save your changes to scraper.js and run it from your terminal:

node

You should see a Chromium browser window pop up and load YouTube.

If you encounter issues launching the browser, the official Puppeteer Troubleshooting guide is an excellent resource.

Next, we'll automate interaction with the page elements.

Handling the Cookie Consent Banner

Most websites, including YouTube, present a cookie consent banner on the first visit. We need to accept it to proceed.

Puppeteer allows element selection using standard CSS selectors and also supports XPath selectors, which are sometimes more convenient for finding elements based on their text content.

Let's use an XPath selector to find the button containing the text "Accept all". The page.waitForXPath method waits for the element to appear in the DOM.

// Wait for the cookie consent button using XPath and click it
try {
  const acceptButtonXPath = '//span[contains(text(), "Accept all")]/ancestor::button';
  const cookieAcceptButton = await page.waitForXPath(acceptButtonXPath, { timeout: 5000 }); // Wait max 5 seconds
  if (cookieAcceptButton) {
    await cookieAcceptButton.click();
    console.log('Accepted cookies.');
    // Wait for potential page reload or navigation after accepting
    await page.waitForNavigation({ waitUntil: 'networkidle0', timeout: 10000 });
    console.log('Page reloaded after cookie acceptance.');
  } else {
    console.log('Cookie banner accept button not found or timed out.');
  }
} catch (error) {
  console.log('Cookie banner not found or already accepted.', error.message);
  // Continue script execution even if the banner isn't found
}

Here’s the updated script incorporating the cookie handling:

const puppeteer = require('puppeteer');

(async () => {
  const browserInstance = await puppeteer.launch({
    headless: false,
    defaultViewport: null
  });
  const page = await browserInstance.newPage();
  await page.goto('https://www.youtube.com/', {
    waitUntil: 'networkidle2'
  });
  console.log('YouTube loaded!');

  // Handle Cookie Banner
  try {
    const acceptButtonXPath = '//span[contains(text(), "Accept all")]/ancestor::button';
    const cookieAcceptButton = await page.waitForXPath(acceptButtonXPath, {
      timeout: 5000
    });
    if (cookieAcceptButton) {
      await cookieAcceptButton.click();
      console.log('Accepted cookies.');
      await page.waitForNavigation({
        waitUntil: 'networkidle0',
        timeout: 10000
      });
      console.log('Page reloaded after cookie acceptance.');
    } else {
      console.log('Cookie banner accept button not found or timed out.');
    }
  } catch (error) {
    console.log('Cookie banner not found or already accepted.', error.message);
  }

  console.log('Ready for next steps...');
  // await browserInstance.close();
})();

Automating the Search

With the cookie banner out of the way, let's focus on the search bar. We need to locate it, type our search query, and then click the search button.

// Find and interact with the search bar
const searchInputSelector = 'input#search';
const searchButtonSelector = 'button#search-icon-legacy';
const searchQuery = 'web scraping best practices'; // Our search term

try {
  await page.waitForSelector(searchInputSelector, { visible: true, timeout: 10000 });
  await page.type(searchInputSelector, searchQuery, { delay: 50 }); // Type slowly like a human
  console.log(`Typed "${searchQuery}" into search bar.`);

  await page.waitForSelector(searchButtonSelector, { visible: true });
  await page.click(searchButtonSelector);
  console.log('Clicked search button.');

  // Wait for search results page to load
  await page.waitForNavigation({ waitUntil: 'networkidle2', timeout: 15000 });
  console.log('Search results page loaded.');
} catch (error) {
  console.error('Error during search:', error.message);
  await browserInstance.close();
  return; // Stop script if search fails
}

Let's integrate this into our script:

const puppeteer = require('puppeteer');

(async () => {
  const browserInstance = await puppeteer.launch({
    headless: false,
    defaultViewport: null
  });
  const page = await browserInstance.newPage();
  await page.goto('https://www.youtube.com/', { waitUntil: 'networkidle2' });
  console.log('YouTube loaded!');

  // Handle Cookie Banner (same code as before)
  try {
    const acceptButtonXPath = '//span[contains(text(), "Accept all")]/ancestor::button';
    const cookieAcceptButton = await page.waitForXPath(acceptButtonXPath, { timeout: 5000 });
    if (cookieAcceptButton) {
      await cookieAcceptButton.click();
      console.log('Accepted cookies.');
      await page.waitForNavigation({ waitUntil: 'networkidle0', timeout: 10000 });
      console.log('Page reloaded after cookie acceptance.');
    } else {
      console.log('Cookie banner accept button not found or timed out.');
    }
  } catch (error) {
    console.log('Cookie banner not found or already accepted.', error.message);
  }

  // Perform Search
  const searchInputSelector = 'input#search';
  const searchButtonSelector = 'button#search-icon-legacy';
  const searchQuery = 'web scraping best practices';

  try {
    await page.waitForSelector(searchInputSelector, { visible: true, timeout: 10000 });
    await page.type(searchInputSelector, searchQuery, { delay: 50 });
    console.log(`Typed "${searchQuery}" into search bar.`);

    await page.waitForSelector(searchButtonSelector, { visible: true });
    await page.click(searchButtonSelector);
    console.log('Clicked search button.');

    await page.waitForNavigation({ waitUntil: 'networkidle2', timeout: 15000 });
    console.log('Search results page loaded.');
  } catch (error) {
    console.error('Error during search:', error.message);
    await browserInstance.close();
    return;
  }

  console.log('Ready to scrape results...');

  // await browserInstance.close();
})();

Running this should now navigate to YouTube, accept cookies (if present), perform the search, and land you on the results page.

Extracting Video Information

Now for the core scraping logic. We need to identify the video entries on the results page and extract their titles and links.

Here's a function using page.evaluate() to grab the video titles and links. The selectors target specific elements commonly used by YouTube for video results.

// Scrape video data
const videoData = await page.evaluate(() => {
  const results = [];
  // Selector for video links/titles - targets the title link within a result item
  const videoElements = document.querySelectorAll('ytd-video-renderer h3 a#video-title');

  videoElements.forEach(element => {
    const title = element.innerText.trim();
    const link = element.href;
    if (title && link) { // Ensure we have both title and link
      results.push({ title, link });
    }
  });

  return results.slice(0, 5); // Return top 5 results
});

console.log('Scraped Video Data:');
console.log(videoData);

await browserInstance.close(); // Close the browser now

Inside page.evaluate:

We initialize an empty array results.
We use document.querySelectorAll with the selector ytd-video-renderer h3 a#video-title. This targets the anchor (a) tag with the ID video-title, which is typically inside an h3 tag within the main container (ytd-video-renderer) for each video result. This is generally more specific than just selecting all h3 tags.
We iterate through the found elements (videoElements).
For each element, we extract its visible text content (innerText) and trim whitespace to get the title.
We get the href attribute for the video link.
We create an object with the title and link and push it to our results array.
Finally, we return the first 5 results using slice(0, 5). This returned value from the function inside evaluate is assigned to the videoData variable in our Node.js script.

After extracting the data, we log it to the console and close the browser instance using browserInstance.close().

Here is the complete script:

const puppeteer = require('puppeteer');

(async () => {
  console.log('Launching browser...');
  const browserInstance = await puppeteer.launch({
    headless: false, // Keep false for debugging, set true for production
    defaultViewport: null
  });

  const page = await browserInstance.newPage();

  console.log('Navigating to YouTube...');
  await page.goto('https://www.youtube.com/', { waitUntil: 'networkidle2' });
  console.log('YouTube loaded!');

  // Handle Cookie Banner
  try {
    const acceptButtonXPath = '//span[contains(text(), "Accept all")]/ancestor::button';
    const cookieAcceptButton = await page.waitForXPath(acceptButtonXPath, { timeout: 5000 });
    if (cookieAcceptButton) {
      await cookieAcceptButton.click();
      console.log('Accepted cookies.');
      await page.waitForNavigation({ waitUntil: 'networkidle0', timeout: 10000 });
      console.log('Page reloaded after cookie acceptance.');
    } else {
      console.log('Cookie banner accept button not found or timed out.');
    }
  } catch (error) {
    console.log('Cookie banner not found or already accepted.', error.message);
  }

  // Perform Search
  const searchInputSelector = 'input#search';
  const searchButtonSelector = 'button#search-icon-legacy';
  const searchQuery = 'web scraping best practices';

  try {
    console.log('Waiting for search input...');
    await page.waitForSelector(searchInputSelector, { visible: true, timeout: 10000 });

    console.log(`Typing "${searchQuery}"...`);
    await page.type(searchInputSelector, searchQuery, { delay: 50 });

    console.log('Waiting for search button...');
    await page.waitForSelector(searchButtonSelector, { visible: true });

    console.log('Clicking search button...');
    await page.click(searchButtonSelector);

    console.log('Waiting for search results page navigation...');
    await page.waitForNavigation({ waitUntil: 'networkidle2', timeout: 15000 });
    console.log('Search results page loaded.');
  } catch (error) {
    console.error('Error during search:', error.message);
    await browserInstance.close();
    return;
  }

  // Scrape video data
  console.log('Scraping video results...');
  const videoData = await page.evaluate(() => {
    const results = [];
    const videoElements = document.querySelectorAll('ytd-video-renderer h3 a#video-title');
    videoElements.forEach(element => {
      const title = element.innerText.trim();
      const link = element.href;
      if (title && link) {
        results.push({ title, link });
      }
    });
    return results.slice(0, 5); // Return top 5 results
  });

  console.log('--- Scraped Video Data ---');
  console.log(videoData);
  console.log('--------------------------');

  console.log('Closing browser...');
  await browserInstance.close();
  console.log('Script finished.');
})();

Running this script should now print an array of objects, each containing the title and link of the top search results, similar to this (titles/links will vary):

// --- Scraped Video Data ---
[
  {
    title: 'Web Scraping Tutorial For Beginners | Scrape Anything!',
    link: 'https://www.youtube.com/watch?v=someVideoId1'
  },
  {
    title: 'Ethical Web Scraping: Best Practices & Legal Considerations',
    link: 'https://www.youtube.com/watch?v=someVideoId2'
  },
  {
    title: 'How to Avoid Getting Blocked While Web Scraping',
    link: 'https://www.youtube.com/watch?v=someVideoId3'
  },
  {
    title: 'Advanced Web Scraping Techniques in Python',
    link: 'https://www.youtube.com/watch?v=someVideoId4'
  },
  {
    title: 'Web Scraping with Node.js and Puppeteer - Full Course',
    link: 'https://www.youtube.com/watch?v=someVideoId5'
  }
]
// --------------------------

Integrating Proxies for Reliable Scraping

Let's modify the puppeteer.launch() options to include the proxy server details via the args array:

// Example using Evomi Residential Proxy endpoint (replace with your details)
const proxyServer = 'rp.evomi.com:1000'; // Example: HTTP endpoint for Evomi Residential
const proxyUsername = 'your-evomi-username'; // Replace with your Evomi username
const proxyPassword = 'your-evomi-password'; // Replace with your Evomi password

console.log(`Launching browser with proxy: ${proxyServer}...`);

const browserInstance = await puppeteer.launch({
  headless: false, // Keep false for debugging
  defaultViewport: null,
  args: [
    `--proxy-server=${proxyServer}`
  ]
});

  const page = await browserInstance.newPage();

  // Authenticate with the proxy server
  await page.authenticate({
    username: proxyUsername,
    password: proxyPassword
  });

  console.log('Proxy authentication set.');

Here's how the initial part of the script looks with proxy integration:

const puppeteer = require('puppeteer');

(async () => {
  // --- Proxy Configuration ---
  // Example using Evomi Residential Proxy endpoint (replace with your details)
  const proxyServer = 'rp.evomi.com:1000'; // Example: HTTP endpoint for Evomi Residential
  const proxyUsername = 'your-evomi-username'; // Replace with your Evomi username
  const proxyPassword = 'your-evomi-password'; // Replace with your Evomi password
  // -------------------------

  console.log(`Launching browser with proxy: ${proxyServer}...`);

  const browserInstance = await puppeteer.launch({
    headless: false, // Keep false for debugging, set true for production
    defaultViewport: null,
    args: [
      `--proxy-server=${proxyServer}`
    ]
  });

  const page = await browserInstance.newPage();

  // Authenticate with the proxy server
  await page.authenticate({
    username: proxyUsername,
    password: proxyPassword
  });
  console.log('Proxy authentication set.');

  console.log('Navigating to YouTube via proxy...');
  await page.goto('https://www.youtube.com/', { waitUntil: 'networkidle2' });
  console.log('YouTube loaded via proxy!');

  // ... (rest of the script: cookie handling, search, scraping) ...

  console.log('Closing browser...');
  await browserInstance.close();

  console.log('Script finished.');
})();

Now, when you run the script, all traffic to YouTube will be routed through your configured Evomi proxy, significantly reducing the risk of your personal IP being flagged or blocked.

Wrapping Up

Puppeteer opens up possibilities for interacting with almost any website, no matter how dynamic. Try applying these techniques to other complex sites to further hone your skills!

Tackling Dynamic Websites: Scraping with Puppeteer, Node.js, and Proxies

To effectively scrape dynamic websites, you need a tool capable of running a full browser environment, executing JavaScript just like a real user's browser would.

Enter Puppeteer: a powerful Node.js library developed by Google. It provides a clean, high-level API to control Chrome or Chromium browsers programmatically.

This guide will walk you through Puppeteer, demonstrating how to build a simple scraper to extract top video results for specific keywords from YouTube.

So, What Exactly is Puppeteer?

It operates using the Chrome DevTools Protocol, giving you programmatic access to the inner workings of the browser.

By default, Puppeteer uses Chromium (the open-source base for Google Chrome), but it's flexible enough to be configured with full Chrome or even Firefox (though support might vary).

Tutorial: Scraping YouTube Search Results with Puppeteer

In this hands-on section, we'll build a Node.js script using Puppeteer. The goal is to take a search term, query YouTube, and extract the titles and URLs of the top video results.

Later, we'll enhance the script by integrating a proxy service to help manage our digital footprint during scraping.

Setting Up Your Environment

Before we start coding, ensure you have Node.js installed. If not, head over to the official Node.js website for download and installation instructions.

First, create a dedicated directory for our project, navigate into it, and initialize a new Node.js project using npm (Node Package Manager):

mkdir evomi-puppeteer-demo
cd evomi-puppeteer-demo
npm init -y

Next, install the core Puppeteer package. This command also downloads a compatible version of Chromium for Puppeteer to control:

npm

Great! Now create a file named scraper.js (or any name you prefer) and open it in your text editor. Let's start coding.

Launching Puppeteer and Opening a Page

Let's begin with a fundamental Puppeteer script:

const puppeteer = require('puppeteer');

(async () => {
  // Launch the browser
  const browserInstance = await puppeteer.launch({
    headless: false, // Show the browser window
    defaultViewport: null // Use the browser's default viewport size
  });

  // Open a new tab
  const page = await browserInstance.newPage();

  // Navigate to YouTube
  await page.goto('https://www.youtube.com/', { waitUntil: 'networkidle2' });

  console.log('YouTube loaded!');

  // We'll add more logic here later...
  // await browserInstance.close(); // Keep it open for now
})();

This script initializes Puppeteer, launches a visible browser instance, opens a new page (tab), and navigates to YouTube's homepage.

To execute this script, save your changes to scraper.js and run it from your terminal:

node

You should see a Chromium browser window pop up and load YouTube.

If you encounter issues launching the browser, the official Puppeteer Troubleshooting guide is an excellent resource.

Next, we'll automate interaction with the page elements.

Handling the Cookie Consent Banner

Most websites, including YouTube, present a cookie consent banner on the first visit. We need to accept it to proceed.

Puppeteer allows element selection using standard CSS selectors and also supports XPath selectors, which are sometimes more convenient for finding elements based on their text content.

Let's use an XPath selector to find the button containing the text "Accept all". The page.waitForXPath method waits for the element to appear in the DOM.

// Wait for the cookie consent button using XPath and click it
try {
  const acceptButtonXPath = '//span[contains(text(), "Accept all")]/ancestor::button';
  const cookieAcceptButton = await page.waitForXPath(acceptButtonXPath, { timeout: 5000 }); // Wait max 5 seconds
  if (cookieAcceptButton) {
    await cookieAcceptButton.click();
    console.log('Accepted cookies.');
    // Wait for potential page reload or navigation after accepting
    await page.waitForNavigation({ waitUntil: 'networkidle0', timeout: 10000 });
    console.log('Page reloaded after cookie acceptance.');
  } else {
    console.log('Cookie banner accept button not found or timed out.');
  }
} catch (error) {
  console.log('Cookie banner not found or already accepted.', error.message);
  // Continue script execution even if the banner isn't found
}

Here’s the updated script incorporating the cookie handling:

const puppeteer = require('puppeteer');

(async () => {
  const browserInstance = await puppeteer.launch({
    headless: false,
    defaultViewport: null
  });
  const page = await browserInstance.newPage();
  await page.goto('https://www.youtube.com/', {
    waitUntil: 'networkidle2'
  });
  console.log('YouTube loaded!');

  // Handle Cookie Banner
  try {
    const acceptButtonXPath = '//span[contains(text(), "Accept all")]/ancestor::button';
    const cookieAcceptButton = await page.waitForXPath(acceptButtonXPath, {
      timeout: 5000
    });
    if (cookieAcceptButton) {
      await cookieAcceptButton.click();
      console.log('Accepted cookies.');
      await page.waitForNavigation({
        waitUntil: 'networkidle0',
        timeout: 10000
      });
      console.log('Page reloaded after cookie acceptance.');
    } else {
      console.log('Cookie banner accept button not found or timed out.');
    }
  } catch (error) {
    console.log('Cookie banner not found or already accepted.', error.message);
  }

  console.log('Ready for next steps...');
  // await browserInstance.close();
})();

Automating the Search

With the cookie banner out of the way, let's focus on the search bar. We need to locate it, type our search query, and then click the search button.

// Find and interact with the search bar
const searchInputSelector = 'input#search';
const searchButtonSelector = 'button#search-icon-legacy';
const searchQuery = 'web scraping best practices'; // Our search term

try {
  await page.waitForSelector(searchInputSelector, { visible: true, timeout: 10000 });
  await page.type(searchInputSelector, searchQuery, { delay: 50 }); // Type slowly like a human
  console.log(`Typed "${searchQuery}" into search bar.`);

  await page.waitForSelector(searchButtonSelector, { visible: true });
  await page.click(searchButtonSelector);
  console.log('Clicked search button.');

  // Wait for search results page to load
  await page.waitForNavigation({ waitUntil: 'networkidle2', timeout: 15000 });
  console.log('Search results page loaded.');
} catch (error) {
  console.error('Error during search:', error.message);
  await browserInstance.close();
  return; // Stop script if search fails
}

Let's integrate this into our script:

const puppeteer = require('puppeteer');

(async () => {
  const browserInstance = await puppeteer.launch({
    headless: false,
    defaultViewport: null
  });
  const page = await browserInstance.newPage();
  await page.goto('https://www.youtube.com/', { waitUntil: 'networkidle2' });
  console.log('YouTube loaded!');

  // Handle Cookie Banner (same code as before)
  try {
    const acceptButtonXPath = '//span[contains(text(), "Accept all")]/ancestor::button';
    const cookieAcceptButton = await page.waitForXPath(acceptButtonXPath, { timeout: 5000 });
    if (cookieAcceptButton) {
      await cookieAcceptButton.click();
      console.log('Accepted cookies.');
      await page.waitForNavigation({ waitUntil: 'networkidle0', timeout: 10000 });
      console.log('Page reloaded after cookie acceptance.');
    } else {
      console.log('Cookie banner accept button not found or timed out.');
    }
  } catch (error) {
    console.log('Cookie banner not found or already accepted.', error.message);
  }

  // Perform Search
  const searchInputSelector = 'input#search';
  const searchButtonSelector = 'button#search-icon-legacy';
  const searchQuery = 'web scraping best practices';

  try {
    await page.waitForSelector(searchInputSelector, { visible: true, timeout: 10000 });
    await page.type(searchInputSelector, searchQuery, { delay: 50 });
    console.log(`Typed "${searchQuery}" into search bar.`);

    await page.waitForSelector(searchButtonSelector, { visible: true });
    await page.click(searchButtonSelector);
    console.log('Clicked search button.');

    await page.waitForNavigation({ waitUntil: 'networkidle2', timeout: 15000 });
    console.log('Search results page loaded.');
  } catch (error) {
    console.error('Error during search:', error.message);
    await browserInstance.close();
    return;
  }

  console.log('Ready to scrape results...');

  // await browserInstance.close();
})();

Running this should now navigate to YouTube, accept cookies (if present), perform the search, and land you on the results page.

Extracting Video Information

Now for the core scraping logic. We need to identify the video entries on the results page and extract their titles and links.

Here's a function using page.evaluate() to grab the video titles and links. The selectors target specific elements commonly used by YouTube for video results.

// Scrape video data
const videoData = await page.evaluate(() => {
  const results = [];
  // Selector for video links/titles - targets the title link within a result item
  const videoElements = document.querySelectorAll('ytd-video-renderer h3 a#video-title');

  videoElements.forEach(element => {
    const title = element.innerText.trim();
    const link = element.href;
    if (title && link) { // Ensure we have both title and link
      results.push({ title, link });
    }
  });

  return results.slice(0, 5); // Return top 5 results
});

console.log('Scraped Video Data:');
console.log(videoData);

await browserInstance.close(); // Close the browser now

Inside page.evaluate:

We initialize an empty array results.
We use document.querySelectorAll with the selector ytd-video-renderer h3 a#video-title. This targets the anchor (a) tag with the ID video-title, which is typically inside an h3 tag within the main container (ytd-video-renderer) for each video result. This is generally more specific than just selecting all h3 tags.
We iterate through the found elements (videoElements).
For each element, we extract its visible text content (innerText) and trim whitespace to get the title.
We get the href attribute for the video link.
We create an object with the title and link and push it to our results array.
Finally, we return the first 5 results using slice(0, 5). This returned value from the function inside evaluate is assigned to the videoData variable in our Node.js script.

After extracting the data, we log it to the console and close the browser instance using browserInstance.close().

Here is the complete script:

const puppeteer = require('puppeteer');

(async () => {
  console.log('Launching browser...');
  const browserInstance = await puppeteer.launch({
    headless: false, // Keep false for debugging, set true for production
    defaultViewport: null
  });

  const page = await browserInstance.newPage();

  console.log('Navigating to YouTube...');
  await page.goto('https://www.youtube.com/', { waitUntil: 'networkidle2' });
  console.log('YouTube loaded!');

  // Handle Cookie Banner
  try {
    const acceptButtonXPath = '//span[contains(text(), "Accept all")]/ancestor::button';
    const cookieAcceptButton = await page.waitForXPath(acceptButtonXPath, { timeout: 5000 });
    if (cookieAcceptButton) {
      await cookieAcceptButton.click();
      console.log('Accepted cookies.');
      await page.waitForNavigation({ waitUntil: 'networkidle0', timeout: 10000 });
      console.log('Page reloaded after cookie acceptance.');
    } else {
      console.log('Cookie banner accept button not found or timed out.');
    }
  } catch (error) {
    console.log('Cookie banner not found or already accepted.', error.message);
  }

  // Perform Search
  const searchInputSelector = 'input#search';
  const searchButtonSelector = 'button#search-icon-legacy';
  const searchQuery = 'web scraping best practices';

  try {
    console.log('Waiting for search input...');
    await page.waitForSelector(searchInputSelector, { visible: true, timeout: 10000 });

    console.log(`Typing "${searchQuery}"...`);
    await page.type(searchInputSelector, searchQuery, { delay: 50 });

    console.log('Waiting for search button...');
    await page.waitForSelector(searchButtonSelector, { visible: true });

    console.log('Clicking search button...');
    await page.click(searchButtonSelector);

    console.log('Waiting for search results page navigation...');
    await page.waitForNavigation({ waitUntil: 'networkidle2', timeout: 15000 });
    console.log('Search results page loaded.');
  } catch (error) {
    console.error('Error during search:', error.message);
    await browserInstance.close();
    return;
  }

  // Scrape video data
  console.log('Scraping video results...');
  const videoData = await page.evaluate(() => {
    const results = [];
    const videoElements = document.querySelectorAll('ytd-video-renderer h3 a#video-title');
    videoElements.forEach(element => {
      const title = element.innerText.trim();
      const link = element.href;
      if (title && link) {
        results.push({ title, link });
      }
    });
    return results.slice(0, 5); // Return top 5 results
  });

  console.log('--- Scraped Video Data ---');
  console.log(videoData);
  console.log('--------------------------');

  console.log('Closing browser...');
  await browserInstance.close();
  console.log('Script finished.');
})();

Running this script should now print an array of objects, each containing the title and link of the top search results, similar to this (titles/links will vary):

// --- Scraped Video Data ---
[
  {
    title: 'Web Scraping Tutorial For Beginners | Scrape Anything!',
    link: 'https://www.youtube.com/watch?v=someVideoId1'
  },
  {
    title: 'Ethical Web Scraping: Best Practices & Legal Considerations',
    link: 'https://www.youtube.com/watch?v=someVideoId2'
  },
  {
    title: 'How to Avoid Getting Blocked While Web Scraping',
    link: 'https://www.youtube.com/watch?v=someVideoId3'
  },
  {
    title: 'Advanced Web Scraping Techniques in Python',
    link: 'https://www.youtube.com/watch?v=someVideoId4'
  },
  {
    title: 'Web Scraping with Node.js and Puppeteer - Full Course',
    link: 'https://www.youtube.com/watch?v=someVideoId5'
  }
]
// --------------------------

Integrating Proxies for Reliable Scraping

Let's modify the puppeteer.launch() options to include the proxy server details via the args array:

// Example using Evomi Residential Proxy endpoint (replace with your details)
const proxyServer = 'rp.evomi.com:1000'; // Example: HTTP endpoint for Evomi Residential
const proxyUsername = 'your-evomi-username'; // Replace with your Evomi username
const proxyPassword = 'your-evomi-password'; // Replace with your Evomi password

console.log(`Launching browser with proxy: ${proxyServer}...`);

const browserInstance = await puppeteer.launch({
  headless: false, // Keep false for debugging
  defaultViewport: null,
  args: [
    `--proxy-server=${proxyServer}`
  ]
});

  const page = await browserInstance.newPage();

  // Authenticate with the proxy server
  await page.authenticate({
    username: proxyUsername,
    password: proxyPassword
  });

  console.log('Proxy authentication set.');

Here's how the initial part of the script looks with proxy integration:

const puppeteer = require('puppeteer');

(async () => {
  // --- Proxy Configuration ---
  // Example using Evomi Residential Proxy endpoint (replace with your details)
  const proxyServer = 'rp.evomi.com:1000'; // Example: HTTP endpoint for Evomi Residential
  const proxyUsername = 'your-evomi-username'; // Replace with your Evomi username
  const proxyPassword = 'your-evomi-password'; // Replace with your Evomi password
  // -------------------------

  console.log(`Launching browser with proxy: ${proxyServer}...`);

  const browserInstance = await puppeteer.launch({
    headless: false, // Keep false for debugging, set true for production
    defaultViewport: null,
    args: [
      `--proxy-server=${proxyServer}`
    ]
  });

  const page = await browserInstance.newPage();

  // Authenticate with the proxy server
  await page.authenticate({
    username: proxyUsername,
    password: proxyPassword
  });
  console.log('Proxy authentication set.');

  console.log('Navigating to YouTube via proxy...');
  await page.goto('https://www.youtube.com/', { waitUntil: 'networkidle2' });
  console.log('YouTube loaded via proxy!');

  // ... (rest of the script: cookie handling, search, scraping) ...

  console.log('Closing browser...');
  await browserInstance.close();

  console.log('Script finished.');
})();

Now, when you run the script, all traffic to YouTube will be routed through your configured Evomi proxy, significantly reducing the risk of your personal IP being flagged or blocked.

Wrapping Up

Puppeteer opens up possibilities for interacting with almost any website, no matter how dynamic. Try applying these techniques to other complex sites to further hone your skills!

Author

Michael Chen

AI & Network Infrastructure Analyst

About Author

Michael bridges the gap between artificial intelligence and network security, analyzing how AI-driven technologies enhance proxy performance and security. His work focuses on AI-powered anti-detection techniques, predictive traffic routing, and how proxies integrate with machine learning applications for smarter data access.

Like this article? Share it.

You asked, we answer - Users questions:

How can Puppeteer scrape content loaded dynamically through actions like infinite scrolling?+

What strategies, beyond using proxies, can help prevent Puppeteer scripts from being detected and blocked?+

Are there ways to automatically solve CAPTCHAs encountered during Puppeteer scraping?+

How can I manage browser resources effectively when running multiple or long-running Puppeteer instances?+

Can Puppeteer maintain login sessions or use specific browser profiles for scraping websites requiring authentication?+

Get Started with Swiss Quality Proxies

Try for Free

View Pricing

Explore Our Proxy Blog

Tips on residential proxies, web scraping & security

Frequently Asked Questions

Read if you have any questions regarding proxies

Read about Ethics

Our Ethical Standards in the residential proxy market

Get Started with Swiss Quality Proxies

Try for Free

View Pricing

Explore Our Proxy Blog

Tips on residential proxies, web scraping & security

Frequently Asked Questions

Read if you have any questions regarding proxies

Read about Ethics

Our Ethical Standards in the residential proxy market

Get Started with Swiss Quality Proxies

Try for Free

View Pricing

Explore Our Proxy Blog

Tips on residential proxies, web scraping & security

Frequently Asked Questions

Read if you have any questions regarding proxies

Read about Ethics

Our Ethical Standards in the residential proxy market

United States

United Kingdom

Germany

France

Japan

Canada

Australia

South Korea

Scraping JavaScript Sites: Puppeteer, Node.js & Proxies

Tackling Dynamic Websites: Scraping with Puppeteer, Node.js, and Proxies

So, What Exactly is Puppeteer?

Tutorial: Scraping YouTube Search Results with Puppeteer

Setting Up Your Environment

Launching Puppeteer and Opening a Page

Handling the Cookie Consent Banner

Automating the Search

Extracting Video Information

Integrating Proxies for Reliable Scraping

Wrapping Up

Tackling Dynamic Websites: Scraping with Puppeteer, Node.js, and Proxies

So, What Exactly is Puppeteer?

Tutorial: Scraping YouTube Search Results with Puppeteer

Setting Up Your Environment

Launching Puppeteer and Opening a Page

Handling the Cookie Consent Banner

Automating the Search

Extracting Video Information

Integrating Proxies for Reliable Scraping

Wrapping Up

Tackling Dynamic Websites: Scraping with Puppeteer, Node.js, and Proxies

So, What Exactly is Puppeteer?

Tutorial: Scraping YouTube Search Results with Puppeteer

Setting Up Your Environment

Launching Puppeteer and Opening a Page

Handling the Cookie Consent Banner

Automating the Search

Extracting Video Information

Integrating Proxies for Reliable Scraping

Wrapping Up

About Author

Like this article? Share it.

You asked, we answer - Users questions:

In This Article

Read More Blogs

Is Amazon Data Scraping Allowed? Ethical and Legal Insights

How to Set Up Evomi Proxies in Octo Browser: Complete Guide

Residential vs. Datacenter Proxies: Best Choice?

Get Started with Swiss Quality Proxies

Get Started with Swiss Quality Proxies

Get Started with Swiss Quality Proxies