Scrape in Python with Undetected ChromeDriver & Proxies

Stealthy Web Scraping: Combining Python, Undetected ChromeDriver, and Proxies

Selenium is a heavyweight champion in the world of browser automation and web scraping. Typically, it works hand-in-glove with standard browser drivers, like the official ChromeDriver for Google Chrome. While effective, these standard drivers aren't exactly invisible. Savvy websites have developed methods to detect automation scripts using Selenium, often leading to blocks or CAPTCHAs.

The main giveaway? Standard drivers tend to leak specific automation-related properties. Enter Undetected ChromeDriver, a clever Python library designed specifically to patch these leaks. By modifying ChromeDriver, it significantly lowers the chances of your script being flagged as a bot.

Integrating Undetected ChromeDriver into your Selenium web scraping projects can be a game-changer. It often leads to more successful data gathering and can even cut down on costs by reducing the need for aggressive IP rotation (though proxies remain crucial!). Since it's an open-source library, adding Undetected ChromeDriver is a smart move for anyone serious about web scraping with Selenium.

Getting Started: Installation

Like any Python package, you'll need to install Undetected ChromeDriver first. Fire up your terminal or command prompt and run:

One handy thing: you don't need to install Selenium separately. Undetected ChromeDriver pulls it in as a dependency, along with other necessary bits.

Another neat feature: Undetected ChromeDriver automatically downloads a compatible ChromeDriver binary for you. This saves you the hassle of manually finding and matching driver versions, a common task in older Selenium setups.

With the installation complete, you can import the library into your Python script:

import undetected_chromedriver as uc

Using Undetected ChromeDriver: A Practical Guide

Making Basic Web Requests

Fetching a webpage (making a GET request) is fundamental to web scraping. With Undetected ChromeDriver, the process is very similar to standard Selenium, but under the hood, it's doing more to stay hidden.

import undetected_chromedriver as uc
import time

def fetch_webpage(target_url):
    # Initialize the Undetected ChromeDriver
    # Use 'options' argument if you need custom configurations later
    browser = uc.Chrome()
    try:
        # Navigate to the desired URL
        print(f"Attempting to load: {target_url}")
        browser.get(target_url)
        # Let's pause briefly to observe the browser (optional)
        print("Page loaded. Pausing for 5 seconds...")
        time.sleep(5) # Keep the browser open for a few seconds
        print("Finished.")
    finally:
        # Ensure the browser is closed even if errors occur
        browser.quit()

# Example: Let's try accessing a site known for bot detection
fetch_webpage('https://nowsecure.nl') # A site that tests browser fingerprinting

In this script, we define a function fetch_webpage. It initializes an instance of the modified Chrome browser using uc.Chrome(). Then, it navigates to the specified URL using the familiar get() method.

We've added a time.sleep(5) just so you can see the browser window open and load the page before it automatically closes. The try...finally block ensures the browser closes properly, even if something goes wrong during the page load.

We're using nowsecure.nl as an example target because it actively checks for bot-like browser properties. Standard Selenium might struggle here, making it a decent test for Undetected ChromeDriver's capabilities.

Capturing Website Source Code

Simply visiting a page isn't enough for scraping; you need the underlying HTML content. You can then feed this HTML to a parsing library like Beautiful Soup 4 to extract the data you need.

import undetected_chromedriver as uc
import time

def get_page_html(target_url):
    # Initialize the Undetected ChromeDriver
    browser = uc.Chrome()
    html_source = None # Initialize variable to store HTML
    try:
        # Navigate to the URL
        print(f"Fetching HTML from: {target_url}")
        browser.get(target_url)
        # Wait a moment for potential dynamic content loading (adjust as needed)
        time.sleep(2)
        # Get the page source HTML
        html_source = browser.page_source
        print("Successfully retrieved HTML source.")
    except Exception as e:
        print(f"An error occurred: {e}")
    finally:
        # Close the browser
        browser.quit()
    # Return the captured HTML
    return html_source

# Example usage:
page_content = get_page_html('https://nowsecure.nl')
if page_content:
    # Print the first 500 characters as a sample
    print("\n--- HTML Source (First 500 chars) ---")
    print(page_content[:500])
    print("...")
else:
    print("Failed to retrieve page content.")

This version builds on the previous one. The key addition is html_source = browser.page_source. This command grabs the full HTML source code of the currently loaded page and stores it in the html_source variable. The function then returns this HTML after closing the browser.

Customizing Browser Behavior

Undetected ChromeDriver aims for stealth out-of-the-box, but sometimes you need to tweak its settings for specific scraping tasks or further optimization.

Keep in mind that altering default settings requires careful testing. Some options might inadvertently make your browser *more* detectable, while others might be necessary for specific scenarios (like running without a visible browser window).

import undetected_chromedriver as uc
import time

def fetch_with_options(target_url):
    # Create a ChromeOptions object
    options = uc.ChromeOptions()
    # Example: Run in headless mode (no visible browser window)
    # Note: Headless detection is sophisticated; test thoroughly!
    # Use options.add_argument('--headless=new') for modern Chrome versions
    options.add_argument('--headless=new')
    # Example: Disable loading images (can speed up scraping)
    # options.add_argument('--blink-settings=imagesEnabled=false')
    # Initialize the driver with the specified options
    browser = uc.Chrome(options=options)
    html_source = None
    try:
        print(f"Fetching (with options) from: {target_url}")
        browser.get(target_url)
        time.sleep(2) # Allow time for page load in headless mode
        html_source = browser.page_source
        print("Successfully retrieved HTML source in headless mode.")
    except Exception as e:
        print(f"An error occurred: {e}")
    finally:
        browser.quit()
    return html_source

# Example usage:
page_content = fetch_with_options('https://nowsecure.nl')
if page_content:
    print("\nRetrieved content successfully (headless).")
    # print(page_content[:500]) # Optionally print content
else:
    print("Failed to retrieve page content (headless).")

Here, we introduce uc.ChromeOptions(). We create an options object and then add arguments like --headless=new to run the browser without a GUI. These options are then passed when creating the uc.Chrome instance.

Many other options exist, like disabling image loading or setting window sizes. One particularly powerful option is customizing the User-Agent string.

The User-Agent is a piece of information your browser sends with every request, identifying itself (browser type, version, OS, etc.). Changing the User-Agent can sometimes help blend in better, especially if the default doesn't match common user profiles.

import undetected_chromedriver as uc
import time

def fetch_with_custom_ua(target_url):
    options = uc.ChromeOptions()
    options.add_argument('--headless=new')

    # Define a custom User-Agent string
    custom_user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36'
    options.add_argument(f'--user-agent={custom_user_agent}')

    browser = uc.Chrome(options=options)
    detected_ua = None
    html_source = None

    try:
        # Optional: Verify the user agent being used by the browser instance
        browser.get('https://httpbin.org/user-agent') # A site that echoes the UA
        time.sleep(1)
        ua_info = browser.find_element("tag name", "pre").text
        print(f"Detected User Agent by httpbin: {ua_info}")

        # Now navigate to the actual target
        print(f"Fetching (with custom UA) from: {target_url}")
        browser.get(target_url)
        time.sleep(2)
        html_source = browser.page_source
        print("Successfully retrieved HTML source with custom User-Agent.")
    except Exception as e:
        print(f"An error occurred: {e}")
    finally:
        browser.quit()

    return html_source

# Example usage:
page_content = fetch_with_custom_ua('https://nowsecure.nl')

# Check results...

We added another argument to set the --user-agent. To confirm it's working, we first visit https://httpbin.org/user-agent which simply reports back the headers it received, allowing us to see the User-Agent our browser is actually sending before proceeding to the main target.

You can experiment with different user agents or even rotate through a list of common, up-to-date ones. Other useful options include specifying browser versions or even providing a path to a specific ChromeDriver binary (though usually unnecessary).

Integrating Proxies

While Undetected ChromeDriver helps mimic a real user's browser, aggressive anti-bot systems also monitor IP address behavior. Sending too many requests from a single IP is a classic red flag. This is where proxies become essential.

Proxies act as intermediaries, routing your requests through different IP addresses. This makes it much harder for websites to track and block your scraping activity based solely on your IP. Fortunately, adding a proxy to Undetected ChromeDriver is straightforward.

import undetected_chromedriver as uc
import time

def fetch_via_proxy(target_url, proxy_string):
    # proxy_string format example: "http://username:password@host:port" or "http://host:port"
    options = uc.ChromeOptions()
    options.add_argument('--headless=new')
    options.add_argument('--ignore-certificate-errors') # Often needed for proxies
    # Add the proxy server argument
    options.add_argument(f'--proxy-server={proxy_string}')
    print(f"Configured proxy: {proxy_string}")
    browser = uc.Chrome(options=options)
    html_source = None
    try:
        # Let's check our apparent IP address via the proxy
        print("Checking IP address via proxy...")
        browser.get('https://httpbin.org/ip')
        time.sleep(2)
        ip_info = browser.find_element("tag name", "pre").text
        print(f"IP address seen by httpbin: {ip_info}")

        # Now navigate to the actual target via proxy
        print(f"Fetching (via proxy) from: {target_url}")
        browser.get(target_url)
        time.sleep(3) # Allow a bit more time for proxy connection
        html_source = browser.page_source
        print("Successfully retrieved HTML source via proxy.")
    except Exception as e:
        print(f"An error occurred while using proxy: {e}")
    finally:
        browser.quit()
    return html_source

# --- Evomi Proxy Example ---
# Replace with your actual Evomi credentials and desired endpoint/port
# Example using Evomi Residential Proxies (HTTP) - Authentication typically needed
# Format: protocol://username:password@host:port
# evomi_proxy = "http://YOUR_USERNAME:YOUR_PASSWORD@rp.evomi.com:1000"
# Example using Evomi Datacenter Proxies (HTTP) - Replace with your details if applicable
# evomi_proxy = "http://YOUR_USERNAME:YOUR_PASSWORD@dc.evomi.com:2000"

# For this example, let's use a placeholder structure.
# Make sure to replace this with a real, working proxy string for actual use.
proxy_details = "http://username:password@rp.evomi.com:1000" # <-- REPLACE THIS
page_content = fetch_via_proxy('https://nowsecure.nl', proxy_details)

# Check results...
# Consider using Evomi's Free Proxy Tester (https://proxy-tester.evomi.com/) to verify proxy status separately.

We add another argument, --proxy-server=YOUR_PROXY_STRING, passing our proxy details to the options. The example shows a common format including protocol, username, password, host, and port, referencing Evomi's residential proxy endpoint structure. Remember to replace the placeholder with your actual, valid proxy credentials and endpoint provided by Evomi.

Using different proxy types like residential, mobile, or datacenter proxies can significantly impact success rates. Residential and mobile proxies often perform best against strict anti-bot measures as they use IPs associated with real devices. Evomi offers a range of these options, starting from competitive prices like $0.49/GB for residential proxies. You can explore the specifics on the Evomi pricing page.

Note: If you use an invalid proxy string, the script might run without errors but fail to load the page correctly, often capturing an error page from the proxy itself instead of the target website's content.

Beyond Undetected ChromeDriver: Other Strategies

Undetected ChromeDriver is a powerful tool, but it's part of a larger scraping ecosystem. If you're still facing blocks, consider these points:

Alternative Libraries: Frameworks like Playwright and Pyppeteer (a Python port of Puppeteer) offer similar browser automation capabilities and have their own communities developing anti-detection techniques. Switching would require code adaptation.
Headless vs. Headful: Experiment with running your browser visibly (headful) versus invisibly (headless). Some sites are better at detecting headless browsers, even modified ones.
Proxy Strategy: Don't just use *a* proxy, use proxies *strategically*. Rotate IPs regularly, use high-quality residential or mobile proxies for tough targets, and match proxy geolocation to the target site if necessary.
User Agent Rotation: Cycle through a list of current, common user agents instead of using just one static string.
Behavioral Analysis: Mimic human behavior. Add random delays between actions, avoid predictable scraping patterns, navigate through login pages naturally instead of hitting internal pages directly.
Anti-Detect Browsers: For maximum stealth, specialized browsers like Evomium (free for Evomi customers) are designed to manage and spoof many browser fingerprint parameters automatically, often simplifying the setup compared to manual configuration in code.
Fingerprint Checking: Use tools like Evomi's Browser Fingerprint Checker to see what information your configured browser might be leaking.

Quick Recap

Let's summarize the core steps for using Undetected ChromeDriver with options and proxies:

Install the library:
Import it:
import undetected_chromedriver as uc
Adapt the Python code below, configuring options and replacing the placeholder proxy details with your actual credentials (e.g., from Evomi):

import undetected_chromedriver as uc
import time

def scrape_with_proxy(target_url, proxy_string):
    options = uc.ChromeOptions()
    options.add_argument('--headless=new') # Or remove for visible browser
    # Example: Use a realistic user agent
    options.add_argument('--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36')
    options.add_argument('--ignore-certificate-errors')
    # Configure the proxy
    options.add_argument(f'--proxy-server={proxy_string}')

    browser = uc.Chrome(options=options)
    html_source = None
    try:
        print(f"Attempting to fetch {target_url} via proxy {proxy_string.split('@')[-1]}...") # Mask user/pass in log
        # Optional: Check IP first
        # browser.get('https://httpbin.org/ip')
        # print(f"Current IP: {browser.find_element('tag name', 'pre').text}")

        browser.get(target_url)
        time.sleep(3) # Adjust wait time as needed
        html_source = browser.page_source
        print("Successfully retrieved page source.")
    except Exception as e:
        print(f"Scraping failed: {e}")
    finally:
        browser.quit()
    return html_source

# --- Configuration ---
# Replace with your actual proxy details from Evomi or another provider
# Format: protocol://username:password@host:port
proxy_config = "http://YOUR_USERNAME:YOUR_PASSWORD@rp.evomi.com:1000" # <-- REPLACE THIS
target_website = 'https://httpbin.org/headers' # Example site that shows request headers

# --- Execution ---
content = scrape_with_proxy(target_website, proxy_config)

if content:
    print("\n--- Received Content Sample ---")
    print(content[:600])
    print("...")

By combining Undetected ChromeDriver's stealth capabilities with the IP anonymity provided by quality proxies like those from Evomi, you significantly increase your chances of successful and sustainable web scraping in Python.

Stealthy Web Scraping: Combining Python, Undetected ChromeDriver, and Proxies

Selenium is a heavyweight champion in the world of browser automation and web scraping. Typically, it works hand-in-glove with standard browser drivers, like the official ChromeDriver for Google Chrome. While effective, these standard drivers aren't exactly invisible. Savvy websites have developed methods to detect automation scripts using Selenium, often leading to blocks or CAPTCHAs.

The main giveaway? Standard drivers tend to leak specific automation-related properties. Enter Undetected ChromeDriver, a clever Python library designed specifically to patch these leaks. By modifying ChromeDriver, it significantly lowers the chances of your script being flagged as a bot.

Integrating Undetected ChromeDriver into your Selenium web scraping projects can be a game-changer. It often leads to more successful data gathering and can even cut down on costs by reducing the need for aggressive IP rotation (though proxies remain crucial!). Since it's an open-source library, adding Undetected ChromeDriver is a smart move for anyone serious about web scraping with Selenium.

Getting Started: Installation

Like any Python package, you'll need to install Undetected ChromeDriver first. Fire up your terminal or command prompt and run:

One handy thing: you don't need to install Selenium separately. Undetected ChromeDriver pulls it in as a dependency, along with other necessary bits.

Another neat feature: Undetected ChromeDriver automatically downloads a compatible ChromeDriver binary for you. This saves you the hassle of manually finding and matching driver versions, a common task in older Selenium setups.

With the installation complete, you can import the library into your Python script:

import undetected_chromedriver as uc

Using Undetected ChromeDriver: A Practical Guide

Making Basic Web Requests

Fetching a webpage (making a GET request) is fundamental to web scraping. With Undetected ChromeDriver, the process is very similar to standard Selenium, but under the hood, it's doing more to stay hidden.

import undetected_chromedriver as uc
import time

def fetch_webpage(target_url):
    # Initialize the Undetected ChromeDriver
    # Use 'options' argument if you need custom configurations later
    browser = uc.Chrome()
    try:
        # Navigate to the desired URL
        print(f"Attempting to load: {target_url}")
        browser.get(target_url)
        # Let's pause briefly to observe the browser (optional)
        print("Page loaded. Pausing for 5 seconds...")
        time.sleep(5) # Keep the browser open for a few seconds
        print("Finished.")
    finally:
        # Ensure the browser is closed even if errors occur
        browser.quit()

# Example: Let's try accessing a site known for bot detection
fetch_webpage('https://nowsecure.nl') # A site that tests browser fingerprinting

In this script, we define a function fetch_webpage. It initializes an instance of the modified Chrome browser using uc.Chrome(). Then, it navigates to the specified URL using the familiar get() method.

We've added a time.sleep(5) just so you can see the browser window open and load the page before it automatically closes. The try...finally block ensures the browser closes properly, even if something goes wrong during the page load.

We're using nowsecure.nl as an example target because it actively checks for bot-like browser properties. Standard Selenium might struggle here, making it a decent test for Undetected ChromeDriver's capabilities.

Capturing Website Source Code

Simply visiting a page isn't enough for scraping; you need the underlying HTML content. You can then feed this HTML to a parsing library like Beautiful Soup 4 to extract the data you need.

import undetected_chromedriver as uc
import time

def get_page_html(target_url):
    # Initialize the Undetected ChromeDriver
    browser = uc.Chrome()
    html_source = None # Initialize variable to store HTML
    try:
        # Navigate to the URL
        print(f"Fetching HTML from: {target_url}")
        browser.get(target_url)
        # Wait a moment for potential dynamic content loading (adjust as needed)
        time.sleep(2)
        # Get the page source HTML
        html_source = browser.page_source
        print("Successfully retrieved HTML source.")
    except Exception as e:
        print(f"An error occurred: {e}")
    finally:
        # Close the browser
        browser.quit()
    # Return the captured HTML
    return html_source

# Example usage:
page_content = get_page_html('https://nowsecure.nl')
if page_content:
    # Print the first 500 characters as a sample
    print("\n--- HTML Source (First 500 chars) ---")
    print(page_content[:500])
    print("...")
else:
    print("Failed to retrieve page content.")

This version builds on the previous one. The key addition is html_source = browser.page_source. This command grabs the full HTML source code of the currently loaded page and stores it in the html_source variable. The function then returns this HTML after closing the browser.

Customizing Browser Behavior

Undetected ChromeDriver aims for stealth out-of-the-box, but sometimes you need to tweak its settings for specific scraping tasks or further optimization.

Keep in mind that altering default settings requires careful testing. Some options might inadvertently make your browser *more* detectable, while others might be necessary for specific scenarios (like running without a visible browser window).

import undetected_chromedriver as uc
import time

def fetch_with_options(target_url):
    # Create a ChromeOptions object
    options = uc.ChromeOptions()
    # Example: Run in headless mode (no visible browser window)
    # Note: Headless detection is sophisticated; test thoroughly!
    # Use options.add_argument('--headless=new') for modern Chrome versions
    options.add_argument('--headless=new')
    # Example: Disable loading images (can speed up scraping)
    # options.add_argument('--blink-settings=imagesEnabled=false')
    # Initialize the driver with the specified options
    browser = uc.Chrome(options=options)
    html_source = None
    try:
        print(f"Fetching (with options) from: {target_url}")
        browser.get(target_url)
        time.sleep(2) # Allow time for page load in headless mode
        html_source = browser.page_source
        print("Successfully retrieved HTML source in headless mode.")
    except Exception as e:
        print(f"An error occurred: {e}")
    finally:
        browser.quit()
    return html_source

# Example usage:
page_content = fetch_with_options('https://nowsecure.nl')
if page_content:
    print("\nRetrieved content successfully (headless).")
    # print(page_content[:500]) # Optionally print content
else:
    print("Failed to retrieve page content (headless).")

Here, we introduce uc.ChromeOptions(). We create an options object and then add arguments like --headless=new to run the browser without a GUI. These options are then passed when creating the uc.Chrome instance.

Many other options exist, like disabling image loading or setting window sizes. One particularly powerful option is customizing the User-Agent string.

The User-Agent is a piece of information your browser sends with every request, identifying itself (browser type, version, OS, etc.). Changing the User-Agent can sometimes help blend in better, especially if the default doesn't match common user profiles.

import undetected_chromedriver as uc
import time

def fetch_with_custom_ua(target_url):
    options = uc.ChromeOptions()
    options.add_argument('--headless=new')

    # Define a custom User-Agent string
    custom_user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36'
    options.add_argument(f'--user-agent={custom_user_agent}')

    browser = uc.Chrome(options=options)
    detected_ua = None
    html_source = None

    try:
        # Optional: Verify the user agent being used by the browser instance
        browser.get('https://httpbin.org/user-agent') # A site that echoes the UA
        time.sleep(1)
        ua_info = browser.find_element("tag name", "pre").text
        print(f"Detected User Agent by httpbin: {ua_info}")

        # Now navigate to the actual target
        print(f"Fetching (with custom UA) from: {target_url}")
        browser.get(target_url)
        time.sleep(2)
        html_source = browser.page_source
        print("Successfully retrieved HTML source with custom User-Agent.")
    except Exception as e:
        print(f"An error occurred: {e}")
    finally:
        browser.quit()

    return html_source

# Example usage:
page_content = fetch_with_custom_ua('https://nowsecure.nl')

# Check results...

We added another argument to set the --user-agent. To confirm it's working, we first visit https://httpbin.org/user-agent which simply reports back the headers it received, allowing us to see the User-Agent our browser is actually sending before proceeding to the main target.

You can experiment with different user agents or even rotate through a list of common, up-to-date ones. Other useful options include specifying browser versions or even providing a path to a specific ChromeDriver binary (though usually unnecessary).

Integrating Proxies

While Undetected ChromeDriver helps mimic a real user's browser, aggressive anti-bot systems also monitor IP address behavior. Sending too many requests from a single IP is a classic red flag. This is where proxies become essential.

Proxies act as intermediaries, routing your requests through different IP addresses. This makes it much harder for websites to track and block your scraping activity based solely on your IP. Fortunately, adding a proxy to Undetected ChromeDriver is straightforward.

import undetected_chromedriver as uc
import time

def fetch_via_proxy(target_url, proxy_string):
    # proxy_string format example: "http://username:password@host:port" or "http://host:port"
    options = uc.ChromeOptions()
    options.add_argument('--headless=new')
    options.add_argument('--ignore-certificate-errors') # Often needed for proxies
    # Add the proxy server argument
    options.add_argument(f'--proxy-server={proxy_string}')
    print(f"Configured proxy: {proxy_string}")
    browser = uc.Chrome(options=options)
    html_source = None
    try:
        # Let's check our apparent IP address via the proxy
        print("Checking IP address via proxy...")
        browser.get('https://httpbin.org/ip')
        time.sleep(2)
        ip_info = browser.find_element("tag name", "pre").text
        print(f"IP address seen by httpbin: {ip_info}")

        # Now navigate to the actual target via proxy
        print(f"Fetching (via proxy) from: {target_url}")
        browser.get(target_url)
        time.sleep(3) # Allow a bit more time for proxy connection
        html_source = browser.page_source
        print("Successfully retrieved HTML source via proxy.")
    except Exception as e:
        print(f"An error occurred while using proxy: {e}")
    finally:
        browser.quit()
    return html_source

# --- Evomi Proxy Example ---
# Replace with your actual Evomi credentials and desired endpoint/port
# Example using Evomi Residential Proxies (HTTP) - Authentication typically needed
# Format: protocol://username:password@host:port
# evomi_proxy = "http://YOUR_USERNAME:YOUR_PASSWORD@rp.evomi.com:1000"
# Example using Evomi Datacenter Proxies (HTTP) - Replace with your details if applicable
# evomi_proxy = "http://YOUR_USERNAME:YOUR_PASSWORD@dc.evomi.com:2000"

# For this example, let's use a placeholder structure.
# Make sure to replace this with a real, working proxy string for actual use.
proxy_details = "http://username:password@rp.evomi.com:1000" # <-- REPLACE THIS
page_content = fetch_via_proxy('https://nowsecure.nl', proxy_details)

# Check results...
# Consider using Evomi's Free Proxy Tester (https://proxy-tester.evomi.com/) to verify proxy status separately.

We add another argument, --proxy-server=YOUR_PROXY_STRING, passing our proxy details to the options. The example shows a common format including protocol, username, password, host, and port, referencing Evomi's residential proxy endpoint structure. Remember to replace the placeholder with your actual, valid proxy credentials and endpoint provided by Evomi.

Using different proxy types like residential, mobile, or datacenter proxies can significantly impact success rates. Residential and mobile proxies often perform best against strict anti-bot measures as they use IPs associated with real devices. Evomi offers a range of these options, starting from competitive prices like $0.49/GB for residential proxies. You can explore the specifics on the Evomi pricing page.

Note: If you use an invalid proxy string, the script might run without errors but fail to load the page correctly, often capturing an error page from the proxy itself instead of the target website's content.

Beyond Undetected ChromeDriver: Other Strategies

Undetected ChromeDriver is a powerful tool, but it's part of a larger scraping ecosystem. If you're still facing blocks, consider these points:

Alternative Libraries: Frameworks like Playwright and Pyppeteer (a Python port of Puppeteer) offer similar browser automation capabilities and have their own communities developing anti-detection techniques. Switching would require code adaptation.
Headless vs. Headful: Experiment with running your browser visibly (headful) versus invisibly (headless). Some sites are better at detecting headless browsers, even modified ones.
Proxy Strategy: Don't just use *a* proxy, use proxies *strategically*. Rotate IPs regularly, use high-quality residential or mobile proxies for tough targets, and match proxy geolocation to the target site if necessary.
User Agent Rotation: Cycle through a list of current, common user agents instead of using just one static string.
Behavioral Analysis: Mimic human behavior. Add random delays between actions, avoid predictable scraping patterns, navigate through login pages naturally instead of hitting internal pages directly.
Anti-Detect Browsers: For maximum stealth, specialized browsers like Evomium (free for Evomi customers) are designed to manage and spoof many browser fingerprint parameters automatically, often simplifying the setup compared to manual configuration in code.
Fingerprint Checking: Use tools like Evomi's Browser Fingerprint Checker to see what information your configured browser might be leaking.

Quick Recap

Let's summarize the core steps for using Undetected ChromeDriver with options and proxies:

Install the library:
Import it:
import undetected_chromedriver as uc
Adapt the Python code below, configuring options and replacing the placeholder proxy details with your actual credentials (e.g., from Evomi):

import undetected_chromedriver as uc
import time

def scrape_with_proxy(target_url, proxy_string):
    options = uc.ChromeOptions()
    options.add_argument('--headless=new') # Or remove for visible browser
    # Example: Use a realistic user agent
    options.add_argument('--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36')
    options.add_argument('--ignore-certificate-errors')
    # Configure the proxy
    options.add_argument(f'--proxy-server={proxy_string}')

    browser = uc.Chrome(options=options)
    html_source = None
    try:
        print(f"Attempting to fetch {target_url} via proxy {proxy_string.split('@')[-1]}...") # Mask user/pass in log
        # Optional: Check IP first
        # browser.get('https://httpbin.org/ip')
        # print(f"Current IP: {browser.find_element('tag name', 'pre').text}")

        browser.get(target_url)
        time.sleep(3) # Adjust wait time as needed
        html_source = browser.page_source
        print("Successfully retrieved page source.")
    except Exception as e:
        print(f"Scraping failed: {e}")
    finally:
        browser.quit()
    return html_source

# --- Configuration ---
# Replace with your actual proxy details from Evomi or another provider
# Format: protocol://username:password@host:port
proxy_config = "http://YOUR_USERNAME:YOUR_PASSWORD@rp.evomi.com:1000" # <-- REPLACE THIS
target_website = 'https://httpbin.org/headers' # Example site that shows request headers

# --- Execution ---
content = scrape_with_proxy(target_website, proxy_config)

if content:
    print("\n--- Received Content Sample ---")
    print(content[:600])
    print("...")

By combining Undetected ChromeDriver's stealth capabilities with the IP anonymity provided by quality proxies like those from Evomi, you significantly increase your chances of successful and sustainable web scraping in Python.

Stealthy Web Scraping: Combining Python, Undetected ChromeDriver, and Proxies

Selenium is a heavyweight champion in the world of browser automation and web scraping. Typically, it works hand-in-glove with standard browser drivers, like the official ChromeDriver for Google Chrome. While effective, these standard drivers aren't exactly invisible. Savvy websites have developed methods to detect automation scripts using Selenium, often leading to blocks or CAPTCHAs.

The main giveaway? Standard drivers tend to leak specific automation-related properties. Enter Undetected ChromeDriver, a clever Python library designed specifically to patch these leaks. By modifying ChromeDriver, it significantly lowers the chances of your script being flagged as a bot.

Integrating Undetected ChromeDriver into your Selenium web scraping projects can be a game-changer. It often leads to more successful data gathering and can even cut down on costs by reducing the need for aggressive IP rotation (though proxies remain crucial!). Since it's an open-source library, adding Undetected ChromeDriver is a smart move for anyone serious about web scraping with Selenium.

Getting Started: Installation

Like any Python package, you'll need to install Undetected ChromeDriver first. Fire up your terminal or command prompt and run:

One handy thing: you don't need to install Selenium separately. Undetected ChromeDriver pulls it in as a dependency, along with other necessary bits.

Another neat feature: Undetected ChromeDriver automatically downloads a compatible ChromeDriver binary for you. This saves you the hassle of manually finding and matching driver versions, a common task in older Selenium setups.

With the installation complete, you can import the library into your Python script:

import undetected_chromedriver as uc

Using Undetected ChromeDriver: A Practical Guide

Making Basic Web Requests

Fetching a webpage (making a GET request) is fundamental to web scraping. With Undetected ChromeDriver, the process is very similar to standard Selenium, but under the hood, it's doing more to stay hidden.

import undetected_chromedriver as uc
import time

def fetch_webpage(target_url):
    # Initialize the Undetected ChromeDriver
    # Use 'options' argument if you need custom configurations later
    browser = uc.Chrome()
    try:
        # Navigate to the desired URL
        print(f"Attempting to load: {target_url}")
        browser.get(target_url)
        # Let's pause briefly to observe the browser (optional)
        print("Page loaded. Pausing for 5 seconds...")
        time.sleep(5) # Keep the browser open for a few seconds
        print("Finished.")
    finally:
        # Ensure the browser is closed even if errors occur
        browser.quit()

# Example: Let's try accessing a site known for bot detection
fetch_webpage('https://nowsecure.nl') # A site that tests browser fingerprinting

In this script, we define a function fetch_webpage. It initializes an instance of the modified Chrome browser using uc.Chrome(). Then, it navigates to the specified URL using the familiar get() method.

We've added a time.sleep(5) just so you can see the browser window open and load the page before it automatically closes. The try...finally block ensures the browser closes properly, even if something goes wrong during the page load.

We're using nowsecure.nl as an example target because it actively checks for bot-like browser properties. Standard Selenium might struggle here, making it a decent test for Undetected ChromeDriver's capabilities.

Capturing Website Source Code

Simply visiting a page isn't enough for scraping; you need the underlying HTML content. You can then feed this HTML to a parsing library like Beautiful Soup 4 to extract the data you need.

import undetected_chromedriver as uc
import time

def get_page_html(target_url):
    # Initialize the Undetected ChromeDriver
    browser = uc.Chrome()
    html_source = None # Initialize variable to store HTML
    try:
        # Navigate to the URL
        print(f"Fetching HTML from: {target_url}")
        browser.get(target_url)
        # Wait a moment for potential dynamic content loading (adjust as needed)
        time.sleep(2)
        # Get the page source HTML
        html_source = browser.page_source
        print("Successfully retrieved HTML source.")
    except Exception as e:
        print(f"An error occurred: {e}")
    finally:
        # Close the browser
        browser.quit()
    # Return the captured HTML
    return html_source

# Example usage:
page_content = get_page_html('https://nowsecure.nl')
if page_content:
    # Print the first 500 characters as a sample
    print("\n--- HTML Source (First 500 chars) ---")
    print(page_content[:500])
    print("...")
else:
    print("Failed to retrieve page content.")

This version builds on the previous one. The key addition is html_source = browser.page_source. This command grabs the full HTML source code of the currently loaded page and stores it in the html_source variable. The function then returns this HTML after closing the browser.

Customizing Browser Behavior

Undetected ChromeDriver aims for stealth out-of-the-box, but sometimes you need to tweak its settings for specific scraping tasks or further optimization.

Keep in mind that altering default settings requires careful testing. Some options might inadvertently make your browser *more* detectable, while others might be necessary for specific scenarios (like running without a visible browser window).

import undetected_chromedriver as uc
import time

def fetch_with_options(target_url):
    # Create a ChromeOptions object
    options = uc.ChromeOptions()
    # Example: Run in headless mode (no visible browser window)
    # Note: Headless detection is sophisticated; test thoroughly!
    # Use options.add_argument('--headless=new') for modern Chrome versions
    options.add_argument('--headless=new')
    # Example: Disable loading images (can speed up scraping)
    # options.add_argument('--blink-settings=imagesEnabled=false')
    # Initialize the driver with the specified options
    browser = uc.Chrome(options=options)
    html_source = None
    try:
        print(f"Fetching (with options) from: {target_url}")
        browser.get(target_url)
        time.sleep(2) # Allow time for page load in headless mode
        html_source = browser.page_source
        print("Successfully retrieved HTML source in headless mode.")
    except Exception as e:
        print(f"An error occurred: {e}")
    finally:
        browser.quit()
    return html_source

# Example usage:
page_content = fetch_with_options('https://nowsecure.nl')
if page_content:
    print("\nRetrieved content successfully (headless).")
    # print(page_content[:500]) # Optionally print content
else:
    print("Failed to retrieve page content (headless).")

Here, we introduce uc.ChromeOptions(). We create an options object and then add arguments like --headless=new to run the browser without a GUI. These options are then passed when creating the uc.Chrome instance.

Many other options exist, like disabling image loading or setting window sizes. One particularly powerful option is customizing the User-Agent string.

The User-Agent is a piece of information your browser sends with every request, identifying itself (browser type, version, OS, etc.). Changing the User-Agent can sometimes help blend in better, especially if the default doesn't match common user profiles.

import undetected_chromedriver as uc
import time

def fetch_with_custom_ua(target_url):
    options = uc.ChromeOptions()
    options.add_argument('--headless=new')

    # Define a custom User-Agent string
    custom_user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36'
    options.add_argument(f'--user-agent={custom_user_agent}')

    browser = uc.Chrome(options=options)
    detected_ua = None
    html_source = None

    try:
        # Optional: Verify the user agent being used by the browser instance
        browser.get('https://httpbin.org/user-agent') # A site that echoes the UA
        time.sleep(1)
        ua_info = browser.find_element("tag name", "pre").text
        print(f"Detected User Agent by httpbin: {ua_info}")

        # Now navigate to the actual target
        print(f"Fetching (with custom UA) from: {target_url}")
        browser.get(target_url)
        time.sleep(2)
        html_source = browser.page_source
        print("Successfully retrieved HTML source with custom User-Agent.")
    except Exception as e:
        print(f"An error occurred: {e}")
    finally:
        browser.quit()

    return html_source

# Example usage:
page_content = fetch_with_custom_ua('https://nowsecure.nl')

# Check results...

We added another argument to set the --user-agent. To confirm it's working, we first visit https://httpbin.org/user-agent which simply reports back the headers it received, allowing us to see the User-Agent our browser is actually sending before proceeding to the main target.

You can experiment with different user agents or even rotate through a list of common, up-to-date ones. Other useful options include specifying browser versions or even providing a path to a specific ChromeDriver binary (though usually unnecessary).

Integrating Proxies

While Undetected ChromeDriver helps mimic a real user's browser, aggressive anti-bot systems also monitor IP address behavior. Sending too many requests from a single IP is a classic red flag. This is where proxies become essential.

Proxies act as intermediaries, routing your requests through different IP addresses. This makes it much harder for websites to track and block your scraping activity based solely on your IP. Fortunately, adding a proxy to Undetected ChromeDriver is straightforward.

import undetected_chromedriver as uc
import time

def fetch_via_proxy(target_url, proxy_string):
    # proxy_string format example: "http://username:password@host:port" or "http://host:port"
    options = uc.ChromeOptions()
    options.add_argument('--headless=new')
    options.add_argument('--ignore-certificate-errors') # Often needed for proxies
    # Add the proxy server argument
    options.add_argument(f'--proxy-server={proxy_string}')
    print(f"Configured proxy: {proxy_string}")
    browser = uc.Chrome(options=options)
    html_source = None
    try:
        # Let's check our apparent IP address via the proxy
        print("Checking IP address via proxy...")
        browser.get('https://httpbin.org/ip')
        time.sleep(2)
        ip_info = browser.find_element("tag name", "pre").text
        print(f"IP address seen by httpbin: {ip_info}")

        # Now navigate to the actual target via proxy
        print(f"Fetching (via proxy) from: {target_url}")
        browser.get(target_url)
        time.sleep(3) # Allow a bit more time for proxy connection
        html_source = browser.page_source
        print("Successfully retrieved HTML source via proxy.")
    except Exception as e:
        print(f"An error occurred while using proxy: {e}")
    finally:
        browser.quit()
    return html_source

# --- Evomi Proxy Example ---
# Replace with your actual Evomi credentials and desired endpoint/port
# Example using Evomi Residential Proxies (HTTP) - Authentication typically needed
# Format: protocol://username:password@host:port
# evomi_proxy = "http://YOUR_USERNAME:YOUR_PASSWORD@rp.evomi.com:1000"
# Example using Evomi Datacenter Proxies (HTTP) - Replace with your details if applicable
# evomi_proxy = "http://YOUR_USERNAME:YOUR_PASSWORD@dc.evomi.com:2000"

# For this example, let's use a placeholder structure.
# Make sure to replace this with a real, working proxy string for actual use.
proxy_details = "http://username:password@rp.evomi.com:1000" # <-- REPLACE THIS
page_content = fetch_via_proxy('https://nowsecure.nl', proxy_details)

# Check results...
# Consider using Evomi's Free Proxy Tester (https://proxy-tester.evomi.com/) to verify proxy status separately.

We add another argument, --proxy-server=YOUR_PROXY_STRING, passing our proxy details to the options. The example shows a common format including protocol, username, password, host, and port, referencing Evomi's residential proxy endpoint structure. Remember to replace the placeholder with your actual, valid proxy credentials and endpoint provided by Evomi.

Using different proxy types like residential, mobile, or datacenter proxies can significantly impact success rates. Residential and mobile proxies often perform best against strict anti-bot measures as they use IPs associated with real devices. Evomi offers a range of these options, starting from competitive prices like $0.49/GB for residential proxies. You can explore the specifics on the Evomi pricing page.

Note: If you use an invalid proxy string, the script might run without errors but fail to load the page correctly, often capturing an error page from the proxy itself instead of the target website's content.

Beyond Undetected ChromeDriver: Other Strategies

Undetected ChromeDriver is a powerful tool, but it's part of a larger scraping ecosystem. If you're still facing blocks, consider these points:

Alternative Libraries: Frameworks like Playwright and Pyppeteer (a Python port of Puppeteer) offer similar browser automation capabilities and have their own communities developing anti-detection techniques. Switching would require code adaptation.
Headless vs. Headful: Experiment with running your browser visibly (headful) versus invisibly (headless). Some sites are better at detecting headless browsers, even modified ones.
Proxy Strategy: Don't just use *a* proxy, use proxies *strategically*. Rotate IPs regularly, use high-quality residential or mobile proxies for tough targets, and match proxy geolocation to the target site if necessary.
User Agent Rotation: Cycle through a list of current, common user agents instead of using just one static string.
Behavioral Analysis: Mimic human behavior. Add random delays between actions, avoid predictable scraping patterns, navigate through login pages naturally instead of hitting internal pages directly.
Anti-Detect Browsers: For maximum stealth, specialized browsers like Evomium (free for Evomi customers) are designed to manage and spoof many browser fingerprint parameters automatically, often simplifying the setup compared to manual configuration in code.
Fingerprint Checking: Use tools like Evomi's Browser Fingerprint Checker to see what information your configured browser might be leaking.

Quick Recap

Let's summarize the core steps for using Undetected ChromeDriver with options and proxies:

Install the library:
Import it:
import undetected_chromedriver as uc
Adapt the Python code below, configuring options and replacing the placeholder proxy details with your actual credentials (e.g., from Evomi):

import undetected_chromedriver as uc
import time

def scrape_with_proxy(target_url, proxy_string):
    options = uc.ChromeOptions()
    options.add_argument('--headless=new') # Or remove for visible browser
    # Example: Use a realistic user agent
    options.add_argument('--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36')
    options.add_argument('--ignore-certificate-errors')
    # Configure the proxy
    options.add_argument(f'--proxy-server={proxy_string}')

    browser = uc.Chrome(options=options)
    html_source = None
    try:
        print(f"Attempting to fetch {target_url} via proxy {proxy_string.split('@')[-1]}...") # Mask user/pass in log
        # Optional: Check IP first
        # browser.get('https://httpbin.org/ip')
        # print(f"Current IP: {browser.find_element('tag name', 'pre').text}")

        browser.get(target_url)
        time.sleep(3) # Adjust wait time as needed
        html_source = browser.page_source
        print("Successfully retrieved page source.")
    except Exception as e:
        print(f"Scraping failed: {e}")
    finally:
        browser.quit()
    return html_source

# --- Configuration ---
# Replace with your actual proxy details from Evomi or another provider
# Format: protocol://username:password@host:port
proxy_config = "http://YOUR_USERNAME:YOUR_PASSWORD@rp.evomi.com:1000" # <-- REPLACE THIS
target_website = 'https://httpbin.org/headers' # Example site that shows request headers

# --- Execution ---
content = scrape_with_proxy(target_website, proxy_config)

if content:
    print("\n--- Received Content Sample ---")
    print(content[:600])
    print("...")

By combining Undetected ChromeDriver's stealth capabilities with the IP anonymity provided by quality proxies like those from Evomi, you significantly increase your chances of successful and sustainable web scraping in Python.

United States

United Kingdom

Germany

France

Japan

Canada

Australia

South Korea

Scrape in Python with Undetected ChromeDriver & Proxies

Stealthy Web Scraping: Combining Python, Undetected ChromeDriver, and Proxies

Getting Started: Installation

Using Undetected ChromeDriver: A Practical Guide

Making Basic Web Requests

Capturing Website Source Code

Customizing Browser Behavior

Integrating Proxies

Beyond Undetected ChromeDriver: Other Strategies

Quick Recap

Stealthy Web Scraping: Combining Python, Undetected ChromeDriver, and Proxies

Getting Started: Installation

Using Undetected ChromeDriver: A Practical Guide

Making Basic Web Requests

Capturing Website Source Code

Customizing Browser Behavior

Integrating Proxies

Beyond Undetected ChromeDriver: Other Strategies

Quick Recap

Stealthy Web Scraping: Combining Python, Undetected ChromeDriver, and Proxies

Getting Started: Installation

Using Undetected ChromeDriver: A Practical Guide

Making Basic Web Requests

Capturing Website Source Code

Customizing Browser Behavior

Integrating Proxies

Beyond Undetected ChromeDriver: Other Strategies

Quick Recap

About Author

Like this article? Share it.

You asked, we answer - Users questions:

In This Article

Read More Blogs

Node Unblocker 2025: Web Scraping Step-by-Step

How to Set Up Evomi Proxies in Octo Browser: Complete Guide

Residential vs. Datacenter Proxies: Best Choice?

Get Started with Swiss Quality Proxies

Get Started with Swiss Quality Proxies

Get Started with Swiss Quality Proxies