How to Fix Failed Python Requests: Retry & Proxy Strategies

Tackling Failed Python Requests: Smart Retries and Proxy Power

Python's requests library is a fantastic tool in any developer's arsenal, particularly favored for tasks like web scraping. It simplifies the process of sending HTTP requests compared to the standard urllib library, making interactions with web servers and APIs much more straightforward.

When you're building web scrapers, you'll likely lean heavily on requests because it's relatively easy to use and debug. This is crucial because, let's face it, requests fail. Understanding *why* they fail and how to handle those failures gracefully is key to building robust scraping applications.

Getting Your Feet Wet with Python Requests

Before we dive in, we'll assume you've got Python set up and are comfortable using an Integrated Development Environment (IDE) like VS Code or PyCharm. If you haven't already, you'll need to install the requests library. Open your terminal or command prompt and type:

This command fetches and installs the library, making it available in your Python environment.

Like any Python library, you need to import requests before using it in your script:

import requests

Sending a simple GET request is quite intuitive. You call the get() method with the URL you want to access:

import requests

def fetch_data(url):
    try:
        response = requests.get(url)
        # We'll check the status code to see if it worked
        print(f"Request to {url} returned status code: {response.status_code}")
        # You might want to return the response object for further processing
        # return response
    except requests.exceptions.RequestException as e:
        print(f"An error occurred: {e}")

# Let's try fetching the Evomi homepage
target_url = 'https://evomi.com'
fetch_data(target_url)

Checking the status_code of the response object is fundamental. It tells you whether your request was successful or if something went wrong. We'll use these codes to decide how to handle failures.

Decoding Failed Request Responses

Every HTTP request results in a status code. While codes like 200 OK mean success, we're more interested in the ones that signal trouble, especially during web scraping.

403 Forbidden

This code means the server understood your request but refuses to authorize it. You might lack the necessary permissions or credentials to access the resource, or perhaps your IP address has been flagged or banned. Overcoming a 403 often requires valid authentication credentials.

If the site uses basic authentication, you can include credentials directly in your request:

import requests

def fetch_protected_data(url, username, password):
    try:
        credentials = (username, password)
        response = requests.get(url, auth=credentials)
        print(f"Status Code: {response.status_code}")
    except requests.exceptions.RequestException as e:
        print(f"An error occurred: {e}")


# Example usage (replace with actual URL and credentials)
# fetch_protected_data('https://some-protected-resource.com/data', 'my_user', 'my_secret_pass')

Keep in mind that many websites use more complex authentication methods (like login forms with sessions and CSRF tokens) which require more sophisticated handling, often involving session objects and posting data.

429 Too Many Requests

Ah, the bane of many scrapers. A 429 error indicates you've hit a rate limit – you're sending requests too frequently from the same IP address. The server is asking you to slow down. This is where retry strategies and proxies become essential.

500 Internal Server Error

This is a generic "something broke" error on the server side. It's not your fault, but the server encountered an unexpected condition. Retrying the request after a short delay often resolves this, as it might be a temporary glitch.

502 Bad Gateway

Similar to a 500 error, a 502 usually means one server acting as a gateway or proxy received an invalid response from an upstream server it was trying to access. Again, this is a server-side issue, and a retry might succeed if the upstream issue is resolved quickly.

503 Service Unavailable

This typically means the server is temporarily overloaded or down for maintenance. It's unable to handle the request right now. While you can retry, success depends entirely on the server becoming available again.

504 Gateway Timeout

Like 502, this involves a gateway or proxy server. However, this time, the gateway didn't receive a *timely* response from the upstream server. It could be due to network congestion or the upstream server being slow. Retrying, possibly with increasing delays (backoff), is a common approach.

Building a Resilient Retry Strategy

The requests library, combined with Python's standard libraries, provides the necessary building blocks to handle most transient errors automatically. For errors like 403 (Forbidden), you might need specific credentials, and for 429 (Too Many Requests), proxies offer a great solution (more on that later). But for the 5xx server errors and temporary glitches, retries are your best friend.

Let's explore two common ways to implement retries.

Method 1: The Simple Loop with Fixed Delay

A straightforward approach is to wrap your request in a loop that retries a fixed number of times with a pause between attempts.

import requests
import time


def fetch_with_simple_retry(url, max_retries=3, delay=5):
    """
    Attempts to fetch a URL, retrying on non-200/404 status codes.
    """
    for attempt in range(max_retries):
        try:
            response = requests.get(url, timeout=10) # Added a timeout
            # Success or 'Not Found' are considered final
            if response.status_code == 200 or response.status_code == 404:
                print(f"Attempt {attempt + 1}: Success or Not Found ({response.status_code}).")
                return response
            else:
                print(f"Attempt {attempt + 1}: Failed with status {response.status_code}. Retrying in {delay}s...")
        except requests.exceptions.Timeout:
            print(f"Attempt {attempt + 1}: Request timed out. Retrying in {delay}s...")
        except requests.exceptions.ConnectionError:
            print(f"Attempt {attempt + 1}: Connection error. Retrying in {delay}s...")
        except requests.exceptions.RequestException as e:
            print(f"Attempt {attempt + 1}: An unexpected error occurred: {e}. Retrying in {delay}s...")

        # Wait before the next attempt, unless it's the last one
        if attempt < max_retries - 1:
            time.sleep(delay)
        else:
            print(f"Attempt {attempt + 1}: Max retries reached. Giving up.")
            return None # Or raise an exception


# Example usage
target_url = 'https://httpbin.org/delay/3' # A site that delays response
fetch_with_simple_retry(target_url)

Here, we import the time library for the sleep() function. The function takes the URL, the maximum number of retries, and the delay between retries. It loops, making the request inside a try...except block to catch potential network issues or timeouts. If the status code isn't 200 (OK) or 404 (Not Found - often treated as a final state), it waits and tries again. This is simple but might hammer a struggling server if the delay is too short.

Method 2: Sophisticated Retries with HTTPAdapter

For more fine-grained control, especially implementing exponential backoff (increasing delays between retries), you can use the HTTPAdapter and Retry utility from requests and its underlying library, urllib3.

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry # Corrected import path


def fetch_with_adapter_retry(url):
    """
    Fetches a URL using a session with a configured retry strategy.
    """
    session = requests.Session()

    # Configure the retry strategy
    retry_strategy = Retry(
        total=5,  # Total number of retries
        backoff_factor=1,  # Multiplier for delay: {backoff factor} * (2 ** ({number of total retries} - 1))
        status_forcelist=[429, 500, 502, 503, 504],  # Status codes to force retry on
        allowed_methods=["HEAD", "GET", "OPTIONS"] # Methods to retry (important for idempotency)
    )

    # Mount the strategy to the session for HTTP and HTTPS
    adapter = HTTPAdapter(max_retries=retry_strategy)
    session.mount("https://", adapter)
    session.mount("http://", adapter)

    try:
        response = session.get(url, timeout=15) # Use the session to make the request
        print(f"Request to {url} returned status code: {response.status_code}")
        return response
    except requests.exceptions.RequestException as e:
        print(f"Failed to fetch {url} after multiple retries: {e}")
        return None


# Example usage
target_url = 'https://httpbin.org/status/503' # A site that returns 503
fetch_with_adapter_retry(target_url)

First, we import HTTPAdapter and Retry. We create a requests.Session object, which persists certain parameters across requests. Then, we define a Retry object, specifying:

total: The maximum number of retry attempts.
backoff_factor: Controls the delay between attempts. A factor of 1 means delays will be roughly 0s, 1s, 2s, 4s, 8s... for subsequent retries.
status_forcelist: A list of HTTP status codes that should trigger a retry.

We create an HTTPAdapter with this retry strategy and `mount` it to our session for both `http://` and `https://` prefixes. Any request made using this `session` object will automatically use the configured retry logic. This method is generally preferred for production code as it's more robust and respects server load better via backoff.

Using Proxies to Navigate Request Limits (Especially 429s)

The 429 Too Many Requests error is a direct challenge to your IP address's reputation with the target server. While retrying might eventually work if the rate limit window passes, a more effective strategy, especially for large-scale scraping, is using proxies.

By routing your requests through different proxy servers, you change the source IP address seen by the target server. If you hit a rate limit on one IP, you can simply switch to another. High-quality proxy providers like Evomi offer vast pools of IPs (Residential, Mobile, Datacenter) allowing you to distribute your requests and avoid hitting those limits.

Let's adapt the HTTPAdapter example to incorporate rotating proxies for handling 429s proactively.

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry


def fetch_with_proxies_and_retry(url):
    """
    Fetches a URL using a session with retries and rotating proxies.
    Note: This example assumes a rotating proxy endpoint.
    """
    session = requests.Session()

    # Proxy configuration - replace with your actual Evomi credentials and endpoint
    # Example for Evomi rotating residential proxies (HTTP)
    proxy_user = "YOUR_USERNAME"
    proxy_pass = "YOUR_PASSWORD"
    proxy_host = "rp.evomi.com"
    proxy_port = 1000  # Use 1001 for HTTPS, 1002 for SOCKS5

    proxy_url = f"http://{proxy_user}:{proxy_pass}@{proxy_host}:{proxy_port}"

    proxies = {
        "http": proxy_url,
        "https": proxy_url,  # Use the same proxy for HTTPS traffic
    }
    session.proxies = proxies  # Set proxies for the session

    # Configure retry strategy - Note: We might remove 429 if proxies handle it
    retry_strategy = Retry(
        total=3,  # Fewer retries as proxies help avoid some issues
        backoff_factor=0.5,
        status_forcelist=[500, 502, 503, 504],  # Retrying mainly server-side issues
        allowed_methods=["HEAD", "GET", "OPTIONS"]
    )
    adapter = HTTPAdapter(max_retries=retry_strategy)
    session.mount("https://", adapter)
    session.mount("http://", adapter)

    try:
        # Verify IP change (optional, good for testing)
        # ip_check_url = 'https://geo.evomi.com/'
        # pre_response = session.get(ip_check_url, timeout=10)
        # print(f"Current IP via proxy: {pre_response.json().get('ip')}")

        response = session.get(url, timeout=20)  # Increased timeout for proxy latency
        print(f"Request to {url} via proxy returned status code: {response.status_code}")

        # Specific handling for 429 if needed (though rotating proxy should mitigate)
        if response.status_code == 429:
            print("Received 429 despite proxy. Rotating proxy might need time or pool exhausted.")
            # Depending on proxy type, might need manual rotation logic here
            # For Evomi rotating residential, each new request *should* use a new IP.
        return response
    except requests.exceptions.ProxyError as e:
        print(f"Proxy error occurred: {e}")
        return None
    except requests.exceptions.RequestException as e:
        print(f"Failed to fetch {url} via proxy after retries: {e}")
        return None


# Example usage
target_url = 'https://httpbin.org/ip'  # Check the IP the target server sees
fetch_with_proxies_and_retry(target_url)

In this setup, we configure the session to use proxies. We're using placeholders for Evomi's rotating residential proxy endpoint (rp.evomi.com:1000). Each request sent through this session will automatically be routed via a proxy. If you're using a provider like Evomi with rotating residential or mobile proxies, each connection (or session, depending on setup) typically gets a new IP address automatically. This drastically reduces the chance of hitting 429 errors tied to a single IP.

Notice we adjusted the Retry strategy, perhaps removing 429 from the status_forcelist because the primary strategy for handling it is now IP rotation via the proxy. We still keep retries for server-side errors (5xx). If using sticky sessions (where you keep the same proxy IP for a while), you'd need more complex logic to manually switch proxies upon receiving a 429.

Considering proxies? Evomi offers ethically sourced Residential, Mobile, and fast Datacenter proxies. We even offer a free trial to test them out!

Wrapping Up

Handling failed requests is non-negotiable for reliable web scraping or API interaction in Python. You've learned two primary ways to implement automatic retries:

A simple for loop with time.sleep() for basic retry needs.
Using requests.Session, HTTPAdapter, and urllib3.util.Retry for more control, including exponential backoff.

Furthermore, you saw how proxies, particularly rotating ones like those offered by Evomi, are incredibly effective at overcoming rate limits (429 errors) by changing your source IP address. Combining smart retry logic with a robust proxy infrastructure gives your Python scripts the resilience needed to navigate the unpredictable nature of the web.

Tackling Failed Python Requests: Smart Retries and Proxy Power

Python's requests library is a fantastic tool in any developer's arsenal, particularly favored for tasks like web scraping. It simplifies the process of sending HTTP requests compared to the standard urllib library, making interactions with web servers and APIs much more straightforward.

When you're building web scrapers, you'll likely lean heavily on requests because it's relatively easy to use and debug. This is crucial because, let's face it, requests fail. Understanding *why* they fail and how to handle those failures gracefully is key to building robust scraping applications.

Getting Your Feet Wet with Python Requests

Before we dive in, we'll assume you've got Python set up and are comfortable using an Integrated Development Environment (IDE) like VS Code or PyCharm. If you haven't already, you'll need to install the requests library. Open your terminal or command prompt and type:

This command fetches and installs the library, making it available in your Python environment.

Like any Python library, you need to import requests before using it in your script:

import requests

Sending a simple GET request is quite intuitive. You call the get() method with the URL you want to access:

import requests

def fetch_data(url):
    try:
        response = requests.get(url)
        # We'll check the status code to see if it worked
        print(f"Request to {url} returned status code: {response.status_code}")
        # You might want to return the response object for further processing
        # return response
    except requests.exceptions.RequestException as e:
        print(f"An error occurred: {e}")

# Let's try fetching the Evomi homepage
target_url = 'https://evomi.com'
fetch_data(target_url)

Checking the status_code of the response object is fundamental. It tells you whether your request was successful or if something went wrong. We'll use these codes to decide how to handle failures.

Decoding Failed Request Responses

Every HTTP request results in a status code. While codes like 200 OK mean success, we're more interested in the ones that signal trouble, especially during web scraping.

403 Forbidden

This code means the server understood your request but refuses to authorize it. You might lack the necessary permissions or credentials to access the resource, or perhaps your IP address has been flagged or banned. Overcoming a 403 often requires valid authentication credentials.

If the site uses basic authentication, you can include credentials directly in your request:

import requests

def fetch_protected_data(url, username, password):
    try:
        credentials = (username, password)
        response = requests.get(url, auth=credentials)
        print(f"Status Code: {response.status_code}")
    except requests.exceptions.RequestException as e:
        print(f"An error occurred: {e}")


# Example usage (replace with actual URL and credentials)
# fetch_protected_data('https://some-protected-resource.com/data', 'my_user', 'my_secret_pass')

Keep in mind that many websites use more complex authentication methods (like login forms with sessions and CSRF tokens) which require more sophisticated handling, often involving session objects and posting data.

429 Too Many Requests

Ah, the bane of many scrapers. A 429 error indicates you've hit a rate limit – you're sending requests too frequently from the same IP address. The server is asking you to slow down. This is where retry strategies and proxies become essential.

500 Internal Server Error

This is a generic "something broke" error on the server side. It's not your fault, but the server encountered an unexpected condition. Retrying the request after a short delay often resolves this, as it might be a temporary glitch.

502 Bad Gateway

Similar to a 500 error, a 502 usually means one server acting as a gateway or proxy received an invalid response from an upstream server it was trying to access. Again, this is a server-side issue, and a retry might succeed if the upstream issue is resolved quickly.

503 Service Unavailable

This typically means the server is temporarily overloaded or down for maintenance. It's unable to handle the request right now. While you can retry, success depends entirely on the server becoming available again.

504 Gateway Timeout

Like 502, this involves a gateway or proxy server. However, this time, the gateway didn't receive a *timely* response from the upstream server. It could be due to network congestion or the upstream server being slow. Retrying, possibly with increasing delays (backoff), is a common approach.

Building a Resilient Retry Strategy

The requests library, combined with Python's standard libraries, provides the necessary building blocks to handle most transient errors automatically. For errors like 403 (Forbidden), you might need specific credentials, and for 429 (Too Many Requests), proxies offer a great solution (more on that later). But for the 5xx server errors and temporary glitches, retries are your best friend.

Let's explore two common ways to implement retries.

Method 1: The Simple Loop with Fixed Delay

A straightforward approach is to wrap your request in a loop that retries a fixed number of times with a pause between attempts.

import requests
import time


def fetch_with_simple_retry(url, max_retries=3, delay=5):
    """
    Attempts to fetch a URL, retrying on non-200/404 status codes.
    """
    for attempt in range(max_retries):
        try:
            response = requests.get(url, timeout=10) # Added a timeout
            # Success or 'Not Found' are considered final
            if response.status_code == 200 or response.status_code == 404:
                print(f"Attempt {attempt + 1}: Success or Not Found ({response.status_code}).")
                return response
            else:
                print(f"Attempt {attempt + 1}: Failed with status {response.status_code}. Retrying in {delay}s...")
        except requests.exceptions.Timeout:
            print(f"Attempt {attempt + 1}: Request timed out. Retrying in {delay}s...")
        except requests.exceptions.ConnectionError:
            print(f"Attempt {attempt + 1}: Connection error. Retrying in {delay}s...")
        except requests.exceptions.RequestException as e:
            print(f"Attempt {attempt + 1}: An unexpected error occurred: {e}. Retrying in {delay}s...")

        # Wait before the next attempt, unless it's the last one
        if attempt < max_retries - 1:
            time.sleep(delay)
        else:
            print(f"Attempt {attempt + 1}: Max retries reached. Giving up.")
            return None # Or raise an exception


# Example usage
target_url = 'https://httpbin.org/delay/3' # A site that delays response
fetch_with_simple_retry(target_url)

Here, we import the time library for the sleep() function. The function takes the URL, the maximum number of retries, and the delay between retries. It loops, making the request inside a try...except block to catch potential network issues or timeouts. If the status code isn't 200 (OK) or 404 (Not Found - often treated as a final state), it waits and tries again. This is simple but might hammer a struggling server if the delay is too short.

Method 2: Sophisticated Retries with HTTPAdapter

For more fine-grained control, especially implementing exponential backoff (increasing delays between retries), you can use the HTTPAdapter and Retry utility from requests and its underlying library, urllib3.

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry # Corrected import path


def fetch_with_adapter_retry(url):
    """
    Fetches a URL using a session with a configured retry strategy.
    """
    session = requests.Session()

    # Configure the retry strategy
    retry_strategy = Retry(
        total=5,  # Total number of retries
        backoff_factor=1,  # Multiplier for delay: {backoff factor} * (2 ** ({number of total retries} - 1))
        status_forcelist=[429, 500, 502, 503, 504],  # Status codes to force retry on
        allowed_methods=["HEAD", "GET", "OPTIONS"] # Methods to retry (important for idempotency)
    )

    # Mount the strategy to the session for HTTP and HTTPS
    adapter = HTTPAdapter(max_retries=retry_strategy)
    session.mount("https://", adapter)
    session.mount("http://", adapter)

    try:
        response = session.get(url, timeout=15) # Use the session to make the request
        print(f"Request to {url} returned status code: {response.status_code}")
        return response
    except requests.exceptions.RequestException as e:
        print(f"Failed to fetch {url} after multiple retries: {e}")
        return None


# Example usage
target_url = 'https://httpbin.org/status/503' # A site that returns 503
fetch_with_adapter_retry(target_url)

First, we import HTTPAdapter and Retry. We create a requests.Session object, which persists certain parameters across requests. Then, we define a Retry object, specifying:

total: The maximum number of retry attempts.
backoff_factor: Controls the delay between attempts. A factor of 1 means delays will be roughly 0s, 1s, 2s, 4s, 8s... for subsequent retries.
status_forcelist: A list of HTTP status codes that should trigger a retry.

We create an HTTPAdapter with this retry strategy and `mount` it to our session for both `http://` and `https://` prefixes. Any request made using this `session` object will automatically use the configured retry logic. This method is generally preferred for production code as it's more robust and respects server load better via backoff.

Using Proxies to Navigate Request Limits (Especially 429s)

The 429 Too Many Requests error is a direct challenge to your IP address's reputation with the target server. While retrying might eventually work if the rate limit window passes, a more effective strategy, especially for large-scale scraping, is using proxies.

By routing your requests through different proxy servers, you change the source IP address seen by the target server. If you hit a rate limit on one IP, you can simply switch to another. High-quality proxy providers like Evomi offer vast pools of IPs (Residential, Mobile, Datacenter) allowing you to distribute your requests and avoid hitting those limits.

Let's adapt the HTTPAdapter example to incorporate rotating proxies for handling 429s proactively.

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry


def fetch_with_proxies_and_retry(url):
    """
    Fetches a URL using a session with retries and rotating proxies.
    Note: This example assumes a rotating proxy endpoint.
    """
    session = requests.Session()

    # Proxy configuration - replace with your actual Evomi credentials and endpoint
    # Example for Evomi rotating residential proxies (HTTP)
    proxy_user = "YOUR_USERNAME"
    proxy_pass = "YOUR_PASSWORD"
    proxy_host = "rp.evomi.com"
    proxy_port = 1000  # Use 1001 for HTTPS, 1002 for SOCKS5

    proxy_url = f"http://{proxy_user}:{proxy_pass}@{proxy_host}:{proxy_port}"

    proxies = {
        "http": proxy_url,
        "https": proxy_url,  # Use the same proxy for HTTPS traffic
    }
    session.proxies = proxies  # Set proxies for the session

    # Configure retry strategy - Note: We might remove 429 if proxies handle it
    retry_strategy = Retry(
        total=3,  # Fewer retries as proxies help avoid some issues
        backoff_factor=0.5,
        status_forcelist=[500, 502, 503, 504],  # Retrying mainly server-side issues
        allowed_methods=["HEAD", "GET", "OPTIONS"]
    )
    adapter = HTTPAdapter(max_retries=retry_strategy)
    session.mount("https://", adapter)
    session.mount("http://", adapter)

    try:
        # Verify IP change (optional, good for testing)
        # ip_check_url = 'https://geo.evomi.com/'
        # pre_response = session.get(ip_check_url, timeout=10)
        # print(f"Current IP via proxy: {pre_response.json().get('ip')}")

        response = session.get(url, timeout=20)  # Increased timeout for proxy latency
        print(f"Request to {url} via proxy returned status code: {response.status_code}")

        # Specific handling for 429 if needed (though rotating proxy should mitigate)
        if response.status_code == 429:
            print("Received 429 despite proxy. Rotating proxy might need time or pool exhausted.")
            # Depending on proxy type, might need manual rotation logic here
            # For Evomi rotating residential, each new request *should* use a new IP.
        return response
    except requests.exceptions.ProxyError as e:
        print(f"Proxy error occurred: {e}")
        return None
    except requests.exceptions.RequestException as e:
        print(f"Failed to fetch {url} via proxy after retries: {e}")
        return None


# Example usage
target_url = 'https://httpbin.org/ip'  # Check the IP the target server sees
fetch_with_proxies_and_retry(target_url)

In this setup, we configure the session to use proxies. We're using placeholders for Evomi's rotating residential proxy endpoint (rp.evomi.com:1000). Each request sent through this session will automatically be routed via a proxy. If you're using a provider like Evomi with rotating residential or mobile proxies, each connection (or session, depending on setup) typically gets a new IP address automatically. This drastically reduces the chance of hitting 429 errors tied to a single IP.

Notice we adjusted the Retry strategy, perhaps removing 429 from the status_forcelist because the primary strategy for handling it is now IP rotation via the proxy. We still keep retries for server-side errors (5xx). If using sticky sessions (where you keep the same proxy IP for a while), you'd need more complex logic to manually switch proxies upon receiving a 429.

Considering proxies? Evomi offers ethically sourced Residential, Mobile, and fast Datacenter proxies. We even offer a free trial to test them out!

Wrapping Up

Handling failed requests is non-negotiable for reliable web scraping or API interaction in Python. You've learned two primary ways to implement automatic retries:

A simple for loop with time.sleep() for basic retry needs.
Using requests.Session, HTTPAdapter, and urllib3.util.Retry for more control, including exponential backoff.

Furthermore, you saw how proxies, particularly rotating ones like those offered by Evomi, are incredibly effective at overcoming rate limits (429 errors) by changing your source IP address. Combining smart retry logic with a robust proxy infrastructure gives your Python scripts the resilience needed to navigate the unpredictable nature of the web.

Tackling Failed Python Requests: Smart Retries and Proxy Power

Python's requests library is a fantastic tool in any developer's arsenal, particularly favored for tasks like web scraping. It simplifies the process of sending HTTP requests compared to the standard urllib library, making interactions with web servers and APIs much more straightforward.

When you're building web scrapers, you'll likely lean heavily on requests because it's relatively easy to use and debug. This is crucial because, let's face it, requests fail. Understanding *why* they fail and how to handle those failures gracefully is key to building robust scraping applications.

Getting Your Feet Wet with Python Requests

Before we dive in, we'll assume you've got Python set up and are comfortable using an Integrated Development Environment (IDE) like VS Code or PyCharm. If you haven't already, you'll need to install the requests library. Open your terminal or command prompt and type:

This command fetches and installs the library, making it available in your Python environment.

Like any Python library, you need to import requests before using it in your script:

import requests

Sending a simple GET request is quite intuitive. You call the get() method with the URL you want to access:

import requests

def fetch_data(url):
    try:
        response = requests.get(url)
        # We'll check the status code to see if it worked
        print(f"Request to {url} returned status code: {response.status_code}")
        # You might want to return the response object for further processing
        # return response
    except requests.exceptions.RequestException as e:
        print(f"An error occurred: {e}")

# Let's try fetching the Evomi homepage
target_url = 'https://evomi.com'
fetch_data(target_url)

Checking the status_code of the response object is fundamental. It tells you whether your request was successful or if something went wrong. We'll use these codes to decide how to handle failures.

Decoding Failed Request Responses

Every HTTP request results in a status code. While codes like 200 OK mean success, we're more interested in the ones that signal trouble, especially during web scraping.

403 Forbidden

This code means the server understood your request but refuses to authorize it. You might lack the necessary permissions or credentials to access the resource, or perhaps your IP address has been flagged or banned. Overcoming a 403 often requires valid authentication credentials.

If the site uses basic authentication, you can include credentials directly in your request:

import requests

def fetch_protected_data(url, username, password):
    try:
        credentials = (username, password)
        response = requests.get(url, auth=credentials)
        print(f"Status Code: {response.status_code}")
    except requests.exceptions.RequestException as e:
        print(f"An error occurred: {e}")


# Example usage (replace with actual URL and credentials)
# fetch_protected_data('https://some-protected-resource.com/data', 'my_user', 'my_secret_pass')

Keep in mind that many websites use more complex authentication methods (like login forms with sessions and CSRF tokens) which require more sophisticated handling, often involving session objects and posting data.

429 Too Many Requests

Ah, the bane of many scrapers. A 429 error indicates you've hit a rate limit – you're sending requests too frequently from the same IP address. The server is asking you to slow down. This is where retry strategies and proxies become essential.

500 Internal Server Error

This is a generic "something broke" error on the server side. It's not your fault, but the server encountered an unexpected condition. Retrying the request after a short delay often resolves this, as it might be a temporary glitch.

502 Bad Gateway

Similar to a 500 error, a 502 usually means one server acting as a gateway or proxy received an invalid response from an upstream server it was trying to access. Again, this is a server-side issue, and a retry might succeed if the upstream issue is resolved quickly.

503 Service Unavailable

This typically means the server is temporarily overloaded or down for maintenance. It's unable to handle the request right now. While you can retry, success depends entirely on the server becoming available again.

504 Gateway Timeout

Like 502, this involves a gateway or proxy server. However, this time, the gateway didn't receive a *timely* response from the upstream server. It could be due to network congestion or the upstream server being slow. Retrying, possibly with increasing delays (backoff), is a common approach.

Building a Resilient Retry Strategy

The requests library, combined with Python's standard libraries, provides the necessary building blocks to handle most transient errors automatically. For errors like 403 (Forbidden), you might need specific credentials, and for 429 (Too Many Requests), proxies offer a great solution (more on that later). But for the 5xx server errors and temporary glitches, retries are your best friend.

Let's explore two common ways to implement retries.

Method 1: The Simple Loop with Fixed Delay

A straightforward approach is to wrap your request in a loop that retries a fixed number of times with a pause between attempts.

import requests
import time


def fetch_with_simple_retry(url, max_retries=3, delay=5):
    """
    Attempts to fetch a URL, retrying on non-200/404 status codes.
    """
    for attempt in range(max_retries):
        try:
            response = requests.get(url, timeout=10) # Added a timeout
            # Success or 'Not Found' are considered final
            if response.status_code == 200 or response.status_code == 404:
                print(f"Attempt {attempt + 1}: Success or Not Found ({response.status_code}).")
                return response
            else:
                print(f"Attempt {attempt + 1}: Failed with status {response.status_code}. Retrying in {delay}s...")
        except requests.exceptions.Timeout:
            print(f"Attempt {attempt + 1}: Request timed out. Retrying in {delay}s...")
        except requests.exceptions.ConnectionError:
            print(f"Attempt {attempt + 1}: Connection error. Retrying in {delay}s...")
        except requests.exceptions.RequestException as e:
            print(f"Attempt {attempt + 1}: An unexpected error occurred: {e}. Retrying in {delay}s...")

        # Wait before the next attempt, unless it's the last one
        if attempt < max_retries - 1:
            time.sleep(delay)
        else:
            print(f"Attempt {attempt + 1}: Max retries reached. Giving up.")
            return None # Or raise an exception


# Example usage
target_url = 'https://httpbin.org/delay/3' # A site that delays response
fetch_with_simple_retry(target_url)

Here, we import the time library for the sleep() function. The function takes the URL, the maximum number of retries, and the delay between retries. It loops, making the request inside a try...except block to catch potential network issues or timeouts. If the status code isn't 200 (OK) or 404 (Not Found - often treated as a final state), it waits and tries again. This is simple but might hammer a struggling server if the delay is too short.

Method 2: Sophisticated Retries with HTTPAdapter

For more fine-grained control, especially implementing exponential backoff (increasing delays between retries), you can use the HTTPAdapter and Retry utility from requests and its underlying library, urllib3.

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry # Corrected import path


def fetch_with_adapter_retry(url):
    """
    Fetches a URL using a session with a configured retry strategy.
    """
    session = requests.Session()

    # Configure the retry strategy
    retry_strategy = Retry(
        total=5,  # Total number of retries
        backoff_factor=1,  # Multiplier for delay: {backoff factor} * (2 ** ({number of total retries} - 1))
        status_forcelist=[429, 500, 502, 503, 504],  # Status codes to force retry on
        allowed_methods=["HEAD", "GET", "OPTIONS"] # Methods to retry (important for idempotency)
    )

    # Mount the strategy to the session for HTTP and HTTPS
    adapter = HTTPAdapter(max_retries=retry_strategy)
    session.mount("https://", adapter)
    session.mount("http://", adapter)

    try:
        response = session.get(url, timeout=15) # Use the session to make the request
        print(f"Request to {url} returned status code: {response.status_code}")
        return response
    except requests.exceptions.RequestException as e:
        print(f"Failed to fetch {url} after multiple retries: {e}")
        return None


# Example usage
target_url = 'https://httpbin.org/status/503' # A site that returns 503
fetch_with_adapter_retry(target_url)

First, we import HTTPAdapter and Retry. We create a requests.Session object, which persists certain parameters across requests. Then, we define a Retry object, specifying:

total: The maximum number of retry attempts.
backoff_factor: Controls the delay between attempts. A factor of 1 means delays will be roughly 0s, 1s, 2s, 4s, 8s... for subsequent retries.
status_forcelist: A list of HTTP status codes that should trigger a retry.

We create an HTTPAdapter with this retry strategy and `mount` it to our session for both `http://` and `https://` prefixes. Any request made using this `session` object will automatically use the configured retry logic. This method is generally preferred for production code as it's more robust and respects server load better via backoff.

Using Proxies to Navigate Request Limits (Especially 429s)

The 429 Too Many Requests error is a direct challenge to your IP address's reputation with the target server. While retrying might eventually work if the rate limit window passes, a more effective strategy, especially for large-scale scraping, is using proxies.

By routing your requests through different proxy servers, you change the source IP address seen by the target server. If you hit a rate limit on one IP, you can simply switch to another. High-quality proxy providers like Evomi offer vast pools of IPs (Residential, Mobile, Datacenter) allowing you to distribute your requests and avoid hitting those limits.

Let's adapt the HTTPAdapter example to incorporate rotating proxies for handling 429s proactively.

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry


def fetch_with_proxies_and_retry(url):
    """
    Fetches a URL using a session with retries and rotating proxies.
    Note: This example assumes a rotating proxy endpoint.
    """
    session = requests.Session()

    # Proxy configuration - replace with your actual Evomi credentials and endpoint
    # Example for Evomi rotating residential proxies (HTTP)
    proxy_user = "YOUR_USERNAME"
    proxy_pass = "YOUR_PASSWORD"
    proxy_host = "rp.evomi.com"
    proxy_port = 1000  # Use 1001 for HTTPS, 1002 for SOCKS5

    proxy_url = f"http://{proxy_user}:{proxy_pass}@{proxy_host}:{proxy_port}"

    proxies = {
        "http": proxy_url,
        "https": proxy_url,  # Use the same proxy for HTTPS traffic
    }
    session.proxies = proxies  # Set proxies for the session

    # Configure retry strategy - Note: We might remove 429 if proxies handle it
    retry_strategy = Retry(
        total=3,  # Fewer retries as proxies help avoid some issues
        backoff_factor=0.5,
        status_forcelist=[500, 502, 503, 504],  # Retrying mainly server-side issues
        allowed_methods=["HEAD", "GET", "OPTIONS"]
    )
    adapter = HTTPAdapter(max_retries=retry_strategy)
    session.mount("https://", adapter)
    session.mount("http://", adapter)

    try:
        # Verify IP change (optional, good for testing)
        # ip_check_url = 'https://geo.evomi.com/'
        # pre_response = session.get(ip_check_url, timeout=10)
        # print(f"Current IP via proxy: {pre_response.json().get('ip')}")

        response = session.get(url, timeout=20)  # Increased timeout for proxy latency
        print(f"Request to {url} via proxy returned status code: {response.status_code}")

        # Specific handling for 429 if needed (though rotating proxy should mitigate)
        if response.status_code == 429:
            print("Received 429 despite proxy. Rotating proxy might need time or pool exhausted.")
            # Depending on proxy type, might need manual rotation logic here
            # For Evomi rotating residential, each new request *should* use a new IP.
        return response
    except requests.exceptions.ProxyError as e:
        print(f"Proxy error occurred: {e}")
        return None
    except requests.exceptions.RequestException as e:
        print(f"Failed to fetch {url} via proxy after retries: {e}")
        return None


# Example usage
target_url = 'https://httpbin.org/ip'  # Check the IP the target server sees
fetch_with_proxies_and_retry(target_url)

In this setup, we configure the session to use proxies. We're using placeholders for Evomi's rotating residential proxy endpoint (rp.evomi.com:1000). Each request sent through this session will automatically be routed via a proxy. If you're using a provider like Evomi with rotating residential or mobile proxies, each connection (or session, depending on setup) typically gets a new IP address automatically. This drastically reduces the chance of hitting 429 errors tied to a single IP.

Notice we adjusted the Retry strategy, perhaps removing 429 from the status_forcelist because the primary strategy for handling it is now IP rotation via the proxy. We still keep retries for server-side errors (5xx). If using sticky sessions (where you keep the same proxy IP for a while), you'd need more complex logic to manually switch proxies upon receiving a 429.

Considering proxies? Evomi offers ethically sourced Residential, Mobile, and fast Datacenter proxies. We even offer a free trial to test them out!

Wrapping Up

Handling failed requests is non-negotiable for reliable web scraping or API interaction in Python. You've learned two primary ways to implement automatic retries:

A simple for loop with time.sleep() for basic retry needs.
Using requests.Session, HTTPAdapter, and urllib3.util.Retry for more control, including exponential backoff.

Furthermore, you saw how proxies, particularly rotating ones like those offered by Evomi, are incredibly effective at overcoming rate limits (429 errors) by changing your source IP address. Combining smart retry logic with a robust proxy infrastructure gives your Python scripts the resilience needed to navigate the unpredictable nature of the web.

United States

United Kingdom

Germany

France

Japan

Canada

Australia

South Korea

How to Fix Failed Python Requests: Retry & Proxy Strategies

Tackling Failed Python Requests: Smart Retries and Proxy Power

Getting Your Feet Wet with Python Requests

Decoding Failed Request Responses

403 Forbidden

429 Too Many Requests

500 Internal Server Error

502 Bad Gateway

503 Service Unavailable

504 Gateway Timeout

Building a Resilient Retry Strategy

Method 1: The Simple Loop with Fixed Delay

Method 2: Sophisticated Retries with HTTPAdapter

Using Proxies to Navigate Request Limits (Especially 429s)

Wrapping Up

Tackling Failed Python Requests: Smart Retries and Proxy Power

Getting Your Feet Wet with Python Requests

Decoding Failed Request Responses

403 Forbidden

429 Too Many Requests

500 Internal Server Error

502 Bad Gateway

503 Service Unavailable

504 Gateway Timeout

Building a Resilient Retry Strategy

Method 1: The Simple Loop with Fixed Delay

Method 2: Sophisticated Retries with HTTPAdapter

Using Proxies to Navigate Request Limits (Especially 429s)

Wrapping Up

Tackling Failed Python Requests: Smart Retries and Proxy Power

Getting Your Feet Wet with Python Requests

Decoding Failed Request Responses

403 Forbidden

429 Too Many Requests

500 Internal Server Error

502 Bad Gateway

503 Service Unavailable

504 Gateway Timeout

Building a Resilient Retry Strategy

Method 1: The Simple Loop with Fixed Delay

Method 2: Sophisticated Retries with HTTPAdapter

Using Proxies to Navigate Request Limits (Especially 429s)

Wrapping Up

About Author

Like this article? Share it.

You asked, we answer - Users questions:

In This Article

Read More Blogs

How E-Commerce Proxies Benefit Shoppers & Businesses

How to Set Up Evomi Proxies in Octo Browser: Complete Guide

Residential vs. Datacenter Proxies: Best Choice?

Get Started with Swiss Quality Proxies

Get Started with Swiss Quality Proxies

Get Started with Swiss Quality Proxies