Scrape with Node.js & Axios: Safeguard Your IP with Proxies





Sarah Whitmore
Scraping Techniques
Scraping with Node.js & Axios: Keep Your IP Safe Using Proxies
When it comes to web scraping with Node.js, Axios is often the go-to library. It's a popular choice for developers needing to fetch website content, which can then be processed using a parsing library like the versatile Cheerio.
However, making requests directly with Axios reveals your IP address to the target website. For simple, one-off requests, this isn't usually a problem. But if you're scraping data at scale, repeated requests from the same IP can quickly lead to blocks. That's where proxies come in – they act as intermediaries, masking your real IP and keeping your scraping operations running smoothly.
This guide will walk you through integrating proxies into your Node.js scraping workflow using Axios, helping you protect your identity and avoid interruptions.
Why Bother With Proxies?
Every time your script connects to a web server, it announces its origin IP address. A few requests might fly under the radar, but try scraping hundreds or thousands of pages, and you'll likely trigger security systems. Automated tools or even vigilant administrators can flag your IP for suspicious activity, leading to temporary or permanent bans. Once blacklisted, accessing the site becomes impossible without changing your IP.
Proxies function as shields, routing your requests through their own servers. The target website only sees the proxy's IP address, keeping your original IP hidden. This fundamental feature allows for gathering large datasets without constantly hitting IP limits. High-quality proxy services, like Evomi, offer vast pools of IP addresses accessible via a single endpoint, enabling seamless IP rotation with each request, making your scraping appear more like organic traffic.
Getting Started with Proxies in Node.js
Let's dive into how you can implement proxies using Node.js and Axios.
Project Setup
First things first, ensure you have Node.js installed. If not, grab it from the official Node.js website following their instructions.
Next, install Axios for making HTTP requests and Cheerio for parsing the HTML content. Open your terminal and run:
npm
Now, create a directory for your project, let's call it axios_proxy_scraper
, and navigate into it. Initialize a new Node.js project by running npm init -y
(the -y
flag accepts default settings). Finally, create a file named scraper.js
and open it in your preferred code editor.
Making a Basic Request with Axios
Here’s a simple example of fetching web content using Axios:
const axios = require('axios');
// Target URL for scraping
const targetUrl = 'http://books.toscrape.com/';
axios.get(targetUrl)
.then(response => {
console.log('Successfully fetched HTML content!');
// console.log(response.data); // Outputting raw HTML
})
.catch(error => {
console.error(`Error fetching the URL: ${error.message}`);
});
This script simply fetches the HTML from the specified URL (Books to Scrape, a site designed for practice) and logs a success message. You could uncomment the console.log(response.data)
line to see the raw HTML.
To extract specific data, you'd typically use Cheerio:
const axios = require('axios');
const cheerio = require('cheerio');
const targetUrl = 'http://books.toscrape.com/';
axios.get(targetUrl)
.then(response => {
const html = response.data;
const $ = cheerio.load(html);
const books = [];
$('article.product_pod').each((index, element) => {
const title = $(element).find('h3 > a').attr('title');
const price = $(element).find('.price_color').text();
books.push({ title, price });
});
console.log('Extracted Books:');
console.log(books);
})
.catch(error => {
console.error(`Error during scraping: ${error.message}`);
});
While "Books to Scrape" welcomes scrapers, many real-world sites aren't so accommodating. Aggressive scraping without hiding your IP can lead to CAPTCHAs, login walls, or outright bans. This makes proxies essential for serious scraping.
Implementing a Proxy in Axios
Using a proxy with Axios is straightforward. You define a proxy configuration object containing the protocol, host IP, and port.
Here's a basic structure:
const proxyConfig = {
protocol: 'http', // Or 'https', 'socks5' depending on the proxy
host: '192.168.1.100', // Example proxy IP address
port: 8080 // Example proxy port
};
You then pass this configuration object within the Axios request options:
axios.get(targetUrl, { proxy: proxyConfig })
.then(response => {
// Process response...
})
.catch(error => {
// Handle errors, potentially related to proxy connection
console.error(`Error with proxy: ${error.message}`);
});
Now, Axios will route the request through the specified proxy server.
Finding and Choosing Proxies
Okay, so you need a proxy. Where do you get one? Broadly, you have two paths:
Free Proxy Lists: Scouring the web yields numerous lists of free proxies. However, be warned: these are often slow, unreliable, insecure, and usually offer only a single, potentially already flagged, IP address. While technically usable, managing their instability requires significant effort and technical know-how.
Paid Proxy Services: Professional providers offer reliable, secure access to large pools of proxies. This is where Evomi comes in. We provide ethically sourced residential, mobile, datacenter, and static ISP proxies.
Why choose a service like Evomi?
Reliability & Speed: Paid proxies offer significantly better performance and uptime compared to free ones.
Security & Ethics: Evomi prides itself on ethically sourced proxies and the security that comes with a professional service. Being based in Switzerland, we uphold high standards of quality and data privacy.
IP Rotation: Our residential and mobile proxy pools automatically rotate IPs, drastically reducing the chance of blocks and simplifying your scraping setup.
Affordability: Professional proxies are more accessible than you might think. Evomi offers competitive pricing, with options like Residential proxies starting at just $0.49/GB and Datacenter proxies at $0.30/GB.
Support: We back our services with excellent customer support.
Free Trial: Unsure? Evomi offers a completely free trial for our Residential, Mobile, and Datacenter proxies, letting you test the waters risk-free.
While you can try managing free proxies, investing in a reliable service like Evomi saves time, frustration, and ultimately leads to better scraping results.
Using Authenticated Proxies (Like Evomi's)
Most professional proxy services, including Evomi, require authentication to ensure only paying customers access the network. This typically involves a username and password.
When you sign up with Evomi, you'll find your credentials and proxy endpoint details (host, ports for different protocols) in your user dashboard.
To use these with Axios, you extend the proxy configuration object with an `auth` field:
// Example using Evomi Residential Proxies (HTTP)
const evomiResidentialProxy = {
protocol: 'http',
host: 'rp.evomi.com', // Evomi residential endpoint
port: 1000, // Evomi HTTP port for residential
auth: {
username: 'YOUR_EVOMI_USERNAME', // Replace with your actual username
password: 'YOUR_EVOMI_PASSWORD' // Replace with your actual password
}
};
// Remember to replace placeholders with your real credentials!
You then use this `evomiResidentialProxy` object in your Axios request, just like the basic proxy example:
axios.get(targetUrl, { proxy: evomiResidentialProxy })
.then(response => {
// ... process data
})
.catch(error => {
console.error(`Proxy authentication or connection error: ${error.message}`);
// Common issues: incorrect credentials, firewall blocks, wrong port/protocol
});
Here’s how the full scraping script might look using an authenticated Evomi proxy:
const axios = require('axios');
const cheerio = require('cheerio');
const targetUrl = 'http://books.toscrape.com/catalogue/category/books/mystery_3/index.html'; // A specific category page
// Evomi Residential Proxy Configuration
const evomiProxyConfig = {
protocol: 'http',
host: 'rp.evomi.com',
port: 1000,
auth: {
username: 'YOUR_EVOMI_USERNAME', // Replace!
password: 'YOUR_EVOMI_PASSWORD' // Replace!
}
};
console.log(`Scraping ${targetUrl} using Evomi proxy...`);
axios.get(targetUrl, { proxy: evomiProxyConfig })
.then(response => {
const html = response.data;
const $ = cheerio.load(html);
const mysteryBooks = [];
$('article.product_pod').each((index, element) => {
const title = $(element).find('h3 > a').attr('title');
const availability = $(element).find('.instock.availability').text().trim();
mysteryBooks.push({ title, availability });
});
console.log('Extracted Mystery Books:');
console.log(mysteryBooks);
})
.catch(error => {
console.error(`Scraping failed: ${error.message}`);
if (error.response) {
// Log additional info if available (like status code 407 Proxy Authentication Required)
console.error(`Status: ${error.response.status}, Data: ${error.response.data}`);
}
});
Using Evomi's rotating residential or mobile proxies means each request likely goes through a different IP address, significantly enhancing your ability to scrape large volumes of data without detection.
Setting Proxies via Environment Variables
Hardcoding credentials directly into your script is a security risk. If you accidentally share the code or commit it to a public repository, your credentials are exposed. A safer approach is using environment variables.
You can define `HTTP_PROXY` and `HTTPS_PROXY` variables in your terminal before running the script. Axios automatically detects and uses these variables if they are set.
The format typically looks like this: `protocol://username:password@host:port`
For Evomi's residential proxies (HTTP example):
Windows (Command Prompt):
set HTTP_PROXY=http://YOUR_EVOMI_USERNAME:YOUR_EVOMI_PASSWORD@rp.evomi.com:1000
set HTTPS_PROXY
Linux / macOS (Bash/Zsh):
export HTTP_PROXY="http://YOUR_EVOMI_USERNAME:YOUR_EVOMI_PASSWORD@rp.evomi.com:1000"
export HTTPS_PROXY="http://YOUR_EVOMI_USERNAME:YOUR_EVOMI_PASSWORD@rp.evomi.com:1001"
After setting these variables in your terminal session, you can run your Axios script *without* the explicit `proxy` configuration object in the code. Axios will handle it automatically.
const axios = require('axios');
const targetUrl = 'http://books.toscrape.com/';
// No proxy config needed here if environment variables are set!
axios.get(targetUrl)
.then(response => {
console.log('Request successful using environment proxy settings.');
// process data...
})
.catch(error => {
console.error(`Request failed: ${error.message}`);
});
Manually Rotating Proxies
If you're not using a service like Evomi's residential or mobile proxies (which handle rotation for you) and instead have a list of individual proxies (perhaps datacenter or static IPs), you might need to implement rotation yourself.
First, define an array containing your proxy configurations:
const myProxyList = [
{ protocol: 'http', host: '203.0.113.1', port: 8080 },
{ protocol: 'http', host: '198.51.100.5', port: 3128 },
{ protocol: 'http', host: '192.0.2.10', port: 9000 },
// Add more proxies as needed
];
Then, create a function to select a random proxy from this list for each request:
function getRandomProxy(proxyList) {
if (!proxyList || proxyList.length === 0) {
return null; // Or handle error appropriately
}
const randomIndex = Math.floor(Math.random() * proxyList.length);
return proxyList[randomIndex];
}
Now, call this function within your Axios request options:
axios.get(targetUrl, { proxy: getRandomProxy(myProxyList) })
.then(response => {
// ... process response
})
.catch(error => {
console.error(`Error with randomly selected proxy: ${error.message}`);
// Consider implementing retry logic here if a proxy fails
});
Important Note: This simple rotation doesn't handle proxy failures. If a chosen proxy is dead or slow, the request will fail. A robust implementation would require error handling and retry mechanisms, potentially removing failing proxies from the list – complexities that managed services like Evomi abstract away for you.
Wrapping Up
Integrating proxies into your Axios-based web scraping projects is crucial for large-scale data collection. It protects your IP address and helps bypass restrictions. As shown, Axios makes configuring both basic and authenticated proxies relatively simple.
While manual proxy management is possible, leveraging a reliable provider like Evomi offers significant advantages in terms of performance, reliability, IP rotation, and ease of use, letting you focus on gathering the data you need. Remember, sometimes even a good proxy isn't enough if website defenses are sophisticated. For mimicking human behavior more closely, consider browser automation tools. You can learn more in our guide to scraping with Puppeteer and proxies.
Scraping with Node.js & Axios: Keep Your IP Safe Using Proxies
When it comes to web scraping with Node.js, Axios is often the go-to library. It's a popular choice for developers needing to fetch website content, which can then be processed using a parsing library like the versatile Cheerio.
However, making requests directly with Axios reveals your IP address to the target website. For simple, one-off requests, this isn't usually a problem. But if you're scraping data at scale, repeated requests from the same IP can quickly lead to blocks. That's where proxies come in – they act as intermediaries, masking your real IP and keeping your scraping operations running smoothly.
This guide will walk you through integrating proxies into your Node.js scraping workflow using Axios, helping you protect your identity and avoid interruptions.
Why Bother With Proxies?
Every time your script connects to a web server, it announces its origin IP address. A few requests might fly under the radar, but try scraping hundreds or thousands of pages, and you'll likely trigger security systems. Automated tools or even vigilant administrators can flag your IP for suspicious activity, leading to temporary or permanent bans. Once blacklisted, accessing the site becomes impossible without changing your IP.
Proxies function as shields, routing your requests through their own servers. The target website only sees the proxy's IP address, keeping your original IP hidden. This fundamental feature allows for gathering large datasets without constantly hitting IP limits. High-quality proxy services, like Evomi, offer vast pools of IP addresses accessible via a single endpoint, enabling seamless IP rotation with each request, making your scraping appear more like organic traffic.
Getting Started with Proxies in Node.js
Let's dive into how you can implement proxies using Node.js and Axios.
Project Setup
First things first, ensure you have Node.js installed. If not, grab it from the official Node.js website following their instructions.
Next, install Axios for making HTTP requests and Cheerio for parsing the HTML content. Open your terminal and run:
npm
Now, create a directory for your project, let's call it axios_proxy_scraper
, and navigate into it. Initialize a new Node.js project by running npm init -y
(the -y
flag accepts default settings). Finally, create a file named scraper.js
and open it in your preferred code editor.
Making a Basic Request with Axios
Here’s a simple example of fetching web content using Axios:
const axios = require('axios');
// Target URL for scraping
const targetUrl = 'http://books.toscrape.com/';
axios.get(targetUrl)
.then(response => {
console.log('Successfully fetched HTML content!');
// console.log(response.data); // Outputting raw HTML
})
.catch(error => {
console.error(`Error fetching the URL: ${error.message}`);
});
This script simply fetches the HTML from the specified URL (Books to Scrape, a site designed for practice) and logs a success message. You could uncomment the console.log(response.data)
line to see the raw HTML.
To extract specific data, you'd typically use Cheerio:
const axios = require('axios');
const cheerio = require('cheerio');
const targetUrl = 'http://books.toscrape.com/';
axios.get(targetUrl)
.then(response => {
const html = response.data;
const $ = cheerio.load(html);
const books = [];
$('article.product_pod').each((index, element) => {
const title = $(element).find('h3 > a').attr('title');
const price = $(element).find('.price_color').text();
books.push({ title, price });
});
console.log('Extracted Books:');
console.log(books);
})
.catch(error => {
console.error(`Error during scraping: ${error.message}`);
});
While "Books to Scrape" welcomes scrapers, many real-world sites aren't so accommodating. Aggressive scraping without hiding your IP can lead to CAPTCHAs, login walls, or outright bans. This makes proxies essential for serious scraping.
Implementing a Proxy in Axios
Using a proxy with Axios is straightforward. You define a proxy configuration object containing the protocol, host IP, and port.
Here's a basic structure:
const proxyConfig = {
protocol: 'http', // Or 'https', 'socks5' depending on the proxy
host: '192.168.1.100', // Example proxy IP address
port: 8080 // Example proxy port
};
You then pass this configuration object within the Axios request options:
axios.get(targetUrl, { proxy: proxyConfig })
.then(response => {
// Process response...
})
.catch(error => {
// Handle errors, potentially related to proxy connection
console.error(`Error with proxy: ${error.message}`);
});
Now, Axios will route the request through the specified proxy server.
Finding and Choosing Proxies
Okay, so you need a proxy. Where do you get one? Broadly, you have two paths:
Free Proxy Lists: Scouring the web yields numerous lists of free proxies. However, be warned: these are often slow, unreliable, insecure, and usually offer only a single, potentially already flagged, IP address. While technically usable, managing their instability requires significant effort and technical know-how.
Paid Proxy Services: Professional providers offer reliable, secure access to large pools of proxies. This is where Evomi comes in. We provide ethically sourced residential, mobile, datacenter, and static ISP proxies.
Why choose a service like Evomi?
Reliability & Speed: Paid proxies offer significantly better performance and uptime compared to free ones.
Security & Ethics: Evomi prides itself on ethically sourced proxies and the security that comes with a professional service. Being based in Switzerland, we uphold high standards of quality and data privacy.
IP Rotation: Our residential and mobile proxy pools automatically rotate IPs, drastically reducing the chance of blocks and simplifying your scraping setup.
Affordability: Professional proxies are more accessible than you might think. Evomi offers competitive pricing, with options like Residential proxies starting at just $0.49/GB and Datacenter proxies at $0.30/GB.
Support: We back our services with excellent customer support.
Free Trial: Unsure? Evomi offers a completely free trial for our Residential, Mobile, and Datacenter proxies, letting you test the waters risk-free.
While you can try managing free proxies, investing in a reliable service like Evomi saves time, frustration, and ultimately leads to better scraping results.
Using Authenticated Proxies (Like Evomi's)
Most professional proxy services, including Evomi, require authentication to ensure only paying customers access the network. This typically involves a username and password.
When you sign up with Evomi, you'll find your credentials and proxy endpoint details (host, ports for different protocols) in your user dashboard.
To use these with Axios, you extend the proxy configuration object with an `auth` field:
// Example using Evomi Residential Proxies (HTTP)
const evomiResidentialProxy = {
protocol: 'http',
host: 'rp.evomi.com', // Evomi residential endpoint
port: 1000, // Evomi HTTP port for residential
auth: {
username: 'YOUR_EVOMI_USERNAME', // Replace with your actual username
password: 'YOUR_EVOMI_PASSWORD' // Replace with your actual password
}
};
// Remember to replace placeholders with your real credentials!
You then use this `evomiResidentialProxy` object in your Axios request, just like the basic proxy example:
axios.get(targetUrl, { proxy: evomiResidentialProxy })
.then(response => {
// ... process data
})
.catch(error => {
console.error(`Proxy authentication or connection error: ${error.message}`);
// Common issues: incorrect credentials, firewall blocks, wrong port/protocol
});
Here’s how the full scraping script might look using an authenticated Evomi proxy:
const axios = require('axios');
const cheerio = require('cheerio');
const targetUrl = 'http://books.toscrape.com/catalogue/category/books/mystery_3/index.html'; // A specific category page
// Evomi Residential Proxy Configuration
const evomiProxyConfig = {
protocol: 'http',
host: 'rp.evomi.com',
port: 1000,
auth: {
username: 'YOUR_EVOMI_USERNAME', // Replace!
password: 'YOUR_EVOMI_PASSWORD' // Replace!
}
};
console.log(`Scraping ${targetUrl} using Evomi proxy...`);
axios.get(targetUrl, { proxy: evomiProxyConfig })
.then(response => {
const html = response.data;
const $ = cheerio.load(html);
const mysteryBooks = [];
$('article.product_pod').each((index, element) => {
const title = $(element).find('h3 > a').attr('title');
const availability = $(element).find('.instock.availability').text().trim();
mysteryBooks.push({ title, availability });
});
console.log('Extracted Mystery Books:');
console.log(mysteryBooks);
})
.catch(error => {
console.error(`Scraping failed: ${error.message}`);
if (error.response) {
// Log additional info if available (like status code 407 Proxy Authentication Required)
console.error(`Status: ${error.response.status}, Data: ${error.response.data}`);
}
});
Using Evomi's rotating residential or mobile proxies means each request likely goes through a different IP address, significantly enhancing your ability to scrape large volumes of data without detection.
Setting Proxies via Environment Variables
Hardcoding credentials directly into your script is a security risk. If you accidentally share the code or commit it to a public repository, your credentials are exposed. A safer approach is using environment variables.
You can define `HTTP_PROXY` and `HTTPS_PROXY` variables in your terminal before running the script. Axios automatically detects and uses these variables if they are set.
The format typically looks like this: `protocol://username:password@host:port`
For Evomi's residential proxies (HTTP example):
Windows (Command Prompt):
set HTTP_PROXY=http://YOUR_EVOMI_USERNAME:YOUR_EVOMI_PASSWORD@rp.evomi.com:1000
set HTTPS_PROXY
Linux / macOS (Bash/Zsh):
export HTTP_PROXY="http://YOUR_EVOMI_USERNAME:YOUR_EVOMI_PASSWORD@rp.evomi.com:1000"
export HTTPS_PROXY="http://YOUR_EVOMI_USERNAME:YOUR_EVOMI_PASSWORD@rp.evomi.com:1001"
After setting these variables in your terminal session, you can run your Axios script *without* the explicit `proxy` configuration object in the code. Axios will handle it automatically.
const axios = require('axios');
const targetUrl = 'http://books.toscrape.com/';
// No proxy config needed here if environment variables are set!
axios.get(targetUrl)
.then(response => {
console.log('Request successful using environment proxy settings.');
// process data...
})
.catch(error => {
console.error(`Request failed: ${error.message}`);
});
Manually Rotating Proxies
If you're not using a service like Evomi's residential or mobile proxies (which handle rotation for you) and instead have a list of individual proxies (perhaps datacenter or static IPs), you might need to implement rotation yourself.
First, define an array containing your proxy configurations:
const myProxyList = [
{ protocol: 'http', host: '203.0.113.1', port: 8080 },
{ protocol: 'http', host: '198.51.100.5', port: 3128 },
{ protocol: 'http', host: '192.0.2.10', port: 9000 },
// Add more proxies as needed
];
Then, create a function to select a random proxy from this list for each request:
function getRandomProxy(proxyList) {
if (!proxyList || proxyList.length === 0) {
return null; // Or handle error appropriately
}
const randomIndex = Math.floor(Math.random() * proxyList.length);
return proxyList[randomIndex];
}
Now, call this function within your Axios request options:
axios.get(targetUrl, { proxy: getRandomProxy(myProxyList) })
.then(response => {
// ... process response
})
.catch(error => {
console.error(`Error with randomly selected proxy: ${error.message}`);
// Consider implementing retry logic here if a proxy fails
});
Important Note: This simple rotation doesn't handle proxy failures. If a chosen proxy is dead or slow, the request will fail. A robust implementation would require error handling and retry mechanisms, potentially removing failing proxies from the list – complexities that managed services like Evomi abstract away for you.
Wrapping Up
Integrating proxies into your Axios-based web scraping projects is crucial for large-scale data collection. It protects your IP address and helps bypass restrictions. As shown, Axios makes configuring both basic and authenticated proxies relatively simple.
While manual proxy management is possible, leveraging a reliable provider like Evomi offers significant advantages in terms of performance, reliability, IP rotation, and ease of use, letting you focus on gathering the data you need. Remember, sometimes even a good proxy isn't enough if website defenses are sophisticated. For mimicking human behavior more closely, consider browser automation tools. You can learn more in our guide to scraping with Puppeteer and proxies.
Scraping with Node.js & Axios: Keep Your IP Safe Using Proxies
When it comes to web scraping with Node.js, Axios is often the go-to library. It's a popular choice for developers needing to fetch website content, which can then be processed using a parsing library like the versatile Cheerio.
However, making requests directly with Axios reveals your IP address to the target website. For simple, one-off requests, this isn't usually a problem. But if you're scraping data at scale, repeated requests from the same IP can quickly lead to blocks. That's where proxies come in – they act as intermediaries, masking your real IP and keeping your scraping operations running smoothly.
This guide will walk you through integrating proxies into your Node.js scraping workflow using Axios, helping you protect your identity and avoid interruptions.
Why Bother With Proxies?
Every time your script connects to a web server, it announces its origin IP address. A few requests might fly under the radar, but try scraping hundreds or thousands of pages, and you'll likely trigger security systems. Automated tools or even vigilant administrators can flag your IP for suspicious activity, leading to temporary or permanent bans. Once blacklisted, accessing the site becomes impossible without changing your IP.
Proxies function as shields, routing your requests through their own servers. The target website only sees the proxy's IP address, keeping your original IP hidden. This fundamental feature allows for gathering large datasets without constantly hitting IP limits. High-quality proxy services, like Evomi, offer vast pools of IP addresses accessible via a single endpoint, enabling seamless IP rotation with each request, making your scraping appear more like organic traffic.
Getting Started with Proxies in Node.js
Let's dive into how you can implement proxies using Node.js and Axios.
Project Setup
First things first, ensure you have Node.js installed. If not, grab it from the official Node.js website following their instructions.
Next, install Axios for making HTTP requests and Cheerio for parsing the HTML content. Open your terminal and run:
npm
Now, create a directory for your project, let's call it axios_proxy_scraper
, and navigate into it. Initialize a new Node.js project by running npm init -y
(the -y
flag accepts default settings). Finally, create a file named scraper.js
and open it in your preferred code editor.
Making a Basic Request with Axios
Here’s a simple example of fetching web content using Axios:
const axios = require('axios');
// Target URL for scraping
const targetUrl = 'http://books.toscrape.com/';
axios.get(targetUrl)
.then(response => {
console.log('Successfully fetched HTML content!');
// console.log(response.data); // Outputting raw HTML
})
.catch(error => {
console.error(`Error fetching the URL: ${error.message}`);
});
This script simply fetches the HTML from the specified URL (Books to Scrape, a site designed for practice) and logs a success message. You could uncomment the console.log(response.data)
line to see the raw HTML.
To extract specific data, you'd typically use Cheerio:
const axios = require('axios');
const cheerio = require('cheerio');
const targetUrl = 'http://books.toscrape.com/';
axios.get(targetUrl)
.then(response => {
const html = response.data;
const $ = cheerio.load(html);
const books = [];
$('article.product_pod').each((index, element) => {
const title = $(element).find('h3 > a').attr('title');
const price = $(element).find('.price_color').text();
books.push({ title, price });
});
console.log('Extracted Books:');
console.log(books);
})
.catch(error => {
console.error(`Error during scraping: ${error.message}`);
});
While "Books to Scrape" welcomes scrapers, many real-world sites aren't so accommodating. Aggressive scraping without hiding your IP can lead to CAPTCHAs, login walls, or outright bans. This makes proxies essential for serious scraping.
Implementing a Proxy in Axios
Using a proxy with Axios is straightforward. You define a proxy configuration object containing the protocol, host IP, and port.
Here's a basic structure:
const proxyConfig = {
protocol: 'http', // Or 'https', 'socks5' depending on the proxy
host: '192.168.1.100', // Example proxy IP address
port: 8080 // Example proxy port
};
You then pass this configuration object within the Axios request options:
axios.get(targetUrl, { proxy: proxyConfig })
.then(response => {
// Process response...
})
.catch(error => {
// Handle errors, potentially related to proxy connection
console.error(`Error with proxy: ${error.message}`);
});
Now, Axios will route the request through the specified proxy server.
Finding and Choosing Proxies
Okay, so you need a proxy. Where do you get one? Broadly, you have two paths:
Free Proxy Lists: Scouring the web yields numerous lists of free proxies. However, be warned: these are often slow, unreliable, insecure, and usually offer only a single, potentially already flagged, IP address. While technically usable, managing their instability requires significant effort and technical know-how.
Paid Proxy Services: Professional providers offer reliable, secure access to large pools of proxies. This is where Evomi comes in. We provide ethically sourced residential, mobile, datacenter, and static ISP proxies.
Why choose a service like Evomi?
Reliability & Speed: Paid proxies offer significantly better performance and uptime compared to free ones.
Security & Ethics: Evomi prides itself on ethically sourced proxies and the security that comes with a professional service. Being based in Switzerland, we uphold high standards of quality and data privacy.
IP Rotation: Our residential and mobile proxy pools automatically rotate IPs, drastically reducing the chance of blocks and simplifying your scraping setup.
Affordability: Professional proxies are more accessible than you might think. Evomi offers competitive pricing, with options like Residential proxies starting at just $0.49/GB and Datacenter proxies at $0.30/GB.
Support: We back our services with excellent customer support.
Free Trial: Unsure? Evomi offers a completely free trial for our Residential, Mobile, and Datacenter proxies, letting you test the waters risk-free.
While you can try managing free proxies, investing in a reliable service like Evomi saves time, frustration, and ultimately leads to better scraping results.
Using Authenticated Proxies (Like Evomi's)
Most professional proxy services, including Evomi, require authentication to ensure only paying customers access the network. This typically involves a username and password.
When you sign up with Evomi, you'll find your credentials and proxy endpoint details (host, ports for different protocols) in your user dashboard.
To use these with Axios, you extend the proxy configuration object with an `auth` field:
// Example using Evomi Residential Proxies (HTTP)
const evomiResidentialProxy = {
protocol: 'http',
host: 'rp.evomi.com', // Evomi residential endpoint
port: 1000, // Evomi HTTP port for residential
auth: {
username: 'YOUR_EVOMI_USERNAME', // Replace with your actual username
password: 'YOUR_EVOMI_PASSWORD' // Replace with your actual password
}
};
// Remember to replace placeholders with your real credentials!
You then use this `evomiResidentialProxy` object in your Axios request, just like the basic proxy example:
axios.get(targetUrl, { proxy: evomiResidentialProxy })
.then(response => {
// ... process data
})
.catch(error => {
console.error(`Proxy authentication or connection error: ${error.message}`);
// Common issues: incorrect credentials, firewall blocks, wrong port/protocol
});
Here’s how the full scraping script might look using an authenticated Evomi proxy:
const axios = require('axios');
const cheerio = require('cheerio');
const targetUrl = 'http://books.toscrape.com/catalogue/category/books/mystery_3/index.html'; // A specific category page
// Evomi Residential Proxy Configuration
const evomiProxyConfig = {
protocol: 'http',
host: 'rp.evomi.com',
port: 1000,
auth: {
username: 'YOUR_EVOMI_USERNAME', // Replace!
password: 'YOUR_EVOMI_PASSWORD' // Replace!
}
};
console.log(`Scraping ${targetUrl} using Evomi proxy...`);
axios.get(targetUrl, { proxy: evomiProxyConfig })
.then(response => {
const html = response.data;
const $ = cheerio.load(html);
const mysteryBooks = [];
$('article.product_pod').each((index, element) => {
const title = $(element).find('h3 > a').attr('title');
const availability = $(element).find('.instock.availability').text().trim();
mysteryBooks.push({ title, availability });
});
console.log('Extracted Mystery Books:');
console.log(mysteryBooks);
})
.catch(error => {
console.error(`Scraping failed: ${error.message}`);
if (error.response) {
// Log additional info if available (like status code 407 Proxy Authentication Required)
console.error(`Status: ${error.response.status}, Data: ${error.response.data}`);
}
});
Using Evomi's rotating residential or mobile proxies means each request likely goes through a different IP address, significantly enhancing your ability to scrape large volumes of data without detection.
Setting Proxies via Environment Variables
Hardcoding credentials directly into your script is a security risk. If you accidentally share the code or commit it to a public repository, your credentials are exposed. A safer approach is using environment variables.
You can define `HTTP_PROXY` and `HTTPS_PROXY` variables in your terminal before running the script. Axios automatically detects and uses these variables if they are set.
The format typically looks like this: `protocol://username:password@host:port`
For Evomi's residential proxies (HTTP example):
Windows (Command Prompt):
set HTTP_PROXY=http://YOUR_EVOMI_USERNAME:YOUR_EVOMI_PASSWORD@rp.evomi.com:1000
set HTTPS_PROXY
Linux / macOS (Bash/Zsh):
export HTTP_PROXY="http://YOUR_EVOMI_USERNAME:YOUR_EVOMI_PASSWORD@rp.evomi.com:1000"
export HTTPS_PROXY="http://YOUR_EVOMI_USERNAME:YOUR_EVOMI_PASSWORD@rp.evomi.com:1001"
After setting these variables in your terminal session, you can run your Axios script *without* the explicit `proxy` configuration object in the code. Axios will handle it automatically.
const axios = require('axios');
const targetUrl = 'http://books.toscrape.com/';
// No proxy config needed here if environment variables are set!
axios.get(targetUrl)
.then(response => {
console.log('Request successful using environment proxy settings.');
// process data...
})
.catch(error => {
console.error(`Request failed: ${error.message}`);
});
Manually Rotating Proxies
If you're not using a service like Evomi's residential or mobile proxies (which handle rotation for you) and instead have a list of individual proxies (perhaps datacenter or static IPs), you might need to implement rotation yourself.
First, define an array containing your proxy configurations:
const myProxyList = [
{ protocol: 'http', host: '203.0.113.1', port: 8080 },
{ protocol: 'http', host: '198.51.100.5', port: 3128 },
{ protocol: 'http', host: '192.0.2.10', port: 9000 },
// Add more proxies as needed
];
Then, create a function to select a random proxy from this list for each request:
function getRandomProxy(proxyList) {
if (!proxyList || proxyList.length === 0) {
return null; // Or handle error appropriately
}
const randomIndex = Math.floor(Math.random() * proxyList.length);
return proxyList[randomIndex];
}
Now, call this function within your Axios request options:
axios.get(targetUrl, { proxy: getRandomProxy(myProxyList) })
.then(response => {
// ... process response
})
.catch(error => {
console.error(`Error with randomly selected proxy: ${error.message}`);
// Consider implementing retry logic here if a proxy fails
});
Important Note: This simple rotation doesn't handle proxy failures. If a chosen proxy is dead or slow, the request will fail. A robust implementation would require error handling and retry mechanisms, potentially removing failing proxies from the list – complexities that managed services like Evomi abstract away for you.
Wrapping Up
Integrating proxies into your Axios-based web scraping projects is crucial for large-scale data collection. It protects your IP address and helps bypass restrictions. As shown, Axios makes configuring both basic and authenticated proxies relatively simple.
While manual proxy management is possible, leveraging a reliable provider like Evomi offers significant advantages in terms of performance, reliability, IP rotation, and ease of use, letting you focus on gathering the data you need. Remember, sometimes even a good proxy isn't enough if website defenses are sophisticated. For mimicking human behavior more closely, consider browser automation tools. You can learn more in our guide to scraping with Puppeteer and proxies.

Author
Sarah Whitmore
Digital Privacy & Cybersecurity Consultant
About Author
Sarah is a cybersecurity strategist with a passion for online privacy and digital security. She explores how proxies, VPNs, and encryption tools protect users from tracking, cyber threats, and data breaches. With years of experience in cybersecurity consulting, she provides practical insights into safeguarding sensitive data in an increasingly digital world.