Rust & Selenium: High-Speed Web Scraping with Proxies





David Foster
Scraping Techniques
Diving into Web Scraping with Rust and Selenium
When you think about web scraping, languages like Python or JavaScript usually come to mind first, thanks to their dynamic nature and extensive libraries. However, diving into a statically typed, compiled language like Rust for scraping offers some compelling advantages, particularly in terms of raw performance and memory safety. Plus, let's be honest, tackling a familiar task with a different tool is a great way to learn and flex those coding muscles!
This guide will walk you through scraping dynamic web pages using Rust, powered by the thirtyfour
crate – a robust library for browser automation based on the well-known Selenium WebDriver standard.
Why Rust for Scraping? More Than Just Speed
Rust has carved out a niche for itself as a language built for performance and reliability, often finding use in systems programming where efficiency is paramount. But what makes it interesting for web scraping?
Performance: Being a compiled language, Rust often significantly outperforms interpreted languages, which can be beneficial when scraping large amounts of data or dealing with complex processing.
Memory Safety: Rust's famous borrow checker guarantees memory safety without needing a garbage collector, preventing common bugs that can plague concurrent scraping tasks.
Strong Type System: Rust's rich type system, featuring structs, enums, and traits, allows you to model the data you're scraping precisely. This means the compiler becomes your ally, catching potential errors related to data structure and types *before* you even run your scraper, leading to more reliable results.
What Exactly is Selenium?
Selenium isn't just one thing; it's a suite of open-source tools centered around browser automation. Its core component, the WebDriver API, provides a standardized way to programmatically control web browsers like Chrome, Firefox, or Safari. Think of it as giving your code the ability to perform almost any action a human user could: clicking buttons, filling forms, scrolling, and navigating pages.
While Selenium's primary domain is automated web application testing, its ability to interact with browsers makes it indispensable for scraping dynamic websites. These are sites that rely heavily on JavaScript to load or display content after the initial HTML is received – content that simpler HTTP request libraries often miss.
Tutorial: Scraping Dynamic Content with Rust & Selenium
Let's get hands-on! We'll build a Rust scraper using thirtyfour
to extract data from the Scraping Sandbox, specifically their infinitely scrolling quote application. This will cover launching a browser, finding elements, interacting with them, and handling dynamic loading.
Setting Up Your Environment
1. Install Rust
If you haven't already, you'll need Rust installed on your system. Head over to the official Rust website and follow the instructions for your operating system.
2. Get ChromeDriver
Selenium communicates with browsers via a WebDriver executable. For Chrome, you need ChromeDriver. Download the version that corresponds to your installed Chrome version from the ChromeDriver downloads page. Once downloaded, run the executable from your terminal. It will start a server, typically on port 9515, which thirtyfour
will connect to.
Keep this terminal window open while you run your scraper.
3. Create Your Rust Project
Use Cargo, Rust's package manager, to create a new project:
cargo new rust_selenium_scraper
cd
Now, open the Cargo.toml
file in the project directory and add thirtyfour
and tokio
(for asynchronous operations) to your dependencies:
[dependencies]
thirtyfour = "0.31.0" # Check for the latest compatible version
tokio = { version = "1", features = ["full"] } # Use the latest 1.x version with full features
With the setup complete, open src/main.rs
in your favorite code editor.
Your First Selenium Script in Rust
Let's start with a basic script to ensure everything is connected. This code will launch Chrome, navigate to the target website, pause briefly, and then close the browser.
Paste this into src/main.rs
and run cargo run
in your terminal:
use thirtyfour::prelude::*;
use tokio::time::{sleep, Duration};
#[tokio::main]
async fn main() -> WebDriverResult<()> {
// Specify Chrome browser capabilities
let caps = DesiredCapabilities::chrome();
// Connect to the running ChromeDriver server
let driver = WebDriver::new("http://localhost:9515", caps).await?;
// Navigate to the target page
driver.goto("http://quotes.toscrape.com/scroll").await?;
println!("Successfully navigated to the page!");
// Pause for a few seconds to observe
sleep(Duration::from_secs(3)).await;
// Close the browser session
driver.quit().await?;
println!("Browser session closed.");
Ok(())
}
Let's break it down:
DesiredCapabilities::chrome()
configures the session for Chrome.WebDriver::new(...)
establishes the connection to the ChromeDriver process we started earlier.driver.goto(...)
tells the browser to load the specified URL.sleep(...)
introduces a simple pause.driver.quit()
terminates the browser session cleanly.
If this runs without errors and you see a Chrome window open and close, you're ready for the next step!
Finding and Extracting Data
Now, let's extract the quotes. First, remove the sleep
and println!
lines related to the pause from the previous step. We'll replace them with logic to find and process the quote elements.
Looking at the target page (http://quotes.toscrape.com/scroll), each quote is contained within a `div` element having the class `quote`.

We can use driver.find_all()
with a CSS selector to grab all these elements. Then, we can iterate through them, finding the text and author elements within each quote box using element.find()
.
Add this code block after the driver.goto(...)
line:
// Find all elements with the CSS class "quote"
let quote_elements = driver.find_all(By::Css(".quote")).await?;
// Create a vector to store the results (Quote Text, Author)
let mut collected_quotes: Vec<(String, String)> = Vec::new();
// Loop through each quote element found
for quote_element in quote_elements {
// Find the text span inside the current quote element
let text = quote_element.find(By::Css(".text")).await?.text().await?;
// Find the author span inside the current quote element
let author = quote_element.find(By::Css(".author")).await?.text().await?;
// Add the extracted data as a tuple to our vector
collected_quotes.push((text, author));
}
// Print the collected quotes
println!("\n--- Collected Quotes ---");
for (quote_text, author_name) in &collected_quotes {
println!("\"{}\" - {}", quote_text, author_name);
}
println!("----------------------\n");
Here's the updated `main` function:
use thirtyfour::prelude::*; // Note: Tokio sleep is not needed for this part
// use tokio::time::{sleep, Duration};
#[tokio::main]
async fn main() -> WebDriverResult<()> {
let caps = DesiredCapabilities::chrome();
let driver = WebDriver::new("http://localhost:9515", caps).await?;
driver.goto("http://quotes.toscrape.com/scroll").await?;
println!("Navigated to quotes page.");
// Find all elements with the CSS class "quote"
let quote_elements = driver.find_all(By::Css(".quote")).await?;
println!("Found {} quote elements initially.", quote_elements.len());
// Create a vector to store the results (Quote Text, Author)
let mut collected_quotes: Vec<(String, String)> = Vec::new();
// Loop through each quote element found
for quote_element in quote_elements {
let text = quote_element.find(By::Css(".text")).await?.text().await?;
let author = quote_element.find(By::Css(".author")).await?.text().await?;
collected_quotes.push((text, author));
}
// Print the collected quotes
println!("\n--- Collected Quotes (Initial Load) ---");
for (quote_text, author_name) in &collected_quotes {
println!("\"{}\" - {}", quote_text, author_name);
}
println!("-------------------------------------\n");
driver.quit().await?;
println!("Browser session closed.");
Ok(())
}
Run cargo run
again. You should see the first batch of quotes printed to your console.
Handling Infinite Scroll
The quotes page uses "infinite scrolling" – new quotes load as you scroll down. Our current script only gets the initially loaded quotes. Let's simulate scrolling to load more.
The thirtyfour
library provides the element.scroll_into_view().await?
method. We can repeatedly find the last quote element on the page and scroll it into view, pausing briefly after each scroll to allow new content to load via JavaScript.
Replace the quote collection block and the print block from the previous step with this loop structure. We'll scroll down a few times and *then* collect all the quotes.
use thirtyfour::prelude::*;
use tokio::time::{sleep, Duration}; // Make sure sleep is imported
#[tokio::main]
async fn main() -> WebDriverResult<()> {
let caps = DesiredCapabilities::chrome();
let driver = WebDriver::new("http://localhost:9515", caps).await?;
driver.goto("http://quotes.toscrape.com/scroll").await?;
println!("Navigated to quotes page.");
let scroll_attempts = 3; // How many times to scroll down
println!("Attempting to scroll down {} times...", scroll_attempts);
for i in 0..scroll_attempts {
// Find all current quote elements
let current_quotes = driver.find_all(By::Css(".quote")).await?;
// Get the last element, if any exist
if let Some(last_quote) = current_quotes.last() {
println!("Scrolling attempt {}: Bringing last quote into view...", i + 1);
last_quote.scroll_into_view().await?;
// Wait a moment for new content to potentially load
sleep(Duration::from_secs(2)).await;
} else {
println!("No quote elements found to scroll to.");
break; // Exit loop if no quotes are found
}
}
println!("Scrolling finished.");
// Now, collect ALL quotes present after scrolling
let all_quote_elements = driver.find_all(By::Css(".quote")).await?;
println!("Found {} quote elements after scrolling.", all_quote_elements.len());
let mut final_quotes: Vec<(String, String)> = Vec::new();
for quote_element in all_quote_elements {
// Use a helper function or inline error handling for potentially missing elements
let text_result = quote_element.find(By::Css(".text")).await;
let author_result = quote_element.find(By::Css(".author")).await;
if let (Ok(text_elem), Ok(author_elem)) = (text_result, author_result) {
let text = text_elem.text().await?;
let author = author_elem.text().await?;
final_quotes.push((text, author));
} else {
println!("Warning: Could not find text or author for a quote element.");
}
}
// Print the final collected quotes
println!("\n--- Collected Quotes (After Scrolling) ---");
for (quote_text, author_name) in &final_quotes {
println!("\"{}\" - {}", quote_text, author_name);
}
println!("Total quotes collected: {}", final_quotes.len());
println!("----------------------------------------\n");
driver.quit().await?;
println!("Browser session closed.");
Ok(())
}
This code first finds the current quote elements, scrolls the last one into view, waits, and repeats. After the loop, it collects all the quotes now present on the page.
Interacting with Page Elements
Selenium isn't limited to just reading data; it can interact with elements like buttons and forms. Let's try logging into the site's Login page.
We'll need to:
Navigate to the initial page.
Find the "Login" link and click it.
On the login page, find the username and password fields.
Fill them using the
element.send_keys("...").await?
method.Find the submit button and click it.
Here's a script demonstrating this (replace the previous `main` function):
use thirtyfour::prelude::*;
use tokio::time::{sleep, Duration};
#[tokio::main]
async fn main() -> WebDriverResult<()> {
let caps = DesiredCapabilities::chrome();
let driver = WebDriver::new("http://localhost:9515", caps).await?;
// Start at the main scroll page (or any page with a login link)
driver.goto("http://quotes.toscrape.com/scroll").await?;
println!("Navigated to initial page.");
// Find the login link - using XPath to find link by its text content
println!("Looking for login link...");
let login_link = driver.find(By::XPath("//a[contains(text(), 'Login')]")).await?;
println!("Found login link, clicking...");
login_link.click().await?;
// Wait briefly for the login page to load
sleep(Duration::from_secs(2)).await;
println!("On login page.");
// Find username field by its ID
println!("Finding username field...");
let username_input = driver.find(By::Css("#username")).await?;
println!("Entering username...");
username_input.send_keys("user").await?; // Example username
// Find password field by its ID
println!("Finding password field...");
let password_input = driver.find(By::Css("#password")).await?;
println!("Entering password...");
password_input.send_keys("password").await?; // Example password
// Find the submit button (input type=submit)
println!("Finding submit button...");
let submit_button = driver.find(By::Css("input[type='submit']")).await?;
println!("Clicking submit...");
submit_button.click().await?;
// Wait to see the result (logged-in page or error)
println!("Submitted login form. Waiting...");
sleep(Duration::from_secs(5)).await;
// Check the current URL or look for an element indicating success/failure (optional)
let current_url = driver.current_url().await?;
println!("Current URL after login attempt: {}", current_url);
driver.quit().await?;
println!("Browser session closed.");
Ok(())
}

This demonstrates how you can automate form submissions, a common task in more complex scraping scenarios.
Integrating Proxies with Thirtyfour
While web scraping itself is generally legal for publicly accessible data, websites often implement measures to detect and block automated traffic. Repeated, rapid requests from the same IP address are a dead giveaway. Getting blocked can range from a temporary nuisance to a permanent ban of your IP.
This is where proxies come in. A proxy server acts as an intermediary between your scraper and the target website. Requests appear to originate from the proxy's IP, masking your own. If a proxy IP gets blocked, you can simply switch to another one.
You'll find various proxy types available:
Datacenter Proxies: Fast and affordable, good for sites with basic protection.
Residential Proxies: IPs from real home internet connections, harder to detect. Ideal for stricter sites.
Mobile Proxies: IPs from mobile carriers, excellent for mimicking mobile users.
Static ISP Proxies: Residential IPs assigned for your exclusive use.
While free proxies exist, they often come with significant drawbacks: slow speeds, unreliability, limited locations, and potential security risks (some log or even inject data). Paid providers like Evomi offer large pools of ethically sourced proxies with high uptime, speed, and better support. Evomi provides various types, including residential proxies starting from just $0.49/GB, ensuring reliable access.
To configure a proxy in thirtyfour
, you modify the DesiredCapabilities
before creating the WebDriver
instance.
First, ensure you import CapabilitiesHelper
:
use thirtyfour::{
common::capabilities::proxy::Proxy, prelude::*, CapabilitiesHelper,
};
Then, set the proxy configuration. Here's how to set an HTTP and HTTPS proxy:
let proxy_url = "http://your-proxy-endpoint:port"; // Replace with your actual proxy address
let mut caps = DesiredCapabilities::chrome();
let mut proxy_config = Proxy::manual();
proxy_config.with_http_proxy(proxy_url)?;
proxy_config.with_ssl_proxy(proxy_url)?; // Often the same as HTTP proxy
caps.set_proxy(proxy_config)?;
// Now create the driver with these capabilities
let driver = WebDriver::new("http://localhost:9515", caps).await?;
Important Note on Authenticated Proxies: Many proxy providers require username/password authentication. Selenium WebDriver itself doesn't have a straightforward way to handle the authentication pop-ups that might appear. The standard proxy configuration shown above works best with proxies authenticated via IP whitelisting. Most providers, including Evomi, allow you to register your machine's IP address in their dashboard. Once whitelisted, you can connect to the proxy endpoint without needing to embed credentials in the URL, bypassing the authentication challenge for Selenium.
If you must use username/password authentication, you might need to format the `proxy_url` like `http://username:password@your-proxy-endpoint:port`. However, be aware this might not work reliably with all WebDriver/browser combinations due to the limitations mentioned.
Here's a full example script that sets up a proxy (using a placeholder - replace it with your Evomi endpoint) and visits an IP checking site like Evomi's free IP Geolocation Checker to verify the proxy is working:
use thirtyfour::{
common::capabilities::proxy::Proxy, prelude::*, CapabilitiesHelper,
};
use tokio::time::{sleep, Duration};
#[tokio::main]
async fn main() -> WebDriverResult<()> {
// --- Proxy Configuration ---
// Replace with your actual proxy endpoint (authentification not yet setup!)
let proxy_address = "http://rp.evomi.com:1000"; // Example Evomi HTTP endpoint for residential
let mut caps = DesiredCapabilities::chrome();
let mut proxy_settings = Proxy::manual();
// Configure for HTTP and HTTPS traffic
proxy_settings.with_http_proxy(proxy_address)?;
proxy_settings.with_ssl_proxy(proxy_address)?; // Use same proxy for HTTPS
// Apply proxy settings to capabilities
caps.set_proxy(proxy_settings)?;
println!("Proxy configured: {}", proxy_address);
// --- End Proxy Configuration ---
println!("Connecting to WebDriver...");
let driver = WebDriver::new("http://localhost:9515", caps).await?;
println!("WebDriver connection established.");
// Navigate to an IP checking service
let check_url = "https://geo.evomi.com/";
println!("Navigating to IP checker: {}", check_url);
driver.goto(check_url).await?;
// Wait for page elements to load (adjust timing as needed)
sleep(Duration::from_secs(5)).await;
// Try to find the displayed IP address (Selector might vary depending on the site)
// For geo.evomi.com, the IP is usually in a specific element. Let's try a common pattern.
// NOTE: Inspect geo.evomi.com manually to find the correct selector if this fails.
let ip_element_result = driver.find(By::Css(".ip-address-val")).await; // Adjust selector as needed!
match ip_element_result {
Ok(ip_element) => {
let displayed_ip = ip_element.text().await?;
println!("IP Address displayed on page: {}", displayed_ip);
}
Err(_) => {
println!(
"Could not find the IP address element automatically. Please check the page manually."
);
// Optional: Grab the whole body text as a fallback
let body_text = driver.find(By::Css("body")).await?.text().await?;
println!("Page body content:\n{}", body_text);
}
}
driver.quit().await?;
println!("Browser session closed.");
Ok(())
}
Run this code after replacing the placeholder with your proxy details. The output should show the IP address of your proxy server, confirming it's routing the browser's traffic.
Wrapping Up
While maybe not the absolute mainstream choice for every scraping project, Rust paired with `thirtyfour` offers a potent combination for building fast, reliable, and type-safe web scrapers, especially when dealing with dynamic content. The learning curve might be steeper than Python or JavaScript initially, particularly around async and the borrow checker, but the performance and safety benefits can be significant for demanding tasks.
The proxy integration challenge highlights one area where the ecosystem might feel less mature than, say, Python's rich set of specialized libraries. However, using IP whitelisting provides a solid workaround.
If you enjoyed this, consider tackling a more complex dynamic site. Our guide on dynamic scraping with Python and Selenium covers techniques that can often be adapted, even if the language specifics differ. Happy scraping!
Diving into Web Scraping with Rust and Selenium
When you think about web scraping, languages like Python or JavaScript usually come to mind first, thanks to their dynamic nature and extensive libraries. However, diving into a statically typed, compiled language like Rust for scraping offers some compelling advantages, particularly in terms of raw performance and memory safety. Plus, let's be honest, tackling a familiar task with a different tool is a great way to learn and flex those coding muscles!
This guide will walk you through scraping dynamic web pages using Rust, powered by the thirtyfour
crate – a robust library for browser automation based on the well-known Selenium WebDriver standard.
Why Rust for Scraping? More Than Just Speed
Rust has carved out a niche for itself as a language built for performance and reliability, often finding use in systems programming where efficiency is paramount. But what makes it interesting for web scraping?
Performance: Being a compiled language, Rust often significantly outperforms interpreted languages, which can be beneficial when scraping large amounts of data or dealing with complex processing.
Memory Safety: Rust's famous borrow checker guarantees memory safety without needing a garbage collector, preventing common bugs that can plague concurrent scraping tasks.
Strong Type System: Rust's rich type system, featuring structs, enums, and traits, allows you to model the data you're scraping precisely. This means the compiler becomes your ally, catching potential errors related to data structure and types *before* you even run your scraper, leading to more reliable results.
What Exactly is Selenium?
Selenium isn't just one thing; it's a suite of open-source tools centered around browser automation. Its core component, the WebDriver API, provides a standardized way to programmatically control web browsers like Chrome, Firefox, or Safari. Think of it as giving your code the ability to perform almost any action a human user could: clicking buttons, filling forms, scrolling, and navigating pages.
While Selenium's primary domain is automated web application testing, its ability to interact with browsers makes it indispensable for scraping dynamic websites. These are sites that rely heavily on JavaScript to load or display content after the initial HTML is received – content that simpler HTTP request libraries often miss.
Tutorial: Scraping Dynamic Content with Rust & Selenium
Let's get hands-on! We'll build a Rust scraper using thirtyfour
to extract data from the Scraping Sandbox, specifically their infinitely scrolling quote application. This will cover launching a browser, finding elements, interacting with them, and handling dynamic loading.
Setting Up Your Environment
1. Install Rust
If you haven't already, you'll need Rust installed on your system. Head over to the official Rust website and follow the instructions for your operating system.
2. Get ChromeDriver
Selenium communicates with browsers via a WebDriver executable. For Chrome, you need ChromeDriver. Download the version that corresponds to your installed Chrome version from the ChromeDriver downloads page. Once downloaded, run the executable from your terminal. It will start a server, typically on port 9515, which thirtyfour
will connect to.
Keep this terminal window open while you run your scraper.
3. Create Your Rust Project
Use Cargo, Rust's package manager, to create a new project:
cargo new rust_selenium_scraper
cd
Now, open the Cargo.toml
file in the project directory and add thirtyfour
and tokio
(for asynchronous operations) to your dependencies:
[dependencies]
thirtyfour = "0.31.0" # Check for the latest compatible version
tokio = { version = "1", features = ["full"] } # Use the latest 1.x version with full features
With the setup complete, open src/main.rs
in your favorite code editor.
Your First Selenium Script in Rust
Let's start with a basic script to ensure everything is connected. This code will launch Chrome, navigate to the target website, pause briefly, and then close the browser.
Paste this into src/main.rs
and run cargo run
in your terminal:
use thirtyfour::prelude::*;
use tokio::time::{sleep, Duration};
#[tokio::main]
async fn main() -> WebDriverResult<()> {
// Specify Chrome browser capabilities
let caps = DesiredCapabilities::chrome();
// Connect to the running ChromeDriver server
let driver = WebDriver::new("http://localhost:9515", caps).await?;
// Navigate to the target page
driver.goto("http://quotes.toscrape.com/scroll").await?;
println!("Successfully navigated to the page!");
// Pause for a few seconds to observe
sleep(Duration::from_secs(3)).await;
// Close the browser session
driver.quit().await?;
println!("Browser session closed.");
Ok(())
}
Let's break it down:
DesiredCapabilities::chrome()
configures the session for Chrome.WebDriver::new(...)
establishes the connection to the ChromeDriver process we started earlier.driver.goto(...)
tells the browser to load the specified URL.sleep(...)
introduces a simple pause.driver.quit()
terminates the browser session cleanly.
If this runs without errors and you see a Chrome window open and close, you're ready for the next step!
Finding and Extracting Data
Now, let's extract the quotes. First, remove the sleep
and println!
lines related to the pause from the previous step. We'll replace them with logic to find and process the quote elements.
Looking at the target page (http://quotes.toscrape.com/scroll), each quote is contained within a `div` element having the class `quote`.

We can use driver.find_all()
with a CSS selector to grab all these elements. Then, we can iterate through them, finding the text and author elements within each quote box using element.find()
.
Add this code block after the driver.goto(...)
line:
// Find all elements with the CSS class "quote"
let quote_elements = driver.find_all(By::Css(".quote")).await?;
// Create a vector to store the results (Quote Text, Author)
let mut collected_quotes: Vec<(String, String)> = Vec::new();
// Loop through each quote element found
for quote_element in quote_elements {
// Find the text span inside the current quote element
let text = quote_element.find(By::Css(".text")).await?.text().await?;
// Find the author span inside the current quote element
let author = quote_element.find(By::Css(".author")).await?.text().await?;
// Add the extracted data as a tuple to our vector
collected_quotes.push((text, author));
}
// Print the collected quotes
println!("\n--- Collected Quotes ---");
for (quote_text, author_name) in &collected_quotes {
println!("\"{}\" - {}", quote_text, author_name);
}
println!("----------------------\n");
Here's the updated `main` function:
use thirtyfour::prelude::*; // Note: Tokio sleep is not needed for this part
// use tokio::time::{sleep, Duration};
#[tokio::main]
async fn main() -> WebDriverResult<()> {
let caps = DesiredCapabilities::chrome();
let driver = WebDriver::new("http://localhost:9515", caps).await?;
driver.goto("http://quotes.toscrape.com/scroll").await?;
println!("Navigated to quotes page.");
// Find all elements with the CSS class "quote"
let quote_elements = driver.find_all(By::Css(".quote")).await?;
println!("Found {} quote elements initially.", quote_elements.len());
// Create a vector to store the results (Quote Text, Author)
let mut collected_quotes: Vec<(String, String)> = Vec::new();
// Loop through each quote element found
for quote_element in quote_elements {
let text = quote_element.find(By::Css(".text")).await?.text().await?;
let author = quote_element.find(By::Css(".author")).await?.text().await?;
collected_quotes.push((text, author));
}
// Print the collected quotes
println!("\n--- Collected Quotes (Initial Load) ---");
for (quote_text, author_name) in &collected_quotes {
println!("\"{}\" - {}", quote_text, author_name);
}
println!("-------------------------------------\n");
driver.quit().await?;
println!("Browser session closed.");
Ok(())
}
Run cargo run
again. You should see the first batch of quotes printed to your console.
Handling Infinite Scroll
The quotes page uses "infinite scrolling" – new quotes load as you scroll down. Our current script only gets the initially loaded quotes. Let's simulate scrolling to load more.
The thirtyfour
library provides the element.scroll_into_view().await?
method. We can repeatedly find the last quote element on the page and scroll it into view, pausing briefly after each scroll to allow new content to load via JavaScript.
Replace the quote collection block and the print block from the previous step with this loop structure. We'll scroll down a few times and *then* collect all the quotes.
use thirtyfour::prelude::*;
use tokio::time::{sleep, Duration}; // Make sure sleep is imported
#[tokio::main]
async fn main() -> WebDriverResult<()> {
let caps = DesiredCapabilities::chrome();
let driver = WebDriver::new("http://localhost:9515", caps).await?;
driver.goto("http://quotes.toscrape.com/scroll").await?;
println!("Navigated to quotes page.");
let scroll_attempts = 3; // How many times to scroll down
println!("Attempting to scroll down {} times...", scroll_attempts);
for i in 0..scroll_attempts {
// Find all current quote elements
let current_quotes = driver.find_all(By::Css(".quote")).await?;
// Get the last element, if any exist
if let Some(last_quote) = current_quotes.last() {
println!("Scrolling attempt {}: Bringing last quote into view...", i + 1);
last_quote.scroll_into_view().await?;
// Wait a moment for new content to potentially load
sleep(Duration::from_secs(2)).await;
} else {
println!("No quote elements found to scroll to.");
break; // Exit loop if no quotes are found
}
}
println!("Scrolling finished.");
// Now, collect ALL quotes present after scrolling
let all_quote_elements = driver.find_all(By::Css(".quote")).await?;
println!("Found {} quote elements after scrolling.", all_quote_elements.len());
let mut final_quotes: Vec<(String, String)> = Vec::new();
for quote_element in all_quote_elements {
// Use a helper function or inline error handling for potentially missing elements
let text_result = quote_element.find(By::Css(".text")).await;
let author_result = quote_element.find(By::Css(".author")).await;
if let (Ok(text_elem), Ok(author_elem)) = (text_result, author_result) {
let text = text_elem.text().await?;
let author = author_elem.text().await?;
final_quotes.push((text, author));
} else {
println!("Warning: Could not find text or author for a quote element.");
}
}
// Print the final collected quotes
println!("\n--- Collected Quotes (After Scrolling) ---");
for (quote_text, author_name) in &final_quotes {
println!("\"{}\" - {}", quote_text, author_name);
}
println!("Total quotes collected: {}", final_quotes.len());
println!("----------------------------------------\n");
driver.quit().await?;
println!("Browser session closed.");
Ok(())
}
This code first finds the current quote elements, scrolls the last one into view, waits, and repeats. After the loop, it collects all the quotes now present on the page.
Interacting with Page Elements
Selenium isn't limited to just reading data; it can interact with elements like buttons and forms. Let's try logging into the site's Login page.
We'll need to:
Navigate to the initial page.
Find the "Login" link and click it.
On the login page, find the username and password fields.
Fill them using the
element.send_keys("...").await?
method.Find the submit button and click it.
Here's a script demonstrating this (replace the previous `main` function):
use thirtyfour::prelude::*;
use tokio::time::{sleep, Duration};
#[tokio::main]
async fn main() -> WebDriverResult<()> {
let caps = DesiredCapabilities::chrome();
let driver = WebDriver::new("http://localhost:9515", caps).await?;
// Start at the main scroll page (or any page with a login link)
driver.goto("http://quotes.toscrape.com/scroll").await?;
println!("Navigated to initial page.");
// Find the login link - using XPath to find link by its text content
println!("Looking for login link...");
let login_link = driver.find(By::XPath("//a[contains(text(), 'Login')]")).await?;
println!("Found login link, clicking...");
login_link.click().await?;
// Wait briefly for the login page to load
sleep(Duration::from_secs(2)).await;
println!("On login page.");
// Find username field by its ID
println!("Finding username field...");
let username_input = driver.find(By::Css("#username")).await?;
println!("Entering username...");
username_input.send_keys("user").await?; // Example username
// Find password field by its ID
println!("Finding password field...");
let password_input = driver.find(By::Css("#password")).await?;
println!("Entering password...");
password_input.send_keys("password").await?; // Example password
// Find the submit button (input type=submit)
println!("Finding submit button...");
let submit_button = driver.find(By::Css("input[type='submit']")).await?;
println!("Clicking submit...");
submit_button.click().await?;
// Wait to see the result (logged-in page or error)
println!("Submitted login form. Waiting...");
sleep(Duration::from_secs(5)).await;
// Check the current URL or look for an element indicating success/failure (optional)
let current_url = driver.current_url().await?;
println!("Current URL after login attempt: {}", current_url);
driver.quit().await?;
println!("Browser session closed.");
Ok(())
}

This demonstrates how you can automate form submissions, a common task in more complex scraping scenarios.
Integrating Proxies with Thirtyfour
While web scraping itself is generally legal for publicly accessible data, websites often implement measures to detect and block automated traffic. Repeated, rapid requests from the same IP address are a dead giveaway. Getting blocked can range from a temporary nuisance to a permanent ban of your IP.
This is where proxies come in. A proxy server acts as an intermediary between your scraper and the target website. Requests appear to originate from the proxy's IP, masking your own. If a proxy IP gets blocked, you can simply switch to another one.
You'll find various proxy types available:
Datacenter Proxies: Fast and affordable, good for sites with basic protection.
Residential Proxies: IPs from real home internet connections, harder to detect. Ideal for stricter sites.
Mobile Proxies: IPs from mobile carriers, excellent for mimicking mobile users.
Static ISP Proxies: Residential IPs assigned for your exclusive use.
While free proxies exist, they often come with significant drawbacks: slow speeds, unreliability, limited locations, and potential security risks (some log or even inject data). Paid providers like Evomi offer large pools of ethically sourced proxies with high uptime, speed, and better support. Evomi provides various types, including residential proxies starting from just $0.49/GB, ensuring reliable access.
To configure a proxy in thirtyfour
, you modify the DesiredCapabilities
before creating the WebDriver
instance.
First, ensure you import CapabilitiesHelper
:
use thirtyfour::{
common::capabilities::proxy::Proxy, prelude::*, CapabilitiesHelper,
};
Then, set the proxy configuration. Here's how to set an HTTP and HTTPS proxy:
let proxy_url = "http://your-proxy-endpoint:port"; // Replace with your actual proxy address
let mut caps = DesiredCapabilities::chrome();
let mut proxy_config = Proxy::manual();
proxy_config.with_http_proxy(proxy_url)?;
proxy_config.with_ssl_proxy(proxy_url)?; // Often the same as HTTP proxy
caps.set_proxy(proxy_config)?;
// Now create the driver with these capabilities
let driver = WebDriver::new("http://localhost:9515", caps).await?;
Important Note on Authenticated Proxies: Many proxy providers require username/password authentication. Selenium WebDriver itself doesn't have a straightforward way to handle the authentication pop-ups that might appear. The standard proxy configuration shown above works best with proxies authenticated via IP whitelisting. Most providers, including Evomi, allow you to register your machine's IP address in their dashboard. Once whitelisted, you can connect to the proxy endpoint without needing to embed credentials in the URL, bypassing the authentication challenge for Selenium.
If you must use username/password authentication, you might need to format the `proxy_url` like `http://username:password@your-proxy-endpoint:port`. However, be aware this might not work reliably with all WebDriver/browser combinations due to the limitations mentioned.
Here's a full example script that sets up a proxy (using a placeholder - replace it with your Evomi endpoint) and visits an IP checking site like Evomi's free IP Geolocation Checker to verify the proxy is working:
use thirtyfour::{
common::capabilities::proxy::Proxy, prelude::*, CapabilitiesHelper,
};
use tokio::time::{sleep, Duration};
#[tokio::main]
async fn main() -> WebDriverResult<()> {
// --- Proxy Configuration ---
// Replace with your actual proxy endpoint (authentification not yet setup!)
let proxy_address = "http://rp.evomi.com:1000"; // Example Evomi HTTP endpoint for residential
let mut caps = DesiredCapabilities::chrome();
let mut proxy_settings = Proxy::manual();
// Configure for HTTP and HTTPS traffic
proxy_settings.with_http_proxy(proxy_address)?;
proxy_settings.with_ssl_proxy(proxy_address)?; // Use same proxy for HTTPS
// Apply proxy settings to capabilities
caps.set_proxy(proxy_settings)?;
println!("Proxy configured: {}", proxy_address);
// --- End Proxy Configuration ---
println!("Connecting to WebDriver...");
let driver = WebDriver::new("http://localhost:9515", caps).await?;
println!("WebDriver connection established.");
// Navigate to an IP checking service
let check_url = "https://geo.evomi.com/";
println!("Navigating to IP checker: {}", check_url);
driver.goto(check_url).await?;
// Wait for page elements to load (adjust timing as needed)
sleep(Duration::from_secs(5)).await;
// Try to find the displayed IP address (Selector might vary depending on the site)
// For geo.evomi.com, the IP is usually in a specific element. Let's try a common pattern.
// NOTE: Inspect geo.evomi.com manually to find the correct selector if this fails.
let ip_element_result = driver.find(By::Css(".ip-address-val")).await; // Adjust selector as needed!
match ip_element_result {
Ok(ip_element) => {
let displayed_ip = ip_element.text().await?;
println!("IP Address displayed on page: {}", displayed_ip);
}
Err(_) => {
println!(
"Could not find the IP address element automatically. Please check the page manually."
);
// Optional: Grab the whole body text as a fallback
let body_text = driver.find(By::Css("body")).await?.text().await?;
println!("Page body content:\n{}", body_text);
}
}
driver.quit().await?;
println!("Browser session closed.");
Ok(())
}
Run this code after replacing the placeholder with your proxy details. The output should show the IP address of your proxy server, confirming it's routing the browser's traffic.
Wrapping Up
While maybe not the absolute mainstream choice for every scraping project, Rust paired with `thirtyfour` offers a potent combination for building fast, reliable, and type-safe web scrapers, especially when dealing with dynamic content. The learning curve might be steeper than Python or JavaScript initially, particularly around async and the borrow checker, but the performance and safety benefits can be significant for demanding tasks.
The proxy integration challenge highlights one area where the ecosystem might feel less mature than, say, Python's rich set of specialized libraries. However, using IP whitelisting provides a solid workaround.
If you enjoyed this, consider tackling a more complex dynamic site. Our guide on dynamic scraping with Python and Selenium covers techniques that can often be adapted, even if the language specifics differ. Happy scraping!
Diving into Web Scraping with Rust and Selenium
When you think about web scraping, languages like Python or JavaScript usually come to mind first, thanks to their dynamic nature and extensive libraries. However, diving into a statically typed, compiled language like Rust for scraping offers some compelling advantages, particularly in terms of raw performance and memory safety. Plus, let's be honest, tackling a familiar task with a different tool is a great way to learn and flex those coding muscles!
This guide will walk you through scraping dynamic web pages using Rust, powered by the thirtyfour
crate – a robust library for browser automation based on the well-known Selenium WebDriver standard.
Why Rust for Scraping? More Than Just Speed
Rust has carved out a niche for itself as a language built for performance and reliability, often finding use in systems programming where efficiency is paramount. But what makes it interesting for web scraping?
Performance: Being a compiled language, Rust often significantly outperforms interpreted languages, which can be beneficial when scraping large amounts of data or dealing with complex processing.
Memory Safety: Rust's famous borrow checker guarantees memory safety without needing a garbage collector, preventing common bugs that can plague concurrent scraping tasks.
Strong Type System: Rust's rich type system, featuring structs, enums, and traits, allows you to model the data you're scraping precisely. This means the compiler becomes your ally, catching potential errors related to data structure and types *before* you even run your scraper, leading to more reliable results.
What Exactly is Selenium?
Selenium isn't just one thing; it's a suite of open-source tools centered around browser automation. Its core component, the WebDriver API, provides a standardized way to programmatically control web browsers like Chrome, Firefox, or Safari. Think of it as giving your code the ability to perform almost any action a human user could: clicking buttons, filling forms, scrolling, and navigating pages.
While Selenium's primary domain is automated web application testing, its ability to interact with browsers makes it indispensable for scraping dynamic websites. These are sites that rely heavily on JavaScript to load or display content after the initial HTML is received – content that simpler HTTP request libraries often miss.
Tutorial: Scraping Dynamic Content with Rust & Selenium
Let's get hands-on! We'll build a Rust scraper using thirtyfour
to extract data from the Scraping Sandbox, specifically their infinitely scrolling quote application. This will cover launching a browser, finding elements, interacting with them, and handling dynamic loading.
Setting Up Your Environment
1. Install Rust
If you haven't already, you'll need Rust installed on your system. Head over to the official Rust website and follow the instructions for your operating system.
2. Get ChromeDriver
Selenium communicates with browsers via a WebDriver executable. For Chrome, you need ChromeDriver. Download the version that corresponds to your installed Chrome version from the ChromeDriver downloads page. Once downloaded, run the executable from your terminal. It will start a server, typically on port 9515, which thirtyfour
will connect to.
Keep this terminal window open while you run your scraper.
3. Create Your Rust Project
Use Cargo, Rust's package manager, to create a new project:
cargo new rust_selenium_scraper
cd
Now, open the Cargo.toml
file in the project directory and add thirtyfour
and tokio
(for asynchronous operations) to your dependencies:
[dependencies]
thirtyfour = "0.31.0" # Check for the latest compatible version
tokio = { version = "1", features = ["full"] } # Use the latest 1.x version with full features
With the setup complete, open src/main.rs
in your favorite code editor.
Your First Selenium Script in Rust
Let's start with a basic script to ensure everything is connected. This code will launch Chrome, navigate to the target website, pause briefly, and then close the browser.
Paste this into src/main.rs
and run cargo run
in your terminal:
use thirtyfour::prelude::*;
use tokio::time::{sleep, Duration};
#[tokio::main]
async fn main() -> WebDriverResult<()> {
// Specify Chrome browser capabilities
let caps = DesiredCapabilities::chrome();
// Connect to the running ChromeDriver server
let driver = WebDriver::new("http://localhost:9515", caps).await?;
// Navigate to the target page
driver.goto("http://quotes.toscrape.com/scroll").await?;
println!("Successfully navigated to the page!");
// Pause for a few seconds to observe
sleep(Duration::from_secs(3)).await;
// Close the browser session
driver.quit().await?;
println!("Browser session closed.");
Ok(())
}
Let's break it down:
DesiredCapabilities::chrome()
configures the session for Chrome.WebDriver::new(...)
establishes the connection to the ChromeDriver process we started earlier.driver.goto(...)
tells the browser to load the specified URL.sleep(...)
introduces a simple pause.driver.quit()
terminates the browser session cleanly.
If this runs without errors and you see a Chrome window open and close, you're ready for the next step!
Finding and Extracting Data
Now, let's extract the quotes. First, remove the sleep
and println!
lines related to the pause from the previous step. We'll replace them with logic to find and process the quote elements.
Looking at the target page (http://quotes.toscrape.com/scroll), each quote is contained within a `div` element having the class `quote`.

We can use driver.find_all()
with a CSS selector to grab all these elements. Then, we can iterate through them, finding the text and author elements within each quote box using element.find()
.
Add this code block after the driver.goto(...)
line:
// Find all elements with the CSS class "quote"
let quote_elements = driver.find_all(By::Css(".quote")).await?;
// Create a vector to store the results (Quote Text, Author)
let mut collected_quotes: Vec<(String, String)> = Vec::new();
// Loop through each quote element found
for quote_element in quote_elements {
// Find the text span inside the current quote element
let text = quote_element.find(By::Css(".text")).await?.text().await?;
// Find the author span inside the current quote element
let author = quote_element.find(By::Css(".author")).await?.text().await?;
// Add the extracted data as a tuple to our vector
collected_quotes.push((text, author));
}
// Print the collected quotes
println!("\n--- Collected Quotes ---");
for (quote_text, author_name) in &collected_quotes {
println!("\"{}\" - {}", quote_text, author_name);
}
println!("----------------------\n");
Here's the updated `main` function:
use thirtyfour::prelude::*; // Note: Tokio sleep is not needed for this part
// use tokio::time::{sleep, Duration};
#[tokio::main]
async fn main() -> WebDriverResult<()> {
let caps = DesiredCapabilities::chrome();
let driver = WebDriver::new("http://localhost:9515", caps).await?;
driver.goto("http://quotes.toscrape.com/scroll").await?;
println!("Navigated to quotes page.");
// Find all elements with the CSS class "quote"
let quote_elements = driver.find_all(By::Css(".quote")).await?;
println!("Found {} quote elements initially.", quote_elements.len());
// Create a vector to store the results (Quote Text, Author)
let mut collected_quotes: Vec<(String, String)> = Vec::new();
// Loop through each quote element found
for quote_element in quote_elements {
let text = quote_element.find(By::Css(".text")).await?.text().await?;
let author = quote_element.find(By::Css(".author")).await?.text().await?;
collected_quotes.push((text, author));
}
// Print the collected quotes
println!("\n--- Collected Quotes (Initial Load) ---");
for (quote_text, author_name) in &collected_quotes {
println!("\"{}\" - {}", quote_text, author_name);
}
println!("-------------------------------------\n");
driver.quit().await?;
println!("Browser session closed.");
Ok(())
}
Run cargo run
again. You should see the first batch of quotes printed to your console.
Handling Infinite Scroll
The quotes page uses "infinite scrolling" – new quotes load as you scroll down. Our current script only gets the initially loaded quotes. Let's simulate scrolling to load more.
The thirtyfour
library provides the element.scroll_into_view().await?
method. We can repeatedly find the last quote element on the page and scroll it into view, pausing briefly after each scroll to allow new content to load via JavaScript.
Replace the quote collection block and the print block from the previous step with this loop structure. We'll scroll down a few times and *then* collect all the quotes.
use thirtyfour::prelude::*;
use tokio::time::{sleep, Duration}; // Make sure sleep is imported
#[tokio::main]
async fn main() -> WebDriverResult<()> {
let caps = DesiredCapabilities::chrome();
let driver = WebDriver::new("http://localhost:9515", caps).await?;
driver.goto("http://quotes.toscrape.com/scroll").await?;
println!("Navigated to quotes page.");
let scroll_attempts = 3; // How many times to scroll down
println!("Attempting to scroll down {} times...", scroll_attempts);
for i in 0..scroll_attempts {
// Find all current quote elements
let current_quotes = driver.find_all(By::Css(".quote")).await?;
// Get the last element, if any exist
if let Some(last_quote) = current_quotes.last() {
println!("Scrolling attempt {}: Bringing last quote into view...", i + 1);
last_quote.scroll_into_view().await?;
// Wait a moment for new content to potentially load
sleep(Duration::from_secs(2)).await;
} else {
println!("No quote elements found to scroll to.");
break; // Exit loop if no quotes are found
}
}
println!("Scrolling finished.");
// Now, collect ALL quotes present after scrolling
let all_quote_elements = driver.find_all(By::Css(".quote")).await?;
println!("Found {} quote elements after scrolling.", all_quote_elements.len());
let mut final_quotes: Vec<(String, String)> = Vec::new();
for quote_element in all_quote_elements {
// Use a helper function or inline error handling for potentially missing elements
let text_result = quote_element.find(By::Css(".text")).await;
let author_result = quote_element.find(By::Css(".author")).await;
if let (Ok(text_elem), Ok(author_elem)) = (text_result, author_result) {
let text = text_elem.text().await?;
let author = author_elem.text().await?;
final_quotes.push((text, author));
} else {
println!("Warning: Could not find text or author for a quote element.");
}
}
// Print the final collected quotes
println!("\n--- Collected Quotes (After Scrolling) ---");
for (quote_text, author_name) in &final_quotes {
println!("\"{}\" - {}", quote_text, author_name);
}
println!("Total quotes collected: {}", final_quotes.len());
println!("----------------------------------------\n");
driver.quit().await?;
println!("Browser session closed.");
Ok(())
}
This code first finds the current quote elements, scrolls the last one into view, waits, and repeats. After the loop, it collects all the quotes now present on the page.
Interacting with Page Elements
Selenium isn't limited to just reading data; it can interact with elements like buttons and forms. Let's try logging into the site's Login page.
We'll need to:
Navigate to the initial page.
Find the "Login" link and click it.
On the login page, find the username and password fields.
Fill them using the
element.send_keys("...").await?
method.Find the submit button and click it.
Here's a script demonstrating this (replace the previous `main` function):
use thirtyfour::prelude::*;
use tokio::time::{sleep, Duration};
#[tokio::main]
async fn main() -> WebDriverResult<()> {
let caps = DesiredCapabilities::chrome();
let driver = WebDriver::new("http://localhost:9515", caps).await?;
// Start at the main scroll page (or any page with a login link)
driver.goto("http://quotes.toscrape.com/scroll").await?;
println!("Navigated to initial page.");
// Find the login link - using XPath to find link by its text content
println!("Looking for login link...");
let login_link = driver.find(By::XPath("//a[contains(text(), 'Login')]")).await?;
println!("Found login link, clicking...");
login_link.click().await?;
// Wait briefly for the login page to load
sleep(Duration::from_secs(2)).await;
println!("On login page.");
// Find username field by its ID
println!("Finding username field...");
let username_input = driver.find(By::Css("#username")).await?;
println!("Entering username...");
username_input.send_keys("user").await?; // Example username
// Find password field by its ID
println!("Finding password field...");
let password_input = driver.find(By::Css("#password")).await?;
println!("Entering password...");
password_input.send_keys("password").await?; // Example password
// Find the submit button (input type=submit)
println!("Finding submit button...");
let submit_button = driver.find(By::Css("input[type='submit']")).await?;
println!("Clicking submit...");
submit_button.click().await?;
// Wait to see the result (logged-in page or error)
println!("Submitted login form. Waiting...");
sleep(Duration::from_secs(5)).await;
// Check the current URL or look for an element indicating success/failure (optional)
let current_url = driver.current_url().await?;
println!("Current URL after login attempt: {}", current_url);
driver.quit().await?;
println!("Browser session closed.");
Ok(())
}

This demonstrates how you can automate form submissions, a common task in more complex scraping scenarios.
Integrating Proxies with Thirtyfour
While web scraping itself is generally legal for publicly accessible data, websites often implement measures to detect and block automated traffic. Repeated, rapid requests from the same IP address are a dead giveaway. Getting blocked can range from a temporary nuisance to a permanent ban of your IP.
This is where proxies come in. A proxy server acts as an intermediary between your scraper and the target website. Requests appear to originate from the proxy's IP, masking your own. If a proxy IP gets blocked, you can simply switch to another one.
You'll find various proxy types available:
Datacenter Proxies: Fast and affordable, good for sites with basic protection.
Residential Proxies: IPs from real home internet connections, harder to detect. Ideal for stricter sites.
Mobile Proxies: IPs from mobile carriers, excellent for mimicking mobile users.
Static ISP Proxies: Residential IPs assigned for your exclusive use.
While free proxies exist, they often come with significant drawbacks: slow speeds, unreliability, limited locations, and potential security risks (some log or even inject data). Paid providers like Evomi offer large pools of ethically sourced proxies with high uptime, speed, and better support. Evomi provides various types, including residential proxies starting from just $0.49/GB, ensuring reliable access.
To configure a proxy in thirtyfour
, you modify the DesiredCapabilities
before creating the WebDriver
instance.
First, ensure you import CapabilitiesHelper
:
use thirtyfour::{
common::capabilities::proxy::Proxy, prelude::*, CapabilitiesHelper,
};
Then, set the proxy configuration. Here's how to set an HTTP and HTTPS proxy:
let proxy_url = "http://your-proxy-endpoint:port"; // Replace with your actual proxy address
let mut caps = DesiredCapabilities::chrome();
let mut proxy_config = Proxy::manual();
proxy_config.with_http_proxy(proxy_url)?;
proxy_config.with_ssl_proxy(proxy_url)?; // Often the same as HTTP proxy
caps.set_proxy(proxy_config)?;
// Now create the driver with these capabilities
let driver = WebDriver::new("http://localhost:9515", caps).await?;
Important Note on Authenticated Proxies: Many proxy providers require username/password authentication. Selenium WebDriver itself doesn't have a straightforward way to handle the authentication pop-ups that might appear. The standard proxy configuration shown above works best with proxies authenticated via IP whitelisting. Most providers, including Evomi, allow you to register your machine's IP address in their dashboard. Once whitelisted, you can connect to the proxy endpoint without needing to embed credentials in the URL, bypassing the authentication challenge for Selenium.
If you must use username/password authentication, you might need to format the `proxy_url` like `http://username:password@your-proxy-endpoint:port`. However, be aware this might not work reliably with all WebDriver/browser combinations due to the limitations mentioned.
Here's a full example script that sets up a proxy (using a placeholder - replace it with your Evomi endpoint) and visits an IP checking site like Evomi's free IP Geolocation Checker to verify the proxy is working:
use thirtyfour::{
common::capabilities::proxy::Proxy, prelude::*, CapabilitiesHelper,
};
use tokio::time::{sleep, Duration};
#[tokio::main]
async fn main() -> WebDriverResult<()> {
// --- Proxy Configuration ---
// Replace with your actual proxy endpoint (authentification not yet setup!)
let proxy_address = "http://rp.evomi.com:1000"; // Example Evomi HTTP endpoint for residential
let mut caps = DesiredCapabilities::chrome();
let mut proxy_settings = Proxy::manual();
// Configure for HTTP and HTTPS traffic
proxy_settings.with_http_proxy(proxy_address)?;
proxy_settings.with_ssl_proxy(proxy_address)?; // Use same proxy for HTTPS
// Apply proxy settings to capabilities
caps.set_proxy(proxy_settings)?;
println!("Proxy configured: {}", proxy_address);
// --- End Proxy Configuration ---
println!("Connecting to WebDriver...");
let driver = WebDriver::new("http://localhost:9515", caps).await?;
println!("WebDriver connection established.");
// Navigate to an IP checking service
let check_url = "https://geo.evomi.com/";
println!("Navigating to IP checker: {}", check_url);
driver.goto(check_url).await?;
// Wait for page elements to load (adjust timing as needed)
sleep(Duration::from_secs(5)).await;
// Try to find the displayed IP address (Selector might vary depending on the site)
// For geo.evomi.com, the IP is usually in a specific element. Let's try a common pattern.
// NOTE: Inspect geo.evomi.com manually to find the correct selector if this fails.
let ip_element_result = driver.find(By::Css(".ip-address-val")).await; // Adjust selector as needed!
match ip_element_result {
Ok(ip_element) => {
let displayed_ip = ip_element.text().await?;
println!("IP Address displayed on page: {}", displayed_ip);
}
Err(_) => {
println!(
"Could not find the IP address element automatically. Please check the page manually."
);
// Optional: Grab the whole body text as a fallback
let body_text = driver.find(By::Css("body")).await?.text().await?;
println!("Page body content:\n{}", body_text);
}
}
driver.quit().await?;
println!("Browser session closed.");
Ok(())
}
Run this code after replacing the placeholder with your proxy details. The output should show the IP address of your proxy server, confirming it's routing the browser's traffic.
Wrapping Up
While maybe not the absolute mainstream choice for every scraping project, Rust paired with `thirtyfour` offers a potent combination for building fast, reliable, and type-safe web scrapers, especially when dealing with dynamic content. The learning curve might be steeper than Python or JavaScript initially, particularly around async and the borrow checker, but the performance and safety benefits can be significant for demanding tasks.
The proxy integration challenge highlights one area where the ecosystem might feel less mature than, say, Python's rich set of specialized libraries. However, using IP whitelisting provides a solid workaround.
If you enjoyed this, consider tackling a more complex dynamic site. Our guide on dynamic scraping with Python and Selenium covers techniques that can often be adapted, even if the language specifics differ. Happy scraping!

Author
David Foster
Proxy & Network Security Analyst
About Author
David is an expert in network security, web scraping, and proxy technologies, helping businesses optimize data extraction while maintaining privacy and efficiency. With a deep understanding of residential, datacenter, and rotating proxies, he explores how proxies enhance cybersecurity, bypass geo-restrictions, and power large-scale web scraping. David’s insights help businesses and developers choose the right proxy solutions for SEO monitoring, competitive intelligence, and anonymous browsing.