Build a Golang Web Scraper & Avoid Blocks Using Proxies

Sarah Whitmore

Last edited on May 15, 2025
Last edited on May 15, 2025

Scraping Techniques

Harnessing Go for Web Scraping: Building Efficient Scrapers and Bypassing Blocks

Combining Go's straightforward syntax with its impressive performance makes it a compelling choice for web scraping tasks.

Often referred to as Golang, this programming language is celebrated for its efficiency and ease of use. In the realm of web scraping, it can even offer performance advantages over established languages like Python, JavaScript (Node.js), or Ruby.

This guide will walk you through creating a Golang web scraper step-by-step, focusing on how to prevent getting blocked. We'll utilize established tools like the Go implementation of Playwright alongside reliable proxy solutions, such as Evomi's residential proxies.

Let's dive in!

What's the Point of a Web Scraper?

Web scrapers are tools designed for large-scale data collection from websites. Common applications include tracking competitor pricing, aggregating news articles, monitoring job boards, checking product availability, analyzing customer sentiment from reviews, and much more.

Essentially, web scrapers automate actions a human would perform manually in a browser. This allows you to not only extract data but also interact with websites by clicking elements, submitting forms, navigating pages, and even capturing screenshots for visual records.

Understanding Web Scraping with Golang

Web scraping in Golang involves writing Go programs to automatically retrieve data from websites, particularly when that data isn't offered through a formal API. Your Go application mimics how a user interacts with a site, allowing you to programmatically access and process information displayed in a browser.

Is Go a Good Fit for Web Scraping?

Absolutely. Go presents a strong case for web scraping projects. Its performance benchmarks often show it running faster than Python or Ruby for similar tasks, and its relatively simple syntax lowers the barrier to entry for developers.

Ultimately, the "best" language for web scraping depends on the specific project requirements and the developer's familiarity with the ecosystem. However, Go's combination of speed and simplicity makes it a worthy contender.

Building Your First Golang Web Scraper: The Basics

Creating a web scraper in Go involves using libraries that handle connecting to websites, fetching their HTML content (or rendering the page like a browser), and then parsing that content to extract the desired information.

A powerful approach involves using a library like Playwright for Go. This allows your scraper to control a headless browser – a real web browser running without a graphical interface. Your code drives the browser, letting it load pages completely, including JavaScript-rendered content, just as a human visitor would. This makes scraping more robust and harder to detect compared to simple HTTP request libraries.

What About Gocolly?

Gocolly is another well-known web scraping framework specifically for Go. It's recognized for its speed and ease of use, allowing developers to quickly build web crawlers and scrapers. However, Gocolly primarily focuses on static HTML content. If the target website relies heavily on JavaScript to load data dynamically, you might need more complex workarounds, like analyzing network requests manually.

Enter Golang Playwright

Playwright for Go is a community-maintained port of Microsoft's powerful Playwright browser automation library. It provides a Go API to control browsers like Chromium, Firefox, and WebKit programmatically.

This enables you to write Go code that interacts with web pages just like a user: navigating, clicking, typing, and extracting data from the fully rendered page. This method is highly flexible and significantly reduces the chances of being blocked, as the interactions closely mimic real user behavior.

Strategies to Avoid Web Scraping Blocks

Combining a headless browser with high-quality residential proxies is a very effective strategy to avoid getting blocked during web scraping.

While web scraping itself is generally legal for publicly accessible data, website operators often implement measures to detect and block automated traffic. They look for patterns that distinguish bots from humans, such as unusual request headers, inability to render JavaScript correctly, or an abnormally high request rate from a single IP address.

Using a headless browser like Playwright helps your scraper appear more human because it *is* a real browser, sending standard headers and executing JavaScript. Websites have a harder time differentiating this traffic from genuine users.

However, sending many requests from the same IP address is still a major red flag. This is where proxies come in. Using a service like Evomi's residential proxies allows you to route your requests through a vast pool of IP addresses belonging to real devices worldwide. Each request can appear to come from a different user, making it extremely difficult for websites to track your scraping activity based on IP address alone. Evomi prides itself on ethically sourced proxies and competitive pricing (starting at just $0.49/GB for residential), offering a reliable way to scale your scraping operations responsibly.

Golang Web Scraper: A Practical Walkthrough

Let's build a basic Golang web scraper using Playwright.

Here’s the plan:

  • Install Go and set up your development environment.

  • Initialize a Go module.

  • Install the Playwright for Go library.

  • Write the basic Go code to control the browser.

  • Take a simple screenshot.

  • Integrate proxies for anonymity.

  • Extract specific data points.

  • Simulate clicking and typing.

Setting Up Your Go Environment

First, you need Go installed. On macOS via Homebrew:

On Windows via Chocolatey:

For Linux or other systems, download the appropriate package from the official Go downloads page.

Verify the installation by opening your terminal and running:

To run a Go file (e.g., `main.go`):

You'll also need a code editor. Visual Studio Code (VS Code) is a popular choice. Install the official Go extension from the Extensions marketplace:

Installing the Go extension in VS Code

Now we're ready to code!

Initializing Your Go Module

Go projects are organized into modules. You need a `go.mod` file to define your module and manage dependencies.

Navigate to your Go workspace (often `$HOME/go`), create a new directory for this project (e.g., `myscraper`), change into that directory in your terminal, and run:

You can replace `evomi.com/myscraper` with your preferred module path. This creates the `go.mod` file.

Installing Playwright for Go

Next, add the Playwright library to your module:

go get

Playwright also needs browser binaries. Install them (including dependencies) with this command:

go run github.com/playwright-community/playwright-go/cmd/playwright install --with-deps

Taking a Website Screenshot

Create a file named `main.go` in your `myscraper` directory.

Add the following code:

package main

import (
	"fmt" // Import fmt for printing output later
	"log"

	"github.com/playwright-community/playwright-go"
)

func main() {
	// Start Playwright
	pw, err := playwright.Run()
	if err != nil {
		log.Fatalf("Could not start Playwright: %v", err)
	}

	// Launch the Chromium browser (headless by default)
	// You could also use pw.Firefox.Launch() or pw.WebKit.Launch()
	browser, err := pw.Chromium.Launch()
	if err != nil {
		log.Fatalf("Could not launch browser: %v", err)
	}

	// Open a new page (tab)
	page, err := browser.NewPage()
	if err != nil {
		log.Fatalf("Could not create page: %v", err)
	}

	// Navigate to a target URL (e.g., Evomi's IP checker)
	targetURL := "https://geo.evomi.com/"
	fmt.Printf("Navigating to %s...\n", targetURL)
	if _, err = page.Goto(targetURL, playwright.PageGotoOptions{
		WaitUntil: playwright.WaitUntilStateNetworkidle, // Wait for network activity to settle
	}); err != nil {
		log.Fatalf("Could not navigate to %s: %v", targetURL, err)
	}
	fmt.Println("Navigation successful!")

	// Take a screenshot
	screenshotPath := "page_screenshot.png"
	fmt.Printf("Taking screenshot and saving to %s...\n", screenshotPath)
	if _, err = page.Screenshot(playwright.PageScreenshotOptions{
		Path: playwright.String(screenshotPath), // Specify the file path
	}); err != nil {
		log.Fatalf("Could not take screenshot: %v", err)
	}
	fmt.Println("Screenshot saved!")

	// Close the browser
	if err = browser.Close(); err != nil {
		log.Fatalf("Could not close browser: %v", err)
	}

	// Stop Playwright
	if err = pw.Stop(); err != nil {
		log.Fatalf("Could not stop Playwright: %v", err)
	}
	fmt.Println("Scraper finished successfully.")
}

Code Breakdown:

  • package main: Declares the package as the main executable.

  • import (...): Imports necessary libraries (logging, fmt for printing, and Playwright).

  • func main() { ... }: The main entry point of the program.

  • pw, err := playwright.Run(): Initializes Playwright. Go functions often return a result and an error. The := syntax declares and initializes variables.

  • if err != nil { ... }: Standard Go error handling pattern. Checks if the preceding operation failed.

  • browser, err := pw.Chromium.Launch(): Starts a Chromium browser instance.

  • page, err := browser.NewPage(): Creates a new browser tab.

  • page.Goto(...): Navigates the page to the specified URL. WaitUntilStateNetworkidle is often useful for pages with dynamic loading.

  • page.Screenshot(...): Captures the current view of the page and saves it to a file. Note the use of playwright.String() to pass string options.

  • browser.Close(): Closes the browser instance.

  • pw.Stop(): Shuts down the Playwright process.

Run the scraper:

If successful, you'll see console output and a `page_screenshot.png` file in your directory showing the loaded page.

Example screenshot of a web page showing an IP address


Integrating Evomi Proxies with Playwright

To route your scraper's traffic through proxies, you configure them when launching the browser.

First, you'll need proxy credentials from your provider. With Evomi, you can easily configure and access your proxy details (like endpoint, port, username, password) in your client dashboard. Evomi offers various proxy types, including residential proxies which are excellent for mimicking real users.


Modify the browser launch part of your `main.go` file. Replace:

browser, err := pw.Chromium.Launch()

With this code block, inserting your specific Evomi proxy details:

// Configure proxy settings (Replace with your actual Evomi credentials)
proxyServer := "rp.evomi.com:1000" // Example: Evomi Residential HTTP endpoint
proxyUsername := "YOUR_EVOMI_USERNAME"
proxyPassword := "YOUR_EVOMI_PASSWORD"
proxySettings := playwright.BrowserTypeLaunchOptionsProxy{
	Server:   playwright.String(proxyServer),
	Username: playwright.String(proxyUsername),
	Password: playwright.String(proxyPassword),
}

// Launch the browser with proxy settings
browser, err := pw.Chromium.Launch(playwright.BrowserTypeLaunchOptions{
	Proxy: &proxySettings, // Pass the proxy config as a pointer
})
if err != nil {
	log.Fatalf("Could not launch browser with proxy: %v", err)
}

Key changes:

  • We define a proxySettings struct holding the server address, username, and password. Remember to replace the placeholders with your actual Evomi details. Use the correct endpoint and port for your chosen proxy type (e.g., rp.evomi.com:1000 for Residential HTTP).

  • We pass a pointer to this struct (&proxySettings) to the Proxy field within playwright.BrowserTypeLaunchOptions when calling Launch().

Run `go run main.go` again. If you used an IP checking site like `https://geo.evomi.com/` or `https://ipv4.icanhazip.com/` as the target URL, the new screenshot should now display the IP address assigned by the Evomi proxy, not your own.

Example screenshot showing a proxy IP address

Playwright offers many other launch options for customizing browser behavior (e.g., user agent, viewport size, geolocation) and screenshot options (e.g., full page, specific element captures).

Extracting Data from Web Elements

Scraping usually involves extracting text or other data from specific HTML elements.

Playwright lets you select elements using CSS selectors (or other methods like XPath) and then retrieve their properties, such as text content.

Let's modify the code to extract the main heading from the Playwright documentation site. Replace the screenshot block and subsequent closing calls with this (insert it before `browser.Close()`):

// Navigate to Playwright's site for data extraction example
dataURL := "https://playwright.dev/docs/intro"
fmt.Printf("Navigating to %s for data extraction...\n", dataURL)
if _, err = page.Goto(dataURL, playwright.PageGotoOptions{
	WaitUntil: playwright.WaitUntilStateNetworkidle,
}); err != nil {
	log.Fatalf("Could not navigate to %s: %v", dataURL, err)
}

// Select the main heading element using its CSS class
headingSelector := "h1" // Simple selector for the main heading
fmt.Printf("Selecting element with selector: '%s'\n", headingSelector)
headingElement, err := page.QuerySelector(headingSelector)
if err != nil {
	log.Fatalf("Could not find element with selector '%s': %v", headingSelector, err)
}
if headingElement == nil {
	log.Fatalf("Element with selector '%s' not found.", headingSelector)
}

// Extract the text content
textContent, err := headingElement.TextContent()
if err != nil {
	log.Fatalf("Could not get text content: %v", err)
}

// Print the extracted text
fmt.Printf("Extracted Heading Text: %s\n", textContent)

// Remember to add back the browser closing and Playwright stopping code:
// if err = browser.Close(); err != nil { ... }
// if err = pw.Stop(); err != nil { ... }

This snippet navigates to a page, uses `page.QuerySelector()` to find the first element matching the `h1` CSS selector, retrieves its text using `TextContent()`, and prints it. For multiple elements, you'd use `page.QuerySelectorAll()`.

Console output showing extracted text from a web page

You can then store this extracted data (`textContent`) in variables, write it to files, or send it to a database.

Interacting with Pages: Clicks and Form Input

Playwright allows comprehensive interaction, simulating almost anything a user can do: clicking buttons, filling forms, scrolling, keyboard input, etc.

Similar to data extraction, you first select the element and then call an interaction method like `Click()` or `Fill()`.

Here's an example demonstrating clicking the search icon on the Playwright site and typing into the search box (add this before the browser closing calls):

// Example: Clicking the search button and typing into the input

// Selector for the search button
searchButtonSelector := ".DocSearch-Button"
fmt.Printf("Clicking element: '%s'\n", searchButtonSelector)
searchButton, err := page.QuerySelector(searchButtonSelector)
if err != nil {
	log.Fatalf("Error finding search button: %v", err)
}
if searchButton == nil {
	log.Fatalf("Search button element '%s' not found", searchButtonSelector)
}

// Click the button
if err = searchButton.Click(); err != nil {
	log.Fatalf("Error clicking search button: %v", err)
}
fmt.Println("Search button clicked.")

// Short pause to allow search modal to appear (adjust as needed)
page.WaitForTimeout(500) // Wait 500 milliseconds

// Selector for the search input field that appears
searchInputSelector := "#docsearch-input"
fmt.Printf("Typing into element: '%s'\n", searchInputSelector)
searchInput, err := page.QuerySelector(searchInputSelector)
if err != nil {
	log.Fatalf("Error finding search input: %v", err)
}
if searchInput == nil {
	log.Fatalf("Search input element '%s' not found", searchInputSelector)
}

// Fill the input field
searchText := "page object model"
if err = searchInput.Fill(searchText); err != nil {
	log.Fatalf("Error filling search input: %v", err)
}
fmt.Printf("Filled search input with: '%s'\n", searchText)

// Take a final screenshot to see the result
finalScreenshotPath := "interaction_screenshot.png"
fmt.Printf("Taking final screenshot: %s\n", finalScreenshotPath)
if _, err = page.Screenshot(playwright.PageScreenshotOptions{
	Path: playwright.String(finalScreenshotPath),
}); err != nil {
	log.Printf("Warning: could not take final screenshot: %v", err) // Log warning instead of fatal error
} else {
	fmt.Println("Final screenshot saved.")
}

// Remember to include browser.Close() and pw.Stop() after this block

This code selects the search button, clicks it, waits briefly, selects the search input field (which becomes visible after the click), and types text into it using `Fill()`. A final screenshot confirms the action.

Screenshot showing text typed into a search field by the Go scraper

Wrapping Up

You've now seen how to construct a Golang web scraper using the Playwright library. We covered setting up Go, installing dependencies, navigating pages, taking screenshots, integrating Evomi proxies to avoid blocks, extracting data, and simulating user interactions like clicks and typing.

This foundation opens the door to more complex scraping tasks: handling pagination, managing multiple concurrent scraping jobs (where Go's concurrency excels), saving data systematically, recording videos of scraping sessions, and executing custom JavaScript within the page context. Combined with reliable, ethically sourced proxies like those from Evomi, you can build powerful and sustainable data gathering solutions.

Happy scraping!

Harnessing Go for Web Scraping: Building Efficient Scrapers and Bypassing Blocks

Combining Go's straightforward syntax with its impressive performance makes it a compelling choice for web scraping tasks.

Often referred to as Golang, this programming language is celebrated for its efficiency and ease of use. In the realm of web scraping, it can even offer performance advantages over established languages like Python, JavaScript (Node.js), or Ruby.

This guide will walk you through creating a Golang web scraper step-by-step, focusing on how to prevent getting blocked. We'll utilize established tools like the Go implementation of Playwright alongside reliable proxy solutions, such as Evomi's residential proxies.

Let's dive in!

What's the Point of a Web Scraper?

Web scrapers are tools designed for large-scale data collection from websites. Common applications include tracking competitor pricing, aggregating news articles, monitoring job boards, checking product availability, analyzing customer sentiment from reviews, and much more.

Essentially, web scrapers automate actions a human would perform manually in a browser. This allows you to not only extract data but also interact with websites by clicking elements, submitting forms, navigating pages, and even capturing screenshots for visual records.

Understanding Web Scraping with Golang

Web scraping in Golang involves writing Go programs to automatically retrieve data from websites, particularly when that data isn't offered through a formal API. Your Go application mimics how a user interacts with a site, allowing you to programmatically access and process information displayed in a browser.

Is Go a Good Fit for Web Scraping?

Absolutely. Go presents a strong case for web scraping projects. Its performance benchmarks often show it running faster than Python or Ruby for similar tasks, and its relatively simple syntax lowers the barrier to entry for developers.

Ultimately, the "best" language for web scraping depends on the specific project requirements and the developer's familiarity with the ecosystem. However, Go's combination of speed and simplicity makes it a worthy contender.

Building Your First Golang Web Scraper: The Basics

Creating a web scraper in Go involves using libraries that handle connecting to websites, fetching their HTML content (or rendering the page like a browser), and then parsing that content to extract the desired information.

A powerful approach involves using a library like Playwright for Go. This allows your scraper to control a headless browser – a real web browser running without a graphical interface. Your code drives the browser, letting it load pages completely, including JavaScript-rendered content, just as a human visitor would. This makes scraping more robust and harder to detect compared to simple HTTP request libraries.

What About Gocolly?

Gocolly is another well-known web scraping framework specifically for Go. It's recognized for its speed and ease of use, allowing developers to quickly build web crawlers and scrapers. However, Gocolly primarily focuses on static HTML content. If the target website relies heavily on JavaScript to load data dynamically, you might need more complex workarounds, like analyzing network requests manually.

Enter Golang Playwright

Playwright for Go is a community-maintained port of Microsoft's powerful Playwright browser automation library. It provides a Go API to control browsers like Chromium, Firefox, and WebKit programmatically.

This enables you to write Go code that interacts with web pages just like a user: navigating, clicking, typing, and extracting data from the fully rendered page. This method is highly flexible and significantly reduces the chances of being blocked, as the interactions closely mimic real user behavior.

Strategies to Avoid Web Scraping Blocks

Combining a headless browser with high-quality residential proxies is a very effective strategy to avoid getting blocked during web scraping.

While web scraping itself is generally legal for publicly accessible data, website operators often implement measures to detect and block automated traffic. They look for patterns that distinguish bots from humans, such as unusual request headers, inability to render JavaScript correctly, or an abnormally high request rate from a single IP address.

Using a headless browser like Playwright helps your scraper appear more human because it *is* a real browser, sending standard headers and executing JavaScript. Websites have a harder time differentiating this traffic from genuine users.

However, sending many requests from the same IP address is still a major red flag. This is where proxies come in. Using a service like Evomi's residential proxies allows you to route your requests through a vast pool of IP addresses belonging to real devices worldwide. Each request can appear to come from a different user, making it extremely difficult for websites to track your scraping activity based on IP address alone. Evomi prides itself on ethically sourced proxies and competitive pricing (starting at just $0.49/GB for residential), offering a reliable way to scale your scraping operations responsibly.

Golang Web Scraper: A Practical Walkthrough

Let's build a basic Golang web scraper using Playwright.

Here’s the plan:

  • Install Go and set up your development environment.

  • Initialize a Go module.

  • Install the Playwright for Go library.

  • Write the basic Go code to control the browser.

  • Take a simple screenshot.

  • Integrate proxies for anonymity.

  • Extract specific data points.

  • Simulate clicking and typing.

Setting Up Your Go Environment

First, you need Go installed. On macOS via Homebrew:

On Windows via Chocolatey:

For Linux or other systems, download the appropriate package from the official Go downloads page.

Verify the installation by opening your terminal and running:

To run a Go file (e.g., `main.go`):

You'll also need a code editor. Visual Studio Code (VS Code) is a popular choice. Install the official Go extension from the Extensions marketplace:

Installing the Go extension in VS Code

Now we're ready to code!

Initializing Your Go Module

Go projects are organized into modules. You need a `go.mod` file to define your module and manage dependencies.

Navigate to your Go workspace (often `$HOME/go`), create a new directory for this project (e.g., `myscraper`), change into that directory in your terminal, and run:

You can replace `evomi.com/myscraper` with your preferred module path. This creates the `go.mod` file.

Installing Playwright for Go

Next, add the Playwright library to your module:

go get

Playwright also needs browser binaries. Install them (including dependencies) with this command:

go run github.com/playwright-community/playwright-go/cmd/playwright install --with-deps

Taking a Website Screenshot

Create a file named `main.go` in your `myscraper` directory.

Add the following code:

package main

import (
	"fmt" // Import fmt for printing output later
	"log"

	"github.com/playwright-community/playwright-go"
)

func main() {
	// Start Playwright
	pw, err := playwright.Run()
	if err != nil {
		log.Fatalf("Could not start Playwright: %v", err)
	}

	// Launch the Chromium browser (headless by default)
	// You could also use pw.Firefox.Launch() or pw.WebKit.Launch()
	browser, err := pw.Chromium.Launch()
	if err != nil {
		log.Fatalf("Could not launch browser: %v", err)
	}

	// Open a new page (tab)
	page, err := browser.NewPage()
	if err != nil {
		log.Fatalf("Could not create page: %v", err)
	}

	// Navigate to a target URL (e.g., Evomi's IP checker)
	targetURL := "https://geo.evomi.com/"
	fmt.Printf("Navigating to %s...\n", targetURL)
	if _, err = page.Goto(targetURL, playwright.PageGotoOptions{
		WaitUntil: playwright.WaitUntilStateNetworkidle, // Wait for network activity to settle
	}); err != nil {
		log.Fatalf("Could not navigate to %s: %v", targetURL, err)
	}
	fmt.Println("Navigation successful!")

	// Take a screenshot
	screenshotPath := "page_screenshot.png"
	fmt.Printf("Taking screenshot and saving to %s...\n", screenshotPath)
	if _, err = page.Screenshot(playwright.PageScreenshotOptions{
		Path: playwright.String(screenshotPath), // Specify the file path
	}); err != nil {
		log.Fatalf("Could not take screenshot: %v", err)
	}
	fmt.Println("Screenshot saved!")

	// Close the browser
	if err = browser.Close(); err != nil {
		log.Fatalf("Could not close browser: %v", err)
	}

	// Stop Playwright
	if err = pw.Stop(); err != nil {
		log.Fatalf("Could not stop Playwright: %v", err)
	}
	fmt.Println("Scraper finished successfully.")
}

Code Breakdown:

  • package main: Declares the package as the main executable.

  • import (...): Imports necessary libraries (logging, fmt for printing, and Playwright).

  • func main() { ... }: The main entry point of the program.

  • pw, err := playwright.Run(): Initializes Playwright. Go functions often return a result and an error. The := syntax declares and initializes variables.

  • if err != nil { ... }: Standard Go error handling pattern. Checks if the preceding operation failed.

  • browser, err := pw.Chromium.Launch(): Starts a Chromium browser instance.

  • page, err := browser.NewPage(): Creates a new browser tab.

  • page.Goto(...): Navigates the page to the specified URL. WaitUntilStateNetworkidle is often useful for pages with dynamic loading.

  • page.Screenshot(...): Captures the current view of the page and saves it to a file. Note the use of playwright.String() to pass string options.

  • browser.Close(): Closes the browser instance.

  • pw.Stop(): Shuts down the Playwright process.

Run the scraper:

If successful, you'll see console output and a `page_screenshot.png` file in your directory showing the loaded page.

Example screenshot of a web page showing an IP address


Integrating Evomi Proxies with Playwright

To route your scraper's traffic through proxies, you configure them when launching the browser.

First, you'll need proxy credentials from your provider. With Evomi, you can easily configure and access your proxy details (like endpoint, port, username, password) in your client dashboard. Evomi offers various proxy types, including residential proxies which are excellent for mimicking real users.


Modify the browser launch part of your `main.go` file. Replace:

browser, err := pw.Chromium.Launch()

With this code block, inserting your specific Evomi proxy details:

// Configure proxy settings (Replace with your actual Evomi credentials)
proxyServer := "rp.evomi.com:1000" // Example: Evomi Residential HTTP endpoint
proxyUsername := "YOUR_EVOMI_USERNAME"
proxyPassword := "YOUR_EVOMI_PASSWORD"
proxySettings := playwright.BrowserTypeLaunchOptionsProxy{
	Server:   playwright.String(proxyServer),
	Username: playwright.String(proxyUsername),
	Password: playwright.String(proxyPassword),
}

// Launch the browser with proxy settings
browser, err := pw.Chromium.Launch(playwright.BrowserTypeLaunchOptions{
	Proxy: &proxySettings, // Pass the proxy config as a pointer
})
if err != nil {
	log.Fatalf("Could not launch browser with proxy: %v", err)
}

Key changes:

  • We define a proxySettings struct holding the server address, username, and password. Remember to replace the placeholders with your actual Evomi details. Use the correct endpoint and port for your chosen proxy type (e.g., rp.evomi.com:1000 for Residential HTTP).

  • We pass a pointer to this struct (&proxySettings) to the Proxy field within playwright.BrowserTypeLaunchOptions when calling Launch().

Run `go run main.go` again. If you used an IP checking site like `https://geo.evomi.com/` or `https://ipv4.icanhazip.com/` as the target URL, the new screenshot should now display the IP address assigned by the Evomi proxy, not your own.

Example screenshot showing a proxy IP address

Playwright offers many other launch options for customizing browser behavior (e.g., user agent, viewport size, geolocation) and screenshot options (e.g., full page, specific element captures).

Extracting Data from Web Elements

Scraping usually involves extracting text or other data from specific HTML elements.

Playwright lets you select elements using CSS selectors (or other methods like XPath) and then retrieve their properties, such as text content.

Let's modify the code to extract the main heading from the Playwright documentation site. Replace the screenshot block and subsequent closing calls with this (insert it before `browser.Close()`):

// Navigate to Playwright's site for data extraction example
dataURL := "https://playwright.dev/docs/intro"
fmt.Printf("Navigating to %s for data extraction...\n", dataURL)
if _, err = page.Goto(dataURL, playwright.PageGotoOptions{
	WaitUntil: playwright.WaitUntilStateNetworkidle,
}); err != nil {
	log.Fatalf("Could not navigate to %s: %v", dataURL, err)
}

// Select the main heading element using its CSS class
headingSelector := "h1" // Simple selector for the main heading
fmt.Printf("Selecting element with selector: '%s'\n", headingSelector)
headingElement, err := page.QuerySelector(headingSelector)
if err != nil {
	log.Fatalf("Could not find element with selector '%s': %v", headingSelector, err)
}
if headingElement == nil {
	log.Fatalf("Element with selector '%s' not found.", headingSelector)
}

// Extract the text content
textContent, err := headingElement.TextContent()
if err != nil {
	log.Fatalf("Could not get text content: %v", err)
}

// Print the extracted text
fmt.Printf("Extracted Heading Text: %s\n", textContent)

// Remember to add back the browser closing and Playwright stopping code:
// if err = browser.Close(); err != nil { ... }
// if err = pw.Stop(); err != nil { ... }

This snippet navigates to a page, uses `page.QuerySelector()` to find the first element matching the `h1` CSS selector, retrieves its text using `TextContent()`, and prints it. For multiple elements, you'd use `page.QuerySelectorAll()`.

Console output showing extracted text from a web page

You can then store this extracted data (`textContent`) in variables, write it to files, or send it to a database.

Interacting with Pages: Clicks and Form Input

Playwright allows comprehensive interaction, simulating almost anything a user can do: clicking buttons, filling forms, scrolling, keyboard input, etc.

Similar to data extraction, you first select the element and then call an interaction method like `Click()` or `Fill()`.

Here's an example demonstrating clicking the search icon on the Playwright site and typing into the search box (add this before the browser closing calls):

// Example: Clicking the search button and typing into the input

// Selector for the search button
searchButtonSelector := ".DocSearch-Button"
fmt.Printf("Clicking element: '%s'\n", searchButtonSelector)
searchButton, err := page.QuerySelector(searchButtonSelector)
if err != nil {
	log.Fatalf("Error finding search button: %v", err)
}
if searchButton == nil {
	log.Fatalf("Search button element '%s' not found", searchButtonSelector)
}

// Click the button
if err = searchButton.Click(); err != nil {
	log.Fatalf("Error clicking search button: %v", err)
}
fmt.Println("Search button clicked.")

// Short pause to allow search modal to appear (adjust as needed)
page.WaitForTimeout(500) // Wait 500 milliseconds

// Selector for the search input field that appears
searchInputSelector := "#docsearch-input"
fmt.Printf("Typing into element: '%s'\n", searchInputSelector)
searchInput, err := page.QuerySelector(searchInputSelector)
if err != nil {
	log.Fatalf("Error finding search input: %v", err)
}
if searchInput == nil {
	log.Fatalf("Search input element '%s' not found", searchInputSelector)
}

// Fill the input field
searchText := "page object model"
if err = searchInput.Fill(searchText); err != nil {
	log.Fatalf("Error filling search input: %v", err)
}
fmt.Printf("Filled search input with: '%s'\n", searchText)

// Take a final screenshot to see the result
finalScreenshotPath := "interaction_screenshot.png"
fmt.Printf("Taking final screenshot: %s\n", finalScreenshotPath)
if _, err = page.Screenshot(playwright.PageScreenshotOptions{
	Path: playwright.String(finalScreenshotPath),
}); err != nil {
	log.Printf("Warning: could not take final screenshot: %v", err) // Log warning instead of fatal error
} else {
	fmt.Println("Final screenshot saved.")
}

// Remember to include browser.Close() and pw.Stop() after this block

This code selects the search button, clicks it, waits briefly, selects the search input field (which becomes visible after the click), and types text into it using `Fill()`. A final screenshot confirms the action.

Screenshot showing text typed into a search field by the Go scraper

Wrapping Up

You've now seen how to construct a Golang web scraper using the Playwright library. We covered setting up Go, installing dependencies, navigating pages, taking screenshots, integrating Evomi proxies to avoid blocks, extracting data, and simulating user interactions like clicks and typing.

This foundation opens the door to more complex scraping tasks: handling pagination, managing multiple concurrent scraping jobs (where Go's concurrency excels), saving data systematically, recording videos of scraping sessions, and executing custom JavaScript within the page context. Combined with reliable, ethically sourced proxies like those from Evomi, you can build powerful and sustainable data gathering solutions.

Happy scraping!

Harnessing Go for Web Scraping: Building Efficient Scrapers and Bypassing Blocks

Combining Go's straightforward syntax with its impressive performance makes it a compelling choice for web scraping tasks.

Often referred to as Golang, this programming language is celebrated for its efficiency and ease of use. In the realm of web scraping, it can even offer performance advantages over established languages like Python, JavaScript (Node.js), or Ruby.

This guide will walk you through creating a Golang web scraper step-by-step, focusing on how to prevent getting blocked. We'll utilize established tools like the Go implementation of Playwright alongside reliable proxy solutions, such as Evomi's residential proxies.

Let's dive in!

What's the Point of a Web Scraper?

Web scrapers are tools designed for large-scale data collection from websites. Common applications include tracking competitor pricing, aggregating news articles, monitoring job boards, checking product availability, analyzing customer sentiment from reviews, and much more.

Essentially, web scrapers automate actions a human would perform manually in a browser. This allows you to not only extract data but also interact with websites by clicking elements, submitting forms, navigating pages, and even capturing screenshots for visual records.

Understanding Web Scraping with Golang

Web scraping in Golang involves writing Go programs to automatically retrieve data from websites, particularly when that data isn't offered through a formal API. Your Go application mimics how a user interacts with a site, allowing you to programmatically access and process information displayed in a browser.

Is Go a Good Fit for Web Scraping?

Absolutely. Go presents a strong case for web scraping projects. Its performance benchmarks often show it running faster than Python or Ruby for similar tasks, and its relatively simple syntax lowers the barrier to entry for developers.

Ultimately, the "best" language for web scraping depends on the specific project requirements and the developer's familiarity with the ecosystem. However, Go's combination of speed and simplicity makes it a worthy contender.

Building Your First Golang Web Scraper: The Basics

Creating a web scraper in Go involves using libraries that handle connecting to websites, fetching their HTML content (or rendering the page like a browser), and then parsing that content to extract the desired information.

A powerful approach involves using a library like Playwright for Go. This allows your scraper to control a headless browser – a real web browser running without a graphical interface. Your code drives the browser, letting it load pages completely, including JavaScript-rendered content, just as a human visitor would. This makes scraping more robust and harder to detect compared to simple HTTP request libraries.

What About Gocolly?

Gocolly is another well-known web scraping framework specifically for Go. It's recognized for its speed and ease of use, allowing developers to quickly build web crawlers and scrapers. However, Gocolly primarily focuses on static HTML content. If the target website relies heavily on JavaScript to load data dynamically, you might need more complex workarounds, like analyzing network requests manually.

Enter Golang Playwright

Playwright for Go is a community-maintained port of Microsoft's powerful Playwright browser automation library. It provides a Go API to control browsers like Chromium, Firefox, and WebKit programmatically.

This enables you to write Go code that interacts with web pages just like a user: navigating, clicking, typing, and extracting data from the fully rendered page. This method is highly flexible and significantly reduces the chances of being blocked, as the interactions closely mimic real user behavior.

Strategies to Avoid Web Scraping Blocks

Combining a headless browser with high-quality residential proxies is a very effective strategy to avoid getting blocked during web scraping.

While web scraping itself is generally legal for publicly accessible data, website operators often implement measures to detect and block automated traffic. They look for patterns that distinguish bots from humans, such as unusual request headers, inability to render JavaScript correctly, or an abnormally high request rate from a single IP address.

Using a headless browser like Playwright helps your scraper appear more human because it *is* a real browser, sending standard headers and executing JavaScript. Websites have a harder time differentiating this traffic from genuine users.

However, sending many requests from the same IP address is still a major red flag. This is where proxies come in. Using a service like Evomi's residential proxies allows you to route your requests through a vast pool of IP addresses belonging to real devices worldwide. Each request can appear to come from a different user, making it extremely difficult for websites to track your scraping activity based on IP address alone. Evomi prides itself on ethically sourced proxies and competitive pricing (starting at just $0.49/GB for residential), offering a reliable way to scale your scraping operations responsibly.

Golang Web Scraper: A Practical Walkthrough

Let's build a basic Golang web scraper using Playwright.

Here’s the plan:

  • Install Go and set up your development environment.

  • Initialize a Go module.

  • Install the Playwright for Go library.

  • Write the basic Go code to control the browser.

  • Take a simple screenshot.

  • Integrate proxies for anonymity.

  • Extract specific data points.

  • Simulate clicking and typing.

Setting Up Your Go Environment

First, you need Go installed. On macOS via Homebrew:

On Windows via Chocolatey:

For Linux or other systems, download the appropriate package from the official Go downloads page.

Verify the installation by opening your terminal and running:

To run a Go file (e.g., `main.go`):

You'll also need a code editor. Visual Studio Code (VS Code) is a popular choice. Install the official Go extension from the Extensions marketplace:

Installing the Go extension in VS Code

Now we're ready to code!

Initializing Your Go Module

Go projects are organized into modules. You need a `go.mod` file to define your module and manage dependencies.

Navigate to your Go workspace (often `$HOME/go`), create a new directory for this project (e.g., `myscraper`), change into that directory in your terminal, and run:

You can replace `evomi.com/myscraper` with your preferred module path. This creates the `go.mod` file.

Installing Playwright for Go

Next, add the Playwright library to your module:

go get

Playwright also needs browser binaries. Install them (including dependencies) with this command:

go run github.com/playwright-community/playwright-go/cmd/playwright install --with-deps

Taking a Website Screenshot

Create a file named `main.go` in your `myscraper` directory.

Add the following code:

package main

import (
	"fmt" // Import fmt for printing output later
	"log"

	"github.com/playwright-community/playwright-go"
)

func main() {
	// Start Playwright
	pw, err := playwright.Run()
	if err != nil {
		log.Fatalf("Could not start Playwright: %v", err)
	}

	// Launch the Chromium browser (headless by default)
	// You could also use pw.Firefox.Launch() or pw.WebKit.Launch()
	browser, err := pw.Chromium.Launch()
	if err != nil {
		log.Fatalf("Could not launch browser: %v", err)
	}

	// Open a new page (tab)
	page, err := browser.NewPage()
	if err != nil {
		log.Fatalf("Could not create page: %v", err)
	}

	// Navigate to a target URL (e.g., Evomi's IP checker)
	targetURL := "https://geo.evomi.com/"
	fmt.Printf("Navigating to %s...\n", targetURL)
	if _, err = page.Goto(targetURL, playwright.PageGotoOptions{
		WaitUntil: playwright.WaitUntilStateNetworkidle, // Wait for network activity to settle
	}); err != nil {
		log.Fatalf("Could not navigate to %s: %v", targetURL, err)
	}
	fmt.Println("Navigation successful!")

	// Take a screenshot
	screenshotPath := "page_screenshot.png"
	fmt.Printf("Taking screenshot and saving to %s...\n", screenshotPath)
	if _, err = page.Screenshot(playwright.PageScreenshotOptions{
		Path: playwright.String(screenshotPath), // Specify the file path
	}); err != nil {
		log.Fatalf("Could not take screenshot: %v", err)
	}
	fmt.Println("Screenshot saved!")

	// Close the browser
	if err = browser.Close(); err != nil {
		log.Fatalf("Could not close browser: %v", err)
	}

	// Stop Playwright
	if err = pw.Stop(); err != nil {
		log.Fatalf("Could not stop Playwright: %v", err)
	}
	fmt.Println("Scraper finished successfully.")
}

Code Breakdown:

  • package main: Declares the package as the main executable.

  • import (...): Imports necessary libraries (logging, fmt for printing, and Playwright).

  • func main() { ... }: The main entry point of the program.

  • pw, err := playwright.Run(): Initializes Playwright. Go functions often return a result and an error. The := syntax declares and initializes variables.

  • if err != nil { ... }: Standard Go error handling pattern. Checks if the preceding operation failed.

  • browser, err := pw.Chromium.Launch(): Starts a Chromium browser instance.

  • page, err := browser.NewPage(): Creates a new browser tab.

  • page.Goto(...): Navigates the page to the specified URL. WaitUntilStateNetworkidle is often useful for pages with dynamic loading.

  • page.Screenshot(...): Captures the current view of the page and saves it to a file. Note the use of playwright.String() to pass string options.

  • browser.Close(): Closes the browser instance.

  • pw.Stop(): Shuts down the Playwright process.

Run the scraper:

If successful, you'll see console output and a `page_screenshot.png` file in your directory showing the loaded page.

Example screenshot of a web page showing an IP address


Integrating Evomi Proxies with Playwright

To route your scraper's traffic through proxies, you configure them when launching the browser.

First, you'll need proxy credentials from your provider. With Evomi, you can easily configure and access your proxy details (like endpoint, port, username, password) in your client dashboard. Evomi offers various proxy types, including residential proxies which are excellent for mimicking real users.


Modify the browser launch part of your `main.go` file. Replace:

browser, err := pw.Chromium.Launch()

With this code block, inserting your specific Evomi proxy details:

// Configure proxy settings (Replace with your actual Evomi credentials)
proxyServer := "rp.evomi.com:1000" // Example: Evomi Residential HTTP endpoint
proxyUsername := "YOUR_EVOMI_USERNAME"
proxyPassword := "YOUR_EVOMI_PASSWORD"
proxySettings := playwright.BrowserTypeLaunchOptionsProxy{
	Server:   playwright.String(proxyServer),
	Username: playwright.String(proxyUsername),
	Password: playwright.String(proxyPassword),
}

// Launch the browser with proxy settings
browser, err := pw.Chromium.Launch(playwright.BrowserTypeLaunchOptions{
	Proxy: &proxySettings, // Pass the proxy config as a pointer
})
if err != nil {
	log.Fatalf("Could not launch browser with proxy: %v", err)
}

Key changes:

  • We define a proxySettings struct holding the server address, username, and password. Remember to replace the placeholders with your actual Evomi details. Use the correct endpoint and port for your chosen proxy type (e.g., rp.evomi.com:1000 for Residential HTTP).

  • We pass a pointer to this struct (&proxySettings) to the Proxy field within playwright.BrowserTypeLaunchOptions when calling Launch().

Run `go run main.go` again. If you used an IP checking site like `https://geo.evomi.com/` or `https://ipv4.icanhazip.com/` as the target URL, the new screenshot should now display the IP address assigned by the Evomi proxy, not your own.

Example screenshot showing a proxy IP address

Playwright offers many other launch options for customizing browser behavior (e.g., user agent, viewport size, geolocation) and screenshot options (e.g., full page, specific element captures).

Extracting Data from Web Elements

Scraping usually involves extracting text or other data from specific HTML elements.

Playwright lets you select elements using CSS selectors (or other methods like XPath) and then retrieve their properties, such as text content.

Let's modify the code to extract the main heading from the Playwright documentation site. Replace the screenshot block and subsequent closing calls with this (insert it before `browser.Close()`):

// Navigate to Playwright's site for data extraction example
dataURL := "https://playwright.dev/docs/intro"
fmt.Printf("Navigating to %s for data extraction...\n", dataURL)
if _, err = page.Goto(dataURL, playwright.PageGotoOptions{
	WaitUntil: playwright.WaitUntilStateNetworkidle,
}); err != nil {
	log.Fatalf("Could not navigate to %s: %v", dataURL, err)
}

// Select the main heading element using its CSS class
headingSelector := "h1" // Simple selector for the main heading
fmt.Printf("Selecting element with selector: '%s'\n", headingSelector)
headingElement, err := page.QuerySelector(headingSelector)
if err != nil {
	log.Fatalf("Could not find element with selector '%s': %v", headingSelector, err)
}
if headingElement == nil {
	log.Fatalf("Element with selector '%s' not found.", headingSelector)
}

// Extract the text content
textContent, err := headingElement.TextContent()
if err != nil {
	log.Fatalf("Could not get text content: %v", err)
}

// Print the extracted text
fmt.Printf("Extracted Heading Text: %s\n", textContent)

// Remember to add back the browser closing and Playwright stopping code:
// if err = browser.Close(); err != nil { ... }
// if err = pw.Stop(); err != nil { ... }

This snippet navigates to a page, uses `page.QuerySelector()` to find the first element matching the `h1` CSS selector, retrieves its text using `TextContent()`, and prints it. For multiple elements, you'd use `page.QuerySelectorAll()`.

Console output showing extracted text from a web page

You can then store this extracted data (`textContent`) in variables, write it to files, or send it to a database.

Interacting with Pages: Clicks and Form Input

Playwright allows comprehensive interaction, simulating almost anything a user can do: clicking buttons, filling forms, scrolling, keyboard input, etc.

Similar to data extraction, you first select the element and then call an interaction method like `Click()` or `Fill()`.

Here's an example demonstrating clicking the search icon on the Playwright site and typing into the search box (add this before the browser closing calls):

// Example: Clicking the search button and typing into the input

// Selector for the search button
searchButtonSelector := ".DocSearch-Button"
fmt.Printf("Clicking element: '%s'\n", searchButtonSelector)
searchButton, err := page.QuerySelector(searchButtonSelector)
if err != nil {
	log.Fatalf("Error finding search button: %v", err)
}
if searchButton == nil {
	log.Fatalf("Search button element '%s' not found", searchButtonSelector)
}

// Click the button
if err = searchButton.Click(); err != nil {
	log.Fatalf("Error clicking search button: %v", err)
}
fmt.Println("Search button clicked.")

// Short pause to allow search modal to appear (adjust as needed)
page.WaitForTimeout(500) // Wait 500 milliseconds

// Selector for the search input field that appears
searchInputSelector := "#docsearch-input"
fmt.Printf("Typing into element: '%s'\n", searchInputSelector)
searchInput, err := page.QuerySelector(searchInputSelector)
if err != nil {
	log.Fatalf("Error finding search input: %v", err)
}
if searchInput == nil {
	log.Fatalf("Search input element '%s' not found", searchInputSelector)
}

// Fill the input field
searchText := "page object model"
if err = searchInput.Fill(searchText); err != nil {
	log.Fatalf("Error filling search input: %v", err)
}
fmt.Printf("Filled search input with: '%s'\n", searchText)

// Take a final screenshot to see the result
finalScreenshotPath := "interaction_screenshot.png"
fmt.Printf("Taking final screenshot: %s\n", finalScreenshotPath)
if _, err = page.Screenshot(playwright.PageScreenshotOptions{
	Path: playwright.String(finalScreenshotPath),
}); err != nil {
	log.Printf("Warning: could not take final screenshot: %v", err) // Log warning instead of fatal error
} else {
	fmt.Println("Final screenshot saved.")
}

// Remember to include browser.Close() and pw.Stop() after this block

This code selects the search button, clicks it, waits briefly, selects the search input field (which becomes visible after the click), and types text into it using `Fill()`. A final screenshot confirms the action.

Screenshot showing text typed into a search field by the Go scraper

Wrapping Up

You've now seen how to construct a Golang web scraper using the Playwright library. We covered setting up Go, installing dependencies, navigating pages, taking screenshots, integrating Evomi proxies to avoid blocks, extracting data, and simulating user interactions like clicks and typing.

This foundation opens the door to more complex scraping tasks: handling pagination, managing multiple concurrent scraping jobs (where Go's concurrency excels), saving data systematically, recording videos of scraping sessions, and executing custom JavaScript within the page context. Combined with reliable, ethically sourced proxies like those from Evomi, you can build powerful and sustainable data gathering solutions.

Happy scraping!

Author

Sarah Whitmore

Digital Privacy & Cybersecurity Consultant

About Author

Sarah is a cybersecurity strategist with a passion for online privacy and digital security. She explores how proxies, VPNs, and encryption tools protect users from tracking, cyber threats, and data breaches. With years of experience in cybersecurity consulting, she provides practical insights into safeguarding sensitive data in an increasingly digital world.

Like this article? Share it.
You asked, we answer - Users questions:
How can I implement automatic proxy rotation within my Golang Playwright scraper if I need a different IP for each request or session?+
The article mentions Go's concurrency. How can I specifically leverage goroutines to speed up scraping multiple URLs with Playwright and proxies?+
Running Playwright can be resource-intensive. What are some strategies to minimize memory and CPU usage for Golang scrapers using this library?+
Beyond using residential proxies like Evomi's, what are the most critical ethical guidelines to follow when building a Golang web scraper?+
What happens if my Golang Playwright scraper encounters a CAPTCHA test, which proxies alone don't solve?+

In This Article