Scraping Google Places: Discover Suppliers at Scale Without Getting Banned


The Scraper
Use Cases
Google Maps is the most comprehensive public database of local businesses on the planet. For supplier discovery, market research, lead generation, or competitive intelligence, the ability to systematically extract business data from Google Places is genuinely valuable.
It's also technically and legally nuanced. Google's Terms of Service prohibit automated scraping of Maps data. Their official Places API charges per request and has quota limits. And their anti-bot infrastructure is among the best in the industry.
This guide covers the practical approaches, using the official API where appropriate, and understanding the trade-offs of other methods.
Before proceeding: Review Google's Terms of Service and your use case's legal context. For internal business intelligence and non-commercial research, the risk profile is different from building a commercial product that redistributes Google's data. This guide is for engineers making informed decisions, not a recommendation to violate any platform's terms.
The Three Paths
Path 1: The Official Places API
Google's Places API is the legitimate path. For use cases where the volume and cost work, it's the cleanest option. No legal exposure, no detection risk, stable schema, SLA-backed.
Pricing (as of 2026):
Nearby Search: $0.032 per request (first 100K/month)
Place Details: $0.017 per request
Text Search: $0.032 per request
For discovering 10,000 suppliers: ~320 Nearby Search requests (50 results each) + 10,000 Place Details requests = ~$180. Reasonable for a one-time intelligence project.
import httpx import asyncio import json from typing import Generator API_KEY = "YOUR_GOOGLE_PLACES_API_KEY" BASE_URL = "https://maps.googleapis.com/maps/api" async def nearby_search( lat: float, lng: float, radius_meters: int, keyword: str, next_page_token: str = None ) -> dict: """Search for places near a location.""" params = { "location": f"{lat},{lng}", "radius": radius_meters, "keyword": keyword, "key": API_KEY, } if next_page_token: params = {"pagetoken": next_page_token, "key": API_KEY} async with httpx.AsyncClient() as client: response = await client.get(f"{BASE_URL}/place/nearbysearch/json", params=params) response.raise_for_status() return response.json() async def get_place_details(place_id: str) -> dict: """Get full details for a specific place.""" params = { "place_id": place_id, "fields": "name,formatted_address,formatted_phone_number,website,rating,user_ratings_total,business_status,types", "key": API_KEY, } async with httpx.AsyncClient() as client: response = await client.get(f"{BASE_URL}/place/details/json", params=params) response.raise_for_status() return response.json() async def discover_suppliers( locations: list[tuple[float, float]], keyword: str, radius_km: int = 10 ) -> list[dict]: """Discover suppliers across multiple geographic points.""" all_places = [] for lat, lng in locations: data = await nearby_search(lat, lng, radius_km * 1000, keyword) places = data.get('results', []) all_places.extend(places) # Handle pagination next_token = data.get('next_page_token') while next_token: await asyncio.sleep(2) # Required: token needs time to activate data = await nearby_search(lat, lng, radius_km * 1000, keyword, next_token) all_places.extend(data.get('results', [])) next_token = data.get('next_page_token') # Deduplicate by place_id seen = set() unique_places = [] for place in all_places: if place['place_id'] not in seen: seen.add(place['place_id']) unique_places.append(place) return unique_places # Grid search: cover a metro area systematically def generate_grid( center_lat: float, center_lng: float, radius_km: float, grid_spacing_km: float ) -> list[tuple[float, float]]: """Generate a grid of lat/lng points to cover an area.""" import math points = [] lat_step = grid_spacing_km / 111.0 # ~111km per degree latitude lng_step = grid_spacing_km / (111.0 * math.cos(math.radians(center_lat))) steps = int(radius_km / grid_spacing_km) for i in range(-steps, steps + 1): for j in range(-steps, steps + 1): points.append(( center_lat + i * lat_step, center_lng + j * lng_step )) return points
Path 2: The Places API New (Updated 2024)
Google updated their Places API in 2024 with a new endpoint structure and expanded data fields. The new API supports richer queries:
async def places_text_search_new(query: str, location_bias: dict) -> dict: """Use the new Places API Text Search endpoint.""" headers = { "Content-Type": "application/json", "X-Goog-Api-Key": API_KEY, "X-Goog-FieldMask": "places.displayName,places.formattedAddress,places.phoneNumbers,places.websiteUri,places.rating,places.userRatingCount", } body = { "textQuery": query, "locationBias": location_bias, "maxResultCount": 20, } async with httpx.AsyncClient() as client: response = await client.post( "https://places.googleapis.com/v1/places:searchText", json=body, headers=headers, ) response.raise_for_status() return response.json()
Path 3: Browser-Based Extraction (When API Limits or Cost Are Constraints)
For use cases where the official API cost is prohibitive at scale, browser-based extraction of publicly visible Google Maps data is used by many data teams. The technical approach, with the legal caveats noted above:
from playwright.async_api import async_playwright import asyncio import json EVOMI_PROXY = { "server": "http://rp.evomi.com:1000", "username": "USERNAME", "password": "PASSWORD", } async def extract_places_from_search( search_query: str, location: str, max_results: int = 50 ) -> list[dict]: """ Extract business listings from Google Maps search results. Note: This approach is subject to Google's ToS — review before use. """ async with async_playwright() as p: browser = await p.chromium.launch( headless=True, args=['--disable-blink-features=AutomationControlled'], ) context = await browser.new_context( proxy=EVOMI_PROXY, user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) " "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36", ) page = await context.new_page() search_url = f"https://www.google.com/maps/search/{search_query}+{location}" await page.goto(search_url) await page.wait_for_selector('[role="feed"]', timeout=10000) # Scroll to load results feed = page.locator('[role="feed"]') results = [] previous_count = 0 while len(results) < max_results: # Extract currently visible results items = await page.query_selector_all('[role="feed"] > div[jsaction]') for item in items[previous_count:]: try: name = await item.query_selector('div.fontHeadlineSmall') rating = await item.query_selector('span[aria-label*="stars"]') address = await item.query_selector('div.W4Efsd:last-child') if name: results.append({ 'name': await name.inner_text(), 'rating': await rating.get_attribute('aria-label') if rating else None, 'address_snippet': await address.inner_text() if address else None, }) except Exception: continue previous_count = len(results) # Scroll feed to load more await feed.evaluate('el => el.scrollTop += 800') await asyncio.sleep(1.5) # Check if we've reached the end end_marker = await page.query_selector('div.HlvSq') if end_marker: break await browser.close() return results[:max_results]
Building the Supplier Intelligence Database
Once you have raw place data, the enrichment pipeline matters as much as the collection:
import pandas as pd from google.cloud import bigquery def build_supplier_dataset( raw_places: list[dict], detail_records: list[dict] ) -> pd.DataFrame: """Merge place summaries with detailed records.""" details_by_id = {r['place_id']: r for r in detail_records} rows = [] for place in raw_places: detail = details_by_id.get(place['place_id'], {}) rows.append({ 'place_id': place['place_id'], 'name': place.get('name'), 'lat': place['geometry']['location']['lat'], 'lng': place['geometry']['location']['lng'], 'rating': place.get('rating'), 'review_count': place.get('user_ratings_total'), 'phone': detail.get('formatted_phone_number'), 'website': detail.get('website'), 'address': detail.get('formatted_address'), 'types': ','.join(detail.get('types', [])), 'business_status': detail.get('business_status'), }) return pd.DataFrame(rows) def load_to_bigquery(df: pd.DataFrame, table_id: str): client = bigquery.Client() job = client.load_table_from_dataframe(df, table_id) job.result() print(f"Loaded {len(df)} suppliers to {table_id}")
Geo-Targeting Your Proxy Layer
For Google Maps specifically, the results vary significantly by the geographic origin of the request. A search for "industrial suppliers" from a US IP returns US businesses. The same search from a DE IP returns German businesses — even if you specify a US location in the query.
To scrape data for specific regions accurately, your proxy IP should match the target region. Evomi's residential proxies support country and city-level targeting across 195+ countries — meaning you can run parallel collection jobs, each with geo-appropriate IP sourcing, and aggregate results in BigQuery.
# Collect suppliers in Germany using German IPs GERMAN_PROXY = "http://USERNAME-country-DE:PASSWORD@rp.evomi.com:1000" # Collect suppliers in Japan using Japanese IPs JAPAN_PROXY = "http://USERNAME-country-JP:PASSWORD@rp.evomi.com:1000"
Common Pitfalls
Pitfall 1: Missing the next_page_token sleep. Google's Nearby Search pagination requires a brief delay (2+ seconds) before the next page token is valid. Ignoring this returns an INVALID_REQUEST error. Always asyncio.sleep(2) before using a token.
Pitfall 2: Not deduplicating across grid points. A supplier near the boundary of two grid cells will appear in both. Always deduplicate by place_id after collecting across the grid.
Pitfall 3: Hitting quota limits silently. The official API returns a 200 response with status: "OVER_QUERY_LIMIT" in the body, it doesn't use HTTP status codes for quota errors. Check data['status'] on every response, not just the HTTP status.
Conclusion
The official Google Places API is the right tool for compliant, stable supplier discovery. For the use cases where it works, it provides clean structured data with no extraction complexity. For higher volume or cost-constrained use cases, browser-based extraction with clean residential proxies is the technical path, with the legal consideration clearly in mind.
Either way, Evomi's residential proxies with geo-targeting handle the IP layer. Start with the free trial and run your geo-targeted collection from there.
Google Maps is the most comprehensive public database of local businesses on the planet. For supplier discovery, market research, lead generation, or competitive intelligence, the ability to systematically extract business data from Google Places is genuinely valuable.
It's also technically and legally nuanced. Google's Terms of Service prohibit automated scraping of Maps data. Their official Places API charges per request and has quota limits. And their anti-bot infrastructure is among the best in the industry.
This guide covers the practical approaches, using the official API where appropriate, and understanding the trade-offs of other methods.
Before proceeding: Review Google's Terms of Service and your use case's legal context. For internal business intelligence and non-commercial research, the risk profile is different from building a commercial product that redistributes Google's data. This guide is for engineers making informed decisions, not a recommendation to violate any platform's terms.
The Three Paths
Path 1: The Official Places API
Google's Places API is the legitimate path. For use cases where the volume and cost work, it's the cleanest option. No legal exposure, no detection risk, stable schema, SLA-backed.
Pricing (as of 2026):
Nearby Search: $0.032 per request (first 100K/month)
Place Details: $0.017 per request
Text Search: $0.032 per request
For discovering 10,000 suppliers: ~320 Nearby Search requests (50 results each) + 10,000 Place Details requests = ~$180. Reasonable for a one-time intelligence project.
import httpx import asyncio import json from typing import Generator API_KEY = "YOUR_GOOGLE_PLACES_API_KEY" BASE_URL = "https://maps.googleapis.com/maps/api" async def nearby_search( lat: float, lng: float, radius_meters: int, keyword: str, next_page_token: str = None ) -> dict: """Search for places near a location.""" params = { "location": f"{lat},{lng}", "radius": radius_meters, "keyword": keyword, "key": API_KEY, } if next_page_token: params = {"pagetoken": next_page_token, "key": API_KEY} async with httpx.AsyncClient() as client: response = await client.get(f"{BASE_URL}/place/nearbysearch/json", params=params) response.raise_for_status() return response.json() async def get_place_details(place_id: str) -> dict: """Get full details for a specific place.""" params = { "place_id": place_id, "fields": "name,formatted_address,formatted_phone_number,website,rating,user_ratings_total,business_status,types", "key": API_KEY, } async with httpx.AsyncClient() as client: response = await client.get(f"{BASE_URL}/place/details/json", params=params) response.raise_for_status() return response.json() async def discover_suppliers( locations: list[tuple[float, float]], keyword: str, radius_km: int = 10 ) -> list[dict]: """Discover suppliers across multiple geographic points.""" all_places = [] for lat, lng in locations: data = await nearby_search(lat, lng, radius_km * 1000, keyword) places = data.get('results', []) all_places.extend(places) # Handle pagination next_token = data.get('next_page_token') while next_token: await asyncio.sleep(2) # Required: token needs time to activate data = await nearby_search(lat, lng, radius_km * 1000, keyword, next_token) all_places.extend(data.get('results', [])) next_token = data.get('next_page_token') # Deduplicate by place_id seen = set() unique_places = [] for place in all_places: if place['place_id'] not in seen: seen.add(place['place_id']) unique_places.append(place) return unique_places # Grid search: cover a metro area systematically def generate_grid( center_lat: float, center_lng: float, radius_km: float, grid_spacing_km: float ) -> list[tuple[float, float]]: """Generate a grid of lat/lng points to cover an area.""" import math points = [] lat_step = grid_spacing_km / 111.0 # ~111km per degree latitude lng_step = grid_spacing_km / (111.0 * math.cos(math.radians(center_lat))) steps = int(radius_km / grid_spacing_km) for i in range(-steps, steps + 1): for j in range(-steps, steps + 1): points.append(( center_lat + i * lat_step, center_lng + j * lng_step )) return points
Path 2: The Places API New (Updated 2024)
Google updated their Places API in 2024 with a new endpoint structure and expanded data fields. The new API supports richer queries:
async def places_text_search_new(query: str, location_bias: dict) -> dict: """Use the new Places API Text Search endpoint.""" headers = { "Content-Type": "application/json", "X-Goog-Api-Key": API_KEY, "X-Goog-FieldMask": "places.displayName,places.formattedAddress,places.phoneNumbers,places.websiteUri,places.rating,places.userRatingCount", } body = { "textQuery": query, "locationBias": location_bias, "maxResultCount": 20, } async with httpx.AsyncClient() as client: response = await client.post( "https://places.googleapis.com/v1/places:searchText", json=body, headers=headers, ) response.raise_for_status() return response.json()
Path 3: Browser-Based Extraction (When API Limits or Cost Are Constraints)
For use cases where the official API cost is prohibitive at scale, browser-based extraction of publicly visible Google Maps data is used by many data teams. The technical approach, with the legal caveats noted above:
from playwright.async_api import async_playwright import asyncio import json EVOMI_PROXY = { "server": "http://rp.evomi.com:1000", "username": "USERNAME", "password": "PASSWORD", } async def extract_places_from_search( search_query: str, location: str, max_results: int = 50 ) -> list[dict]: """ Extract business listings from Google Maps search results. Note: This approach is subject to Google's ToS — review before use. """ async with async_playwright() as p: browser = await p.chromium.launch( headless=True, args=['--disable-blink-features=AutomationControlled'], ) context = await browser.new_context( proxy=EVOMI_PROXY, user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) " "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36", ) page = await context.new_page() search_url = f"https://www.google.com/maps/search/{search_query}+{location}" await page.goto(search_url) await page.wait_for_selector('[role="feed"]', timeout=10000) # Scroll to load results feed = page.locator('[role="feed"]') results = [] previous_count = 0 while len(results) < max_results: # Extract currently visible results items = await page.query_selector_all('[role="feed"] > div[jsaction]') for item in items[previous_count:]: try: name = await item.query_selector('div.fontHeadlineSmall') rating = await item.query_selector('span[aria-label*="stars"]') address = await item.query_selector('div.W4Efsd:last-child') if name: results.append({ 'name': await name.inner_text(), 'rating': await rating.get_attribute('aria-label') if rating else None, 'address_snippet': await address.inner_text() if address else None, }) except Exception: continue previous_count = len(results) # Scroll feed to load more await feed.evaluate('el => el.scrollTop += 800') await asyncio.sleep(1.5) # Check if we've reached the end end_marker = await page.query_selector('div.HlvSq') if end_marker: break await browser.close() return results[:max_results]
Building the Supplier Intelligence Database
Once you have raw place data, the enrichment pipeline matters as much as the collection:
import pandas as pd from google.cloud import bigquery def build_supplier_dataset( raw_places: list[dict], detail_records: list[dict] ) -> pd.DataFrame: """Merge place summaries with detailed records.""" details_by_id = {r['place_id']: r for r in detail_records} rows = [] for place in raw_places: detail = details_by_id.get(place['place_id'], {}) rows.append({ 'place_id': place['place_id'], 'name': place.get('name'), 'lat': place['geometry']['location']['lat'], 'lng': place['geometry']['location']['lng'], 'rating': place.get('rating'), 'review_count': place.get('user_ratings_total'), 'phone': detail.get('formatted_phone_number'), 'website': detail.get('website'), 'address': detail.get('formatted_address'), 'types': ','.join(detail.get('types', [])), 'business_status': detail.get('business_status'), }) return pd.DataFrame(rows) def load_to_bigquery(df: pd.DataFrame, table_id: str): client = bigquery.Client() job = client.load_table_from_dataframe(df, table_id) job.result() print(f"Loaded {len(df)} suppliers to {table_id}")
Geo-Targeting Your Proxy Layer
For Google Maps specifically, the results vary significantly by the geographic origin of the request. A search for "industrial suppliers" from a US IP returns US businesses. The same search from a DE IP returns German businesses — even if you specify a US location in the query.
To scrape data for specific regions accurately, your proxy IP should match the target region. Evomi's residential proxies support country and city-level targeting across 195+ countries — meaning you can run parallel collection jobs, each with geo-appropriate IP sourcing, and aggregate results in BigQuery.
# Collect suppliers in Germany using German IPs GERMAN_PROXY = "http://USERNAME-country-DE:PASSWORD@rp.evomi.com:1000" # Collect suppliers in Japan using Japanese IPs JAPAN_PROXY = "http://USERNAME-country-JP:PASSWORD@rp.evomi.com:1000"
Common Pitfalls
Pitfall 1: Missing the next_page_token sleep. Google's Nearby Search pagination requires a brief delay (2+ seconds) before the next page token is valid. Ignoring this returns an INVALID_REQUEST error. Always asyncio.sleep(2) before using a token.
Pitfall 2: Not deduplicating across grid points. A supplier near the boundary of two grid cells will appear in both. Always deduplicate by place_id after collecting across the grid.
Pitfall 3: Hitting quota limits silently. The official API returns a 200 response with status: "OVER_QUERY_LIMIT" in the body, it doesn't use HTTP status codes for quota errors. Check data['status'] on every response, not just the HTTP status.
Conclusion
The official Google Places API is the right tool for compliant, stable supplier discovery. For the use cases where it works, it provides clean structured data with no extraction complexity. For higher volume or cost-constrained use cases, browser-based extraction with clean residential proxies is the technical path, with the legal consideration clearly in mind.
Either way, Evomi's residential proxies with geo-targeting handle the IP layer. Start with the free trial and run your geo-targeted collection from there.
Google Maps is the most comprehensive public database of local businesses on the planet. For supplier discovery, market research, lead generation, or competitive intelligence, the ability to systematically extract business data from Google Places is genuinely valuable.
It's also technically and legally nuanced. Google's Terms of Service prohibit automated scraping of Maps data. Their official Places API charges per request and has quota limits. And their anti-bot infrastructure is among the best in the industry.
This guide covers the practical approaches, using the official API where appropriate, and understanding the trade-offs of other methods.
Before proceeding: Review Google's Terms of Service and your use case's legal context. For internal business intelligence and non-commercial research, the risk profile is different from building a commercial product that redistributes Google's data. This guide is for engineers making informed decisions, not a recommendation to violate any platform's terms.
The Three Paths
Path 1: The Official Places API
Google's Places API is the legitimate path. For use cases where the volume and cost work, it's the cleanest option. No legal exposure, no detection risk, stable schema, SLA-backed.
Pricing (as of 2026):
Nearby Search: $0.032 per request (first 100K/month)
Place Details: $0.017 per request
Text Search: $0.032 per request
For discovering 10,000 suppliers: ~320 Nearby Search requests (50 results each) + 10,000 Place Details requests = ~$180. Reasonable for a one-time intelligence project.
import httpx import asyncio import json from typing import Generator API_KEY = "YOUR_GOOGLE_PLACES_API_KEY" BASE_URL = "https://maps.googleapis.com/maps/api" async def nearby_search( lat: float, lng: float, radius_meters: int, keyword: str, next_page_token: str = None ) -> dict: """Search for places near a location.""" params = { "location": f"{lat},{lng}", "radius": radius_meters, "keyword": keyword, "key": API_KEY, } if next_page_token: params = {"pagetoken": next_page_token, "key": API_KEY} async with httpx.AsyncClient() as client: response = await client.get(f"{BASE_URL}/place/nearbysearch/json", params=params) response.raise_for_status() return response.json() async def get_place_details(place_id: str) -> dict: """Get full details for a specific place.""" params = { "place_id": place_id, "fields": "name,formatted_address,formatted_phone_number,website,rating,user_ratings_total,business_status,types", "key": API_KEY, } async with httpx.AsyncClient() as client: response = await client.get(f"{BASE_URL}/place/details/json", params=params) response.raise_for_status() return response.json() async def discover_suppliers( locations: list[tuple[float, float]], keyword: str, radius_km: int = 10 ) -> list[dict]: """Discover suppliers across multiple geographic points.""" all_places = [] for lat, lng in locations: data = await nearby_search(lat, lng, radius_km * 1000, keyword) places = data.get('results', []) all_places.extend(places) # Handle pagination next_token = data.get('next_page_token') while next_token: await asyncio.sleep(2) # Required: token needs time to activate data = await nearby_search(lat, lng, radius_km * 1000, keyword, next_token) all_places.extend(data.get('results', [])) next_token = data.get('next_page_token') # Deduplicate by place_id seen = set() unique_places = [] for place in all_places: if place['place_id'] not in seen: seen.add(place['place_id']) unique_places.append(place) return unique_places # Grid search: cover a metro area systematically def generate_grid( center_lat: float, center_lng: float, radius_km: float, grid_spacing_km: float ) -> list[tuple[float, float]]: """Generate a grid of lat/lng points to cover an area.""" import math points = [] lat_step = grid_spacing_km / 111.0 # ~111km per degree latitude lng_step = grid_spacing_km / (111.0 * math.cos(math.radians(center_lat))) steps = int(radius_km / grid_spacing_km) for i in range(-steps, steps + 1): for j in range(-steps, steps + 1): points.append(( center_lat + i * lat_step, center_lng + j * lng_step )) return points
Path 2: The Places API New (Updated 2024)
Google updated their Places API in 2024 with a new endpoint structure and expanded data fields. The new API supports richer queries:
async def places_text_search_new(query: str, location_bias: dict) -> dict: """Use the new Places API Text Search endpoint.""" headers = { "Content-Type": "application/json", "X-Goog-Api-Key": API_KEY, "X-Goog-FieldMask": "places.displayName,places.formattedAddress,places.phoneNumbers,places.websiteUri,places.rating,places.userRatingCount", } body = { "textQuery": query, "locationBias": location_bias, "maxResultCount": 20, } async with httpx.AsyncClient() as client: response = await client.post( "https://places.googleapis.com/v1/places:searchText", json=body, headers=headers, ) response.raise_for_status() return response.json()
Path 3: Browser-Based Extraction (When API Limits or Cost Are Constraints)
For use cases where the official API cost is prohibitive at scale, browser-based extraction of publicly visible Google Maps data is used by many data teams. The technical approach, with the legal caveats noted above:
from playwright.async_api import async_playwright import asyncio import json EVOMI_PROXY = { "server": "http://rp.evomi.com:1000", "username": "USERNAME", "password": "PASSWORD", } async def extract_places_from_search( search_query: str, location: str, max_results: int = 50 ) -> list[dict]: """ Extract business listings from Google Maps search results. Note: This approach is subject to Google's ToS — review before use. """ async with async_playwright() as p: browser = await p.chromium.launch( headless=True, args=['--disable-blink-features=AutomationControlled'], ) context = await browser.new_context( proxy=EVOMI_PROXY, user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) " "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36", ) page = await context.new_page() search_url = f"https://www.google.com/maps/search/{search_query}+{location}" await page.goto(search_url) await page.wait_for_selector('[role="feed"]', timeout=10000) # Scroll to load results feed = page.locator('[role="feed"]') results = [] previous_count = 0 while len(results) < max_results: # Extract currently visible results items = await page.query_selector_all('[role="feed"] > div[jsaction]') for item in items[previous_count:]: try: name = await item.query_selector('div.fontHeadlineSmall') rating = await item.query_selector('span[aria-label*="stars"]') address = await item.query_selector('div.W4Efsd:last-child') if name: results.append({ 'name': await name.inner_text(), 'rating': await rating.get_attribute('aria-label') if rating else None, 'address_snippet': await address.inner_text() if address else None, }) except Exception: continue previous_count = len(results) # Scroll feed to load more await feed.evaluate('el => el.scrollTop += 800') await asyncio.sleep(1.5) # Check if we've reached the end end_marker = await page.query_selector('div.HlvSq') if end_marker: break await browser.close() return results[:max_results]
Building the Supplier Intelligence Database
Once you have raw place data, the enrichment pipeline matters as much as the collection:
import pandas as pd from google.cloud import bigquery def build_supplier_dataset( raw_places: list[dict], detail_records: list[dict] ) -> pd.DataFrame: """Merge place summaries with detailed records.""" details_by_id = {r['place_id']: r for r in detail_records} rows = [] for place in raw_places: detail = details_by_id.get(place['place_id'], {}) rows.append({ 'place_id': place['place_id'], 'name': place.get('name'), 'lat': place['geometry']['location']['lat'], 'lng': place['geometry']['location']['lng'], 'rating': place.get('rating'), 'review_count': place.get('user_ratings_total'), 'phone': detail.get('formatted_phone_number'), 'website': detail.get('website'), 'address': detail.get('formatted_address'), 'types': ','.join(detail.get('types', [])), 'business_status': detail.get('business_status'), }) return pd.DataFrame(rows) def load_to_bigquery(df: pd.DataFrame, table_id: str): client = bigquery.Client() job = client.load_table_from_dataframe(df, table_id) job.result() print(f"Loaded {len(df)} suppliers to {table_id}")
Geo-Targeting Your Proxy Layer
For Google Maps specifically, the results vary significantly by the geographic origin of the request. A search for "industrial suppliers" from a US IP returns US businesses. The same search from a DE IP returns German businesses — even if you specify a US location in the query.
To scrape data for specific regions accurately, your proxy IP should match the target region. Evomi's residential proxies support country and city-level targeting across 195+ countries — meaning you can run parallel collection jobs, each with geo-appropriate IP sourcing, and aggregate results in BigQuery.
# Collect suppliers in Germany using German IPs GERMAN_PROXY = "http://USERNAME-country-DE:PASSWORD@rp.evomi.com:1000" # Collect suppliers in Japan using Japanese IPs JAPAN_PROXY = "http://USERNAME-country-JP:PASSWORD@rp.evomi.com:1000"
Common Pitfalls
Pitfall 1: Missing the next_page_token sleep. Google's Nearby Search pagination requires a brief delay (2+ seconds) before the next page token is valid. Ignoring this returns an INVALID_REQUEST error. Always asyncio.sleep(2) before using a token.
Pitfall 2: Not deduplicating across grid points. A supplier near the boundary of two grid cells will appear in both. Always deduplicate by place_id after collecting across the grid.
Pitfall 3: Hitting quota limits silently. The official API returns a 200 response with status: "OVER_QUERY_LIMIT" in the body, it doesn't use HTTP status codes for quota errors. Check data['status'] on every response, not just the HTTP status.
Conclusion
The official Google Places API is the right tool for compliant, stable supplier discovery. For the use cases where it works, it provides clean structured data with no extraction complexity. For higher volume or cost-constrained use cases, browser-based extraction with clean residential proxies is the technical path, with the legal consideration clearly in mind.
Either way, Evomi's residential proxies with geo-targeting handle the IP layer. Start with the free trial and run your geo-targeted collection from there.

Author
The Scraper
Engineer and Webscraping Specialist
About Author
The Scraper is a software engineer and web scraping specialist, focused on building production-grade data extraction systems. His work centers on large-scale crawling, anti-bot evasion, proxy infrastructure, and browser automation. He writes about real-world scraping failures, silent data corruption, and systems that operate at scale.



