Google Maps is a resourceful map application that provides comprehensive information about geographical locations around the world. It’s also a crucial source of local business information. You can use it to get a business name, address information, website URL, opening hours, and more.

Scraping the Google Maps data programmatically lets you convert it into an organized format that you can use for various purposes, including generating business leads, organizing mass email campaigns, and getting contact information for cold calling.

This article talks about how to scrape data from Google Maps. You are going to learn how to get useful local business information from the application. We’ll also discuss how ProxyCrawl can help you make the data extraction process smooth, fast, and rewarding.

Let’s start by talking about how you can use ProxyCrawl for pulling online information.

Using ProxyCrawl for Web Scraping

ProxyCrawl is a versatile tool that allows you to crawl online data at scale. It can be used for a wide range of data extraction tasks, including scraping data from Google Maps. It’s what you need to overcome the usual scraping challenges and make the most out of your efforts.

These are some reasons why ProxyCrawl is vital for web scraping:

  • Easy to use ProxyCrawl provides a user-friendly API, which you can start using within a few minutes. There is also detailed documentation full of code samples on how to integrate the API.
  • Supports anonymous crawling With ProxyCrawl, you’ll not need to worry about revealing your real identity when extracting information from the Internet. It has an extensive pool of proxies and data centers that allows you to remain anonymous.
  • Supports advanced extraction ProxyCrawl supports all types of crawling projects. Since it supports JavaScript rendering, this provides you with real browser capabilities for retrieving data from modern, complicated websites without experiencing any hindrances. It also allows you to bypass blockades, CAPTCHAs, and other access restrictions that may impede you from harvesting data quickly and proficiently.
  • Free testing account After signing up for an account, you’ll get free 1,000 credits for trying out the tool’s capabilities. Then, you can just continue using the service and pay for your usage at the end of each billing cycle.

For this Google Maps web scraping tutorial, we’ll use ProxyCrawl’s Crawling API. Every request to the API starts with the following base part:

1
https://api.proxycrawl.com

Then, the following two mandatory query string parameters are required:

  • Unique authentication token This authorizes you to use the API. ProxyCrawl provides two types of tokens: normal token for completing generic web requests and JavaScript token for scraping dynamic, JavaScript-rendered websites.
  • Target URL This is the URL you need to extract its data. It should start with HTTP or HTTPS. You should also encode the URL to convert it into a format that can be transferred over the Internet easily.

This is how to add the mandatory parameters to the API request:

1
https://api.proxycrawl.com/?token=ADD_TOKEN&url=ADD_URL

That’s all you need to begin using ProxyCrawl for pulling information from the Internet.

It’s that easy!

In fact, if you provide the required parameter information to the above request and execute it on a web browser’s address bar, it’ll return the full HTML code of the target web page.

Scraping Data From Google Maps Using ProxyCrawl

Let’s now see how you can use ProxyCrawl to harvest Google Maps data fast and efficiently. In this tutorial, we’ll aim to extract the data of New York restaurants listed on Google Maps. We’ll use the PHP programming language for this task.

Step 1: Grab the request URL

Let’s start by going to Google Maps and searching for restaurants in New York.

As seen on the screenshot above, the restaurants’ data is delivered on the page’s left sidebar. This is what we want to scrape, instead of extracting data out of the entire page. The easiest way to scrape it is to inspect the browser’s network traffic and grab the URL that delivers the data.

To inspect the browser’s network traffic, you can right-click anywhere on the page’s left sidebar and select the Inspect option. This will open the developer tools section at the bottom of your browser window. Next, select the Network tab. You’ll start seeing the data that comes across the network.

To load the data we want, search again for “New York restaurants”. Then, you can type “search” on the Network panel search box to filter the search URLs.

This will reveal the URLs associated with your recent search. In this case, the data we want is in the first GET request, which delivers the restaurants’ data in JSON format.

If you click on the row containing the request, a new right-hand pane is displayed. The pane gives more information about the request. Under the Headers tab, highlight and copy the URL. That’s the URL used for acquiring the data displayed on Google Maps. It’s the URL we’ll use for scraping the restaurants’ data.

This is the URL we grabbed:

1
https://www.google.com/search?tbm=map&authuser=0&hl=en&gl=ke&pb=!4m12!1m3!1d13288.926258283986!2d-74.02334913898135!3d40.73841320805614!2m3!1f0!2f0!3f0!3m2!1i1366!2i211!4f13.1!7i20!10b1!12m8!1m1!18b1!2m3!5m1!6e2!20e3!10b1!16b1!19m4!2m3!1i360!2i120!4i8!20m65!2m2!1i203!2i100!3m2!2i4!5b1!6m6!1m2!1i86!2i86!1m2!1i408!2i240!7m50!1m3!1e1!2b0!3e3!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e3!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e3!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e9!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e4!2b1!4b1!9b0!22m6!1sVgY-YP-eBdqg5NoPuc2f4A8:2!2s1i:0,t:11886,p:VgY-YP-eBdqg5NoPuc2f4A8:2!7e81!12e5!17sVgY-YP-eBdqg5NoPuc2f4A8:92!18e15!24m54!1m16!13m7!2b1!3b1!4b1!6i1!8b1!9b1!20b0!18m7!3b1!4b1!5b1!6b1!9b1!13b0!14b0!2b1!5m5!2b1!3b1!5b1!6b1!7b1!10m1!8e3!14m1!3b1!17b1!20m2!1e3!1e6!24b1!25b1!26b1!29b1!30m1!2b1!36b1!43b1!52b1!54m1!1b1!55b1!56m2!1b1!3b1!65m5!3m4!1m3!1m2!1i224!2i298!89b1!26m4!2m3!1i80!2i92!4i8!30m0!34m16!2b1!3b1!4b1!6b1!8m4!1b1!3b1!4b1!6b1!9b1!12b1!14b1!20b1!23b1!25b1!26b1!37m1!1e81!42b1!47m0!49m1!3b1!50m4!2e2!3m2!1b1!3b1!65m0!69i544&q=New york restaurants&oq=New york restaurants&gs_l=maps.3..38i39i129k1j38i39i129i444k1j38i426k1l3.0.0.2.48644.1.1.0.0.0.0.713.713.6-1.1.0....0...1ac..64.maps..0.1.713....0.&tch=1&ech=2&psi=VgY-YP-eBdqg5NoPuc2f4A8.1614677593039.1

Next, let’s clean up the request URL by removing the following query string parameters that may not be necessary:

  • oq
  • gs_l
  • tch
  • ech
  • psi

This is how the final URL looks like:

1
https://www.google.com/search?tbm=map&authuser=0&hl=en&gl=ke&pb=!4m12!1m3!1d13288.926258283986!2d-74.02334913898135!3d40.73841320805614!2m3!1f0!2f0!3f0!3m2!1i1366!2i211!4f13.1!7i20!10b1!12m8!1m1!18b1!2m3!5m1!6e2!20e3!10b1!16b1!19m4!2m3!1i360!2i120!4i8!20m65!2m2!1i203!2i100!3m2!2i4!5b1!6m6!1m2!1i86!2i86!1m2!1i408!2i240!7m50!1m3!1e1!2b0!3e3!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e3!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e3!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e9!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e4!2b1!4b1!9b0!22m6!1sVgY-YP-eBdqg5NoPuc2f4A8:2!2s1i:0,t:11886,p:VgY-YP-eBdqg5NoPuc2f4A8:2!7e81!12e5!17sVgY-YP-eBdqg5NoPuc2f4A8:92!18e15!24m54!1m16!13m7!2b1!3b1!4b1!6i1!8b1!9b1!20b0!18m7!3b1!4b1!5b1!6b1!9b1!13b0!14b0!2b1!5m5!2b1!3b1!5b1!6b1!7b1!10m1!8e3!14m1!3b1!17b1!20m2!1e3!1e6!24b1!25b1!26b1!29b1!30m1!2b1!36b1!43b1!52b1!54m1!1b1!55b1!56m2!1b1!3b1!65m5!3m4!1m3!1m2!1i224!2i298!89b1!26m4!2m3!1i80!2i92!4i8!30m0!34m16!2b1!3b1!4b1!6b1!8m4!1b1!3b1!4b1!6b1!9b1!12b1!14b1!20b1!23b1!25b1!26b1!37m1!1e81!42b1!47m0!49m1!3b1!50m4!2e2!3m2!1b1!3b1!65m0!69i544&q=New york restaurants

Step 2: Examine the returned data

Let’s now make a request using the grabbed URL and examine how the returned data looks like. This will assist in creating the scraping logic in the next step.

We’ll use the PHP cURL library to make a GET request and retrieve the Google Maps data. Since ProxyCrawl requires URLs to be encoded, we’ll use the built-in urlencode() function to encode the grabbed URL.

In this case, we’ll use ProxyCrawl’s normal token to make the request.

Here is the code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<?php
//Encoding the URL
$url = urlencode("https://www.google.com/search?tbm=map&authuser=0&hl=en&gl=ke&pb=!4m12!1m3!1d13288.926258283986!2d-74.02334913898135!3d40.73841320805614!2m3!1f0!2f0!3f0!3m2!1i1366!2i211!4f13.1!7i20!10b1!12m8!1m1!18b1!2m3!5m1!6e2!20e3!10b1!16b1!19m4!2m3!1i360!2i120!4i8!20m65!2m2!1i203!2i100!3m2!2i4!5b1!6m6!1m2!1i86!2i86!1m2!1i408!2i240!7m50!1m3!1e1!2b0!3e3!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e3!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e3!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e9!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e4!2b1!4b1!9b0!22m6!1sVgY-YP-eBdqg5NoPuc2f4A8:2!2s1i:0,t:11886,p:VgY-YP-eBdqg5NoPuc2f4A8:2!7e81!12e5!17sVgY-YP-eBdqg5NoPuc2f4A8:92!18e15!24m54!1m16!13m7!2b1!3b1!4b1!6i1!8b1!9b1!20b0!18m7!3b1!4b1!5b1!6b1!9b1!13b0!14b0!2b1!5m5!2b1!3b1!5b1!6b1!7b1!10m1!8e3!14m1!3b1!17b1!20m2!1e3!1e6!24b1!25b1!26b1!29b1!30m1!2b1!36b1!43b1!52b1!54m1!1b1!55b1!56m2!1b1!3b1!65m5!3m4!1m3!1m2!1i224!2i298!89b1!26m4!2m3!1i80!2i92!4i8!30m0!34m16!2b1!3b1!4b1!6b1!8m4!1b1!3b1!4b1!6b1!9b1!12b1!14b1!20b1!23b1!25b1!26b1!37m1!1e81!42b1!47m0!49m1!3b1!50m4!2e2!3m2!1b1!3b1!65m0!69i544&q=Newyork restaurants");
//initialize a new cURL session
$curl = curl_init();
//pass the target URL using ProxyCrawl
curl_setopt($curl, CURLOPT_URL, 'https://api.proxycrawl.com/?token=ADD_NORMAL_TOKEN&url=' . $url);
//return page contents
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
//execute the predefined cURL session
$response_data = curl_exec($curl);
//close the cURL session
curl_close($curl);
//output the response
echo $response_data

?>

If we run the code above, this is the output we get on a browser:

If you paste the returned data on a JSON validator tool like this JSON viewer, you can make some useful observations. For example, you’ll notice that removing the first four characters makes the data to be in a valid JSON format.

You’ll also notice that the data is contained in arrays, which are nested against each other. For example, the first 20 restaurants’ data are contained in the following marked section:

If we expand an array, we can find information about each restaurant. For example, if we expand the nested array number 1, you’ll notice that the details of the restaurant are contained in array number 14. This is the pattern throughout the data.

For example, the name of the restaurant is at number 11:

The location of the restaurant is at number 18:

The phone number of the restaurant is at array number 178:

Step 3: Create the scraping logic

As we mentioned earlier, removing the first four characters in the returned data makes it to be in a valid JSON format. This will make iterating over the data possible.

Here is the code for doing that:

1
$response_data = substr($response_data, 4, -1);

Also, replacing the null values with empty strings in the returned response makes it easier to handle the data.

Here is the code for doing that:

1
$response_data = str_replace("null,", '"",', $response_data);

Next, let’s use the built-in json_decode function to convert the JSON string data into a PHP object variable. We’ll also set the true parameter to convert the returned object into an associative array.

Here is the code:

1
$scraped_data = json_decode($response_data, true);

Next, let’s use the isset function to check if the restaurants’ data exists. We’ll place the rest of the scraping logic inside this function.

Then, let’s use the foreach loop to loop through the array and find instances where the number 14 array occurs. Remember that the data we need is contained in array number 14.

Lastly, let’s find occurrences of the restaurants’ names, locations, and phone numbers.

Here is the code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
if (isset($scraped_data[0][1]))
{
foreach ($scraped_data[0][1] as $value)
{
if (isset($value[14]))
{
$restaurants_data = $value[14];
$temporary_array = [];
if (isset($restaurants_data[11])) $temporary_array['Restaurant Name:'] = $restaurants_data[11];
if (isset($restaurants_data[18])) $temporary_array['Restaurant Location:'] = $restaurants_data[18];
if (isset($restaurants_data[178][0][0])) $temporary_array['Restaurant Phone Number:'] = $restaurants_data[178][0][0];
$final_array_data[] = $temporary_array;
}
}

}

Summary

Here is the entire code for using ProxyCrawl to scrape data from Google Maps:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
<?php
//Encoding the URL
$url = urlencode("https://www.google.com/search?tbm=map&authuser=0&hl=en&gl=ke&pb=!4m12!1m3!1d13288.926258283986!2d-74.02334913898135!3d40.73841320805614!2m3!1f0!2f0!3f0!3m2!1i1366!2i211!4f13.1!7i20!10b1!12m8!1m1!18b1!2m3!5m1!6e2!20e3!10b1!16b1!19m4!2m3!1i360!2i120!4i8!20m65!2m2!1i203!2i100!3m2!2i4!5b1!6m6!1m2!1i86!2i86!1m2!1i408!2i240!7m50!1m3!1e1!2b0!3e3!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e3!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e3!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e9!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e4!2b1!4b1!9b0!22m6!1sVgY-YP-eBdqg5NoPuc2f4A8:2!2s1i:0,t:11886,p:VgY-YP-eBdqg5NoPuc2f4A8:2!7e81!12e5!17sVgY-YP-eBdqg5NoPuc2f4A8:92!18e15!24m54!1m16!13m7!2b1!3b1!4b1!6i1!8b1!9b1!20b0!18m7!3b1!4b1!5b1!6b1!9b1!13b0!14b0!2b1!5m5!2b1!3b1!5b1!6b1!7b1!10m1!8e3!14m1!3b1!17b1!20m2!1e3!1e6!24b1!25b1!26b1!29b1!30m1!2b1!36b1!43b1!52b1!54m1!1b1!55b1!56m2!1b1!3b1!65m5!3m4!1m3!1m2!1i224!2i298!89b1!26m4!2m3!1i80!2i92!4i8!30m0!34m16!2b1!3b1!4b1!6b1!8m4!1b1!3b1!4b1!6b1!9b1!12b1!14b1!20b1!23b1!25b1!26b1!37m1!1e81!42b1!47m0!49m1!3b1!50m4!2e2!3m2!1b1!3b1!65m0!69i544&q=Newyork restaurants");
//initialize a new cURL session
$curl = curl_init();
//pass the target URL using ProxyCrawl
curl_setopt($curl, CURLOPT_URL, 'https://api.proxycrawl.com/?token=ADD_NORMAL_TOKEN&url=' . $url);
//return page contents
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
//execute the predefined cURL session
$response_data = curl_exec($curl);
//close the cURL session
curl_close($curl);
//remove first four characters
$response_data = substr($response_data, 4, -1);
//replacing null values
$response_data = str_replace("null,", '"",', $response_data);
//convert JSON string data into PHP object
$scraped_data = json_decode($response_data, true);

$final_array_data = [];

//create the scraping logic
if (isset($scraped_data[0][1]))
{
foreach ($scraped_data[0][1] as $value)
{
if (isset($value[14]))
{
$restaurants_data = $value[14];
$temporary_array = [];
if (isset($restaurants_data[11])) $temporary_array['Restaurant Name:'] = $restaurants_data[11];
if (isset($restaurants_data[18])) $temporary_array['Restaurant Location:'] = $restaurants_data[18];
if (isset($restaurants_data[178][0][0])) $temporary_array['Restaurant Phone Number:'] = $restaurants_data[178][0][0];
$final_array_data[] = $temporary_array;
}
}

}

//output the final data
var_dump($final_array_data);

?>

If we run the code, here is the result we get (it’s truncated for brevity):

We did it!

We managed to extract data from Google Maps.

Conclusion

That’s how to scrape data from Google Maps using ProxyCrawl. With ProxyCrawl, you can extract map data fast and efficiently—while remaining anonymous.

It’s the tool you need to pull online data without worrying about experiencing access restrictions. You can use it to extract unstructured information from any web page and import the data into your work environment easily.
Click here to create a free ProxyCrawl account.

Happy scraping!