This tutorial will show you how to gather public data from the Google Play store, including data points like title, price, version number, download rates, reviews, and more. In this repository, you can find a free Google Play scraper tool, designed for smaller-scale scraping tasks. If you want to increase your scraping scale, the second part of this guide will show you how to utilize a far more effective Oxylabs' Scraper API. It comes with a free trial, which you can claim by registering a free account on the dashboard.
A free tool which you can use to get data for apps, books, or movies from Google Play using a specific search query.
To run this tool, you need to have Python 3.11
or later installed on your system.
Open up a terminal window, navigate to this repository, and run this command:
make install
To scrape data from Google Play, first choose one of these categories, that are available in Google Play:
apps
movies
books
The default category in the tool is apps
, so feel free to omit the CATEGORY
parameter from the command if that's the category you need.
If you prefer to choose a different category than apps
, run this command in your terminal:
make scrape QUERY="<your_query>" CATEGORY="<your_chosen_category>
Otherwise, the command should look like this:
make scrape QUERY="<your_query>"
Note
Make sure the category name is in lowercase.
For this example, let's try scraping Google Play results for movies about fishing. The command should look like this:
make scrape QUERY="fishing" CATEGORY="movies"
Note
Make sure to enclose your query and category in quotation marks. Otherwise, the tool might have trouble parsing it.
After running the command, you should see a similar output in your terminal:
After the tool finishes running, you can find a file named movies_play_items.csv
in your current working directory. This file contains Google Play items for the query and category you entered. The file name will always be in this format: {category}_play_items.csv
. The generated CSV file contains these columns of data:
title
- The title of the movie.price
- The price of the movie to rent.rating
- The rating of the movie.cover_url
- The URL to the image of the cover for the movie.url
- The URL for the movie.
Here's an example of how the scraped and parsed data should look like:
In case the code doesn't work or your project is of bigger scale, please refer to the second part of the tutorial. There, we showcase how to scrape public data with Oxylabs Scraper API.
After purchasing access to the API or claiming your free trial, you'll have to use your API credentials for authentication.
You can retrieve Google Play results by providing your target URLs and
forming a payload
with job parameters. Scraper API will return the HTML of
any public Google Play page you have provided.
The following examples demonstrate how you can get Google Play results in HTML format. To begin, you need to send the request to the API using the Push-Pull method (or other methods):
import requests
from pprint import pprint
# Structure payload.
payload = {
'source': 'google',
'url': 'https://play.google.com/store/games?hl=en_GB&gl=UK',
'user_agent_type': 'desktop_edge',
'render': 'html',
'geo_location': 'United Kingdom',
'locale': 'en-gb'
}
# Get response.
response = requests.request(
'POST',
'https://data.oxylabs.io/v1/queries',
auth=('USERNAME', 'PASSWORD'), #Your credentials go here
json=payload
)
# This will return a JSON response with job information and results URLs.
pprint(response.json())
Once the job is finished, you can then send another request to retrieve the Google Play results. Here, you must use the job ID value that’s provided in the response of the above code sample:
import requests
from pprint import pprint
# Get response.
response = requests.request(
'GET',
'http://data.oxylabs.io/v1/queries/{job_id}/results',
auth=('USERNAME', 'PASSWORD')
)
# This will return a JSON response with scraped results.
pprint(response.json())
Visit our documentation for more information.
The response will be in JSON format, containing HTML content and details about the job itself:
{
"results": [
{
"content": "<!DOCTYPE html><html lang=\"en\" dir=\"ltr\"><head><meta http-equiv=\"origin-trial\" content=\"Az520Inasey3TAyqLyojQa8MnmCALSEU29yQFW8dePZ7xQTvSt73pHazLFTK5f7SyLUJSo2uKLesEtEa9aUYcgMAAACPeyJvcmlnaW4iOiJodHRw...",
"created_at": "2023-08-28 14:14:59",
"updated_at": "2023-08-28 14:15:35",
"page": 1,
"url": "https://play.google.com/store/games?hl=en_GB&gl=US",
"job_id": "7101930169060862977",
"status_code": 200
}
]
}
To get parsed results, use the free Custom Parser feature. Check out this in-depth Custom Parser tutorial to learn how to use it.
With Oxylabs’ Google Play Scraper API, the data extraction process is as easy as it gets. Feel free to contact our 24/7 support team via live chat or email if you need assistance.
Read More Google Scraping Related Repositories: Google Sheets for Basic Web Scraping, How to Scrape Google Shopping Results, How To Scrape Google Jobs, Google News Scrpaer, How to Scrape Google Scholar, How to Scrape Google Flights with Python, How To Scrape Google Images, Scrape Google Search Results, Scrape Google Trends