Skip to content

Commit a95d727

Browse files
committed
search volume + advanced serps data
1 parent f28f233 commit a95d727

File tree

7 files changed

+330
-21
lines changed

7 files changed

+330
-21
lines changed

README.md

Lines changed: 62 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -48,46 +48,101 @@ Remember to switch it back when things get real ;)
4848

4949
## Usage
5050

51-
### 1. Post tasks
51+
### SERPs
52+
53+
Get results from Google on a list of keywords.
54+
55+
#### 1. Post tasks
5256

5357
Put the list of keywords in a text file (one request per line), then run:
5458

55-
$ python tasks_post.py --input yourfile.txt
59+
$ python serps_post.py --input yourfile.txt
5660

5761
The script will send the keywords to the API by batches of 100 requests, and write a report containing the `task_id` for each keyword.
5862

5963
The `input` argument is required. Other options are available:
6064
- `config`: configuration file to use (default: `config.ini`)
6165
- `input`: input file with the list of requests (required)
62-
- `output`: output basename for the report (default: `keyword-requests`)
66+
- `output`: output basename for the report (default: `serps-tasks`)
6367
- `language_code`: language code for the requests (default: `fr`)
6468
- `location_code`: location code for requests (default: "2250" for France, get other codes on <https://api.dataforseo.com/v3/serp/google/locations>)
6569
- `nb_results`: number of results to ask for (default: `10`)
6670
- `device`: choice between `desktop` (default) and `mobile`
71+
- `pixels`: activate to get the number of pixels from the top of the page to the result (this will double requests costs). You'll need to use the `--advanced` option when getting the results.
6772
- `priority`: choose the priority queue between `low` (default) and `high` (note: `high` priority requests will be charged more by DataForSEO)
6873
- `batch`: number of requests to send for each batch, between 1 and 100 (default: `100`)
6974
- `delay`: delay in seconds between batches of requests (default: `10`)
7075
- `sep`: CSV separator for `output`(default: `;`)
7176

7277

73-
### 2. See if tasks are ready
78+
#### 2. See if tasks are ready
7479

7580
This step is not required, but will show you how many of your requests have been processed and are available for download.
7681
Simply run:
7782

78-
$ python tasks_ready.py
83+
$ python serps_ready.py
7984

8085
Note that the API will only show you 1000 available results at most, but there might be more not shown.
8186

82-
### 3. Get results
87+
#### 3. Get results
8388

8489
Once you've posted your requests, it usually takes only a few instants for the API to process.
8590
To get the results, run:
8691

87-
$ python tasks_get.py
92+
$ python serps_get.py
8893

8994
The script will request the API to check for available results, then download the data for each done task and repeat while there are available results.
9095

96+
Available options:
97+
- `config`: configuration file to use (default: `config.ini`)
98+
- `output`: output basename for the results (default: `serps-results`)
99+
- `advanced`: get advanced details, such as presence of image, video, ratings, sitelinks, ... or position of the result in pixels (default: `False`).
100+
- `delay`: delay in seconds between batches of requests (default: `10`)
101+
102+
103+
### Keywords
104+
105+
Get keywords data from Google AdWords API: search volume, cpc, competition, categories.
106+
107+
#### 1. Post tasks
108+
109+
Put the list of keywords in a text file (one request per line), then run:
110+
111+
$ python keywords_data_post.py --input yourfile.txt
112+
113+
The script will send the keywords to the API by batches of 100 requests, and write a report containing the `task_id` for each keyword.
114+
115+
The `input` argument is required. Other options are available:
116+
- `config`: configuration file to use (default: `config.ini`)
117+
- `input`: input file with the list of requests (required)
118+
- `output`: output basename for the report (default: `keywords-data-tasks`)
119+
- `language_code`: language code for the requests (default: `fr`)
120+
- `location_code`: location code for requests (default: "2250" for France, get other codes on <https://api.dataforseo.com/v3/keywords_data/google/locations>)
121+
- `batch`: number of keywords to include for each batch, between 1 and 700 (default: `700`). You will be charged the same amount per batch, no matter what number of keywords you include (so better stay with 700!).
122+
- `delay`: delay in seconds between batches of requests (default: `1`)
123+
- `sep`: CSV separator for `output`(default: `;`)
124+
125+
#### 2. See if tasks are ready
126+
127+
This step is not required, but will show you how many of your requests have been processed and are available for download.
128+
Simply run:
129+
130+
$ python keywords_data_ready.py
131+
132+
#### 3. Get results
133+
134+
Once you've posted your requests, it usually takes only a few instants for the API to process.
135+
To get the results, run:
136+
137+
$ python keywords_data_get.py
138+
139+
The script will request the API to check for available results, then download the data for each done task and repeat while there are available results.
140+
141+
Available options:
142+
- `config`: configuration file to use (default: `config.ini`)
143+
- `output`: output basename for the results (default: `keywords-data-results`)
144+
- `delay`: delay in seconds between batches of requests (default: `10`)
145+
91146
## Contributing
92147

93148
If you wish to contribute to this repository or to report an issue, please do this [on GitLab](https://gitlab.com/databulle/dataforseo-serp-api-python-client).

keywords_data_get.py

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
####
2+
## DATAFORSEO GOOGLE KEYWORDS API
3+
##
4+
## Fetches available results from API.
5+
####
6+
7+
import csv
8+
import configparser
9+
import argparse
10+
import time
11+
import datetime
12+
from client import RestClient
13+
import json
14+
15+
if __name__ == '__main__':
16+
parser = argparse.ArgumentParser()
17+
parser.add_argument('--config', default="config.ini",
18+
type=str, help='Global config file (default: "config.ini")')
19+
parser.add_argument('--output', default="keywords-data-results",
20+
type=str, help='Output basename (default: "keywords-data-results")')
21+
parser.add_argument('--delay', default=1,
22+
type=float, help='Delay in seconds between batches of requests (default: 1)')
23+
args = parser.parse_args()
24+
25+
conf = configparser.ConfigParser()
26+
conf.read(args.config)
27+
user = conf['general']['user']
28+
password = conf['general']['password']
29+
30+
# Output headers
31+
fields=['keyword','location_code','language_code','search_partners','search_volume','cpc','competition','categories','monthly_searches']
32+
# Output name
33+
timestr = time.strftime("%Y%m%d-%H%M%S")
34+
tag = args.output + "-" + timestr
35+
filename = tag + ".csv"
36+
37+
with open(filename,'w',newline='') as file:
38+
writer = csv.DictWriter(file, fieldnames=fields, delimiter=";")
39+
writer.writeheader()
40+
file.close()
41+
42+
client = RestClient(user,password)
43+
44+
# While there are results, request the next batch
45+
next_batch = True
46+
while next_batch:
47+
response = client.get("/v3/keywords_data/google/search_volume/tasks_ready")
48+
if response['status_code'] == 20000:
49+
tasks_available = response["tasks"][0]["result_count"]
50+
print("{} tasks available".format(tasks_available))
51+
if tasks_available < 1:
52+
next_batch = False
53+
54+
for task in response["tasks"]:
55+
if (task['result'] and (len(task['result']) > 0)):
56+
for result_task_info in task['result']:
57+
if(result_task_info['endpoint']):
58+
res = client.get(result_task_info['endpoint'])
59+
60+
for t in res["tasks"]:
61+
if (t['result'] and (len(t['result']) > 0)):
62+
for k in t['result']:
63+
data = dict()
64+
data["keyword"] = k["keyword"]
65+
data["location_code"] = k["location_code"]
66+
data["language_code"] = k["language_code"]
67+
data["search_partners"] = k["search_partners"]
68+
data["search_volume"] = k["search_volume"]
69+
data["cpc"] = k["cpc"]
70+
data["competition"] = k["competition"]
71+
data["categories"] = json.dumps(k["categories"])
72+
monthly_searches = []
73+
for m in sorted(k["monthly_searches"], key=lambda x: (x['year'], x['month'])):
74+
monthly_searches. append(m["search_volume"])
75+
data["monthly_searches"] = json.dumps(monthly_searches)
76+
77+
with open(filename,'a',newline='') as file:
78+
writer = csv.DictWriter(file, fieldnames=fields, delimiter=";")
79+
writer.writerow(data)
80+
file.close()
81+
82+
print("Batch done.")
83+
time.sleep(args.delay)
84+
else:
85+
next_batch = False
86+
print("error. Code: %d Message: %s" % (response["status_code"], response["status_message"]))

keywords_data_post.py

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
####
2+
## DATAFORSEO GOOGLE KEYWORDS API
3+
##
4+
## Posts keywords search volume requests.
5+
####
6+
7+
import csv
8+
import configparser
9+
import argparse
10+
import time
11+
import datetime
12+
from client import RestClient
13+
14+
def range_limited_int_type(arg):
15+
""" Type function for argparse - an int within some predefined bounds """
16+
try:
17+
f = int(arg)
18+
except ValueError:
19+
raise argparse.ArgumentTypeError("Must be an integer")
20+
if f < 1 or f > 700:
21+
raise argparse.ArgumentTypeError("Argument must be < " + str(1) + " and > " + str(700))
22+
return f
23+
24+
if __name__ == '__main__':
25+
parser = argparse.ArgumentParser()
26+
parser.add_argument('--config', default="config.ini",
27+
type=str, help='Global config file (default: "config.ini").')
28+
parser.add_argument('--input', required=True,
29+
type=str, help='List of keywords to request.')
30+
parser.add_argument('--output', default="keywords-data-tasks",
31+
type=str, help='Output basename (default: "keywords-data-tasks").')
32+
parser.add_argument('--language_code', default="fr",
33+
type=str, help='Language code for requests (default: "fr")')
34+
parser.add_argument('--location_code', default=2250,
35+
type=int, help='Location code for requests (default: "2250" for France, get other codes on <https://api.dataforseo.com/v3/serp/google/locations>)')
36+
parser.add_argument('--batch', default=700,
37+
type=range_limited_int_type, help='Max number of tasks per batch. Max 700. Each batch costs 0.05$ (default: 700).')
38+
parser.add_argument('--delay', default=1,
39+
type=float, help='Delay in seconds between batches of requests (default: 1).')
40+
parser.add_argument('--sep', default=";",
41+
type=str, help='CSV file separator (default: ";").')
42+
args = parser.parse_args()
43+
44+
45+
# Read list of requests
46+
with open(args.input,'r') as file:
47+
kws = list()
48+
for line in file.readlines():
49+
kws.append(str.strip(line))
50+
file.close()
51+
# Output headers
52+
fields=['id','status','tag','nb_requests','first_kw','last_kw']
53+
# Output name
54+
timestr = time.strftime("%Y%m%d-%H%M%S")
55+
tag = args.output + "-" + timestr
56+
filename = tag + ".csv"
57+
58+
conf = configparser.ConfigParser()
59+
conf.read(args.config)
60+
user = conf['general']['user']
61+
password = conf['general']['password']
62+
63+
with open(filename,'w',newline='') as file:
64+
writer = csv.DictWriter(file,fieldnames=fields,delimiter=args.sep)
65+
writer.writeheader()
66+
file.close()
67+
68+
client = RestClient(user, password)
69+
i = 0
70+
j = args.batch
71+
72+
# Cut the kws list in batches
73+
while j < len(kws)+args.batch:
74+
post_data = {}
75+
post_data[len(post_data)] = dict(
76+
language_code = args.language_code,
77+
location_code = args.location_code,
78+
keywords = kws[i:j],
79+
tag = tag,
80+
)
81+
82+
83+
response = client.post("/v3/keywords_data/google/search_volume/task_post", post_data)
84+
if response["status_code"] == 20000:
85+
for task in response["tasks"]:
86+
data = dict()
87+
data["nb_requests"] = len(task["data"]["keywords"])
88+
data["first_kw"] = task["data"]["keywords"][0]
89+
data["last_kw"] = task["data"]["keywords"][-1]
90+
data["status"] = task["status_message"]
91+
data["id"] = task["id"]
92+
data["tag"] = task["data"]["tag"]
93+
with open(filename,'a',newline='') as file:
94+
writer = csv.DictWriter(file,fieldnames=fields,delimiter=args.sep)
95+
writer.writerow(data)
96+
file.close()
97+
print("Batch {} done.".format(int((i/args.batch) + 1)))
98+
else:
99+
print("error. Code: %d Message: %s" % (response["status_code"], response["status_message"]))
100+
101+
i = j
102+
j += args.batch
103+
time.sleep(args.delay)
104+
105+
file.close()

keywords_data_ready.py

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
####
2+
## DATAFORSEO GOOGLE KEYWORDS API
3+
##
4+
## Shows number of tasks ready for download.
5+
####
6+
7+
import configparser
8+
import argparse
9+
from client import RestClient
10+
11+
12+
if __name__ == '__main__':
13+
parser = argparse.ArgumentParser()
14+
parser.add_argument('--config', default="config.ini",
15+
type=str, help='Global config file (default: "config.ini")')
16+
args = parser.parse_args()
17+
18+
conf = configparser.ConfigParser()
19+
conf.read(args.config)
20+
user = conf['general']['user']
21+
password = conf['general']['password']
22+
23+
client = RestClient(user,password)
24+
25+
response = client.get("/v3/keywords_data/google/search_volume/tasks_ready")
26+
if response["status_code"] == 20000:
27+
tasks_available = response["tasks"][0]["result_count"]
28+
print("{} tasks available".format(tasks_available))
29+
else:
30+
print("error. Code: %d Message: %s" % (response["status_code"], response["status_message"]))

0 commit comments

Comments
 (0)