Skip to content
This repository was archived by the owner on Nov 2, 2023. It is now read-only.

Commit 52d6121

Browse files
committed
add an example about how to use the class
1 parent c6eba15 commit 52d6121

File tree

4 files changed

+86
-15
lines changed

4 files changed

+86
-15
lines changed

doc/source/references/api.rst

Lines changed: 80 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,83 @@
1-
**************
1+
*************
22
wikipedia api
3-
**************
3+
*************
4+
5+
Le toolkit wekeypedia inclut une classe qui permet de passer des requêtes
6+
plus fines et adaptées à des recherches d'information spécifiques et peu
7+
généralisables. Par exemple, la plupart des classes implémentées gèrent des
8+
objets à une échelle individuelle alors que pour des raisons d'optimisation, il
9+
est parfois nécessaire d'affiner les requêtes afin d'en réduire leur nombre.
410

511
.. automodule:: wekeypedia.wikipedia.api
6-
:members:
12+
:members:
13+
14+
Examples
15+
--------
16+
17+
Here is piece of code that retrieve all links included in the `Wisdom` page and
18+
check if all these links (n=184) have an equivalent in the french wikipedia. It does so
19+
by asking for langlinks of 50 pages at once instead of building one query per
20+
links. In this case, the network load reduction goes from 184 queries to 4. #win
21+
22+
::
23+
24+
from __future__ import division
25+
from math import ceil
26+
from collections import defaultdict
27+
28+
import wekeypedia
29+
from wekeypedia.wikipedia.api import api as api
30+
31+
def api_bunch(page_titles, lang, req):
32+
results = defaultdict(list)
33+
param = req
34+
35+
w = api(lang)
36+
37+
for i in range(0,int(ceil(len(page_titles)/50))):
38+
param["titles"] = "|".join(page_titles[i*50:i*50+50-1])
39+
40+
while True:
41+
r = w.get(param)
42+
results.update({ p["title"]: p['langlinks'] for pageid, p in r["query"]["pages"].items() if 'langlinks' in p })
43+
44+
if "continue" in r:
45+
param.update(r["continue"])
46+
else:
47+
break
48+
49+
return results
50+
51+
def get_lang_projection(pages, source, target):
52+
"""
53+
Retrieve all correspondance from a set of pages into another language
54+
55+
Parameters
56+
----------
57+
pages : list
58+
List of page titles
59+
60+
Returns
61+
-------
62+
correspondances : list
63+
List of `(redirect(initial page), corresponding page)`
64+
"""
65+
66+
params = {
67+
"redirects": "",
68+
"format": "json",
69+
"action": "query",
70+
"prop": "info|langlinks",
71+
"lllimit": 500,
72+
"lllang": target,
73+
"continue":""
74+
}
75+
76+
r = api_bunch(pages, source, params)
77+
78+
return [ (page, t["*"]) for page,tt in r.items() for t in tt if t["lang"] == target ]
79+
80+
u = wekeypedia.WikipediaPage("Wisdom")
81+
pages = list(set([ x["title"] for x in u.get_links() ]))
82+
83+
get_lang_projection(pages, "en", "fr")

doc/source/references/generated/wekeypedia.lsm.compare.rst

Lines changed: 0 additions & 6 deletions
This file was deleted.
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
get_categories
2+
======================================================
3+
4+
.. currentmodule:: wekeypedia.wikipedia.page
5+
6+
.. automethod:: WikipediaPage.get_categories

doc/source/references/generated/wekeypedia.wikipedia.page.WikipediaPage.get_links_title.rst

Lines changed: 0 additions & 6 deletions
This file was deleted.

0 commit comments

Comments
 (0)