Skip to content

Query for set information #2

@kevinlul

Description

@kevinlul

Collecting set information is the last piece for YAML Yugi to exceed parity with other solutions. Unlike the other data collected so far, which are contained in flat categories, sets are indexed on Yugipedia in hierarchical categories. This means that instead of a target category for sets directly containing an article about a set, categories may be nested. When querying the MediaWiki API, only the immediate members of a category are returned, including the names of child categories, but the members of those child categories are not returned. Therefore, new code is required in order to download entire category hierarchies and subscribe to updates on them. Category hierarchies are allowed to contain cycles, and while this is not expected of the categories for sets, our code should be correct even if cycles are encountered and not fall into an infinite loop.

Design

Either create or extend the current full download script to recursively download a targeted category, without falling into infinite loops. For example, after fetching https://yugipedia.com/api.php?action=query&redirects=true&generator=categorymembers&prop=revisions&rvprop=content&format=json&formatversion=2&gcmlimit=50&gcmtitle=Category:Yu-Gi-Oh!_Master_Duel_sets, the the ns=14 category items in the response should be stored in a ordered set for additional follow-up requests once the current category is completely downloaded.

To subscribe to incremental updates, the existing script can be used, but each time, it should be called with all the known descendant categories cached from the last full download, in addition to the top-level category itself. This is because the MediaWiki API only provides the immediate parent categories of an article, not all ancestor categories.

Subtasks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions