Skip to content

Gracefully handle PRs that contain over 100 changed files #1

Open
@YuriSizov

Description

GitHub GraphQL API has a limit to a number of items when fetching groups of them. Each group of objects can only contain 100 items, and the rest need to be fetched via the pagination mechanism (which is why we fetch PRs themselves through several requests, and not with one big request).

This means that when fetching the list of files affected by each PR we only get the first 100 of them. Which is fine for most PRs, but some mega PRs touch more than that. The solution seems to be simple enough:

  • Fetch all PRs normally.
  • Record all PRs that contain exactly 100 files (can't be more, less doesn't matter).
  • For each of those PRs, which should be just a handful, make a series of requests to get the complete list of changed files.

This should keep the number of requests relatively slow, so we should stay well within our API budget. Of course, if some PR affects several thousand files it will take a hot moment to gather that information, but that should be a rare and temporary occasion (such PRs are hard to rebase and are typically done by core maintainers only, doing big passes on something).

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions