Skip to content
This repository was archived by the owner on Nov 16, 2023. It is now read-only.
This repository was archived by the owner on Nov 16, 2023. It is now read-only.

Add support for crawling a GitHub User’s various contributions, independent of a specific org or repo #146

@danisyellis

Description

@danisyellis

Our goal is to track contributions by our employees to any open-source project on GitHub. So we'll need to look at each employee’s commits, pull requests, issues, etc. We can do this through the User’s Events.

I have some questions about how to do this:

  1. Is there anything in the current constraints of ghcrawler that will make this an exceptionally difficult task?

  2. How do I say “traverse the Events for a given User”? Where is an example of similar code doing something similar?

    • Based on this discussion Add support for traversing Releases #94 I thought it would be in the GitHub processor. Inside of that file, my understanding is that this code in user() this._addCollection(request, ‘repos', ‘repo’) should tell it to look at a user’s repos and add those repos to the mongodb repo collection. But currently, as far as I can tell, it processes the user, but doesn’t even hit the repo function. Because I care most about events right now, I also tried this._addCollection(request, 'events', 'null’); and this._addCollection(request, 'events', ‘events’); but neither seemed to do anything.
  3. Will this require an advanced traversal policy? I think that I can use the default traversal policy for now and refine it with an advanced one later to grab fewer things from user, if desired, like using graphQL to do a query. Is that right?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions