GitHub as event stream proof-of-concept #44

bryanburgers · 2019-06-03T21:10:27Z

We currently have two sources-of-truth when syncing on startup: the
database and issue comments.

This proof of concept tries to bring us down to a single source-of-truth
by effectively elimating the local database (at least for pull request
state; other aspects like treeclosed are not addressed).

In order to do this, we use GitHub's "timeline"
(v3,
v4)
feature to get a fuller view of what has happened to a pull request so
we can reify state in a more complete manner than using issue comments
alone. For example, the timeline API interleaves issue comments with
things like AssignedEvent, HeadRefForcePushedEvent, and
RenamedTitleEvent in the order they happened so we can reliably build
the full state from this historical record.

The general concept is that in both synchronizing and realtime mode,
we have the pull request object respond to the event to update the
state. Side effects like comments and label updates are returned, and
the code can use or drop those depending on which mode Homu in.

PR state --------\
                 |     +---------------+
                 \---> |               | -----> Modified PR state
                       | process_event |
                 /---> |               | --\
                 |     +---------------+   +--> Realtime comments
GitHub Event ----/                         +--> Label updates
                                           \--> Database update

The good thing is that we can model these timeline items and GitHub's
event webhooks (which we use when processing in realtime mode) in much
the same way, so that we can unify them in a way that makes it possible
to effectively test the effect of events on a pull request.

This particular pull request does not attempt to integrate these changes
into the codebase yet, because I wanted to get feedback that this is
indeed a viable path to head down.

Instead, this pull request (which should not be merged) provides a new
executable that demonstrates this ability on existing pull requests in a
read-only mode. The output is a series of relevant timeline items, and
the state of the pull request as it changes.

pip install -e homu
homu-sync-test --repo rust-lang/rust 61419

Start adding some tests to Homu by testing the issue comment command-parsing logic. Take `parse_commands` and break it apart into two phases * Parsing phase * Execution phase The parsing phase returns a list of commands with action names (ideally, this would be a Rust enum, but to simulate that, we use action names on a single class) that are returned regardless of the current state. So for example, `@bors retry` will return a `retry` command regardless of the current state of `realtime`. The execution phase then inspects the list of commands and decides what to do with them. So for example, the `retry` command will be skipped if `realtime == False`. This has the positive result of having a parsing phase that has no side-effects, which makes it much easier to test. This can lead to higher confidence that the code works as expected without the high cost of testing in production and possibly disrupting the build flow. This has the negative result of adding a lot of lines of code to achieve command parsing, which we already do successfully without the tests.

Use `python setup.py test` to run the tests, and a few minor changes required to support that.

Fix merge conflict in homu/main.py

We currently have two sources-of-truth when syncing on startup: the database and issue comments. This proof of concept tries to bring us down to a single source-of-truth by effectively elimating the local database (at least for pull request state; other aspects like `treeclosed` are not addressed). In order to do this, we use GitHub's "timeline" ([v3](https://developer.github.com/v3/issues/timeline/), [v4](https://developer.github.com/v4/object/pullrequest/#timelineitems)) feature to get a fuller view of what has happened to a pull request so we can reify state in a more complete manner than using issue comments alone. For example, the timeline API interleaves issue comments with things like `AssignedEvent`, `HeadRefForcePushedEvent`, and `RenamedTitleEvent` in the order they happened so we can reliably build the full state from this historical record. The general concept is that in both `synchronizing` and `realtime` mode, we have the pull request object respond to the event to update the state. Side effects like comments and label updates are returned, and the code can use or drop those depending on which mode Homu in. PR state --------\ | +---------------+ \---> | | -----> Modified PR state | process_event | /---> | | --\ | +---------------+ +--> Realtime comments GitHub Event ----/ +--> Label updates \--> Database update The good thing is that we can model these timeline items and GitHub's event webhooks (which we use when processing in `realtime` mode) in much the same way, so that we can unify them in a way that makes it possible to effectively test the effect of events on a pull request. This particular pull request does not attempt to integrate these changes into the codebase yet, because I wanted to get feedback that this is indeed a viable path to head down. Instead, this pull request (which should not be merged) provides a new executable that demonstrates this ability on existing pull requests in a read-only mode. The output is a series of relevant timeline items, and the state of the pull request as it changes. pip install -e homu homu-sync-test --repo rust-lang/rust 61419

bryanburgers · 2019-06-03T21:12:33Z

@kennytm @pietroalbini @Manishearth (not sure who else) I'd love to get some 👀 on this, and your thoughts on this approach.

Might be some significant cutting to get there so I wanted to get feedback first, but I think it could have some real tangible benefits in the future, too.

kennytm · 2019-06-04T04:22:44Z

+1 for reducing the SQLite dependency

pietroalbini · 2019-06-04T17:20:40Z

This approach seems awesome!

bryanburgers added 5 commits May 22, 2019 10:21

Use a more python-standard test setup

2f9b34c

Use `python setup.py test` to run the tests, and a few minor changes required to support that.

Merge remote-tracking branch 'origin/master' into test-commands

baedbc1

Fix merge conflict in homu/main.py

Merge remote-tracking branch 'bryan/test-commands' into poc-sync

1aff314

bryanburgers mentioned this pull request Jun 7, 2019

Refactor to improve synchronization and testability #45

Draft

7 tasks

bryanburgers closed this Jul 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GitHub as event stream proof-of-concept #44

GitHub as event stream proof-of-concept #44

Uh oh!

bryanburgers commented Jun 3, 2019

Uh oh!

bryanburgers commented Jun 3, 2019

Uh oh!

kennytm commented Jun 4, 2019

Uh oh!

pietroalbini commented Jun 4, 2019

Uh oh!

Uh oh!

GitHub as event stream proof-of-concept #44

GitHub as event stream proof-of-concept #44

Uh oh!

Conversation

bryanburgers commented Jun 3, 2019

Uh oh!

bryanburgers commented Jun 3, 2019

Uh oh!

kennytm commented Jun 4, 2019

Uh oh!

pietroalbini commented Jun 4, 2019

Uh oh!

Uh oh!