Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source PostHog: incremental streams read only relevant pages #4001

Merged
merged 9 commits into from
Jul 27, 2021

Conversation

keu
Copy link
Contributor

@keu keu commented Jun 9, 2021

What

The initial version had a bug that causes the reading of all pages from the response.
The issue exposed in Annotation stream, even though filtration logic was working correctly, the stream was reading all pages because we never initialized the state to check it in the next_page_token method.

How

Initialize init_state during the first parse_response call.

Recommended reading order

  1. x.java
  2. y.python

Pre-merge Checklist

Expand the checklist which is relevant for this PR.

Connector checklist

  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • Secrets are annotated with airbyte_secret in output spec
  • Unit & integration tests added as appropriate (and are passing)
    • Community members: please provide proof of this succeeding locally e.g: screenshot or copy-paste acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • /test connector=connectors/<name> command as documented here is passing.
    • Community members can skip this, Airbyters will run this for you.
  • Code reviews completed
  • Credentials added to Github CI if needed and not already present. instructions for injecting secrets into CI.
  • Documentation updated
    • README
    • CHANGELOG.md
    • Reference docs in the docs/integrations/ directory.
  • Build is successful
  • Connector version bumped like described here
  • New Connector version released on Dockerhub by running the /publish command described here
  • No major blockers
  • PR merged into master branch
  • Follow up tickets have been created
  • Associated tickets have been closed & stakeholders notified

Connector Generator checklist

  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed.

@keu
Copy link
Contributor Author

keu commented Jun 9, 2021

/test connector=source-posthog

🕑 source-posthog https://github.com/airbytehq/airbyte/actions/runs/922736500
✅ source-posthog https://github.com/airbytehq/airbyte/actions/runs/922736500

@keu keu requested a review from sherifnada June 9, 2021 18:47
Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one suggestion but lgtm

Also why did SAT not catch this?

@@ -104,9 +103,10 @@ def get_updated_state(self, current_stream_state: MutableMapping[str, Any], late
return {self.cursor_field: max(latest_state, current_state)}

def parse_response(self, response: requests.Response, stream_state: Mapping[str, Any], **kwargs) -> Iterable[Mapping]:
self._initial_state = self._initial_state or stream_state.get(self.cursor_field) or self._start_date
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like it would be better to set this variable in read_records instead something like:

self._initial_state = ...
yield from super().read_records(**kwargs)

wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch, unfortunately haven't seen this comment

@github-actions github-actions bot added area/connectors Connector related issues area/documentation Improvements or additions to documentation labels Jul 20, 2021
@keu
Copy link
Contributor Author

keu commented Jul 22, 2021

/publish connector=source-posthog

❌ source-posthog https://github.com/airbytehq/airbyte/actions/runs/1057609531

@keu
Copy link
Contributor Author

keu commented Jul 22, 2021

/publish connector=connectors/source-posthog

🕑 connectors/source-posthog https://github.com/airbytehq/airbyte/actions/runs/1057658261
❌ connectors/source-posthog https://github.com/airbytehq/airbyte/actions/runs/1057658261

@keu
Copy link
Contributor Author

keu commented Jul 23, 2021

/publish connector=connectors/source-posthog

🕑 connectors/source-posthog https://github.com/airbytehq/airbyte/actions/runs/1058241790
❌ connectors/source-posthog https://github.com/airbytehq/airbyte/actions/runs/1058241790

@keu
Copy link
Contributor Author

keu commented Jul 26, 2021

/publish connector=connectors/source-posthog

🕑 connectors/source-posthog https://github.com/airbytehq/airbyte/actions/runs/1068189413
❌ connectors/source-posthog https://github.com/airbytehq/airbyte/actions/runs/1068189413

@keu
Copy link
Contributor Author

keu commented Jul 27, 2021

/publish connector=connectors/source-posthog

🕑 connectors/source-posthog https://github.com/airbytehq/airbyte/actions/runs/1069510171
❌ connectors/source-posthog https://github.com/airbytehq/airbyte/actions/runs/1069510171

@keu
Copy link
Contributor Author

keu commented Jul 27, 2021

/publish connector=connectors/source-posthog

🕑 connectors/source-posthog https://github.com/airbytehq/airbyte/actions/runs/1069539746
✅ connectors/source-posthog https://github.com/airbytehq/airbyte/actions/runs/1069539746

@keu keu merged commit 76508b8 into master Jul 27, 2021
@keu keu deleted the keu/fix-posthog-state branch July 27, 2021 01:52
@keu keu added zazmic and removed area/documentation Improvements or additions to documentation labels Jul 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants