-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source PostHog: incremental streams read only relevant pages #4001
Conversation
/test connector=source-posthog
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one suggestion but lgtm
Also why did SAT not catch this?
@@ -104,9 +103,10 @@ def get_updated_state(self, current_stream_state: MutableMapping[str, Any], late | |||
return {self.cursor_field: max(latest_state, current_state)} | |||
|
|||
def parse_response(self, response: requests.Response, stream_state: Mapping[str, Any], **kwargs) -> Iterable[Mapping]: | |||
self._initial_state = self._initial_state or stream_state.get(self.cursor_field) or self._start_date |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like it would be better to set this variable in read_records
instead something like:
self._initial_state = ...
yield from super().read_records(**kwargs)
wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch, unfortunately haven't seen this comment
/publish connector=source-posthog
|
/publish connector=connectors/source-posthog
|
/publish connector=connectors/source-posthog
|
/publish connector=connectors/source-posthog
|
/publish connector=connectors/source-posthog
|
/publish connector=connectors/source-posthog
|
What
The initial version had a bug that causes the reading of all pages from the response.
The issue exposed in Annotation stream, even though filtration logic was working correctly, the stream was reading all pages because we never initialized the state to check it in the next_page_token method.
How
Initialize init_state during the first parse_response call.
Recommended reading order
x.java
y.python
Pre-merge Checklist
Expand the checklist which is relevant for this PR.
Connector checklist
airbyte_secret
in output spec./gradlew :airbyte-integrations:connectors:<name>:integrationTest
./test connector=connectors/<name>
command as documented here is passing.docs/integrations/
directory./publish
command described hereConnector Generator checklist
-scaffold
in their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates
then checking in your changes