-
Notifications
You must be signed in to change notification settings - Fork 24
fix: (CDK) (AsyncRetriever) - fix availability strategy
issues, during check connection
#419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: (CDK) (AsyncRetriever) - fix availability strategy
issues, during check connection
#419
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes issues with the AsyncRetriever's availability strategy during a check connection by exposing a dedicated property for handling rate limits.
- Introduces a new property and setter for exit_on_rate_limit in AsyncRetriever.
- Updates declarative_stream to branch its behavior based on whether the retriever is an AsyncRetriever.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
airbyte_cdk/sources/declarative/retrievers/async_retriever.py | Adds new exit_on_rate_limit property and setter to support async logic. |
airbyte_cdk/sources/declarative/declarative_stream.py | Integrates AsyncRetriever exit_on_rate_limit in stream configuration. |
📝 WalkthroughWalkthroughThis pull request introduces an Changes
Possibly related PRs
Suggested labels
Suggested reviewers
Would you like to proceed with these reviewers, wdyt? Tip ⚡🧪 Multi-step agentic review comment chat (experimental)
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms (9)
🔇 Additional comments (2)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/retrievers/async_retriever.py (1)
39-62
: Consider using a more resilient property chain access patternThe current implementation has a deep property access chain that's fragile to changes and causing type checking issues. Consider implementing a more resilient pattern:
@property def exit_on_rate_limit(self) -> bool: """ Whether to exit on rate limit. This is a property of the job repository and not the stream slicer. The stream slicer is responsible for creating the jobs, but the job repository is responsible for managing the rate limits and other job-related properties. Note: - If the `creation_requester` cannot place / create the job - it might be the case of the RateLimits - If the `creation_requester` can place / create the job - it means all other requesters should successfully manage to complete the results. """ - return self.stream_slicer._job_orchestrator._job_repository.creation_requester.exit_on_rate_limit # type: ignore[return-value] + # Use a helper method to safely access nested properties + def get_creation_requester(): + if not hasattr(self.stream_slicer, "_job_orchestrator") or self.stream_slicer._job_orchestrator is None: + return None + job_orchestrator = self.stream_slicer._job_orchestrator + if not hasattr(job_orchestrator, "_job_repository") or job_orchestrator._job_repository is None: + return None + job_repository = job_orchestrator._job_repository + if not hasattr(job_repository, "creation_requester") or job_repository.creation_requester is None: + return None + return job_repository.creation_requester + + requester = get_creation_requester() + return requester.exit_on_rate_limit if requester and hasattr(requester, "exit_on_rate_limit") else False @exit_on_rate_limit.setter def exit_on_rate_limit(self, value: bool) -> None: """ Sets the `exit_on_rate_limit` property of the job repository > creation_requester, meaning that the Job cannot be placed / created if the rate limit is reached. Thus no futher work on managing jobs is expected to be done. """ - self.stream_slicer._job_orchestrator._job_repository.creation_requester.exit_on_rate_limit = value # type: ignore[assignment] + # Use a helper method to safely access nested properties + def get_creation_requester(): + if not hasattr(self.stream_slicer, "_job_orchestrator") or self.stream_slicer._job_orchestrator is None: + return None + job_orchestrator = self.stream_slicer._job_orchestrator + if not hasattr(job_orchestrator, "_job_repository") or job_orchestrator._job_repository is None: + return None + job_repository = job_orchestrator._job_repository + if not hasattr(job_repository, "creation_requester") or job_repository.creation_requester is None: + return None + return job_repository.creation_requester + + requester = get_creation_requester() + if requester and hasattr(requester, "exit_on_rate_limit"): + requester.exit_on_rate_limit = valueThis approach:
- Makes the code more resilient to structural changes
- Handles missing attributes gracefully
- Improves code readability
- Addresses type checking issues
What do you think?
🧰 Tools
🪛 GitHub Actions: Linters
[error] 52-52: Returning Any from function declared to return 'bool' [no-any-return]
[error] 52-52: Item 'None' of 'AsyncJobOrchestrator | None' has no attribute '_job_repository' [union-attr]
[error] 52-52: Item 'AsyncJobRepository' of 'AsyncJobRepository | Any' has no attribute 'creation_requester' [union-attr]
[error] 61-61: Item 'None' of 'AsyncJobOrchestrator | None' has no attribute '_job_repository' [union-attr]
[error] 61-61: Item 'AsyncJobRepository' of 'AsyncJobRepository | Any' has no attribute 'creation_requester' [union-attr]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte_cdk/sources/declarative/declarative_stream.py
(2 hunks)airbyte_cdk/sources/declarative/retrievers/async_retriever.py
(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/retrievers/async_retriever.py
[error] 52-52: Returning Any from function declared to return 'bool' [no-any-return]
[error] 52-52: Item 'None' of 'AsyncJobOrchestrator | None' has no attribute '_job_repository' [union-attr]
[error] 52-52: Item 'AsyncJobRepository' of 'AsyncJobRepository | Any' has no attribute 'creation_requester' [union-attr]
[error] 61-61: Item 'None' of 'AsyncJobOrchestrator | None' has no attribute '_job_repository' [union-attr]
[error] 61-61: Item 'AsyncJobRepository' of 'AsyncJobRepository | Any' has no attribute 'creation_requester' [union-attr]
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Analyze (python)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/declarative_stream.py (3)
17-17
: Import addition looks goodAdding the AsyncRetriever import enables the type checking that follows. Nice job keeping the imports organized alphabetically!
79-82
: Good handling of different retriever typesThe property now correctly handles both AsyncRetriever and standard retrievers. This is a clean way to support both retrieval mechanisms.
86-90
: Clean implementation of the setterThe setter mirrors the getter logic nicely, maintaining symmetry in the codebase. The implementation correctly delegates to the appropriate object based on the retriever type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I've started a brainstorm here to try to find a way for us to avoid to do this for every retrievers we add
What
Original Slack conversation: https://airbytehq-team.slack.com/archives/C07JMAAE620/p1741825218622419
Problem Statement:
availability strategy
that searches for theretriever.requester.exit_on_rate_limits
, which is not available for theAsyncRetriever
.How
creation_requester.exit_on_rate_limits
and re-use itUser Impact
With this update, customers who utilize the
AsyncRetriever
will now have the capability to specify the asynchronous stream as the one for checking the connection. This was not the case previously.Summary by CodeRabbit
New Features
Documentation