-
Notifications
You must be signed in to change notification settings - Fork 31
feat: Add unprivileged and config-free discover for declarative static schemas #559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Add unprivileged and config-free discover for declarative static schemas #559
Conversation
…cSchemaLoader Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
…ation control Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
…class Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
…ver property Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
…discover-for-declarative-static-schemas
📝 WalkthroughWalkthroughAllows discover to run without an explicit config when the source can load schemas dynamically. Adds a check_config_during_discover flag, detects DynamicSchemaLoader usage in declarative sources, updates CLI/entrypoint parsing/behavior for discover, and adds tests for the new flows. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Entrypoint
participant Source
rect rgb(220,240,255)
Note over Entrypoint: Old flow — discover required config
User->>Entrypoint: discover (no config)
Entrypoint->>Entrypoint: error: config required
end
rect rgb(240,255,240)
Note over Entrypoint,Source: New flow — conditional requirement
User->>Entrypoint: discover (no config)
Entrypoint->>Source: query check_config_during_discover?
alt Source says config not required (False)
Source-->>Entrypoint: False
Entrypoint->>Source: discover(config = {})
Source-->>Entrypoint: AirbyteCatalog
else Source requires config (True)
Source-->>Entrypoint: True
Entrypoint->>Entrypoint: raise ValueError (helpful message)
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Areas to inspect closely:
Possibly related PRs
Suggested reviewers
Question for the author: the BaseConnector default for Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (3)
7-14: Trim unused imports to avoid linter noise, wdyt?
pytestandcheck_config_against_spec_or_exitare imported but never referenced in this module. Removing them keeps the test file lean and prevents futureflake8/ruffwarnings.-from unittest.mock import MagicMock, patch - -import pytest - -from airbyte_cdk.models import AirbyteCatalog -from airbyte_cdk.sources.declarative.manifest_declarative_source import ManifestDeclarativeSource -from airbyte_cdk.sources.utils.schema_helpers import check_config_against_spec_or_exit +from unittest.mock import MagicMock, patch + +from airbyte_cdk.models import AirbyteCatalog +from airbyte_cdk.sources.declarative.manifest_declarative_source import ManifestDeclarativeSource
16-43: Collapse almost-identical configs with@pytest.mark.parametrize?Both tests build large, nearly identical
source_configdicts that differ only by theschema_loadertype and the expected boolean flag. Switching to a single parametrized test would reduce duplication and make the intent clearer:@pytest.mark.parametrize( "schema_loader, expected_flag", [ ({"type": "DynamicSchemaLoader", ...}, True), ({"type": "InlineSchemaLoader", "schema": {}}, False), ], ) def test_check_config_during_discover(schema_loader, expected_flag): source_config = {... "schema_loader": schema_loader, ...} source = ManifestDeclarativeSource(source_config=source_config) assert source.check_config_during_discover is expected_flag assert source.check_config_against_spec is TrueThis keeps the focus on the behavior being exercised rather than on the boilerplate fixture, wdyt?
Also applies to: 52-68
84-87: Simplify mock stream name assignment for claritySetting the name via
type(mock_airbyte_stream).name = "test_dynamic_stream"works but is slightly cryptic. Assigning the attribute directly on the instance is more obvious:-mock_airbyte_stream = MagicMock() -type(mock_airbyte_stream).name = "test_dynamic_stream" +mock_airbyte_stream = MagicMock() +mock_airbyte_stream.name = "test_dynamic_stream"Unless there’s a strict need for the attribute to live on the class, the direct instance assignment is easier to read, wdyt?
airbyte_cdk/entrypoint.py (1)
106-110: CLI UX: optional flag still listed under “required” group
--configis no longer required fordiscover, but it’s still added to the “required named arguments” group. This can mislead users reading--help. Would moving it to the parent parser (or renaming the group) be clearer?-required_discover_parser = discover_parser.add_argument_group("required named arguments") -required_discover_parser.add_argument( +discover_parser.add_argument( "--config", type=str, required=False, help="path to the json configuration file" )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
📒 Files selected for processing (6)
airbyte_cdk/cli/source_declarative_manifest/_run.py(1 hunks)airbyte_cdk/connector.py(1 hunks)airbyte_cdk/entrypoint.py(3 hunks)airbyte_cdk/sources/declarative/manifest_declarative_source.py(3 hunks)unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py(1 hunks)unit_tests/test_entrypoint.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-amplitude' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (6)
airbyte_cdk/connector.py (1)
35-42: Documentation for config validation control flags is clear and well-placed.The docstrings for both
check_config_against_specandcheck_config_during_discoverclearly explain their purpose and behavior. The new flag allows sources to provide catalog information without requiring authentication, which is a useful extension.airbyte_cdk/cli/source_declarative_manifest/_run.py (1)
238-239: Good change to avoid None for config.This change ensures that
configis always a dictionary (possibly empty) rather thanNonewhen no valid config argument is provided, which makes downstream processing more consistent. This supports the new unprivileged discovery flow.unit_tests/test_entrypoint.py (1)
246-246: Test updated correctly to reflect optional config for discover.The test definition has been properly updated to reflect the fact that the "discover" command no longer requires a config parameter. This aligns with the changes in the entrypoint implementation.
airbyte_cdk/sources/declarative/manifest_declarative_source.py (2)
143-144: Good pattern for setting check_config_during_discover based on schema loader type.Setting this flag based on the presence of a
DynamicSchemaLoaderis a clean approach that automatically enables the appropriate behavior without requiring manual configuration for each source.
549-579: Well-implemented schema loader detection method.The
_uses_dynamic_schema_loadermethod is thorough, checking both static streams and dynamic stream templates. The documentation is clear about its purpose and return value.I particularly appreciate the detailed checks that handle nested configurations and the various ways streams can be defined in the manifest.
airbyte_cdk/entrypoint.py (1)
278-282: Double-check backwards compatibility of the new gate
discover()now validates config only whencheck_config_during_discoverisTrue. For connectors that never define the flag (older versions), the default value onBaseConnectorwill govern behavior. Could you confirm that default isTrueto preserve existing semantics? If not, adding an explicit default here might avoid accidental skips, wdyt?
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
- Fixed TypeError in ManifestDeclarativeSource._stream_configs() missing required positional argument 'config' - Added empty_config parameter when calling _stream_configs during schema detection - This enables config-free discover to work for static schema connectors Manual testing shows this fixes the immediate TypeError but additional work needed for datetime parsing issues. Co-Authored-By: AJ Steers <aj@airbyte.io>
- Fix ConnectorTestScenario.connector_root attribute error in source_base.py - Update manual test plan with complete CLI testing results - Document successful config-free discover for source-pokeapi (static schema) - Confirm dynamic schema connectors correctly require config as expected - All local quality checks (MyPy, ruff format, ruff check) pass Key findings: - PR #559 core functionality is working for static schema connectors - source-pokeapi successfully returns catalog without config - source-datascope still has datetime parsing issues (separate fix needed) - Dynamic schema connectors correctly fail without config as expected Co-Authored-By: AJ Steers <aj@airbyte.io>
- Add blank line after AssertionError in connector_base.py - Resolves ruff format check CI failure Co-Authored-By: AJ Steers <aj@airbyte.io>
- Convert relative paths to absolute paths before creating frozen ConnectorTestScenario models - Fixes PyTest failures in CI by preventing attempts to modify frozen Pydantic instances - Local tests now pass: 7 passed, 1 skipped Co-Authored-By: AJ Steers <aj@airbyte.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
airbyte_cdk/test/standard_tests/connector_base.py (1)
123-138: Good fixes addressing the previous review comments!The null check for
result._messagesand the correct use ofscenario.expected_outcome.expect_exception()both address the issues flagged in previous reviews.However, there's a formatting issue that needs attention, wdyt?
if ( - scenario.expected_outcome.expect_exception() - and conn_status.status == Status.SUCCEEDED + scenario.expected_outcome.expect_exception() + and conn_status.status == Status.SUCCEEDED and not result.errors ):
🧹 Nitpick comments (1)
airbyte_cdk/test/standard_tests/connector_base.py (1)
153-193: Solid implementation of scenario loading with good error handling!The method properly loads YAML config, validates scenarios, and handles path resolution. The filtering logic for "iam_role" tests and path conversion to absolute paths is well thought out.
One small suggestion: consider making the categories configurable rather than hardcoded, wdyt?
@classmethod def get_scenarios( cls, + categories: list[str] | None = None, ) -> list[ConnectorTestScenario]: """Get acceptance tests for a given category.""" - categories = ["connection", "spec"] + if categories is None: + categories = ["connection", "spec"]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (2)
airbyte_cdk/test/standard_tests/connector_base.py(2 hunks)airbyte_cdk/test/standard_tests/source_base.py(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- airbyte_cdk/test/standard_tests/source_base.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-generated from `declarative_component_schema.yaml` and should be ignored in the recommended reviewing order.
airbyte_cdk/test/standard_tests/connector_base.py (4)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py:21-29
Timestamp: 2025-01-13T23:39:15.457Z
Learning: The CustomPageIncrement class in unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py is imported from another connector definition and should not be modified in this context.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#90
File: Dockerfile:16-21
Timestamp: 2024-12-02T18:36:04.346Z
Learning: Copying files from `site-packages` in the Dockerfile maintains compatibility with both the old file structure that manifest-only connectors expect and the new package-based structure where SDM is part of the CDK.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
🪛 GitHub Actions: Linters
airbyte_cdk/test/standard_tests/connector_base.py
[error] 136-137: ruff formatting check failed. File requires reformatting as indicated by 'poetry run ruff format --diff'.
🔇 Additional comments (3)
airbyte_cdk/test/standard_tests/connector_base.py (3)
11-15: Import changes look good!The addition of
yamlandAirbyteConnectionStatusimports align well with the new YAML config loading functionality and explicit type casting in the test methods.
139-142: Clean implementation of connector root directory discovery!The method properly leverages the existing
find_connector_rootutility and provides sensible search locations.
144-151: Well-implemented class property with proper error handling!The implementation correctly uses the connector root directory and provides clear error messaging when the config file is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
airbyte_cdk/test/standard_tests/connector_base.py (2)
154-194: Consider adding error handling for YAML parsing and path resolution.The scenario loading logic is comprehensive, but there are a few areas where additional error handling might be helpful:
- YAML parsing could fail with malformed files
- Path resolution for config and catalog paths could fail if the files don't exist
Would you like to add try-catch blocks around the YAML loading and path resolution operations, wdyt?
- all_tests_config = yaml.safe_load(cls.acceptance_test_config_path.read_text()) + try: + all_tests_config = yaml.safe_load(cls.acceptance_test_config_path.read_text()) + except yaml.YAMLError as e: + raise ValueError(f"Failed to parse YAML config at {cls.acceptance_test_config_path}: {e}") from eAlso, should we validate that the resolved paths actually exist before returning the scenarios, wdyt?
179-185: Consider extracting the filtering logic for better maintainability.The list comprehension with multiple conditions is functional but could be more readable. Would you consider extracting this into a helper method or breaking it down for clarity, wdyt?
- test_scenarios.extend( - [ - ConnectorTestScenario.model_validate(test) - for test in all_tests_config["acceptance_tests"][category]["tests"] - if "config_path" in test and "iam_role" not in test["config_path"] - ] - ) + for test in all_tests_config["acceptance_tests"][category]["tests"]: + if "config_path" not in test: + continue + if "iam_role" in test["config_path"]: + # Skip iam_role tests as they are not supported in the test suite + continue + test_scenarios.append(ConnectorTestScenario.model_validate(test))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
airbyte_cdk/sources/declarative/manifest_declarative_source.py(4 hunks)airbyte_cdk/test/standard_tests/connector_base.py(2 hunks)airbyte_cdk/test/standard_tests/source_base.py(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- airbyte_cdk/test/standard_tests/source_base.py
- airbyte_cdk/sources/declarative/manifest_declarative_source.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-generated from `declarative_component_schema.yaml` and should be ignored in the recommended reviewing order.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, ignore all `__init__.py` files when providing a recommended reviewing order.
airbyte_cdk/test/standard_tests/connector_base.py (4)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py:21-29
Timestamp: 2025-01-13T23:39:15.457Z
Learning: The CustomPageIncrement class in unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py is imported from another connector definition and should not be modified in this context.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#90
File: Dockerfile:16-21
Timestamp: 2024-12-02T18:36:04.346Z
Learning: Copying files from `site-packages` in the Dockerfile maintains compatibility with both the old file structure that manifest-only connectors expect and the new package-based structure where SDM is part of the CDK.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
🧬 Code Graph Analysis (1)
airbyte_cdk/test/standard_tests/connector_base.py (5)
airbyte_cdk/test/entrypoint_wrapper.py (2)
connection_status_messages(167-168)errors(200-201)airbyte_cdk/test/models/scenario.py (2)
expected_outcome(93-99)ConnectorTestScenario(27-179)airbyte_cdk/test/models/outcome.py (1)
expect_exception(55-57)airbyte_cdk/test/standard_tests/docker_base.py (4)
get_connector_root_dir(53-55)get_test_class_dir(46-50)acceptance_test_config_path(69-75)get_scenarios(78-132)airbyte_cdk/utils/connector_paths.py (1)
find_connector_root(180-198)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Check: destination-motherduck
- GitHub Check: Check: source-shopify
- GitHub Check: SDM Docker Image Build
- GitHub Check: Analyze (python)
🔇 Additional comments (4)
airbyte_cdk/test/standard_tests/connector_base.py (4)
11-15: Import changes look good.The addition of
yamlandAirbyteConnectionStatusimports align with the new functionality, and the cleanup of unused imports (AirbyteMessage,Type) is appropriate.
123-139: Great improvements to error handling and validation!The enhanced error message formatting and explicit connection status validation significantly improve debugging capabilities. I notice the past review comments have been addressed with the null check for
result._messagesand the proper attribute access forscenario.expected_outcome.expect_exception().
140-143: Clean delegation to utility function.The
get_connector_root_dirmethod properly delegates tofind_connector_rootwith appropriate search paths. This aligns with the existing pattern inDockerConnectorTestSuite.
145-152: Solid implementation of the acceptance test config path property.The
classpropertydecorator and error handling provide clear feedback when the config file is missing. Good use of the existingACCEPTANCE_TEST_CONFIGconstant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte_cdk/test/entrypoint_wrapper.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-generated from `declarative_component_schema.yaml` and should be ignored in the recommended reviewing order.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, ignore all `__init__.py` files when providing a recommended reviewing order.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Check: source-shopify
- GitHub Check: Check: source-pokeapi
- GitHub Check: Check: destination-motherduck
- GitHub Check: Check: source-intercom
- GitHub Check: Check: source-hardcoded-records
- GitHub Check: SDM Docker Image Build
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Analyze (python)
…schemas Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
|
|
||
| from unittest.mock import MagicMock, patch | ||
|
|
||
| import pytest |
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
Fixed
Show fixed
Hide fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (1)
1-174: Comprehensive unit test coverage!The tests effectively verify:
check_config_during_discoveris True with DynamicSchemaLoadercheck_config_during_discoveris False with InlineSchemaLoader- Discovery works without config in both scenarios
- The flag states are consistent before and after discovery
The tests are well-structured and have clear docstrings explaining the intent.
One observation: the tests mock
ManifestDeclarativeSource.streams, which means they don't fully exercise the_uses_dynamic_schema_loader()detection logic or the full discovery flow. This is fine for unit tests, but have you also added integration tests that verify the end-to-end behavior without mocks? Wdyt about adding a test that doesn't mock streams to ensure_uses_dynamic_schema_loader()correctly inspects the actual manifest?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
airbyte_cdk/entrypoint.py(3 hunks)airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py(3 hunks)airbyte_cdk/test/entrypoint_wrapper.py(0 hunks)airbyte_cdk/test/standard_tests/source_base.py(4 hunks)unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py(1 hunks)
💤 Files with no reviewable changes (1)
- airbyte_cdk/test/entrypoint_wrapper.py
🚧 Files skipped from review as they are similar to previous changes (1)
- airbyte_cdk/test/standard_tests/source_base.py
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: ChristoGrab
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
📚 Learning: 2024-11-18T23:40:06.391Z
Learnt from: ChristoGrab
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Applied to files:
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.pyunit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
📚 Learning: 2024-11-15T01:04:21.272Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
Applied to files:
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.pyunit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.pyairbyte_cdk/entrypoint.py
📚 Learning: 2024-11-15T00:59:08.154Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
Applied to files:
airbyte_cdk/entrypoint.py
🧬 Code graph analysis (3)
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py (1)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py (2)
_stream_configs(517-536)dynamic_streams(511-515)
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (4)
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py (1)
ManifestDeclarativeSource(98-653)airbyte_cdk/sources/utils/schema_helpers.py (1)
check_config_against_spec_or_exit(188-206)airbyte_cdk/entrypoint.py (1)
discover(278-287)airbyte_cdk/sources/abstract_source.py (2)
discover(85-90)streams(74-79)
airbyte_cdk/entrypoint.py (4)
airbyte_cdk/sources/abstract_source.py (2)
_emit_queued_messages(291-294)discover(85-90)airbyte_cdk/test/entrypoint_wrapper.py (1)
discover(397-419)airbyte_cdk/sources/source.py (1)
discover(48-52)airbyte_cdk/connector.py (3)
read_config(51-58)configure(45-48)configure(122-127)
🪛 GitHub Actions: Linters
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
[warning] 1-1: ruff format would reformat 2 files (code style changes available).
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: Check: source-pokeapi
- GitHub Check: Check: source-intercom
- GitHub Check: Check: destination-motherduck
- GitHub Check: Check: source-hardcoded-records
- GitHub Check: Check: source-shopify
- GitHub Check: Manifest Server Docker Image Build
- GitHub Check: SDM Docker Image Build
- GitHub Check: Pytest (All, Python 3.13, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.12, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (python)
🔇 Additional comments (9)
airbyte_cdk/entrypoint.py (6)
108-108: LGTM! Making --config optional enables unprivileged discovery.This change aligns perfectly with the PR's objective to support discovery without configuration for declarative static schemas.
180-184: Nice refactor to generator expressions for consistency.This change makes the spec emission flow consistent with the new patterns introduced in the discover and read branches.
185-199: Unprivileged discover flow looks solid!The logic correctly handles the case where:
- Command is "discover"
- No config is provided
- Source doesn't require config during discover
The empty config dictionary approach should work for sources that don't need authentication during discovery. Just want to verify: have you tested that the
discovermethod handles empty configs gracefully across different source types, particularly those with conditional streams? Wdyt about adding a defensive check or comment about this assumption?
200-202: Excellent error message for missing config!This addresses the previous review feedback about providing a clear, actionable error when config is required but not provided. Much better UX!
207-210: Consistent message emission pattern.Good to see the generator expression pattern applied consistently across all branches.
282-282: Correct flag for controlling config validation during discovery.Switching from
check_config_against_spectocheck_config_during_discoverproperly separates concerns: the former controls validation during read operations, while the latter specifically governs discovery behavior. This gives sources fine-grained control over when config is required.airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py (3)
101-102: Good default for declarative sources.Setting
check_config_during_discover = Falseas the default for declarative sources makes sense, as many declarative connectors can provide schema information without authentication. The docstring clearly explains the intent.
155-155: Smart runtime detection of DynamicSchemaLoader.The approach of dynamically setting
check_config_during_discoverbased on whether any stream usesDynamicSchemaLoaderis elegant. This means sources automatically adapt their behavior based on their schema loading strategy.One question: does
_uses_dynamic_schema_loader()account forConditionalStreamswhere streams might be conditionally included based on config? If a DynamicSchemaLoader is nested inside a conditional stream that's excluded with an empty config, should we still require config during discovery? Wdyt about the current behavior in that edge case?
622-653: ConditionalStreams edge case confirmed and conservatively handled.Verification confirms the implementation is sound. ConditionalStreams conditions properly evaluate via InterpolatedBoolean when
_stream_configsis called with empty config. Any streams filtered out by unmet conditions won't be detected as having DynamicSchemaLoader, so config validation remains required during discovery—a reasonable and conservative approach. Would documenting this behavior be helpful for future maintainers, wdyt?
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
Show resolved
Hide resolved
…static schemas - Move check_config_during_discover logic to ConcurrentDeclarativeSource (the non-deprecated class) - Add _uses_dynamic_schema_loader() method to detect DynamicSchemaLoader usage - Set check_config_during_discover=False for sources without DynamicSchemaLoader - Fix docstring in ManifestDeclarativeSource to correctly describe DynamicSchemaLoader behavior - Add entrypoint unit tests for discover with/without config - Improve error message when config is required but not provided - Update test imports to use legacy module location This enables declarative sources with static schemas (InlineSchemaLoader) to run discover without credentials, while sources with DynamicSchemaLoader still require config for authentication. Co-Authored-By: AJ Steers <aj@airbyte.io>
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
Fixed
Show fixed
Hide fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
airbyte_cdk/entrypoint.py(3 hunks)airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py(3 hunks)airbyte_cdk/sources/declarative/concurrent_declarative_source.py(2 hunks)unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py(1 hunks)unit_tests/test_entrypoint.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: ChristoGrab
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
📚 Learning: 2024-11-18T23:40:06.391Z
Learnt from: ChristoGrab
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Applied to files:
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.pyairbyte_cdk/sources/declarative/concurrent_declarative_source.py
📚 Learning: 2024-11-15T01:04:21.272Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
Applied to files:
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.pyairbyte_cdk/entrypoint.py
📚 Learning: 2024-11-15T00:59:08.154Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
Applied to files:
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.pyairbyte_cdk/entrypoint.py
📚 Learning: 2025-01-14T00:20:32.310Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the `airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` file, the strict module name checks in `_get_class_from_fully_qualified_class_name` (requiring `module_name` to be "components" and `module_name_full` to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.
Applied to files:
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py
🧬 Code graph analysis (4)
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py (1)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py (3)
_uses_dynamic_schema_loader(657-688)_stream_configs(519-538)dynamic_streams(513-517)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py (1)
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py (2)
_uses_dynamic_schema_loader(622-654)_stream_configs(498-519)
airbyte_cdk/entrypoint.py (3)
airbyte_cdk/sources/abstract_source.py (2)
_emit_queued_messages(291-294)discover(85-90)airbyte_cdk/test/entrypoint_wrapper.py (1)
discover(397-419)airbyte_cdk/connector.py (3)
read_config(51-58)configure(45-48)configure(122-127)
unit_tests/test_entrypoint.py (3)
unit_tests/sources/test_abstract_source.py (7)
MockSource(62-99)message_repository(98-99)message_repository(150-155)streams(84-87)name(189-190)name(255-256)name(282-283)airbyte_cdk/sources/message/repository.py (5)
consume_queue(59-60)consume_queue(70-71)consume_queue(93-95)consume_queue(114-115)consume_queue(138-139)airbyte_cdk/entrypoint.py (3)
AirbyteEntrypoint(53-390)run(156-239)discover(282-291)
🪛 GitHub Actions: Linters
unit_tests/test_entrypoint.py
[error] 856-856: Ruff formatting: 2 files would be reformatted. Run 'poetry run ruff format' to fix formatting.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: Check: destination-motherduck
- GitHub Check: Check: source-shopify
- GitHub Check: Check: source-pokeapi
- GitHub Check: Check: source-intercom
- GitHub Check: Check: source-hardcoded-records
- GitHub Check: Manifest Server Docker Image Build
- GitHub Check: SDM Docker Image Build
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.12, Ubuntu)
- GitHub Check: Pytest (All, Python 3.13, Ubuntu)
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (python)
| MockSource.discover.assert_called_once() | ||
| call_args = MockSource.discover.call_args | ||
| assert call_args[0][1] == {} # config argument should be empty dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the config assertion so the test passes
MockSource.discover is patched on the class, so the bound call records (mock_source, logger, config) in call_args[0]. The current assertion grabs index 1, which is the logger, so the check will always fail before your change can even land. Could we unpack the tuple and assert against the third positional element (the config) instead, like below? wdyt?
- call_args = MockSource.discover.call_args
- assert call_args[0][1] == {} # config argument should be empty dict
+ _, _, passed_config = MockSource.discover.call_args[0]
+ assert passed_config == {} # config argument should be empty dict📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| MockSource.discover.assert_called_once() | |
| call_args = MockSource.discover.call_args | |
| assert call_args[0][1] == {} # config argument should be empty dict | |
| MockSource.discover.assert_called_once() | |
| _, _, passed_config = MockSource.discover.call_args[0] | |
| assert passed_config == {} # config argument should be empty dict |
🤖 Prompt for AI Agents
In unit_tests/test_entrypoint.py around lines 898 to 900, the test currently
checks call_args[0][1] which is the logger because MockSource.discover was
patched on the class and the bound call records (mock_source, logger, config);
update the assertion to inspect the third positional argument (the config)
instead — either unpack call_args[0] and assert the third element is {} or
directly assert call_args[0][2] == {}.
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (1)
9-9: Remove unused pytest import.The
pytestimport isn't used anywhere in this test file, wdyt about removing it?-import pytest -
🧹 Nitpick comments (2)
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (2)
77-129: Consider testing actual behavior without mockingstreams().A few observations about this test:
Lines 120-121 duplicate the flag assertions from
test_check_config_during_discover_with_dynamic_schema_loader– are these redundant?By mocking
streams(), the test bypasses the actual validation logic that would normally run inManifestDeclarativeSource.streams(). According to the code inmanifest_declarative_source.py, thestreams()method validates config viaself._spec_component.validate_config(config). This test doesn't verify whether that validation is correctly triggered or skipped based on thecheck_config_during_discoverflag.The test verifies that
discover()can be called with an empty config whenstreams()is mocked, but doesn't confirm the real end-to-end flow works as intended.Would it be more valuable to test the actual discover flow without mocking, perhaps in an integration test? Or is the mock intentional here to isolate the discover method from streams behavior, wdyt?
131-175: Misleading docstring and redundant assertions.A few issues here:
The docstring (line 135) says "Test that discovery validates config when DynamicSchemaLoader is not used," but since
streams()is mocked, no actual config validation occurs. The test just verifies thatdiscover()returns a catalog when called with an empty config and mocked streams. Maybe update the docstring to reflect what's actually being tested?Lines 164-165 and 174-175 repeat the flag assertions already covered in
test_check_config_during_discover_without_dynamic_schema_loader– these seem redundant, wdyt?Similar to my comment on the previous test, mocking
streams()prevents testing the actual validation behavior. Is this intentional?This test has significant overlap with
test_discover_with_dynamic_schema_loader_no_config– could these be parameterized to reduce duplication (e.g.,@pytest.mark.parametrizewith schema_loader type and expected flag value)?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py(1 hunks)
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: ChristoGrab
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
📚 Learning: 2024-11-18T23:40:06.391Z
Learnt from: ChristoGrab
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Applied to files:
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
📚 Learning: 2024-11-15T01:04:21.272Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
Applied to files:
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
📚 Learning: 2024-11-10T04:50:11.914Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 13
File: airbyte_cdk/connector.py:99-99
Timestamp: 2024-11-10T04:50:11.914Z
Learning: When a PR's goal is to run the autoformat task from `ruff`, avoid suggesting code changes beyond formatting to prevent potential negative side effects.
Applied to files:
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
📚 Learning: 2024-11-15T00:59:08.154Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
Applied to files:
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
🧬 Code graph analysis (1)
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (4)
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py (2)
ManifestDeclarativeSource(98-654)streams(305-342)airbyte_cdk/entrypoint.py (1)
discover(282-291)airbyte_cdk/sources/declarative/concurrent_declarative_source.py (2)
discover(385-388)streams(391-423)airbyte_cdk/sources/abstract_source.py (2)
discover(85-90)streams(74-79)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: Check: source-pokeapi
- GitHub Check: Check: destination-motherduck
- GitHub Check: Check: source-intercom
- GitHub Check: Check: source-hardcoded-records
- GitHub Check: Check: source-shopify
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (python)
- GitHub Check: Pytest (All, Python 3.13, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.12, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Manifest Server Docker Image Build
- GitHub Check: SDM Docker Image Build
- GitHub Check: Pytest (Fast)
🔇 Additional comments (2)
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (2)
17-50: LGTM! Clean flag verification test.This test clearly verifies that
check_config_during_discoveris correctly set toTruewhen using DynamicSchemaLoader.
52-75: LGTM! Good coverage of the static schema case.This test nicely complements the previous test by verifying that
check_config_during_discoverisFalsewhen using InlineSchemaLoader.
feat: Add unprivileged discover support for declarative sources with static schemas
Summary
This PR enables declarative sources with static schemas (using
InlineSchemaLoader) to rundiscoverwithout providing credentials. This is useful when the API schema is known statically and doesn't require authentication to determine available streams.Key changes:
check_config_during_discoverattribute toBaseConnector(defaults toTruefor backward compatibility)_uses_dynamic_schema_loader()method in bothConcurrentDeclarativeSourceandManifestDeclarativeSourceto detect when DynamicSchemaLoader is present--configoptional for the discover command in the entrypointcheck_config_during_discover=Falseand no config is provided, discover runs with an empty configScope & Limitations:
InlineSchemaLoaderor other static schema loadersDynamicSchemaLoaderstill require config (as they make separate HTTP calls to retrieve schema)streams()is fully invokedRelated work:
Review & Testing Checklist for Human
Critical items to verify (4):
_uses_dynamic_schema_loader()in bothConcurrentDeclarativeSourceandManifestDeclarativeSource. Does it correctly check all manifest locations where DynamicSchemaLoader could be defined? Are there edge cases it misses?ManifestDeclarativeSource(legacy) toConcurrentDeclarativeSourceduring this PR. Verify thesource-declarative-manifestCLI command works correctly with unprivileged discover.Test Plan
Notes
Test changes rationale:
test_entrypoint.pybecause discover no longer requires--configin all cases. This is intentional and matches the new behavior where config is optional.Known limitation:
The current implementation enables unprivileged discover at the entrypoint/command level, but the
streams()method is still called, which fully initializes Stream objects. Connectors that access config during stream initialization may still fail even whencheck_config_during_discover=False. This is acceptable for the current scope and can be addressed in a follow-up PR if needed.Files most critical to review:
airbyte_cdk/sources/declarative/concurrent_declarative_source.py- Main implementationairbyte_cdk/entrypoint.py- Entrypoint logic for handling unprivileged discoverunit_tests/test_entrypoint.py- New tests for discover with/without configSummary by CodeRabbit
New Features
Bug Fixes
Tests