Skip to content

Conversation

@aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented May 20, 2025

feat: Add unprivileged discover support for declarative sources with static schemas

Summary

This PR enables declarative sources with static schemas (using InlineSchemaLoader) to run discover without providing credentials. This is useful when the API schema is known statically and doesn't require authentication to determine available streams.

Key changes:

  • Added check_config_during_discover attribute to BaseConnector (defaults to True for backward compatibility)
  • Implemented _uses_dynamic_schema_loader() method in both ConcurrentDeclarativeSource and ManifestDeclarativeSource to detect when DynamicSchemaLoader is present
  • Made --config optional for the discover command in the entrypoint
  • Added unprivileged discover logic: when check_config_during_discover=False and no config is provided, discover runs with an empty config
  • Improved error message when config is required but not provided
  • Added entrypoint unit tests for both supported and unsupported unprivileged discover scenarios

Scope & Limitations:

  • Only works for declarative sources where the manifest uses InlineSchemaLoader or other static schema loaders
  • Sources with DynamicSchemaLoader still require config (as they make separate HTTP calls to retrieve schema)
  • Even with unprivileged discover enabled, connectors that access config during stream initialization may still fail because streams() is fully invoked
  • Programmatic (non-declarative) sources continue to require config by default

Related work:

Review & Testing Checklist for Human

Critical items to verify (4):

  • DynamicSchemaLoader detection logic is complete: Review _uses_dynamic_schema_loader() in both ConcurrentDeclarativeSource and ManifestDeclarativeSource. Does it correctly check all manifest locations where DynamicSchemaLoader could be defined? Are there edge cases it misses?
  • Test with real connectors: Beyond PokeAPI, test discover without config on 2-3 other declarative connectors with static schemas. Verify they return catalogs successfully. Also test a connector with DynamicSchemaLoader to ensure it correctly requires config.
  • CLI path works correctly: The implementation was moved from ManifestDeclarativeSource (legacy) to ConcurrentDeclarativeSource during this PR. Verify the source-declarative-manifest CLI command works correctly with unprivileged discover.
  • Error message clarity: When discover fails due to missing config, verify the error message clearly explains that unprivileged discovery is not supported and how to fix it.

Test Plan

# Test 1: Declarative source with static schema (should work without config)
poetry run source-declarative-manifest discover --manifest-path=/path/to/connector/with/inline/schema/manifest.yaml

# Test 2: Declarative source with DynamicSchemaLoader (should fail with clear error)
poetry run source-declarative-manifest discover --manifest-path=/path/to/connector/with/dynamic/schema/manifest.yaml

# Test 3: Normal discover with config still works
poetry run source-declarative-manifest discover --config=/path/to/config.json --manifest-path=/path/to/manifest.yaml

# Test 4: Run unit tests
poetry run pytest unit_tests/test_entrypoint.py::test_run_discover_without_config_when_supported
poetry run pytest unit_tests/test_entrypoint.py::test_run_discover_without_config_when_not_supported
poetry run pytest unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py

Notes

Test changes rationale:

  • Removed "discover" from required-arg tests in test_entrypoint.py because discover no longer requires --config in all cases. This is intentional and matches the new behavior where config is optional.
  • Added positive coverage for both branches (unprivileged discover supported vs. not supported) to prevent test-modification-to-pass issues.

Known limitation:
The current implementation enables unprivileged discover at the entrypoint/command level, but the streams() method is still called, which fully initializes Stream objects. Connectors that access config during stream initialization may still fail even when check_config_during_discover=False. This is acceptable for the current scope and can be addressed in a follow-up PR if needed.

Files most critical to review:

  1. airbyte_cdk/sources/declarative/concurrent_declarative_source.py - Main implementation
  2. airbyte_cdk/entrypoint.py - Entrypoint logic for handling unprivileged discover
  3. unit_tests/test_entrypoint.py - New tests for discover with/without config

Summary by CodeRabbit

  • New Features

    • Discover no longer requires a config when a source supports unprivileged discovery (dynamic-schema scenarios); discovery will run with an empty config in those cases.
    • Introduced a per-source toggle to control whether config validation runs during discovery.
  • Bug Fixes

    • Improved and explicit error when --config is missing but unprivileged discovery is not supported.
  • Tests

    • Added and updated tests covering discover behavior with and without config and the new validation toggle.

devin-ai-integration bot and others added 21 commits April 8, 2025 22:33
…cSchemaLoader

Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
…ation control

Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
…class

Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
…ver property

Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
Co-Authored-By: Aaron <AJ> Steers <aj@airbyte.io>
@github-actions github-actions bot added the enhancement New feature or request label May 20, 2025
@aaronsteers aaronsteers marked this pull request as ready for review May 20, 2025 03:11
@coderabbitai
Copy link
Contributor

coderabbitai bot commented May 20, 2025

📝 Walkthrough

Walkthrough

Allows discover to run without an explicit config when the source can load schemas dynamically. Adds a check_config_during_discover flag, detects DynamicSchemaLoader usage in declarative sources, updates CLI/entrypoint parsing/behavior for discover, and adds tests for the new flows.

Changes

Cohort / File(s) Summary
CLI Config Parsing
airbyte_cdk/cli/source_declarative_manifest/_run.py
_parse_inputs_into_config_catalog_state now assigns {} (empty dict) instead of None when config is missing or falsy.
Core Framework — Connector Flags
airbyte_cdk/connector.py
Added check_config_during_discover: bool = True to BaseConnector with docstring; clarified check_config_against_spec docstring.
Entrypoint — Discovery Flow
airbyte_cdk/entrypoint.py
Made --config optional for discover; use check_config_during_discover to decide validation during discover; added branch to run discover with empty config or raise a clear ValueError when unsupported.
Legacy Manifest Declarative Source
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py
Added check_config_during_discover: bool = False; added _uses_dynamic_schema_loader() to detect DynamicSchemaLoader usage and set the flag after post-processing.
Concurrent Declarative Source
airbyte_cdk/sources/declarative/concurrent_declarative_source.py
Added _uses_dynamic_schema_loader() helper and initialize check_config_during_discover in __init__ based on detection.
Tests — New & Updated
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py, unit_tests/test_entrypoint.py, airbyte_cdk/test/standard_tests/source_base.py
New tests for dynamic vs inline schema discovery and check_config_during_discover; updated entrypoint tests to cover discover-without-config behavior; adjusted standard tests to add preliminary check step and update skip/expectation handling.
Formatting / Minor
airbyte_cdk/test/entrypoint_wrapper.py
Removed an extra blank line (no behavioral change).

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Entrypoint
    participant Source

    rect rgb(220,240,255)
    Note over Entrypoint: Old flow — discover required config
    User->>Entrypoint: discover (no config)
    Entrypoint->>Entrypoint: error: config required
    end

    rect rgb(240,255,240)
    Note over Entrypoint,Source: New flow — conditional requirement
    User->>Entrypoint: discover (no config)
    Entrypoint->>Source: query check_config_during_discover?
    alt Source says config not required (False)
        Source-->>Entrypoint: False
        Entrypoint->>Source: discover(config = {})
        Source-->>Entrypoint: AirbyteCatalog
    else Source requires config (True)
        Source-->>Entrypoint: True
        Entrypoint->>Entrypoint: raise ValueError (helpful message)
    end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Areas to inspect closely:

  • airbyte_cdk/entrypoint.py — branching logic for discover (empty config path) and the raised ValueError messaging.
  • airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py and airbyte_cdk/sources/declarative/concurrent_declarative_source.py — correctness and completeness of _uses_dynamic_schema_loader() traversal across streams, dynamic_streams, and loaders.
  • unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py and unit_tests/test_entrypoint.py — ensure tests reflect intended behavior for both dynamic and inline schema loaders and for both paths (with/without config).

Possibly related PRs

Suggested reviewers

  • maxi297
  • brianjlai
  • aldogonzalez8

Question for the author: the BaseConnector default for check_config_during_discover is True while ManifestDeclarativeSource sets it to False until runtime detection — did you consider aligning the manifest default with the base class, or is the asymmetry intentional to reflect post-processing detection, wdyt?

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 61.90% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "feat: Add unprivileged and config-free discover for declarative static schemas" accurately and specifically describes the main change introduced in this pull request. The title directly reflects the primary feature being added—enabling discovery without configuration for declarative sources using static schemas—and appropriately scopes it to that specific use case. The title is concise, follows conventional commit format, and provides clear context that a developer scanning the PR history would understand the primary purpose of the changes.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch aj/feat/unprivileged-discover-for-declarative-static-schemas

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (3)

7-14: Trim unused imports to avoid linter noise, wdyt?

pytest and check_config_against_spec_or_exit are imported but never referenced in this module. Removing them keeps the test file lean and prevents future flake8/ruff warnings.

-from unittest.mock import MagicMock, patch
-
-import pytest
-
-from airbyte_cdk.models import AirbyteCatalog
-from airbyte_cdk.sources.declarative.manifest_declarative_source import ManifestDeclarativeSource
-from airbyte_cdk.sources.utils.schema_helpers import check_config_against_spec_or_exit
+from unittest.mock import MagicMock, patch
+
+from airbyte_cdk.models import AirbyteCatalog
+from airbyte_cdk.sources.declarative.manifest_declarative_source import ManifestDeclarativeSource

16-43: Collapse almost-identical configs with @pytest.mark.parametrize?

Both tests build large, nearly identical source_config dicts that differ only by the schema_loader type and the expected boolean flag. Switching to a single parametrized test would reduce duplication and make the intent clearer:

@pytest.mark.parametrize(
    "schema_loader, expected_flag",
    [
        ({"type": "DynamicSchemaLoader", ...}, True),
        ({"type": "InlineSchemaLoader", "schema": {}}, False),
    ],
)
def test_check_config_during_discover(schema_loader, expected_flag):
    source_config = {... "schema_loader": schema_loader, ...}
    source = ManifestDeclarativeSource(source_config=source_config)
    assert source.check_config_during_discover is expected_flag
    assert source.check_config_against_spec is True

This keeps the focus on the behavior being exercised rather than on the boilerplate fixture, wdyt?

Also applies to: 52-68


84-87: Simplify mock stream name assignment for clarity

Setting the name via type(mock_airbyte_stream).name = "test_dynamic_stream" works but is slightly cryptic. Assigning the attribute directly on the instance is more obvious:

-mock_airbyte_stream = MagicMock()
-type(mock_airbyte_stream).name = "test_dynamic_stream"
+mock_airbyte_stream = MagicMock()
+mock_airbyte_stream.name = "test_dynamic_stream"

Unless there’s a strict need for the attribute to live on the class, the direct instance assignment is easier to read, wdyt?

airbyte_cdk/entrypoint.py (1)

106-110: CLI UX: optional flag still listed under “required” group

--config is no longer required for discover, but it’s still added to the “required named arguments” group. This can mislead users reading --help. Would moving it to the parent parser (or renaming the group) be clearer?

-required_discover_parser = discover_parser.add_argument_group("required named arguments")
-required_discover_parser.add_argument(
+discover_parser.add_argument(
     "--config", type=str, required=False, help="path to the json configuration file"
 )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between d458e8f and fac853c.

📒 Files selected for processing (6)
  • airbyte_cdk/cli/source_declarative_manifest/_run.py (1 hunks)
  • airbyte_cdk/connector.py (1 hunks)
  • airbyte_cdk/entrypoint.py (3 hunks)
  • airbyte_cdk/sources/declarative/manifest_declarative_source.py (3 hunks)
  • unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (1 hunks)
  • unit_tests/test_entrypoint.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (6)
airbyte_cdk/connector.py (1)

35-42: Documentation for config validation control flags is clear and well-placed.

The docstrings for both check_config_against_spec and check_config_during_discover clearly explain their purpose and behavior. The new flag allows sources to provide catalog information without requiring authentication, which is a useful extension.

airbyte_cdk/cli/source_declarative_manifest/_run.py (1)

238-239: Good change to avoid None for config.

This change ensures that config is always a dictionary (possibly empty) rather than None when no valid config argument is provided, which makes downstream processing more consistent. This supports the new unprivileged discovery flow.

unit_tests/test_entrypoint.py (1)

246-246: Test updated correctly to reflect optional config for discover.

The test definition has been properly updated to reflect the fact that the "discover" command no longer requires a config parameter. This aligns with the changes in the entrypoint implementation.

airbyte_cdk/sources/declarative/manifest_declarative_source.py (2)

143-144: Good pattern for setting check_config_during_discover based on schema loader type.

Setting this flag based on the presence of a DynamicSchemaLoader is a clean approach that automatically enables the appropriate behavior without requiring manual configuration for each source.


549-579: Well-implemented schema loader detection method.

The _uses_dynamic_schema_loader method is thorough, checking both static streams and dynamic stream templates. The documentation is clear about its purpose and return value.

I particularly appreciate the detailed checks that handle nested configurations and the various ways streams can be defined in the manifest.

airbyte_cdk/entrypoint.py (1)

278-282: Double-check backwards compatibility of the new gate

discover() now validates config only when check_config_during_discover is True. For connectors that never define the flag (older versions), the default value on BaseConnector will govern behavior. Could you confirm that default is True to preserve existing semantics? If not, adding an explicit default here might avoid accidental skips, wdyt?

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
aaronsteers and others added 2 commits July 17, 2025 20:05
- Fixed TypeError in ManifestDeclarativeSource._stream_configs() missing required positional argument 'config'
- Added empty_config parameter when calling _stream_configs during schema detection
- This enables config-free discover to work for static schema connectors

Manual testing shows this fixes the immediate TypeError but additional work needed for datetime parsing issues.

Co-Authored-By: AJ Steers <aj@airbyte.io>
devin-ai-integration bot and others added 3 commits July 18, 2025 03:38
- Fix ConnectorTestScenario.connector_root attribute error in source_base.py
- Update manual test plan with complete CLI testing results
- Document successful config-free discover for source-pokeapi (static schema)
- Confirm dynamic schema connectors correctly require config as expected
- All local quality checks (MyPy, ruff format, ruff check) pass

Key findings:
- PR #559 core functionality is working for static schema connectors
- source-pokeapi successfully returns catalog without config
- source-datascope still has datetime parsing issues (separate fix needed)
- Dynamic schema connectors correctly fail without config as expected

Co-Authored-By: AJ Steers <aj@airbyte.io>
- Add blank line after AssertionError in connector_base.py
- Resolves ruff format check CI failure

Co-Authored-By: AJ Steers <aj@airbyte.io>
- Convert relative paths to absolute paths before creating frozen ConnectorTestScenario models
- Fixes PyTest failures in CI by preventing attempts to modify frozen Pydantic instances
- Local tests now pass: 7 passed, 1 skipped

Co-Authored-By: AJ Steers <aj@airbyte.io>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
airbyte_cdk/test/standard_tests/connector_base.py (1)

123-138: Good fixes addressing the previous review comments!

The null check for result._messages and the correct use of scenario.expected_outcome.expect_exception() both address the issues flagged in previous reviews.

However, there's a formatting issue that needs attention, wdyt?

         if (
-            scenario.expected_outcome.expect_exception()
-            and conn_status.status == Status.SUCCEEDED
+            scenario.expected_outcome.expect_exception()
+            and conn_status.status == Status.SUCCEEDED
             and not result.errors
         ):
🧹 Nitpick comments (1)
airbyte_cdk/test/standard_tests/connector_base.py (1)

153-193: Solid implementation of scenario loading with good error handling!

The method properly loads YAML config, validates scenarios, and handles path resolution. The filtering logic for "iam_role" tests and path conversion to absolute paths is well thought out.

One small suggestion: consider making the categories configurable rather than hardcoded, wdyt?

 @classmethod
 def get_scenarios(
     cls,
+    categories: list[str] | None = None,
 ) -> list[ConnectorTestScenario]:
     """Get acceptance tests for a given category."""
-    categories = ["connection", "spec"]
+    if categories is None:
+        categories = ["connection", "spec"]
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 451ce82 and ff14901.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • airbyte_cdk/test/standard_tests/connector_base.py (2 hunks)
  • airbyte_cdk/test/standard_tests/source_base.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/test/standard_tests/source_base.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-generated from `declarative_component_schema.yaml` and should be ignored in the recommended reviewing order.
airbyte_cdk/test/standard_tests/connector_base.py (4)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py:21-29
Timestamp: 2025-01-13T23:39:15.457Z
Learning: The CustomPageIncrement class in unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py is imported from another connector definition and should not be modified in this context.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#90
File: Dockerfile:16-21
Timestamp: 2024-12-02T18:36:04.346Z
Learning: Copying files from `site-packages` in the Dockerfile maintains compatibility with both the old file structure that manifest-only connectors expect and the new package-based structure where SDM is part of the CDK.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
🪛 GitHub Actions: Linters
airbyte_cdk/test/standard_tests/connector_base.py

[error] 136-137: ruff formatting check failed. File requires reformatting as indicated by 'poetry run ruff format --diff'.

🔇 Additional comments (3)
airbyte_cdk/test/standard_tests/connector_base.py (3)

11-15: Import changes look good!

The addition of yaml and AirbyteConnectionStatus imports align well with the new YAML config loading functionality and explicit type casting in the test methods.


139-142: Clean implementation of connector root directory discovery!

The method properly leverages the existing find_connector_root utility and provides sensible search locations.


144-151: Well-implemented class property with proper error handling!

The implementation correctly uses the connector root directory and provides clear error messaging when the config file is missing.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
airbyte_cdk/test/standard_tests/connector_base.py (2)

154-194: Consider adding error handling for YAML parsing and path resolution.

The scenario loading logic is comprehensive, but there are a few areas where additional error handling might be helpful:

  1. YAML parsing could fail with malformed files
  2. Path resolution for config and catalog paths could fail if the files don't exist

Would you like to add try-catch blocks around the YAML loading and path resolution operations, wdyt?

-        all_tests_config = yaml.safe_load(cls.acceptance_test_config_path.read_text())
+        try:
+            all_tests_config = yaml.safe_load(cls.acceptance_test_config_path.read_text())
+        except yaml.YAMLError as e:
+            raise ValueError(f"Failed to parse YAML config at {cls.acceptance_test_config_path}: {e}") from e

Also, should we validate that the resolved paths actually exist before returning the scenarios, wdyt?


179-185: Consider extracting the filtering logic for better maintainability.

The list comprehension with multiple conditions is functional but could be more readable. Would you consider extracting this into a helper method or breaking it down for clarity, wdyt?

-            test_scenarios.extend(
-                [
-                    ConnectorTestScenario.model_validate(test)
-                    for test in all_tests_config["acceptance_tests"][category]["tests"]
-                    if "config_path" in test and "iam_role" not in test["config_path"]
-                ]
-            )
+            for test in all_tests_config["acceptance_tests"][category]["tests"]:
+                if "config_path" not in test:
+                    continue
+                if "iam_role" in test["config_path"]:
+                    # Skip iam_role tests as they are not supported in the test suite
+                    continue
+                test_scenarios.append(ConnectorTestScenario.model_validate(test))
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ff14901 and 2345164.

📒 Files selected for processing (3)
  • airbyte_cdk/sources/declarative/manifest_declarative_source.py (4 hunks)
  • airbyte_cdk/test/standard_tests/connector_base.py (2 hunks)
  • airbyte_cdk/test/standard_tests/source_base.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • airbyte_cdk/test/standard_tests/source_base.py
  • airbyte_cdk/sources/declarative/manifest_declarative_source.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-generated from `declarative_component_schema.yaml` and should be ignored in the recommended reviewing order.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, ignore all `__init__.py` files when providing a recommended reviewing order.
airbyte_cdk/test/standard_tests/connector_base.py (4)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py:21-29
Timestamp: 2025-01-13T23:39:15.457Z
Learning: The CustomPageIncrement class in unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py is imported from another connector definition and should not be modified in this context.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#90
File: Dockerfile:16-21
Timestamp: 2024-12-02T18:36:04.346Z
Learning: Copying files from `site-packages` in the Dockerfile maintains compatibility with both the old file structure that manifest-only connectors expect and the new package-based structure where SDM is part of the CDK.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
🧬 Code Graph Analysis (1)
airbyte_cdk/test/standard_tests/connector_base.py (5)
airbyte_cdk/test/entrypoint_wrapper.py (2)
  • connection_status_messages (167-168)
  • errors (200-201)
airbyte_cdk/test/models/scenario.py (2)
  • expected_outcome (93-99)
  • ConnectorTestScenario (27-179)
airbyte_cdk/test/models/outcome.py (1)
  • expect_exception (55-57)
airbyte_cdk/test/standard_tests/docker_base.py (4)
  • get_connector_root_dir (53-55)
  • get_test_class_dir (46-50)
  • acceptance_test_config_path (69-75)
  • get_scenarios (78-132)
airbyte_cdk/utils/connector_paths.py (1)
  • find_connector_root (180-198)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-shopify
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Analyze (python)
🔇 Additional comments (4)
airbyte_cdk/test/standard_tests/connector_base.py (4)

11-15: Import changes look good.

The addition of yaml and AirbyteConnectionStatus imports align with the new functionality, and the cleanup of unused imports (AirbyteMessage, Type) is appropriate.


123-139: Great improvements to error handling and validation!

The enhanced error message formatting and explicit connection status validation significantly improve debugging capabilities. I notice the past review comments have been addressed with the null check for result._messages and the proper attribute access for scenario.expected_outcome.expect_exception().


140-143: Clean delegation to utility function.

The get_connector_root_dir method properly delegates to find_connector_root with appropriate search paths. This aligns with the existing pattern in DockerConnectorTestSuite.


145-152: Solid implementation of the acceptance test config path property.

The classproperty decorator and error handling provide clear feedback when the config file is missing. Good use of the existing ACCEPTANCE_TEST_CONFIG constant.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3245f0f and 6748db0.

📒 Files selected for processing (1)
  • airbyte_cdk/test/entrypoint_wrapper.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-generated from `declarative_component_schema.yaml` and should be ignored in the recommended reviewing order.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, ignore all `__init__.py` files when providing a recommended reviewing order.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Analyze (python)

devin-ai-integration bot and others added 2 commits October 31, 2025 03:20
…schemas

Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
Copilot AI review requested due to automatic review settings October 31, 2025 03:34
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.


from unittest.mock import MagicMock, patch

import pytest
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (1)

1-174: Comprehensive unit test coverage!

The tests effectively verify:

  1. check_config_during_discover is True with DynamicSchemaLoader
  2. check_config_during_discover is False with InlineSchemaLoader
  3. Discovery works without config in both scenarios
  4. The flag states are consistent before and after discovery

The tests are well-structured and have clear docstrings explaining the intent.

One observation: the tests mock ManifestDeclarativeSource.streams, which means they don't fully exercise the _uses_dynamic_schema_loader() detection logic or the full discovery flow. This is fine for unit tests, but have you also added integration tests that verify the end-to-end behavior without mocks? Wdyt about adding a test that doesn't mock streams to ensure _uses_dynamic_schema_loader() correctly inspects the actual manifest?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6748db0 and a69c252.

📒 Files selected for processing (5)
  • airbyte_cdk/entrypoint.py (3 hunks)
  • airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py (3 hunks)
  • airbyte_cdk/test/entrypoint_wrapper.py (0 hunks)
  • airbyte_cdk/test/standard_tests/source_base.py (4 hunks)
  • unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (1 hunks)
💤 Files with no reviewable changes (1)
  • airbyte_cdk/test/entrypoint_wrapper.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/test/standard_tests/source_base.py
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: ChristoGrab
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
📚 Learning: 2024-11-18T23:40:06.391Z
Learnt from: ChristoGrab
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.

Applied to files:

  • airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py
  • unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
📚 Learning: 2024-11-15T01:04:21.272Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.

Applied to files:

  • airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py
  • unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
  • airbyte_cdk/entrypoint.py
📚 Learning: 2024-11-15T00:59:08.154Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.

Applied to files:

  • airbyte_cdk/entrypoint.py
🧬 Code graph analysis (3)
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py (1)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py (2)
  • _stream_configs (517-536)
  • dynamic_streams (511-515)
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (4)
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py (1)
  • ManifestDeclarativeSource (98-653)
airbyte_cdk/sources/utils/schema_helpers.py (1)
  • check_config_against_spec_or_exit (188-206)
airbyte_cdk/entrypoint.py (1)
  • discover (278-287)
airbyte_cdk/sources/abstract_source.py (2)
  • discover (85-90)
  • streams (74-79)
airbyte_cdk/entrypoint.py (4)
airbyte_cdk/sources/abstract_source.py (2)
  • _emit_queued_messages (291-294)
  • discover (85-90)
airbyte_cdk/test/entrypoint_wrapper.py (1)
  • discover (397-419)
airbyte_cdk/sources/source.py (1)
  • discover (48-52)
airbyte_cdk/connector.py (3)
  • read_config (51-58)
  • configure (45-48)
  • configure (122-127)
🪛 GitHub Actions: Linters
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py

[warning] 1-1: ruff format would reformat 2 files (code style changes available).

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-shopify
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (python)
🔇 Additional comments (9)
airbyte_cdk/entrypoint.py (6)

108-108: LGTM! Making --config optional enables unprivileged discovery.

This change aligns perfectly with the PR's objective to support discovery without configuration for declarative static schemas.


180-184: Nice refactor to generator expressions for consistency.

This change makes the spec emission flow consistent with the new patterns introduced in the discover and read branches.


185-199: Unprivileged discover flow looks solid!

The logic correctly handles the case where:

  • Command is "discover"
  • No config is provided
  • Source doesn't require config during discover

The empty config dictionary approach should work for sources that don't need authentication during discovery. Just want to verify: have you tested that the discover method handles empty configs gracefully across different source types, particularly those with conditional streams? Wdyt about adding a defensive check or comment about this assumption?


200-202: Excellent error message for missing config!

This addresses the previous review feedback about providing a clear, actionable error when config is required but not provided. Much better UX!


207-210: Consistent message emission pattern.

Good to see the generator expression pattern applied consistently across all branches.


282-282: Correct flag for controlling config validation during discovery.

Switching from check_config_against_spec to check_config_during_discover properly separates concerns: the former controls validation during read operations, while the latter specifically governs discovery behavior. This gives sources fine-grained control over when config is required.

airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py (3)

101-102: Good default for declarative sources.

Setting check_config_during_discover = False as the default for declarative sources makes sense, as many declarative connectors can provide schema information without authentication. The docstring clearly explains the intent.


155-155: Smart runtime detection of DynamicSchemaLoader.

The approach of dynamically setting check_config_during_discover based on whether any stream uses DynamicSchemaLoader is elegant. This means sources automatically adapt their behavior based on their schema loading strategy.

One question: does _uses_dynamic_schema_loader() account for ConditionalStreams where streams might be conditionally included based on config? If a DynamicSchemaLoader is nested inside a conditional stream that's excluded with an empty config, should we still require config during discovery? Wdyt about the current behavior in that edge case?


622-653: ConditionalStreams edge case confirmed and conservatively handled.

Verification confirms the implementation is sound. ConditionalStreams conditions properly evaluate via InterpolatedBoolean when _stream_configs is called with empty config. Any streams filtered out by unmet conditions won't be detected as having DynamicSchemaLoader, so config validation remains required during discovery—a reasonable and conservative approach. Would documenting this behavior be helpful for future maintainers, wdyt?

…static schemas

- Move check_config_during_discover logic to ConcurrentDeclarativeSource (the non-deprecated class)
- Add _uses_dynamic_schema_loader() method to detect DynamicSchemaLoader usage
- Set check_config_during_discover=False for sources without DynamicSchemaLoader
- Fix docstring in ManifestDeclarativeSource to correctly describe DynamicSchemaLoader behavior
- Add entrypoint unit tests for discover with/without config
- Improve error message when config is required but not provided
- Update test imports to use legacy module location

This enables declarative sources with static schemas (InlineSchemaLoader) to run
discover without credentials, while sources with DynamicSchemaLoader still require
config for authentication.

Co-Authored-By: AJ Steers <aj@airbyte.io>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a69c252 and 7048b30.

📒 Files selected for processing (5)
  • airbyte_cdk/entrypoint.py (3 hunks)
  • airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py (3 hunks)
  • airbyte_cdk/sources/declarative/concurrent_declarative_source.py (2 hunks)
  • unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (1 hunks)
  • unit_tests/test_entrypoint.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: ChristoGrab
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
📚 Learning: 2024-11-18T23:40:06.391Z
Learnt from: ChristoGrab
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.

Applied to files:

  • airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py
  • airbyte_cdk/sources/declarative/concurrent_declarative_source.py
📚 Learning: 2024-11-15T01:04:21.272Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.

Applied to files:

  • airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py
  • airbyte_cdk/entrypoint.py
📚 Learning: 2024-11-15T00:59:08.154Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.

Applied to files:

  • airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py
  • airbyte_cdk/entrypoint.py
📚 Learning: 2025-01-14T00:20:32.310Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the `airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` file, the strict module name checks in `_get_class_from_fully_qualified_class_name` (requiring `module_name` to be "components" and `module_name_full` to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.

Applied to files:

  • airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py
🧬 Code graph analysis (4)
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py (1)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py (3)
  • _uses_dynamic_schema_loader (657-688)
  • _stream_configs (519-538)
  • dynamic_streams (513-517)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py (1)
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py (2)
  • _uses_dynamic_schema_loader (622-654)
  • _stream_configs (498-519)
airbyte_cdk/entrypoint.py (3)
airbyte_cdk/sources/abstract_source.py (2)
  • _emit_queued_messages (291-294)
  • discover (85-90)
airbyte_cdk/test/entrypoint_wrapper.py (1)
  • discover (397-419)
airbyte_cdk/connector.py (3)
  • read_config (51-58)
  • configure (45-48)
  • configure (122-127)
unit_tests/test_entrypoint.py (3)
unit_tests/sources/test_abstract_source.py (7)
  • MockSource (62-99)
  • message_repository (98-99)
  • message_repository (150-155)
  • streams (84-87)
  • name (189-190)
  • name (255-256)
  • name (282-283)
airbyte_cdk/sources/message/repository.py (5)
  • consume_queue (59-60)
  • consume_queue (70-71)
  • consume_queue (93-95)
  • consume_queue (114-115)
  • consume_queue (138-139)
airbyte_cdk/entrypoint.py (3)
  • AirbyteEntrypoint (53-390)
  • run (156-239)
  • discover (282-291)
🪛 GitHub Actions: Linters
unit_tests/test_entrypoint.py

[error] 856-856: Ruff formatting: 2 files would be reformatted. Run 'poetry run ruff format' to fix formatting.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (python)

Comment on lines +898 to +900
MockSource.discover.assert_called_once()
call_args = MockSource.discover.call_args
assert call_args[0][1] == {} # config argument should be empty dict
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix the config assertion so the test passes

MockSource.discover is patched on the class, so the bound call records (mock_source, logger, config) in call_args[0]. The current assertion grabs index 1, which is the logger, so the check will always fail before your change can even land. Could we unpack the tuple and assert against the third positional element (the config) instead, like below? wdyt?

-    call_args = MockSource.discover.call_args
-    assert call_args[0][1] == {}  # config argument should be empty dict
+    _, _, passed_config = MockSource.discover.call_args[0]
+    assert passed_config == {}  # config argument should be empty dict
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
MockSource.discover.assert_called_once()
call_args = MockSource.discover.call_args
assert call_args[0][1] == {} # config argument should be empty dict
MockSource.discover.assert_called_once()
_, _, passed_config = MockSource.discover.call_args[0]
assert passed_config == {} # config argument should be empty dict
🤖 Prompt for AI Agents
In unit_tests/test_entrypoint.py around lines 898 to 900, the test currently
checks call_args[0][1] which is the logger because MockSource.discover was
patched on the class and the bound call records (mock_source, logger, config);
update the assertion to inspect the third positional argument (the config)
instead — either unpack call_args[0] and assert the third element is {} or
directly assert call_args[0][2] == {}.

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (1)

9-9: Remove unused pytest import.

The pytest import isn't used anywhere in this test file, wdyt about removing it?

-import pytest
-
🧹 Nitpick comments (2)
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (2)

77-129: Consider testing actual behavior without mocking streams().

A few observations about this test:

  1. Lines 120-121 duplicate the flag assertions from test_check_config_during_discover_with_dynamic_schema_loader – are these redundant?

  2. By mocking streams(), the test bypasses the actual validation logic that would normally run in ManifestDeclarativeSource.streams(). According to the code in manifest_declarative_source.py, the streams() method validates config via self._spec_component.validate_config(config). This test doesn't verify whether that validation is correctly triggered or skipped based on the check_config_during_discover flag.

  3. The test verifies that discover() can be called with an empty config when streams() is mocked, but doesn't confirm the real end-to-end flow works as intended.

Would it be more valuable to test the actual discover flow without mocking, perhaps in an integration test? Or is the mock intentional here to isolate the discover method from streams behavior, wdyt?


131-175: Misleading docstring and redundant assertions.

A few issues here:

  1. The docstring (line 135) says "Test that discovery validates config when DynamicSchemaLoader is not used," but since streams() is mocked, no actual config validation occurs. The test just verifies that discover() returns a catalog when called with an empty config and mocked streams. Maybe update the docstring to reflect what's actually being tested?

  2. Lines 164-165 and 174-175 repeat the flag assertions already covered in test_check_config_during_discover_without_dynamic_schema_loader – these seem redundant, wdyt?

  3. Similar to my comment on the previous test, mocking streams() prevents testing the actual validation behavior. Is this intentional?

  4. This test has significant overlap with test_discover_with_dynamic_schema_loader_no_config – could these be parameterized to reduce duplication (e.g., @pytest.mark.parametrize with schema_loader type and expected flag value)?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7048b30 and 711635f.

📒 Files selected for processing (1)
  • unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (1 hunks)
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: ChristoGrab
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
📚 Learning: 2024-11-18T23:40:06.391Z
Learnt from: ChristoGrab
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.

Applied to files:

  • unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
📚 Learning: 2024-11-15T01:04:21.272Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.

Applied to files:

  • unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
📚 Learning: 2024-11-10T04:50:11.914Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 13
File: airbyte_cdk/connector.py:99-99
Timestamp: 2024-11-10T04:50:11.914Z
Learning: When a PR's goal is to run the autoformat task from `ruff`, avoid suggesting code changes beyond formatting to prevent potential negative side effects.

Applied to files:

  • unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
📚 Learning: 2024-11-15T00:59:08.154Z
Learnt from: aaronsteers
Repo: airbytehq/airbyte-python-cdk PR: 58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.

Applied to files:

  • unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py
🧬 Code graph analysis (1)
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (4)
airbyte_cdk/legacy/sources/declarative/manifest_declarative_source.py (2)
  • ManifestDeclarativeSource (98-654)
  • streams (305-342)
airbyte_cdk/entrypoint.py (1)
  • discover (282-291)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py (2)
  • discover (385-388)
  • streams (391-423)
airbyte_cdk/sources/abstract_source.py (2)
  • discover (85-90)
  • streams (74-79)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-shopify
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (python)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (2)
unit_tests/sources/declarative/test_manifest_declarative_source_dynamic_schema.py (2)

17-50: LGTM! Clean flag verification test.

This test clearly verifies that check_config_during_discover is correctly set to True when using DynamicSchemaLoader.


52-75: LGTM! Good coverage of the static schema case.

This test nicely complements the previous test by verifying that check_config_during_discover is False when using InlineSchemaLoader.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants