Skip to content

tests: improve fast tests error logging, skip 'read' on destinations #644

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Jul 4, 2025

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Jul 3, 2025

This PR improves our container-based FAST tests by wrapping the output in an EntrypointOutput object, and adding new methods to that class to streamline error message processing into a single place.

This PR also resolves an issue where destinations would fail the read tests, since they don't have discover and read implementations. This second issue is resolved by (1) adding a proper connector_name class property to the test suite, and (2) adding an is_destination_connector property which uses the name to determine if the connector is a destination. If so, the read test will now be skipped.

Summary by CodeRabbit

  • Bug Fixes

    • Improved error handling and messaging for Docker-based connector tests and image verification.
    • Fixed typo in workflow comments for Python linting.
  • Refactor

    • Streamlined error reporting and exception handling in test suites.
    • Centralized error formatting and exception raising for entrypoint outputs.
    • Updated Docker command execution logic for clarity and improved output processing.
    • Simplified test logic for Docker connector acceptance tests.
    • Enhanced test suite selection logic to explicitly support destination connectors.
  • Chores

    • Adjusted test matrix to ensure consistent testing of specific connectors.
    • Updated documentation for connector test templates to use consistent terminology.

Important

Auto-merge enabled.

This PR is set to merge automatically when all requirements are met.

devin-ai-integration bot and others added 11 commits June 27, 2025 22:43
- Fix CatalogProvider.get_primary_keys() to fall back to stream.source_defined_primary_key when primary_key is empty/None
- Add comprehensive unit tests covering all fallback scenarios
- Resolves bug where destinations ignore source-defined primary keys when configured primary key is not set
- Affects all destinations using CDK SQL processor base classes

Co-Authored-By: AJ Steers <aj@airbyte.io>
Address @aaronsteers feedback to use more concise approach:
pks = configured_stream.primary_key or configured_stream.stream.source_defined_primary_key or []

This maintains exact same functionality while being more readable.

Co-Authored-By: AJ Steers <aj@airbyte.io>
- Consolidate 7 individual test methods into 1 parametrized test with 6 scenarios
- Use list[str] format for parameters with wrapping logic in test
- Remove case normalization test since CatalogProvider doesn't handle normalization
- Reduce code from ~150 lines to ~50 lines while maintaining full coverage

Co-Authored-By: AJ Steers <aj@airbyte.io>
…behavior

- Add explanation that method uses primary_key if set explicitly in configured catalog
- Otherwise falls back to source_defined_primary_key if set
- Addresses GitHub comment from @aaronsteers

Co-Authored-By: AJ Steers <aj@airbyte.io>
- Fix Ruff Format Check CI failure
- Apply proper formatting to expanded docstring in get_primary_keys method

Co-Authored-By: AJ Steers <aj@airbyte.io>
… unset

- Reverse priority order to favor source_defined_primary_key over primary_key
- Return None when neither primary key field is set
- Update all callers to handle None gracefully with 'pks or []' coalescing
- Update unit tests to reflect new behavior and priority order
- Addresses @aaronsteers feedback on primary key fallback logic

Co-Authored-By: AJ Steers <aj@airbyte.io>
…perations

- Replace 'or []' coalescing with explicit guard statements in merge methods
- Raise AirbyteInternalError when no primary keys available for merge operations
- Addresses @aaronsteers feedback on code clarity and error handling
- Ensures merge operations fail fast when primary keys are missing

Co-Authored-By: AJ Steers <aj@airbyte.io>
- Split long docstring line in get_primary_keys method across multiple lines
- Addresses @aaronsteers feedback on line length issue in PR comment
- Maintains readability while complying with formatting standards

Co-Authored-By: AJ Steers <aj@airbyte.io>
…lues

- Updated docstring to specify 'primary key column names' instead of just 'primary keys'
- Added Returns section explaining the method returns column names or None
- Addresses @dbgold17's clarifying question about terminology in GitHub comment

Co-Authored-By: AJ Steers <aj@airbyte.io>
@aaronsteers aaronsteers changed the title Aj/tests/improve-fast-test-outputs tests: improve fast tests error logging, skip 'read' on destinations Jul 3, 2025
Copy link

github-actions bot commented Jul 3, 2025

PyTest Results (Fast)

3 693 tests  ±0   3 682 ✅ ±0   6m 22s ⏱️ +6s
    1 suites ±0      11 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 0dd3cd9. ± Comparison against base commit acb92dd.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Jul 3, 2025

PyTest Results (Full)

3 696 tests  ±0   3 685 ✅ ±0   17m 56s ⏱️ -6s
    1 suites ±0      11 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 0dd3cd9. ± Comparison against base commit acb92dd.

♻️ This comment has been updated with latest results.

@aaronsteers aaronsteers changed the title tests: improve fast tests error logging, skip 'read' on destinations test: improve fast tests error logging, skip 'read' on destinations Jul 3, 2025
Copy link
Contributor

@dbgold17 dbgold17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. one super small comment.

:shipit:

@aaronsteers aaronsteers changed the title test: improve fast tests error logging, skip 'read' on destinations tests: improve fast tests error logging, skip 'read' on destinations Jul 4, 2025
Base automatically changed from devin/1751064114-fix-primary-key-fallback to main July 4, 2025 01:13
Copy link
Contributor

coderabbitai bot commented Jul 4, 2025

Warning

Rate limit exceeded

@aaronsteers has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 26 minutes and 0 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 0d8c317 and 0dd3cd9.

📒 Files selected for processing (1)
  • airbyte_cdk/test/standard_tests/util.py (3 hunks)
📝 Walkthrough

Walkthrough

This update refactors error handling and Docker command execution in Airbyte's test and utility modules. It introduces structured exception classes, centralizes error formatting, and streamlines Docker-based test logic. Additionally, it updates GitHub workflow files for connector testing and Python linting, including minor fixes and adjustments to testing matrix attributes.

Changes

File(s) Change Summary
.github/workflows/connector-tests.yml Changed cdk_extra for destination-motherduck from "sql" to "n/a"; added clarifying comment.
.github/workflows/python_lint.yml Fixed comment typos; changed ruff format check command from --check to --diff.
airbyte_cdk/test/entrypoint_wrapper.py Added AirbyteEntrypointException; extended EntrypointOutput with command info and error/exception methods.
airbyte_cdk/test/standard_tests/_job_runner.py Removed _errors_to_str; replaced with calls to new error formatting methods on EntrypointOutput.
airbyte_cdk/test/standard_tests/connector_base.py Changed how connector_name is derived in the connector property (now uses class attribute).
airbyte_cdk/test/standard_tests/docker_base.py Refactored to add connector_name property and is_destination_connector; centralized error handling and simplified test logic.
airbyte_cdk/utils/docker.py Refactored Docker command functions; renamed params, introduced run_docker_airbyte_command, improved image verification logic.
airbyte_cdk/cli/airbyte_cdk/_connector.py Updated docstring comments in TEST_FILE_TEMPLATE to refer to "connector" instead of "source" and improved formatting.
airbyte_cdk/test/standard_tests/util.py Removed TEST_CLASS_MAPPING; added explicit language tag parsing and conditional test suite class assignment for source/destination connectors.

Sequence Diagram(s)

sequenceDiagram
    participant TestSuite
    participant DockerUtils
    participant EntrypointOutput

    TestSuite->>DockerUtils: run_docker_airbyte_command(cmd)
    DockerUtils->>EntrypointOutput: Create EntrypointOutput with command, messages, errors
    EntrypointOutput-->>DockerUtils: EntrypointOutput object
    DockerUtils-->>TestSuite: EntrypointOutput object

    TestSuite->>EntrypointOutput: raise_if_errors()
    alt Errors present
        EntrypointOutput-->>TestSuite: Raise AirbyteEntrypointException
    else No errors
        EntrypointOutput-->>TestSuite: Continue
    end
Loading

Possibly related PRs

Suggested reviewers

  • dbgold17

Would you like me to invite dbgold17 to review these changes, or is there anyone else you think would be a good fit, wdyt?

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
airbyte_cdk/test/standard_tests/docker_base.py (2)

35-39: Clean up unused import.

The static analysis correctly flags that run_docker_command is imported but no longer used since you've migrated to run_docker_airbyte_command. WDYT about removing this unused import?

 from airbyte_cdk.utils.docker import (
     build_connector_image,
     run_docker_airbyte_command,
-    run_docker_command,
 )

354-354: Handle unused variable flagged by static analysis.

The read_result variable is assigned but never used. Since this is likely intentional (you're just verifying the command succeeds), would you consider either using the result or making the assignment intention clearer?

-            read_result: EntrypointOutput = run_docker_airbyte_command(
+            _ = run_docker_airbyte_command(
airbyte_cdk/test/entrypoint_wrapper.py (1)

54-68: Consider using a regular class instead of dataclass?

Since AirbyteEntrypointException doesn't define any fields, the @dataclass decorator doesn't provide any benefits here. Would you consider using a regular class definition instead? This might also resolve the static analysis warning. wdyt?

-@dataclass
 class AirbyteEntrypointException(Exception):
     """Exception raised for errors in the AirbyteEntrypoint execution.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between acb92dd and eb0af5d.

📒 Files selected for processing (7)
  • .github/workflows/connector-tests.yml (1 hunks)
  • .github/workflows/python_lint.yml (2 hunks)
  • airbyte_cdk/test/entrypoint_wrapper.py (4 hunks)
  • airbyte_cdk/test/standard_tests/_job_runner.py (3 hunks)
  • airbyte_cdk/test/standard_tests/connector_base.py (1 hunks)
  • airbyte_cdk/test/standard_tests/docker_base.py (11 hunks)
  • airbyte_cdk/utils/docker.py (8 hunks)
🧰 Additional context used
🧠 Learnings (8)
📓 Common learnings
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
airbyte_cdk/test/standard_tests/connector_base.py (3)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py:21-29
Timestamp: 2025-01-13T23:39:15.457Z
Learning: The CustomPageIncrement class in unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py is imported from another connector definition and should not be modified in this context.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#90
File: Dockerfile:16-21
Timestamp: 2024-12-02T18:36:04.346Z
Learning: Copying files from `site-packages` in the Dockerfile maintains compatibility with both the old file structure that manifest-only connectors expect and the new package-based structure where SDM is part of the CDK.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the `airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` file, the strict module name checks in `_get_class_from_fully_qualified_class_name` (requiring `module_name` to be "components" and `module_name_full` to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.
.github/workflows/python_lint.yml (2)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#13
File: airbyte_cdk/connector.py:99-99
Timestamp: 2024-11-10T04:50:11.914Z
Learning: When a PR's goal is to run the autoformat task from `ruff`, avoid suggesting code changes beyond formatting to prevent potential negative side effects.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#77
File: .github/workflows/pypi_publish.yml:70-75
Timestamp: 2024-11-20T12:05:50.628Z
Learning: In the file `.github/workflows/pypi_publish.yml`, when setting the `VERSION` variable in a shell script, avoid adding quotes around the variable assignment, as it can include the quotes in the variable's value. This can lead to unintended Docker tags like `airbyte/source-declarative-manifest:'6.7.0dev8'`.
.github/workflows/connector-tests.yml (4)
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#90
File: Dockerfile:16-21
Timestamp: 2024-12-02T18:36:04.346Z
Learning: Copying files from `site-packages` in the Dockerfile maintains compatibility with both the old file structure that manifest-only connectors expect and the new package-based structure where SDM is part of the CDK.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py:21-29
Timestamp: 2025-01-13T23:39:15.457Z
Learning: The CustomPageIncrement class in unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py is imported from another connector definition and should not be modified in this context.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
airbyte_cdk/test/standard_tests/_job_runner.py (2)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: pyproject.toml:108-111
Timestamp: 2024-11-15T00:58:15.446Z
Learning: In the project, the `run` function is defined in `airbyte_cdk/cli/source_declarative_manifest/_run.py` and is imported into the module's `__init__.py`.
airbyte_cdk/test/standard_tests/docker_base.py (5)
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#90
File: Dockerfile:16-21
Timestamp: 2024-12-02T18:36:04.346Z
Learning: Copying files from `site-packages` in the Dockerfile maintains compatibility with both the old file structure that manifest-only connectors expect and the new package-based structure where SDM is part of the CDK.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py:21-29
Timestamp: 2025-01-13T23:39:15.457Z
Learning: The CustomPageIncrement class in unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py is imported from another connector definition and should not be modified in this context.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: pyproject.toml:108-111
Timestamp: 2024-11-15T00:58:15.446Z
Learning: In the project, the `run` function is defined in `airbyte_cdk/cli/source_declarative_manifest/_run.py` and is imported into the module's `__init__.py`.
airbyte_cdk/utils/docker.py (4)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: pyproject.toml:108-111
Timestamp: 2024-11-15T00:58:15.446Z
Learning: In the project, the `run` function is defined in `airbyte_cdk/cli/source_declarative_manifest/_run.py` and is imported into the module's `__init__.py`.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#90
File: Dockerfile:16-21
Timestamp: 2024-12-02T18:36:04.346Z
Learning: Copying files from `site-packages` in the Dockerfile maintains compatibility with both the old file structure that manifest-only connectors expect and the new package-based structure where SDM is part of the CDK.
airbyte_cdk/test/entrypoint_wrapper.py (1)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
🧬 Code Graph Analysis (4)
airbyte_cdk/test/standard_tests/connector_base.py (1)
airbyte_cdk/test/standard_tests/docker_base.py (1)
  • connector_name (58-61)
airbyte_cdk/test/standard_tests/_job_runner.py (1)
airbyte_cdk/test/entrypoint_wrapper.py (3)
  • as_exception (220-222)
  • get_formatted_error_message (203-218)
  • errors (200-201)
airbyte_cdk/test/standard_tests/docker_base.py (2)
airbyte_cdk/utils/docker.py (2)
  • run_docker_airbyte_command (451-494)
  • run_docker_command (393-448)
airbyte_cdk/test/entrypoint_wrapper.py (3)
  • raise_if_errors (224-234)
  • catalog (237-241)
  • EntrypointOutput (69-343)
airbyte_cdk/utils/docker.py (1)
airbyte_cdk/test/entrypoint_wrapper.py (6)
  • EntrypointOutput (69-343)
  • raise_if_errors (224-234)
  • errors (200-201)
  • get_formatted_error_message (203-218)
  • spec_messages (163-164)
  • logs (188-189)
🪛 Flake8 (7.2.0)
airbyte_cdk/test/standard_tests/docker_base.py

[error] 35-35: 'airbyte_cdk.utils.docker.run_docker_command' imported but unused

(F401)


[error] 354-354: local variable 'read_result' is assigned to but never used

(F841)

airbyte_cdk/utils/docker.py

[error] 20-20: 'airbyte_cdk.sources.declarative.spec' imported but unused

(F401)

🪛 Pylint (3.3.7)
airbyte_cdk/test/standard_tests/docker_base.py

[error] 66-66: Method 'connector_name' has no 'startswith' member

(E1101)

airbyte_cdk/test/entrypoint_wrapper.py

[error] 222-222: Too many positional arguments for constructor call

(E1121)

⏰ Context from checks skipped due to timeout of 90000ms (10)
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-shopify
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Analyze (python)
🔇 Additional comments (17)
.github/workflows/python_lint.yml (2)

29-29: Nice typo fix!

Good catch on fixing "Job-specifc" to "Job-specific" in both places.

Also applies to: 52-52


54-54: Improved developer experience with --diff flag.

Changing from --check to --diff is a nice improvement - developers can now see exactly what formatting changes would be made instead of just getting a pass/fail result. This makes it easier to understand and fix formatting issues. WDYT about this approach vs the previous check-only behavior?

.github/workflows/connector-tests.yml (1)

78-79: Great alignment with PR objectives for destination connector testing.

This change ensures destination-motherduck is always tested rather than conditionally based on SQL changes. The comment clearly explains the reasoning. This aligns well with the PR's goal to improve destination connector handling in the test suite.

airbyte_cdk/test/standard_tests/connector_base.py (1)

48-48: Good refactoring to use centralized connector_name property.

Using cls.connector_name instead of duplicating the logic (connector_root.absolute().name) is a nice improvement. This centralizes the connector name calculation in the connector_name property from docker_base.py, making the code more maintainable and consistent.

airbyte_cdk/test/standard_tests/_job_runner.py (2)

108-108: Excellent centralization of error handling logic.

Replacing the manual AssertionError construction with result.as_exception() is a great improvement. This delegates the error formatting and exception creation to the EntrypointOutput class where it belongs, making the code cleaner and more maintainable.


118-118: Consistent migration to centralized error formatting.

Good work replacing _errors_to_str(result.errors) with result.get_formatted_error_message() throughout the file. This eliminates code duplication and ensures consistent error message formatting across all usage sites.

Also applies to: 142-144

airbyte_cdk/test/standard_tests/docker_base.py (5)

57-66: Great addition of connector identification methods.

The new connector_name property and is_destination_connector method are excellent additions that centralize connector identification logic. This supports the PR's objective to properly handle destination connectors in tests. The implementation is clean and follows good patterns.


158-167: Excellent refactoring to use run_docker_airbyte_command.

The migration from run_docker_command to run_docker_airbyte_command with raise_if_errors=True significantly simplifies the error handling code. The new approach eliminates manual try-catch blocks and return code checks, making the code much cleaner and more maintainable.

Also applies to: 209-222, 302-315, 354-371


246-247: Perfect solution for destination connector read test handling.

This addresses the exact issue mentioned in the PR objectives where destination connectors were failing read tests because they lack discover and read implementations. The skip logic is clean and well-placed.


317-319: Simplified catalog parsing with EntrypointOutput.

Using discover_result.catalog instead of manually parsing stdout lines is much cleaner and more reliable. The direct access to the catalog message from the EntrypointOutput object eliminates potential parsing errors and makes the intent clearer.


66-66: Please verify at runtime that connector_name.startswith("destination-") works as expected

I wasn’t able to confirm the false-positive fix in our sandbox due to dependency conflicts. Could you run a quick local check to ensure connector_name is truly a str and .startswith behaves correctly? For example:

• File & Line: airbyte_cdk/test/standard_tests/docker_base.py, line 66
• Snippet to run in a clean environment (e.g. a virtualenv):

python3 -m venv venv && source venv/bin/activate
pip install -U pip
pip install -e .[dev]

python3 - << 'EOF'
import sys
from pathlib import Path
sys.path.append('.')
from airbyte_cdk.test.standard_tests.docker_base import DockerConnectorTestSuite

class TestSuite(DockerConnectorTestSuite):
    @classmethod
    def get_connector_root_dir(cls):
        return Path('.')

print('connector_name:', repr(TestSuite.connector_name))
print('startswith destination-:', TestSuite.connector_name.startswith("destination-"))
EOF

WDYT?

airbyte_cdk/test/entrypoint_wrapper.py (2)

86-95: Good addition for error context!

The command parameter is a helpful addition that provides valuable context in error messages.


203-234: Well-implemented error handling methods!

These methods provide a clean API for error handling. The docstrings are clear and follow the suggestions from previous reviews. The formatted error message includes the command context which will be helpful for debugging.

airbyte_cdk/utils/docker.py (4)

396-404: Great parameter rename for clarity!

The rename from check to raise_if_errors better describes the parameter's behavior. The added docstring note about using run_docker_airbyte_command for Airbyte verbs is also helpful guidance.


451-495: Excellent abstraction for Airbyte commands!

This function provides a clean interface for running Airbyte-specific Docker commands and returns the rich EntrypointOutput object. The error handling is well-thought-out, converting subprocess errors into the uncaught_exception parameter of EntrypointOutput.


506-542: Great improvement to error reporting!

The refactored verify_connector_image now provides detailed error messages, which will help with debugging failed verifications. Using run_docker_airbyte_command and the EntrypointOutput methods makes the code much cleaner than parsing raw JSON output.


283-291: Clean handling of verification results!

The updated code properly handles the new tuple return type from verify_connector_image and provides clear error messages to the user.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eb0af5d and b0f28b7.

📒 Files selected for processing (3)
  • airbyte_cdk/cli/airbyte_cdk/_connector.py (2 hunks)
  • airbyte_cdk/test/standard_tests/util.py (2 hunks)
  • airbyte_cdk/utils/docker.py (8 hunks)
✅ Files skipped from review due to trivial changes (1)
  • airbyte_cdk/cli/airbyte_cdk/_connector.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/utils/docker.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
airbyte_cdk/test/standard_tests/util.py (4)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py:21-29
Timestamp: 2025-01-13T23:39:15.457Z
Learning: The CustomPageIncrement class in unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py is imported from another connector definition and should not be modified in this context.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#90
File: Dockerfile:16-21
Timestamp: 2024-12-02T18:36:04.346Z
Learning: Copying files from `site-packages` in the Dockerfile maintains compatibility with both the old file structure that manifest-only connectors expect and the new package-based structure where SDM is part of the CDK.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
🧬 Code Graph Analysis (1)
airbyte_cdk/test/standard_tests/util.py (4)
airbyte_cdk/test/standard_tests/destination_base.py (1)
  • DestinationTestSuiteBase (7-16)
airbyte_cdk/test/standard_tests/docker_base.py (2)
  • DockerConnectorTestSuite (42-371)
  • connector_name (58-61)
airbyte_cdk/test/standard_tests/declarative_sources.py (1)
  • DeclarativeSourceTestSuite (27-103)
airbyte_cdk/test/standard_tests/source_base.py (1)
  • SourceTestSuiteBase (30-177)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
airbyte_cdk/test/standard_tests/util.py (3)

13-13: LGTM! Import addition looks good.

The import of DestinationTestSuiteBase aligns perfectly with the new logic for handling destination connectors. This supports the broader improvement to distinguish between source and destination connectors in the test framework.


50-57: Nice improvement on language tag extraction!

The new approach of explicitly searching for tags starting with "language:" is much cleaner than the previous dictionary-based mapping. The error handling is comprehensive and provides helpful debugging information. The tag splitting logic is also robust with the split(":", 1) approach.


59-67: Great explicit conditional logic for test suite selection!

The new conditional approach is much more readable and maintainable than the previous dictionary mapping. I particularly appreciate how it properly distinguishes between source and destination connectors for Python language connectors - this aligns well with the is_destination_connector method I see in the docker_base.py file.

@aaronsteers aaronsteers enabled auto-merge (squash) July 4, 2025 02:00
@aaronsteers aaronsteers merged commit 9ec30dc into main Jul 4, 2025
26 checks passed
@aaronsteers aaronsteers deleted the aj/tests/improve-fast-test-outputs branch July 4, 2025 02:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants