Skip to content

fix(low-code): handle ScannerError in ConfigComponentsResolver #614

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

darynaishchenko
Copy link
Contributor

@darynaishchenko darynaishchenko commented Jun 26, 2025

For bing-ads custom reports we need to define cursor format based on report aggregation provided in the config by users.
Manifest will look like this:

 - type: ComponentMappingDefinition
   create_or_update: true
   field_path: [ "incremental_sync", "datetime_format" ]
   value: "{{ '%Y-%m-%dT%H:%M:%S+00:00' if components_values['report_aggregation'] == 'Hourly' else '%Y-%m-%d' }}"

When format string is being parsed in the _parse_yaml_if_possible Scanner error is raised.
Added check for the error message for datetime format case to pass this value to stream template.

Summary by CodeRabbit

  • Bug Fixes

    • Improved error handling for YAML parsing to gracefully handle specific scanner errors involving unexpected '%' characters, preventing unnecessary exceptions for certain invalid YAML strings.
  • Tests

    • Added new test cases to ensure correct handling of YAML scanner errors and to verify that affected strings are processed without raising exceptions.

@darynaishchenko darynaishchenko self-assigned this Jun 26, 2025
@darynaishchenko
Copy link
Contributor Author

darynaishchenko commented Jun 26, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

@github-actions github-actions bot added the bug Something isn't working label Jun 26, 2025
Copy link
Contributor

coderabbitai bot commented Jun 26, 2025

📝 Walkthrough

Walkthrough

The changes introduce additional error handling in the YAML parsing logic within ConfigComponentsResolver, specifically catching and handling ScannerError exceptions related to invalid '%' characters. Corresponding unit tests are updated to include a manifest that triggers this error scenario, ensuring the parser returns the original string rather than raising an exception.

Changes

File(s) Change Summary
airbyte_cdk/sources/declarative/resolvers/config_components_resolver.py Enhanced _parse_yaml_if_possible to handle ScannerError for specific '%' character errors.
unit_tests/sources/declarative/resolvers/test_config_components_resolver.py Added a test manifest and test case for YAML ScannerError scenarios in dynamic stream reading.

Sequence Diagram(s)

sequenceDiagram
    participant Test as Test Case
    participant Resolver as ConfigComponentsResolver
    participant YAML as yaml.safe_load

    Test->>Resolver: Call _parse_yaml_if_possible(value)
    Resolver->>YAML: Try to parse value as YAML
    YAML-->>Resolver: Raise ScannerError (if '%' found)
    alt Error message contains "expected alphabetic or numeric character, but found '%'"
        Resolver-->>Test: Return original string value
    else Other ScannerError
        Resolver-->>Test: Raise ScannerError
    end
    Note over Resolver: If no error, return parsed value
Loading

Would you like to see a comparison diagram of the old vs. new error handling flow as well, wdyt?

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
airbyte_cdk/sources/declarative/resolvers/config_components_resolver.py (1)

189-192: Solid targeted error handling approach!

The implementation correctly catches the specific ScannerError for datetime format strings while preserving the original exception for other scanner errors. The string matching approach is appropriate for this targeted fix, wdyt?

One small consideration: should we log this specific case for debugging purposes, or is silent handling preferred here?

unit_tests/sources/declarative/resolvers/test_config_components_resolver.py (1)

148-160: Excellent test coverage for the new error handling!

The test manifest correctly includes a component mapping with a Jinja expression containing datetime format strings that would trigger the ScannerError. This provides good coverage for the new exception handling logic, wdyt?

One formatting note from static analysis - could we add an extra blank line before line 160 to follow PEP 8 conventions?

    }
)
+
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0afea4a and 41265f6.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/resolvers/config_components_resolver.py (2 hunks)
  • unit_tests/sources/declarative/resolvers/test_config_components_resolver.py (2 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#13
File: airbyte_cdk/connector.py:99-99
Timestamp: 2024-11-10T04:50:11.914Z
Learning: When a PR's goal is to run the autoformat task from `ruff`, avoid suggesting code changes beyond formatting to prevent potential negative side effects.
unit_tests/sources/declarative/resolvers/test_config_components_resolver.py (4)
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the `airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` file, the strict module name checks in `_get_class_from_fully_qualified_class_name` (requiring `module_name` to be "components" and `module_name_full` to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#174
File: unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py:21-29
Timestamp: 2025-01-13T23:39:15.457Z
Learning: The CustomPageIncrement class in unit_tests/source_declarative_manifest/resources/source_the_guardian_api/components.py is imported from another connector definition and should not be modified in this context.
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
airbyte_cdk/sources/declarative/resolvers/config_components_resolver.py (1)
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
🪛 Flake8 (7.2.0)
unit_tests/sources/declarative/resolvers/test_config_components_resolver.py

[error] 160-160: expected 2 blank lines, found 1

(E302)

⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-amplitude
  • GitHub Check: Check: source-shopify
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: SDM Docker Image Build
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/resolvers/config_components_resolver.py (1)

14-14: LGTM on the import addition!

The ScannerError import is correctly added to support the new exception handling below.

unit_tests/sources/declarative/resolvers/test_config_components_resolver.py (1)

174-174: Perfect test case addition!

Adding the scanner error manifest to the parameterized test ensures the new error handling path gets exercised and validates that no exceptions are raised while still producing the expected stream names.

@darynaishchenko darynaishchenko requested a review from tolik0 June 26, 2025 14:20
Copy link

PyTest Results (Fast)

3 670 tests  +1   3 659 ✅ +1   6m 26s ⏱️ +4s
    1 suites ±0      11 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 41265f6. ± Comparison against base commit 0afea4a.

Copy link

PyTest Results (Full)

3 673 tests  +1   3 662 ✅ +1   18m 5s ⏱️ +6s
    1 suites ±0      11 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 41265f6. ± Comparison against base commit 0afea4a.

Copy link
Contributor

@aldogonzalez8 aldogonzalez8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved

@darynaishchenko darynaishchenko merged commit c357b4c into main Jun 26, 2025
27 checks passed
@darynaishchenko darynaishchenko deleted the daryna/low-code/config-components-mapping-parsing-yaml branch June 26, 2025 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants