Skip to content

fix: (CDK) (Declarative) - Add Manifest Migration module #485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Apr 29, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
57e7bff
add
bazarnov Apr 16, 2025
d36bb01
updated structure
bazarnov Apr 16, 2025
88f9e30
removed __ManifestMigration_ from class names
bazarnov Apr 16, 2025
ebc854d
updated version checks and error messaging
bazarnov Apr 16, 2025
d30ae26
handled missing original key issue for path_to_url migration
bazarnov Apr 16, 2025
5668180
add __should_migrate flag handling
bazarnov Apr 17, 2025
cdb7710
Merge remote-tracking branch 'origin/main' into baz/cdk/add-manifest-…
bazarnov Apr 17, 2025
c494934
formatted
bazarnov Apr 17, 2025
64ee9db
Merge remote-tracking branch 'origin/main' into baz/cdk/add-manifest-…
bazarnov Apr 21, 2025
9e58f5b
correct readme.md
bazarnov Apr 22, 2025
c6b31cc
fixed the imports for readme.md
bazarnov Apr 22, 2025
ab08a07
add request_body_* > request_body migration + unit test
bazarnov Apr 22, 2025
0308817
updated docstring
bazarnov Apr 22, 2025
2f168ba
Merge remote-tracking branch 'origin/main' into baz/cdk/add-manifest-…
bazarnov Apr 23, 2025
38b7362
updated the structure to address the versioning and the order of the …
bazarnov Apr 23, 2025
c3ee514
fixed linters issues
bazarnov Apr 23, 2025
b0cbb09
Merge remote-tracking branch 'origin/main' into baz/cdk/add-manifest-…
bazarnov Apr 25, 2025
b202be8
updated
bazarnov Apr 25, 2025
e2f6fd1
changed the name of the test
bazarnov Apr 28, 2025
c418561
Merge remote-tracking branch 'origin/main' into baz/cdk/add-manifest-…
bazarnov Apr 28, 2025
480f5f7
updated request_body_* migration
bazarnov Apr 28, 2025
5bb675d
Merge remote-tracking branch 'origin/main' into baz/cdk/add-manifest-…
bazarnov Apr 28, 2025
6e72928
updated migrations to the latest CDK version
bazarnov Apr 28, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions airbyte_cdk/connector_builder/connector_builder_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,16 @@ def get_limits(config: Mapping[str, Any]) -> TestLimits:
return TestLimits(max_records, max_pages_per_slice, max_slices, max_streams)


def should_migrate_manifest(config: Mapping[str, Any]) -> bool:
"""
Determines whether the manifest should be migrated,
based on the presence of the "__should_migrate" key in the config.

This flag is set by the UI.
"""
return config.get("__should_migrate", False)


def should_normalize_manifest(config: Mapping[str, Any]) -> bool:
"""
Check if the manifest should be normalized.
Expand All @@ -71,6 +81,7 @@ def create_source(config: Mapping[str, Any], limits: TestLimits) -> ManifestDecl
config=config,
emit_connector_builder_messages=True,
source_config=manifest,
migrate_manifest=should_migrate_manifest(config),
normalize_manifest=should_normalize_manifest(config),
component_factory=ModelToComponentFactory(
emit_connector_builder_messages=True,
Expand Down
73 changes: 73 additions & 0 deletions airbyte_cdk/manifest_migrations/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Manifest Migrations

This directory contains the logic and registry for manifest migrations in the Airbyte CDK. Migrations are used to update or transform manifest components to newer formats or schemas as the CDK evolves.

## Adding a New Migration

1. **Create a Migration File:**
- Add a new Python file in the `migrations/` subdirectory.
- Name the file using the pattern: `<description_of_the_migration>.py`.
- Example: `http_requester_url_base_to_url.py`
- The filename should be unique and descriptive.

2. **Define the Migration Class:**
- The migration class must inherit from `ManifestMigration`.
- Name the class using a descriptive name (e.g., `HttpRequesterUrlBaseToUrl`).
- Implement the following methods:
- `should_migrate(self, manifest: ManifestType) -> bool`
- `migrate(self, manifest: ManifestType) -> None`
- `validate(self, manifest: ManifestType) -> bool`

3. **Register the Migration:**
- Open `migrations/registry.yaml`.
- Add an entry under the appropriate version, or create a new version section if needed.
- Each migration entry should include:
- `name`: The filename (without `.py`)
- `order`: The order in which this migration should be applied for the version
- `description`: A short description of the migration

Example:
```yaml
manifest_migrations:
- version: 6.45.2
migrations:
- name: http_requester_url_base_to_url
order: 1
description: |
This migration updates the `url_base` field in the `HttpRequester` component spec to `url`.
```

4. **Testing:**
- Ensure your migration is covered by unit tests.
- Tests should verify both `should_migrate`, `migrate`, and `validate` behaviors.

## Migration Discovery

- Migrations are discovered and registered automatically based on the entries in `migrations/registry.yaml`.
- Do not modify the migration registry in code manually.
- If you need to skip certain component types, use the `NON_MIGRATABLE_TYPES` list in `manifest_migration.py`.

## Example Migration Skeleton

```python
from airbyte_cdk.manifest_migrations.manifest_migration import TYPE_TAG, ManifestMigration, ManifestType

class ExampleMigration(ManifestMigration):
component_type = "ExampleComponent"
original_key = "old_key"
replacement_key = "new_key"

def should_migrate(self, manifest: ManifestType) -> bool:
return manifest[TYPE_TAG] == self.component_type and self.original_key in manifest

def migrate(self, manifest: ManifestType) -> None:
manifest[self.replacement_key] = manifest[self.original_key]
manifest.pop(self.original_key, None)

def validate(self, manifest: ManifestType) -> bool:
return self.replacement_key in manifest and self.original_key not in manifest
```

---

For more details, see the docstrings in `manifest_migration.py` and the examples in the `migrations/` folder.
3 changes: 3 additions & 0 deletions airbyte_cdk/manifest_migrations/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#
# Copyright (c) 2025 Airbyte, Inc., all rights reserved.
#
12 changes: 12 additions & 0 deletions airbyte_cdk/manifest_migrations/exceptions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#
# Copyright (c) 2025 Airbyte, Inc., all rights reserved.
#


class ManifestMigrationException(Exception):
"""
Raised when a migration error occurs in the manifest.
"""

def __init__(self, message: str) -> None:
super().__init__(f"Failed to migrate the manifest: {message}")
134 changes: 134 additions & 0 deletions airbyte_cdk/manifest_migrations/manifest_migration.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
#
# Copyright (c) 2025 Airbyte, Inc., all rights reserved.
#


from abc import ABC, abstractmethod
from dataclasses import asdict, dataclass
from typing import Any, Dict

ManifestType = Dict[str, Any]


TYPE_TAG = "type"

NON_MIGRATABLE_TYPES = [
# more info here: https://github.com/airbytehq/airbyte-internal-issues/issues/12423
"DynamicDeclarativeStream",
]


@dataclass
class MigrationTrace:
"""
This class represents a migration that has been applied to the manifest.
It contains information about the migration, including the version it was applied from,
the version it was applied to, and the time it was applied.
"""

from_version: str
to_version: str
migration: str
migrated_at: str

def as_dict(self) -> Dict[str, Any]:
return asdict(self)


class ManifestMigration(ABC):
"""
Base class for manifest migrations.
This class provides a framework for migrating manifest components.
It defines the structure for migration classes, including methods for checking if a migration is needed,
performing the migration, and validating the migration.
"""

def __init__(self) -> None:
self.is_migrated: bool = False

@abstractmethod
def should_migrate(self, manifest: ManifestType) -> bool:
"""
Check if the manifest should be migrated.

:param manifest: The manifest to potentially migrate

:return: true if the manifest is of the expected format and should be migrated. False otherwise.
"""

@abstractmethod
def migrate(self, manifest: ManifestType) -> None:
"""
Migrate the manifest. Assumes should_migrate(manifest) returned True.

:param manifest: The manifest to migrate
"""

@abstractmethod
def validate(self, manifest: ManifestType) -> bool:
"""
Validate the manifest to ensure the migration was successfully applied.

:param manifest: The manifest to validate
"""

def _is_component(self, obj: Dict[str, Any]) -> bool:
"""
Check if the object is a component.

:param obj: The object to check
:return: True if the object is a component, False otherwise
"""
return TYPE_TAG in obj.keys()

def _is_migratable_type(self, obj: Dict[str, Any]) -> bool:
"""
Check if the object is a migratable component,
based on the Type of the component and the migration version.

:param obj: The object to check
:return: True if the object is a migratable component, False otherwise
"""
return obj[TYPE_TAG] not in NON_MIGRATABLE_TYPES

def _process_manifest(self, obj: Any) -> None:
"""
Recursively processes a manifest object, migrating components that match the migration criteria.

This method traverses the entire manifest structure (dictionaries and lists) and applies
migrations to components that:
1. Have a type tag
2. Are not in the list of non-migratable types
3. Meet the conditions defined in the should_migrate method

Parameters:
obj (Any): The object to process, which can be a dictionary, list, or any other type.
Dictionary objects are checked for component type tags and potentially migrated.
List objects have each of their items processed recursively.
Other types are ignored.

Returns:
None, since we process the manifest in place.
"""
if isinstance(obj, dict):
# Check if the object is a component
if self._is_component(obj):
# Check if the object is allowed to be migrated
if not self._is_migratable_type(obj):
return

# Check if the object should be migrated
if self.should_migrate(obj):
# Perform the migration, if needed
self.migrate(obj)
# validate the migration
self.is_migrated = self.validate(obj)

# Process all values in the dictionary
for value in list(obj.values()):
self._process_manifest(value)

elif isinstance(obj, list):
# Process all items in the list
for item in obj:
self._process_manifest(item)
Loading
Loading