Skip to content

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Jun 10, 2025

Add Versioned Storage Types for Database Backward Compatibility

This PR implements versioned storage types for Katana's database to enable backward compatibility for block data across different database versions. The implementation allows newer versions of Katana to read/deserialize databases created with older versions by creating separate versioned storage types in the database layer.

This is the first half of implementing a full database migration from previous database versions. The objective is to make sure future changes that affect the database format will not introduce compatibility issue with current/older format.

Changes Made

Core Implementation

  • Created versioned storage module (crates/storage/db/src/versioned/)
    • VersionedHeader enum supporting Header (v6), and current Header (v7)
    • VersionedTx enum supporting Tx (v6), and current Tx (v7)
    • Version-aware Compress/Decompress implementations for the versioned types with fallback deserialization.

@devin-ai-integration
Copy link
Contributor Author

Original prompt from karii@cartridge.gg:

File: katana/crates/storage/db/src/version.rs
File: katana/crates/storage/db/src/tables.rs

Let's add a database migration tool to Katana. The idea is if Katana is initialized with an existing database that was created using an older version of Katana which might be using an older database format, we can still use the database.

Currently, Katana database is not backward compatible. Meaning each time the database version is updated (which usually happens when some changes are made that affect the serialization of the database types), the database version needs to be increased. The version is tracked in version.rs.

Now we want to make the database format be backward compatible. Meaning, existing database that are created using an earlier database version can be still be opened/processed using a Katana version that has a newer database format.

But, we don't need to maintain backward compatibility with all the types that we use in the database. The database types can be found in tables.rs. Katana is a blockchain node, so we only need to maintain backward compatibility for block data. (i.e., block header, transactions). Because we can re-execute the block in order to rebuild back the state. The reason why we choose to re-execute instead of maintaining compatibility for all the database types, is because there might be some new fields that are added to some tables that can only be computed through transactions execution.

The plan is to have two things; (1) versioned the necessary database types to ensure we can deserialize no matter the database version (backward compatibility), and (2) have a way to re-execute the blocks.

Your job for now is to implement (1). Make sure the block data types are versioned to maintain backward compatibility.

You only need to look in the following repo: dojoengine/katana

@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link

github-actions bot commented Jun 10, 2025

Binary size report 📊

Branch Size
main 66.21MiB
devin/1749581445-database-versioning 66.36MiB ( +0.23% )

@codecov
Copy link

codecov bot commented Jun 10, 2025

Codecov Report

Attention: Patch coverage is 51.45985% with 133 lines in your changes missing coverage. Please review.

Project coverage is 73.47%. Comparing base (71d4bbe) to head (f0da674).

Files with missing lines Patch % Lines
.../storage/db/src/models/versioned/transaction/v6.rs 47.16% 84 Missing ⚠️
crates/storage/db/src/models/versioned/block/v6.rs 0.00% 23 Missing ⚠️
...rates/storage/db/src/models/versioned/block/mod.rs 62.06% 11 Missing ⚠️
...storage/db/src/models/versioned/transaction/mod.rs 61.53% 10 Missing ⚠️
crates/primitives/src/block.rs 0.00% 3 Missing ⚠️
crates/storage/db/src/lib.rs 75.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #118      +/-   ##
==========================================
- Coverage   73.73%   73.47%   -0.26%     
==========================================
  Files         205      209       +4     
  Lines       22689    22944     +255     
==========================================
+ Hits        16730    16859     +129     
- Misses       5959     6085     +126     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kariy kariy marked this pull request as draft June 10, 2025 23:10
@kariy kariy force-pushed the devin/1749581445-database-versioning branch 2 times, most recently from c15773f to 73a39a2 Compare June 12, 2025 17:14
@kariy
Copy link
Member

kariy commented Jun 17, 2025

@claude let's cherry-pick commit f1e21e4e, 302ad5a5, and f743e105 of the v1.5.4 tag and rebase this branch on those commits. some of the aforementioned commits are actually a subset of this PR so please make sure to resolve any conflicts appropriately.

@kariy kariy force-pushed the devin/1749581445-database-versioning branch from afa4f68 to afd80f5 Compare June 18, 2025 11:41
kariy and others added 13 commits June 18, 2025 19:53
There has been a mistake on the progression of the database version. For the latest stable release of Katana (v1.5.3), the database version is [6](https://github.com/dojoengine/katana/blob/8cca707d20a52b3d8d559193c77dc720ca3a4fbd/crates/storage/db/src/version.rs#L8) which hadn't changed since Katana [v1.0.10](https://github.com/dojoengine/dojo/blob/v1.0.10/crates/katana/storage/db/src/version.rs). If the rule for bumping the database version is followed correctly - bump the version if there are changes in the serialization for the database [types](https://github.com/dojoengine/katana/blob/8cca707d20a52b3d8d559193c77dc720ca3a4fbd/crates/storage/db/src/tables.rs#L191-L266) - the fact that the version remain unchanged since v1.0.10 would supposedly imply the database format remains compatible from v1.0.10 all the way to v1.5.3. Unfortunately, it is not the case for this (6) database version.

The correct version for Katana v1.5.* should've been 7 following a change made to the `Header` struct - addition of the `l2_gas_prices` - by this [PR](#19). But I've made a mistake of forgetting to perform a version bump following the change.

So, this commit doubles as a patch (for v1.5.*) to maintain backward compatibility across all versions of Katana with database version 6, and as the first part of implementing an actual database migration tool.
- Add versioned Header and Tx storage types supporting versions 5-7
- Implement version-aware codecs with fallback deserialization
- Update database initialization to allow older versions for block data
- Maintain strict versioning for non-block tables
- Update storage provider to handle versioned type conversions
- Fix all test cases to use versioned types properly

This enables backward compatibility for block data across different
database versions while allowing state reconstruction through re-execution.

Co-Authored-By: karii@cartridge.gg <evergreenkary@gmail.com>
- Break long versioned header calls into multiple lines
- Fix formatting issues identified by CI fmt check

Co-Authored-By: karii@cartridge.gg <evergreenkary@gmail.com>
- Add ResourceBoundsMappingV6 with only l1_gas and l2_gas fields
- Create V6-specific transaction types for InvokeTxV3, DeclareTxV3, DeployAccountTxV3
- Implement conversion logic from V6 to V7 with default l1_data_gas
- Update TxV6 enum to use proper V6-specific types
- Fix ChainId import to use katana_primitives::chain::ChainId

Co-Authored-By: karii@cartridge.gg <evergreenkary@gmail.com>
- Fix From<Tx> for VersionedTx to use V7 format for new transactions
- Update fallback deserialization to handle V5, V6, and V7 properly
- Remove useless as_ref() call to fix clippy warning
- Resolve compilation errors in CI

Co-Authored-By: karii@cartridge.gg <evergreenkary@gmail.com>
@kariy kariy force-pushed the devin/1749581445-database-versioning branch from afd80f5 to 8f29fd8 Compare June 18, 2025 12:42
@dojoengine dojoengine deleted a comment from claude bot Jun 18, 2025
@kariy kariy marked this pull request as ready for review June 18, 2025 12:42
@kariy kariy changed the title feat(db): add versioned storage types for backward compatibility feat(db): add versioned db types for backward compatibility Jun 18, 2025
@claude
Copy link
Contributor

claude bot commented Jun 18, 2025

Claude finished @devin-ai-integration[bot]'s task —— View job


Code Review Complete

I've conducted a comprehensive review of the versioned database types implementation for backward compatibility. Here's my detailed feedback:

Critical Issues ⚠️

1. Infinite Recursion in VersionedHeader Decompress crates/storage/db/src/models/versioned/block/mod.rs:44-61

The current fallback logic has a potential infinite recursion issue when deserializing as Self (line 47). If all version-specific attempts fail, it tries to deserialize as VersionedHeader again, which calls the same method.

Fix: Replace the fallback with a specific error message:

Err(CodecError::Decompress("Failed to deserialize header: no compatible version found".to_string()))

2. Version Compatibility Mismatch crates/storage/db/src/version.rs:63

The is_block_compatible_version() function claims to support versions 5-7, but:

  • VersionedHeader only has V6 and V7 variants (no V5)
  • VersionedTx only has V6 and V7 variants (no V5)

Fix: Either:

  • Update the compatibility check to (6..=CURRENT_DB_VERSION) to match actual implementation
  • Or implement V5 variants in the enums

3. Version Inconsistency in Transaction Conversion crates/storage/db/src/models/versioned/transaction/mod.rs:15-19

The From<Tx> implementation incorrectly uses VersionedTx::V7(tx) when it should match the current version. The header implementation correctly uses V7.

Code Quality Issues

4. Missing From Conversion for VersionedTx crates/storage/db/src/models/versioned/transaction/mod.rs

Unlike VersionedHeader, VersionedTx is missing a From<VersionedTx> for Tx implementation, which is needed for seamless conversion back to primitives.

5. Untagged Serde Pattern Complexity crates/storage/db/src/models/versioned/transaction/mod.rs:28-56

The VersionedTx decompress method uses a complex untagged enum pattern that's harder to maintain than the straightforward approach used in VersionedHeader.

6. Test Coverage Gaps

  • No integration tests for actual backward compatibility with real V5/V6 database files
  • No tests for the decompression fallback logic
  • The database compatibility test only checks V1.2.2 but not the claimed V5-V7 range

Architecture & Design

7. Good Design Patterns

  • Clean separation between versioned and current types
  • Proper trait implementations for compression/decompression
  • Good use of From/Into conversions for seamless integration
  • Version-aware database initialization logic

8. Resource Bounds Migration

The V6 transaction types properly handle the migration from 2-field to 3-field ResourceBoundsMapping by defaulting l1_data_gas to zero.

Performance Considerations

9. Fallback Deserialization Performance

The fallback deserialization tries multiple versions sequentially, which could impact performance on large datasets. Consider ordering attempts by most likely version first (V7 → V6 → V5).

Security Concerns

10. Input Validation

The deserialization properly handles invalid data by returning appropriate errors rather than panicking.

Test Quality

11. Unit Test Coverage

Good unit tests for individual conversion functions, but missing integration tests for the full backward compatibility workflow.

Recommendations

  1. Fix Critical Issues: Address the infinite recursion and version consistency issues immediately
  2. Add V5 Support: Either implement V5 variants or update compatibility claims
  3. Improve Test Coverage: Add tests with real V5/V6 database files
  4. Documentation: Add clear documentation about which fields are added/removed between versions
  5. Error Handling: Consider more specific error types for different compatibility issues

Summary

The implementation follows good architectural patterns and successfully provides backward compatibility for block data. However, there are critical bugs that need immediate attention, particularly around infinite recursion and version consistency. The approach of using versioned enums with fallback deserialization is sound, but the execution has some flaws that should be addressed.


"The universe is change; our life is what our thoughts make it. 
When you wake up in the morning, tell yourself: 
The people I deal with today will be meddling, ungrateful, arrogant, 
dishonest, jealous, and surly. They are like this because they can't 
tell good from evil. But I have seen the beauty of good, 
and the ugliness of evil, and have recognized that the wrongdoer 
has a nature related to my own." 
- Marcus Aurelius, Meditations, Book II

@claude
Copy link
Contributor

claude bot commented Jun 18, 2025

Claude finished @devin-ai-integration[bot]'s task —— View job


Comprehensive Code Review Complete

I've conducted a thorough review of the versioned database types implementation for backward compatibility. Here's my detailed analysis:

Critical Issues ⚠️

1. Version Compatibility Mismatch crates/storage/db/src/version.rs:63

The is_block_compatible_version() function claims to support versions 5-7, but:

  • VersionedHeader only has V6 and V7 variants (no V5)
  • VersionedTx only has V6 and V7 variants (no V5)

This inconsistency could lead to runtime errors when attempting to read version 5 databases, as the code would claim compatibility but lack the necessary deserialization support.

Fix: Either update the compatibility check to (6..=CURRENT_DB_VERSION) or implement V5 variants.

2. Missing From Conversion for VersionedTx crates/storage/db/src/models/versioned/transaction/mod.rs

Unlike VersionedHeader, VersionedTx is missing a From<VersionedTx> for Tx implementation, which is needed for seamless conversion back to primitives throughout the provider layer.

Code Quality Issues

3. Decompression Logic Inconsistency

  • VersionedHeader uses a straightforward fallback approach (lines 52-59)
  • VersionedTx uses a complex untagged enum pattern (lines 37-56)

This inconsistency makes the codebase harder to maintain. Consider standardizing on the simpler approach used by VersionedHeader.

4. Test Coverage Gaps

  • No integration tests for actual backward compatibility with real V5/V6 database files
  • The database compatibility test (tests/db-compat) only checks V1.2.2 but not the claimed V5-V7 range
  • No tests for the decompression fallback logic

Architecture & Design ✅

5. Good Design Patterns

  • Clean separation between versioned and current types
  • Proper trait implementations for compression/decompression
  • Good use of From/Into conversions for seamless integration
  • Version-aware database initialization logic (crates/storage/db/src/lib.rs:44-51)

6. Resource Bounds Migration

The V6 transaction types properly handle the migration from 2-field to 3-field ResourceBoundsMapping by defaulting l1_data_gas to zero (crates/storage/db/src/models/versioned/transaction/v6.rs:131-135). This is a well-designed backward compatibility pattern.

Performance Considerations

7. Fallback Deserialization Performance

The fallback deserialization tries multiple versions sequentially, which could impact performance on large datasets. Consider ordering attempts by most likely version first (V7 → V6).

Security & Error Handling ✅

8. Input Validation

The deserialization properly handles invalid data by returning appropriate errors rather than panicking. The error messages are descriptive and helpful for debugging.

Provider Integration ✅

9. Seamless Integration

The provider layer (crates/storage/provider/src/providers/db/mod.rs) correctly uses versioned types and converts them back to primitives:

  • Lines 150, 152: header.into() conversion
  • Lines 427, 458: transaction.into() conversion

Table Definitions ✅

10. Proper Table Updates

The table definitions (crates/storage/db/src/tables.rs:197, 212) correctly use VersionedHeader and VersionedTx for the Headers and Transactions tables respectively.

Test Quality

11. Unit Test Coverage

Good unit tests for individual conversion functions (crates/storage/db/src/models/versioned/transaction/v6.rs:236-299), but missing integration tests for the full backward compatibility workflow.

Recommendations

  1. Fix Critical Issues: Address the version compatibility mismatch immediately
  2. Add Missing Implementation: Implement From<VersionedTx> for Tx conversion
  3. Improve Test Coverage: Add tests with real V5/V6 database files
  4. Standardize Patterns: Use consistent decompression approaches
  5. Documentation: Add clear documentation about which fields are added/removed between versions

Summary

The implementation follows good architectural patterns and successfully provides backward compatibility for block data. The approach of using versioned enums with fallback deserialization is sound and well-integrated with the existing provider infrastructure. However, there are critical version consistency issues that need immediate attention to ensure the feature works as intended.

The resource bounds migration strategy demonstrates thoughtful consideration of field additions, and the overall design maintains the existing API while enabling backward compatibility.

"Very little is needed to make a happy life; it is all within yourself, 
in your way of thinking. Today I escaped anxiety. Or no, 
I discarded it, because it was within me, in my own perceptions 
— not outside."
— Marcus Aurelius, Meditations, Book II

@claude
Copy link
Contributor

claude bot commented Jun 18, 2025

Claude finished @devin-ai-integration[bot]'s task —— View job


Comprehensive Code Review for Versioned Database Types

I've conducted a thorough review of this versioned database types implementation for backward compatibility.

Review Tasks:

  • Read and understand the core versioned types implementation
  • Analyze version compatibility logic and existing flagged issues
  • Review database table integration and provider layer changes
  • Check test coverage and integration tests
  • Assess performance implications
  • Security and error handling review
  • Overall architecture and design patterns assessment
  • Provide comprehensive feedback and recommendations

Critical Issues ⚠️

1. Version Compatibility Mismatch crates/storage/db/src/version.rs:63

The is_block_compatible_version() function claims to support versions 5-7, but:

  • VersionedHeader only has V6 and V7 variants (no V5)
  • VersionedTx only has V6 and V7 variants (no V5)

This inconsistency could lead to runtime errors when attempting to read version 5 databases, as the code would claim compatibility but lack the necessary deserialization support.

Fix: Either update the compatibility check to (6..=CURRENT_DB_VERSION) or implement V5 variants.

Code Quality Issues

2. Decompression Logic Inconsistency

  • VersionedHeader uses a straightforward fallback approach (crates/storage/db/src/models/versioned/block/mod.rs:52-59)
  • VersionedTx uses a complex untagged enum pattern (crates/storage/db/src/models/versioned/transaction/mod.rs:37-56)

This inconsistency makes the codebase harder to maintain. Consider standardizing on the simpler approach used by VersionedHeader.

3. Test Coverage Gaps

  • No integration tests for actual backward compatibility with real V5/V6 database files
  • The database compatibility test (tests/db-compat) only checks V1.2.2 but not the claimed V5-V7 range
  • No tests for the decompression fallback logic

Architecture & Design ✅

4. Good Design Patterns

  • Clean separation between versioned and current types
  • Proper trait implementations for compression/decompression
  • Good use of From/Into conversions for seamless integration
  • Version-aware database initialization logic (crates/storage/db/src/lib.rs:44-51)

5. Resource Bounds Migration

The V6 transaction types properly handle the migration from 2-field to 3-field ResourceBoundsMapping by defaulting l1_data_gas to zero (crates/storage/db/src/models/versioned/transaction/v6.rs:131-135). This is a well-designed backward compatibility pattern.

Performance Considerations

6. Fallback Deserialization Performance

The fallback deserialization tries multiple versions sequentially, which could impact performance on large datasets. Consider ordering attempts by most likely version first (V7 → V6).

Security & Error Handling ✅

7. Input Validation

The deserialization properly handles invalid data by returning appropriate errors rather than panicking. The error messages are descriptive and helpful for debugging.

Provider Integration ✅

8. Seamless Integration

The provider layer (crates/storage/provider/src/providers/db/mod.rs) correctly uses versioned types and converts them back to primitives:

  • Lines 150, 152: header.into() conversion
  • Lines 427, 458: transaction.into() conversion

Table Definitions ✅

9. Proper Table Updates

The table definitions (crates/storage/db/src/tables.rs:197, 212) correctly use VersionedHeader and VersionedTx for the Headers and Transactions tables respectively.

Test Quality

10. Unit Test Coverage

Good unit tests for individual conversion functions (crates/storage/db/src/models/versioned/transaction/v6.rs:236-299), but missing integration tests for the full backward compatibility workflow.

Recommendations

  1. Fix Critical Issues: Address the version compatibility mismatch immediately
  2. Improve Test Coverage: Add tests with real V5/V6 database files
  3. Standardize Patterns: Use consistent decompression approaches
  4. Documentation: Add clear documentation about which fields are added/removed between versions

Summary

The implementation follows good architectural patterns and successfully provides backward compatibility for block data. The approach of using versioned enums with fallback deserialization is sound and well-integrated with the existing provider infrastructure. However, there are critical version consistency issues that need immediate attention to ensure the feature works as intended.

The resource bounds migration strategy demonstrates thoughtful consideration of field additions, and the overall design maintains the existing API while enabling backward compatibility.

"Very little is needed to make a happy life; it is all within yourself, 
in your way of thinking. Today I escaped anxiety. Or no, 
I discarded it, because it was within me, in my own perceptions 
— not outside."
— Marcus Aurelius, Meditations, Book II

@kariy kariy merged commit 5f86bfd into main Jun 18, 2025
11 of 13 checks passed
@kariy kariy deleted the devin/1749581445-database-versioning branch June 18, 2025 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants