feat(sqllogictest): support dynamic catalog configuration#1943
feat(sqllogictest): support dynamic catalog configuration#1943AndreaBozzo wants to merge 1 commit intoapache:mainfrom
Conversation
|
The This is an intermittent infrastructure issue with GitHub Actions, not related to the code changes in this PR. Re-running the failed job should resolve it. |
liurenjie1024
left a comment
There was a problem hiding this comment.
Thanks @AndreaBozzo for this pr, catalog config should be engine specific, not globally.
crates/sqllogictest/src/schedule.rs
Outdated
|
|
||
| /// Configuration for a catalog in the schedule file. | ||
| #[derive(Debug, Clone, Serialize, Deserialize)] | ||
| pub struct CatalogConfig { |
There was a problem hiding this comment.
Catalog config should be engine specific, e.g., it should be part of an engine, rather in a common section.
|
Sorry, i should have figured that out. I'll work on It and amend the commit when done! |
This PR implements issue apache#1780 by allowing each engine in the sqllogictest framework to configure its own catalog. Changes: - Remove global [catalog] section from schedule parsing - Each engine now creates its own catalog based on engine-specific config - DataFusionEngine reads 'catalog_type' and 'catalog_properties' from config - Default catalog type is 'memory' with a temp warehouse for testing - Support for all catalog types via iceberg-catalog-loader (rest, glue, hms, sql, s3tables) Example configuration: ```toml [engines] df = { type = "datafusion", catalog_type = "rest", catalog_properties = { uri = "http://localhost:8181" } } ``` Closes apache#1780
7e8d26f to
4239a20
Compare
|
I've updated the PR to make the catalog configuration engine-specific as suggested. I considered adding explicit error handling for Let me know if you'd prefer that change and thank you for your patience |
liurenjie1024
left a comment
There was a problem hiding this comment.
Thanks @AndreaBozzo for this pr, while it's helpful, I think we should resolve #1952 first.
…1953) This PR refactors the schedule file parsing in the sqllogictest crate to use serde-derived structs instead of manual TOML parsing, as requested in #1952. ### Changes - **New structs with serde derives:** - `ScheduleConfig` - top-level configuration parsed from TOML - `EngineConfig` - per-engine configuration with `#[serde(flatten)]` for extensibility - `EngineType` - enum of supported engine types - **Refactored parsing flow:** - `Schedule::from_file()` now uses `toml::from_str()` directly - Added `instantiate_engines()` to separate parsing from engine creation - Removed manual `parse_engines()` and `parse_steps()` functions - **Forward-compatibility:** - Uses `#[serde(flatten)]` to capture extra fields in `EngineConfig.extra` - This enables PR #1943 to easily add `catalog_type` and `catalog_properties` support ### Relation to #1943 This PR was suggested by @liurenjie1024 as a prerequisite to #1943 (dynamic catalog configuration). The `#[serde(flatten)]` approach allows #1943 to simply extract the catalog configuration from `EngineConfig.extra` without modifying the parsing logic. ### Testing - All existing tests pass - Added new unit tests for deserialization behavior - Integration test with `df_test.toml` passes unchanged Closes #1952
Which issue does this PR close?
Closes #1780
What changes are included in this PR?
This PR adds support for configuring the catalog type dynamically per engine in sqllogictest schedule files.
Changes:
catalog_typeandcatalog_propertiesfields to engine configurationmemorycatalog (default) and all catalog types fromiceberg-catalog-loader(rest, glue, hms, sql, s3tables)iceberg-catalog-loaderdependency to the workspaceExample configuration:
If no
catalog_typeis specified, defaults to memory catalog for backward compatibility.Are these changes tested?
Yes, unit tests added for:
Note: pr body updated after maintainer's review and feedbacks