feat(csharp/src/Drivers/Apache): Implement GetColumnsExtended metadata for Databricks#2766
Merged
CurtHagenlocher merged 12 commits intoMay 8, 2025
Conversation
CurtHagenlocher
requested changes
May 2, 2025
CurtHagenlocher
left a comment
Contributor
There was a problem hiding this comment.
Thanks! Please pass the allFields parameter directly and not by reference and consider adding the comments about the results being in a single batch.
…sync method properly return Task
CurtHagenlocher
requested changes
May 7, 2025
CurtHagenlocher
left a comment
Contributor
There was a problem hiding this comment.
Thanks! I've made a few suggestions for improvements, and the code needs to pass the whitespace linter.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description:
This PR adds a new metadata API GetColumnsExtended to the Apache Hive2 driver. This consolidated metadata query combines column information with primary key and foreign key relationships, allowing clients to retrieve complete column metadata in a single call.
Changes:
New Metadata Command: Added GetColumnsExtended to the list of supported metadata commands
Consolidated Query Implementation: The new method retrieves and combines data from:
GetColumns - Basic column metadata
GetPrimaryKeys - Primary key information
GetCrossReference - Foreign key relationships
Schema Enhancement: Added prefixed fields to the schema:
PK_COLUMN_NAME, PK_KEY_SEQ for primary key information
FK_PKCOLUMN_NAME, FK_PKTABLE_CAT, FK_PKTABLE_SCHEM, FK_PKTABLE_NAME, FK_FKCOLUMN_NAME for foreign key information
Relationship Mapping: Each column is matched with its corresponding PK/FK data (if any)
Unified Result Set: All data is combined into a single Arrow RecordBatch
Benefits:
Reduced API calls: Clients can fetch complete column information with 1 call instead of 3
Simplified client code: No need to manually join metadata from multiple queries
Complete column context: Get column type information along with its relationships
Better performance: Reduces network round-trips for metadata operations
Testing:
Added tests in StatementTests.cs to verify that the extended fields are correctly populated
Tested with tables containing primary and foreign keys to ensure correctness
Sample return for a foreign key column:
TODO:
Based on runtime version switch to use DescribeTableExtended.