Skip to content

Conversation

scovich
Copy link
Contributor

@scovich scovich commented Sep 29, 2025

Which issue does this PR close?

Rationale for this change

While pathfinding support for unshredding variant objects and arrays, I ran into the same issue that variant_get already suffers -- ShreddingState is inconvenient because it owns the value and typed_value columns it refers to, even tho borrowing them usually suffices.

What changes are included in this PR?

Define a new BorrowedShreddingState which does what it says, and update ShreddedPathStep (in variant_get.rs) to use it.

Also, make the constructor fallible in order to correctly report casting failures if the value column is not a binary view.

Are these changes tested?

Yes, existing tests cover this code.

Are there any user-facing changes?

New TryFrom and From implementations.

impl From<&StructArray> for ShreddingState was replaced by a suitable TryFrom.

@scovich
Copy link
Contributor Author

scovich commented Sep 29, 2025

CC @alamb

@github-actions github-actions bot added the parquet-variant parquet-variant* crates label Sep 29, 2025
Copy link
Contributor Author

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review

Comment on lines -275 to -288
// Extract value and typed_value fields
let value = if let Some(value_col) = inner.column_by_name("value") {
if let Some(binary_view) = value_col.as_binary_view_opt() {
Some(binary_view.clone())
} else {
return Err(ArrowError::NotYetImplemented(format!(
"VariantArray 'value' field must be BinaryView, got {}",
value_col.data_type()
)));
}
} else {
None
};
let typed_value = inner.column_by_name("typed_value").cloned();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now encapsulated by ShreddingState::try_from

}

/// Returns a borrowed version of this shredding state
pub fn borrow(&self) -> BorrowedShreddingState<'_> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had an impl From instead, but this seems to be more readable/intuitive at call sites.

@scovich
Copy link
Contributor Author

scovich commented Sep 29, 2025

Should be a quickie, assuming reviewers agree with the premise.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love it -- thank you @scovich

self.typed_value.as_ref()
}

/// Returns a borrowed version of this shredding state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

// Peel away the prefix of path elements that traverses the shredded parts of this variant
// column. Shredding will traverse the rest of the path on a per-row basis.
let mut shredding_state = input.shredding_state().clone();
let mut shredding_state = input.shredding_state().borrow();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@alamb
Copy link
Contributor

alamb commented Sep 29, 2025

I don't have much to add. FYI @codephage2020 @liamzwbao and @klion26

Copy link
Contributor

@liamzwbao liamzwbao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! Thanks, @scovich

Copy link
Member

@klion26 klion26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT, thanks for this work!

@alamb alamb merged commit 07ae1dd into apache:main Sep 30, 2025
18 checks passed
@alamb
Copy link
Contributor

alamb commented Sep 30, 2025

🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet-variant parquet-variant* crates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants