Skip to content

[parquet] Expose whether FileDecryptionProperties uses a KeyRetriever #9721

@adamreeve

Description

@adamreeve

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Datafusion has its own configuration types for Parquet encryption and for convenience you can convert from FileDecryptionProperties from the parquet crate to Datafusion's ConfigFileDecryptionProperties. But the Datafusion type can't represent decryption properties that use a key retriever.

The public API of FileDecryptionProperties doesn't expose whether a key retriever is used, so Datafusion relies on getting an error back when getting the footer key. KeyRetriever implementations might not necessarily return an error in this scenario though. They could return a default key or return the correct footer key, but then Datafusion wouldn't know if different column keys are needed.

Describe the solution you'd like

A new uses_key_retriever(&self) -> bool method on FileDecryptionProperties.

Describe alternatives you've considered

Exposing the KeyRetriever itself, eg. with a method like key_retriever(&self) -> Option<Arc<dyn KeyRetriever>>, or making the internal DecryptionKeys type public.

I think it might be best to minimise how much of the internals are exposed though to allow changing things in future more easily without breaking API changes.

We could also change the API of the column_keys method to return an Option, as at the moment an empty result could mean that a key retriever is used or that an explicit column key is set and uniform encryption is used. But this would be a breaking change.

Additional context

See apache/datafusion#21603 (comment) for more context.

Metadata

Metadata

Assignees

Labels

enhancementAny new improvement worthy of a entry in the changelog

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions