Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Datafusion has its own configuration types for Parquet encryption and for convenience you can convert from FileDecryptionProperties from the parquet crate to Datafusion's ConfigFileDecryptionProperties. But the Datafusion type can't represent decryption properties that use a key retriever.
The public API of FileDecryptionProperties doesn't expose whether a key retriever is used, so Datafusion relies on getting an error back when getting the footer key. KeyRetriever implementations might not necessarily return an error in this scenario though. They could return a default key or return the correct footer key, but then Datafusion wouldn't know if different column keys are needed.
Describe the solution you'd like
A new uses_key_retriever(&self) -> bool method on FileDecryptionProperties.
Describe alternatives you've considered
Exposing the KeyRetriever itself, eg. with a method like key_retriever(&self) -> Option<Arc<dyn KeyRetriever>>, or making the internal DecryptionKeys type public.
I think it might be best to minimise how much of the internals are exposed though to allow changing things in future more easily without breaking API changes.
We could also change the API of the column_keys method to return an Option, as at the moment an empty result could mean that a key retriever is used or that an explicit column key is set and uniform encryption is used. But this would be a breaking change.
Additional context
See apache/datafusion#21603 (comment) for more context.
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Datafusion has its own configuration types for Parquet encryption and for convenience you can convert from
FileDecryptionPropertiesfrom theparquetcrate to Datafusion'sConfigFileDecryptionProperties. But the Datafusion type can't represent decryption properties that use a key retriever.The public API of
FileDecryptionPropertiesdoesn't expose whether a key retriever is used, so Datafusion relies on getting an error back when getting the footer key.KeyRetrieverimplementations might not necessarily return an error in this scenario though. They could return a default key or return the correct footer key, but then Datafusion wouldn't know if different column keys are needed.Describe the solution you'd like
A new
uses_key_retriever(&self) -> boolmethod onFileDecryptionProperties.Describe alternatives you've considered
Exposing the
KeyRetrieveritself, eg. with a method likekey_retriever(&self) -> Option<Arc<dyn KeyRetriever>>, or making the internalDecryptionKeystype public.I think it might be best to minimise how much of the internals are exposed though to allow changing things in future more easily without breaking API changes.
We could also change the API of the
column_keysmethod to return anOption, as at the moment an empty result could mean that a key retriever is used or that an explicit column key is set and uniform encryption is used. But this would be a breaking change.Additional context
See apache/datafusion#21603 (comment) for more context.