Skip to content

Add example for how to read encrypted parquet files #7283

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 14, 2025
52 changes: 52 additions & 0 deletions parquet/src/arrow/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,58 @@
//!
//! println!("Read {} records.", record_batch.num_rows());
//! ```
//!
//! # Example of reading non-uniformly encrypted parquet file into arrow record batch
//!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a note here that this requires the experimental encryption feature to be enabled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

//! Note: This requires the experimental `encryption` feature to be enabled at compile time.
//!
//!
#![cfg_attr(feature = "encryption", doc = "```rust")]
#![cfg_attr(not(feature = "encryption"), doc = "```ignore")]
//! # use arrow_array::{Int32Array, ArrayRef};
//! # use arrow_array::{types, RecordBatch};
//! # use parquet::arrow::arrow_reader::{
//! # ArrowReaderMetadata, ArrowReaderOptions, ParquetRecordBatchReaderBuilder,
//! # };
//! # use arrow_array::cast::AsArray;
//! # use parquet::file::metadata::ParquetMetaData;
//! # use tempfile::tempfile;
//! # use std::fs::File;
//! # use parquet::encryption::decrypt::FileDecryptionProperties;
//! # let test_data = arrow::util::test_util::parquet_test_data();
//! # let path = format!("{test_data}/encrypt_columns_and_footer.parquet.encrypted");
//! #
//! let file = File::open(path).unwrap();
//!
//! // Define the AES encryption keys required required for decrypting the footer metadata
//! // and column-specific data. If only a footer key is used then it is assumed that the
//! // file uses uniform encryption and all columns are encrypted with the footer key.
//! // If any column keys are specified, other columns without a key provided are assumed
//! // to be unencrypted
//! let footer_key = "0123456789012345".as_bytes(); // Keys are 128 bits (16 bytes)
//! let column_1_key = "1234567890123450".as_bytes();
//! let column_2_key = "1234567890123451".as_bytes();
//!
//! let decryption_properties = FileDecryptionProperties::builder(footer_key.to_vec())
//! .with_column_key("double_field", column_1_key.to_vec())
//! .with_column_key("float_field", column_2_key.to_vec())
//! .build()
//! .unwrap();
//!
//! let options = ArrowReaderOptions::default()
//! .with_file_decryption_properties(decryption_properties);
//! let reader_metadata = ArrowReaderMetadata::load(&file, options.clone()).unwrap();
//! let file_metadata = reader_metadata.metadata().file_metadata();
//! assert_eq!(50, file_metadata.num_rows());
//!
//! let mut reader = ParquetRecordBatchReaderBuilder::try_new_with_options(file, options)
//! .unwrap()
//! .build()
//! .unwrap();
//!
//! let record_batch = reader.next().unwrap().unwrap();
//! assert_eq!(50, record_batch.num_rows());
//! ```

experimental!(mod array_reader);
pub mod arrow_reader;
Expand Down
Loading