[thrift-remodel] Remove conversion functions for row group and column metadata #8574

etseidl · 2025-10-08T18:06:52Z

Which issue does this PR close?

Rationale for this change

A good bit (around 15%) of Parquet metadata parsing involves first decoding to thrift structs (FileMetaData, RowGroup, etc), and then converting to the metadata structs used by this crate (ParquetMetaData, RowGroupMetaData, etc). This PR removes some of the intermediate structures and parses straight to the crate structs.

What changes are included in this PR?

Some thrift generated structures are removed, and the code necessary to decode has been hand written. This will enable future optimizations such as selectively decoding parts of the metadata.

In addition to the above, this PR cleans up some of the memory size computation, and also boxes some of the objects used for decryption.

Are these changes tested?

Should be covered by existing tests.

Are there any user-facing changes?

No, only private APIs are changed.

etseidl · 2025-10-08T18:09:11Z

Benchmarks relative to current main.

default features
group                             main_57_0_default                      no_conv
-----                             -----------------                      -------
decode parquet metadata           1.10     19.2±0.29µs        ? ?/sec    1.00     17.4±0.31µs        ? ?/sec
decode parquet metadata (wide)    1.15     85.1±1.73ms        ? ?/sec    1.00     74.2±1.58ms        ? ?/sec
open(default)                     1.10     20.0±0.34µs        ? ?/sec    1.00     18.1±0.41µs        ? ?/sec
open(page index)                  1.02    257.3±7.66µs        ? ?/sec    1.00    252.9±4.98µs        ? ?/sec

all features
group                             main_57_0                              no_conv_encr
-----                             ---------                              ------------
decode parquet metadata           1.08     19.5±0.39µs        ? ?/sec    1.00     18.0±0.32µs        ? ?/sec
decode parquet metadata (wide)    1.14     89.2±1.68ms        ? ?/sec    1.00     78.5±1.69ms        ? ?/sec
open(default)                     1.14     21.3±0.38µs        ? ?/sec    1.00     18.8±0.39µs        ? ?/sec
open(page index)                  1.02    259.8±5.41µs        ? ?/sec    1.00    254.6±5.31µs        ? ?/sec

etseidl · 2025-10-08T18:43:09Z

parquet/src/file/metadata/memory.rs

+impl<T: HeapSize> HeapSize for Box<T> {
+    fn heap_size(&self) -> usize {
+        self.as_ref().heap_size()
+    }
+}


Not sure if this is correct. Should this also include the size of T?

Per the description of

arrow-rs/parquet/src/file/metadata/memory.rs

Lines 35 to 39 in fe9a92f

/// Return the size of any bytes allocated on the heap by this object,

/// including heap memory in those structures

///

/// Note that the size of the type itself is not included in the result --

/// instead, that size is added by the caller (e.g. container).

So in that case I do think the size of T should be included. The size of self (the pointer/Box) should not be included, but the memory it points to should be

Thanks, I'll fix.

etseidl · 2025-10-08T18:47:59Z

parquet/src/file/metadata/thrift_gen.rs

+}
+);
+
+// TODO(ets): move a lot of the encryption stuff to its own module


There is a follow-on PR ready to do just this.

etseidl · 2025-10-08T18:48:46Z

parquet/src/file/metadata/thrift_gen.rs

+thrift_struct!(
+pub(crate) struct Statistics<'a> {


this just moved from below

etseidl · 2025-10-09T20:35:15Z

parquet/src/file/metadata/thrift_gen.rs

-            return Err(general_err!("Column order length mismatch"));
+// using ThriftSliceInputProtocol rather than ThriftCompactInputProtocl trait because
+// these are all internal and operate on slices.
+fn read_column_chunk<'a>(


I realize this is quite ugly, but necessary for both performance and for stats skipping and other optimizations. I do have a simplified version of this function in the queue, so it will get better with time.

alamb · 2025-10-09T20:53:16Z

👀

alamb

Thanks @etseidl -- I started going through this PR but didn't make it. I will finish up tomorrow

alamb · 2025-10-09T20:56:16Z

parquet/src/file/metadata/memory.rs

+impl<T: HeapSize> HeapSize for Box<T> {
+    fn heap_size(&self) -> usize {
+        self.as_ref().heap_size()
+    }
+}


Per the description of

arrow-rs/parquet/src/file/metadata/memory.rs

Lines 35 to 39 in fe9a92f

/// Return the size of any bytes allocated on the heap by this object,

/// including heap memory in those structures

///

/// Note that the size of the type itself is not included in the result --

/// instead, that size is added by the caller (e.g. container).

So in that case I do think the size of T should be included. The size of self (the pointer/Box) should not be included, but the memory it points to should be

alamb · 2025-10-09T20:56:49Z

parquet/src/file/metadata/memory.rs

            + self.unencoded_byte_array_data_bytes.heap_size()
            + self.repetition_level_histogram.heap_size()
            + self.definition_level_histogram.heap_size()
+            + self.geo_statistics.heap_size()


alamb · 2025-10-09T20:58:16Z

parquet/src/file/metadata/memory.rs


 impl HeapSize for FileMetaData {
    fn heap_size(&self) -> usize {
+        #[cfg(feature = "encryption")]


Does this change also fix ParquetMetaData memory size is not reported accurately when encryption is enabled #8472 ?

No, we still need to implement HeapSize for FileDecryptor.

alamb · 2025-10-09T20:59:38Z

parquet/tests/arrow_reader/bad_data.rs

 #[test]
 fn test_parquet_1481() {
    let err = read_file("PARQUET-1481.parquet").unwrap_err();
-    #[cfg(feature = "encryption")]


Why did this test change?

Because I've unified the file metadata decoding for both paths, so now we get the same error regardless of whether encryption is enabled or not.

etseidl · 2025-10-09T21:28:43Z

Thanks @etseidl -- I started going through this PR but didn't make it. I will finish up tomorrow

Thanks @alamb, I know it's a lot to slog through. This is the largest one I've got queued up 😅

alamb · 2025-10-10T12:34:51Z

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1016-gcp #17~24.04.1-Ubuntu SMP Wed Sep 3 01:55:36 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing no_row_group_conv (12a8c2f) to 8e669e7 diff
BENCH_NAME=metadata
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench metadata
BENCH_FILTER=
BENCH_BRANCH_NAME=no_row_group_conv
Results will be posted here when complete

alamb · 2025-10-10T12:37:21Z

🤖: Benchmark completed

Details

group                             main                                   no_row_group_conv
-----                             ----                                   -----------------
decode parquet metadata           1.07     13.0±0.07µs        ? ?/sec    1.00     12.1±0.13µs        ? ?/sec
decode parquet metadata (wide)    1.19     71.6±1.05ms        ? ?/sec    1.00     60.1±1.02ms        ? ?/sec
open(default)                     1.11     13.0±0.05µs        ? ?/sec    1.00     11.7±0.09µs        ? ?/sec
open(page index)                  1.01    203.5±1.58µs        ? ?/sec    1.00    201.8±2.20µs        ? ?/sec

alamb · 2025-10-10T13:23:39Z

🤖: Benchmark completed

🚀 that is quite nice @etseidl -- love it.

alamb

Thank you @etseidl -- this is a great exampe of what the new thrift decoding infrastructure allows ❤️

alamb · 2025-10-10T12:36:28Z

parquet/src/file/metadata/thrift_gen.rs

 }

+/// Create a [`crate::file::statistics::Statistics`] from a thrift [`Statistics`] object.
 pub(crate) fn convert_stats(


This is probably a good target to consider reworking eventually as well it it shows up in traces

Added to the list.

alamb · 2025-10-10T13:06:25Z

parquet/src/file/metadata/thrift_gen.rs

+
+    let compression = codec;
+
+    // NOTE: I tried using the builder for this, but it added 20% to the execution time


that is interesting and unexpected. Sounds like a good idea to keep it like this

Spoiler: I eventually take another page from @jhorstmann's book and use the builder to create a default initialized ColumnChunkMetaData whose fields I then fill in directly. It reduces the code bloat and allows me to again split out the ColumnMetaData parsing without any drop in performance.

alamb · 2025-10-10T13:10:25Z

parquet/src/file/metadata/thrift_gen.rs

+}

-        Ok(ParquetMetaData::new(fmd, row_groups))
+/// Create [`ParquetMetaData`] from thrift input. Note that this only decodes the file metadata in


this is really neat. With this code, I can easily see a bunch more potential optimizations (e.g. for example use a single Vec of ColumnChunkMetadata and have each RowGroup store the starting offset into that vec rather than have Vecs of Vecs of Vec.

(not for this PR obviously, but for a future one)

alamb · 2025-10-10T13:13:38Z

parquet/src/file/metadata/thrift_gen.rs

+                file_offset = Some(i64::read_thrift(&mut *prot)?);
+            }
+            3 => {
+                // `ColumnMetaData`. Read inline for performance sake.


what do you mean "for performance sake" -- does that means the rust compiler isn't inlining this when using the macro? Maybe we should sprinkle some #[inline] 🤔

It was hard to tell from profiling exactly what was addding to the overhead, but splitting this into a separate call that decodes and returns a ColumnMetaData object which is then moved into the ColumnChunkMetaData added about 5% to execution time IIRC. As mentioned above, my latest code passes the ColumnChunkMetaData in, so no temporary object is created. Then I get some code reuse and this becomes easier to read 😄

Another benefit of all this thrift remodel work is that we can do this level of optimziation (which is basically impractical with generated code)

especially when it comes to things like recasting the encodings vector as a bitmask. That gets rid of a ton of allocations 😄

alamb · 2025-10-10T13:22:39Z

parquet/src/file/metadata/thrift_gen.rs

+                let mut cols = Vec::with_capacity(list_ident.size as usize);
+                for i in 0..list_ident.size as usize {
+                    let col = read_column_chunk(prot, &schema_descr.columns()[i])?;
+                    cols.push(col);
+                }


I am just curious -- did you try a more functional form (I never know quite what the rust compiler knows how to optimize).

Sometimes using this type of loop lets the compiler avoid the bounds checks, which may or may not help in this case

Suggested change

let mut cols = Vec::with_capacity(list_ident.size as usize);

for i in 0..list_ident.size as usize {

let col = read_column_chunk(prot, &schema_descr.columns()[i])?;

cols.push(col);

}

let cols = schema_descr

.columns()

.iter()

.enumerate()

.map(|(i, c)| read_column_chunk(prot, c))

.collect::<Result<Vec<_>>>()?;

(I don't think we should do this as part of this PR, but it might be interesting to try in the future)

I tried that last night 😅. It was a bit slower. That pattern has been a bit of a mixed bag. I have used it where I can.
https://github.com/etseidl/arrow-rs/blob/12a8c2f34dd9410d449b8562a72665670a6b65d7/parquet/src/file/metadata/thrift_gen.rs#L656-L659

I keep trying it here and there. I keep when it doesn't hurt performance.

fascinating!

etseidl · 2025-10-10T17:23:59Z

Going to merge this and move on to the encryption refactor + part of #8518 (the latter has a small breaking change I want to get in while I can 😁).

remove conversion functions for row group and column metadata

fe9a92f

github-actions bot added the parquet Changes to the parquet crate label Oct 8, 2025

etseidl added the performance label Oct 8, 2025

etseidl commented Oct 8, 2025

View reviewed changes

etseidl added 4 commits October 8, 2025 12:00

add a todo

59051d8

Merge remote-tracking branch 'origin/main' into no_row_group_conv

94c2f1a

post merge fix

5b08a0b

lint

ef6b4a4

etseidl commented Oct 9, 2025

View reviewed changes

alamb reviewed Oct 9, 2025

View reviewed changes

include size of T in heap size of Box<T>

89984c6

etseidl added 2 commits October 9, 2025 18:14

Merge remote-tracking branch 'origin/main' into no_row_group_conv

a05f149

remove check for schema before reading column orders

12a8c2f

alamb approved these changes Oct 10, 2025

View reviewed changes

Merge branch 'main' into no_row_group_conv

af2a360

etseidl merged commit b51a000 into apache:main Oct 10, 2025
16 checks passed

etseidl deleted the no_row_group_conv branch October 12, 2025 05:16

alamb mentioned this pull request Oct 17, 2025

[thrift-remodel] Optimize convert_row_groups #8517

Closed

	/// Return the size of any bytes allocated on the heap by this object,
	/// including heap memory in those structures
	///
	/// Note that the size of the type itself is not included in the result --
	/// instead, that size is added by the caller (e.g. container).


		let compression = codec;

		// NOTE: I tried using the builder for this, but it added 20% to the execution time

[thrift-remodel] Remove conversion functions for row group and column metadata #8574

[thrift-remodel] Remove conversion functions for row group and column metadata #8574

Uh oh!

Conversation

etseidl commented Oct 8, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

etseidl commented Oct 8, 2025

Uh oh!

etseidl Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alamb commented Oct 9, 2025

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

etseidl commented Oct 9, 2025

Uh oh!

alamb commented Oct 10, 2025

Uh oh!

alamb commented Oct 10, 2025

Uh oh!

alamb commented Oct 10, 2025

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

etseidl commented Oct 10, 2025

etseidl Oct 8, 2025 •

edited

Loading