-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Closed
Description
Describe the bug, including details regarding any error messages, version, and platform.
The first and the last line of this block of code access the same metadata variable but only one of them does so holding a lock.
I assume this means the other one should too.
There are some other places in this file that access metadata in tricky ways (e.g. it is not clear from a first glance at a method whether nullptr is allowed or not). They could also race.
arrow/cpp/src/arrow/dataset/file_parquet.cc
Lines 607 to 618 in 0dbbd43
| if (parquet_fragment->metadata() != nullptr) { | |
| ARROW_ASSIGN_OR_RAISE(row_groups, parquet_fragment->FilterRowGroups(options->filter)); | |
| pre_filtered = true; | |
| if (row_groups.empty()) return MakeEmptyGenerator<std::shared_ptr<RecordBatch>>(); | |
| } | |
| // Open the reader and pay the real IO cost. | |
| auto make_generator = | |
| [this, options, parquet_fragment, pre_filtered, | |
| row_groups](const std::shared_ptr<parquet::arrow::FileReader>& reader) mutable | |
| -> Result<RecordBatchGenerator> { | |
| // Ensure that parquet_fragment has FileMetaData | |
| RETURN_NOT_OK(parquet_fragment->EnsureCompleteMetadata(reader.get())); |
Component(s)
C++, Parquet