-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Closed as not planned
Labels
Description
hi team! I use the 0.15.1 and found a problem when read parquet file, which contains array column. The asan output
parquet-low-level-example(49396,0x7ff848622680) malloc: nano zone abandoned due to inability to preallocate reserved vm space.
row num:1000000
=================================================================
==49396==ERROR: AddressSanitizer: global-buffer-overflow on address 0x0001087d73f8 at pc 0x0001076ecb8d bp 0x7ff7b8b4d6b0 sp 0x7ff7b8b4d6a8
WRITE of size 8 at 0x0001087d73f8 thread T0
#0 0x1076ecb8c in int arrow::util::RleDecoder::GetBatchWithDictSpaced<long long>(long long const*, long long*, int, int, unsigned char const*, long long) rle_encoding.h:488
#1 0x1076e62c8 in parquet::DictDecoderImpl<parquet::PhysicalType<(parquet::Type::type)2> >::DecodeSpaced(long long*, int, int, unsigned char const*, long long) encoding.cc:1079
#2 0x1075d9e6b in parquet::internal::TypedRecordReader<parquet::PhysicalType<(parquet::Type::type)2> >::ReadValuesSpaced(long long, long long) column_reader.cc:1052
#3 0x1075dc1a9 in parquet::internal::TypedRecordReader<parquet::PhysicalType<(parquet::Type::type)2> >::ReadRecordData(long long) column_reader.cc:1096
#4 0x1075d6a4c in parquet::internal::TypedRecordReader<parquet::PhysicalType<(parquet::Type::type)2> >::ReadRecords(long long) column_reader.cc:822
#5 0x1073d1583 in parquet::arrow::LeafReader::NextBatch(long long, std::__1::shared_ptr<arrow::ChunkedArray>*) reader.cc:414
#6 0x1073d55bd in parquet::arrow::NestedListReader::NextBatch(long long, std::__1::shared_ptr<arrow::ChunkedArray>*) reader.cc:469
#7 0x1073f5a82 in parquet::arrow::RowGroupRecordBatchReader::ReadNext(std::__1::shared_ptr<arrow::RecordBatch>*) reader.cc:320
#8 0x1073b409a in printParquetFile(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) reader-writer.cc:97
#9 0x1073b5209 in main reader-writer.cc:111
#10 0x7ff8049b230f (<unknown module>)
0x0001087d73f8 is located 40 bytes to the left of global variable 'guard variable for arrow::SparseTensor::dim_name(int) const::kEmpty' defined in 'arrow-apache-arrow-0.15.1/cpp/src/arrow/sparse_tensor.cc' (0x1087d7420) of size 8
0x0001087d73f8 is located 0 bytes to the right of global variable 'kEmpty' defined in 'arrow-apache-arrow-0.15.1/cpp/src/arrow/sparse_tensor.cc:415:28' (0x1087d73e0) of size 24
SUMMARY: AddressSanitizer: global-buffer-overflow rle_encoding.h:488 in int arrow::util::RleDecoder::GetBatchWithDictSpaced<long long>(long long const*, long long*, int, int, unsigned char const*, long long)
Shadow bytes around the buggy address:
0x1000210fae20: 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9
0x1000210fae30: 00 00 00 00 00 00 00 f9 f9 f9 f9 f9 00 f9 f9 f9
0x1000210fae40: 00 00 f9 f9 00 f9 f9 f9 01 f9 f9 f9 01 f9 f9 f9
0x1000210fae50: 01 f9 f9 f9 01 f9 f9 f9 00 00 f9 f9 00 f9 f9 f9
0x1000210fae60: 01 f9 f9 f9 00 00 f9 f9 00 f9 f9 f9 00 00 00 00
=>0x1000210fae70: 00 00 00 f9 f9 f9 f9 f9 00 00 00 00 00 00 00[f9]
0x1000210fae80: f9 f9 f9 f9 00 f9 f9 f9 00 00 00 f9 f9 f9 f9 f9
0x1000210fae90: 00 f9 f9 f9 00 00 f9 f9 00 f9 f9 f9 00 00 f9 f9
0x1000210faea0: 00 f9 f9 f9 00 00 f9 f9 00 f9 f9 f9 00 00 f9 f9
0x1000210faeb0: 00 f9 f9 f9 00 00 f9 f9 00 f9 f9 f9 00 00 f9 f9
0x1000210faec0: 00 f9 f9 f9 00 00 f9 f9 00 f9 f9 f9 00 00 f9 f9
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==49396==ABORTING
Process finished with exit code 134 (interrupted by signal 6: SIGABRT)
Component(s)
C++