Skip to content

[C++] misaligned-pointer-use after clang-18 upgrade #44372

@dotnwat

Description

@dotnwat

Describe the bug, including details regarding any error messages, version, and platform.

After upgrading to clang 18 we are getting a misaligned-pointer-use ubsan error firing when reading parquet table.

Tried with both arrow 16 and 17 releases.

/v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/arrow/util/ubsan.h:66:21: runtime error: load of misaligned address 0x7f0787c2d3c2 for type 'const unsigned int *', which requires 4 byte alignment
--
  | 0x7f0787c2d3c2: note: pointer points here
  | a5 bd  06 0b 40 20 0c 44 61 1c  48 a2 2c 4c e3 3c 50 24  4d 54 65 5d 58 a6 6d 5c  e7 7d 60 28 8e 24
  | ^
  | SUMMARY: UndefinedBehaviorSanitizer: misaligned-pointer-use /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/arrow/util/ubsan.h:66:21
  | [Backtrace #8]
  | __sanitizer::Abort() at /llvm/3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff/src/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp:143
  |  
  |  
  | [Backtrace #9]
  | __sanitizer::Die() at /llvm/3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff/src/compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:58
  |  
  |  
  | [Backtrace #10]
  | __ubsan::ScopedReport::~ScopedReport() at /llvm/3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff/src/compiler-rt/lib/ubsan/ubsan_diag.cpp:402
  |  
  |  
  | [Backtrace #11]
  | handleTypeMismatchImpl(__ubsan::TypeMismatchData*, unsigned long, __ubsan::ReportOptions) at /llvm/3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff/src/compiler-rt/lib/ubsan/ubsan_handlers.cpp:137
  |  
  |  
  | [Backtrace #12]
  | __ubsan_handle_type_mismatch_v1 at /llvm/3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff/src/compiler-rt/lib/ubsan/ubsan_handlers.cpp:142
  |  
  |  
  | [Backtrace #13]
  | int arrow::internal::unpack32_specialized<arrow::internal::(anonymous namespace)::UnpackBits512<(arrow::internal::DispatchLevel)3> >(unsigned int const*, unsigned int*, int, int) at /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/arrow/util/ubsan.h:66
  |  
  |  
  | [Backtrace #14]
  | int arrow::bit_util::BitReader::GetBatch<int>(int, int*, int) at /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/arrow/util/bit_stream_utils.h:342
  |  
  |  
  | [Backtrace #15]
  | int arrow::util::RleDecoder::GetBatchWithDict<long>(long const*, int, long*, int) at /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/arrow/util/rle_encoding.h:580
  |  
  |  
  | [Backtrace #16]
  | parquet::(anonymous namespace)::DictDecoderImpl<parquet::PhysicalType<(parquet::Type::type)2> >::Decode(long*, int) at /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/parquet/encoding.cc:1634
  |  
  |  
  | [Backtrace #17]
  | parquet::internal::(anonymous namespace)::TypedRecordReader<parquet::PhysicalType<(parquet::Type::type)2> >::ReadValuesDense(long) at /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/parquet/column_reader.cc:1824
  |  
  |  
  | [Backtrace #18]
  | parquet::internal::(anonymous namespace)::TypedRecordReader<parquet::PhysicalType<(parquet::Type::type)2> >::ReadRecordData(long) at /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/parquet/column_reader.cc:1879
  |  
  |  
  | [Backtrace #19]
  | parquet::internal::(anonymous namespace)::TypedRecordReader<parquet::PhysicalType<(parquet::Type::type)2> >::ReadRecords(long) at /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/parquet/column_reader.cc:1425
  |  
  |  
  | [Backtrace #20]
  | parquet::arrow::(anonymous namespace)::LeafReader::LoadBatch(long) at /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/parquet/arrow/reader.cc:482
  |  
  |  
  | [Backtrace #21]
  | parquet::arrow::ColumnReaderImpl::NextBatch(long, std::__1::shared_ptr<arrow::ChunkedArray>*) at /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/parquet/arrow/reader.cc:109
  |  
  |  
  | [Backtrace #22]
  | parquet::arrow::(anonymous namespace)::FileReaderImpl::ReadColumn(int, std::__1::vector<int, std::__1::allocator<int> > const&, parquet::arrow::ColumnReader*, std::__1::shared_ptr<arrow::ChunkedArray>*) at /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/parquet/arrow/reader.cc:284
  |  
  |  
  | [Backtrace #23]
  | parquet::arrow::(anonymous namespace)::FileReaderImpl::DecodeRowGroups(std::__1::shared_ptr<parquet::arrow::(anonymous namespace)::FileReaderImpl>, std::__1::vector<int, std::__1::allocator<int> > const&, std::__1::vector<int, std::__1::allocator<int> > const&, arrow::internal::Executor*)::$_0::operator()(unsigned long, std::__1::shared_ptr<parquet::arrow::ColumnReaderImpl>) const at /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/parquet/arrow/reader.cc:1252
  |  
  |  
  | [Backtrace #24]
  | parquet::arrow::(anonymous namespace)::FileReaderImpl::DecodeRowGroups(std::__1::shared_ptr<parquet::arrow::(anonymous namespace)::FileReaderImpl>, std::__1::vector<int, std::__1::allocator<int> > const&, std::__1::vector<int, std::__1::allocator<int> > const&, arrow::internal::Executor*) at /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/arrow/util/parallel.h:95
  |  
  |  
  | [Backtrace #25]
  | parquet::arrow::(anonymous namespace)::FileReaderImpl::ReadRowGroups(std::__1::vector<int, std::__1::allocator<int> > const&, std::__1::vector<int, std::__1::allocator<int> > const&, std::__1::shared_ptr<arrow::Table>*) at /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/parquet/arrow/reader.cc:1231
  |  
  |  
  | [Backtrace #26]
  | parquet::arrow::(anonymous namespace)::FileReaderImpl::ReadTable(std::__1::vector<int, std::__1::allocator<int> > const&, std::__1::shared_ptr<arrow::Table>*) at /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/parquet/arrow/reader.cc:199
  |  
  |  
  | [Backtrace #27]
  | parquet::arrow::(anonymous namespace)::FileReaderImpl::ReadTable(std::__1::shared_ptr<arrow::Table>*) at /v/build/v_deps_build/arrow-prefix/src/arrow/cpp/src/parquet/arrow/reader.cc:300

Component(s)

C++

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions