Skip to content

Releases: apache/arrow-rs

arrow 55.0.0

08 Apr 15:24
9322547
Compare
Choose a tag to compare

Changelog

55.0.0 (2025-04-08)

Full Changelog

Breaking changes:

Implemented enhancements:

  • Improve the performance of concat #7357 [arrow]
  • Pushdown predictions to Parquet in-memory row group fetches #7348 [parquet]
  • Improve CSV parsing errors: Print the row that makes csv parsing fails #7344 [arrow]
  • Support ColumnMetaData encoding_stats in Parquet Writing #7341 [parquet]
  • Support writing Parquet with modular encryption #7327 [parquet]
  • Parquet Use U64 Instead of Usize (wasm support for files greater than 4GB) #7238 [parquet]
  • Support different TimeUnits and timezones when reading Timestamps from INT96 #7220 [parquet]

Fixed bugs:

  • New clippy failures in code base with release of rustc 1.86 #7381 [parquet] [arrow]
  • Fix bug in ParquetMetaDataReader and add test of suffix metadata reads with encryption #7372 [parquet] (etseidl)

Documentation updates:

  • Improve documentation on ArrayData::offset #7385 [arrow] (alamb)
  • Improve documentation for AsyncFileReader::get_metadata #7380 [parquet] (alamb)
  • Improve documentation on implementing Parquet predicate pushdown #7370 [parquet] (alamb)
  • Add documentation and examples for pretty printing, make pretty_format_columns_with_options pub #7346 [arrow] (alamb)
  • Improve documentation on writing parquet, including multiple threads #7321 [parquet] (alamb)

Merged pull requests:

Read more

arrow 54.3.1

26 Mar 16:02
e62b212
Compare
Choose a tag to compare

Changelog

54.3.1 (2025-03-26)

Full Changelog

Fixed bugs:

  • Round trip encoding of list of fixed list fails when offset is not zero #7315

Merged pull requests:

* This Changelog was automatically generated by github_changelog_generator

arrow 54.3.0

17 Mar 20:57
57942c4
Compare
Choose a tag to compare

Changelog

54.3.0 (2025-03-17)

Full Changelog

Implemented enhancements:

  • Using column chunk offset index in InMemoryRowGroup::fetch #7300
  • Support reading parquet with modular encryption #7296 [parquet]
  • Add example for how to read/write encrypted parquet files #7281 [parquet]
  • Have writer return parsed ParquetMetadata #7254 [parquet]
  • feat: Support Utf8View in JSON reader #7244 [arrow]
  • StructBuilder should provide a way to get a &dyn ArrayBuilder of a field builder #7193 [arrow]
  • Support div_wrapping/rem_wrapping for numeric arithmetic kernels #7158 [arrow]
  • Improve RleDecoder performance #7195 [parquet] (Dandandan)
  • Improve arrow-json deserialization performance by 30% #7157 [arrow] (mwylde)
  • Add with_skip_validation flag to IPC StreamReader, FileReader and FileDecoder #7120 [arrow] (alamb)

Fixed bugs:

  • Archery integration CI test is failing on main: error: package half v2.5.0 cannot be built because it requires rustc 1.81 or newer, while the currently active rustc version is 1.77.2 #7291
  • MSRV CI check is failing on main #7289
  • Incorrect IPC schema encoding for multiple dictionaries #7058 [arrow] [arrow-flight]

Documentation updates:

Merged pull requests:

Read more

object_store 0.12.0

03 Apr 10:59
3da5e0d
Compare
Choose a tag to compare

Changelog

object_store_0.12.0 (2025-03-05)

Full Changelog

Breaking changes:

Implemented enhancements:

Fixed bugs:

Merged pull requests:

* This Changelog was automatically generated by github_changelog_generator

arrow 54.2.1

27 Feb 12:07
3f56468
Compare
Choose a tag to compare

Changelog

54.2.1 (2025-02-27)

Full Changelog

Fixed bugs:

  • Use chrono >= 0.4.34, < 0.4.40 to avoid breaking #7210

* This Changelog was automatically generated by github_changelog_generator

arrow 54.2.0

12 Feb 15:34
d4b9482
Compare
Choose a tag to compare

Changelog

54.2.0 (2025-02-12)

Full Changelog

Implemented enhancements:

  • Casting from Utf8View to Dict(k, Utf8View) #7114
  • Support creating map arrays with key metadata #7100 [arrow]
  • [parquet] Print Parquet BasicTypeInfo id when present #7081 [parquet]
  • Add arrow-ipc benchmarks for the IPC reader and writer #6968 [arrow]

Fixed bugs:

  • NullBufferBuilder::allocated_size Returns Size in Bits #7121 [arrow]
  • [Regression in 54.0.0]. Decimal cast to smaller precision gives invalid (off-by-one) result in some cases #7069 [arrow]
  • Minor: Fix deprecated note to point to the correct const #7067 [arrow]
  • incorrect error message for reading definition levels #7056 [parquet]
  • First None in ListArray panics in cast_with_options #7043 [arrow]

Documentation updates:

Merged pull requests:

* This Changelog was automatically generated by github_changelog_generator

arrow 54.1.0

29 Jan 13:41
3bf29a2
Compare
Choose a tag to compare

Changelog

54.1.0 (2025-01-29)

Full Changelog

Implemented enhancements:

  • Create GitHub releases automatically on tagging #7041
  • Add required methods to access inner builder for NullBufferBuilder #7002 [arrow]
  • Re-export NullBufferBuilder in the arrow crate #6975 [arrow]
  • arrow-string function should support binary input as well #6923 [arrow]
  • MMap support for IPC files #6709 [arrow]
  • fix: mark (Large)ListView as nested and support in equal data type #6995 [arrow] (rluvaton)
  • Expose min/max values for Decimal128/256 and improve docs #6992 [arrow] (alamb)
  • [Parquet] Improve speed of dictionary encoding NaN float values #6953 [parquet] (adamreeve)
  • Optimize BooleanBufferBuilder for non nullable columns #6973 [arrow]
  • arrow::compute::concat should merge dictionary type when concatenating list of dictionaries #6888 [arrow]
  • Improve error message for unsupported cast between struct and other types #6724 [arrow]
  • implement regexp_match, regexp_scalar_match and regexp_array_match for StringViewArray #6717 [arrow]
  • Speed up Parquet utf8 validation #6667 [parquet]

Fixed bugs:

  • Regression: Concatenating sliced ListArrays is broken #7034
  • PrimitiveDictionaryBuilder with specific value data type and capacity #7011 [arrow]
  • Arrow IPC Writer Panics for sliced nested arrays #6997 [arrow]
  • RecordBatch with no columns cannot be roundtripped through Parquet #6988 [parquet]
  • StringView: Using the Interleave kernel (and potentially others) results in many repeated buffers in variadic_buffers #6780 [arrow]
  • fix prefetch of page index #6999 [parquet] (adriangb)
  • fix: Parquet column writer Dictionary(_, Decimal128) and Dictionary(_, Decimal256) #6987 [parquet] (korowa)
  • Writing floating point values containing NaN to Parquet is slow when using dictionary encoding #6952 [parquet] [arrow]
  • Public API using private types: Buffer::from_bytes takes unexported Bytes #6754 [parquet] [arrow] [arrow-flight]
  • Some MSRVs are inaccurate #6741 [parquet] [arrow] [arrow-flight]

Documentation updates:

Merged pull requests:

Read more

53.4.0

27 Jan 12:08
d3fcb4b
Compare
Choose a tag to compare

Changelog

53.4.0 (2025-01-14)

Full Changelog

Merged pull requests:

  • fix clippy (#6791) (#6940)
  • fix: decimal conversion looses value on lower precision (#6836) (#6936)
  • perf: Use Cow in get_format_string in FFI_ArrowSchema (#6853) (#6937)
  • fix: Encoding of List offsets was incorrect when slice offsets begin …
  • [arrow-cast] Support cast numeric to string view (alternate) (#6816) (#…
  • Enable matching temporal as from_type to Utf8View (#6872) (#6956)
  • [arrow-cast] Support cast boolean from/to string view (#6822) (#6957)
  • [53.0.0_maintenance] Fix CI (#6964)
  • Add Array::shrink_to_fit(&mut self) to 53.4.0 (#6790) (#6817) (#6962)

Update version to 54.0.0, add CHANGELOG (#6894)

27 Jan 12:10
2887cc1
Compare
Choose a tag to compare

Changelog

54.0.0 (2024-12-18)

Full Changelog

Breaking changes:

Implemented enhancements:

  • Parquet schema hint doesn't support integer types upcasting #6891 [parquet]
  • Parquet UTF-8 max statistics are overly pessimistic #6867 [parquet]
  • Add builder support for Int8 keys #6844 [arrow]
  • Formalize the name of the nested Field in a list #6784 [parquet] [arrow] [arrow-flight]
  • Allow disabling the writing of Parquet Offset Index #6778 [parquet]
  • parquet::record::make_row is not exposed to users, leaving no option to users to manually create Row objects #6761 [parquet]
  • Avoid from_num_days_from_ce_opt calls in timestamp_s_to_datetime if we don't need #6746 [arrow]
  • Support Temporal -> Utf8View casting #6734 [arrow]
  • Add Option To Coerce List Type on Parquet Write #6733 [parquet] [arrow]
  • Support Numeric -> Utf8View casting #6714 [arrow]
  • Support Utf8View <=> boolean casting #6713 [arrow]

Fixed bugs:

  • Buffer::bit_slice loses length with byte-aligned offsets #6895 [arrow]
  • parquet arrow writer doesn't track memory size correctly for fixed sized lists #6839 [parquet]
  • Casting Decimal128 to Decimal128 with smaller precision produces incorrect results in some cases #6833 [arrow]
  • Should empty nullable dictionary be parsed as null from arrow-csv? #6821 [arrow]
  • Array take doesn't make fields nullable #6809
  • Arrow Flight Encodes a Slice's List Offsets If the slice offset is starts with zero #6803 [arrow]
  • Parquet readers incorrectly interpret legacy nested lists #6756 [parquet]
  • filter_bits under-allocates resulting boolean buffer #6750 [arrow]
  • Multi-language support issues with Arrow FlightSQL client's execute_update and execute_ingest methods #6545 [arrow] [arrow-flight]

Documentation updates:

Closed issues:

Merged pull requests:

Read more

Prepare for 53.3.0 release (#6739)

27 Jan 12:09
f5b51ff
Compare
Choose a tag to compare

Changelog

53.3.0 (2024-11-17)

Full Changelog

Implemented enhancements:

  • PartialEq of GenericByteViewArray (StringViewArray / ByteViewArray) that compares on equality rather than logical value #6679 [arrow]
  • Need a mechanism to handle schema changes due to dictionary hydration in FlightSQL server implementations #6672 [arrow] [arrow-flight]
  • Support encoding Utf8View columns to JSON #6642 [arrow]
  • Implement append_n for BooleanBuilder #6634 [arrow]
  • Some take optimizations #6621 [arrow]
  • Error Instead of Panic On Attempting to Write More Than 32769 Row Groups #6591 [parquet]
  • Make casting from a timestamp without timezone to a timestamp with timezone configurable #6555
  • Add record_batch! macro for easy record batch creation #6553 [arrow]
  • Support Binary --> Utf8View casting #6531 [arrow]
  • downcast_primitive_array and downcast_dictionary_array are not hygienic wrt imports #6400 [arrow]
  • Implement interleave_record_batch #6731 [arrow] (waynexia)
  • feat: record_batch! macro #6588 [arrow] (ByteBaker)

Fixed bugs:

  • Signed decimal e-notation parsing bug #6728 [arrow]
  • Add support for Utf8View -> numeric in can_cast_types #6715
  • IPC file writer produces incorrect footer when not preserving dict ID #6710 [arrow]
  • parquet from_thrift_helper incorrectly checks index #6693 [parquet]
  • Primitive REPEATED fields not contained in LIST annotated groups aren't read as lists by record reader #6648 [parquet]
  • DictionaryHandling does not recurse into Map fields #6644 [arrow] [arrow-flight]
  • Array writer output empty when no record is written #6613 [arrow]
  • Archery Integration Test with c# failing on main #6577 [arrow]
  • Potential unsoundness in filter_run_end_array #6569 [arrow]
  • Parquet reader can generate incorrect validity buffer information for nested structures #6510 [parquet]
  • arrow-array ffi: FFI_ArrowArray.null_count is always interpreted as unsigned and initialized during conversion from C to Rust. #6497 [arrow]

Documentation updates:

Performance improvements:

Closed issues:

  • Incorrect like results for pattern starting/ending with % percent and containing escape characters #6702 [arrow]

Merged pull requests:

Read more