Releases: apache/arrow-rs
arrow 55.0.0
Changelog
55.0.0 (2025-04-08)
Breaking changes:
- Change Parquet API interaction to use
u64
(support files larger than 4GB in WASM) #7371 [parquet] (kylebarron) - Remove
AsyncFileReader::get_metadata_with_options
, addoptions
toAsyncFileReader::get_metadata
#7342 [parquet] (corwinjoy) - Parquet: Support reading Parquet metadata via suffix range requests #7334 [parquet] (kylebarron)
- Upgrade to
object_store
to0.12.0
#7328 [parquet] (mbrobbel) - Upgrade
pyo3
to0.24
#7324 [arrow] (mbrobbel) - Reapply Box
FlightErrror::tonic
to reduce size (fixes nightly clippy) #7277 [arrow] [arrow-flight] (alamb) - Improve parquet gzip compression performance using zlib-rs #7200 [parquet] (psvri)
- Fix:
date_part
to extract only the requested part (not the overall interval) #7189 [arrow] (delamarch3) - chore: upgrade flatbuffer version to
25.2.10
#7134 [arrow] (tisonkun) - Add hooks to json encoder to override default encoding or add support for unsupported types #7015 [arrow] (adriangb)
Implemented enhancements:
- Improve the performance of
concat
#7357 [arrow] - Pushdown predictions to Parquet in-memory row group fetches #7348 [parquet]
- Improve CSV parsing errors: Print the row that makes csv parsing fails #7344 [arrow]
- Support ColumnMetaData
encoding_stats
in Parquet Writing #7341 [parquet] - Support writing Parquet with modular encryption #7327 [parquet]
- Parquet Use U64 Instead of Usize (wasm support for files greater than 4GB) #7238 [parquet]
- Support different TimeUnits and timezones when reading Timestamps from INT96 #7220 [parquet]
Fixed bugs:
- New clippy failures in code base with release of rustc 1.86 #7381 [parquet] [arrow]
- Fix bug in
ParquetMetaDataReader
and add test of suffix metadata reads with encryption #7372 [parquet] (etseidl)
Documentation updates:
- Improve documentation on
ArrayData::offset
#7385 [arrow] (alamb) - Improve documentation for
AsyncFileReader::get_metadata
#7380 [parquet] (alamb) - Improve documentation on implementing Parquet predicate pushdown #7370 [parquet] (alamb)
- Add documentation and examples for pretty printing, make
pretty_format_columns_with_options
pub #7346 [arrow] (alamb) - Improve documentation on writing parquet, including multiple threads #7321 [parquet] (alamb)
Merged pull requests:
- chore: apply clippy suggestions newly introduced in rust 1.86 #7382 [parquet] [arrow] (westonpace)
- bench: add more {boolean, string, int} benchmarks for concat kernel #7376 [arrow] (rluvaton)
- Add more examples of using Parquet encryption #7374 [parquet] (adamreeve)
- Clean up
ArrowReaderMetadata::load_async
#7369 [parquet] (etseidl) - bump pyo3 for RUSTSEC-2025-0020 #7368 [arrow] (onursatici)
- Test int96 Parquet file from Spark #7367 [parquet] (mbutrovich)
- fix: respect offset/length when converting ArrayData to StructArray #7366 [arrow] (westonpace)
- Print row, data present, expected type, and row number in error messages for arrow-csv #7361 [arrow] (psiayn)
- Use rust builtins for round_upto_multiple_of_64 and ceil #7358 [arrow] (psvri)
- Write parquet PageEncodingStats #7354 [parquet] (jhorstmann)
- Move
sysinfo
todev-dependencies
#7353 [parquet] (mbrobbel) - chore(deps): update sysinfo requirement from 0.33.0 to 0.34.0 #7352 [parquet] (dependabot[bot])
- Add additional benchmarks for utf8view comparison kernels #7351 [arrow] (zhuqi-lucas)
- Upgrade to twox-hash 2.0 #7347 [parquet] (alamb)
- refactor: apply borrowed chunk reader to Sbbf::read_from_column_chunk #7345 [parquet] (ethe)
- Merge changelog and version from 54.3.1 into main #7340 [parquet] [arrow] (timsaucer)
- Remove
object-store
label from.asf.yaml
#7339 (mbrobbel)
...
arrow 54.3.1
Changelog
54.3.1 (2025-03-26)
Fixed bugs:
- Round trip encoding of list of fixed list fails when offset is not zero #7315
Merged pull requests:
- Add missing type annotation #7326 [parquet] (mbrobbel)
- bugfix: correct offsets when serializing a list of fixed sized list and non-zero start offset #7318 [arrow] (timsaucer)
* This Changelog was automatically generated by github_changelog_generator
arrow 54.3.0
Changelog
54.3.0 (2025-03-17)
Implemented enhancements:
- Using column chunk offset index in
InMemoryRowGroup::fetch
#7300 - Support reading parquet with modular encryption #7296 [parquet]
- Add example for how to read/write encrypted parquet files #7281 [parquet]
- Have writer return parsed
ParquetMetadata
#7254 [parquet] - feat: Support Utf8View in JSON reader #7244 [arrow]
- StructBuilder should provide a way to get a &dyn ArrayBuilder of a field builder #7193 [arrow]
- Support div_wrapping/rem_wrapping for numeric arithmetic kernels #7158 [arrow]
- Improve RleDecoder performance #7195 [parquet] (Dandandan)
- Improve arrow-json deserialization performance by 30% #7157 [arrow] (mwylde)
- Add
with_skip_validation
flag to IPCStreamReader
,FileReader
andFileDecoder
#7120 [arrow] (alamb)
Fixed bugs:
- Archery integration CI test is failing on main: error: package
half v2.5.0
cannot be built because it requires rustc 1.81 or newer, while the currently active rustc version is 1.77.2 #7291 - MSRV CI check is failing on main #7289
- Incorrect IPC schema encoding for multiple dictionaries #7058 [arrow] [arrow-flight]
Documentation updates:
- Add example for how to read encrypted parquet files #7283 [parquet] (rok)
- Update the relative path of the test data in docs #7221 (Ziy1-Tan)
- Minor: fix doc and remove unused code #7194 [arrow] (lewiszlw)
- doc: modify wrong comment #7190 [arrow] (YichiZhang0613)
- doc: fix IPC file reader/writer docs #7178 [arrow] (Jefffrey)
Merged pull requests:
- chore: require ffi feature in arrow-schema benchmark #7298 [arrow] (ethe)
- Fix archery integration test #7292 (alamb)
- Minor: run
test_decimal_list
again #7282 [parquet] (alamb) - Move Parquet encryption tests into the arrow_reader integration tests #7279 [parquet] (adamreeve)
- Include license and notice files in published crates, part 2 #7275 [arrow] (ankane)
- feat: Support Utf8View in JSON reader #7263 [arrow] (zhuqi-lucas)
- feat: use
force_validate
feature flag when creating an arrays #7241 [arrow] (rluvaton) - fix: take on empty struct array returns empty array #7224 [arrow] (westonpace)
- fix: correct
bloom_filter_position
description #7223 [parquet] (romanz) - Minor: Move
make_builder
into mod.rs #7218 (lewiszlw) - Expose
field_builders
inStructBuilder
#7217 [arrow] (lewiszlw) - Minor: Fix json StructMode docs links #7215 [arrow] (gstvg)
- [main] Bump arrow version to 54.2.1 (#7207) #7212 (alamb)
- feat: add
downcast_integer_array
macro helper #7211 [arrow] (rluvaton) - Remove zstd pin #7199 [parquet] (tustvold)
- fix: Use chrono's quarter() to avoid conflict #7198 [arrow] (yutannihilation)
- Fix some Clippy 1.85 warnings #7167 [parquet] [arrow] (mbrobbel)
- feat: add to concat different data types error message the data types #7166 [arrow] (rluvaton)
- Add Week ISO, Year ISO computation #7163 [arrow] (kosiew)
- fix: create_random_batch fails with timestamp types having a timezone #7162 [arrow] (niebayes)
- Avoid overflow of remainder #7159 [arrow] (wForget)
- fix: Data type inference for NaN, inf and -inf in csv files #7150 [arrow] (Mottl)
- Preserve null dictionary values in
interleave
andconcat
kernels #7144 [arrow] (kawadakk) - Support casting
Date
to a time zone-specific timestamp #7141 [arrow] (friendlymatthew) - Minor: Add doctest to ArrayDataBuilder::build_unchecked #7139 [arrow] (gstvg)
- arrow-ord: add support for nested types to
partition
#7131 [arrow] (asubiotto) - Update prost-build requirement from =0.13.4 to =0.13.5 #7127 [arrow] [arrow-flight] (dependabot[bot])
- Avoid use of `flatbuffers...
object_store 0.12.0
Changelog
object_store_0.12.0 (2025-03-05)
Breaking changes:
- feat: add
Extensions
to object storePutMultipartOpts
#7214 [object-store] (crepererum) - feat: add
Extensions
to object storePutOptions
#7213 [object-store] (crepererum) - chore: enable conditional put by default for S3 #7181 [object-store] (meteorgan)
- feat: add
Extensions
to object storeGetOptions
#7170 [object-store] (crepererum) - feat(object_store): Override DNS Resolution to Randomize IP Selection #7123 [object-store] (crepererum)
- Use
u64
range instead ofusize
, for better wasm32 support #6961 [object-store] (XiangpengHao) - object_store: Add enabled-by-default "fs" feature #6636 [object-store] (Turbo87)
- Return
BoxStream
with'static
lifetime fromObjectStore::list
#6619 [object-store] (kylebarron) - object_store: Migrate from snafu to thiserror #6266 [object-store] (Turbo87)
Implemented enhancements:
- Object Store: S3 IP address selection is biased #7117 [object-store]
- object_store: GCSObjectStore should derive Clone #7113 [object-store]
- Remove all RCs after release #7059 [object-store]
- LocalFileSystem::list_with_offset is very slow over network file system #7018 [object-store]
- Release object store
0.11.2
(non API breaking) Around Dec 15 2024 #6902 [object-store]
Fixed bugs:
- LocalFileSystem errors with satisfiable range request #6749 [object-store]
Merged pull requests:
- ObjectStore WASM32 Support #7226 [object-store] (tustvold)
- [main] Bump arrow version to 54.2.1 (#7207) #7212 (alamb)
- Decouple ObjectStore from Reqwest #7183 [object-store] (tustvold)
- object_store: Disable all compression formats in HTTP reqwest client #7143 [object-store] (kylewlacy)
- refactor: remove unused
async
fromInMemory::entry
#7133 [object-store] (crepererum) - object_store/gcp: derive Clone for GoogleCloudStorage #7112 [object-store] (james-rms)
- Update version to 54.2.0 and add CHANGELOG #7110 (alamb)
- Remove all RCs after release #7060 [object-store] (kou)
- Update release schedule README.md #7053 (alamb)
- Create GitHub releases automatically on tagging #7042 (kou)
- Change Log On Succesful S3 Copy / Multipart Upload to Debug #7033 [object-store] (diptanu)
- Prepare for
54.1.0
release #7031 (alamb) - Add a custom implementation
LocalFileSystem::list_with_offset
#7019 [object-store] (corwinjoy) - Improve docs for
AmazonS3Builder::from_env
#6977 [object-store] (kylebarron) - Fix WASM CI for Rust 1.84 release #6963 (alamb)
- Update itertools requirement from 0.13.0 to 0.14.0 in /object_store #6925 [object-store] (dependabot[bot])
- Fix LocalFileSystem with range request that ends beyond end of file #6751 [object-store] (kylebarron)
* This Changelog was automatically generated by github_changelog_generator
arrow 54.2.1
Changelog
54.2.1 (2025-02-27)
Fixed bugs:
- Use chrono >= 0.4.34, < 0.4.40 to avoid breaking #7210
* This Changelog was automatically generated by github_changelog_generator
arrow 54.2.0
Changelog
54.2.0 (2025-02-12)
Implemented enhancements:
- Casting from Utf8View to Dict(k, Utf8View) #7114
- Support creating map arrays with key metadata #7100 [arrow]
- [parquet] Print Parquet BasicTypeInfo id when present #7081 [parquet]
- Add arrow-ipc benchmarks for the IPC reader and writer #6968 [arrow]
Fixed bugs:
- NullBufferBuilder::allocated_size Returns Size in Bits #7121 [arrow]
- [Regression in 54.0.0]. Decimal cast to smaller precision gives invalid (off-by-one) result in some cases #7069 [arrow]
- Minor: Fix deprecated note to point to the correct const #7067 [arrow]
- incorrect error message for reading definition levels #7056 [parquet]
- First None in ListArray panics in
cast_with_options
#7043 [arrow]
Documentation updates:
- Minor: Clarify documentation on
NullBufferBuilder::allocated_size
#7089 [arrow] (alamb) - Minor: Update release schedule #7086 (alamb)
- Improve
ListArray
documentation for slices #7039 [arrow] (alamb)
Merged pull requests:
- fix: NullBufferBuilder::allocated_size should return Size in Bytes #7122 [arrow] (shuozel)
- minor: fix deprecated_note #7105 [arrow] (Chen-Yuan-Lai)
- Minor: Fix ArrayDataBuilder::build_unchecked docs #7103 [arrow] (gstvg)
- Support setting key field in MapBuilder #7101 [arrow] (rshkv)
- Add tests that arrow IPC data is validated #7096 [arrow] (alamb)
- Print Parquet BasicTypeInfo id when present #7094 [parquet] (devinrsmith)
- Expose record boundary information in JSON decoder #7092 [arrow] (scovich)
- Benchmarks for Arrow IPC reader #7091 [arrow] (alamb)
- Benchmarks for Arrow IPC writer #7090 [arrow] (alamb)
- Add another decimal cast edge test case #7078 [arrow] (findepi)
- minor: re-export
OffsetBufferBuilder
inarrow
crate #7077 [arrow] (alamb) - Support converting large dates (i.e. +10999-12-31) from string to Date32 #7074 [arrow] (phillipleblanc)
- fix: issue introduced in #6833 - less than equal check for scale in decimal conversion #7070 [arrow] (himadripal)
- perf: inline
from_iter
forScalarBuffer
#7066 [arrow] (0ax1) - fix: first none/empty list in
ListArray
panics incast_with_options
#7065 [arrow] (irenjj) - Minor: add ticket reference for todo #7064 [parquet] (alamb)
- Refactor some decimal-related code and tests #7062 [arrow] (CurtHagenlocher)
- fix error message for reading definition levels #7057 [parquet] (jp0317)
- Update release schedule README.md #7053 (alamb)
- Support both 0x01 and 0x02 as type for list of booleans in thrift metadata #7052 [parquet] (jhorstmann)
- Refactor arrow-ipc: Move
create_*_array
methods intoRecordBatchDecoder
#7029 [arrow] (alamb) - Refactor arrow-ipc: Rename
ArrayReader
toRecodeBatchDecoder
#7028 [arrow] (alamb) - Introduce
UnsafeFlag
to manage disablingArrayData
validation #7027 [arrow] (alamb)
* This Changelog was automatically generated by github_changelog_generator
arrow 54.1.0
Changelog
54.1.0 (2025-01-29)
Implemented enhancements:
- Create GitHub releases automatically on tagging #7041
- Add required methods to access inner builder for
NullBufferBuilder
#7002 [arrow] - Re-export
NullBufferBuilder
in the arrow crate #6975 [arrow] arrow-string
function should support binary input as well #6923 [arrow]- MMap support for IPC files #6709 [arrow]
- fix: mark (Large)ListView as nested and support in equal data type #6995 [arrow] (rluvaton)
- Expose min/max values for Decimal128/256 and improve docs #6992 [arrow] (alamb)
- [Parquet] Improve speed of dictionary encoding NaN float values #6953 [parquet] (adamreeve)
- Optimize
BooleanBufferBuilder
for non nullable columns #6973 [arrow] arrow::compute::concat
should merge dictionary type when concatenating list of dictionaries #6888 [arrow]- Improve error message for unsupported cast between struct and other types #6724 [arrow]
- implement regexp_match, regexp_scalar_match and regexp_array_match for StringViewArray #6717 [arrow]
- Speed up Parquet utf8 validation #6667 [parquet]
Fixed bugs:
- Regression: Concatenating sliced
ListArray
s is broken #7034 PrimitiveDictionaryBuilder
with specific value data type and capacity #7011 [arrow]- Arrow IPC Writer Panics for sliced nested arrays #6997 [arrow]
- RecordBatch with no columns cannot be roundtripped through Parquet #6988 [parquet]
- StringView: Using the Interleave kernel (and potentially others) results in many repeated buffers in variadic_buffers #6780 [arrow]
- fix prefetch of page index #6999 [parquet] (adriangb)
- fix: Parquet column writer
Dictionary(_, Decimal128)
andDictionary(_, Decimal256)
#6987 [parquet] (korowa) - Writing floating point values containing NaN to Parquet is slow when using dictionary encoding #6952 [parquet] [arrow]
- Public API using private types:
Buffer::from_bytes
takes unexportedBytes
#6754 [parquet] [arrow] [arrow-flight] - Some MSRVs are inaccurate #6741 [parquet] [arrow] [arrow-flight]
Documentation updates:
- docs: add to bit slice iterator docs that the start value is inclusive and end value is exclusive #7022 [arrow] (rluvaton)
- Fix duplicate link references in README #7020 (Jefffrey)
- Enhance ListViewArray related docs #7007 [arrow] (Jefffrey)
- Document data type support and examples to predicates
*like
,starts_with
,ends_with
,contains
#7003 [arrow] (alamb) - Minor: improve documentation on timezone representations #7000 [arrow] (alamb)
- Add additional documentation for UTC representation of timestamps #6994 [arrow] (Abdullahsab3)
- Improve
ParquetRecordBatchStreamBuilder
docs / examples #6948 [parquet] (alamb) - Document the
ParquetRecordBatchStream
buffering #6947 [parquet] (alamb) - Minor: improve
zip
kernel docs, add examples #6928 [arrow] (alamb) - Add doctest example for
Buffer::from_bytes
#6920 [arrow] (kylebarron) - [object store] Add planned object_store release schedule to crate readme #6904 (alamb)
- Avoid panics? #6737 [parquet]
Merged pull requests:
- Create GitHub releases automatically on tagging #7042 (kou)
- Fix
concat
for slicedListArrays
#7037 [arrow] (alamb) - Minor: Clarify NullBufferBuilder::new capacity parameter #7016 [arrow] (alamb)
- Add
is_valid
andtruncate
methods toNullBufferBuilder
#7013 [arrow] (Chen-Yuan-Lai) - fix: use the values builder capacity for the hash map in
PrimitiveDictionaryBuilder::new_from_builders
#7012 [arrow] (rluvaton) - Refactor ipc reading code into methods on
ArrayReader
#7006 [arrow] (alamb) - Minor: make it clear Predicate is crate private #7001 [arrow] (alamb)
- fix: Panic on reencoding offsets in arrow-ipc with sliced nested arrays #6998 [arrow] (HawaiianSpork)
- Add check for empty schema in
parquet::schema::types::from_thrift_helper
#6990 [parquet] ([etseidl](https://github.com/...
53.4.0
Changelog
53.4.0 (2025-01-14)
Merged pull requests:
- fix clippy (#6791) (#6940)
- fix: decimal conversion looses value on lower precision (#6836) (#6936)
- perf: Use Cow in get_format_string in FFI_ArrowSchema (#6853) (#6937)
- fix: Encoding of List offsets was incorrect when slice offsets begin …
- [arrow-cast] Support cast numeric to string view (alternate) (#6816) (#…
- Enable matching temporal as from_type to Utf8View (#6872) (#6956)
- [arrow-cast] Support cast boolean from/to string view (#6822) (#6957)
- [53.0.0_maintenance] Fix CI (#6964)
- Add Array::shrink_to_fit(&mut self) to 53.4.0 (#6790) (#6817) (#6962)
Update version to 54.0.0, add CHANGELOG (#6894)
Changelog
54.0.0 (2024-12-18)
Breaking changes:
- avoid redundant parsing of repeated value in RleDecoder #6834 [parquet] (jp0317)
- Handling nullable DictionaryArray in CSV parser #6830 [arrow] (edmondop)
- fix(flightsql): remove Any encoding of DoPutUpdateResult #6825 [arrow] [arrow-flight] (davisp)
- arrow-ipc: Default to not preserving dict IDs #6788 [arrow] (brancz)
- Remove some very old deprecated functions #6774 [parquet] [arrow] (alamb)
- update to pyo3 0.23.0 #6745 [arrow] (psvri)
- Remove APIs deprecated since v 4.4.0 #6722 [arrow] [arrow-flight] (findepi)
- Return
None
when Parquet page indexes are not present in file #6639 [parquet] (etseidl) - Add
ParquetError::NeedMoreData
markParquetError
asnon_exhaustive
#6630 [parquet] (etseidl) - Remove APIs deprecated since v 2.0.0 #6609 [arrow] (findepi)
Implemented enhancements:
- Parquet schema hint doesn't support integer types upcasting #6891 [parquet]
- Parquet UTF-8 max statistics are overly pessimistic #6867 [parquet]
- Add builder support for Int8 keys #6844 [arrow]
- Formalize the name of the nested
Field
in a list #6784 [parquet] [arrow] [arrow-flight] - Allow disabling the writing of Parquet Offset Index #6778 [parquet]
parquet::record::make_row
is not exposed to users, leaving no option to users to manually createRow
objects #6761 [parquet]- Avoid
from_num_days_from_ce_opt
calls intimestamp_s_to_datetime
if we don't need #6746 [arrow] - Support Temporal -> Utf8View casting #6734 [arrow]
- Add Option To Coerce List Type on Parquet Write #6733 [parquet] [arrow]
- Support Numeric -> Utf8View casting #6714 [arrow]
- Support Utf8View <=> boolean casting #6713 [arrow]
Fixed bugs:
Buffer::bit_slice
loses length with byte-aligned offsets #6895 [arrow]- parquet arrow writer doesn't track memory size correctly for fixed sized lists #6839 [parquet]
- Casting Decimal128 to Decimal128 with smaller precision produces incorrect results in some cases #6833 [arrow]
- Should empty nullable dictionary be parsed as null from arrow-csv? #6821 [arrow]
- Array take doesn't make fields nullable #6809
- Arrow Flight Encodes a Slice's List Offsets If the slice offset is starts with zero #6803 [arrow]
- Parquet readers incorrectly interpret legacy nested lists #6756 [parquet]
- filter_bits under-allocates resulting boolean buffer #6750 [arrow]
- Multi-language support issues with Arrow FlightSQL client's execute_update and execute_ingest methods #6545 [arrow] [arrow-flight]
Documentation updates:
- Should we document at what rate deprecated APIs are removed? #6851 [parquet] [arrow]
- Fix docstring for
Format::with_header
inarrow-csv
#6856 [arrow] (kylebarron) - Add deprecation / API removal policy #6852 [parquet] [arrow] (alamb)
- Minor: add example for creating
SchemaDescriptor
#6841 [parquet] (alamb) - chore: enrich panic context when BooleanBuffer fails to create #6810 [arrow] (tisonkun)
Closed issues:
- [FlightSQL] GetCatalogsBuilder does not sort the catalog names #6807 [arrow] [arrow-flight]
- Add a lint to automatically check for unused dependencies #6796 [arrow] [arrow-flight]
Merged pull requests:
- doc: add comment for timezone string #6899 [arrow] (xxchan)
- docs: fix typo #6890 [arrow] (rluvaton)
- Minor: Fix deprecation notice for
arrow_to_parquet_schema
#6889 [parquet] (etseidl) - Add Field::with_dict_is_ordered #6885 [arrow] (alamb)
- Deprecate "max statistics size" property in
WriterProperties
#6884 [parquet] (etseidl) - Add deprecation warnings for everything related to
dict_id
#6873 [[parquet](https://github.com...
Prepare for 53.3.0 release (#6739)
Changelog
53.3.0 (2024-11-17)
Implemented enhancements:
PartialEq
of GenericByteViewArray (StringViewArray / ByteViewArray) that compares on equality rather than logical value #6679 [arrow]- Need a mechanism to handle schema changes due to dictionary hydration in FlightSQL server implementations #6672 [arrow] [arrow-flight]
- Support encoding Utf8View columns to JSON #6642 [arrow]
- Implement
append_n
forBooleanBuilder
#6634 [arrow] - Some take optimizations #6621 [arrow]
- Error Instead of Panic On Attempting to Write More Than 32769 Row Groups #6591 [parquet]
- Make casting from a timestamp without timezone to a timestamp with timezone configurable #6555
- Add
record_batch!
macro for easy record batch creation #6553 [arrow] - Support
Binary
-->Utf8View
casting #6531 [arrow] downcast_primitive_array
anddowncast_dictionary_array
are not hygienic wrt imports #6400 [arrow]- Implement interleave_record_batch #6731 [arrow] (waynexia)
- feat:
record_batch!
macro #6588 [arrow] (ByteBaker)
Fixed bugs:
- Signed decimal e-notation parsing bug #6728 [arrow]
- Add support for Utf8View -> numeric in can_cast_types #6715
- IPC file writer produces incorrect footer when not preserving dict ID #6710 [arrow]
- parquet from_thrift_helper incorrectly checks index #6693 [parquet]
- Primitive REPEATED fields not contained in LIST annotated groups aren't read as lists by record reader #6648 [parquet]
- DictionaryHandling does not recurse into Map fields #6644 [arrow] [arrow-flight]
- Array writer output empty when no record is written #6613 [arrow]
- Archery Integration Test with c# failing on main #6577 [arrow]
- Potential unsoundness in
filter_run_end_array
#6569 [arrow] - Parquet reader can generate incorrect validity buffer information for nested structures #6510 [parquet]
- arrow-array ffi: FFI_ArrowArray.null_count is always interpreted as unsigned and initialized during conversion from C to Rust. #6497 [arrow]
Documentation updates:
- Minor: Document pattern for accessing views in StringView #6673 [arrow] (alamb)
- Improve Array::is_nullable documentation #6615 [arrow] (findepi)
- Minor: improve docs for ByteViewArray->ByteArray From impl #6610 [arrow] (alamb)
Performance improvements:
Closed issues:
- Incorrect like results for pattern starting/ending with
%
percent and containing escape characters #6702 [arrow]
Merged pull requests:
- Fix signed decimal e-notation parsing #6729 [arrow] (gruuya)
- Clean up some arrow-flight tests and duplicated code #6725 [arrow] [arrow-flight] (itsjunetime)
- Update PR template section about API breaking changes #6723 (findepi)
- Support for casting
StringViewArray
toDecimalArray
#6720 [arrow] (tlm365) - File writer preserve dict bug #6711 [arrow] (brancz)
- Add filter_kernel benchmark for run array #6706 [arrow] (delamarch3)
- Fix string view ILIKE checks with NULL values #6705 [arrow] (findepi)
- Implement logical_null_count for more array types #6704 [arrow] (findepi)
- Fix LIKE with escapes #6703 [arrow] (findepi)
- Speed up
filter_bytes
#6699 [arrow] (Dandandan) - Minor: fix misleading comment in byte view #6695 [arrow] (jayzhan211)
- minor fix on checking index #6694 [parquet] (jp0317)
- Undo run end filter performance regression #6691 [arrow] (delamarch3)
- Reimplement
PartialEq
ofGenericByteViewArray
compares by logical value #6689 [arrow] (tlm365) - feat: expose known_schema from FlightDataEncoder #6688 [arrow] [arrow-flight] (nathanielc)
- Update hashbrown requirement from 0.14.2 to 0.15.1 #6684 [parquet] [arrow] (dependabot[bot])
- Support Duration in JSON Reader #6683 [[arrow](https://github.com...