Releases: aloneguid/parquet-dotnet
Releases · aloneguid/parquet-dotnet
5.0.1
New feature
You can deserialise "required" lists and "required" list elements, as raised by @akaloshych84 in #502. See nullability and lists.
Improvements
- Better error reporting in case class serializer has mismatched definition and repetition levels (as per #502).
- Pass property attributes down to list data field, by @agaskill in #559.
Bug fixed
- Compression/decompression would fail on some platforms like x86 or Linux x86 with musl runtime.
Floor
boolean
columns display as checks.- Structs display as expandable objects, with properly aligned keys.
5.0.0
Support Parquet.Net
If you find the project helpful, you can support Parquet.Net by starring it.
Breaking changes
- This is the first version without old Table/Row API, which is now completely removed. This API was one of the major headaches and source of bugs since being introduced in the very first version of this library. If you need a similar functionality, consider untyped serializer which should be stable enough (Floor utility relies on this exclusively for quite some time).
ParquetSerializer
'sSerializeAsync
was acceptingParquetSerializerOptions
butDeserializeAsync
was acceptingParquetOptions
. This is now aligned for consistency so they both useParquetSerializerOptions
.
New features
- Class deserializer can optionally ignore property name casing (#536).
Improvements
ParquetWriter
supports asynchronous dispose pattern (IAsyncDisposable
), thanks to @andagr in #479.IronCompress
upstream dependency updated to 1.6.0.
Bugs fixed
- Nullable
Enum
s were not correctly unwrapped to primitive types, by @cliedeman in #551. - Reverting #537 due to it breaking binary compatibility in 4.25.0. Thanks to @NeilMacMullen for reporting this.
4.25.0
Improvements
- File merger utility has
Stream
overload for non file-based operations. - File merger utility has extra overload to choose compression codec and specify custom metadata, by @dxdjgl in #519.
- Timestamp logical type is supported, by @cliedeman in #521.
- More data types support encoding using Dictionary encoding, by @EamonHetherton in #531.
- Support for Roslyn nullable types, by @ErikApption in #537.
- internal: fix return of
Decode
methods to returning the actual destination length, by @artnim in #543.
4.24.0
New features
- Enum serialization is supported, using Enum's underlying type as a storage type.
[ParquetIgnore]
is supported in addition to[JsonIgnore]
for class properties. This is useful when you want to ignore a property in Parquet serialization but not in JSON serialization. Thanks to @rhvieira1980 in #411.- By popular demand, there is now a
FileMerger
utility which can merge multiple parquet files into a single file by either merging files or actual data together.
Improvements
- Nullable
TimeSpan
support inParquetSerializer
by @cliedeman in #409. DataFrame
support forint16/uint16
types by @asmirnov82 in #469.- Dropping build targets for .NET Core 3.1 and .NET 7.0 (STS). This should not affect anyone as .NET 6 and 8 are the LTS versions now.
- Added convenience methods to serialize/deserialize collections into a single row group in #506 by @piiertho.
- Serialization of interfaces and interface member properties is now supported, see #513 thanks to @Pragmateek.
ParquetReader
is now easier to use in LINQ expressions thanks to @danielearwicker in #509.- Upgraded to latest IronCompress dependency.
Bug fixes
- Loop will read past the end of a block #487 by @alex-harper.
- Decimal scale condition check fixed in #504 by @sierzput.
- Class schema reflector was using single cache for reading and writing, which resulted in incorrect schema for writing. Thanks to @Pragmateek in #514.
- Incorrect definition level for null values in #516 by @greg0rym.
Parquet Floor
- New feature "File explorer" lists filesystem using a panel on the left, allowing you to quickly load different files in the same directory and navigate to other directories.
- Hovering over title will show full file path and load time in milliseconds.
- Right-click on a row shows context menu allowing to copy the row to clipboard in text format.
- Icon updated to use the official Parquet logo.
- You will get a notification popup if a new version of Parquet Floor is available.
- Telemetry agreement changed and made clearer to understand.
4.23.5
Bug fixes
- Reading decimal fields ignores precision and scale by @sierzput in #482.
- UUID logical type was not read correctly, it must always be in big-endian format. Thanks to @anatoliy-savchak in #496.
4.23.4
Bug fixes
Fixed regression in schema discovery of nullables for DateTime
, DateOnly
, TimeOnly
.
4.23.3
4.23.2
Bug fixes
- Avoid file truncation when serializing with Append = true by @danielearwicker in #462.
- Failure to read Parquet file with
FIXED_LEN_BYTE_ARRAY
generated by Python in #463 thanks to @AndrewDavidLees by @aloneguid.
4.23.1
Improvement
- Flat file converter understands simple arrays and lists.
4.23.0
New features
- Class serializer now supports fields, in addition to properties (#405).
- New helper class
ParquetToFlatTableConverter
to simplify conversion of parquet files to flat data destinations.
Bugs fixed
- .NET >= 6 specific types
DateOnly
andTimeOnly
deserialization was failing due to schema validation errors (#395). TimeOnly
nullability wasn't respected.- Custom attributes like
[ParquetTimestamp]
,[ParquetMicroSecondsTime]
or[ParquetDecimal]
were ignored for nullable class properties (408).
Floor
- Remembers theme variant - "light" or "dark".
- Ask for permission to send anonymous telemetry data on start.
- New button - reload file from disk.
- Simple conversion to CSV.
- Implemented version check on start.