Skip to content

Releases: aloneguid/parquet-dotnet

5.0.2

14 Nov 18:45
Compare
Choose a tag to compare

New features

  • Untyped serialisation supports async enumerable, thanks to @flambert860 in #566.

Improvements

  • Serialisation of CLR (not Parquet) structs and nullable structs is now properly handled and supported, thanks to @paulengineer.
  • Microsoft.Data.Analysis related functionality was moved out to a separate nuget package - Parquet.Net.Data.Analysis. This is because it introduces quite a few dependencies which are not always needed with the slim main package.
  • For Windows, run unit tests on x86 and x32 explicitly.
  • Improved GHA build/release process, combining all workflows into one and simplifying it, most importantly release management.

Floor

  • application slimmed down a bit, removing "File Explorer". Sticking to doing one thing and do it well - view Parquet files.
  • Due to Avalonia startup times being not satisfactory when you are in the "mode" (1-2 seconds) Floor will reuse existing instance to open a file rather than starting the app again.

Announcements 🎉

There is a new, very young project I've been thinking a lot for a long time and finally started - DeltaIO. It's attempting to do what Parquet.Net did for Apache Parquet but for Delta tables. It heavily relies on this library to read delta logs and data from it. It's still very young, but if you are interested in Delta with .NET, please check it out, bookmark, start and leave feedbacks/suggestions.

5.0.1

14 Oct 13:13
Compare
Choose a tag to compare

New feature

You can deserialise "required" lists and "required" list elements, as raised by @akaloshych84 in #502. See nullability and lists.

Improvements

  • Better error reporting in case class serializer has mismatched definition and repetition levels (as per #502).
  • Pass property attributes down to list data field, by @agaskill in #559.

Bug fixed

  • Compression/decompression would fail on some platforms like x86 or Linux x86 with musl runtime.

Floor

  • boolean columns display as checks.
  • Structs display as expandable objects, with properly aligned keys.

5.0.0

02 Oct 13:21
Compare
Choose a tag to compare

Support Parquet.Net

If you find the project helpful, you can support Parquet.Net by starring it.

Breaking changes

  • This is the first version without old Table/Row API, which is now completely removed. This API was one of the major headaches and source of bugs since being introduced in the very first version of this library. If you need a similar functionality, consider untyped serializer which should be stable enough (Floor utility relies on this exclusively for quite some time).
  • ParquetSerializer's SerializeAsync was accepting ParquetSerializerOptions but DeserializeAsync was accepting ParquetOptions. This is now aligned for consistency so they both use ParquetSerializerOptions.

New features

Improvements

  • ParquetWriter supports asynchronous dispose pattern (IAsyncDisposable), thanks to @andagr in #479.
  • IronCompress upstream dependency updated to 1.6.0.

Bugs fixed

  • Nullable Enums were not correctly unwrapped to primitive types, by @cliedeman in #551.
  • Reverting #537 due to it breaking binary compatibility in 4.25.0. Thanks to @NeilMacMullen for reporting this.

4.25.0

09 Sep 08:45
Compare
Choose a tag to compare

Improvements

  • File merger utility has Stream overload for non file-based operations.
  • File merger utility has extra overload to choose compression codec and specify custom metadata, by @dxdjgl in #519.
  • Timestamp logical type is supported, by @cliedeman in #521.
  • More data types support encoding using Dictionary encoding, by @EamonHetherton in #531.
  • Support for Roslyn nullable types, by @ErikApption in #537.
  • internal: fix return of Decode methods to returning the actual destination length, by @artnim in #543.

4.24.0

06 Jun 12:38
Compare
Choose a tag to compare

New features

  • Enum serialization is supported, using Enum's underlying type as a storage type.
  • [ParquetIgnore] is supported in addition to [JsonIgnore] for class properties. This is useful when you want to ignore a property in Parquet serialization but not in JSON serialization. Thanks to @rhvieira1980 in #411.
  • By popular demand, there is now a FileMerger utility which can merge multiple parquet files into a single file by either merging files or actual data together.

Improvements

  • Nullable TimeSpan support in ParquetSerializer by @cliedeman in #409.
  • DataFrame support for int16/uint16 types by @asmirnov82 in #469.
  • Dropping build targets for .NET Core 3.1 and .NET 7.0 (STS). This should not affect anyone as .NET 6 and 8 are the LTS versions now.
  • Added convenience methods to serialize/deserialize collections into a single row group in #506 by @piiertho.
  • Serialization of interfaces and interface member properties is now supported, see #513 thanks to @Pragmateek.
  • ParquetReader is now easier to use in LINQ expressions thanks to @danielearwicker in #509.
  • Upgraded to latest IronCompress dependency.

Bug fixes

  • Loop will read past the end of a block #487 by @alex-harper.
  • Decimal scale condition check fixed in #504 by @sierzput.
  • Class schema reflector was using single cache for reading and writing, which resulted in incorrect schema for writing. Thanks to @Pragmateek in #514.
  • Incorrect definition level for null values in #516 by @greg0rym.

Parquet Floor

  • New feature "File explorer" lists filesystem using a panel on the left, allowing you to quickly load different files in the same directory and navigate to other directories.
  • Hovering over title will show full file path and load time in milliseconds.
  • Right-click on a row shows context menu allowing to copy the row to clipboard in text format.
  • Icon updated to use the official Parquet logo.
  • You will get a notification popup if a new version of Parquet Floor is available.
  • Telemetry agreement changed and made clearer to understand.

4.23.5

04 Apr 13:10
Compare
Choose a tag to compare

Bug fixes

  • Reading decimal fields ignores precision and scale by @sierzput in #482.
  • UUID logical type was not read correctly, it must always be in big-endian format. Thanks to @anatoliy-savchak in #496.

4.23.4

02 Feb 10:12
Compare
Choose a tag to compare

Bug fixes

Fixed regression in schema discovery of nullables for DateTime, DateOnly, TimeOnly.

4.23.3

25 Jan 10:46
Compare
Choose a tag to compare

4.23.3

Fixed regression in schema discovery of nullable decimal data types. Thanks to @stefer in #465 for investigating and reporting this.

4.23.2

22 Jan 11:06
Compare
Choose a tag to compare

Bug fixes

4.23.1

19 Jan 10:46
Compare
Choose a tag to compare

Improvement

  • Flat file converter understands simple arrays and lists.