v0.11.2 (2022-05-05)
New features:
- Added support to append to existing IPC Arrow file #972 (jorgecarleitao)
- Added pop to utf8/binary/fixedSize MutableArray #966 (ygf11)
- Added support for union scalars #930 (ncpenke)
Fixed bugs:
- Added support to read nested binary from parquet #978 (jorgecarleitao)
- Fixed empty reader panic for NDJSON type infer #974 (Roberto-XY)
- Prevented SO in large parquet files #973 (ritchie46)
- Fixed API bug in
async
read of IPC metadata #969 (jorgecarleitao) - Fixed writing required list to parquet #968 (jorgecarleitao)
Enhancements:
- Added support Parquet deserialize LargeList and Uint data types #979 (b41sh)
- Made reading of IPC dictionaries lazy #971 (jorgecarleitao)
- Allowed creating IPC
FileWriter
without writing to the file #970 (jorgecarleitao)
v0.11.1 (2022-04-27)
Fixed bugs:
v0.11.0 (2022-04-27)
Breaking changes:
- Refactored parquet statistics deserialization #962 (jorgecarleitao)
- Made GroupFilter
Send + Sync
#947 (jorgecarleitao)
New features:
- Added support for non-ordered projections to IPC reading #961 (jorgecarleitao)
- Added support for reading indexed parquet pages #923 (jorgecarleitao)
Fixed bugs:
- Parquet regression:
exceptions.ArrowErrorException: NotYetImplemented("Can't read Dictionary(UInt32, LargeUtf8, false) from parquet")
#955 - Reading Parquet binary column panics during deserialization 'attempt to subtract with overflow` #944
- Reading Parquet file written by pyarrow with
lz4
compression fails withOutOfSpec("Thrift out of range")
#940 - Issues when trying to create a parquet file with FixedSizedListArray #691
- Fixed bug in writing csv with buffer resizing #965 (ritchie46)
- Fixed bug in reading binary parquet #945 (jorgecarleitao)
- Fixed error in writing fixedSizeListArray to parquet #941 (jorgecarleitao)
- Fixed support to read dict nested binary parquet #924 (jorgecarleitao)
Enhancements:
- Reduced memory usage in reading parquet #964 (jorgecarleitao)
- Simpler IPC code #939 (jorgecarleitao)
- don't allocate string when writing to csv #935 (ritchie46)
- Removed un-needed generic parameter #927 (jorgecarleitao)
- update to odbc-api 0.36.0 #925 (pacman82)
Documentation updates:
- Fixed example of parallel read via rayon #958 (jorgecarleitao)
- Fixed guide deployment #931 (jorgecarleitao)
- Typo fix #919 (bkmgit)
Testing updates:
- Fixed patch of integration tests #960 (jorgecarleitao)
- Added test for MapArray #942 (jorgecarleitao)
- Fixed wrong clippy warning #938 (jorgecarleitao)
v0.10.1 (2022-03-16)
New features:
- Added support to write
StructArray
to Avro #909 (jorgecarleitao) - Added support to write
ListArray
to Avro #908 (jorgecarleitao)
Fixed bugs:
- Fixed error in
FixedSizeBinaryArray::new_null
#914 (jorgecarleitao)
Enhancements:
- remove csv dependency for csv-write #917 (ritchie46)
- Added
capacity
to some mutable arrays and tests #913 (jorgecarleitao) - Support
sum
,min
andmax
for extension and decimal #907 (jorgecarleitao)
Testing updates:
- Added more tests #910 (jorgecarleitao)
v0.10.0 (2022-03-12)
Breaking changes:
- Renamed
Ffi_ArrowArray
andFfi_ArrowSchema
#859 - Improved performance and stability of writing to CSV #866 (ritchie46)
- Simplified API for writing to JSON #864 (jorgecarleitao)
- Simplified API to import from FFI #854 (jorgecarleitao)
- Simplified compute (lower/upper) #847 (jorgecarleitao)
- Simplified infering arrow schema from a parquet schema #819 (jorgecarleitao)
- Bumped parquet and aligned API to fit into it #795 (jorgecarleitao)
New features:
- Added
GrowableUnion
#902 (jorgecarleitao) - Added cast to
months_days_ns
#900 (jorgecarleitao) - Added support for
hash
ofmonth_day_ns
arrays #899 (jorgecarleitao) - IPC sink types and IPC file stream #878 (dexterduck)
- implemented
futures::Sink
for parquet async writer #877 (dexterduck) - Added
try_new
andnew
to all arrays #873 (jorgecarleitao) - Added support for datatypes serde #858 (houqp)
- Added support to the Arrow C stream interface (read and write) #857 (jorgecarleitao)
- Support to read/write from/to ODBC #849 (jorgecarleitao)
- Added operators that include validities in comparisons #846 (ritchie46)
- Added support to read and write
Decimal128
to Avro #837 (potter420) - Added support to read Arrow streams asynchronously #832 (jorgecarleitao)
- Added support to write
LargeUtf8
andLargeBinary
to Avro #828 (illumination-k) - Added support for pushdown projection in reading Avro #827 (jorgecarleitao)
- Added support to read Avro's structs #826 (jorgecarleitao)
- Added support to write largeUtf8/Binary to Avro #825 (jorgecarleitao)
- Added json serialization of timestamp/date32/date64 #814 (ritchie46)
- Added
BooleanArray::from_trusted_len_values_iter_unchecked
#799 (ritchie46) - Added
MutableUtf8Array::extend_values
#798 (ritchie46) - Added COW semantics to
Buffer
,Bitmap
and some arrays #794 (ritchie46) - Added support to read parquet row groups in chunks #789 (jorgecarleitao)
- Added scalar bitwise ops #788 (jorgecarleitao)
- Migrated to portable simd #747 (jorgecarleitao)
Fixed bugs:
- Fixed edge case in reading multiple parquet pages #904 (jorgecarleitao)
- Bug fix in offset for sliced unions #891 (ncpenke)
- Fix edge case in reading nested parquet #884 (jorgecarleitao)
- Fixed unsoundness of
#derive(Clone)
for FFI structs #882 (jorgecarleitao) - Fixed json writing of dates and datetimes #867 (jorgecarleitao)
- Fixed reading parquet with timezone #862 (jorgecarleitao)
- Fixed error in writing compressed IPC arrow #855 (jorgecarleitao)
- Fixed wrong null_count when slicing a sliced Bitmap #848 (satlank)
- Fixed error in writing compressed IPC files #840 (jorgecarleitao)
- Fixed float to i128 cast #817 (houqp)
- fix unescaped '"' in json writing #812 (ritchie46)
- Fixed reading parquet binary dict page #791 (danburkert)
Enhancements:
- Add
FixedSizeBinaryScalar
#782 - Use more idiomatic versions #898 (jorgecarleitao)
- Added support for min/max for decimal #897 (jorgecarleitao)
- Made
FixedSizeList::try_push_valid
public and addednew_with_field
#887 (ncpenke) - Added
MutableFixedList::mut_values
#886 (jorgecarleitao) - Made IPC IO use
try_new
#879 (jorgecarleitao) - expose
ListValuesIter
#874 (ritchie46) - Bumped crc #856 (jorgecarleitao)
- DRY parquet reading #845 (jorgecarleitao)
- Refactored (internal) fmt #842 (jorgecarleitao)
- Bumped zstd #841 (jorgecarleitao)
- inline push #835 (ritchie46)
- Increased API consistency for COW and respective docs #833 (jorgecarleitao)
- Improved flexibility of reading parquet #820 (jorgecarleitao)
- Small improvement to deserializing fixed-len parquet statistics. #818 (jorgecarleitao)
- Added support for other timestamp units from parquet #803 (jorgecarleitao)
- More to
into_mut
implementations #801 (ritchie46) - Added
FixedSizeListScalar
andFixedSizeBinaryScalar
#786 (illumination-k) - DRY parquet module #785 (jorgecarleitao)
Documentation updates:
- Improved documentation #860 (jorgecarleitao)
- Made crate
deny(missing_docs)
#808 (jorgecarleitao) - Fixed doc for
Bitmap::set_bit
#802 (yjshen) - Fixed
dyn Array::slice
docstring #792 (ritchie46)
Testing updates:
- Simpler code (DRY) #901 (jorgecarleitao)
- Fixed integration test #885 (jorgecarleitao)
- Simplified code to generate parquet files for tests #883 (jorgecarleitao)
- Removed un-needed
unsafe
#843 (jorgecarleitao) - Added more tests #810 (jorgecarleitao)
- Reduced code duplication #805 (jorgecarleitao)
- upgrade to clap 3.0 #797 (Jimexist)
- Simplified avro reading and added more tests #737 (jorgecarleitao)
v0.9.1 (2022-01-19)
New features:
- Added support for compare dictionary-encoded with scalar #686 (jorgecarleitao)
Fixed bugs:
- Allowed passing
None
as ipc_fields in flight API #780 (jorgecarleitao)
Enhancements:
- Read dict binary from parquet #781 (jorgecarleitao)
- Added support to read and write float dict from parquet #778 (jorgecarleitao)
Testing updates:
- Fixed CI for SIMD #779 (jorgecarleitao)
v0.9.0 (2022-01-14)
Breaking changes:
- Added number of rows read in CSV inference #765 (jorgecarleitao)
- Refactored
nullif
#753 (jorgecarleitao) - Migrated to latest parquet2 #752 (jorgecarleitao)
- Replace flatbuffers dependency by Planus #732 (jorgecarleitao)
- Simplified
Schema
andField
#728 (jorgecarleitao) - Replaced
RecordBatch
byChunk
#717 (jorgecarleitao) - Removed
Option
from fields' metadata #715 (jorgecarleitao) - Moved dict_id to IPC-specific IO #713 (jorgecarleitao)
- Moved is_ordered from
Field
toDataType::Dictionary
#711 (jorgecarleitao) - Refactored JSON writing (5-10x) #709 (jorgecarleitao)
- Made Avro read API use
Block
andCompressedBlock
#698 (jorgecarleitao) - Simplified most traits #696 (jorgecarleitao)
- Replaced
Display
byDebug
forArray
#694 (jorgecarleitao) - Replaced
MutableBuffer
bystd::Vec
#693 (jorgecarleitao) - Simplified
Utf8Scalar
andBinaryScalar
#660 (jorgecarleitao) - Simplified Primitive and Boolean scalar #648 (jorgecarleitao)
New features:
- Add
and_scalar
andor_scalar
for boolean_kleene #662 - Add
lower
andupper
support for string #635 - Added support to cast decimal #761 (jorgecarleitao)
- Added support to deserialize JSON (!= NDJSON) #758 (jorgecarleitao)
- Added support to infer nested json structs #750 (jorgecarleitao)
- Added support to compare intervals #746 (jorgecarleitao)
- Added
any
andall
kernel #739 (ritchie46) - Added support to write Avro async #736 (jorgecarleitao)
- Added support to write interval to Avro #734 (jorgecarleitao)
- Added
and_scalar
andor_scalar
for boolean kleene #723 (silathdiir) - Added
and_scalar
andor_scalar
for boolean #707 (silathdiir) - Refactored JSON read to split IO-bounded from CPU-bounded tasks #706 (jorgecarleitao)
- Added more conversions from parquet #701 (jorgecarleitao)
- Added support for compressed Avro write #699 (jorgecarleitao)
- Added support to write to Avro #690 (jorgecarleitao)
- Added dynamic version of negation #685 (jorgecarleitao)
- Added support to read dictionary-encoded required parquet pages #683 (mdrach)
- Added
upper
#664 (Xuanwo) - Added
lower
#641 (Xuanwo) - Added support for
async
read of Avro #620 (jorgecarleitao)
Fixed bugs:
- Pyarrow and Arrow2 don't agree on Timestamp resolution #700
- Writing compressed dictionary in parquet corrupts the files #667
- Replaced assert by error in IPC read #748 (jorgecarleitao)
- Made all panics in IPC read errors #722 (jorgecarleitao)
- Fixed error in compare booleans #721 (jorgecarleitao)
- Fixed error in dispatching scalar arithmetics #682 (jorgecarleitao)
- Fixed error in reading negative decimals from parquet #679 (mdrach)
- Made IPC reader less restrictive #678 (jorgecarleitao)
- Fixed error in trait constraint in compute #665 (jorgecarleitao)
- Fixed performance regression of CSV reading #657 (jorgecarleitao)
- Fixed filter of predicate with validity #653 (ritchie46)
- Made
Scalar: Send+Sync
#644 (jorgecarleitao)
Enhancements:
- Feature: JSON IO? #712
- Simplified code #760 (jorgecarleitao)
- Added iterator of values of
FixedBinaryArray
#757 (jorgecarleitao) - Remove un-needed
unsafe
#756 (jorgecarleitao) - Replaced un-needed
unsafe
#755 (jorgecarleitao) - Made IO
#[forbid(unsafe)]
#749 (jorgecarleitao) - Improved reading nullable Avro arrays #727 (Igosuki)
- Allow to create primitive array by vec without extra memcopy #710 (sundy-li)
- Removed requirement of
use Array
to access primitives'data_type
#697 (jorgecarleitao) - Cleaned up trait usage and added forbid_unsafe to parts #695 (jorgecarleitao)
- Migrated from
avro-rs
toavro-schema
#692 (jorgecarleitao) - Added
MutablePrimitiveArray::extend_constant
#689 (jorgecarleitao) - Do not write validity without nulls in IPC #688 (jorgecarleitao)
- DRY code via macro #681 (jorgecarleitao)
- Made
dyn Array
andScalar
usable in#[derive(PartialEq)]
#680 (jorgecarleitao) - Made IPC ZSTD-compressed consumable by pyarrow #675 (jorgecarleitao)
- Simplified trait bounds in arithmetics #671 (jorgecarleitao)
- Improved performance of reading utf8 required from parquet (-15%) #670 (jorgecarleitao)
- Avoid double utf8 checks on MutableUtf8 -> Utf8 #655 (jorgecarleitao)
- Made
Buffer::offset
public #652 (ritchie46) - Improved performance in cast Primitive to Binary/String (2x) #646 (sundy-li)
- Made
Filter: Send+Sync
#645 (jorgecarleitao) - Made API to create field accept
String
#643 (jorgecarleitao)
Documentation updates:
- Fixed clippy (coming from 1.58) #763 (jorgecarleitao)
- Described how to run part of the tests #762 (jorgecarleitao)
- Improved README #735 (jorgecarleitao)
- clarify boolean value in DataType::Dictionary #718 (ritchie46)
- readme typo #687 (max-sixty)
- Added example to read parquet in parallel with rayon #658 (jorgecarleitao)
- Added documentation to
Bitmap::as_slice
#654 (ritchie46)
Testing updates:
- Improved json tests #742 (jorgecarleitao)
- Added integration tests for writing compressed parquet #740 (jorgecarleitao)
- Updated patch for integration test #731 (jorgecarleitao)
- Added cargo check to benchmarks #730 (sundy-li)
- More tests to CSV writing #724 (jorgecarleitao)
- Added integration tests for other compressions with parquet from pyarrow #674 (jorgecarleitao)
- Bumped nightly in CI #672 (jorgecarleitao)
- Invalidate caches from CI. #656 (jorgecarleitao)
v0.8.1 (2021-11-27)
Fixed bugs:
v0.8.0 (2021-11-27)
Breaking changes:
- Made CSV write options use chrono formatting by default #624
- Add
compression
toIpcWriteOptions
#570 - Made
cast
acceptCastOptions
parameter #569 - Simplified
ArrowError
#640 (jorgecarleitao) - Use
DynComparator
forlexsort
andpartition
#637 (yjshen) - Split "compute" feature #634 (jorgecarleitao)
- Removed unneeded trait. #628 (jorgecarleitao)
- Sealed 2 traits to forbid downstream implementations #621 (jorgecarleitao)
- Simplified arithmetics compute #607 (jorgecarleitao)
- Refactored comparison
Operator
#604 (jorgecarleitao) - Simplified dictionary indexes #584 (jorgecarleitao)
- Simplified IPC APIs #576 (jorgecarleitao)
- Simplified IPC stream writer / remove finish on drop from stream writer #575 (jorgecarleitao)
- Simplified trait in compute. #572 (jorgecarleitao)
- Compute: add partial option into CastOptions #561 (sundy-li)
- Introduced
UnionMode
enum #557 (simonvandel) - Changed DataType::FixedSize*(i32) to DataType::FixedSize*(usize) #556 (simonvandel)
New features:
- Added support to write timestamps with timezones for CSV #623 (jorgecarleitao)
- Added support to read Avro files' metadata asynchronously #614 (jorgecarleitao)
- Added iterator for
StructArray
#613 (illumination-k) - Added support to read snappy-compressed Avro #612 (jorgecarleitao)
- Added support to read decimal from csv #602 (jorgecarleitao)
- Added support to cast
NullArray
to all other types #589 (flaneur2020) - Added support dictionaries in nested types over IPC #587 (jorgecarleitao)
- Added support to write Arrow IPC streams asynchronously #577 (jorgecarleitao)
- Added support to write compressed Arrow IPC (feather v2) #566 (jorgecarleitao)
- Added support for ffi for
FixedSizeList
andFixedSizeBinary
#565 (jorgecarleitao) - Added support for
async
csv reading. #562 (jorgecarleitao) - Added support for
bitwise
operations #553 (1aguna) - Added support to read
StructArray
from parquet #547 (jorgecarleitao)
Fixed bugs:
- Fixed error in reading nullable from Avro. #631 (jorgecarleitao)
- Fixed error in union FFI #625 (jorgecarleitao)
- Fixed error in computing projection in
io::ipc::read::reader::FileReader
#596 (illumination-k) - Fixed error in compressing IPC LZ4 #593 (jorgecarleitao)
- Fixed growable of dictionaries negative keys #582 (ritchie46)
- Made substring kernel on utf8 take chars into account. #568 (ritchie46)
- Fixed error in passing sliced arrays via FFI #564 (jorgecarleitao)
Enhancements:
- Faster
take
with null values (2-3x) #633 (jorgecarleitao) - Improved error message for missing feature in compressed parquet #632 (jorgecarleitao)
- Added
to
conversion toFixedSizeBinary
#622 (ritchie46) - Bumped
confy-table
#618 (jorgecarleitao) - Made
MutableArray
Send + Sync
#617 (jorgecarleitao) - Removed most of allocations in IPC reading #611 (jorgecarleitao)
- Speed up boolean comparison kernels (~3x) #610 (Dandandan)
- Improved performance of decimal arithmetics #605 (jorgecarleitao)
- Simplified traits and added documentation #603 (jorgecarleitao)
- Improved performance of
is_not_null
. #600 (jorgecarleitao) - Added
len
to every array #599 (jorgecarleitao) - Added support for
NullArray
at FFI. #598 (jorgecarleitao) - Optimized
MutableBinaryArray
#597 (jorgecarleitao) - Speedup/simplify bitwise operations (avoid extra allocation) #586 (Dandandan)
- Improved performance of
bitmap::from_trusted
(3x) #578 (jorgecarleitao) - Made bitmap not cache null count #563 (jorgecarleitao)
- Avoided redundant checks in creating an
Utf8Array
fromMutableUtf8Array
#560 (jorgecarleitao) - Avoid unnecessary allocations #559 (simonvandel)
- Surfaced errors in reading from avro #558 (jorgecarleitao)
Documentation updates:
- Simplified example #619 (jorgecarleitao)
- Made example of parallel parquet write be over multiple batches #544 (jorgecarleitao)
Testing updates:
- Cleaned up benches #636 (jorgecarleitao)
- Ignored tests code in coverage report #615 (yjhmelody)
- Added more tests #601 (jorgecarleitao)
- Mitigated
RUSTSEC-2020-0159
#595 (jorgecarleitao) - Added more tests #591 (jorgecarleitao)
v0.7.0 (2021-10-29)
Breaking changes:
- Simplified reading parquet #532 (jorgecarleitao)
- Change IPC
FileReader
to own the underlying reader #518 (blakesmith) - Migrate to
arrow_format
crate #517 (jorgecarleitao)
New features:
- Added read of 2-level nested lists from parquet #548 (jorgecarleitao)
- add dictionary serialization for csv-writer #515 (ritchie46)
- Added
checked_negate
andwrapping_negate
forPrimitiveArray
#506 (yjhmelody)
Fixed bugs:
- Fixed error in reading fixed len binary from parquet #549 (jorgecarleitao)
- Fixed ffi of sliced arrays #540 (jorgecarleitao)
- Fixed s3 example #536 (jorgecarleitao)
- Fixed error in writing compressed parquet dict pages #523 (jorgecarleitao)
- Validity taken into account when writing
StructArray
to json #511 (VasanthakumarV)
Enhancements:
- Bumped Prost and Tonic #550 (PsiACE)
- Speedup scalar boolean operations #546 (Dandandan)
- Added fast path for validating ASCII text (~1.12-1.89x improvement on reading ASCII parquet data) #542 (Dandandan)
- Exposed missing APIs to write parquet in parallel #539 (jorgecarleitao)
- improve utf8 init validity #530 (ritchie46)
- export missing
BinaryValueIter
#526 (yjhmelody)
Documentation updates:
- Added more IPC documentation #534 (HagaiHargil)
- Fixed clippy and fmt #521 (ritchie46)
Testing updates:
- Added more tests for
utf8
#543 (jorgecarleitao) - Ignored RUSTSEC-2020-0071 and RUSTSEC-2020-0159 #537 (jorgecarleitao)
- Improved parquet read benches #533 (jorgecarleitao)
- Added fmt and clippy checks to CI. #522 (xudong963)
v0.6.2 (2021-10-09)
New features:
Fixed bugs:
- Do not check offsets or utf8 validity in ffi (#505) #510 (NilsBarlaug)
- Made
try_push_valid
public again #509 (ritchie46)
Enhancements:
v0.6.1 (2021-10-07)
Breaking changes:
- Bring
MutableFixedSizeListArray
to the spec used by the rest of the Mutable API #475 - Removed
ALIGNMENT
invariant from[Mutable]Buffer
#449 - Un-nested
compute::arithemtics::basic
#461 (jorgecarleitao) - Added more serialization options for csv writer. #453 (ritchie46)
- Changed validity from
&Option<Bitmap>
toOption<&Bitmap>
. #431 (jorgecarleitao) - Bumped parquet2 #422 (jorgecarleitao)
- Changed IPC
FileWriter
to own thewriter
. #420 (yjshen) - Made
DynComparator
Send+Sync
#414 (yjshen)
New features:
- Read Decimal from Parquet File #444
- Add IO read for Avro #401
- Added support to read Avro logical types,
List
,Enum
,Duration
andFixed
. #493 (jorgecarleitao) - Added read
Decimal
from parquet #489 (potter420) - Implement
BitXor
trait forBitmap
#485 (houqp) - Added
extend
/extend_unchecked
forMutableBooleanArray
#478 (VasanthakumarV) - expose
shrink_to_fit
to mutable arrays #467 (ritchie46) - Added support for
DataType::Map
andMapArray
#464 (jorgecarleitao) - Extract parts of datetime #433 (VasanthakumarV)
- Added support to add an interval to a timestamp #417 (jorgecarleitao)
- Added support to read Avro. #406 (jorgecarleitao)
- Replaced own allocator by
std::Vec
. #385 (jorgecarleitao)
Fixed bugs:
- crash in parquet read #459
- Made writing stream to parquet require a non-static lifetime #471 (GrandChaman)
- Made importing from FFI
unsafe
#458 (jorgecarleitao) - Fixed panic in division using nulls. #438 (jorgecarleitao)
- Fixed error writing dictionary extension to IPC #397 (jorgecarleitao)
- Fixed error in extending
MutableBitmap
#393 (jorgecarleitao)
Enhancements:
- Some
compare
function are not exported #349 - Investigate how to add support for timezones in timestamp #23
- Made
hash
work for extension type #487 (jorgecarleitao) - Added
extend
/extend_unchecked
forMutableBinaryArray
#486 (VasanthakumarV) - Improved inference and deserialization of CSV #483 (jorgecarleitao)
- Added
GrowableFixedSizeList
and improvedMutableFixedSizeListArray
#470 (jorgecarleitao) - Added
MutableBitmap::shrink_to_fit
#468 (jorgecarleitao) - Added
MutableArray::as_box
#450 (sd2k) - Improved performance of sum aggregation via aligned loads (-10%) #445 (ritchie46)
- Removed
assert
fromMutableBuffer::set_len
#443 (ritchie46) - Optimized
null_count
#442 (ritchie46) - Improved performance of list iterator (- 10-20%) #441 (ritchie46)
- Improved performance of
PrimitiveGrowable
for nulls (-10%) #434 (jorgecarleitao) - Allowed accessing validity without importing
Array
#432 (jorgecarleitao) - Optimize hashing using
ahash
andmultiversion
(-30%) #428 (Dandandan) - Improved performance of iterator of
Utf8Array
andBinaryArray
(3-4x) #427 (jorgecarleitao) - Improved performance of utf8 validation of large strings via
simdutf8
(-40%) #426 (Dandandan) - Added reading of parquet required dictionary-encoded binary. #419 (jorgecarleitao)
- Add
extend
/extend_unchecked
forMutableUtf8Array
#413 (VasanthakumarV) - Added support to extract hours and years from timestamps with timezone #412 (jorgecarleitao)
- Added
io_csv_read
andio_csv_write
feature #408 (ritchie46) - Improve
comparison
docs and re-export the array-comparing function #404 (HagaiHargil) - Added support to read dict-encoded required primitive types from parquet #402 (Dandandan)
- Added
Array::with_validity
#399 (ritchie46)
Documentation updates:
- Improved documentation #491 (jorgecarleitao)
- Added more API docs. #479 (jorgecarleitao)
- Added more documentation #476 (jorgecarleitao)
- Improved documentation #462 (jorgecarleitao)
- Added example showing parallel writes to parquet (x num_cores) #436 (jorgecarleitao)
- Improved documentation #430 (jorgecarleitao)
- [0.5] The docs
io
module has no submodules #390 - Made docs be compiled with feature
full
#391 (jorgecarleitao)
Testing updates:
- DRY via macro. #477 (jorgecarleitao)
- DRY of type check and len check code in
compute
#474 (yjhmelody) - Added property testing #460 (jorgecarleitao)
- Added fmt to CI. #455 (jorgecarleitao)
- Simplified CI #452 (jorgecarleitao)
- fix filter kernels bench #440 (ritchie46)
- Reduced number of combinations in feature tests. #429 (jorgecarleitao)
- Move tests from
src/compute/
totests/
#423 (VasanthakumarV) - Skipped some feature permutations. #411 (jorgecarleitao)
- Added tests to some invariants of
unsafe
#403 (jorgecarleitao) - Added support to read and write extension types to and from parquet #396 (jorgecarleitao)
- Fix testing of SIMD #394 (jorgecarleitao)
v0.5.3 (2021-09-14)
New features:
- Added support to read and write extension types to and from parquet #396 (jorgecarleitao)
Fixed bugs:
- Fixed error writing dictionary extension to IPC #397 (jorgecarleitao)
- Fixed error in extending
MutableBitmap
#393 (jorgecarleitao)
Enhancements:
- Added support to read dict-encoded required primitive types from parquet #402 (Dandandan)
- Added
Array::with_validity
#399 (ritchie46)
Testing updates:
- Fix testing of SIMD #394 (jorgecarleitao)
v0.5.1 (2021-09-09)
Documentation updates:
- [0.5] The docs
io
module has no submodules #390 - Made docs be compiled with feature
full
#391 (jorgecarleitao)
v0.5.0 (2021-09-07)
Breaking changes:
- Added
Extension
toDataType
#361 MonthDayNano
added to enumIntervalUnit
#360- Make
io::parquet::write::write_*
return size of file in bytes #354 - Renamed
bitmap::utils::null_count
tobitmap::utils::count_zeros
#342 - Made
GroupFilter
optional in parquet'sRecordReader
and added method to set it. #386 (jorgecarleitao) - Removed
PartialOrd
andOrd
of all enums indatatypes
#379 (jorgecarleitao) - Made
cargo
features not default #369 (jorgecarleitao) - Prepare APIs for extension types #357 (jorgecarleitao)
New features:
- Added support for
async
parquet write #372 (GrandChaman) - Add support to extension types in FFI #363 (jorgecarleitao)
- Added support for field's metadata via FFI #362 (jorgecarleitao)
- Added support for
Extension
(logical) type #359 (jorgecarleitao) - Added support for compute to
BinaryArray
#346 (zhyass) - Added support for reading binary from CSV #337 (jorgecarleitao)
- Added support for
MONTH_DAY_NANO
interval type #268 (jorgecarleitao)
Fixed bugs:
- Parquet read skips a few rows at the end of the page #373
parquet_read
fails when a column has too many rows with string values #366parquet_read
panics withindex_out_of_bounds
#351- Fixed error in
MutableBitmap::push_unchecked
#384 (jorgecarleitao) - Fixed display of timestamp with tz. #375 (jorgecarleitao)
Enhancements:
- Added
extend_*values
toMutablePrimitiveArray
#383 (ritchie46) - Improved performance of writing to CSV (20-25%) #382 (jorgecarleitao)
- Bumped
lexical-core
#378 (jorgecarleitao) - Fixed casting of utf8 <> Timestamp with and without timezone #376 (jorgecarleitao)
- Added
Send+Sync
toMutableBuffer
#368 (jorgecarleitao) - Improved performance of unary _not_ for aligned bitmaps (3x) #365 (jorgecarleitao)
- Reduced dependencies within
num
#353 (jorgecarleitao) - Bumped to parquet2 v0.4 #352 (jorgecarleitao)
- Bumped tonic and prost in flight #344 (PsiACE)
- Improved null count calculation (5x) #343 (jorgecarleitao)
- Improved perf of deserializing integers from json (30%) #340 (jorgecarleitao)
- Simplified code of json schema inference #339 (jorgecarleitao)
Documentation updates:
- Moved guide examples to examples/ #387 (jorgecarleitao)
- Added more docs. #358 (jorgecarleitao)
- Improved API docs. #355 (jorgecarleitao)
Testing updates:
- Moved tests to
tests/
#389 (jorgecarleitao) - Moved compute tests to tests/ #388 (jorgecarleitao)
- Added more tests. #380 (jorgecarleitao)
- Pinned nightly in SIMD tests #364 (jorgecarleitao)
- Improved benches for take #348 (jorgecarleitao)
- Made IPC integration tests run tests that are not run by arrow-rs #278 (jorgecarleitao)
v0.4.0 (2021-08-24)
Breaking changes:
- Change dictionary iterator of values from
Array
s of one element toScalar
s #335 - Align FFI API with arrow's C++ API #328
- Make
*_compare_scalar
not returnResult
#316 - Make
io::print
,get_value_display
andget_display
not returnResult
#286 - Add
MetadataVersion
to IPC interfaces #282 - Change
DataType::Union
to enable round trips in IPC #281 - Removed clone requirement in
StructArray -> RecordBatch
#307 (jorgecarleitao) - Fixed error in reading a non-finished IPC stream. #302 (jorgecarleitao)
- Generalized ZipIterator to accept a
BitmapIter
#296 (jorgecarleitao)
New features:
- Added API to FFI
Field
#321 (jorgecarleitao) - Added
compare_scalar
#317 (jorgecarleitao) - Add
UnionArray
#283 (jorgecarleitao)
Fixed bugs:
- SliceIterator of last bytes is not correct #292
- Fixed error in displaying dictionaries with nulls in values #334 (jorgecarleitao)
- Fixed error in dict equality #333 (jorgecarleitao)
- Fixed small inconsistencies between
compute::cast
andcompute::can_cast
#295 (jorgecarleitao) - Removed order implementation for
days_ms
/Interval(DayTime)
#285 (jorgecarleitao)
Enhancements:
- Added support for remaining non-nested datatypes #336 (jorgecarleitao)
- Made
multiversion
andlexical-core
optional #324 (jorgecarleitao) - Improved performance of utf8 comparison (1.7x-4x) #322 (jorgecarleitao)
- Improved performance of boolean comparison (5x-14x) #318 (jorgecarleitao)
- Added trait
TryPush
#314 (jorgecarleitao) - Added cast
date32 -> i64
anddate64 -> i32
#308 (ritchie46) - Improved performance of comparison with SIMD feature flag (2x-3.5x) #305 (jorgecarleitao)
- Added support to read json to
BinaryArray
#304 (jorgecarleitao) - Improved
MutableFixedSizeBinaryArray
#303 (jorgecarleitao) - Improved
MutablePrimitiveArray
andMutableUtf8Array
#299 (jorgecarleitao) - Improved
MutableBooleanArray
#297 (jorgecarleitao) - Improved performance of concatenating non-aligned validities (15x) #291 (jorgecarleitao)
- Added support for timestamps with tz and interval to
io::print::write
#287 (jorgecarleitao) - Improved debug repr of buffers and bitmaps. #284 (jorgecarleitao)
- Cleaned up internals of json integration #280 (jorgecarleitao)
- Removed
serde_derive
dependency #279 (jorgecarleitao) - Simplified IPC code. #277 (jorgecarleitao)
- Reduced dependencies from confi-table and enabled
wasm
onio_print
feature. #276 (jorgecarleitao) - Improve performance of
rem_scalar/div_scalar
for integer types (4x-10x) #275 (ritchie46)
Documentation updates:
- Cleaned examples and docs from old API. #330 (jorgecarleitao)
- Improved documentation #306 (jorgecarleitao)
Testing updates:
- Improved naming of testing workflows #315 (jorgecarleitao)
- Added tests to scalar API #300 (jorgecarleitao)
- Made CSV and JSON tests not use files. #290 (jorgecarleitao)
- Moved tests to integration tests #289 (jorgecarleitao)
Closed issues:
- Make parquet_read_record support async #331
- Panic due to SIMD comparison #312
- Bitmap::mutable line 155 may Panic/segfault #309
- IPC's
StreamReader
may abort due to excessive memory by overflowing ausize
d variable #301 - Improve performance of
rem_scalar/div_scalar
for integer types (4x-10x) #259
v0.3.0 (2021-08-11)
Breaking changes:
- Renamed
sum
tosum_primitive
#273 - Moved trait
Index
fromarray::Index
totypes::Index
#272 - Added optional
projection
to IPC FileReader #271 - Added optional
page_filter
to parquet'sRecordReader
andget_page_iterator
#270 - Renamed parquets'
CompressionCodec
toCompression
#269
New features:
- Added support for FFI of dictionary-encoded arrays #267 (jorgecarleitao)
- Added support for projection pushdown on IPC files #264 (jorgecarleitao)
- Added support to read parquet asynchronously #260 (jorgecarleitao)
- Added support to filter parquet pages. #256 (jorgecarleitao)
- Added wrapping_cast to cast kernels #254 (sundy-li)
- Added support to parquet IO on wasm32 #239 (jorgecarleitao)
- Added support to round-trip dictionary arrays on parquet #232 (jorgecarleitao)
- Added Scalar API #56 (jorgecarleitao)
Fixed bugs:
- Fixed error in computing remainder of chunk iterator #262 (jorgecarleitao)
- Fixed error in slicing bitmap. #250 (jorgecarleitao)
Enhancements:
- Improve the performance in cast kernel using AsPrimitive trait in generic dispatch #252
- Poor performance in
sort::sort_to_indices
with limit option in arrow2 #245 - Support loading Feather v2 (IPC) files with more than 1 million tables #231
- Migrated to parquet2 v0.3 #265 (jorgecarleitao)
- Added more tests to cast and min/max #253 (jorgecarleitao)
- Prettytable is unmaintained. Change to comfy-table #251 (PsiACE)
- Added IndexRange to remove checks in hot loops #247 (jorgecarleitao)
- Make merge_sort_slices MergeSortSlices public #243 (sundy-li)
Documentation updates:
- Added example and guide section on compute #242 (jorgecarleitao)
Closed issues:
- Allow projection pushdown to IPC files #261
- Add support to write dictionary-encoded pages #211
- Make IpcWriteOptions easier to find. #120
v0.2.0 (2021-07-30)
Breaking changes:
- Simplified
new
signature of growable API #238 (jorgecarleitao) - Add support to merge sort with a limit #222 (sundy-li)
- Generalized sort to accept indices other than i32. #220 (jorgecarleitao)
- Added support for limited sort #218 (jorgecarleitao)
New features:
- Merge sort support limit option #221
- Introduce limit option to sort #215
- Added support for take of interval of days_ms #219 (jorgecarleitao)
- Added FFI for remaining types #213 (jorgecarleitao)
Fixed bugs:
- Filter operation on sliced utf8 arrays are incorrect #233
- Fixed error in slicing bitmap. #237 (jorgecarleitao)
- Fixed nested FFI. #212 (jorgecarleitao)
Enhancements:
- Avoid materialization of indices in filter_record_batch for single arrays #234
- Add integration tests for writing to parquet #80
- Short-circuited boolean evaluation in GrowableList #228 (ritchie46)
- Add extra inlining to speed up take #226 (Dandandan)
- Removed un-needed
unsafe
#225 (jorgecarleitao)
Documentation updates:
- Add documentation to guide #96
- Add git submodule command to correct the test doc #223 (sundy-li)
- Added badges to README #216 (sundy-li)
- Clarified differences with arrow crate #210 (alamb)
- Clarified differences with arrow crate #209 (alamb)
* This Changelog was automatically generated by github_changelog_generator