07 Mar 19:32

arupcsedu

2e95291

0.6.0 Latest

Latest

Cylon 0.6.0 is a major release. We are excited to present UCC, Gloo integration, More distributed operations

Features

Cylon C++ and Python

Implemention of Slice, Head and Tail Operations
adding conda docker
Ucc integration
adding cylonflow as a submodule
Use generic operator
Summit fixes
Adding custom mpirun params cmake var
Adding cmake parallelism flag
Gloo python binding
Enabling gloo CI
Add downloading catch2 header dynamically
Dist sort cpu
Cylon Gloo integration
Adding distributed scalar aggregates
Extending datatypes
Allowing custom MPI_Comm for MPI

Build

Updating to Arrow 0.9.x
Windows build support
MacOS build support
Conda build is the default build
Improving docker build

You can download source code from Github
Conda binaries are available in Anaconda

Commits

91bdd54 Update conda-actions.yml (#645)
d1739ed Added buildable instructions for Rivanna (#643)
d9a6420 Arrow 9.0.0 and gcc-11 update (#601)
4c867b1 Summit Fixes (#623)
7f8a3b1 Fixing sample bug (#631)
ce12454 Cython binding for slice, head and tail (#619)
ef4c904 #610: SampleArray util method replaced by using arrow::compute::Take … (#612)
4694a9e Minor fixes (#608)
121b386 Fixing: Corrupted result when joining tables contain list data types #615 (#616)
68fa598 Summit fixes (#607)
de3ec7b fixing bash splitting (#606)
0a489fc adding cmake parallelism flag (#605)
035fd70 Implement Slice, Head and Tail Operation in both centralize and distr… (#592)
d99a6f2 adding custom mpirun params cmake var (#604)
f20c119 Update README-summit.md (#603)
4bc27f9 Create README-summit.md (#602)
e6b7306 Minor fixes (#596)
2e6ac80 adding conda docker (#600)
4dd359f Ucc integration (#591)
61b4a82 adding cylonflow as a submodule (#593)
e4dd38b Use generic operator (#583)
6c0dfa8 Gloo python binding (#587)
773f11f Gloo python bindings (#585)
2fc95be Add downloading catch2 header dynamically (#584)
c56ab2d Enabling gloo CI (#582)
a820ed8 Dist sort cpu (#574)
f68cc62 Adding UCC build (#579)
2759a30 Cylon Gloo integration (#576)
b2c0820 Adding distributed scalar aggregates (#570)
9c2fdc4 Extending datatypes (#568)
e3d553c Bump ua-parser-js from 0.7.22 to 0.7.31 in /docs (#566)
3bafb75 Bump ssri from 6.0.1 to 6.0.2 in /docs (#565)
814a463 minor fixes (#564)
be92253 Bump lodash from 4.17.20 to 4.17.21 in /docs (#561)
e87dd7c Bump shelljs from 0.8.4 to 0.8.5 in /docs (#562)
71bd8bf Bump nanoid from 3.1.22 to 3.2.0 in /docs (#563)
49b343d Allowing custom MPI_Comm for MPI (#559)
fa52dd4 Update contributors.md
54d4a53 added io functions (#550)
1a8c3d7 Fixing 554 (#558)
887ea18 update arrow link (#557)
1ce4c6b Fixing 552 (#553)
f5e31a1 Merging 0.5.0 release (#547)

Contributors

Ahmet Uyar
Chathura Widanage
Damitha Sandeepa Lenadora
dependabot[bot]
Hasara Maithree
Kaiying Shan
niranda perera
Supun Kamburugamuve
Vibhatha Lakmal Abeykoon
Ziyao22
Arup Kumar Sarker
Mills Wellons Staylor
Gregor von Laszewski

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

Assets 2

16 Dec 19:29

nirandaperera

0.5.0

2b74ade

0.5.0

Cylon 0.5.0 is a major release. We are excited to present GCylon, cudf-based distributed
DataFrame for Nvidia GPUs, UCX integration, Anaconda support, and much more.

Features

Cylon C++ and Python

Adding UCX integration with MPI
Adding read distribution
Changing join column naming convention to match SQL and pandas
Adding Dataframe.applymap, Dataframe.isin
Add iloc operation to DataFrame
Adding null handling to table operators and Comparators
Adding Equal/ distributed equal operators
Adding array flattening
Adding Repartition
Adding mapreduce style group-by aggregators
Adding table level AllGather, Gather and Broadcast operators
Performance improvements and bug fixes

Build

Updating to Arrow 0.5.x
Windows build support
MacOS build support
Conda build is the default build
Improving docker build

Gcylon

First release of Gcylon which supports distributed DataFrame processing on Nvidia GPUs using CuDF:

Implemented shuffling and distributed sorting
Distributed Join/merge
Distributed GroupBy
DataFrame Set operations
Repartitioning DataFrames
Distributed IO for reading/writing CSV, JSON and Parquet files

You can download source code from Github
Conda binaries are available in Anaconda

Commits

3344bf9 Mapreduce style group-by aggregators (#535)
50ef890 Remove minor warnings (#544)
559e8eb Adding CPU serializer (#539)
abb4404 fixed unused variable/parameter and casting warnings (#542)
62a3f08 Distributed IO (#533)
15d06d6 Bump color-string from 1.5.4 to 1.7.4 in /docs (#534)
810c4ed fixing RNG issue (#538)
fbb049b fixing build error (#536)
a10e052 Bump algoliasearch-helper from 3.3.3 to 3.6.2 in /docs (#532)
112ea97 Repartition - CPU (#526)
79c4b73 create a MacOS yml file (#530)
b9e7a8c Repartition - GPU (#528)
2191b9f fixed function name change in cudf api from gcylon test files (#529)
3e9036e Upgrading to arrow 5.0.0 (#525)
24d182a Groupby values null handling (#527)
54a5074 Null handling for Comparators (#524)
0b9516e Adding array flattening (#522)
b3fc2a2 Implemented MergeOrSort when merging sorted tables (#523)
1e061b2 Feature/equal (#499)
e378d1d reformatted gcylon codes with tab size 2, non-functional changes (#521)
8450d9b Added support for sliced tables in gather, broadcast and sorting (#520)
92b8124 Update windows.yml
1f9790d Update macos.yml
d33f9ac Update conda-actions.yml
963d491 Update c-cpp.yml
2229981 added mpi datatype dispatching for primitive data types (#519)
d9936b4 Head tail operators (#512)
ac99d00 Formatting code (#518)
fff84cc Code formatting (#517)
f32f04d Null handling in splitters and build arrays (#511)
4cab7ca Delete files from CPP example folder that are not needed (#516)
d174430 moving tutorial repo to (#514)
9cd7911 Python example cleanup (#513)
fe4caf3 Distributed sorting (#510)
2302f58 Minor improvements to the Table API (#508)
71eb80a adding new test utils (#507)
24b83dd Adding to docker docs (#498)
6f2faf8 Update conda.md
4f8f3c7 Gcylon docs (#501)
a786258 Adding contributing guide to documentation (#496)
8ab8b2d changing join column naming convention to match SQL and pandas (#487)
f18b91f improvements to ucx build from conda (#484)
912fb54 Windows build (#482)
216758a making improvements to the build (#483)
4e2894e Add functions to dataframe (#481)
1f1ddd9 Documentation update (#479)
e623315 Bump tar from 6.1.5 to 6.1.11 in /docs (#477)
1e5db7b improve docs (#476)
58c0595 removing extra examples (#474)
3c823f6 Gcylon integration (#470)
92748eb Cpp example cleanup (#475)
fa14527 Docs improvements (#469)
1306220 Bump url-parse from 1.4.7 to 1.5.3 in /docs (#473)
8234ae7 Bump path-parse from 1.0.6 to 1.0.7 in /docs (#472)
c8b435b Bump tar from 6.0.5 to 6.1.5 in /docs (#471)
1cc28dd Performance improvements (#453)
9092bbf MacOS build (#464)
d59d91e Add iloc operation to DataFrame (#465)
8d7a8dc Removed glog files from the header files (#463)
ea62eef License updates (#462)
2f56265 changed all relative Cylon header references to global (#461)
123c93c Building in conda env without using conda-build (#457)
3b3a285 Compilation document improvements (#454)
8578b1f Adding barrier at the end of the test case (#458)
e6eded5 Fix for empty df (#455)
8f14992 Fixed mpi test case (#456)
cb06998 Changes to the Docs (#451)
4ce1d7e updates to the docker readme
e011e0f enhancing readme
adfa6c0 adding read distribution (#432)
bd2e024 UCX integration (#439)
a42d04a Bump ws from 6.2.1 to 6.2.2 in /docs (#437)
710b562 Bump dns-packet from 1.3.1 to 1.3.4 in /docs (#435)
07aee74 adding new operators to DataFrame API (#429)
71e57f8 Updating to arrow 4.0 (#418)
a490dc2 changing ctx to const reference in methods (#419)
18a5447 missing docs (#428)
38534f5 0.4.1 release (#427)
10f5a6a Enabling scalars in df set_item (#425)
0be7897 Op bench refactor (#417)
ec964d8 Bug fixes in dataframe (#420)
e0ba964 Update c-cpp.yml
0200c02 adding finalize check and removing destructor finalize call. (#412)
149919c Update README.md
016c5c9 adding missing test case
5609535 Update README.md
e3ca0bf 0.4.0 release (#411)

Contributors

Ahmet Uyar
Chathura Widanage
Damitha Sandeepa Lenadora
dependabot[bot]
Hasara Maithree
Kaiying Shan
niranda perera
Supun Kamburugamuve
Vibhatha Lakmal Abeykoon
Ziyao22

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

Assets 2

04 May 16:28

chathurawidanage

0.4.1

1476926

0.4.1

Cylon 0.4.1 is a bug fix release.

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

Assets 2

21 Apr 16:49

chathurawidanage

0.4.0

76d150e

0.4.0

Cylon 0.4.0 is a major release with the following features.

Major Features

Python

DataFrame API similar to Pandas supporting around 40 operators commonly used in Pandas.
Conda build and conda based binaries for Linux for installing.
Python binding to all the operators added on the C++ level.
Providing compute functions with both Arrow and Numpy for filtering, math operations and comparison operators.
Added operator benchmarks.
Added new options for CSV reading supporting all the options in PyArrow for reading CSV.

C++

Added distributed multi-column operations on tables for join, union, intersection, set difference and sort.
Added improved hash operations using Bytell Hash Maps. Improved performance by 2 times for union, intersection, set difference and unique.
Added new aggregate operations for GroupBy operation (Mean, Variance, Std Dev, Quantile, NUnique, Median).
Implemented GroupBy aggregators using CRTP (Curiously recurring template pattern).
Improved indexing at the core by Added more types, improved performance of indexed lookups.
Added unique distributed operator.
Added temporal data types like DateTime, Date32 (seconds resolution), Date64 (milliseconds resolution) and TImestamp (with time zone information).
Other performance improvements and bug fixes.

Build

Compiling using external Apache Arrow installation (local/ pip).

Applications and Benchmarks

Implementing a subset of TPC-XBB queries (Queries 6, 7, 9, 14, 22, 23) and the rest is ongoing.
Applications with connections to deep learning.

You can download source code from Github

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

Assets 2

18 Dec 04:22

nirandaperera

v0.3.1

d153525

0.3.1

Cylon 0.3.1 is a bug fix release.

You can download source code from Github

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

Assets 2

12 Dec 15:15

nirandaperera

v0.3.0

c944d2d

0.3.0

Cylon 0.3.0 adds the following features. Please note that this release may not be backward
compatible with previous releases.

Major Features

C++

Adding order-by and distributed table sort operations
Multiple partitioning schemes (modulo, hash, and range)
C++ API refactoring
Performance improvements in the existing C++ API

Python (Pycylon)

Exposing table operators similar to Pandas (28 new operators).
- Comparison operators
- Logical Operators
- Math operators
- Null/NA value filtering and filling
- Filtering and updating (including inplace ops)
- Schema refactoring
- Experimental indexing abstract
Distributed Data sorting Python bindings
Adding new examples for updated operations. (https://github.com/cylondata/cylon/tree/master/python/examples)

You can download source code from Github

Examples

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

Assets 2

20 Oct 00:57

nirandaperera

0.2.0

26cf812

0.2.0

Cylon 0.2.0 adds the following features. Please note that this release may not be backward
compatible with v0.1.0.

Major Features

C++

Adding aggregates and group-by API
Creating tables using std::vectors or cylon::Columns
C++ API refactoring
Major performance improvements in the existing C++ API

Python (Pycylon)

Extending Cython API for extended development for other Cython/Python libraries
Aggregates and Groupby addition
Column name-based relational algebra operations and aggregate/groupby ops addition
Major performance improvements in the existing Python API

Java (JCylon)

Performance improvements

You can download source code from Github

Examples

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

Assets 2

18 Jul 01:01

nirandaperera

0.1.0

de48798

Cylon Release 0.1.0

Cylon 0.1.0 is the first open-source public release of Cylon Project. We are excited to bring a high-performance
data engineering toolkit that can work as a library as well as a standalone framework. This is the first step towards building a complete toolkit designed to work with AI/ML systems and integrate with data processing systems with the
vision "data engineering everywhere".

You can download source code from Github

Who should use Cylon?

Users of Pandas dataframes or SQL interface
Those needing parallel data engineering
Those needing Python C++ Java interoperability
HPC Python (Dask) and Big Data (Kubernetes) environments

Major Features in v0.1.0

Introducing Cylon C++ engine based on Apache Arrow.
Cylon C++, Python (PyCylon) and Java language bindings
Seamless integration with Pandas and NumPy
Distributed operations using MPI
Local and distributed operations (Select, Project, Joins, Intersection, Union, Subtract)
Jupyter notebook support and experimental Google Colab support

Examples

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Features

Cylon C++ and Python

Build

Commits

Contributors

License

Features

Cylon C++ and Python

Build

Gcylon

Commits

Contributors

License

License

Major Features

Python

C++

Build

Applications and Benchmarks

License

License

Major Features

C++

Python (Pycylon)

Examples

License

Major Features

C++

Python (Pycylon)

Java (JCylon)

Examples

License

Who should use Cylon?

Major Features in v0.1.0

Examples

License

Releases: cylondata/cylon

0.6.0

Features

Cylon C++ and Python

Build

Commits

Contributors

License

0.5.0

Features

Cylon C++ and Python

Build

Gcylon

Commits

Contributors

License

0.4.1

License

0.4.0

Major Features

Python

C++

Build

Applications and Benchmarks

License

0.3.1

License

0.3.0

Major Features

C++

Python (Pycylon)

Examples

License

0.2.0

Major Features

C++

Python (Pycylon)

Java (JCylon)

Examples

License

Cylon Release 0.1.0

Who should use Cylon?

Major Features in v0.1.0

Examples

License