Releases: EntilZha/PyFunctional
Release 1.4.3
1.0.0
Reaching 1.0
primarily means that API stability has been reached so I don't expect to run into many new breaking changes. The library is also relatively feature complete at this point almost two years after the first commit (February 5, 2015).
This release includes several new minor features and usability improvements in jupyter notebook environments
New Features
- Added optional initial value for
reduce
(#86) - Added table of contents to readme (#88)
- Added data interchange tutorial with pandas (https://github.com/EntilZha/PyFunctional/blob/master/examples/PyFunctional-pandas-tutorial.ipynb)
- Implemented
itertools.starmap
asSequence.starmap
andSequence.smap
(#90) - Added interface to
csv.DictReader
viaseq.csv_dict_reader
(#92) - Improved
_html_repr_
,show
andtabulate
by auto detecting named tuples as column names (#91) - Improved
_html_repr_
andshow
to tell the user 10 of N rows are being shown if there are more than 10 rows (#94)
Dependencies and Supported Python Versions
- Bumped version dependencies (#89)
- Added Python 3.6 via Travis CI testing
0.7.1
0.7.0
New Features
- Auto parallelization by using
pseq
instead ofseq
. Details at #47 - Parallel functions:
map
,select
,filter
,filter_not
,where
,flatten
, andflat_map
- Compressed file IO support for
gzip
/lzma
/bz2
as detailed at #54 - Cartesian product from
itertools.product
implemented asPipeline.cartesian
- Website at pyfunctional.org and docs at docs.pyfunctional.org
Bug Fixes
- No option for encoding in
to_json
#70
Internal Changes
- Pinned versions of all dependencies
Contributors
0.6.0
Largest changes in this release are adding SQLite support and changing the project name to PyFunctional
.
Name Change
Details can be found in the RFC issue. On PyPI, 0.6.0
was published as PyFunctional
and ScalaFunctional
to support transition to new name. Overall, name change better suits the package as it is about functional programming with python, even if it is inspired by Scala/Spark.
New Features
- Added support for reading to and from SQLite databases with
seq.sqlite3
- Added
to_pandas
call integration
Internal Changes
- Changed code quality check service
Release 0.5.0
Release 0.5.0
is a few new features and bug fixes grouped into a release.
Breaking Changes
Sequence.zip_with_index
has modified behavior to extend usability and conform to scala/spark APIs which breaks prior compatibility. The drop in replacement to fix this issue in code bases upgrading to0.5.0
is changingzip_with_index
toenumerate
.
New Features
- Delimiter option on
to_file
Sequence.sliding
for sliding windows over sequence of elements
Internal Changes
- Changed relative imports to absolute imports
Bug Fixes
_wrap
incorrectly converted tuples to arraysto_file
documentation fixed- Prior mentioned
zip_with_index
in breaking changes
Changelog: https://github.com/EntilZha/ScalaFunctional/blob/master/CHANGELOG.md
Milestone: https://github.com/EntilZha/ScalaFunctional/milestones/0.5.0
Release 0.4.1: File reading, writing, and LINQ
The primary goals of this release were to:
- Support reading and writing data from files in common formats
- Improve LINQ support
Reading and Writing text, json, jsonl, and csv
The large feature additions of this release include functions to natively read and write from text, json, jsonl, and csv files. Details on the issue can be found at #19. The examples on the README.md
page illustrate how these can be used and their usefulness. A full list of changes can be found in CHANGELOG.md
or the copy of it at the bottom of the release notes.
LINQ
In doing research I found that a common use case where ScalaFunctional
could be helpful is in doing LINQ-like data manipulation. To better serve this group of users functions like select
and where
were added, and documentation was improved to cover this use case.
Breaking Changes
The bug detailed at #44 exposed that fold_left
and fold_right
was using the passed function incorrectly. This was corrected, but is a breaking change to all prior versions.
0.4.1 enum34 Removed
In the release of 0.4.0 a issue was found where the wheel built with python2 contained enum34 which broke the python3 installation. If it were built with python3, then it would not include enum34 causing problems with python2. The solution was to remove enum34 and use vanilla python instead.
Changelog
Release 0.4.0
New Features
- Official and tested support for python 3.5. Thus
ScalaFunctional
is tested on Python 2.7, 3.3,
3.4, 3.5, pypy, and pypy3 aggregate
from LINQorder_by
from LINQwhere
from LINQselect
from LINQaverage
from LINQsum
modified to allow LINQ projected sumproduct
modified to allow LINQ projected productseq.jsonl
to read jsonl filesseq.json
to read json filesseq.open
to read filesseq.csv
to read csv filesseq.range
to create range sequencesSequence.to_jsonl
to save jsonl filesSequence.to_json
to save json filesSequence.to_file
to save filesSequence.to_csv
to save csv files- Improved documentation with more examples and mention LINQ explicitly
- Change PyPi keywords to improve discoverability
- Created Google groups mailing list
Bug Fixes
fold_left
andfold_right
had incorrect order of arguments for passed function
Release 0.4.1
Fix python 3 build error due to wheel installation of enum34. Package no longer depends on enum34
Contributors
Thank you to adrian17 for contributing seq.range
to the release.
Release 0.4.0
Refer to the release notes for 0.4.1
for summary of changes in 0.4.0
. Both versions are nearly identical with 0.4.1
being a hotfix to a pip install issue on python 3
Release 0.3.1: Addition of distinct_by
This is a very minor release which adds distinct_by
to the API. distinct_by
takes a single identity function as argument. The returned sequence is unique by the identity function and consists of the first element found for each identity key. Code example below:
from functional import seq
seq([(1, 2), (1, 3), (2, 3), (4, 5), (0, 1), (0, 0)]).distinct_by(lambda x: x[0])
# [(0, 1), (1, 2), (2, 3), (4, 5)]
Release 0.3.0: Lineage Performance and Stable API
The primary goal of this release was to improve performance of longer data pipelines. Additionally, there were additional API additions and several minor breaking changes.
Performance Improvements
The largest under the hood change is changing all operations to be lazy by default. 0.2.0
calculates a new list at every transformation. This was initially implemented using generators, but this could lead to unexpected behavior. The problem with this approach is highlighted in #20. Code sample below:
from functional import seq
def gen():
for e in range(5):
yield e
nums = gen()
s = seq(nums)
s.map(lambda x: x * 2).sum()
# prints 20
s.map(lambda x: x * 2).sum()
# prints 0
s = seq([1, 2, 3, 4])
a = s.map(lambda x: x * 2)
a.sum()
# prints 20
a.sum()
# prints 0
Either, ScalaFunctional
would need to aggressively cache results or a new approach was needed. That approach is called lineage. The basic concept is that ScalaFunctional
:
- Tracks the most recent concrete data (eg list of objects)
- Tracks the list of transformations that need to be applied to the list to find the answer
- Whenever an expression is evaluated, the result is cached for (1) and returned
The result is the problems above are fixed, below is an example showing how the backend calculates results:
from functional import seq
In [8]: s = seq(1, 2, 3, 4)
In [9]: s._lineage
Out[9]: Lineage: sequence
In [10]: s0 = s.map(lambda x: x * 2)
In [11]: s0._lineage
Out[11]: Lineage: sequence -> map(<lambda>)
In [12]: s0
Out[12]: [2, 4, 6, 8]
In [13]: s0._lineage
Out[13]: Lineage: sequence -> map(<lambda>) -> cache
Note how initially, since the expression is not evaluated, it is not cached. Since printing s0
in the repl calls __repr__
, it is evaluated and cached so it is not recomputed if s0
is used again. You can also call cache()
directly if desired. You may also notice that seq
can now take a list of arguments like list
(added in #27).
Next up
Improvements in documentation and redo of README.md
. Next release will be focused on extending ScalaFunctional
further to work with other data input/output and more usability improvements. This release also marks relative stability in the collections API. Everything that seemed worth porting from Scala/Spark has been completed with a few additions (predominantly left, right, inner, and outer joins). There aren't currently any foreseeable breaking changes.