Releases · J535D165/recordlinkage

20 Jul 13:00

b93d976

A new release of recordlinkage after a long time (too long, I'm sorry). This release bumps the minor version to 0.16. This version supports pandas 2 and pandas 1. It doesn't contain any structural changes or improvements to the API.

What's Changed

Fix typo by @havardox in #184
Fix usage examples by @martinhohoff in #190
Fix links by @andyjessen in #186
add threshold None and label docstrings for String by @davidggphy in #189
Add support for pandas==2 by @J535D165 in #192
Replace setup.py by pyproject.toml by @J535D165 in #195
Lint with Ruff and format with Black by @J535D165 in #196
Update CI docs generation and CI pipeline by @J535D165 in #197
Update the docs CI pipeline by @J535D165 in #198
Add pre-commit hooks by @J535D165 in #199

New Contributors

@havardox made their first contribution in #184
@martinhohoff made their first contribution in #190
@andyjessen made their first contribution in #186
@davidggphy made their first contribution in #189

Full Changelog: v0.15...v0.16

Contributors

J535D165, havardox, and 3 other contributors

Assets 2

19 Apr 14:00

github-actions

v0.15

bd5cd08

Release v0.15 (19 Apr 2022)

Remove deprecated recordlinkage classes (#173)
Bump min Python version to 3.6, ideally 3.8+ (#171)
Bump min pandas version to >=1
Resolve deprecation warnings for numpy and pandas
Happy lint, sort imports, format code with yapf
Remove unnecessary np.sort in SNI algorithm (#141)
Fix bug for cosine and qgram string comparisons with threshold (#135)
Fix several typos in docs (#151)(#152)(#153)(#154)(#163)(#164)
Fix random indexer (#158)
Fix various deprecation warnings and broken docs build (#170)
Fix broken docs build due to pandas depr warnings (#169)
Fix broken build and removed warning messages (#168)
Update narrative
Replace Travis by Github Actions (#132)
Fix broken test NotFittedError
Fix bug in low memory random sampling and add more tests (#130)
Add extras_require to setup.py for deps management
Add banner to README and update title
Add Binder and Colab buttons at tutorials (#174)

Special thanks to Tomasz Waleń @twalen and other contributors for their work on this release.

Contributors

twalen

Assets 4

01 Dec 15:54

J535D165

v0.14

4590be3

Version 0.14 (1 Dec 2019)

Drop Python 2.7 and Python 3.4 support. (#91)
Upgrade minimal pandas version to 0.23.
Simplify the use of all cpus in parallel mode. (#102)
Store large example datasets in user home folder or use environment variable. Before, example datasets were stored in the package. (see issue #42) (#92)
Add support to write and read annotation files for recordlinkage ANNOTATOR. See the docs and https://github.com/J535D165/recordlinkage-annotator for more information.
Replace .labels by .codes for pandas.MultiIndex objects for newer versions of pandas (>0.24). (#103)
Fix totals for pandas.MultiIndex input on confusion matrix and accuracy metrics. (see issue #84) (#109)
Initialize Compare with (a list of) features (Bug). (#124)
Various updates in relation to deprecation warnings in third-party libraries such as sklearn, pandas and networkx.

Assets 4

27 Mar 21:51

J535D165

v0.13.2

87a5f4a

Version 0.13.2 (27 Mar 2019)

Fix distribution problem.

Assets 4

15 Mar 15:07

J535D165

v0.13

702899e

Version 0.13 (15 Mar 2019)

resolve conflict with threshold and missing value (#85)

Closes #70

Assets 4

04 Jan 15:34

J535D165

v0.11.2

2f259de

Version 0.11.2 (4 Jan 2018)

Minor installation improvement. Exclude unwanted files

Assets 4

04 Jan 15:26

J535D165

v0.11.1

18b9634

Version 0.11.1 (4 Jan 2018)

Fix installation issue. Submodule 'preprocessing' was not added to the
source distribution.

Assets 4

04 Jan 09:07

J535D165

v0.11.0

e5cf451

Version 0.11.0 (22 Dec 2017)

The submodule 'standardise' is renamed. The new name is 'preprocessing'.
The submodule 'standardise' will get deprecated in a next version.
Deprecation errors were not visible for many users. In this version, the
errors are better visible.
Improved and new logs for indexing, comparing and classification.
Faster comparing of string variables. Thanks Joel Becker.
Changes make it possible to pickle Compare and Index objects. This makes it
easier to run code in parallel. Tests were added to ensure that pickling
remains possible.
Important change. MultiIndex objects with many record pairs were split into
pieces to lower memory usage. In this version, this automatic splitting is
removed. Please split the data yourself.
Integer indexing. Blog post will follow on this.
The metrics submodule has changed heavily. This will break with the previous
version.
repr() and str() will return informative information for index and compare
objects.
It is possible to use abbreviations for string similarity methods. For example
'jw' for the Jaro-Winkler method.
The FEBRL dataset loaders can now return the true links as a
pandas.MultIndex for each FEBRL dataset. This option is disabled by default.
See the FEBRL datasets for details.
Fix issue with automatic recognision of license on Github.
Various small improvements.

Note: In the next release, the Pairs class will get removed. Migrate now.

Assets 4

28 Dec 12:58

J535D165

v0.10.1

b859926

Version 0.10.1 (30 Aug 2017)

print statement in the geo compare algorithm removed.
String, numeric and geo compare functions now raise directly when an
incorrect algorithm name is passed.
Fix unit test that failed on Python 2.7.

Assets 4

28 Dec 12:58

J535D165

v0.10.0

a11b5ef

Version 0.10.0 (30 Aug 2017)

A new compare API. The new Compare class no longer takes the datasets and
pairs as arguments. The actual computation is now performed when calling
.compute(PAIRS, DF1, DF2). The documentation is updated as well, but
still needs improvement.
Two new string similarity measures are added: Smith Waterman
(smith_waterman) and Longest Common Substring (lcs). Thanks to Joel Becker
and Jillian Anderson from the Networks Lab of the University of Waterloo.
Added and/or updated a large amount of unit tests.
Various small improvements.

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

New Contributors

Contributors

Contributors

Releases: J535D165/recordlinkage

Release v0.16

What's Changed

New Contributors

Contributors

Release v0.15 (19 Apr 2022)

Contributors

Version 0.14 (1 Dec 2019)

Version 0.13.2 (27 Mar 2019)

Version 0.13 (15 Mar 2019)

Version 0.11.2 (4 Jan 2018)

Version 0.11.1 (4 Jan 2018)

Version 0.11.0 (22 Dec 2017)

Version 0.10.1 (30 Aug 2017)

Version 0.10.0 (30 Aug 2017)