v1.9.0 - 2024-01-11
This release makes a number of improvements. It introduces a new concept to the metadata known as column relationships! Column relationships can be used to define when certain groups of columns in a table should be treated as a special concept (eg. address). You can add a column relationship by using the new add_column_relationship
method. The metadata detection was also improved by allowing semantic sdtypes (eg. 'email', 'phone_number') to be detected as primary keys.
This release also patches some bugs. An issue messing up the likelihood matching in the HMASynthesizer
was resolved. The CTGANSynthesizer
no longer fails when using the FixedCombinations
constraint. The Inequality
constraint was also patched to handle datetimes better.
Deprecations
- The
set_address_columns
method is deprecated in favor ofadd_column_relationship
.
New Features
- Improve error messages for composite keys - Issue #1684 by @frances-h
- Add column relationship validation to single table metadata - Issue #1698 by @frances-h
- Add add_column_relationship method to single table metadata - Issue #1699 by @frances-h
- Make synthesizers work with column_relationships - Issue #1700 by @R-Palazzo
- Metadata auto-detection should find primary keys of semantic sdtypes - Issue #1724 by @fealho
Bugs Fixed
- InvalidDataError for Inequality constraint (even though data is valid) - Issue #1692 by @fealho
BaseIndependentSampler
crashes because it tries to cast id columns - Issue #1712 by @pvk-developer- KeyError in
CTGANSynthesizer
when applyingFixedCombinations
constraint - Issue #1717 by @pvk-developer - Fix _get_likelihoods not generating likelihood values - Issue #1720 by @frances-h