Skip to content

v1.9.0 - 2024-01-11

Compare
Choose a tag to compare
@amontanez24 amontanez24 released this 11 Jan 19:48
· 320 commits to main since this release

This release makes a number of improvements. It introduces a new concept to the metadata known as column relationships! Column relationships can be used to define when certain groups of columns in a table should be treated as a special concept (eg. address). You can add a column relationship by using the new add_column_relationship method. The metadata detection was also improved by allowing semantic sdtypes (eg. 'email', 'phone_number') to be detected as primary keys.

This release also patches some bugs. An issue messing up the likelihood matching in the HMASynthesizer was resolved. The CTGANSynthesizer no longer fails when using the FixedCombinations constraint. The Inequality constraint was also patched to handle datetimes better.

Deprecations

  • The set_address_columns method is deprecated in favor of add_column_relationship.

New Features

  • Improve error messages for composite keys - Issue #1684 by @frances-h
  • Add column relationship validation to single table metadata - Issue #1698 by @frances-h
  • Add add_column_relationship method to single table metadata - Issue #1699 by @frances-h
  • Make synthesizers work with column_relationships - Issue #1700 by @R-Palazzo
  • Metadata auto-detection should find primary keys of semantic sdtypes - Issue #1724 by @fealho

Bugs Fixed

  • InvalidDataError for Inequality constraint (even though data is valid) - Issue #1692 by @fealho
  • BaseIndependentSampler crashes because it tries to cast id columns - Issue #1712 by @pvk-developer
  • KeyError in CTGANSynthesizer when applying FixedCombinations constraint - Issue #1717 by @pvk-developer
  • Fix _get_likelihoods not generating likelihood values - Issue #1720 by @frances-h