Skip to content

v1.6.0 - 2023-11-07

Compare
Choose a tag to compare
@amontanez24 amontanez24 released this 07 Nov 17:21
· 369 commits to main since this release

This release improves user messaging in multiple ways. The most notable is that users will now see an alert if the HMASynthesizer is likely to be slow for their data's schema. Additionally, the logger messaging for constraints and the error messaging when setting distributions on non-parametric models was made more detailed.

The visualization plots in the sdv.evaluation sub-package all got a new parameter called plot_type, allowing the users to specify the plot type to use if the one being inferred is not useful. The sdv.datasets.local.load_csvs method now has a parameter called read_csv_parameters, that allow users to specify how the csvs should be read during loading. The same change was also made to the sdv.metadata.multi_table.detect_table_from_csv, sdv.metadata.multi_table.detect_from_csvs and sdv.metadata.single_table.detect_from_csv methods.

Multiple bugs were resolved including one that caused new categories to be created during the sample step of CTGANSynthesizer.

New Features

  • Improve debug messages when a constraint falls back to reject sampling approach - Issue #1478 by @amontanez24
  • Constraints should work with timezone-aware datetime columns - Issue #1576 by @fealho
  • Better error message when trying to get distributions from non-parametric models - PR #1633 by @frances-h
  • Add options to read CSV files - Issue #1644 by @lajohn4747
  • Print alert if HMASynthesizer is likely to be slow - Issue #1646 by @lajohn4747
  • Make SDV compatible with SDMetrics 0.12.1 - Issue #1650 by @pvk-developer
  • Make function to estimate number of columns CTGAN produces - Issue #1657 by @fealho

Bugs Fixed

  • In get_available_demos, the num_tables column should be an int - Issue #1420 by @lajohn4747
  • AttributeError when using specific locale strings (es_AR, fr_BE) - Issue #1439 by @lajohn4747
  • Confusing error when passing in an empty dataframe (with constraints) - Issue #1455 by @lajohn4747
  • HMASynthesizer: Better error message for learned distributions (misleading fit error) - Issue #1579 by @fealho
  • Fix tests in SDV after update in RDT 1.7.1 - Issue #1638 by @lajohn4747
  • CTGAN sometimes creates new categories (int data) - Issue #1647 by @pvk-developer
  • CTGAN sometimes creates new categories (object data) - Issue #1648 by @pvk-developer
  • Better error message if I provide an incompatible sdtype/locale combo - Issue #1653 by @pvk-developer