Releases: sdv-dev/SDMetrics
v0.12.1 - 2023-11-01
This release fixes a bug with the new Intertable Trends property and older pandas versions and a bug with how the ML Efficacy metric handled train and test data. Reports handle missing relationships more gracefully.
Bugs Fixed
- Multiple FutureWarning lines printed out when running the Quality Report (Intertable Trends property) - Issue #490 by @frances-h
- Transformer should not be fit on test data - Issue #291 by @fealho
- Reports should not crash if there are no relationships - Issue #481 by @lajohn4747
v0.12.0 - 2023-10-31
This release adds a new property, InterTable Trends. Several plots were moved from the reports module into the new visualizations module. The metadata
parameter was removed for these plots, and the plot_types
parameter was added. plot_types
lets the user control which plot type is used. Several crashes have been resolved.
New Features
- Provide meta information about the reports - Pull #472 by @frances-h
- Validate that the metadata is always a dict - Issue #428 by @R-Palazzo
- Expose reports module in top-level init - Pull #459 by @frances-h
- Add new get_column_pair_plot - Issue #444 by @pvk-developer
- Add InterTable Trends property - Issue #451 by @frances-h
- Add new get_column_plot - Issue #443 by @pvk-developer
- Add new get_cardinality_plot - Issue #445 by @frances-h
- Create visualizations module - Issue #442 by @frances-h, @pvk-developer
Bugs Fixed
- Fix
NewRowSynthesis
on datetime columns without formats - Issue #473 by @fealho - Intertable trends property crashes if a table has no statistical columns - Issue #476 by @lajohn4747
- Fix BoundaryAdherence NaN handling - Issue #470 by @frances-h
- The Intertable Trends visualization is mislabeled as 'Column Shapes' - Issue #477 by @lajohn4747
- ValueError when using get_cardinality_plot on some schemas - Issue #447 by @frances-h
Internal
- Switch default branch from master to main - Issue #420 by @amontanez24
v0.11.1 - 2023-09-14
This release makes multiple changes to better handle errors that get raised from the DiagnosticReport
. The report should be able to run to completion now and have any errors that it encounters reported in a column on the details that can be observed from running get_details
. It also resolves many warnings that were interrupting the printing of the report's results and progress.
New Features
- Create single table coverage property - Issue #389 by @R-Palazzo
- Create single table synthesis property - Issue #390 by @R-Palazzo
- Create single table Boundaries property - Issue #391 by @R-Palazzo
- Add multi table Coverage, Synthesis and Boundaries property - Issue #393 by @R-Palazzo
Bugs Fixed
- Ensure that the
Synthesis
property score doesn't change - Issue #425 by @amontanez24 - The Error column contains a mix of
NaN
andNone
values - Issue #427 by @pvk-developer - Always show the
Table
column inget_details
- Issue #429 by @frances-h - Diagnostic explanations should not repeat if I generate multiple times - Issue #430 by @amontanez24
- RangeCoverage errors on datetime columns in DiagnosticReport - Issue #431 by @frances-h
- The coverage visualization shows empty bar graph for nan values - Issue #432 by @frances-h
- Diagnostic report should skip over all NaN columns - Issue #433 by @pvk-developer
- Quality report is printing out a long warning message (hundreds of lines) - Issue #448 by @amontanez24
Internal
- Use property classes in single table DiagnosticReport - Issue #392 by @R-Palazzo
- Use property classes in multi table DiagnosticReport - Issue #394 by @R-Palazzo
v0.11.0 - 2023-08-10
This release adds a function that allows users to plot the cardinality of foreign and primary keys in synthetic data. More specifically, it graphs the frequency that each number of children per parent row occurs in the parent table.
Additionally, architectural changes are made to improve the efficiency and error handling of the QualityReport
! The progress bar is also enhanced to be more informative when the report is generating.
This release also adds support for Python 3.11 and drops support for Python 3.7.
New Features
- Visualize cardinality of foreign key columns - Issue #283 by @R-Palazzo
- Create single table BaseProperty class - Issue #354 by @amontanez24
- Create single table column shapes property - Issue #355 by @R-Palazzo
- Create single table column pair trends property - Issue #356 by @R-Palazzo
- Create multi table BaseProperty class - Issue #357 by @pvk-developer
- Create multi table column shapes and column pair trends properties - Issue #358 by @R-Palazzo
- Create Parent Child Relationships property class - Issue #359 by @pvk-developer
- In Multi Table Quality Report: Rename "Table Relationships" property to "Cardinality" - Issue #360 by @frances-h
- More accurate progress bar for single table Quality Report - Issue #361 by @R-Palazzo
- More accurate progress bar for multi table Quality Report - Issue #362 by @fealho
- Raise error in CorrelationSimilarity if either column is constant - Issue #407 by @fealho
Bug Fixes
- Issue in building the denormalized table inside the Parent-Child Detection metrics - Issue #328 by @fealho
- Don't modify the rounding in the quality report - Issue #401 by @R-Palazzo
- The Cardinality property is missing some relationships - Issue #404 by @pvk-developer
- The Cardinality property is not returning a DataFrame - Issue #405 by @fealho
- Overall property score should be the average across all breakdowns - Issue #415 by @amontanez24
Internal
- Use property classes in single table QualityReport - Issue #370 by @R-Palazzo
- Use property classes in multi table QualityReport - Issue #371 by @fealho
- Add add-on detection for premium metrics - Issue #388 by @amontanez24
Maintenance
- Add support for Python 3.11 - Issue #353 by @amontanez24
- Drop support for Python 3.7 - Issue #380 by @amontanez24
v0.10.1 - 2023-06-06
This release fixes a bug that was causing the DiagnosticReport
to crash on the NewRowSynthesis
metric. It also adds support for PyTorch 2.0!
Bug Fixes
- ValueError: multi-line expressions (NewRowSynthesis metric in DiagnosticReport) - Issue #327 by @R-Palazzo
Maintenance
v0.10.0 - 2023-05-03
This release makes the DiagnosticReport
more fault tolerant by preventing it from crashing if a metric it uses fails. It also adds support for Pandas 2.0!
Additionally, support for the old SDV
metadata format (pre SDV
1.0) has been dropped.
New Features
- Cleanup SDMetrics to only accept SDV 1.0 metadata format - Issue #331 by @amontanez24
- Make the diagnostic report more fault-tolerant - Issue #332 by @frances-h
Maintenance
- Remove upper bound for pandas - Issue #338 by @pvk-developer
v0.9.3 - 2023-04-12
This release improves the clarity of warning/error messages. We also add a version add-on, update the workflow to optimize the runtime and fix a bug in the NewRowSynthesis
metric when computing the synthetic_sample_size
for multi-table.
New Features
- Add functionality to find version add-on - Issue #321 by @frances-h
- More detailed warning in QualityReport when there is a constant input - Issue #316 by @pvk-developer
- Make error more informative in QualityReport when tables cannot be merged - Issue #317 by @frances-h
- More detailed warning in QualityReport for unexpected category values - Issue #315 by @frances-h
Bug Fixes
- Multi table DiagnosticReport sets synthetic_sample_size too low for NewRowSynthesis - Issue #320 by @pvk-developer
v0.9.2 - 2023-03-08
This release fixes bugs in the NewRowSynthesis
metric when too many columns were present. It also fixes bugs around datetime columns that are formatted as strings in both get_column_pair_plot
and get_column_plot
.
Bug Fixes
- Method get_column_pair_plot: Does not plot synthetic data if datetime column is formatted as a string - Issue [#310] (#310) by @frances-h
- Method get_column_plot: ValueError if a datetime column is formatted as a string - Issue #309 by @frances-h
- Fix ValueError in the NewRowSynthesis metric (also impacts DiagnosticReport) - Issue #307 by @frances-h
v0.9.1 - 2023-02-17
This release fixes bugs in the existing metrics and reports.
Bug Fixes
- Fix issue-296 for discrete and continuous columns - Issue #296 by @R-Palazzo
- Support new metadata for datetime_format - Issue #303 by @frances-h