Releases: sdv-dev/SDV
v0.4.3 - 2020-09-28
This release moves the models and algorithms related to generation of synthetic
relational data to a new sdv.relational
subpackage (Issue #198)
As part of the change, also the old sdv.models
have been removed and now
relational modeling is based on the recently introduced sdv.tabular
models.
v0.4.2 - 2020-09-19
In this release the sdv.evaluation
module has been reworked to include 4 different
metrics and in all cases return a normalized score between 0 and 1.
Included metrics are:
cstest
kstest
logistic_detection
svc_detection
v0.4.1 - 2020-09-07
This release fixes a couple of minor issues and introduces an important rework of the
User Guides section of the documentation.
Issues fixed
- Error Message: "make sure the Graphviz executables are on your systems' PATH" - Issue #182 by @csala
- Anonymization mappings leak - Issue #187 by @csala
v0.4.0 - 2020-08-08
In this release SDV gets new documentation, new tutorials, improvements to the Tabular API
and broader python and dependency support.
Complete list of changes:
- New Documentation site based on the
pydata-sphinx-theme
. - New User Guides and Notebook tutorials.
- New Developer Guides section within the docs with details about the SDV architecture,
the ecosystem libraries and how to extend and contribute to the project. - Improved API for the Tabular models with focus on ease of use.
- Support for Python 3.8 and the newest versions of pandas, scipy and scikit-learn.
- New Slack Workspace for development discussions and community support.
v0.3.6 - 2020-07-23
This release introduces a new concept of Constraints
, which allow the user to define
special relationships between columns that will not be handled via modeling.
This is done via a new sdv.constraints
subpackage which defines some well-known pre-defined
constraints, as well as a generic framework that allows the user to customize the constraints
to their needs as much as necessary.
New Features
- Support for Constraints - Issue #169 by @csala
v0.3.5 - 2020-07-09
This release introduces a new subpackage sdv.tabular
with models designed specifically
for single table modeling, while still providing all the usual conveniences from SDV, such
as:
- Seamless multi-type support
- Missing data handling
- PII anonymization
Currently implemented models are:
- GaussianCopula: Multivariate distributions modeled using copula functions. This is stronger
version, with more marginal distributions and options, than the one used to model multi-table
datasets. - CTGAN: GAN-based data synthesizer that can generate synthetic tabular data with high fidelity.
v0.3.4 - 2020-07-04
New Features
- Support for Multiple Parents - Issue #162 by @csala
- Sample by default the same number of rows as in the original table - Issue #163 by @csala
General Improvements
- Add benchmark - Issue #165 by @csala
v0.3.3 - 2020-06-26
General Improvements
- Use SDMetrics for evaluation - Issue #159 by @csala
v0.3.2 - 2020-02-03
General Improvements
- Improve metadata visualization - Issue #151 @csala @JDTheRipperPC
v0.3.1 - 2020-01-22
New Features
-
Add Metadata Validation - Issue #134 by @csala @JDTheRipperPC
-
Add Metadata Visualization - Issue #135 by @JDTheRipperPC
General Improvements
-
Add path to metadata JSON - Issue #143 by @JDTheRipperPC
-
Use new Copulas and RDT versions - Issue #147 by @csala @JDTheRipperPC