You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The detection metrics for single table data and sequential data both compute the AUC (ROC) and return 1-AUC as the final score. The score is hard to interpret.
An extreme (close to 0 or close to 1) indicates that the synthetic and real data are noticeably different -- enough for a model to tell them apart. This indicates lower quality or alternatively higher privacy.
A middle score (close to 0.5) indicates that the synthetic and real data are similar -- enough to fool the model, as the model is no better than random. This indicates higher quality or alternately lower privacy.
This is an odd way to interpret the score. Usually, we want 1 to represent success and 0 to represent failure.
Proposed Changes
Instead of returning 1-AUC, perhaps there is a different formula we can use such as:
$score = | AUC - 0.5 | \times 2$
This would yield a score that is geared towards privacy:
0 if the AUC score was close to 0.5, which means lower privacy
1 if the AUC score was closer to an extreme (0 or 1), which means higher privacy
The text was updated successfully, but these errors were encountered:
Problem Description
The detection metrics for single table data and sequential data both compute the
AUC (ROC)
and return1-AUC
as the final score. The score is hard to interpret.This is an odd way to interpret the score. Usually, we want 1 to represent success and 0 to represent failure.
Proposed Changes
Instead of returning
1-AUC
, perhaps there is a different formula we can use such as:This would yield a score that is geared towards privacy:
The text was updated successfully, but these errors were encountered: