Skip to content
This repository has been archived by the owner on Nov 1, 2024. It is now read-only.

Recover Datasets for NLI Task #900

Open
13 of 15 tasks
Rebecca-Qian opened this issue Mar 11, 2022 · 1 comment
Open
13 of 15 tasks

Recover Datasets for NLI Task #900

Rebecca-Qian opened this issue Mar 11, 2022 · 1 comment
Assignees

Comments

@Rebecca-Qian
Copy link

Rebecca-Qian commented Mar 11, 2022

Some datasets on the Dynabench NLI task were accidentally deleted on 03/10. As a result, the files and metrics were also removed from S3. Tracking progress on recovery and mitigation steps:

Leaderboard:

  • snli-test
  • mnli-test-mismatched
  • mnli-test-matched
  • anli-r1-test
  • anli-r2-test
  • anli-r3-test

Non Leaderboard

  • superglue-winogender
  • mnli-dev-mismatched
  • mnli-dev-matched
  • snli-dev
  • hans
  • nli-stress-test
  • anli-r1-dev
  • anli-r2-dev
  • anli-r3-dev

Next Steps

This was caused by confusion with the "add dataset" interface, where previous datasets looked like part of the submission form. Some steps to prevent future incidents:

  • UX improvements: Pop-up warning when someone tries to delete a dataset, or header above the datasets marking a clear separation from the submission form.
  • Enable bucket versions: Versioned S3 buckets would ensure deleted items are backed up for X days.

This process also uncovered several UX bugs, eg. successful dataset upload still sent an error message.

@ktirumalafb
Copy link
Contributor

One of the key errors here that was confusing was: if you don't assign a round to a dataset after uploading, it eventually fails to update the scores for that dataset in the DB. This happened before too see #835 , maybe we should maybe change the default round for dataset uploads to something non zero

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants