Added base scoring program #14

DavidCarlyn · 2024-05-17T19:42:38Z

Addresses #5

I've migrated the major-minor functionality over, but haven't adapted it for this particular format yet, but should be able to add it soon.

The .txt files I added to the reference_data folder were used to quickly test this setup. This will still need testing for an entire approach (once we get the baseline up and running on this format, we can use that).

I made several assumptions on were the data (prediction, solutions, etc.) will be held. @egrace479 let me know if you see a problem with my assumptions.

Also worth noting is we don't have error checking for the input files (predictions for example). Should we?

egrace479 · 2024-05-17T22:49:55Z

The location for data is described in the competition.yaml file.

I think we're meant to have the ingestion program pull in the input_data (the validation and testing images), then the submitted model.py should return the predictions to be matched against the reference_data. Presumably, they could return a table with the image filename and prediction (hybrid or not). I currently have the reference_data CSVs (butterfly_ref_<valid or test>_<A or mimic>.csv) set up with a ssp_indicator column (major and minor) for species A. They all have a filename and hybrid_stat_ref column to match with the testing images.

I'll read through what you have here on Monday. Glad you included a test case!

helper_scripts/dataio.py

scoring_program/scoring_config.yaml

scoring_program/score.py

DavidCarlyn · 2024-05-29T04:07:54Z

I still have to update the second scoring program score_maj_min.py to take into account the major and minor scoring, but there should be enough here to start testing.

* add bioclip code_submission * add ingestion program and bioclip model submission * add model environment and change prediction.txt file * remove defaults in ingestion.py * Update baselines/BioCLIP_code_submission/metadata Co-authored-by: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com> * Update baselines/BioCLIP_code_submission/model.py Co-authored-by: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com> * Update baselines/BioCLIP_code_submission/model.py Co-authored-by: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com> * Update baselines/BioCLIP_code_submission/model.py Co-authored-by: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com> * Update baselines/BioCLIP_code_submission/model.py Co-authored-by: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com> * Update baselines/BioCLIP_code_submission/requirements.txt Co-authored-by: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com> * Update ingestion_program/ingestion.py Co-authored-by: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com> * Update baselines/BioCLIP_code_submission/model.py Co-authored-by: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com> * Apply suggestions from code review deal with device variable --------- Co-authored-by: Elizabeth Campolongo <38985481+egrace479@users.noreply.github.com>

DavidCarlyn · 2024-06-05T03:25:22Z

I updated the sample code submission for the DINOv2 baseline. Some notes:

Need to test & debug updated baseline code (with ingestion, scoring, etc.)
Still need to update scoring for major-minor use-case.

scoring_program/score.py

* divide and rename scoring programs by task make helper functions accessible to scoring programs * Update metadata files to point at proper scoring function, input, and output * Update mimic scoring program get proper input files to read, not just directories print all scores and output challenge scores to scores.json for CodaBench to read Add requirements file for temporary fix * Update species A scoring program get proper input files to read, not just directories print all scores and output challenge scores to scores.json for CodaBench to read scores still need proper labels and requirements file is temporary fix pending base container * Add 'ref' to solution filenames to match competition.yaml in formatting branch

…pecies

DavidCarlyn · 2024-06-09T21:47:49Z

When we are evaluating in scoring_program_A, will all the entries be either major and minor subspecies and no others?
I'm double checking, because before we would include more than just the major and minor subspecies to calculate the threshold
and just report the accuracy of the major and minor rows. Since we are splitting this up into different test sets/tasks, then I
am assuming that we are calculating the threshold per task then. I believe we are in agreement on that, just double checking.

scoring_program_A/helper_scripts/dataio.py

egrace479 · 2024-06-11T21:10:40Z

@work4cs, this is working as expected now (just without the container).

egrace479

We will want to remove the requirements installation from both scoring programs once we have the container sorted, but this is functioning as-is on CodaBench with codalab/competitions-v2-compute-worker.

baselines/DINO_SGD_code_submission/model.py

scoring_program_A/requirements.txt

scoring_program_mimic/requirements.txt

scoring_program_mimic/scoring_config.yaml

formatting

egrace479 · 2024-06-12T16:07:38Z

@work4cs and @DavidCarlyn I think we're good at this point? We'll change the scoring programs to not require the requirements file once we get the container functioning.

Added base scoring program

e924227

DavidCarlyn requested review from egrace479 and work4cs May 17, 2024 19:42

egrace479 reviewed May 24, 2024

View reviewed changes

helper_scripts/dataio.py Outdated Show resolved Hide resolved

scoring_program/scoring_config.yaml Outdated Show resolved Hide resolved

scoring_program/score.py Outdated Show resolved Hide resolved

updated to utilize .csv format

d4b1098

work4cs and others added 5 commits May 29, 2024 18:06

fix the prediction of scores

2f09805

Added DINOv2 inference code

8a33049

Added DINOv2 requirements file

d82fd0a

Added SGD classfier weights from DINO baseline

218cdaa

egrace479 reviewed Jun 5, 2024

View reviewed changes

scoring_program/score.py Outdated Show resolved Hide resolved

egrace479 and others added 4 commits June 7, 2024 22:25

updated parse_solution_file function to return if row is a major subs…

2c116e9

…pecies

updated major_minor scoring program to include the recall separately

44f97bb

updated proper keys in save_scores function

df34556

DavidCarlyn commented Jun 9, 2024

View reviewed changes

scoring_program_A/helper_scripts/dataio.py Show resolved Hide resolved

egrace479 self-requested a review June 10, 2024 14:25

fix definition of signal, non-signal indices

fe1c0fb

egrace479 approved these changes Jun 11, 2024

View reviewed changes

This was linked to issues Jun 11, 2024

Add scoring program #5

Closed

Baseline submission--DinoV2 #10

Closed

egrace479 added this to the Functional Challenge Bundle milestone Jun 11, 2024

fix dino

cd3bd9e

egrace479 reviewed Jun 12, 2024

View reviewed changes

baselines/DINO_SGD_code_submission/model.py Outdated Show resolved Hide resolved

egrace479 reviewed Jun 12, 2024

View reviewed changes

scoring_program_A/requirements.txt Outdated Show resolved Hide resolved

egrace479 reviewed Jun 12, 2024

View reviewed changes

scoring_program_mimic/requirements.txt Outdated Show resolved Hide resolved

egrace479 reviewed Jun 12, 2024

View reviewed changes

scoring_program_mimic/scoring_config.yaml Outdated Show resolved Hide resolved

Apply suggestions from code review

ed8e13d

formatting

work4cs merged commit da18386 into set-up Jun 12, 2024

work4cs deleted the scoring branch June 12, 2024 19:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added base scoring program #14

Added base scoring program #14

DavidCarlyn commented May 17, 2024

egrace479 commented May 17, 2024

DavidCarlyn commented May 29, 2024

DavidCarlyn commented Jun 5, 2024

DavidCarlyn commented Jun 9, 2024

egrace479 commented Jun 11, 2024

egrace479 left a comment

egrace479 commented Jun 12, 2024

Added base scoring program #14

Added base scoring program #14

Conversation

DavidCarlyn commented May 17, 2024

egrace479 commented May 17, 2024

DavidCarlyn commented May 29, 2024

DavidCarlyn commented Jun 5, 2024

DavidCarlyn commented Jun 9, 2024

egrace479 commented Jun 11, 2024

egrace479 left a comment

Choose a reason for hiding this comment

egrace479 commented Jun 12, 2024