What should we benchmark? #232

ccbaumler · 2022-12-20T20:41:54Z

Following the same process as sourmash-bio/sourmash#2410, we will benchmark the charcoal workflow with the demo directory and/or the six signatures included in sourmash-bio/sourmash#2410. Suggesting to:

Run the demo repo
Run each sequences alone
Run a variety of sequences from small to large sets
Run the all six together

It may be interesting to also compare the results of sourmash search --containment to `charcoal.contigs_list_contaminents.py in this repo.

The text was updated successfully, but these errors were encountered:

ccbaumler · 2022-12-20T21:01:05Z

It would also be interesting to compare the accuracy of sourmash gather and genome-grist MinSetCov taxonomic outputs with and without charcoal.

ctb · 2022-12-21T13:51:00Z

It sounds like you might be trying to benchmark both computational performance and classification performance. Those are pretty different things.

I don't think that charcoal has any individually expensive steps or computationally complex scripts that are part of it; it's just the workflow overall that involves an awful lot of steps, much like genome-grist. That may change your benchmarking strategy.

ccbaumler · 2022-12-21T16:21:32Z

I agree! They are completely different benchmarks. Mostly wanted to jot down the notion before it left me forever.
Additionally, since we will be writing an analytical benchmark for computational results, we will have a foundation to come back to in the future when we are ready for a biological benchmark.

Would you suggest forking and adding benchmark directives throughout the snakefile instead of a global benchmark?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What should we benchmark? #232

What should we benchmark? #232

ccbaumler commented Dec 20, 2022

ccbaumler commented Dec 20, 2022

ctb commented Dec 21, 2022

ccbaumler commented Dec 21, 2022

What should we benchmark? #232

What should we benchmark? #232

Comments

ccbaumler commented Dec 20, 2022

ccbaumler commented Dec 20, 2022

ctb commented Dec 21, 2022

ccbaumler commented Dec 21, 2022