What is the criteria of saving the top k models in Casanovo version 4? #315

irleader · 2024-03-18T14:52:01Z

In version 3, every_n_train_steps is specified to save model weights for each n train steps. I usually pick the model when validation AA prec and pep prec are both the best.

While in version 4, Casanovo only saves the top k models.

If calculate_precision is set as False as recommended, does Casanovo save the models based on lowest training loss or validation loss? And any previous worse models will be removed from the folder?
Can I train the model without specifying any validation set?
Is it still possible to set calculate_precision as True, and save the model when AA prec and pep prec are the best as in version 3?

Sorry for the dumb questions.

bittremieux · 2024-03-18T23:20:05Z

Yes. We have observed a strong correlation between the loss and the AA/peptide precision, so the former can be used as a proxy to select the best model. Hence, during training you can skip calculating the precision to speed up the validation steps, because running beam search during inference is expensive. Note that even when AA/peptide precision is calculated, the top-k models are always determined based on the validation loss. The ultimate metric we're interested in is of course still AA/peptide precision, so it is recommended to calculate this afterwards for the final model.
We use the ModelCheckpoint from PyTorch Lightning, so for specific implementation details, check out their documentation. For example, set save_top_k=-1 to retain all model checkpoints.
No, a validation set is required.
So this is basically what is being done, except the validation loss is used to determine the best model. Basically see point (1).

We want to make saving models more flexible (#313, #291), but this will take a bit more time.

bittremieux added the question Further information is requested label Mar 18, 2024

bittremieux closed this as completed Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the criteria of saving the top k models in Casanovo version 4? #315

What is the criteria of saving the top k models in Casanovo version 4? #315

irleader commented Mar 18, 2024

bittremieux commented Mar 18, 2024

What is the criteria of saving the top k models in Casanovo version 4? #315

What is the criteria of saving the top k models in Casanovo version 4? #315

Comments

irleader commented Mar 18, 2024

bittremieux commented Mar 18, 2024