Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support distillation loss weight scheduler #444

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

HIT-cwh
Copy link
Collaborator

@HIT-cwh HIT-cwh commented Feb 1, 2023

Modification

  1. Support Cos / Linear / MultiStep loss weight scheduler.
    Notes: Linear one can be used for distillation loss weight warmup and MultiStep can be used for early stop.
  2. Add loss weight scheduler manager to record the current / base (for cos schedule) loss weight.
  3. Add loss weight scheduler hook.
  • Before run: Build schedulers. And set the milestones according to the begin / end iter of each scheduler.
  • Before train epoch: Set the current loss weight attribute of the loss weight scheduler manager according to those schedulers whose by_epoch attribute is True. And set the base loss weight of the loss weight scheduler manager to the current loss weight if the current iter is in the milestones. For example, suppose we want to use Linear in the first 5 epochs for warmup and then use Cosine. Before the 6th epoch, we should set the base loss weight to the current one for the calculation in the Cosine scheduler.
  • Before train iter: Set the current loss weight attribute of the loss weight scheduler manager according to those schedulers whose by_epoch attribute is False. And set the base loss weight of the loss weight scheduler manager to the current loss weight if the current iter is in the milestones.
  1. Add the corresponding pytest.

TODO

  1. Docstring

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant