Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the negative label #1

Open
Z-ZHHH opened this issue Sep 25, 2022 · 2 comments
Open

Question about the negative label #1

Z-ZHHH opened this issue Sep 25, 2022 · 2 comments

Comments

@Z-ZHHH
Copy link

Z-ZHHH commented Sep 25, 2022

Great work!
Does the loss value be negative during the training if we use the negative labels? When the feature collapse to the class prototype, the logit will be strict one-hot. It seems that the loss value -> -infinity.

@weijiaheng
Copy link
Collaborator

The loss could go negative when learning with negative labels when adopting the cross-entropy loss, this is simply because: the ce loss will multiply per class -log(p_i) with the soft label. For the irrelevant classes (not equal to the training label), multiplying with a negative soft label may result in a negative loss (see here).

In our paper, we discuss how to address this issue in practice (in Appendix D.2). Briefly speaking, negative labels rely on a relatively well-pre-trained model. Since the mechanism is to enhance model confidence in its prediction. Thus, if training with negative labels at the beginning of the training procedure, it is possible that the model becomes overly-confident in bad representation. (The learned presentation is likely to be bad at the beginning of the training)

@Z-ZHHH
Copy link
Author

Z-ZHHH commented Sep 26, 2022

Thanks a lot!
I just tried the NLS during the whole training process and it didn't work. Thanks for the details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants