Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird condition for sequence classification in Bart #494

Closed
cyr0930 opened this issue Feb 13, 2023 · 0 comments · Fixed by #624
Closed

Weird condition for sequence classification in Bart #494

cyr0930 opened this issue Feb 13, 2023 · 0 comments · Fixed by #624
Assignees
Labels
bug Something isn't working do-not-stale This issue won't be automatically staled and closed after 90 days

Comments

@cyr0930
Copy link

cyr0930 commented Feb 13, 2023

Environment info

  • adapter-transformers version: 3.1.0

Information

Model I am using (Bert, XLNet ...): Bart
I think checking condition for sequence classification by shape is quite buggy.
I wanted to use LM head with same enc & dec seq_len (and, of course, I could have different number of eos tokens), but conditional code below make it difficult to do that.
I work around with setting dec_seq_len == enc_seq_len-1, but it should be fixed for potential bug.
https://github.com/adapter-hub/adapter-transformers/blob/master/src/transformers/adapters/models/bart/adapter_model.py#L96

Expected behavior

@cyr0930 cyr0930 added the bug Something isn't working label Feb 13, 2023
@lenglaender lenglaender added do-not-stale This issue won't be automatically staled and closed after 90 days and removed Stale labels May 15, 2023
@adapter-hub adapter-hub deleted a comment from adapter-hub-bert Dec 16, 2023
@calpt calpt self-assigned this Dec 16, 2023
@calpt calpt closed this as completed in #624 Jan 7, 2024
calpt added a commit that referenced this issue Jan 7, 2024
Fixes #494 and fixes #563.

`BartAdapterModel` currently tries to extract CLS token representations
from EOS tokens independent from the used heads (if at all). This causes
weird issues for models not using CLS token based heads.

This PR moves CLS token rep extraction into the specific head classes to
only compute it when needed. Two (re-usable) new args to
`forward_head()` are introduced:
- `get_cls_from_eos_tokens`: to indicate a model uses CLS reps from EOS
tokens
- `eos_mask`: token mask to extract CLS reps (required when passing
`get_cls_from_eos_tokens=True`)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working do-not-stale This issue won't be automatically staled and closed after 90 days
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants