You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For DBpedia 8-shot on GPT-2, I incur a warning "token indices sequence length is longer than the specified maximum sequence length" along with an error "RuntimeError: The size of tensor a (1024) must match the size of tensor b (1060) at non-singleton dimension 3" on line 81 of the utils.py file where the function gpt2_model.generate() is called.
One possible Solution: In the current version of the code, we do not have a check when we're encoding a sequence that is larger than the max sequence the GPT-2 model can handle (which is 1024 tokens). If you pass that sequence to the model it will crash as it cannot handle such a long sequence.
You can truncate the sequence: seq = seq[:1023] or use the max_length tokenizer parameter so that it handles it on its own.:
if (len(input_ids['input_ids'][0]) > 1023):
input_ids['input_ids'] = input_ids['input_ids'][:, :1023]
input_ids['attention_mask'] = input_ids['attention_mask'][:, :1023]
Not sure if this is an optimal way to circumvent this issue. I'd appreciate if you could help. Thank you.
The text was updated successfully, but these errors were encountered:
For DBpedia 8-shot on GPT-2, I incur a warning "token indices sequence length is longer than the specified maximum sequence length" along with an error "RuntimeError: The size of tensor a (1024) must match the size of tensor b (1060) at non-singleton dimension 3" on line 81 of the utils.py file where the function gpt2_model.generate() is called.
One possible Solution: In the current version of the code, we do not have a check when we're encoding a sequence that is larger than the max sequence the GPT-2 model can handle (which is 1024 tokens). If you pass that sequence to the model it will crash as it cannot handle such a long sequence.
You can truncate the sequence: seq = seq[:1023] or use the max_length tokenizer parameter so that it handles it on its own.:
if (len(input_ids['input_ids'][0]) > 1023):
input_ids['input_ids'] = input_ids['input_ids'][:, :1023]
input_ids['attention_mask'] = input_ids['attention_mask'][:, :1023]
Not sure if this is an optimal way to circumvent this issue. I'd appreciate if you could help. Thank you.
The text was updated successfully, but these errors were encountered: