You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This gives attention scores that are -116,000 on the correct segment, such that the pattern is uniform across things outside the causal mask. I have no idea why this happens, and it seems ridiculous, but we should set attn.IGNORE to -torch.inf probably
The text was updated successfully, but these errors were encountered:
Minimal example:
This gives attention scores that are -116,000 on the correct segment, such that the pattern is uniform across things outside the causal mask. I have no idea why this happens, and it seems ridiculous, but we should set attn.IGNORE to -torch.inf probably
The text was updated successfully, but these errors were encountered: