Replies: 3 comments 7 replies
-
Hi @Coluding |
Beta Was this translation helpful? Give feedback.
1 reply
-
Hi again, is your implementation of LocalAttention also sparse-aware such that empty tokens are not represented in memory? I could not find any docs on that. Thanks in advance! |
Beta Was this translation helpful? Give feedback.
6 replies
-
Great! Thanks for your quick replies! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all :)
I was wondering if dilated local attention like in the Longformer paper is already integrated? I could not find it but thought maybe of the Longformer paper reference in the docstring of the LocalAttention class that there may be something already implemented. If not, is the implementation planned?
Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions