Skip to content

Releases: tatp22/linformer-pytorch

Latest working version

10 Oct 13:21
Compare
Choose a tag to compare

Have not pushed up a release in a while, and this is a latest working version after 2 misc bugs have been fixed.

Added intermediate dim change

04 Aug 16:03
Compare
Choose a tag to compare

Added intermediate ff dimension

Now, the model dimension can be different in the intermediate layers.
This change applies to the ff module, and only in the encoder. Now, if
the flag ff_intermediate is not None, the layers will look like this:

channels -> ff_dim -> ff_intermediate (For layer 1)
ff_intermediate -> ff_dim -> ff_intermediate (For layers 2 to depth-1)
ff_intermediate -> ff_dim -> channels (For layer depth)

As opposed to

channels -> ff_dim -> channels (For all layers)

Able to use convolutional nets instead of linear

31 Jul 09:28
Compare
Choose a tag to compare

Now, the linformer supports convolution as a way to downsample the input, instead of relying on linear layers. This may reduce the amount of parameters necessary.

Encoder Decoder finished, Causal attention

28 Jul 16:13
2158efc
Compare
Choose a tag to compare

Finished an encoder and a decoder module. Also, causal attention works, when the causal=True flag is set. Will update the README shortly...

Added Masking

16 Jul 11:03
Compare
Choose a tag to compare

Added masking to the Linformer. However, this is still a WIP, since masking cannot be done in the traditional sense, like what is done in the attention is all you need paper, because there is an overhead of adding another (n,n) matrix, which is infeasable.

Started Encoder/Decoder work

06 Jul 13:41
3973b39
Compare
Choose a tag to compare

The repo now supports an encoder and a decoder.

TODO: Masking

Bug fixed

06 Jul 13:13
111caaa
Compare
Choose a tag to compare

Fixed a bug with the sequencing of the Linformer. Now should train properly.

LM model

02 Jul 14:30
Compare
Choose a tag to compare

A lm model is now available, for language modeling tasks

Rebase, added option to plot MHAttention heads

29 Jun 05:49
Compare
Choose a tag to compare

Rebased the code so it looks better, and added the option to plot the
MHAttention module as well as the Linformer module

No weight matrices in `LinearAttentionHead`

28 Jun 00:11
7c5c3a0
Compare
Choose a tag to compare

Check out pull request #7 to see the changes