Skip to content

Releases: lucidrains/st-moe-pytorch

0.0.28

10 Sep 16:59
Compare
Choose a tag to compare
oops

0.0.27

10 Sep 16:25
Compare
Choose a tag to compare
chip away at edge cases

0.0.25

10 Sep 15:05
Compare
Choose a tag to compare
another micro optimization for communication

0.0.24

10 Sep 14:50
Compare
Choose a tag to compare
in split by rank function, cache the sizes so on backwards there is n…

…ot an extra call

0.0.23

09 Sep 16:10
Compare
Choose a tag to compare
start journeying into distributed mixture of experts implementation

0.0.22

26 Aug 00:14
Compare
Choose a tag to compare
add ability to use differentiable topk

0.0.21

21 Aug 22:46
Compare
Choose a tag to compare
allow for different thresholds between second and third expert

0.0.20

21 Aug 22:09
Compare
Choose a tag to compare
multiply gates by mask_flat twice, as in mesh tensorflow code for top…

…-n gating

0.0.19

21 Aug 18:33
Compare
Choose a tag to compare
better naming

0.0.18

21 Aug 18:20
Compare
Choose a tag to compare
generalize to top-n gating, parallelize as much as possible