Releases: hfxunlp/transformer
v0.3.8
support pre-trained models (BERT, RoBERTa, BART, T5, MBART);
add regression loss, bucket relative positional encoding and self-dependency units;
support compression (gz, bz2, xz) and character-level text data processing;
support unicode standardization and Chinese desegmentation;
configure BF/FP16 and inference mode for pytorch;
fix & enhancement.
v0.3.7
v0.3.6
v0.3.5
support multilingual NMT;
support contiguous model parameters;
add sentencepiece (spm) support;
add C backend for core modules (this saves resources but is slower than the python backend);
clean & enhancement (the class components of transformer.Encoder/Decoder(s)
are changed, and model files of previous commits cannot be loaded correctly).
v0.3.4
v0.3.3
v0.3.2
v0.3.1
In this release, we:
Support AdaBelief optimizer;
Accelerate zero_grad
by enabling set_to_none
;
Support RealFormer.
v0.3.0
In this release, we:
Move AMP support from apex to torch.cuda.amp introduced in PyTorch 1.6;
Support sampling during greedy decode (for back-translation);
Accelerate Average Attention Network by replacing the matrix multiplication with cumsum; (A typo in this release is fixed in commit ed5eb60
)
Add APE support;
Support the Mish activation function.
v0.2.9
In this release, we:
adapt to PyTorch 1.5;
explicitly support Lipschitz constrained parameter initialization;
incorporate features: n-gram dropout, dynamic batch sizes, and source phrase representation learning.
Sorry for we did not include the update of utils in this release, please find utils/comm.py
(a.k.a utils.comm
) required by parallel/base.py
(a.k.a parallel.base
), or use commit 2b6b22094b545e74b05c075f3daac9c14f16414d
instead.