This is just a centralised version of the Github automatically generated Release changelogs.
- Refactor position encoding configuration by @vince62s in #60
- fix update vocab by @vince62s in #63
- bfloat16 support, and an attempt at homogenizing model_dtype & precision by @francoishernandez in #54
- Fix prefix and suffix transforms - avoid adding empty suffix or prefix by @sersh88 in #57
- fix the incorrect dockerimages in the ReadMe by @aaaallleen in #68
- Remove unnecessary optim in convert_HF by @francoishernandez in #71
- Add onmt_config converter to facilitate switch by @francoishernandez in #69
- Update some FAQ sections by @francoishernandez in #74
- Added TER and BLEU for early stopping by @aaaallleen in #73
- [fix] fix normalize and clean transforms config management by @francoishernandez in #87
- [docs] Fix quickstart config and command by @francoishernandez in #90
- add head_dim setting when diff from hidden // heads by @vince62s in #78
- Some MHA and RoPE refactoring, llama-3.1 rope_scaling by @francoishernandez in #91
- Fixed variable referenced before assignment when position_embeddings is None error by @dameikle in #95
- Send src_pad_mask and tgt_pad_mask to decoder in _align_forward by @dameikle in #96
- Fixdistrib by @vince62s in #100
- fix added tokens by @vince62s in #101
- Support mapped tokens eg: <im_start> ==> ⦅im_start⦆in inference.yaml … by @vince62s in #102
- add wmt22 recipes with TowerInstruct and Llama3.1 LLMs by @vince62s in #103
- Remove duplicate sentencepiece requirement by @francoishernandez in #104
- [patch] Adapt some warning behaviours for reduced verbosity by @francoishernandez in #105
- [patch] Update precision to compute_dtype in forgotten places by @francoishernandez in #106
- Inference server, lots of related changes by @francoishernandez in #42
Full Changelog: https://github.com/eole-nlp/eole/compare/0.0.1...0.0.2
- mlp refact by @vince62s in #1
- fix llama3 and parallel_residual by @vince62s in #4
- fixed mismatch between mask and batch dimensions by @l-k-11235 in #6
- simplify LayerNorm access as a constant by @vince62s in #7
- Fix the checkpoint directory cleaning by @l-k-11235 in #10
- Modify default model config behaviour by @francoishernandez in #8
- rename num_kv remove multiquery by @vince62s in #12
- fix mmlu config by @vince62s in #13
- Fix the tokenizer saving in the HF converter by @l-k-11235 in #14
- remove unsused average attn by @vince62s in #15
- MHA refac: rope without complex operations + query only as input of the forward by @vince62s in #20
- Revert "MHA refac: rope without complex operations + query only as input of the forward" by @vince62s in #22
- missing removal of average attn by @vince62s in #23
config.models.BaseModelConfig._override_values
updates everything once by @francoishernandez in #24- [fix] Patch lora bin to dump json config by @francoishernandez in #28
- review flash/sdpa arg by @vince62s in #25
- fix missing layers names by @vince62s in #30
- Split MHA by @vince62s in #29
- Resize the key_pad_mask by @l-k-11235 in #36
- [patch] upgrade docusaurus deps, fix build script by @francoishernandez in #37
- Add gpt2 converter, hellaswag eval tool, misc fixes by @francoishernandez in #38
- Forgot hellaswag.py tool in #38 by @francoishernandez in #39
- estim lambda scheduler by @vince62s in #40
- Add support for XLM-Roberta-XL (and XXL) conversion by @vince62s in #41
- Some fixes, get rid of data_task, homogenize model_task to model_type by @francoishernandez in #43
- Some improvements to config.json readability by @francoishernandez in #44
- [docs] Github Actions workflow to facilitate docs deployment by @francoishernandez in #47
- [fix] Allow to build_vocab with full train config, patch vocab validation by @francoishernandez in #49
- Enable PyPI release workflow by @francoishernandez in #50
- [fix] Fix paths in wiki_103 recipe, add pyarrow opt requirement by @francoishernandez in #51
- Estim first token instead of average by @vince62s in #46
- Add Recipe to train a cometkiwi-like encoder model (which can be used to score sentence pairs) by @vince62s in #53
- Simplify init files, remove some unused code by @francoishernandez in #52
Full Changelog: https://github.com/eole-nlp/eole/commits/0.0.1rc1