Skip to content

Latest commit

 

History

History
69 lines (62 loc) · 6.23 KB

CHANGELOG.md

File metadata and controls

69 lines (62 loc) · 6.23 KB

Changelog

This is just a centralised version of the Github automatically generated Release changelogs.

0.0.2

  • Refactor position encoding configuration by @vince62s in #60
  • fix update vocab by @vince62s in #63
  • bfloat16 support, and an attempt at homogenizing model_dtype & precision by @francoishernandez in #54
  • Fix prefix and suffix transforms - avoid adding empty suffix or prefix by @sersh88 in #57
  • fix the incorrect dockerimages in the ReadMe by @aaaallleen in #68
  • Remove unnecessary optim in convert_HF by @francoishernandez in #71
  • Add onmt_config converter to facilitate switch by @francoishernandez in #69
  • Update some FAQ sections by @francoishernandez in #74
  • Added TER and BLEU for early stopping by @aaaallleen in #73
  • [fix] fix normalize and clean transforms config management by @francoishernandez in #87
  • [docs] Fix quickstart config and command by @francoishernandez in #90
  • add head_dim setting when diff from hidden // heads by @vince62s in #78
  • Some MHA and RoPE refactoring, llama-3.1 rope_scaling by @francoishernandez in #91
  • Fixed variable referenced before assignment when position_embeddings is None error by @dameikle in #95
  • Send src_pad_mask and tgt_pad_mask to decoder in _align_forward by @dameikle in #96
  • Fixdistrib by @vince62s in #100
  • fix added tokens by @vince62s in #101
  • Support mapped tokens eg: <im_start> ==> ⦅im_start⦆in inference.yaml … by @vince62s in #102
  • add wmt22 recipes with TowerInstruct and Llama3.1 LLMs by @vince62s in #103
  • Remove duplicate sentencepiece requirement by @francoishernandez in #104
  • [patch] Adapt some warning behaviours for reduced verbosity by @francoishernandez in #105
  • [patch] Update precision to compute_dtype in forgotten places by @francoishernandez in #106
  • Inference server, lots of related changes by @francoishernandez in #42

Full Changelog: https://github.com/eole-nlp/eole/compare/0.0.1...0.0.2

0.0.1

  • mlp refact by @vince62s in #1
  • fix llama3 and parallel_residual by @vince62s in #4
  • fixed mismatch between mask and batch dimensions by @l-k-11235 in #6
  • simplify LayerNorm access as a constant by @vince62s in #7
  • Fix the checkpoint directory cleaning by @l-k-11235 in #10
  • Modify default model config behaviour by @francoishernandez in #8
  • rename num_kv remove multiquery by @vince62s in #12
  • fix mmlu config by @vince62s in #13
  • Fix the tokenizer saving in the HF converter by @l-k-11235 in #14
  • remove unsused average attn by @vince62s in #15
  • MHA refac: rope without complex operations + query only as input of the forward by @vince62s in #20
  • Revert "MHA refac: rope without complex operations + query only as input of the forward" by @vince62s in #22
  • missing removal of average attn by @vince62s in #23
  • config.models.BaseModelConfig._override_values updates everything once by @francoishernandez in #24
  • [fix] Patch lora bin to dump json config by @francoishernandez in #28
  • review flash/sdpa arg by @vince62s in #25
  • fix missing layers names by @vince62s in #30
  • Split MHA by @vince62s in #29
  • Resize the key_pad_mask by @l-k-11235 in #36
  • [patch] upgrade docusaurus deps, fix build script by @francoishernandez in #37
  • Add gpt2 converter, hellaswag eval tool, misc fixes by @francoishernandez in #38
  • Forgot hellaswag.py tool in #38 by @francoishernandez in #39
  • estim lambda scheduler by @vince62s in #40
  • Add support for XLM-Roberta-XL (and XXL) conversion by @vince62s in #41
  • Some fixes, get rid of data_task, homogenize model_task to model_type by @francoishernandez in #43
  • Some improvements to config.json readability by @francoishernandez in #44
  • [docs] Github Actions workflow to facilitate docs deployment by @francoishernandez in #47
  • [fix] Allow to build_vocab with full train config, patch vocab validation by @francoishernandez in #49
  • Enable PyPI release workflow by @francoishernandez in #50
  • [fix] Fix paths in wiki_103 recipe, add pyarrow opt requirement by @francoishernandez in #51
  • Estim first token instead of average by @vince62s in #46
  • Add Recipe to train a cometkiwi-like encoder model (which can be used to score sentence pairs) by @vince62s in #53
  • Simplify init files, remove some unused code by @francoishernandez in #52

Full Changelog: https://github.com/eole-nlp/eole/commits/0.0.1rc1