Skip to content

Releases: foundation-model-stack/fms-hf-tuning

v2.1.2

19 Nov 14:41
1e82e02
Compare
Choose a tag to compare

What's Changed

Dependency Changes

  • build(deps): set lower limit for transformers to 4.45 for granite 3.0 by @willmj in #387

Additional Changes

Full Changelog: v2.1.1...v2.1.2

v2.1.2-rc.1

18 Nov 16:18
398c2a8
Compare
Choose a tag to compare
v2.1.2-rc.1 Pre-release
Pre-release

What's Changed

  • build(deps): set lower limit for transformers to 4.45 for granite 3.0 by @willmj in #387
  • docs: Update supported models by @aluu317 in #389

Full Changelog: v2.1.1...v2.1.2-rc.1

v2.1.1

06 Nov 00:14
e2ac091
Compare
Choose a tag to compare

What's Changed

Dependency changes

  • Pull in new versions of fms-acceleration-peft, fms-acceleration-foak with fixes for AutoGPTQ and gradient accumulation hooks and adds granite GPTQ model
  • build(deps): set transformers below 4.46, waiting on fixes by @anhuong in #384

Additional changes

Full Changelog: v2.1.0...v2.1.1

v2.1.1-rc.2

01 Nov 18:59
0e664f2
Compare
Choose a tag to compare
v2.1.1-rc.2 Pre-release
Pre-release
deps: set transformers below 4.46, waiting on fixes (#384)

Signed-off-by: Anh Uong <anh.uong@ibm.com>

v2.1.1-rc.1

31 Oct 22:22
e51ec36
Compare
Choose a tag to compare
v2.1.1-rc.1 Pre-release
Pre-release
docs: Update supported models list in README (#382)

Signed-off-by: Thara Palanivel <130496890+tharapalanivel@users.noreply.github.com>

v2.1.0

18 Oct 22:10
8f16818
Compare
Choose a tag to compare

New Major Feature

  • Support for GraniteForCausalLM model architecture

Dependency upgrades

  • Upgraded transformers to version 4.45.2, now supports GraniteForCausalLM models. Note that if a model is trained with transformers v4.45, you need the same version transformers>=4.45 to load the trained model, prior versions of transformers will not be compatible.
  • Upgraded accelerate to version 1.0.1.
  • Limit upper bound of torch to be under v2.5.0, not including v2.5.0, so it's compatible with flash-attention-2.
  • Upgraded fms_acceleration_peft to v0.3.1 which includes disabling offloading state dict which caused ephemeral storage issues when loading large models with QLoRA. Also includes setting defaults when target_modules=None.

Additional bug fix

  • Fix for crash when running a multi GPU training with a non-existent output dir.

Full list of Changes

  • ci: run unit tests, fmt, image build on release branch by @anhuong in #361
  • chore: update code owners by @anhuong in #363
  • fix: crash when output directory doesn't exist by @HarikrishnanBalagopal in #364
  • refactor: move tokenizer_data_utils with the rest of utils, add further unit testing. by @willmj in #348
  • build(deps): update transformers and accelerate deps by @anhuong in #355
  • build(deps): Update peft requirement from <0.13,>=0.8.0 to >=0.8.0,<0.14 by @dependabot in #354
  • build(deps): Upgrade accelerate requirement to allow version 1.0.0 by @willmj in #371
  • build: Set triton environment variables by @willmj in #370
  • build(deps): torch<2.5 due to FA2 error with new version by @anhuong in #375
  • chore: merge set of changes for v2.1.0 by @aluu317 in #376

Full Changelog: v2.0.1...v2.1.0

v2.1.0-rc.1

16 Oct 22:22
1570d04
Compare
Choose a tag to compare
v2.1.0-rc.1 Pre-release
Pre-release

What's Changed

  • ci: run unit tests, fmt, image build on release branch by @anhuong in #361
  • chore: update code owners by @anhuong in #363
  • fix: crash when output directory doesn't exist by @HarikrishnanBalagopal in #364
  • refactor: move tokenizer_data_utils with the rest of utils, add further unit testing. by @willmj in #348
  • build(deps): update transformers and accelerate deps by @anhuong in #355
  • build(deps): Update peft requirement from <0.13,>=0.8.0 to >=0.8.0,<0.14 by @dependabot in #354
  • build(deps): Upgrade accelerate requirement to allow version 1.0.0 by @willmj in #371
  • build: Set triton environment variables by @willmj in #370

Full Changelog: v2.0.1...v2.0.2-rc.1

v2.0.1

01 Oct 16:07
9b8245e
Compare
Choose a tag to compare

New major features:

  1. Support for LoRA for the following model architectures - llama3, llama3.1, granite (GPTBigCode and LlamaForCausalLM), mistral, mixtral, and allam
  2. Support for QLora for the following model architectures - llama3, granite (GPTBigCode and LlamaForCausalLM), mistral, mixtral
  3. Addition of post-processing function to format tuned adapters as required by vLLM for inference. Refer to README on how to run as a script. When tuning on image, post-processing can be enabled using the flag lora_post_process_for_vllm. See build README for details on how to set this flag.
  4. Enablement of new flags for throughput improvements: padding_free to process multiple examples without adding padding tokens, multipack for multi-GPU training to balance the number of tokens processed on each device, and fast_kernels for optimized tuning with fused operations and triton kernels. See README for details on how to set these flags and use cases.

Dependency upgrades:

  1. Upgraded transformers to version 4.44.2 needed for tuning of all models
  2. Upgraded accelerate to version 0.33 needed for tuning of all models. Version 0.34.0 has a bug for FSDP.

API /interface changes:

  1. train() API now returns a tuple of trainer instance and additional metadata as a dict

Additional features and fixes

  1. Support of resume tuning from the existing checkpoint. Refer to README on how to use it as a flag. Flag resume_training defaults to True.
  2. Addition of default pad token in tokenizer when EOS and PAD tokens are equal to improve training quality.
  3. JSON compatability for input datasets. See docs for details on data formats.
  4. Fix to not resize embedding layer by default, embedding layer can continue to be resized as needed using flag embedding_size_multiple_of.

Full List of what's Changed

New Contributors

Full Changelog: v1.2.2...v2.0.0

v2.0.0

30 Sep 21:03
3b150ab
Compare
Choose a tag to compare

This version has old dependency and users should move to v2.0.1 instead

v2.0.0-rc.2

27 Sep 23:08
a37f074
Compare
Choose a tag to compare
v2.0.0-rc.2 Pre-release
Pre-release

What's Changed

  • fix: check for wte.weight along with embed_tokens.weight by @willmj in #356

Full Changelog: v2.0.0-rc.1...v2.0.0-rc.2