Releases: pytorch/text
TorchText 0.18
Warning: TorchText development is stopped and the 0.18 release will be the last stable release of the library.
This release is compatible with PyTorch 2.3.0 patch release. There are no new features added.
TorchText 0.17.2 Release
This release is compatible with PyTorch 2.2.2 patch release. There are no new features added.
TorchText 0.17.1 Release
This release is compatible with PyTorch 2.2.1 patch release. There are no new features added.
TorchText 0.17.0 Release
This release is compatible with PyTorch PyTorch 2.2.0. There are no new features added.
TorchText 0.16.2 Release
This is a patch release, which is compatible with PyTorch 2.1.2. There are no new features added.
TorchText 0.16.1
This is a patch release, which is compatible with PyTorch 2.1.1. There are no new features added.
Torchtext 0.16
Current status
As of September 2023 we have paused active development of TorchText because our focus has shifted away from building out this library offering. We will continue to release new versions but do not anticipate any new feature development as we figure out future investments in this space.
Bug Fixes
- Update links to multi30k dataset since original servers are down (#2194)
- Use filelock to block on concurrent model downloads (#2166)
New Features
- Add support for
__contains__
for Vectors class (#2144) - Add generation utility support to T5Bundle (#2146)
- Add option to ignore UTF-8 decoding error to scripted tokenizer (#2134)
- Add shift-right method to T5 model (#2131)
- Add XLMR and RoBERTa transforms as factory functions (#2102)
- Make sure to include padding mask in generation (#2096)
- (Prototype) Add top-p and top-k sampling (#2137)
TorchText 0.15.2 Release
This is a minor release, which is compatible with PyTorch 2.0.1. There are no new features added.
v0.15.1
Highlights
In this release, we add a new model architecture along with pre-trained weights, increase flexibility in our tokenizers, and improve the overall stability of the library.
- Added T5 & Flan-T5 model architecture with pre-trained weights
- Added DistilRoBERTa
- Added tutorial showing T5 in action
- Added prototype
GenerationUtils
Models
Torchtext expanded its models to include both T5, Flan-T5 and DistilRoBERTa along with the corresponding pre-trained model weights. These additions represent both the smallest and largest models available in Torchtext to date as well as the first encoder/decoder model with T5. As usual, all models are Torchscriptable.
Utils
Since TorchText now has encoder/decoder models available, we prototyped a GenerationUtils
for generic decoding capabilities for encoder/decoder or decoder only models.
Improvements
Features
- Add DistilRoBERTa to OSS (#1998)
- Beginning of GenerationUtils (#2011)
- Add Flan-T5 architecture (#2027)
- Optimize T5 for sequence generation (#2054)
- Add bundles for FLAN-T5 (#2061)
- Promote T5 and variants (#2064)
- Fixup generation utils for prototype release (#2065)
CI (Migrate from CircleCI to Github Actions)
- Remove CUDA binary builds (#1994)
- Remove Linux and MacOS unit tests from CircleCI (#1993)
- Validate binaries for nightly/release testing (#2010)
- Rename variable to avoid conflict with PIP system variable PIP_PREFIX (#2015, #2016)
- Refactor validation using MATRIX vars (#2021)
- Migrate validation workflows to test-infra (#2022)
- 3.11 Windows Wheels Support in CircleCI (#2053)
- Adding RC triggers for all build jobs (#2057)
- Add windows 3.11 conda (#2063)
- Channel=test for build matrix generation (#2066)
- Turn off CirclCI 3.11 unit tests (#2078)
- Fix validation workflow for test channel (#2071)
- Modify integration test workflow to use PyTorch generic CI job (#2051)
Bug Fixes
- Change
read_from_tar
call toload_from_tar
(#1997) - Update Multi30k test dataset hash (#2003)
- Fix device setting for T5 Model (#2007)
- Fix
overwite
typo (#2006) - Fix linting error (#2019)
- Fix memory leak with C++ RegEx operator (#2024)
- Fix CodeQL workflow failure (#2046)
- Fix UTF8 decoding error in GPT2BPETokenizer
decode
method (#2092)
Examples
- Update T5 tutorial for 2.0 release (#2080)
Documentation
Testing
- Replaced tabs w/ spaces to fix CodeMod (#1999)
- Add GPU testing for RoBERTa models (#2025)
- Add TorchData version to smoke tests (#2034)
- Update integration-test.yml (#2038)
- Update CUDA version on GPU test (#2040)
- Add prototype GPU tests for T5 (#2055)
- Install portalocker for testing (#2056)
- Test newly uploaded Flan-T5 weights (#2074)
Dependencies
- Add TorchData as a hard dependency (#1985)
Others
TorchText 0.14.1 Release
This is a minor release, which is compatible with PyTorch 1.13.1. There are no new features added.