BETSI

A light implementation of the 2017 Google paper 'Attention is all you need'. BETSI is the name of the model, which is a recursive acronym standing for BETSI: English to shitty Italian, as the training time I allowed on my graphics card did not give enough time for amazing results.

For this implementation I will implement a translation from English to Italian, as Tranformer models are exceptional at language translation and this seems to be a common use of light implementations of this paper.

The dataset I will be using is the opus books dataset which is a collection of copyright free books. The book content of these translations are free for personal, educational, and research use. OPUS language resource paper.

Notes

I'm creating notes as I go, which can be found in NOTES.md.

Transformer model architecture

Requirements

There is a requirements.txt that has the packages needed to run this. I used PyTorch with ROCm as this sped up training A LOT. Training this model on CPU on my laptop takes around 5.5 hours per epoch, while training the model on GPU on my desktop takes around 13.5 minutes (24.4 times faster!).

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
resources		resources
weights		weights
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
NOTES.md		NOTES.md
README.md		README.md
RUSHIT.md		RUSHIT.md
attention_visualization.ipynb		attention_visualization.ipynb
clean.sh		clean.sh
config.py		config.py
dataset.py		dataset.py
dna.py		dna.py
model.py		model.py
requirements.txt		requirements.txt
train.out		train.out
train.py		train.py
translate.py		translate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BETSI

Notes

Transformer model architecture

Requirements

TODO and tenative timeline:

References used

About

Languages

License

taxborn/betsi

Folders and files

Latest commit

History

Repository files navigation

BETSI

Notes

Transformer model architecture

Requirements

TODO and tenative timeline:

References used

About

Topics

Resources

License

Stars

Watchers

Forks

Languages