This repo is a collection of PyTorch implementations of Transformer architectures with simple flexible config for ease of experimentation. The goal is learning and experimentation.
Tests can be run using pytest
from the root directory. There are also online colabs that should test any new architecture added to the repo on shakespeare character prediction.
As well as this each architecture and layer should be benchmarked for speed using: