A character-level GPT from scratch in a single file gpt.py.
Optimized for readability and learnability.
- single file
- as readable as possible
- comments for learnings and common errors
- type annotations with runtime type checking
- working code that trains on text and generates text like it
We train a character-level GPT on a small corpus of Shakespearian English from plays.
After training, the same model is used to generate similar text, especially reproducing the style and syntax of the input.
python >= 3.10
torch >= 2.0
jaxtyping==0.2.28
beartype>=0.15.0
pip install -r requirements.txt
python gpt.py
This project uses jaxtyping for type annotations and runtime type checking.
This is what the type annotations look like:
from torch import Tensor
from jaxtyping import Float, Int
def func(x: Float[Tensor, "A B C"]) -> Int[Tensor, ""]:
return x.shape[0]
Float[Tensor, "A B C"] # float tensor with shape (A, B, C)
Int[Tensor, ""] # int scalar (0-dim) tensor
func # function that takes in a float tensor with shape (A, B, C) and returns an int scalar tensor
All contributions in the form of confusions, concerns, suggestions, or improvements are welcome!
- include type annotations for all variables within functions too when this is well-supported by jaxtyping, see this issue
This repo is heavily influenced by Andrej Karpathy's nanogpt