Skip to content

A GPT from scratch in a single file. Optimized for readability and learnability.

License

Notifications You must be signed in to change notification settings

veezbo/single_file_gpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

single_file_gpt

A character-level GPT from scratch in a single file gpt.py.

Optimized for readability and learnability.

features

  • single file
  • as readable as possible
  • comments for learnings and common errors
  • type annotations with runtime type checking
  • working code that trains on text and generates text like it

demo

We train a character-level GPT on a small corpus of Shakespearian English from plays.

After training, the same model is used to generate similar text, especially reproducing the style and syntax of the input.

dependencies

python >= 3.10
torch >= 2.0
jaxtyping==0.2.28
beartype>=0.15.0

install

pip install -r requirements.txt

run

python gpt.py

typing

This project uses jaxtyping for type annotations and runtime type checking.

This is what the type annotations look like:

from torch import Tensor
from jaxtyping import Float, Int

def func(x: Float[Tensor, "A B C"]) -> Int[Tensor, ""]:
    return x.shape[0]

Float[Tensor, "A B C"]  # float tensor with shape (A, B, C)
Int[Tensor, ""]  # int scalar (0-dim) tensor
func  # function that takes in a float tensor with shape (A, B, C) and returns an int scalar tensor

contributing

All contributions in the form of confusions, concerns, suggestions, or improvements are welcome!

future

  • include type annotations for all variables within functions too when this is well-supported by jaxtyping, see this issue

acknowledgements

This repo is heavily influenced by Andrej Karpathy's nanogpt

license

License: MIT

About

A GPT from scratch in a single file. Optimized for readability and learnability.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages