This tutorial has been developed for the VI Pyrenees winter school (Feb. 2023) of quantum information.
If you’re a participant of the school, please see material below.
If you’re not a participant of the school, you’re probably seeing this content after the actual hands-on tutorial has happened. However, you may find it useful to visit the the tutorial page and read through the completed tutorial walkthrough to learn about unsupervised learning and how to train a GPT-like architecture.
First, we start introducing the basics of unsupervised learning and generative models. We explain the main concepts behind language models and how to train them, such as the tokenization process or the chain rule of probability.
Note
To prepare this tutorial, we have used some of the content from our machine learning course.
Then, we build a bigram language model, which allows us to introduce the concept of an embedding table and the typical training loops. This model serves both as baseline and warm up.
Finally, we proceed to learn about the specific inner workings of GPT. We start by introducing the transformer architecture and the self-attention mechanism. Then, we show how to use causal self-attention (or masked self-attention) to train our model to generate new text. Finally, we dive into the details such as skip connections and layer normalization.
To finish, we train a larger model that sucessfully learns to count! :)
We provide two main notebooks that can run in colab:
nbs/tutorial.ipynb
is ready to be filled during the sessionnbs/metatutorial.ipynb
is the already complete tutorial
To follow along with the tutorial, you need the first notebook. Ideally, you could simply go to its page and run it on google colab. However, since the wifi here is unstable, you can clone (or simply download) the repository to your computer.
If you want to follow along, make sure you have
pytorch installed. We won’t use any fancy stuff
from pytorch so you can simply run conda install pytorch -c pytorch
to
install it with conda or pip install torch
to install it with pip. In
case you have a GPU, find your installed CUDA version (e.g. 11.7) to
install conda install pytorch-cuda=11.7 -c nvidia
. Follow their
getting started page in case
of doubt.