Skip to content

FNALLPC/machine-learning-hats

Repository files navigation

logo


IntroductionSetupLinksCredits


Jupyter Book Badge Binder Codestyle pre-commit.ci status


CMS Machine Learning Hands-on Advanced Tutorial Session (HATS)

Introduction

This is a set of tutorials for the CMS Machine Learning Hands-on Advanced Tutorial Session (HATS). They are intended to show you how to build machine learning models in python, using xgboost, Keras, TensorFlow, and PyTorch, and use them in your ROOT-based analyses. We will build event-level classifiers for differentiating VBF Higgs and standard model background 4 muon events and jet-level classifiers for differentiating boosted W boson jets from QCD jets using BDTs, and dense and convolutional neural networks. We will also explore more advanced models such as graph neural networks (GNNs), variational autoencoders (VAEs), and generative adversarial networks (GANs) on simple datasets.

Setup

Purdue Analysis Facility (New and recommended!)

The recommended method for running the tutorials live is the Purdue AF, follow the instructions here.

Vanderbilt Jupyterhub

Another option is the Vanderbilt Jupyterhub, instructions here.

FNAL LPC

Not as well supported, but instructions are here.

Locally

All these notebooks can be run on your local machine as well. It can often be useful to test your models and pipelines locally, but it is not recommended to run full trainings as these can be resource-intensive.

To run locally, run these commands from your terminal:

# Download the setup bash file for your machine from here https://github.com/conda-forge/miniforge#mambaforge
# e.g. wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
# Install: (the mamba directory can end up taking O(1-10GB) so make sure the directory you're using allows that quota)
chmod u+x Mambaforge-Linux-x86_64.sh
./Mambaforge-Linux-x86_64.sh  # follow instructions in the installation

git clone https://github.com/FNALLPC/machine-learning-hats/
cd machine-learning-hats
mamba create -f environment.yml
mamba activate machine-learning-hats
jupyter lab  # this will create a JupyterLab instance from which you can run all the notebooks.

Binder

You can launch this repository in a "Binder" instance using: Binder, or for a specific notebook by navigating to the rocket icon on the website and clicking on the Binder option.

launch_notebooks

This may be a more convenient, but it has not been well tested and the set-up time can be slow.

Google Colab

Each notebook can also be launched in a Google Colab instance by clicking "Google Colab" option in the menu bar above. To use this, you will have to install any extra libraries needed for the tutorial yourself and re-download the relevant datasets each time.

Links

The indico page is: https://indico.cern.ch/event/1444116/

The Mattermost for live support is: https://mattermost.web.cern.ch/cms-exp/channels/hatslpc-2024

The datasets we will use are located here: DOI

Credits

This project is created using the excellent open source Jupyter Book project and the executablebooks/cookiecutter-jupyter-book template.