michaelnny

Follow

Michael Hu michaelnny

Follow

RL is the king

32 followers · 6 following

Shanghai
www.vectortheta.com

Achievements

Achievements

Pinned Loading

alpha_zero alpha_zero Public

A PyTorch implementation of DeepMind's AlphaZero agent to play Go and Gomoku board games

Python 89 18
deep_rl_zoo deep_rl_zoo Public archive

A collection of Deep Reinforcement Learning algorithms implemented with PyTorch to solve Atari games and classic control tasks like CartPole, LunarLander, and MountainCar.

Python 106 11
muzero muzero Public archive

A PyTorch implementation of DeepMind's MuZero agent

Python 28 3
Llama3-FunctionCalling Llama3-FunctionCalling Public

Fine-tune Llama3 model to support function calling

Jupyter Notebook 29 1
InstructLLaMA InstructLLaMA Public

Implements pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF), to train and fine-tune the LLaMA2 model to follow human instructions, similar to Instru…

Jupyter Notebook 47 9
QLoRA-LLM QLoRA-LLM Public archive

A simple custom QLoRA implementation for fine-tuning a language model (LLM) with basic tools such as PyTorch and Bitsandbytes, completely decoupled from Hugging Face.

Python 6 1