rhlf

Here are 2 public repositories matching this topic...

michaelnny / InstructLLaMA

Implements pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF), to train and fine-tune the LLaMA2 model to follow human instructions, similar to InstructGPT or ChatGPT, but on a much smaller scale.

ppo rhlf instructgpt qlora llam2 4bit-fine-tune

Updated Mar 9, 2024
Jupyter Notebook

anas-zafar / LLM-Survey

Star

The official GitHub page for the survey paper "Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects"

natural-language-processing rhlf pre-trained-language-models large-language-models llms generative-ai chatgpt vision-language-model

Updated Nov 19, 2024

Improve this page

Add a description, image, and links to the rhlf topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rhlf topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly