Generates texts using an LLM fine-tuned on previous messages.
- Run
pip install -r requirements.txt
- Follow instructions below to initialize a model if needed
- Run
python3 texts_ai.py --help
for instructions
Both GPT-type and LLaMa-type models are supported. I wrote the prompts and query.py to correspond with Vicuna, for LLaMa-type models, and a custom NanoGPT for GPT-type models. Instructions to configure both of them are below, but the code should be hackable to work with a variety of models.
Fine-tuning with NanoGPT
- Clone
karpathy/nanoGPT
- Run training_messages.py from this repo
- Edit
nanoGPT/data/shakespeare/prepare.py
to use the file written by query.py rather than Shakespeare - Run prepare.py
- Follow the instructions from NanoGPT to fine-tune or train using the
train.bin
andval.bin
files created by prepare.py
Setting up Vicuna
- Run
brew install bash
- Download llama.sh from this repo
- Modify llama.sh to only download 7B and then run it with homebrew's bash
- Convert to Tokenizer format using this script and this command:
python src/transformers/models/llama/convert_llama_weights_to_hf.py \
--input_dir /path/to/downloaded/llama/weights \
--model_size 7B \
--output_dir /output/path
- Modify LLaMa with Vicuna weights:
python3 -m fastchat.model.apply_delta \
--base-model-path /path/to/llama-7b \
--target-model-path /path/to/output/vicuna-7b \
--delta-path lmsys/vicuna-7b-delta-v1.1 \
--low-cpu-mem
- Run blocks 2 and 3 of this script on the Vicuna bin, replacing
LLaMa
withLlama
- Run conversion scripts from llama.cpp to convert Vicuna pth to f16 and then q4_0
Under the MIT License. Models may have their own licenses.