This guide will help you understand how to convert AI models into different formats that our applications can use.
We convert LLMs on HuggingFace into special formats (GGUF, TensorRT, ONNX) so they can work with our applications called Jan and Cortex. Think of this like converting a video file from one format to another so it can play on different devices.
This step will create a model repository on Cortexso's Hugging Face account and then generate two files: model.yml
as the model's configuration file and metadata.yml
.
- Visit: https://github.com/janhq/models/actions/workflows/create-model-yml.yml
- Click
Run workflow
dropdown - Fill in the required information:
model_name
: Name of the model to create (will be used in repo name and files)prompt_template
: Prompt template for the model (default:<|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n
)stop_tokens
: Stop tokens for the model (comma-separated, e.g.,,</s>
) (default:<|im_end|>
)engine
: Engine to run the model (default:llama-cpp
)
- Click
Run workflow
button to start the conversion
Common Errors:
- Wrong Stop Tokens: If the model keeps generating too much text, check the
stop
tokens inmodel.yml
- Engine Errors: Make sure you picked the right engine type in
model.yml
- Template Issues: Double-check your
prompt_template
if the model gives weird outputs
- Visit: https://github.com/janhq/models/actions/workflows/convert-model-all-quant.yml
- Choose your conversion type:
- For GGUF format: Click
Convert model to gguf with specified quant
- For TensorRT: Coming soon
- For ONNX: Coming soon
- Click
Run workflow
dropdown - Fill in the required information:
source_model_id
: The source HuggingFace model ID (e.g., meta-llama/Meta-Llama-3.1-8B-Instruct)source_model_size
: The model size (e.g., 8b)target_model_id
: The target HuggingFace model IDquantization_level
: Quantization level (e.g., 'q4-km') or 'all' for all levels
- Click
Run workflow
button to start the conversion
- Watch the conversion process (it looks like a loading bar)
- If you see any errors ❌:
- Click on the failed step to see what went wrong
- Create a "Bug Report" on GitHub
- Contact Rex or Alex for help
After the conversion completes:
- Check Huggingface page to make sure all files are there
- Test the model in Cortex:
- Make sure it generates text properly
- Check if it stops generating when it should
For instructions on installing and running the model on Cortex, please refer to the official documentation
- Navigate to the newly created model repository on Cortexso's Hugging Face account
- Open the repository and select "Create Model Card"
- Use the template below for your model card, replacing the example content with your model's information:
---
license: your_model_license
---
## Overview
[Provide a brief description of your model, including its key features, use cases, and performance characteristics]
## Variants
| No | Variant | Cortex CLI command |
| --- | --- | --- |
| 1 | [variant_name](variant_url) | `cortex run model_name:variant` |
| 2 | [main/default](main_url) | `cortex run model_name` |
## Use with Jan (UI)
1. Install **Jan** from the [Quickstart Guide](https://jan.ai/docs/quickstart)
2. In Jan model Hub, enter:
```
cortexso/your_model_name
```
## Use with Cortex (CLI)
1. Install **Cortex** using the [Quickstart Guide](https://cortex.jan.ai/docs/quickstart)
2. Run the model with:
```
cortex run your_model_name
```
## Credits
- **Author:** [Original model creator]
- **Converter:** [Converting organization/person]
- **Original License:** [Link to original license]
- **Papers/References:** [Relevant papers or documentation]
- Review and verify:
- Model license is correctly specified
- All URLs are valid and point to the correct resources
- Model names and commands are accurate
- Click "Commit changes" to save the model card