v6.4 - stream responses

BBC-Esq released this 03 Aug 18:03

· 325 commits to main since this release

6cfb5a8

Improvements

All "local models" now stream their responses for a better user experience.
Various small improvements.

Local Models

Fixed Dolphin Phi3-Medium
Added Yi 1.5 - 6b
Added H2O Danube3 - 4b
- Great quality small model.
Removed Mistral v.03 - 7b
- The model is gated so it's difficult to implement in a program. Plus, there are a plethora of other good models.
Removed Llama 3.1 - 8b
- Same as with Mistral.
Added Internlm 2.5 - 7b
Fixed Dolphin-Mistral-Nemo

Vision Models

Added Falcon-vlm - 11b
- Great quality. Uses Llava 1.6's processor.

Falcon-vlm, Llava 1.6 Vicuna - 7b, and Llava 1.6 Vicuna - 13b have arguably surpassed Cogvlm and are faster for less VRAM. Thus, Cogvlm may be deprecated in the future.

Misc.

Most, but not all, models should now download to the Models folder so you can take your folder with you. FYI, ensuring that all models do so is a work in progress, the goal being to carry all of the necessary files + program on a flash drive.

Current Chat and Vision Models

chart_chat

chart_vision

Assets 2