22 Aug 22:30

BBC-Esq

38c5baf

v6.7.0 - LONG CONTEXT no see!

General Updates

CITATIONS! with hyperlinks when searching the Vector DB and getting a response.

Display of a chat model's max context and how many tokens you've used.

2X Speed Increase

Choose "half" in the database creation settings. It will automatically choose `bfloat16` or `float16` based on your GPU.

This results in a 2x speed increase with extremely low loss in quality.

Chat Models

Removed Internlm2_5 - 1.8b and Qwen 1.5 - 1.6b as under performing.
Removed Dolphin-Llama 3 - 8b and Internlm2 - 20b as superseded.
Added Danube 3 - 4b with 8k context.
Added Phi 3.5 Mini - 4b with 8k context.
Added Hermes-4-Llama 3.1 - 8b with 8k context
Added Internlm2_5 - 20b with 8k context

The following models now have have 8192 context:

Model Name	Parameters (billion)	Context Length
Danube 3 - 4b	4	8192
Dolphin-Qwen 2 - 1.5b	1.5	8192
Phi 3.5 Mini - 4b	4	8192
Internlm2_5 - 7b	7	8192
Dolphin-Llama 3.1 - 8b	8	8192
Hermes-3-Llama-3.1 - 8b	8	8192
Dolphin-Qwen 2 - 7b	7	8192
Dolphin-Mistral-Nemo - 12b	12	8192
Internlm2_5 - 20b	20	8192

Text to Speech Models

Excited to add additional models to choose from when using whisperspeech as the text to speech backend - see the chart below for the various s2a and t2s model combinations and "relative" compute times along with real vram usage stats.

Current Chat and Vision Models

Assets 2

12 Aug 01:06

BBC-Esq

v6.6.0

c9c4774

v6.6.0 - 8192 CONTEXT!

General Updates

Ensured that vector model pulldown menu auto-updates.
Made the vector model pulldown menu more descriptive.

Local Models

Added Internlm v 2.5 1.8b. In the last release, version 2.0 of Internlm's 1.8b model was removed. However, the quality increased noticeably with their version 2.5 so I'm re-adding it.

Vector Models

Excited to add Alibaba-NLP/gte-base-en-v1.5 and Alibaba-NLP/gte-large-en-v1.5. These vector models have a context limit of 8192, which is automatically set within the program. With a conservative estimate of 3 characters per token, that means that you can set the chunk size to approximatly 24,576!!
Removed Stella as it was under-performing and too difficult to work with. There is no love loss since the prior release marked it as "experimental" anyways.

Current Chat and Vision Models

Assets 2

07 Aug 18:13

BBC-Esq

v6.5.0

ccc5d5b

v6.5.0 - Llama 3.1 & MiniCPM v2

General updates

Remove triton dependency as cogvlm vision model is also removed.
Redid all benchmarks with more-accurate parameters.

Local Models

Overall, the large amount of chat models was becoming unnecessary or redundant. Therefore, I removed models that weren't providing optimal responses to simplify the user's experience, and added Llama 3.1.

Removed Models

Qwen 2 - 0.5b
Qwen 1.5 - 0.5b
Qwen 2 - 1.5b
Qwen 2 - 7b
- Redundant with Dolphin Qwen 2 - 7b
Yi 1.5 - 6b
Stablelm2 - 12b
Llama 3 - 8b
- Redundant with Dolphin Llama 3 - 8b

Added Models

Dolphin Llama 3.1 - 8b

Vision Models

Overall, two vision models were removed as unnecessary and MiniCPM-V-2_6 - 8b was added. As of the date of this release, MiniCPM-V-2_6 - 8b is now the best model in terms of quality. I currently recommend using this model if you have the time and VRAM.

Removed Models

cogvlm
MiniCPM-Llama3

Vector Models

Added Stella_en_1.5B_v5, which ranks very high on the leaderboard.
- Note, this is a work in progress as currently the results seem to be sub-optimal.

Current Chat and Vision Models

Assets 2

03 Aug 18:03

BBC-Esq

v6.4

6cfb5a8

v6.4 - stream responses

Improvements

All "local models" now stream their responses for a better user experience.
Various small improvements.

Local Models

Fixed Dolphin Phi3-Medium
Added Yi 1.5 - 6b
Added H2O Danube3 - 4b
- Great quality small model.
Removed Mistral v.03 - 7b
- The model is gated so it's difficult to implement in a program. Plus, there are a plethora of other good models.
Removed Llama 3.1 - 8b
- Same as with Mistral.
Added Internlm 2.5 - 7b
Fixed Dolphin-Mistral-Nemo

Vision Models

Added Falcon-vlm - 11b
- Great quality. Uses Llava 1.6's processor.

Falcon-vlm, Llava 1.6 Vicuna - 7b, and Llava 1.6 Vicuna - 13b have arguably surpassed Cogvlm and are faster for less VRAM. Thus, Cogvlm may be deprecated in the future.

Misc.

Most, but not all, models should now download to the Models folder so you can take your folder with you. FYI, ensuring that all models do so is a work in progress, the goal being to carry all of the necessary files + program on a flash drive.

Current Chat and Vision Models

Assets 2

31 Jul 20:06

BBC-Esq

v6.3.0

851893b

6.3.0 - whisper upgrade

NOTE

This release has been deleted a few times because of errors but this one should work now....

Updates:

Added the large-v3 whisper model and removed large-v2.
Added all three distil whisper model sizes.
Ensured that all whisper model files are downloaded to the Models/whisper folder in the source code folder.
Added error handling in metrics bar for if/when the numbers go over 100% - e.g. a model overflows the vram.
Modified gui.py to specify the multiprocess type earlier in the script to avoid some errors.

Assets 2

30 Jul 13:26

BBC-Esq

v6.2.3

aca7eec

v6.2.3 - FAST installation

Uses the impressive uv library written in RUST for a 2x-4x speed up of setup_windows.py.

Make sure that run pip install uv first, as outlined in the updated installation instructions.

Assets 2

27 Jul 12:11

BBC-Esq

v6.2.2

f87ba5c

v6.2.2 - Welcome LLAVA_NEXT

New Vector Models

Reintroducing these after an unduly long hiatus:

New Vision Models

Welcome llava-next also known as Llava 1.6:

Other Changes

Removed sentence-t5-xxl vector model.
Set batch sizes for all current vector models.
Fixed a bug where chat model didn't automatically eject when the program's window was closed, thus preventing the command prompt from being returned to a user.

Assets 2

25 Jul 04:08

BBC-Esq

v6.2.1

cc59151

v6.2.1 - PERFECT install patch

Patch release to add dependencies accidentally missing from setup_windows.py. See the release notes for version 6.2.0 for more details on the release itself.

Assets 2

24 Jul 17:50

BBC-Esq

v6.2.0

d339ada

v6.2.0 - PERFECT installation

Note, use the setup_windows.py script attached to this release instead or check out release 6.2.1

Breaking Changes

Overhauled the installation procedure. Too many dependencies were creating conflicts that neither pip, pip-compile or any other approach I'm aware of could solve. Thus, setup_windows.py has been completely revamped to install on Windows + Nvidia GPU systems.
It should now install every library needed EVERY SINGLE TIME without exceptions. The tradeoff is that it's slightly slower, which is no biggie.
If I have time, I will re-incorporate an installation procedure for CPU-only systems.

New Chat Models

Other Changes

Clean up Unneeded Portions of Scripts
Disable PHI3 Mini Models Temporarily due to errors.
Update the System Message for the Chat Models
Upgraded to transformers==4.43.1 and downgraded to cuda==12.1 (to avoid errors).

Currently Support Chat Models:

Assets 3

19 Jul 23:00

BBC-Esq

v6.1.0

3cacc39

v6.1.0 - complexity growing!

Version 6.1

Stability-geared release.

Bug Fixes

VectorDB's can now be created with images again and searched. Sentence-transformers was the mail culprit.
Solved the issue of the DB not being created by using from_texts instead of from_documents within the TileDB library.
Massive improvement in stability when switching to/from "local models." Involved heavy troubleshooting multiprocessing.
Greatly improved the installation procedure - i.e. setup_windows.py and requirements.txt, which was responsible for a lot of conflicting dependencies and therefore random errors.

Regressions

Temporarily commented out Phi3 (original) models to solve an inference issue, but dolphin phi3 works fine.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

General Updates

2X Speed Increase

Choose "half" in the database creation settings. It will automatically choose `bfloat16` or `float16` based on your GPU.

This results in a 2x speed increase with extremely low loss in quality.

Chat Models

The following models now have have 8192 context:

Text to Speech Models

Current Chat and Vision Models

General Updates

Local Models

Vector Models

Current Chat and Vision Models

General updates

Local Models

Removed Models

Added Models

Vision Models

Removed Models

Vector Models

Current Chat and Vision Models

Improvements

Local Models

Vision Models

Misc.

Current Chat and Vision Models

NOTE

Updates:

New Vector Models

New Vision Models

Other Changes

Breaking Changes

New Chat Models

Other Changes

Currently Support Chat Models:

Version 6.1

Bug Fixes

Regressions

Releases: BBC-Esq/VectorDB-Plugin-for-LM-Studio

v6.7.0 - LONG CONTEXT no see!

General Updates

2X Speed Increase

Choose "half" in the database creation settings. It will automatically choose bfloat16 or float16 based on your GPU.

This results in a 2x speed increase with extremely low loss in quality.

Chat Models

The following models now have have 8192 context:

Text to Speech Models

Current Chat and Vision Models

v6.6.0 - 8192 CONTEXT!

General Updates

Local Models

Vector Models

Current Chat and Vision Models

v6.5.0 - Llama 3.1 & MiniCPM v2

General updates

Local Models

Removed Models

Added Models

Vision Models

Removed Models

Vector Models

Current Chat and Vision Models

v6.4 - stream responses

Improvements

Local Models

Vision Models

Misc.

Current Chat and Vision Models

6.3.0 - whisper upgrade

NOTE

Updates:

v6.2.3 - FAST installation

v6.2.2 - Welcome LLAVA_NEXT

New Vector Models

New Vision Models

Other Changes

v6.2.1 - PERFECT install patch

v6.2.0 - PERFECT installation

Breaking Changes

New Chat Models

Other Changes

Currently Support Chat Models:

v6.1.0 - complexity growing!

Version 6.1

Bug Fixes

Regressions

Choose "half" in the database creation settings. It will automatically choose `bfloat16` or `float16` based on your GPU.