Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Lucene int4 SQ #2253

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

naveentatikonda
Copy link
Member

Description

Add support for Lucene SQ 4 bits

Related Issues

Resolves #2252

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@naveentatikonda naveentatikonda added Features Introduces a new unit of functionality that satisfies a requirement v2.19.0 labels Nov 7, 2024
@naveentatikonda naveentatikonda changed the title Add support for Lucene SQ 4 bits Add support for Lucene int4 SQ Nov 7, 2024
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
@navneet1v
Copy link
Collaborator

@naveentatikonda can you link the benchmarks or provide a link where the benchmarks are present for Lucene SQ 4 bit

@naveentatikonda
Copy link
Member Author

@naveentatikonda can you link the benchmarks or provide a link where the benchmarks are present for Lucene SQ 4 bit

@navneet1v Pls find the benchmarking recall results below

m ef_construction ef_search confidence interval Primary Replica
16 100 100 0 (dynamic) 8 0
Dataset Bits spaceType Recall
glove-200-angular 4 cosine 0.68
glove-200-angular 7 cosine 0.72
cohere-1m 4 Inner Product 0.67
cohere-1m 4 L2 0.87

@navneet1v
Copy link
Collaborator

navneet1v commented Nov 8, 2024

@naveentatikonda can you link the benchmarks or provide a link where the benchmarks are present for Lucene SQ 4 bit

@navneet1v Pls find the benchmarking recall results below

m ef_construction ef_search confidence interval Primary Replica
16 100 100 0 (dynamic) 8 0
Dataset Bits spaceType Recall
glove-200-angular 4 cosine 0.68
glove-200-angular 7 cosine 0.72
cohere-1m 4 Inner Product 0.67
cohere-1m 4 L2 0.87

Thanks for sharing the results.
Since we are having recalls in order 0.7, do you think we should enable the rescoring OOB with this quantization? and then we should launch the feature? would like to know your thoughts?

@naveentatikonda
Copy link
Member Author

naveentatikonda commented Nov 9, 2024

Thanks for sharing the results. Since we are having recalls in order 0.7, do you think we should enable the rescoring OOB with this quantization? and then we should launch the feature? would like to know your thoughts?

Yeah, that's definitely a good idea, we can see better recall by trading off latency. But, I thought that we only want to support rescoring for only on_disk mode and as of today we are only supporting it for Faiss engine. Also, we might not include this (as 8x compression) as part of on_disk because we prefer to use Faiss engine over Lucene.

From UX perspective, you want to add rescoring support to Lucene with SQ irrespective of on_disk ?

@navneet1v
Copy link
Collaborator

From UX perspective, you want to add rescoring support to Lucene with SQ irrespective of on_disk ?

This is a good point. I think rescoring and on_disk should be 2 different things, I should be able to do rescoring without mentioning on_disk mode. I feel this is getting tangled more and more as when I think about it. I think we should trigger a discussion around can rescoring be used outside of on_disk mode or it is always tied to on_disk?

@jmazanec15 , @shatejas , @vamshin

@jmazanec15
Copy link
Member

jmazanec15 commented Nov 11, 2024

We are able to do rescoring without specifying on_disk. on_disk just sets default rescoring. Issue is we do not support re-scoring for lucene because we use Lucene's query. But, we should onboard support for it with this.

@navneet1v
Copy link
Collaborator

We are able to do rescoring without specifying on_disk. on_disk just sets default rescoring. Issue is we do not support re-scoring for lucene because we use Lucene's query. But, we should onboard support for it with this.

if this is case, then I think we should start working on implementing the rescoring feature for Lucene query clause. Given the recall is quite not good for int4.

@heemin32
Copy link
Collaborator

heemin32 commented Nov 11, 2024

Rescoring make sense when we use full precision vector during rescoring. If we uses quantized vector during rescoring, rescoring won't increase recall much. Therefore, rescoring kind of tied to on_disk in that sense.

@navneet1v
Copy link
Collaborator

Rescoring make sense when we use full precision vector during rescoring. If we uses quantized vector during rescoring, rescoring won't increase recall much. Therefore, rescoring kind of tied to on_disk in that sense.

thats correct @heemin32 . Here when we are talking about rescoring we are talking about rescoring via full precision vectors only.

@heemin32
Copy link
Collaborator

heemin32 commented Nov 11, 2024

thats correct @heemin32 . Here when we are talking about rescoring we are talking about rescoring via full precision vectors only.

Then, it is "on_disk" right?

@navneet1v
Copy link
Collaborator

thats correct @heemin32 . Here when we are talking about rescoring we are talking about rescoring via full precision vectors only.

Then, it is "on_disk" right?

why it will be on_disk then? as Jack mentioned earlier on_disk is just a way to setup some defaults.

@heemin32
Copy link
Collaborator

thats correct @heemin32 . Here when we are talking about rescoring we are talking about rescoring via full precision vectors only.

Then, it is "on_disk" right?

why it will be on_disk then? as Jack mentioned earlier on_disk is just a way to setup some defaults.

It might confuse user experience.I thought SQ is actually throwing away the original vector and only store quantized vector. Whereas, on_disk will store full precision vector but use quantized vector for in memory index.

@naveentatikonda
Copy link
Member Author

why it will be on_disk then? as Jack mentioned earlier on_disk is just a way to setup some defaults.

It might confuse user experience.I thought SQ is actually throwing away the original vector and only store quantized vector. Whereas, on_disk will store full precision vector but use quantized vector for in memory index.

No @heemin32, in Lucene SQ also has full precision vectors stored in a segment file on disk, which we will use to requantize the data if the quantiles changed in that segment

@naveentatikonda
Copy link
Member Author

@navneet1v @shatejas as discussed offline ran some tests by tuning hyper parameters, ef_search(using method_parameters), ef_construction and m. Looking at the results, there isn't much improvement in recall(not close to 0.9) even after bumping up these parameters a lot. So, we must invest some time and add rescoring support to lucene.

bits confidence interval Primary Replica
4 0 (dynamic) 8 0
Dataset ef_construction ef_search m Recall
glove-200-angular 100 100 16 0.68
glove-200-angular 100 256 16 0.72
glove-200-angular 100 512 16 0.74
glove-200-angular 256 512 16 0.76
glove-200-angular 256 512 64 0.78
glove-200-angular 512 512 64 0.78
glove-200-angular 1024 1024 100 0.78
cohere-ip-1m 100 100 16 0.67
cohere-ip-1m 100 256 16 0.67
cohere-ip-1m 100 512 16 0.67
cohere-l2-1m 100 100 16 0.87
cohere-l2-1m 100 256 16 0.88
cohere-l2-1m 100 512 16 0.89

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Features Introduces a new unit of functionality that satisfies a requirement v2.19.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Lucene Inbuilt Scalar Quantizer to convert float 32 bits to 4 bits
5 participants