Relevance score dynamic threshold #788

qixdev · 2024-11-17T16:53:31Z

qixdev
Nov 17, 2024

Hi, I like meilisearch, but unfortunately I see that it is the most beneficial when there is a lot of data to search in.
In small setups with only few thousands of unique documents it tends to add-up irrelevant search results to the end.
In one of my projects, I made a wrapper, which checks if the first hit in the results hit more than 0.99(which means extremely relevant) and I filter the original results, making sure that all scores are equal to the 0.99 in the first result. This way I filter out unnecessary noise from the search results.
But as you probably thinking about it, it's not quite efficient. Yeah, it cuts the noise on very relevant results, but on really low scores (0.66, 0.15) it doesn't get through filter and all results are returned.

There is some pattern behind that I have noticed. You can see that the difference between first result and other results are determining the relevancy. This might be an heuristic, but I see it works.
Imagine results score set:
0.66, 0.66, 0.36, 0.36, 0.15, 0.10
Here, the first two results probably will be the best and there is no need to return other results.
Same could be applied here:
0.66, 0.33, 0.10
The difference between first result and second is very large, and only first result is going to be relevant.
0.33, 0.33, 0.31, 0.31, 0.27
Here, all 5 results are relevant, due the small rolling difference between results.

My suggestion would be adding this feature of dynamic thresholding, as usual threshold wouldn't capture some of the results and wouldn't cut the noise. Also you could add some customization, similar to mean and sigma values in distribution settings for embeddings in meilisearch vector search

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meilisearch

Relevance score dynamic threshold #788

{{title}}

Replies: 0 comments

Select a reply

Meilisearch

Relevance score dynamic threshold #788

qixdev Nov 17, 2024

Replies: 0 comments

qixdev
Nov 17, 2024