You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm learning and experimenting with using Arraymancer and text embedding.
In python I use SentenceTransformers and Sklearn/KNeighborsClassifier to find closest matches, using the Cosine metric.
It seems like Arraymancer doesn't support Cosine metric. Are there plans on adding it?
I was using kdTree, with euclidean metric and the results were all wrong.
Can Arraymancer help me normalize the text embeddings? this way I can use euclidean metric and get some good results?
here is my code:
import arraymancer
let vectors = read_npy[float64]("title_vectors.txt.npy")
echo vectors.shape
# [1226242, 350]
let kd = kdtree(vectors)
let (dist,ix) = kd.query(vectors[0,_].reshape(350), k = 3 ) # find closest to first entry
Another thing I am confused about, is why I need to reshape(350)
When I tried: let (dist,ix) = kd.query(vectors[0,_], k = 3 ) it resulted in: Broadcasting error: non-singleton dimensions must be the same in both tensors.
Thanks
The text was updated successfully, but these errors were encountered:
I'm learning and experimenting with using Arraymancer and text embedding.
In python I use SentenceTransformers and Sklearn/KNeighborsClassifier to find closest matches, using the Cosine metric.
It seems like Arraymancer doesn't support Cosine metric. Are there plans on adding it?
I was using kdTree, with euclidean metric and the results were all wrong.
Can Arraymancer help me normalize the text embeddings? this way I can use euclidean metric and get some good results?
here is my code:
Another thing I am confused about, is why I need to
reshape(350)
When I tried:
let (dist,ix) = kd.query(vectors[0,_], k = 3 )
it resulted in: Broadcasting error: non-singleton dimensions must be the same in both tensors.Thanks
The text was updated successfully, but these errors were encountered: