Attention over similarity matrix (AOSM) question #8

reddyav1 · 2024-10-21T17:12:09Z

I see here that X-CLIP uses learnable weight matrices to compute the final scores from the similarity vectors/matrices. However, I am having trouble reconciling this with the equations in Section 3.3 of the paper, which seem to show just a softmax-weighted summation with no learnable parameters.

I'm not sure what I'm missing here. Would you mind clarifying this for me?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attention over similarity matrix (AOSM) question #8

Attention over similarity matrix (AOSM) question #8

reddyav1 commented Oct 21, 2024

Attention over similarity matrix (AOSM) question #8

Attention over similarity matrix (AOSM) question #8

Comments

reddyav1 commented Oct 21, 2024