You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I see here that X-CLIP uses learnable weight matrices to compute the final scores from the similarity vectors/matrices. However, I am having trouble reconciling this with the equations in Section 3.3 of the paper, which seem to show just a softmax-weighted summation with no learnable parameters.
I'm not sure what I'm missing here. Would you mind clarifying this for me?
The text was updated successfully, but these errors were encountered:
I see here that X-CLIP uses learnable weight matrices to compute the final scores from the similarity vectors/matrices. However, I am having trouble reconciling this with the equations in Section 3.3 of the paper, which seem to show just a softmax-weighted summation with no learnable parameters.
I'm not sure what I'm missing here. Would you mind clarifying this for me?
The text was updated successfully, but these errors were encountered: