Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用自有数据集微调m3e-base后,在文本检索任务中效果变差 #105

Open
XuHao777 opened this issue Nov 1, 2023 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@XuHao777
Copy link

XuHao777 commented Nov 1, 2023

🐛 bug 说明

您好,我构建了一个自有数据集,格式为[label, query1, query2],使用该数据集微调m3e-base模型。同时构建了一个测试数据集,格式为[query, passage1,passage2,passage3,passage4,passage5]。使用原始m3e-base模型和微调后模型分别得到测试数据集的MAE,P@top3,Spearman,发现这三个指标都下降了,这是什么原因呢

附相关指标参数:
MAE P@top3 Spearman 备注
m3e-base 1.068 0.733 0.431 normalized
m3e-base 1.072 0.7333 0.428 not normalized
m3e-base-ft 1.24 0.6766 0.3039 使用query2query数据集微调

Python Version

None

@XuHao777 XuHao777 added the bug Something isn't working label Nov 1, 2023
@wangyuxinwhy
Copy link
Owner

loss 的变化怎么样,是不是过拟合了?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants