Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lbfgs #10265

Merged
merged 22 commits into from
Jun 26, 2023
Merged

Lbfgs #10265

merged 22 commits into from
Jun 26, 2023

Conversation

L-Xiafeng
Copy link
Contributor

@L-Xiafeng L-Xiafeng commented May 15, 2023

oneflow,strong_wolfe训练结果

num_epoch: 100, loss: 0.09013008

real    1m17.526s
user    2m24.450s
sys     0m50.515s

torch,strong_wolfe训练结果

num_epoch: 100, loss: 0.09780428

real    1m10.364s
user    1m9.138s
sys     0m5.576s

@levi131 levi131 self-requested a review May 15, 2023 05:31
@L-Xiafeng L-Xiafeng marked this pull request as ready for review May 22, 2023 01:49
@L-Xiafeng

This comment was marked as resolved.

@L-Xiafeng

This comment was marked as resolved.

@L-Xiafeng

This comment was marked as resolved.

@L-Xiafeng

This comment was marked as outdated.

@levi131 levi131 self-assigned this May 22, 2023
@levi131
Copy link
Contributor

levi131 commented May 23, 2023

需要在oneflow代码仓中补充测试验证,分别验证使用strong wolfe和不使用strong wolfe的情况

@L-Xiafeng L-Xiafeng enabled auto-merge (squash) May 26, 2023 09:07
@L-Xiafeng
Copy link
Contributor Author

L-Xiafeng commented Jun 16, 2023

float32,1000epoch稳定性测试

框架 allow_tf32 allow_fp16_reduced 线搜索方法 RTX3090结果 RTX2080Ti结果
oneflow False False None OK OK
oneflow False True None OK OK
oneflow True False None 会训飞 OK
oneflow True True Nonel 会训飞 OK
oneflow False False Strong_wolfe OK OK
oneflow False True Strong_wolfe OK OK
oneflow True False Strong_wolfe loss卡在0.27 OK
oneflow True True Strong_wolfe loss卡在0.27 OK
torch False False None OK OK
torch False True None OK OK
torch True False None loss为nan OK
torch True True None loss为nan OK
torch False False Strong_wolfe loss卡死在0.094 loss卡死在0.102
torch False True Strong_wolfe loss卡死在0.094 loss卡死在0.102
torch True False Strong_wolfe loss卡死在1.70 loss卡死在0.102
torch True True Strong_wolfe loss卡死在1.70 loss卡死在0.102

Copy link
Contributor

@levi131 levi131 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lucky9-cyou lucky9-cyou self-requested a review June 19, 2023 08:19
@L-Xiafeng L-Xiafeng merged commit 6f5e0f6 into Oneflow-Inc:master Jun 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants