Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

模型处理大规模数据集时cuda OOM #23

Open
xinshame opened this issue Dec 23, 2021 · 4 comments
Open

模型处理大规模数据集时cuda OOM #23

xinshame opened this issue Dec 23, 2021 · 4 comments

Comments

@xinshame
Copy link

No description provided.

@xinshame xinshame changed the title 面对大规模数据集处理的时候的cuda OOM 模型处理大规模数据集时cuda OOM Dec 23, 2021
@xinshame
Copy link
Author

xinshame commented Dec 23, 2021

尝试调小batch size到32 依然oom,请问作者有相关的处理思路么

@huangtinglin
Copy link
Owner

Hi 感谢您的关注!请问下您的显卡显存大小有多少呢,以及跑的是哪个数据集。此外算法是transductive setting,需要每次生成全部结点的embedding,您可以拓展成inductive setting的,每次只生成batch内结点的embedding。

@xinshame
Copy link
Author

xinshame commented Dec 24, 2021

Hi 感谢您的关注!请问下您的显卡显存大小有多少呢,以及跑的是哪个数据集。此外算法是transductive setting,需要每次生成全部结点的embedding,您可以拓展成inductive setting的,每次只生成batch内结点的embedding。

显卡内存16G,实际可用14G,数据集是自己构建的数据集,数据量大概是1300w三元组

@huangtinglin
Copy link
Owner

huangtinglin commented Apr 4, 2022

抱歉迟复,对于transductive learning的算法,结点的个数是制约算法在大规模数据集上训练的主要因素,拓展的思路是改写成inductive learning,您可以参考pyg的训练逻辑:

https://github.com/pyg-team/pytorch_geometric/blob/master/examples/graph_sage_unsup.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants