Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If the dataset has large average number of corners #2

Open
zssjh opened this issue Jun 22, 2022 · 5 comments
Open

If the dataset has large average number of corners #2

zssjh opened this issue Jun 22, 2022 · 5 comments

Comments

@zssjh
Copy link

zssjh commented Jun 22, 2022

Thank you very much for your open source code. I build floorplan on structured3d with very good results! Now, I run this algorithm on my own complex dataset, which has average/maximum numbers of corners 300/800(structured3d is 22/52), so the algorithm will use huge GPU memory and run failed,maybe because the initial number of edges is set to the square of the number of corners(O(N^2))? So do you have any good solutions? Looking forward to your reply!

@woodfrog
Copy link
Owner

In the current implementation, the number of edges after edge filtering is set to be (3 * N) instead of O(N^2), where N is the number of corners (check here and the corresponding descriptions in the paper's Sec.4.2). We don't use O(N^2) because 1) most edge candidates are easy negatives and can be eliminated with independent processing (i.e., the edge filtering part), and feeding all edges into transformer decoder is a waste; 2) keeping all the edge candidates makes the computational cost of transformer decoders unaffordable.

In your case, I guess the GPU memory is used up even before the edge filtering part, as you have too many corners. A potential solution would be: 1) run the edge filtering on all O(N^2) edge candidates in an iterative manner and eliminate all easy negatives; 2) try to run the edge transformer decoder part with the remaining edge candidates. But running a transformer decoder with over 2000 input nodes is still computationally expensive, and you still need many GPU resources.

Another workaround could be splitting each big scene into multiple sub-divisions and running the algorithm on each part separately. As your scene is so huge, the relation between far areas might be weak, and this division might not hurt the performance significantly.

Hope this will help :)

@zssjh zssjh closed this as completed Jun 22, 2022
@zssjh zssjh reopened this Jun 22, 2022
@zssjh
Copy link
Author

zssjh commented Jun 22, 2022

In the current implementation, the number of edges after edge filtering is set to be (3 * N) instead of O(N^2), where N is the number of corners (check here and the corresponding descriptions in the paper's Sec.4.2). We don't use O(N^2) because 1) most edge candidates are easy negatives and can be eliminated with independent processing (i.e., the edge filtering part), and feeding all edges into transformer decoder is a waste; 2) keeping all the edge candidates makes the computational cost of transformer decoders unaffordable.

In your case, I guess the GPU memory is used up even before the edge filtering part, as you have too many corners. A potential solution would be: 1) run the edge filtering on all O(N^2) edge candidates in an iterative manner and eliminate all easy negatives; 2) try to run the edge transformer decoder part with the remaining edge candidates. But running a transformer decoder with over 2000 input nodes is still computationally expensive, and you still need many GPU resources.

Another workaround could be splitting each big scene into multiple sub-divisions and running the algorithm on each part separately. As your scene is so huge, the relation between far areas might be weak, and this division might not hurt the performance significantly.

Hope this will help :)

Thank you, It's very helpful. I'll try it!

@zssjh
Copy link
Author

zssjh commented Jun 27, 2022

Hello, @woodfrog , still a problem about this dataset, our dataset is about 120 images, I augment 10 times to get about 1000 inputs, but the training is still over fitting at about epoch 50, so the network learning nothing now. So for small dataset, which part of the network can I remove or simplify that has relative little impact on accuracy? Or do you have other suggestions? Thank you very much!

@woodfrog
Copy link
Owner

Hello, @woodfrog , still a problem about this dataset, our dataset is about 120 images, I augment 10 times to get about 1000 inputs, but the training is still over fitting at about epoch 50, so the network learning nothing now. So for small dataset, which part of the network can I remove or simplify that has relative little impact on accuracy? Or do you have other suggestions? Thank you very much!

Hi @zssjh, according to your previous description, your dataset seems to contain quite large-scale scenes, so I don't think 120 of such scenes would lead to very serious overfitting. Could you elaborate on what you observed for "so the network learns nothing now"? If you try to do a test on the training images, would the results be perfect? If this is the case, then data augmentation should be the right way to go -- what is your current augmentation strategy to get the 10 augmented copies?

@zssjh
Copy link
Author

zssjh commented Jul 5, 2022

Hi, @woodfrog
Thank you for your reply! I train about 300 epochs, at about 20 epoch, the val loss began to rise until the end, including corner loss, s1 edge loss, image decorator loss. Only geometry loss did not rise, but remained unchanged from the 150th epochs, so I judged this situation as over fitting. According to your suggestion, I tested the best checkpoint (from 144 epoch) on the testset, and I found that the network seemed to have learned some rules, About 40% of edges and corners can be detected correctly, but it seems that the over fitting prevents the network from learning better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants