Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training from Scratch #3

Open
Mahsa13473 opened this issue Oct 5, 2019 · 13 comments
Open

Training from Scratch #3

Mahsa13473 opened this issue Oct 5, 2019 · 13 comments

Comments

@Mahsa13473
Copy link

Hi there,

Thanks for releasing the codes, it is amazing work!
I try to train the network from scratch and follow all the steps that were mentioned in Readme file, but I couldn't get the same results in comparison to pretrained model.

I was wondering which hyperparameters are used for the pretarined one. Is it the same as the defaults in train_sdf.py?
How many epochs did you train to get the best accuracy?
Also which dataset was used for training? The old one or the new one that you mentioned in Readme?

@no-materials
Copy link

Hello, in addition to @Mahsa13473 's comment, can you also provide the approximate training time?

@Xharlie
Copy link
Owner

Xharlie commented Oct 15, 2019

hi by the time we submitted, we used the old one which everyone else used as well. We used imagenet pretrained vgg16(provided by official tensorflow release), as shown in the command in readme. We haven't tried training everything from scratch yet since i guess the dataset itself is not big enough to understand 2d image perfectly.

@Xharlie
Copy link
Owner

Xharlie commented Oct 15, 2019

the training time can vary from 1 day to 3 days depends on your gpu. but i ll say at most 3 days. The bottleneck is on cpu since we have to read sdf ground truth and image h5 file on the fly. so if you have a better cpu or ssd for sdf/img storage, you can train them faster.

@asurada404
Copy link

Hi, I'm also training the network from scratch using the pre-trained vgg16, but I can't get the same result. Did you used the pre-trained vgg16? @Mahsa13473

@Mahsa13473
Copy link
Author

Hi. Yes, but I couldn't get the same result with the pretrained vgg16. But I tried a few months ago. Not sure how it works with the updated version of code. @asurada404

@JohnG0024
Copy link

Hello, anyone knows where is the pretrained modelvgg_16.ckpt?
python -u train/train_sdf.py --gpu 0 --img_feat_twostream --restore_modelcnn ./models/CNN/pretrained_model/vgg_16.ckpt --log_dir checkpoint/SDF_JG --category all --num_sample_points 2048 --batch_size 20 --learning_rate 0.0001 --cat_limit 36000
gets an error:
tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ./models/CNN/pretrained_model/vgg_16.ckpt

@asurada404
Copy link

Download vgg_16.ckpt and save to ./models/CNN/pretrained_model first. @JohnG0024

@JohnG0024
Copy link

@asurada404 Thanks!

@JohnG0024
Copy link

@Xharlie In your opinion, what's missing in the dataset that makes it unable to understand 2d image perfectly?

@asurada404
Copy link

asurada404 commented Aug 30, 2020

The VGG was used as an encoder to extract the features of the image.
The pre-trained VGG was training on ImageNet dataset(more than 14 million images and more than 20,000 categories) which is much larger than ShapeNet.
As a result, the VGG trained on ImageNet can extract image features better than the VGG trained on ShapeNet. @JohnG0024

@JohnG0024
Copy link

@asurada404 That makes sense. So the vgg_16.ckpt is from the full Imagenet dataset, not the 1k subset of categories of ImageNet used in the ImageNet Challenge?

@asurada404
Copy link

asurada404 commented Aug 31, 2020

You can find more details in this paper @JohnG0024

@AlexsaseXie
Copy link

Does anyone successfully reproduce the results?

I trained the network with ground truth camera parameters. No modifications have done to the code.
nohup python -u train/train_sdf.py --gpu 0 --img_feat_twostream --restore_modelcnn ./models/CNN/pretrained_model/vgg_16.ckpt --log_dir checkpoint/{your training checkpoint dir} --category all --num_sample_points 2048 --batch_size 20 --learning_rate 0.0001 --cat_limit 36000 &> log/DISN_train_all.log &

The train/test split is 3D-R2N2. I trained for about 3 days, approxiamtely 23 epochs. The sdf loss stopped dropping so I assumed the network converged. But I only got bad visuals in test set models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants