-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Robust SqueezeNet for ImageNet #89
Comments
As far as I know, ImageNet is still available for download (under certain terms and conditions) from here. That being said, I do not fully understand why the training and validation images are not enough to train a model. |
The fact is that in this file, as I understand it, there are annotations for the files of the validation part of the dataset. That is, it is impossible to test the model without it. But the problem, apparently, was solved. robustness does not use torchvision.datasets.ImageNet. In general, I am now preparing datasets for robustness. |
Well. I managed to prepare the dataset and run the training through the robustness CLI. But there is a problem. The maximum size of the batch is 4. And this is on SqueezeNet 1.1. 8 GB of RAM and 3 GB of video memory. Why such a large memory consumption? Is there any way to fix this? 1 epoch on ImageNet requires 24 hours and it would be nice to increase the size of the batch. |
It is possible to measure the number of model parameters and the number of activations produced during the forward and backward passes to directly see what is consuming memory. Unfortunately, we do not have the capacity to investigate this, especially since it is not as issue directly related to the library, but rather standard DNN training. |
Okay, then I'll rephrase the question a little. Does robust learning put an extra load on memory compared to regular learning? |
Nope. Robust learning requires additional passes through the model, but the memory footprint is essentially the same. (When using |
Good. Then what settings would you recommend me to use? I mean learning_rate, etc. I will teach on the standard ImageNet. |
And what's more, is it possible to compute different parts of the network on different devices? For example, for VGG19. The convolutional part is calculated on the GPU, and the fully connected part is calculated on the CPU. |
We typically train robust models with the same parameters as their standard version. So I would start by using the parameters used for a standard SqueezeNet on ImageNet. Yes, it should be possible to use different devices. It might require modifying that training code a bit though since this is not a typical use-case. |
So, I keep trying to train the robust SqueezeNet. Decided to use RestrictedImageNet due to limited resources. I have a problem: Robust SqueezeNet1_1 seems to behave incorrectly. This is expressed in the fact that when I try to use it for style transfer, the loss is always equal to nan. And even at the first iteration, before the image update and the optimizer step. Moreover, the code was tested on a regular SqueezeNet from the PyTorch repository and no problems were observed. I do not know how normal it is that the training loss decreases from 1.6000 to 1.5500 per epoch. In my opinion, this is too little. The parameters are as follows: lr = 0.01, attack-lr = 0.05, attack-steps = 7, eps = 3.0, batch-size = 4, constraint = 2. And one more question: is it possible to extract from ImageNet only the data that is used in training on RestrictedImageNet? I'd like to train the model in Google.Colab. |
At the moment, I have managed to run your library on my computer with a GPU and I would like to train a robust SqueezeNet 1.1 on the Imagenet dataset. But I ran into a problem: Imagenet is no longer available for download. I managed to download the validation and training part from academictorrents, but I couldn't find devkit archive. Please upload this archive here, it is only 2.5 MB. Without this, it is impossible to start training...☹️
The text was updated successfully, but these errors were encountered: