Attack accuracy discrepency between default and custom models. #1355

KarthikGanesan88 · 2021-10-12T17:20:37Z

KarthikGanesan88
Oct 12, 2021

Hello, I am trying to use ART to evaluate the robustness of some techniques against attacks, using Pytorch. These techniques require me to make very low level changes (e.g., how floating point operations are performed). I started with the get_started_pytorch.py program and created a new model, where I changed just the final 2 linear layers. I used the linear layer code given by pytorch here. Specifically, the 'Extending torch.autograd' and 'Extending torch.nn' sections right at the top.

I am creating two classifier to train them separately. 'Default' which is the baseline code given in the ART example and 'custom' where I change the linear layers as follows:

#self.fc_1 = nn.Linear(in_features=4 * 4 * 10, out_features=100)
#self.fc_2 = nn.Linear(in_features=100, out_features=10)
self.fc_1 = MyLinear(4 * 4 * 10, 100)
self.fc_2 = MyLinear(100, 10)

For now, I am performing the exact same operations in both. The classification accuracy is identical between the default and custom models.
However, I am getting different attack accuracies, for the FSM attack. Here is a full readout:
Acc. of default model on adv. inputs trained on default model: 16%
Acc. of custom model on adv. inputs trained on custom model: 5%

When I swap the inputs, the accuracies move with the inputs. So:
Acc. of default model on adv. inputs trained on custom model: 5%
Acc. of custom model on adv. inputs trained on default model: 16%

So it seems it is performing the attack using the custom model leads to the difference. Is there something else I need to do with this model to make it work with ART the same as the default model does?

Answered by beat-buesser

Oct 12, 2021

Hi @KarthikGanesan88 This sounds like an interesting experiment! I would have a few questions to get a better understanding. Could you please describe the "swap the inputs" step in more detail, maybe with code (how are the sample defined, etc.)?

View full answer

beat-buesser · 2021-10-12T21:16:56Z

beat-buesser
Oct 12, 2021
Maintainer

Hi @KarthikGanesan88 This sounds like an interesting experiment! I would have a few questions to get a better understanding. Could you please describe the "swap the inputs" step in more detail, maybe with code (how are the sample defined, etc.)?

7 replies

KarthikGanesan88 Oct 13, 2021
Author

Thank you. That must be it! The reason I switched to double was that I was failing torch.autograd.gradcheck which they recommend for custom layers. But using the default float, I am getting the right result now.
Yes, I am using the default line of x_test_adv = attack.generate(x=x_test). If I provided the label as well, would that make this a 'targeted' attack, vs. it being un-targeted right now? Sorry, my background is computer hardware so this is all very new to me. Thanks!

beat-buesser Oct 13, 2021
Maintainer

@KarthikGanesan88 That's great. If the attack is targeted or untargeted depends on the attacks argument targeted. If targeted=Truethe attack treats the labels provided togenerateas target labels, iftargeted=Falsethe attack teats the labels provided togenerate` s true labels.

KarthikGanesan88 Oct 13, 2021
Author

OK. so I'm not sure about your previous suggestion then. What is the difference between x_test_adv = attack.generate(x=x_test) and x_test_adv = attack.generate(x=x_test, y=y_test)?

Also, a different topic but does ART use GPU acceleration? Are there compute intensive portions of code for other attacks and would I benefit from having a GPU for those? Sorry if this is off-topic. I can post this as a new topic if that would be preferable.

beat-buesser Oct 13, 2021
Maintainer

In the first case attack.generate(x=x_test) the model's predictions are used as true labels to calculate the classification loss, in the latter case x_test_adv = attack.generate(x=x_test, y=y_test) the provided true labels are used to calculate the loss. In the first case it can happen that the attack generates "adversarial" examples that are eventually correctly classified, this is a possibility for samples where the benign version is initially mis-classified by the model.

Yes, ART uses GPUs, if available, for model evaluation and gradient calculations, and in framework-specific attack implementations.

KarthikGanesan88 Oct 14, 2021
Author

I see, I understand now. Thank you for all your help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attack accuracy discrepency between default and custom models. #1355

{{title}}

Replies: 1 comment 7 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Attack accuracy discrepency between default and custom models. #1355

KarthikGanesan88 Oct 12, 2021

Replies: 1 comment · 7 replies

beat-buesser Oct 12, 2021 Maintainer

KarthikGanesan88 Oct 13, 2021 Author

beat-buesser Oct 13, 2021 Maintainer

KarthikGanesan88 Oct 13, 2021 Author

beat-buesser Oct 13, 2021 Maintainer

KarthikGanesan88 Oct 14, 2021 Author

KarthikGanesan88
Oct 12, 2021

Replies: 1 comment 7 replies

beat-buesser
Oct 12, 2021
Maintainer

KarthikGanesan88 Oct 13, 2021
Author

beat-buesser Oct 13, 2021
Maintainer

KarthikGanesan88 Oct 13, 2021
Author

beat-buesser Oct 13, 2021
Maintainer

KarthikGanesan88 Oct 14, 2021
Author