Attribute is reset per batch in `dp` mode #3301

siahuat0727 · 2020-09-01T03:40:24Z

siahuat0727
Sep 1, 2020

❓ Questions and Help

Before asking:

search the issues.
search the docs.

What is your question?

I don't know whether this is a bug...
As shown in the code below, I think the behavior of dp mode is unexpected? (The attribute is reset every batch)
When using ddp mode, everything is fine. (The property will be initialized only once per GPU)

Code

import os
import torch
from torch.nn import functional as F
from torch.utils.data import DataLoader
from torchvision.datasets import MNIST
from torchvision import transforms
import pytorch_lightning as pl
from pytorch_lightning import Trainer

from argparse import Namespace


class LitModel(pl.LightningModule):

    def __init__(self):
        super().__init__()
        self.l1 = torch.nn.Linear(28 * 28, 10)
        self._dummy_property = None

    @property
    def dummy_propery(self):
        if self._dummy_property is None:
            self._dummy_property = '*' * 30
            print('print only once per gpu')
        return self._dummy_property

    def forward(self, x):
        return torch.relu(self.l1(x.view(x.size(0), -1)))

    def training_step(self, batch, batch_idx):
        print(self._dummy_property)
        # Access every batch
        self.dummy_propery
        print(self._dummy_property)

        x, y = batch
        y_hat = self(x)
        loss = F.cross_entropy(y_hat, y)
        return pl.TrainResult(loss)

    def configure_optimizers(self):
        return torch.optim.Adam(self.parameters(), lr=0.02)


train_loader = DataLoader(
    MNIST(
        os.getcwd(),
        download=True,
        transform=transforms.ToTensor()
    ),
    batch_size=128
)
trainer = pl.Trainer(gpus=2,
                     distributed_backend='dp',
                     max_epochs=2)

model = LitModel()
trainer.fit(model, train_loader)

Output

None
print only once per gpu
******************************
None
print only once per gpu
******************************
None
print only once per gpu
******************************
None
print only once per gpu
******************************
...

What's your environment?

Version 0.9.0

Answered by awaelchli

Sep 20, 2020

oh, this is actually a known problem and comes from DataParallel in PyTorch itself.
See #565 and #1649 for reference.
@ananyahjha93 has been working on a workaround but it seems to be super non trivial #1895

View full answer

Borda · 2020-09-02T13:28:58Z

Borda
Sep 2, 2020
Maintainer

@yukw777 mind have a look? 🐰

0 replies

yukw777 · 2020-09-03T01:24:47Z

yukw777
Sep 3, 2020

@Borda sure thing! A bit busy right now, but I’ll try to get to it later this week or weekend!

0 replies

awaelchli · 2020-09-20T23:21:20Z

awaelchli
Sep 20, 2020

oh, this is actually a known problem and comes from DataParallel in PyTorch itself.
See #565 and #1649 for reference.
@ananyahjha93 has been working on a workaround but it seems to be super non trivial #1895

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attribute is reset per batch in `dp` mode #3301

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

Attribute is reset per batch in dp mode #3301

siahuat0727 Sep 1, 2020

❓ Questions and Help

Before asking:

What is your question?

Code

Output

What's your environment?

Replies: 3 comments

Borda Sep 2, 2020 Maintainer

yukw777 Sep 3, 2020

awaelchli Sep 20, 2020

Attribute is reset per batch in `dp` mode #3301

siahuat0727
Sep 1, 2020

Borda
Sep 2, 2020
Maintainer

yukw777
Sep 3, 2020

awaelchli
Sep 20, 2020