Incorrect time_dim for intermediate temporal layers #4

mlomnitz · 2024-04-24T15:26:30Z

I have been working through your code trying to get it working, and I believe I found an issue when you set the time_dim for the temporal layers here:

def set_time_dim_(
    klasses: Tuple[Type[Module]],
    model: Module,
    time_dim: int
):
    for model in model.modules():
        if isinstance(model, klasses):
            model.time_dim = time_dim

You are setting the same time_dim for all of layers, but the size of the temporal dimension is cut in half after each step in the UNet. Because of this, the model crashes when trying to reshape/rearrange the tensors for intermediate layers (for instance here (maybe others as well?):

if is_video:
    batch_size = x.shape[0]
    x = rearrange(x, 'b c t h w -> b h w t c')
else:
    assert exists(batch_size) or exists(self.time_dim)

    rearrange_kwargs = dict(b = batch_size, t = self.time_dim)
    x = rearrange(x, '(b t) c h w -> b h w t c', **compact_values(rearrange_kwargs))

I am working on my on workaround in the same set_time_dim function but thought I would report it in case it is helpful.

lucidrains · 2024-05-07T15:08:01Z

@mlomnitz hey Michael! thanks for testing this out

fear i may be attempting too much magic here, but could you try the latest version and see if it works? 🤞

lucidrains added a commit that referenced this issue May 7, 2024

address #4

1b26c9f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect time_dim for intermediate temporal layers #4

Incorrect time_dim for intermediate temporal layers #4

mlomnitz commented Apr 24, 2024

lucidrains commented May 7, 2024

Incorrect time_dim for intermediate temporal layers #4

Incorrect time_dim for intermediate temporal layers #4

Comments

mlomnitz commented Apr 24, 2024

lucidrains commented May 7, 2024