Feature request: implement VideoLDM-approach #462

kapsner · 2024-02-01T15:37:20Z

kapsner
Feb 1, 2024

First of all thanks for that great library, bringing generative models to the monai framework and also for the nice tutorials!

I was wondering if it would be possible to add the implementation described by Blattmann et al. in their paper "Align your Latents:
High-Resolution Video Synthesis with Latent Diffusion Models" (https://arxiv.org/abs/2304.08818) (see also https://research.nvidia.com/labs/toronto-ai/VideoLDM/), where they, as far as I understand, trained for the sake of efficiency a pre-trained 2D Autoencoder that was fine-tuned on a temporal dimension (for video-generation, which would be the z-dimension in medical images) by adding 3D layers to the decoder.

For GenerativeModels-integration, I was thinking of

a new dataloader class, that makes a 2D dataset out of 3D images by taking slice from each volume to train the 2D autoencoder
add 3D-layers to the autoencoder's decoder to patch-wise fine-tune it on 3D-data

Maybe, this approach could also help to generate synthetic medical 3D datasets in a diagnostic resolution.

Looking forward to hear your opinion on this topic.

Best, Lorenz

marksgraham · 2024-02-08T15:55:13Z

marksgraham
Feb 8, 2024
Maintainer

Hi there,

This seems possible to me. The new dataloader shouldn't be necessary, we could define dataset transforms which take a 3D volume and extracts 2D slices from it. Something along the lines of the example here where the RandSpatialCrop picks a random 2d slice from the dataset.

Adding temporal components to the DDPM and the autoencoder should be fairly doable, too.

Do you have any interest in having a go at implementing this?

Mark

0 replies

ajugeorge97 · 2024-08-15T08:47:38Z

ajugeorge97
Aug 15, 2024

Hi Mark,

I’m Aju, from Lorenz's team. I’m following up on his earlier query. Just wanted to check if anyone has started working on this.

Best,
Aju

1 reply

marksgraham Aug 19, 2024
Maintainer

Hi - I no longer work on MONAI Gen and the project has actually moved under the main MONAI repository, so it might be best to comment there and ask if someone is working on this. But I don't think anyone will be, so you would likely have to work on it yourself if you wanted to see it in MONAI :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: implement VideoLDM-approach #462

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Feature request: implement VideoLDM-approach #462

kapsner Feb 1, 2024

Replies: 2 comments · 1 reply

marksgraham Feb 8, 2024 Maintainer

ajugeorge97 Aug 15, 2024

marksgraham Aug 19, 2024 Maintainer

kapsner
Feb 1, 2024

Replies: 2 comments 1 reply

marksgraham
Feb 8, 2024
Maintainer

ajugeorge97
Aug 15, 2024

marksgraham Aug 19, 2024
Maintainer