Deep Generative Models Course - Homework Solutions

Introduction

This repository contains my solutions to six Deep Generative Models assignments, each exploring a different topic in generative models and neural networks. The notebooks feature implementations of key models such as CLIP, LSTMs, VAEs, GANs, Flow-Based Models, DDPMs, and EBMs. These solutions focus on real-world datasets, including MNIST, CelebA, and a custom captcha dataset, showcasing a variety of deep learning techniques.

Homework 1: OpenAI CLIP Model (Food 101)
Homework 2: Autoregressive Generative Models and Variational Autoencoders
- Part 1: Autoregressive Generative Models (Yahoo Finance)
- Part 2: Variational Autoencoders (MNIST)
Homework 3: GANs and Flow-Based Models
- Part 1: Generative Adversarial Networks (MNIST & CIFAR10)
- Part 2: Flow-Based Models (CelebA)
Homework 4: Denoising Diffusion and Energy-Based Models
- Part 1: Denoising Diffusion Probabilistic Models (Captcha Dataset)
- Part 2: Energy-Based Models (MNIST)

Homework 1: OpenAI CLIP Model

In this notebook, I evaluate the OpenAI CLIP model. The focus is on understanding how CLIP associates text and images in a shared embedding space.

Visualization of embeddings for fine-grained labels (represented with smaller data points and an alpha value of 0.4) alongside the original labels (depicted with larger data points). Each class is assigned a distinct color for differentiation, and a legend is included for clarity.

Homework 2: Autoregressive Generative Models and Variational Autoencoders

Part 1: Autoregressive Generative Models (LSTM)

I implemented an LSTM-based autoregressive generative model to predict stock prices, using historical stock data. The model captures complex temporal dependencies by factorizing the joint probability of the time series using the chain rule of probability.

Time series visualization of stock market data, with actual prices shown in blue and predicted prices in red, highlighting the model's forecasting performance.

Stock price trends showing the 10, 20, and 50-day moving averages alongside adjusted closing prices, highlighting overall price trends.

Part 2: Variational Autoencoders (MNIST)

In this part, I implemented a Variational Autoencoder (VAE) to generate new handwritten digits using the MNIST dataset. The VAE model learns a latent space representation and generates new digits by sampling from the learned distribution.

Average images for each digit (0-9) generated by the Variational Autoencoder, illustrating the typical features of handwritten digits.

Homework 3: GANs and Flow-Based Models

Part 1: Generative Adversarial Networks (GANs)

This section includes implementations of Conditional GANs and Wasserstein GANs. The goal was to improve stability during training while generating high-quality images through adversarial learning.

Generated images of handwritten digits (0-9) from the Basic GAN model trained on the MNIST dataset, demonstrating the model's ability to capture the diversity of digit styles.

Conditional GAN-generated images of handwritten digits, showcasing the model's capability to produce digits based on specified labels from the MNIST dataset.

Generated images from the Wasserstein GAN model trained on the CIFAR-10 dataset, illustrating improved visual quality and diversity compared to traditional GANs.

Part 2: Flow-Based Models (CelebA)

I applied normalizing flows to enhance a VAE for more realistic image generation on the CelebA dataset. Flow-based models enable efficient and exact sampling, which improves the expressiveness of the latent space representation.

Generated images from the Normalizing Flow model.

Homework 4: Denoising Diffusion and Energy-Based Models

Part 1: Denoising Diffusion Probabilistic Models (Captcha Dataset)

I implemented a Denoising Diffusion Probabilistic Model (DDPM) to generate new captcha images by progressively adding noise and learning to reverse this process. The captcha dataset consists of RGB images with corresponding text labels.

Generated captcha images from the Denoising Diffusion Probabilistic Model (DDPM) trained on the captcha dataset, showcasing the model's ability to create diverse and realistic samples without conditioning.

Conditional generated captcha images from the Denoising Diffusion Probabilistic Model (DDPM), illustrating the model's capability to produce targeted outputs based on specified text conditions.

Part 2: Energy-Based Models (MNIST)

In this part, I implemented an Energy-Based Model (EBM) using contrastive divergence on the MNIST dataset. The model learns to generate new samples by optimizing energy functions based on training data.

Generated digit samples from the Energy-Based Model (EBM) trained on the MNIST dataset using contrastive divergence, showcasing the model's ability to create new images.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
HW1		HW1
HW2		HW2
HW3		HW3
HW4		HW4
figures		figures
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Generative Models Course - Homework Solutions

Introduction

Table of Contents

Homework 1: OpenAI CLIP Model

Homework 2: Autoregressive Generative Models and Variational Autoencoders

Part 1: Autoregressive Generative Models (LSTM)

Part 2: Variational Autoencoders (MNIST)

Homework 3: GANs and Flow-Based Models

Part 1: Generative Adversarial Networks (GANs)

Part 2: Flow-Based Models (CelebA)

Homework 4: Denoising Diffusion and Energy-Based Models

Part 1: Denoising Diffusion Probabilistic Models (Captcha Dataset)

Part 2: Energy-Based Models (MNIST)

About

Releases

Packages

Languages

License

AmirMohamadBabaee/Deep-Generative-Models-Homework

Folders and files

Latest commit

History

Repository files navigation

Deep Generative Models Course - Homework Solutions

Introduction

Table of Contents

Homework 1: OpenAI CLIP Model

Homework 2: Autoregressive Generative Models and Variational Autoencoders

Homework 3: GANs and Flow-Based Models

Homework 4: Denoising Diffusion and Energy-Based Models

About

Resources

License

Stars

Watchers

Forks

Languages