Does the residual coding really work? #23

XiHuYan · 2025-01-10T08:30:13Z

Just curious, based on your experiments, does the residual coding mechanism truly perform better than the traditional MLP-based embedding methods?

EdoardoBotta · 2025-01-10T09:52:23Z

Good question. I have been experimenting quite a bit with this repo, in an attempt to replicate the results from the original RQ-VAE paper but current performance is still lagging behind the results from the paper. While the transformer architecture is not exactly the same as the one from the paper (the paper uses encoder-decoder, while I have been using decoder-only, keeping roughly the same number of parameters), I suspect the encoding to also play a part in this.

As far as the encoding model is concerned, I do see some advantages in terms of reduced storage of the item embeddings with this approach as opposed to storing an item ID embedding table.

However, I have found the RQVAE to be very hard to train in a stable manner. I've experimented with a bunch of tricks from the literature to fight codebook collapse before I was able to get good codebook utilization (> 80%).

XiHuYan · 2025-01-10T09:57:49Z

Thanks for sharing these valuable insights. Hope you can find a solution to overcome collapse! :D

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does the residual coding really work? #23

Does the residual coding really work? #23

XiHuYan commented Jan 10, 2025

EdoardoBotta commented Jan 10, 2025

XiHuYan commented Jan 10, 2025

Does the residual coding really work? #23

Does the residual coding really work? #23

Comments

XiHuYan commented Jan 10, 2025

EdoardoBotta commented Jan 10, 2025

XiHuYan commented Jan 10, 2025