Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does the residual coding really work? #23

Open
XiHuYan opened this issue Jan 10, 2025 · 2 comments
Open

Does the residual coding really work? #23

XiHuYan opened this issue Jan 10, 2025 · 2 comments

Comments

@XiHuYan
Copy link

XiHuYan commented Jan 10, 2025

Just curious, based on your experiments, does the residual coding mechanism truly perform better than the traditional MLP-based embedding methods?

@EdoardoBotta
Copy link
Owner

Good question. I have been experimenting quite a bit with this repo, in an attempt to replicate the results from the original RQ-VAE paper but current performance is still lagging behind the results from the paper. While the transformer architecture is not exactly the same as the one from the paper (the paper uses encoder-decoder, while I have been using decoder-only, keeping roughly the same number of parameters), I suspect the encoding to also play a part in this.

As far as the encoding model is concerned, I do see some advantages in terms of reduced storage of the item embeddings with this approach as opposed to storing an item ID embedding table.

However, I have found the RQVAE to be very hard to train in a stable manner. I've experimented with a bunch of tricks from the literature to fight codebook collapse before I was able to get good codebook utilization (> 80%).

@XiHuYan
Copy link
Author

XiHuYan commented Jan 10, 2025

Thanks for sharing these valuable insights. Hope you can find a solution to overcome collapse! :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants