-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lower-case in LM1B #7
Comments
Hi, thank you for your question! I have to admit that we made a mistake on that statement. We will remove this in our later versions. Nevertheless, we think the comparison is fair. We re-implemented D3PM with PyTorch. Besides, we replaced their backbone with the architecture of bert-base-uncased and used the same tokenizer (so that both methods are lower-cased). We obtained the baseline results based on such re-implementation. It is also worth noting that our reported results of D3PM-absorbing are only slightly worse than that in their paper due to limitation of computational resources, indicating the correctness of our implementation. But we trained DiffusionBert for even less time. Hope this helps! Please feel free to contact with me if you have any other questions. We will also include the cased results in the final version. :) |
Hello!
In the paper you write
But in D3PM paper it is never stated that LM1B data was lower-cased (and you can see samples from their model in the appendix where the sentences contain upper-case characters). So the perplexity comparison seems incorrect, because it is easier to model all-lowercased text. Am I missing something?
The text was updated successfully, but these errors were encountered: