https://arxiv.org/abs/2107.06499
Deduplicating Training Data Makes Language Models Better (Katherine Lee, Daphne Ippolito, Andrew Nystrom, Chiyuan Zhang, Douglas Eck, Chris Callison-Burch, Nicholas Carlini)
https://arxiv.org/abs/2107.06499
Deduplicating Training Data Makes Language Models Better (Katherine Lee, Daphne Ippolito, Andrew Nystrom, Chiyuan Zhang, Douglas Eck, Chris Callison-Burch, Nicholas Carlini)