Justen, Lennart, Kilian Müller, et al. No Time Like the Present: Effects of Temporal Language Change on Comment Moderation. 2022.
(abstract).
The spread of online hate has become a major problem for newspapers that host comment sections. There is growing interest in using machine learning and natural language processing for (semi-) automated abusive language detection to avoid the costs of manual comment moderation or having to shut down comment sections completely. However much of the past work on abusive language detection with ML uses random train-test splitting procedures that assume an unrealistically static language environment. In this paper, we show using a new german newspaper comments dataset that a time-stratified evaluation procedure provides a more realistic measure of a classifier’s performance on future data. We also show that the performance of classifiers can degrade quickly as the training data grows more outdated and language and news coverage evolve. We show that the performance of classifiers trained on data from before the Covid-19 pandemic drops sharply when evaluated on Covid-era comments.