This repository contains canonical copies of useful reference datasets for the Wikimedia movement. It is maintained jointly by the Wikimedia Foundation's Product Analytics and Data Engineering teams.
The data here is stored in tab-separated values (TSV) format. For simplicity, values should not contain any tabs or newlines.
This approach avoids the escaping and quoting issues often caused by the CSV format (for an example, see T327983).