SDC

A 210,396-word corpus called the Saudi Dialect Corpus (SDC)
It was built for training the Saudi model, containing the mixed dialects of Saudi Arabia.
It was collected from social media platforms, such as Facebook and Twitter.
It is 2,018 KB in size.

If you use the SDC corpus, Please cite this paper:

Tarmom, T., Teahan, W., Atwell, E. and Alsalka, M.A., 2020. Compression versus traditional machine learning classifiers to detect code-switching in varieties and dialects: Arabic as a case study. Natural Language Engineering, 26(6), pp.663-676.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
SDC.txt		SDC.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SDC

If you use the SDC corpus, Please cite this paper:

About

Releases

Packages

TaghreedT/SDC

Folders and files

Latest commit

History

Repository files navigation

SDC

If you use the SDC corpus, Please cite this paper:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages