Skip to content

The Kyrgyz News Corpus dataset is a collection of news in the Kyrgyz language collected from various news sites and contains 256364 news.

Notifications You must be signed in to change notification settings

Akyl-AI/Kyrgyz_News_Corpus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

Kyrgyz_News_Corpus

The Kyrgyz News Corpus dataset is a collection of news in the Kyrgyz language collected from various news sites using web scraping and contains 256364 news.

This corpus contains news on various topics, including politics, economics, culture, sports and others. Each entry in the dataset is a separate news item, including the text and the source.

This dataset can only be used for research purposes such as natural language processing, thematic modeling, and more. It can be useful for researchers, developers and students interested in analyzing texts in the Kyrgyz language and related tasks.

References

All of our achievements were made achievable thanks to the robust AI community in Kyrgyzstan and the contributions made by individuals within the AkylAI project (by TheCramer.com). We also express our gratitude to the Kyrgyz news agencies for their work, which allowed us to create this dataset.

Dataset

Kyrgyz News Corpus dataset can be downloaded from here.

Next

We work on creation Kyrgyz Spell checker and grammar corrector. Please feel free to reach out timur.turat@gmail.com or rkizmailov@gmail.com if you are interested in any forms of collaborations!

About

The Kyrgyz News Corpus dataset is a collection of news in the Kyrgyz language collected from various news sites and contains 256364 news.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published