Skip to content

CLDF dataset derived from Beijing University's "Chinese Dialect Vocabularies" from 1964

License

Notifications You must be signed in to change notification settings

lexibank/beidasinitic

Repository files navigation

CLDF dataset derived from Beijing University's "Chinese Dialect Vocabularies" from 1964

CLDF validation

How to cite

If you use these data please cite

  • the original source

    Běijīng Dàxué 北京大学 (1964): Hànyǔ fāngyán cíhuì 汉语方言词汇 [Chinese dialect vocabularies]. Beijing: Wenzi Gaige.

  • the derived dataset using the DOI of the particular released version you were using

Description

This dataset is licensed under a CC-BY-4.0 license

Available online at https://github.com/digling/cddb/

Conceptlists in Concepticon:

Notes

This dataset, which is well-known among Sinologists, comprises 18 dialect varieties, collected during the 1950s and was digitized during 2012 and 2016. We offer the data in morpheme-segmented form, with a slightly adjusted IPA transcription.

Statistics

CLDF validation Glottolog: 100% Concepticon: 82% Source: 100% BIPA: 100% CLTS SoundClass: 100%

  • Varieties: 18 (linked to 18 different Glottocodes)
  • Concepts: 905 (linked to 738 different Concepticon concept sets)
  • Lexemes: 18,059
  • Sources: 1
  • Synonymy: 1.11
  • Invalid lexemes: 0
  • Tokens: 120,791
  • Segments: 279 (0 BIPA errors, 0 CLTS sound class errors, 279 CLTS modified)
  • Inventory size (avg): 62.78

Contributors

Name GitHub user Description Role
Beijing University data collection Author
Johann-Mattis List @LinguList maintainer Editor

CLDF Datasets

The following CLDF datasets are available in cldf: