Skip to content

Latest commit

 

History

History
69 lines (38 loc) · 5.91 KB

SPEC.md

File metadata and controls

69 lines (38 loc) · 5.91 KB

Specifications of Pāḷi Tipiṭaka & Dictionary Websites

Multilingual Support of the Website

Two different implementations are allowed. The locales are listed here.

Implementation #1 (Current Implementation)

When users visit /.*, the server should serve the content of the website in locale language according to http ACCEPT_LANGUAGES header. If locales in ACCEPT_LANGUAGES are all un-supported, then the sever should serve English (en_US) content by default.

when users visit /{{ locale }}/., the server should serve the content of the website in {{ locale }} language regardless of http ACCEPT_LANGUAGES header. For example, when users visit /zh_TW/., the server should serve the content of the websites in Traditional Chinese.

Implementation #2 (sub-domain implementation)

When users visit example.org/. or www.example.org/., the server should serve the content of the website in locale language according to http ACCEPT_LANGUAGES header. If locales in ACCEPT_LANGUAGES are all un-supported, then the sever should serve English (en_US) content by default.

when users visit {{ locale }}.example.org/., the server should serve the content of the website in {{ locale }} language regardless of http ACCEPT_LANGUAGES header. For example, when users visit zh_TW.example.org/., the server should serve the content of the websites in Traditional Chinese.

Pali Text Title Translation

Question: What is Pali Text Title? Answer: For example, vinaya, Dīghanikāya, Brahmajālasuttaṃ, etc. are Pali text titles. You can see this url. The left side treeview, the html title, and the links above the translation all contain translation of Pali text titles in Traditional Chinese.

If the implementation #1 of multilingual support of website is choosen, then when users visit /., Pali text titles are translated according to http ACCEPT_LANGUAGES header. If all locales in ACCEPT_LANGUAGES are un-supported, Pali text titles are translated to English (en_US) by default. When users visit /{{ locale }}/., Pali text titles are translated to the {{ locale }} language. For example, when users visit /zh_TW/.*, Pali text titles are translated to Traditional Chinese.

Order of Dictionaries When Users Lookup the Word

Currently 5 different languages of dictionaries are supported: Pali-English, Pali-Chinese, Pali-Japanese, Pali-Vietnamese, Pali-Burmese.

When users look up the definition of the word, no matter in tooltip or preview, the order of the languages of dictionaries should be determined according to http ACCEPT_LANGUAGES header by default. If not in ACCEPT_LANGUAGES, the order can be determined by programmers.

Besides, in the settings of the website, options should be provided to users to choose the order of the languages of dictionaries.

URL Structure of Dictionary Website

Current Implementation

The following 6 pathes are top-level pathes, and the sub-structures of the top-level pathes are the same.

  • /
  • /zh_TW/
  • /zh_CN/
  • /en_US/
  • /fr_FR/
  • /vi_VN/

The only difference in above 6 pathes is that / will detect http ACCEPT_LANGUAGES header and show corresponding {{ locale }} language of the website, if the {{ locale }} language is not supported, then English (en_US) is shown by default. Pathes start with /{{ locale }}/ will show corresponding {{ locale }} language of the website, regardless of the http ACCEPT_LANGUAGES header.

In /, there are links pointing to /browse/{{ first_char_of_pali_word }}, the possible first character of Pali words are:

['a', 'ā', 'b', 'c', 'd', 'ḍ', 'e', 'g', 'h', 'i', 'ī', 'j', 'k', 'l', 'ḷ', 'm', 'ŋ', 'n', 'ñ', 'ṅ', 'ṇ', 'o', 'p', 'r', 's', 't', 'ṭ', 'u', 'ū', 'v', 'y', '-', '°']

Under /browse/{{ first_char_of_pali_word }}, there are links to words starting with the same first character. For example, there are more than 30,000 words starting with "a".

Every Pali word should a unique URL. For example, the URL of the word 'sacca' is /browse/s/sacca.

The structure of /zh_TW/, /fr_FR/, etc. should be the same as that of /, except the content (content does not include word explanations) in the URLs are shown in the corresponding {{ locale }} language.

Data of Pali-English, Pali-Chinese, Pali-Japanese, Pali-Vietnamese, Pali-Burmese dictionaries

All data of Pali-English, Pali-Chinese, Pali-Japanese, Pali-Vietnamese, Pali-Burmese dictionaries is located at the dictionary directory in the data repository. The format is explained in the README under the same directory. The data is in CSV format and can be easily processed. To see how data of dictionaries is pre-processed before deployment of dictionary website, please refer to Python scripts under dictionary/setup at pali repository.

There are 504,414 explanations in the database, consisting of 210,111 words.

Data of XML files of Pali texts written in Roman characters (including canons, commentaries, sub-commentaries, etc.) released by VRI

The data is located at the tipitaka/romn directory in the data repository.