-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace raw "language code" with a popup #790
Conversation
Where did you get the list from? It is probably not complete, right? |
I googled BCP-47 language tags. Where should we get the complete list? @justvanrossum. |
I was hoping you could help finding out. The spec is apparently here: https://www.rfc-editor.org/info/bcp47 but I don't see a "list of codes + names". Wakamaifondue has this list: https://github.com/Wakamai-Fondue/wakamai-fondue-engine/blob/master/src/tools/ot-to-html-lang.js It also contains the OpenType tag for each language, which we don't need per se. Please check that repository how that file was made: it would be nice if we could have a script that generates the needed data. |
Actually, the OpenType tag would be useful of we ever parse the language information from the actual fonts. Maybe we should just use ot-to-html-lang.js as is. (While parsing the font would be best, I'm worried about performance for big fonts.) |
This is too much. I am not sure if |
This list looks more realistic https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes |
There is a language code list in googlefonts page Belarusian is missing that I notice |
Maybe parsing the font could be done in Python side if we are worried about the performance. |
(I do worry about the inbalance between the effort of implementing this feature versus its (low) priority.) |
I tried wakamaifoundue, it isn't working well. Try it with |
Wakamai fondue worked well with IBM Flex Sans font. It returns these: If we are going to parse the font, as you suggested, we can lookup these language names in ot-to-html-lang.js and get the language codes of them. We can show the result in UI in |
The simplest solution looks like defining all the language codes in |
Can you be more specific about what didn't work well? |
It didn't work with the font I tried (Google Sans). Supported languages were empty in the result. |
You tried it on the wakamaifondue beta site? It worked for me with the subset, but froze on the full font (I could perhaps have completed eventually, but I didn't have the patience.) I'm pretty sure Wakamaifondue does a lot more font parsing than only getting the languages out, and I'd like to know how long it would take to do just that. Can you find the code in Wakamaifondue that is responsible for parsing the languages, and adapt it for a test we can do ourselves? |
Yes
It reads "GSUB" table of the binary font. Full file read is required. https://github.com/Wakamai-Fondue/wakamai-fondue-engine/blob/master/src/fondue/Fondue.js#L964 In my opinion, full-binary read should be in one side, either in Python or Javascript. |
Ah great. Check also:
Like I said before, reference-font is a client-side feature, so JS it is. I would love to know how long it takes for a big (GS) font to extract the languages in a browser. |
8632811
to
a9263f7
Compare
I delete this PR, start a new one. |
Fixes #746.