-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Russian in PWG, a few more #414
Comments
|
I'd like to review the spelling of other OCS words already written in PWG.
Yes, OCS and Russian words should be marked up separately. OCS is usually printed with a special extrabold uncial cyrillic font. |
I'm taking Gasyoun's texts with SergeA's revisions on cases 2,3. Regarding the distinction between OCS and Russian:
@SergeA -- Do you want to review those 82 cases? If so, what do you need from me for that review? If it seems important to revise the markup to distinguish the OCS cases, then it probably makes sense to do this after a review of the 82 cases. |
All. Just a list will do.
Yap, most letters are the same. |
Yes.
You know, I´ll be happy if you´ll provide a UI for the task. Or perhaps a list with (headword / Russian word / link to the online article / link to the scan image). Is it easy to generate such list? I do not know the technical side. I do not know if it is possible to give a link to the exact Cologne article. Often it´d be very handy to provide in the discussion a direct link to the MW meaning etc. |
@SergeA Give this russian02.html a try. Note 1: I developed this for a desktop screen (1920 x 1080), and have only used Chrome in testing. Note 2: The |
Thank you, it's workable ok. I think I'll finish the check quickly. |
Need correction:
(№20 краты und №21 кратъ)
(№56 краса ... №57 красити ... №58 красьнъ ... №59 красный)
Remove question mark, tag as LS:
Language = OCS:
Language = RUS:
Language = Greec: |
Three more corrections, which require fonts with support of Cyrillic Extended-B Unicode range. (In my comp they are viewable with fonts: DejaVu Serif & Old Standard TT.)
I am not sure about the last if it makes any difference to spell with ы or ꙑ. I am not an expert in OCS, but it seems to me they are optional graphical variants. In another example Böhtlingk spells through ы. |
In case 38: I'm removing the Should we provide in pwgbib an English translation and/or other description of this Russian language work? |
Language namesWe should probably aim to follow a standard spelling for language name in I used language names consistent with ISO 639-2.
Why the hyphens in Old-Church-Slavonic ?There is a clash in two standards:
It seems like a good consistency and documentation feature to require specific spellings for So a compromise has to be made in our usage of one or the other of these two standards. The compromise I made is to replace the space character with the hyphen character. CapitalizationNote that the attribute list is case-sensistive. In the one case (among the 84) where the language was Greek, I spelled the attribute All of this is like arguing about the number of angels that can dance on the head of a pin. But sometime, when there's nothing better to do, we should standardize the language name spellings |
Cyrillic Extended-B UnicodeI made no change in the display program to add a font representing the characters requiring this portion of the Unicode code points. Nonetheless, when I view the 'kfte' example where such a character occurs, I see no problem. At the moment, I don't know whether this is a fortuitous accident with my browser/OS/font setup. status of russion01.htmlThis display is in an in-between state now.
|
It works, that's enough. ISO was not aware of DTD.
Dance, dance 💃 |
|
|
I also see it ok in the browser in my comp. But in the mail-program ꙗ and ꙑ are changing to squares. |
So Seeing is NOT believing in this case!> Didn't realize Russian and Greek have 'homoglyphs' . Will make the change to Greek Unicode |
Have added ls expansions for Buslaev and Minaev. NOTE: This may be one more ls with Russian that needs expansion, This appears to me to be TWO references: |
Changed display to use Old Standard for Russian and OCS within Does this work in that Win7 computer? |
"Антарабида" is a supposed Sanskrit (or not Sanskrit?) word written in Cyrillic letters (the same way as Sanskrit words written with Latin letters in English text). the source There was aslo German edition: |
I mentioned that comp just to say there can be a font problem. As I said, that comp is without any additional fonts. And without fonts you can not help. If I'll install some fonts perhaps it will work. But which font must I install? Perhaps in the site there should be some recommendations about fonts. |
Let's assume you are using one of the Cologne displays, say the basic display for PWG. In this scenario, you should not have to manually download any fonts to the Win7 computer. When required by the display, the browser takes care of downloading Old Standard Font from the Cologne server. This is a usage of what is called a 'web font'; this article describes the details of web programming. If you 'inspect' one of the russian or ocs words in chrome browser you can see this: Note1: I had to clear cached files in browser for this to work; Ctrl-F5 doesn't adequately clear things in russian02.html here since the displays are in Iframes. Note2: This network (or web) font may not be in a place used by other programs on the computer, such as the mail program. If you need to install the Old Standard Font into the Windows OS Fonts, I can give you a link. |
Actually, that's not or goal - mail programs. So web font works well enough for web. |
Ooops! I've tried the link with Russian list from this thread. Now tested it in Win7 + Chrome63 with basic PWG interface - yes, it works! )) Thanks for explanation and for the picture how to check.
The behavior of fonts and browsers is quite mysterious. |
The question is - do it really always and everywhere works well enough? |
The only question remains - why are only etymologies in Old Standard, but not all of the non-originally-Devanagari text. Rendering of fonts in mail exchange software is out of our reach. |
Impossible to tell. Browser technologies change all the time. My informal aim in developing web apps is that they work with modern browsers. I'm often unsure whether a particular new feature (say of Javascript ES6) should be used. For instance, if such a feature works with Chrome but not Firefox, I won't use it. Of course there are so-called Javascript transpilers which convert ES6 to ES5, but I've avoided workflows using such 'build' steps. So, if it works on a venerable WINXP SP3, that's nice; but if it doesn't, that will have to remain an unfortunate case. |
This is a good question. In the current displays, there is some looseness in the specification of what CSS styles apply to what parts of the page. My implicit game plan is:
It is in this second step that clearing up that CSS looseness can be addressed. Currently about 50% of the dictionaries have been converted (see the tracker). It might be that work can begin now on that second step for those converted dictionaries, without |
Agree. Keep in mind Serge is on WINXP and I myself in the countryside am on WINXP SP3.
Agree, so it's time for me to get in the game. As you state it's |
@gasyoun |
I guess you should make all the works converted to Unicode (and IAST) at the earliest and keep at this Github repos, in addition to whatever encoding you would prefer continue working with. It makes the collaboration work easier for people like me. |
These two items have already been done for all dictionaries as far as I remember. If there are any abberrations, they should be treated as bug and be corrected. |
Where can I access the Unicode (or IAST) files? I am seeing only slp1 (Jim's version) or HK (Thomas's version ?) files everywhere (rather mostly). |
if I had got the AP90 and PWG Unicode files, @funderburkjim would've happily/easily used all my corrections in them by now. |
Even the MW99 was converted to IAST only after my joining here and asking specifically for it, not before that. |
SLP1 for Sanskrit and IAST for all other was the thumbrule when we converted the various encodings. |
SLP1 does not even require unicode code points. It gets accomodated within ASCII itself. So SLP1 is Unicode compliant. You may be intending to say IAST when you say Unicode. |
Sorry to be blunt, but the reason why much of your good work is not being used immediately because it changes the markup or structure irreversibly. As an observer of processes at Cologne for many years, I find that Jim (and over the years me too) adheres to "invertibility principle" very religiously. Whenever a drastic change is made (like changing the encoding), Jim writes a convertor to and from. If the output of to and fro function yields the original file, then only the drastic changes are made. I see three ways in which your marvellous and fast work may be incorporates in Cologne.
I think the third way will suit you more. |
Did not expect this from you, Dr. Dhaval!! Let me give an example, rAmAyaRa : rāmāyaṇa : रामायण By Unicode, I mean the native language lettering (if Skt- Devanagari etc.) for all the languages involved. |
The reason for this is very simple- I did all those portions in our format/style, which includes the conversion from one encoding to another; we don't do things in parts/batches, but as a whole set of processes involved altogether at one go. If I had got the (converted) files from CDSL itself, they would have been straight away useful for further work, as was the case in MW99 work. |
You need to specify your requirements fully. I will work and provide you file in that format. |
There are no much requirements from my side, a fairly simple/single conversion (for all the works at once, not one-by-one against a specific request) is all that I asked for. @funderburkjim had agreed in principle, and gave few binding points for my further working; but then nothing happened afterwards. BTW, he was still talking about only one work (PWG), not all the CDSL works!! |
Most of the Russian script appearing in PWG has been provided.
Here are a few more that need filling in
Case 1. hw = Ahanas
Case 2. hw = dIrGa
Case 3. hw = dIrGa
Case 4. hw = dIrGa
Case 5. hw = kAkud
Case 6. hw = dIrGa
all the dIrGa's are on page 3-0654
The text was updated successfully, but these errors were encountered: