Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VN missing pages, continued #76

Open
funderburkjim opened this issue Sep 18, 2024 · 39 comments
Open

VN missing pages, continued #76

funderburkjim opened this issue Sep 18, 2024 · 39 comments

Comments

@funderburkjim
Copy link
Contributor

Continue the discussion of VN (additions and improvements) for PWG, that was begun in #39.

funderburkjim added a commit that referenced this issue Sep 18, 2024
@funderburkjim
Copy link
Contributor Author

https://sanskrit-lexicon.uni-koeln.de/pwgindex.html alleges to be a scanned edition of PWG,
created long ago from material from @maltenth.
readme_cdsl_vn.txt is a sort of index to the VN portions from the various volumes.

Compare the volume 1 VN material from this cdsl source to
the volume 1 VN material supplied by @Andhrabharati in #37, #39 : AB vol 1 vn pdf.

They are very different.

How to account for this difference? If we want to improve the VN coding in cdsl pwg.txt, which of the two sources should be used?

Although AB does not divulge the exact source of his pdfs, perhaps he could retrieve some information from the title pages that would explain the difference.

@Andhrabharati
Copy link

Andhrabharati commented Sep 18, 2024

@maltenth has to respond about the scans at CDSL!

And it is my lookout to reach to the best possible original sources (either scans or physical books) that enhances my collection, as a conitinuous process.
[I find that many works cited in PWG could be traced at the Bavarian library (having excellent quality). Of course, there are quite many other sources as well.]

So far as I am concerned, the text in the pwgheader (of older date) is exactly what is in the print volumes; and I just had proofed the same (and at times split some matter into separate lines) and posted earlier.

Jim could probably start with converting the PWG-VN data in Thomas's original format to current CDSL format.

@gasyoun
Copy link
Member

gasyoun commented Sep 18, 2024

https://sanskrit-lexicon.uni-koeln.de/pwgindex.html alleges to be a scanned edition of PWG,
created long ago from material from @maltenth.

This was 2002 or 2004, I received them on a CD from Germany in Moscow.

@Andhrabharati
Copy link

This post indicates some of my PWG "sources".

readme_cdsl_vn.txt is a sort of index to the VN portions from the various volumes.
... ...
vol. 1 (no VN matter)

If the Vol.1 does not contain any VN matter, how would Jim (and/or Thomas) explain the presence of the typed matter from those pages in the pwgheader file (that was received from Thomas)?

@funderburkjim
Copy link
Contributor Author

@maltenth has to respond about the scans at CDSL!

Unfortunately, communication by me with Thomas has become unpredictable this year.

@funderburkjim
Copy link
Contributor Author

funderburkjim commented Sep 18, 2024

If the Vol.1 does not contain any VN matter ...

Just now, I compared
a. the volume 2 material from pwgindex to
b. pwgheader/PWG.V.2.VN.pages.pdf

They are identical. From this I infer that the material in pwgheader/PWG.V.1.VN.pages.pdf is simply absent from the pwgindex images. Why absent -- no way to know.
This allays my concern regarding possible version difference.

I have revised pwgindex program to include

  • pages from pwgheader/PWG.V.1.VN.pages.pdf.
  • page from pwgheader/PWG.V.6.VN.pages.pdf.

@funderburkjim
Copy link
Contributor Author

Based on recent review (see revised readme_cdsl_vn.txt),
the first task is to provide 'entries' for the 'missing' VN material.
The sources for this missing material has two parts:

  • PWG.VN.text.Vol.s.1-6.txt typed by AB . Jim's task is to convert [v.pppp] lines to the metaline-body-lend format of pwg.txt entries. Let's refer to this file by the shorter name VNTXT.
  • The AV reference improvements on page 3 of PWG.V.1.VN.pages.pdf. AB's task is to type this in some straightforward way. Then Jim's task will be to convert AB's file to the m-b-l format of pwg.txt entries.

The VNTXT file has 599 lines 'to convert'. 131 of these are, following the printed text, without headwords. The first examples:

[1.0012]	 ¦ streiche das Beispiel u. अक्न ...`    <<< headword is 'akna'
[1.0014] 	 ¦ Z. 31 streiche <ls>ṚV.</ls> <8,46,26>.   <<< headword is 'akza'

AB: Have you already determined these missing headwords?

@Andhrabharati
Copy link

AB: Have you already determined these missing headwords?

No, Jim; you may refer to my related post.
But, it is a fairly simple task if decided to be done!

@Andhrabharati
Copy link

If I am to do it, I might wish to re-look at the whole content for a possible 'revision'!!

@Andhrabharati
Copy link

Andhrabharati commented Sep 25, 2024

  • The AV reference improvements on page 3 of PWG.V.1.VN.pages.pdf. AB's task is to type this in some straightforward way.

It may be noted that the form <X> = <Y> is to be considered something like lies <X> st. <Y> at these AV citation changes. As such, I don't suggest changing this format.

@Andhrabharati
Copy link

@funderburkjim

I think I have now properly changed these PWGVN lines, to the format as in the pwkvn pages.
PWGVN_1-6_reformatted_(dng).txt

There are couple of places (the lines having ...do..., ??? and ;;) that you may need to look at first.
-----------------------------
PS. I feel the VN lines of PWG-5 (lines 553-573 in my file) could be discarded, as the page is not to be seen in the original Bavarian Library and the re-printed Japanese (MLBD) ed. copies.

@funderburkjim
Copy link
Contributor Author

funderburkjim commented Sep 25, 2024

  • Looks like this reformatted file is the one I should work with.
    • In particular, it has filled in what I called the 'missing' headwords
  • Why the 21 do... items in headword field ?

@Andhrabharati
Copy link

Andhrabharati commented Sep 25, 2024

...do... denotes that the VN line belongs to the same HW as above!
Or in other words, those HWs contain two or more corrections.

@Andhrabharati
Copy link

Andhrabharati commented Sep 25, 2024

And you may note that the [v.pppp] after the broken bar denotes the actual correction location, not the pc-field for the metaline (which should be built with the previous [Page:VNv-ppp]).

@funderburkjim
Copy link
Contributor Author

Note: ...do... ¦ [1.0014] Z. 31 streiche <ls>ṚV. 8,46,26.</ls> actually refers to 'अक्ष, not to अक्ष्` (the HW above). This is the only one I've checked.

@Andhrabharati
Copy link

yes; in fact it should be referring to <hom>2.</hom> अक्ष.

Probably these should be checked again all over for the homonyms and accent marks (which I had missed at some places), after you prepare the file.

@Andhrabharati
Copy link

PS. I feel the VN lines of PWG-5 (lines 553-573 in my file) could be discarded, as the page is not to be seen in the original Bavarian Library and the re-printed Japanese (MLBD) ed. copies.

The "actual" reason I had in mind is not about the VN part in the Cologne-scan on Sp.1677-8 (which is present in PWG7 as well), but that many entries in the Bavarian copy do not "appear" anywhere else, incl. the CDSL text.

Bavarian Library copy scan page--
image

CDSL scan page--
image

@funderburkjim
Copy link
Contributor Author

funderburkjim commented Sep 26, 2024

transcoding

The transcoding to slp1 (from vntext_0_deva.txt to vntxt_0.txt) required a few edits of vntext_0_deva. See change_vntxt_0_deva.txt.

@funderburkjim
Copy link
Contributor Author

vntxt_1.txt

Correct pwg-devanagari accents that were missed in vntxt_0.txt.

@funderburkjim
Copy link
Contributor Author

lines 553-573 of AB file

From an examination of these 21 headwords with current PWG display:

  • only one headword does not have an entry in vol 7; the exception is AlokagadADarI, which appears
    as headword AlokagAdADarI in vol. 7
  • The content in lines 553-573 for a given headword is generally similar (but not identical) to the corresponding
    content in vol 7; however the content under pAriBAzika seems different.

I see no problem (and some minor benefit) in KEEPING lines 553-573, since this material corresponds to the
scan Thomas made for cdsl.

It is mysterious that the Bavarian edition (per scan above)

  • doesn't have the material at the bottom of the corresponding page of cdsl scan
  • Is different in the top half also. e.g. There is a legitimate correction to moGa in Bavarian
    edition, which I don't find in the cdsl scan.

BTW: it is good that you have not only filled in headwords, but also added page-references for corrections.

@gasyoun
Copy link
Member

gasyoun commented Sep 27, 2024

BTW: it is good that you have not only filled in headwords, but also added page-references for corrections.

long live @Andhrabharati

@Andhrabharati
Copy link

[from Jim's file: pwgissues/issue76/readme.txt]

# transcode
cd /c/xampp/htdocs/sanskrit-lexicon/PWG/pwgissues/issue76/transcode
mkdir pwgtranscoder1
cp /c/xampp/htdocs/sanskrit-lexicon/MWS/mwtranscode/transcoder1/deva_slp1.xml pwgtranscoder1/deva_slp1.xml
cp /c/xampp/htdocs/sanskrit-lexicon/MWS/mwtranscode/transcoder1/slp1_deva.xml pwgtranscoder1/slp1_deva.xml

cp /c/xampp/htdocs/sanskrit-lexicon/MWS/mwtranscode/transcoder.py .
cp /c/xampp/htdocs/sanskrit-lexicon/MWS/mwtranscode/mw_transcode.py pwg_transcode.py

# heavily edit pwg_transcode.py

It is quite surprising to see that Jim has copied MW's transcoder files to "handle" the PWG transcoding, and had to "heavily edit" the same for the purpose!!

Probably (a) MW is fully overshadowing Jim's thoughts, or (b) Jim is also now entering into "dotage" as Thomas, who himself said thus in response to one of my points earlier.

Jim has a separate "transcoder file-set" for the PWG family from the very initial days (which he had updated for the devanagari accent, upon some prolonged debating with me); and the same should've been used here.

Otherwise, it leads to unnecessary contamination of MW-style and PWG-style of accents, as can be seen from the below snippets from the PWG print and Jim's current revision--

image

[from Jim's file: change_vntxt_0_deva.txt]

image

@Andhrabharati
Copy link

Andhrabharati commented Sep 27, 2024

[from AB's file: PWGVN_1-6_reformatted_(dng).txt]

{#रााण꣫#} ¦ [6.0317] (auf Bogen 21*) Z. 1; in {#राणि#} und {#पैलादि#} ist der Haken über dem {#ि#} abgebrochen.
;; Jim, this is a case of non-invertibility of Devanagari-slp1-devanagari!!

The transcoding to slp1 (from vntext_0_deva.txt to vntxt_0.txt) required a few edits of vntext_0_deva.

[from Jim's file: change_vntxt_0_deva.txt]

old:
{#रााण꣫#} ¦ [6.0317] (auf Bogen 21*) ...
new: PWG style udAtta -> MW style udAtta, also hiatus
; the cdsl spelling headword in rARa/ = राण॑
{#राण॑#} ¦ [6.0317] (auf Bogen 21*)
---
old:
{#राण॑#} ¦ [6.0317] (auf Bogen 21*) Z. 1; in {#राणि#} und {#पैलादि#} ist der Haken über dem {#ि#} abgebrochen.
new: Replace DEVANAGARI VOWEL SIGN I with DEVANAGARI LETTER I
{#राण॑#} ¦ [6.0317] (auf Bogen 21*) Z. 1; in {#राणि#} und {#पैलादि#} ist der Haken über dem {#इ#} abgebrochen.
; Jim doesn't know how to represent in slp1 the 'naked' vowel sign.
; the hook above the {#ि#} is broken

Incidentally, I had discussed about this very item with @drdhaval2785 in private mail exactly 3 years back!

Here is my initial mail to Dhaval--
image

followed by further responses--
image

@Andhrabharati
Copy link

Andhrabharati commented Sep 27, 2024

; Jim doesn't know how to represent in slp1 the 'naked' vowel sign.

It's because Jim is following the slp1 from Peter Schraf, who had duly made a note of this particular point in "his study/survey" (before coming up with slp1)--

image

but for some reason, did not even "try" to propose any solution!

So it is not just slp1 alone that doesn't handle this, but also (any and) every existing Roman transliteration scheme!

If Jim is "willing" to "update" the CDSL transcoding rules (as he had done in quite many cases till now), I shall post my proposal to handle the same (with which the invertibility condition also gets satisfied).

Probably, Jim might wish to get Peter Schraf's opinion also about that proposal (before taking any action on it).

@Andhrabharati
Copy link

I see no problem (and some minor benefit) in KEEPING lines 553-573, since this material corresponds to the scan Thomas made for cdsl.

It is mysterious that the Bavarian edition (per scan above)

* doesn't have the material at the bottom of the corresponding page of cdsl scan

* Is different in the top half also. e.g. There is a legitimate correction to moGa in Bavarian
  edition, which I don't find in the cdsl scan.

In fact, I would consider it to be exactly opposite that the CDSL scan is THE mystery case!

As I had already indicated earlier, both the Bavarian Library scan (1868) and the Japanese reprint (1976) tally exactly with each other, so does any physical book that I had seen in various Indian libraries (or in market now for sale).

Now I have found a scan copy digitised by Google [from the Sapienza University of Rome (Biblioteca di Studi Orientali)] in August 2013, which has both the "proper ending page" of Bavarian copy followed by the "extraneous page" of the CDSL scan (after a blank page).

This is somewhat similar to what we had seen earlier in one of the MW99 scans having two of MD errata pages, about which some discussion has took place, and finally it was concluded that it was an error in binding that particular copy and those two pages were NOT brought into the MW annexure data.

It is surprising that the CDSL scan copy has the "original" ending page (as in all the three above scan copies) MISSING and is left only with the dubious "extraneous" page.

@funderburkjim
Copy link
Contributor Author

MW is fully overshadowing Jim's thoughts

No, The reason I used the mw transcoders was that I had available
the inverse transcoder deva_slp1.xml but did not have deva1_slp1.xml.

I'm constructing deva1_slp1.xml now.

funderburkjim added a commit that referenced this issue Sep 28, 2024
@funderburkjim
Copy link
Contributor Author

vntxt_1_rev.txt

The inverse transcoder file deva1_slp1.xml now created. I should have done that in the first place.
This used to genereate the slp1 version of AB's file: vntxt_1_rev.txt.

Jim thinks that vntxt_1_rev.txt is ready for further use.

"update" the CDSL transcoding rules

I'm curious what such an update would look like. Let's see the proposed transcoder file.

or Jim is also now entering into "dotage" :
image

@Andhrabharati
Copy link

Andhrabharati commented Sep 28, 2024

First things first!

Against Jim's two posts 1 and 2 just above this, I would like to re-iterate from AB's post:

Jim has a separate "transcoder file-set" for the PWG family from the very initial days (which he had updated for the devanagari accent, upon some prolonged debating with me); and the same should've been used here.

Here are the transcoders I have with me (as recd from Jim)--

[MW-version, which has no "deva1_slp1.xml" indeed]
image

[pw-version, which DOES have the "deva1_slp1.xml"]
image

And he had clearly said those days that the deva1 <> slp1 files were specifically made for the pw-family!! He had also indicated how to check the intertibility using the "to & fro transcoders" one after the other.

I can as well show (point) him where he has posted these transcoders (for me) earlier, if he is still not convinced that these were already existing before!
------------------
PS. Sorry Jim, I didn't use the "dotage" term in any derogatory sense; it was just indicating the state-of-the-mind (forgetfulness) sometimes seen in younger guys as well.

@Andhrabharati
Copy link

Andhrabharati commented Sep 28, 2024

As I has mentioned in my mail to Dhaval (in the above post), a need to transcoding the vowel-marker (mAtrA) characters arises not only in case of grammar books, as in

[Macdonnell]
image

or [Monier Williams]
image

or in reference works, as in [Monier Williams dictionary]
image

[Unicode Chart: Devanagari]
image

or in posters, as at [Marcis's post](#37 (comment))
image

image

Of course, for most of such works that go to actual publishing, other 'professional means' would be resorted to (and not these Roman transcoding schemes) for the intended text matter appropriately!!

[... post continues further below ...]

@gasyoun
Copy link
Member

gasyoun commented Sep 28, 2024

@Andhrabharati I can hardly imagine a case other than textbook for having the need to seperate the vowel representation.

@Andhrabharati
Copy link

but esp. in the cases of "truthfully" showing/indicating [in plain text format] the mistakes or wrong readings (or prints), as at--

[PWG6-0317] ;; which became 6-0333 after correction
image
image

[PWGVN 6-001]
image
image

[PWG3-0271]
image
image

[PWGVN3-001]
image
image

Here these Devanagari strings are deliberately typed thus in the text matter, and are NOT at all typos as Jim has commented and "changed" them to the 'corrected' forms--

The transcoding to slp1 (from vntext_0_deva.txt to vntxt_0.txt) required a few edits of vntext_0_deva.
image

and
image

@Andhrabharati
Copy link

Andhrabharati commented Sep 28, 2024

Now is the time for my proposal to transcode these--

I would like to propose using the ¬ ["Not sign"] character (alt+0172; u+00ac) for denoting the following 'vowel-mātrā' character as a 'Not-vowel' character!

The Unicode std. prescribes ◌ ["Dotted circle"] (u+25cc) character to be used as a place-holder, and showed it in positioning the diacritic-marks (which I am now extending to positioning the vowel-markers as well).

Namely, the proposal goes like this--
image

Note: Devanagari transcoding would not be with the dotted circle (the uniscribe engine would take care of rendering the appropriate script character), but the Roman transcoding should be having dotted circle prior to the resp. Roman letter.

@Andhrabharati
Copy link

With this notation, we would get the round-robbin strings properly--

image

funderburkjim added a commit to sanskrit-lexicon/csl-websanlexicon that referenced this issue Sep 28, 2024
funderburkjim added a commit to sanskrit-lexicon/csl-apidev that referenced this issue Sep 28, 2024
funderburkjim added a commit that referenced this issue Sep 28, 2024
…ior version.

The current transcodepwg/pwgtranscoder2/deva1_slp1.xml is slightly better. Use it
@funderburkjim
Copy link
Contributor Author

deva1 comparison

@Andhrabharati After your comment, I was able to find a deva1_slp1.xml from 2023.
In conversion of your file to slp1, the current version shows one improvement.
*So the preferred version is deva1_slp1.xml.
And this version is now also available in csl-websanlexicon and csl-apidev (which are the cdsl 'official' locations for the transcoders.

Some details on comparison to the 2023 version are in readme_deva1.txt.

@gasyoun
Copy link
Member

gasyoun commented Sep 29, 2024

deliberately typed thus in the text matter, and are NOT at all typos

agree @Andhrabharati

funderburkjim added a commit to sanskrit-lexicon/csl-websanlexicon that referenced this issue Oct 1, 2024
funderburkjim added a commit to sanskrit-lexicon-scans/pwg that referenced this issue Oct 1, 2024
@funderburkjim
Copy link
Contributor Author

question on ??? ¦ [1.0956] — [1.1016]

AB file comment states:

20 cases— 12121, 12122, 12145, 12196 (2), 12217, 12247, 12282, 12291, 12350, 12352, 
12369, 12448, 12457, 12470, 12513, 12561, 12593, 12602 and 12691

But there are only 19 L-numbers listed. Does the (2) have some significance that
yields 20 cases ?

I plan to generate a VN entry for each of these 19 (or 20) .

@funderburkjim
Copy link
Contributor Author

Also, I think there are two more at the beginning of the list

11847 {#upanayana#} 1-0956
    Here the reference <ls>ŚĀṄKH. GṚHY. 1, 5.</ls> has only two numbers.
    But it is on page 1-0956, so Author must have intended this to change also,
    otherwise he would not have put "Sp. 956–1016" in the VN.
12106 {#upaSaya/#}  1-0974 

@gasyoun
Copy link
Member

gasyoun commented Oct 5, 2024

@funderburkjim I contintue to upload my new scans of Sanskrit dictionaries, do not know if better than what you have or not https://vk.com/samskrtamru?w=wall-88831040_22648

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Oct 7, 2024
funderburkjim added a commit to sanskrit-lexicon/csl-websanlexicon that referenced this issue Oct 7, 2024
funderburkjim added a commit to sanskrit-lexicon/csl-pywork that referenced this issue Oct 7, 2024
funderburkjim added a commit to sanskrit-lexicon/csl-apidev that referenced this issue Oct 7, 2024
funderburkjim added a commit that referenced this issue Oct 7, 2024
@funderburkjim
Copy link
Contributor Author

vn missing installed

Work files are here.

vntxt_4.txt contains the new entries.

These entries have been inserted into pwg.txt, and csl-orig updated.
Various small adjustments made to the display programs (see commit links above).

I think the goals of this issue have been satisfied.
Request @Andhrabharati to review.

Next step for me: changes to pwg.txt that were noticed during this missing VN work. Will detail these proposed changes after AB review of this vn work .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants