Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardization of pwg links for 'M.' (Manu) #74

Open
funderburkjim opened this issue Sep 9, 2024 · 28 comments
Open

Standardization of pwg links for 'M.' (Manu) #74

funderburkjim opened this issue Sep 9, 2024 · 28 comments
Assignees

Comments

@funderburkjim
Copy link
Contributor

funderburkjim commented Sep 9, 2024

The 'normal' markup for PWG dictionary of a reference to Manusmfti has the form
<ls>KATHĀS. a, s.</ls> , where a is the adhyAya number and s is the shloka number (both are digit sequences).
In a display for pwg, this form generates a link to the cdsl link target for Manusmfti , as discussed in #73.
Other activated link forms are
<ls n="M.">a, s.</ls> and <ls n="M. a,">s.</ls>

But there are many 'implied' forms whose markup can be changed to a sequence of these normal forms.
For instance, <ls>M. 12, 33. 1, 17. 8, 50</ls> generates a link for 12,33 but not for 1,17 or 8,50.

This example can be recoded in pwg.txt as
<ls>M. 12, 33.</ls> <ls n="M.">1, 17.</ls> <ls n="M.">8, 50</ls>

In this recoding, 1,17 and 8,50 are active links.
This issue describes work that recodes most of these implied forms.

@gasyoun
Copy link
Member

gasyoun commented Sep 10, 2024

M. 12, 33. 1, 17. 8, 50 generates a link for 12,33 but not for 1,17 or 8,50.

Love you Jim. Long live Jim.

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Sep 11, 2024
funderburkjim added a commit that referenced this issue Sep 11, 2024
@funderburkjim
Copy link
Contributor Author

finished

work directory.

We started with 13979 M. links.
Now we have 22557 links.

See lexextract_all.txt for a summary of counts of all ls references in pwg.

Compare 2022 lsextract,
done exactly 2 years ago.

@funderburkjim
Copy link
Contributor Author

small oddity

<L>1289<pc>1-0095<k1>atiTin
 has M. 12,917   but there is no such shloka in our 'manu'!

I did no systematic check of the 'validity' of (adhyAya,shloka) references, though this would be possible and might turn up other non-existing links.
This could be done by comparing linksort2_3.txt with Manu.Deslongchamps.index.txt.

@funderburkjim
Copy link
Contributor Author

Main work done. Closing issue.

@Andhrabharati
Copy link

@funderburkjim

00004 UNKNOWN ls is unknown

you may like to correct these 4 instances as
<L>25591<pc>2-1032<k1>cira : <ls>AMAR 39.</ls> -> <ls>AMAR. 39.</ls>
<L>36198<pc>3-0879<k1>DaraRIDara : <ls>Lot. de la b. 1. 2.</ls> -> <ls>Lot. de la b. l. 2.</ls>
<L>96821<pc>6-1488<k1>vraj : <ls n="DHĀTUP 7,.">40</ls> -> <ls n="DHĀTUP. 7,">40</ls>
<L>114521<pc>7-1378<k1>sPUrj : <ls>RAJA-TAR. 4,471.</ls> -> <ls>RĀJA-TAR. 4,471.</ls>

funderburkjim added a commit that referenced this issue Sep 12, 2024
@funderburkjim funderburkjim reopened this Sep 12, 2024
@funderburkjim
Copy link
Contributor Author

invalid adhyaya, shloka

Found 37 cases where adhyaya, shloka in pwg inconsistent with the index table.
See ls_M_invalid.txt.

@Andhrabharati Would you investigate these and resolve where possible?

@Andhrabharati
Copy link

Andhrabharati commented Sep 13, 2024

Found 37 cases where adhyaya, shloka in pwg inconsistent with the index table.
See ls_M_invalid.txt.

@Andhrabharati Would you investigate these and resolve where possible?

Resolved 35 cases successfully; and the remaining 2 cases are stripped-off the <ls n="M."> to just <ls>, to make them numeric strings (increasing their count from 60 to 62).

This has been a tiring exercise!!

@Andhrabharati
Copy link

Andhrabharati commented Sep 13, 2024

I have also identified few composite ls-strings (pertaiing to ṚV., AV. and R.) that are to be properly split as individual ls-strings (to link)--
ls-entries to split further.txt

@funderburkjim
Copy link
Contributor Author

Resolved 35 cases successfully

Where are the resolutions located?

@Andhrabharati
Copy link

I did not post them, taking that you would've to work with my other two points (above) first!

@Andhrabharati
Copy link

Anyways, here is the file--
resolving ls_M_invalid cases (AB).txt

@funderburkjim
Copy link
Contributor Author

Thanks! - I've attended to the 4 unknowns above; will aim to implement the other two.

@funderburkjim
Copy link
Contributor Author

@Andhrabharati A question for one of the 'invalid' cases:

<L>89631<pc>6-0904<k1>vARijaka
;; also 
<ls>MBH. 14, 4283</ls> -> 
<ls>MBH. 13, 4283</ls>

I could not find vARijaka in MBH. 13, 4283.
https://sanskrit-lexicon-scans.github.io/mbhcalc/?13.4283

@Andhrabharati
Copy link

See this--
image

@Andhrabharati
Copy link

Found one instance of the two left-over cases--

<L>26306<pc>3-0001<k1>ja
(AB) old res.: <ls n="M.">51, 5.</ls> -> <ls>51, 5.</ls> ;; ls with numeric string (yet to identify the actual ls entity)
(AB) new res.: <ls n="M.">51, 5.</ls> -> <ls n="VARĀH. BṚH. S.">52,5.</ls>

image

@funderburkjim
Copy link
Contributor Author

funderburkjim commented Sep 14, 2024

<ls>MBH. 13, 4283

I misread the (blurry) verse numbers !

funderburkjim added a commit that referenced this issue Sep 14, 2024
@funderburkjim
Copy link
Contributor Author

work on AB's split further file

About 50 cases.

See change_5.txt.
There are 7 identified as '?' -- Maybe @Andhrabharati can resolve.

@Andhrabharati
Copy link

Before I post my findings on these 7 cases, I would like Jim to look at my other post #66 and act on it, which is a very minor issue.

By the way, I had to "blame" Jim at one or two places while resolving these 7 cases!!

@funderburkjim
Copy link
Contributor Author

funderburkjim commented Sep 15, 2024

@Andhrabharati Please provide

  1. the 126 prosody instances mentioned above and in issues 66 and 29.
    • the 'old/new' format used in 66 would be best for me.
  2. replacements for two pages with bad scans in issue66:
    a. https://www.sanskrit-lexicon.uni-koeln.de/scans/PWGScan/PWGScanpdf/pwg5-0907.pdf
    b. https://www.sanskrit-lexicon.uni-koeln.de/scans/PWGScan/PWGScanpdf/pwg5-1165.pdf

@Andhrabharati
Copy link

Andhrabharati commented Sep 16, 2024

@funderburkjim
I am glad that you have looked at the mentioned issue, and asking for data.

I shall post the data at those resp. issues first, and then post my findings in the above 7 cases here.

@Andhrabharati
Copy link

Andhrabharati commented Sep 16, 2024

Just posted the relevant data at #66 and #29.

Now it is time to post my findings of the 7 cases here--

; <L>4336<pc>1-0320<k1>apriya<k2>a/priya

; oldls:<ls>AV. 8, 10, 3, 1. 6, 26.</ls>
; ? <ls>AV. 8, 10, 3, 1.</ls
36273 old <ls>AV. 8, 10, 3, 1. 6, 26.</ls> <ls n="AV.">12, 1, 30.</ls>
;
36273 new <ls>AV. 8, 10, 3, 1.</ls> <ls n="AV. 8,">6, 26.</ls> <ls n="AV.">12, 1, 30.</ls>

AB: <ls>AV. 8, 10, 3, 1.</ls> -> <ls>AV. 8, 10, 18.</ls> ;; this is mentioned in PWG-1 VN pages (it also has another 100+ corrections in AV citations of Vol.1!).

On the whole, lot many corrections mentioned in Vols.1-4 and Vol.6 are missed in the current CDSL file; though I had posted the "fully proofed" text of these portions, Jim had interpreted that they are all included in the VN portion of Vol.7 with random checking [vide this issue], and did not add the same to the CDSL file.

I find that majority of these 1000+ entries are not present in the Vol.7, and even if some got repeated in the Vol.7, I wonder what made Jim not to add them in PWG, when he has added a whole lot of 10k+ entries in pwk that are just index words of other volumes (& do not contain any 'objective' body as such), in spite of my pointing out the same. It only shows that he is NOT consistent in following his 'own rules' throughout the project!

; -----------------------------------------------------
; <L>7429<pc>1-0555<k1>asura<k2>a/sura

; oldls:<ls n="ṚV.">83, 6. 10, 124, 3.</ls>
; ? <ls n="ṚV.">83, 6.</ls>
64273 old <ls n="ṚV.">83, 6. 10, 124, 3.</ls> {#ma\haspu\trAso\ asu^rasya vI\rA di\vo Da\rtAra^ urvi\yA pari^ Kyan#}
;
64273 new <ls n="ṚV.">83, 6.</ls> <ls n="ṚV.">10, 124, 3.</ls> {#ma\haspu\trAso\ asu^rasya vI\rA di\vo Da\rtAra^ urvi\yA pari^ Kyan#}

AB: <ls n="ṚV.">83, 6.</ls> -> <ls n="ṚV. 5,">83, 6.</ls>;; also <ls n="ṚV. 5,">63, 7, 3.</ls> -> <ls n="ṚV. 5,">63, 7.</ls>

; <L>7429<pc>1-0555<k1>asura<k2>a/sura

; oldls:<ls n="ṚV.">10, 2. 6, 7, 2. 3, 56, 3.</ls>
; ? <ls n="ṚV.">10, 2.</ls>
64274 old <ls n="ṚV.">10, 2. 6, 7, 2. 3, 56, 3.</ls> {#yadI^ Gf\teBi\rAhuto\ vASI^ma\gnirBara^ta\ uccAva^ ca . asu^ra iva ni\rRija^m#}
;
64274 new <ls n="ṚV.">10, 2.</ls> <ls n="ṚV.">6, 7, 2.</ls> <ls n="ṚV.">3, 56, 3.</ls> {#yadI^ Gf\teBi\rAhuto\ vASI^ma\gnirBara^ta\ uccAva^ ca . asu^ra iva ni\rRija^m#}

AB: <ls n="ṚV.">10, 2.</ls> -> <ls n="ṚV. 10,">10, 2.</ls>

; -----------------------------------------------------
; <L>14692<pc>2-0051<k1>kan<k2>kan

; ? <ls n="ṚV. 1,">175, 5.</ls> no mention of kan
; oldls:<ls n="ṚV.">1, 51, 12. 33, 14. 175, 5.</ls>
135296 old <ls n="ṚV.">1, 51, 12. 33, 14. 175, 5.</ls> {#asta^M nanakze\ yasmi^M cA\kan#}
;
135296 new <ls n="ṚV.">1, 51, 12.</ls> <ls n="ṚV. 1,">33, 14.</ls> <ls n="ṚV. 1,">175, 5.</ls> {#asta^M nanakze\ yasmi^M cA\kan#}

AB: <ls n="ṚV. 1,">175, 5.</ls> -> <ls n="ṚV. 1,">174, 5.</ls>

; -----------------------------------------------------
; <L>22042<pc>2-0708<k1>garhaRa<k2>garhaRa

; oldls:<ls>R. 2, 25. 73. 3, 66</ls>
; ?
211474 old <ls>R. 2, 25. 73. 3, 66</ls> in den Unterschrr. Auch {#garhaRA#} <lex>f.</lex>
;
211474 new <ls>R. 2, 25.</ls> <ls n="R. 2,">73.</ls> <ls n="R.">3, 66</ls> in den Unterschrr. Auch {#garhaRA#} <lex>f.</lex>

AB: कैकेयीगर्हण is the name/topic of the resp. sargas (2,25), (2.73; Gorr. 2,75) and (3,66); how to denote the same here?

; -----------------------------------------------------
; <L>43539<pc>4-0579<k1>paryAya<k2>paryAya/

; oldls:<ls>AV. 8, 10. 9, 6. 11, 3. 12, 5. 15, 1</ls>
; ?
433859 old <ls>AV. 8, 10. 9, 6. 11, 3. 12, 5. 15, 1</ls> u.s.w. heissen
;
433859 new <ls>AV. 8, 10.</ls> <ls n="AV.">9, 6.</ls> <ls n="AV.">11, 3.</ls> <ls n="AV.">12, 5.</ls> <ls n="AV.">15, 1</ls> u.s.w. heissen

AB: Various sets of AV hymns (8,10,xx), (9,6,xx), (11,3,xx), (12,5,xx) and (15,1,xx) are known as पर्यायसूक्तs; how to denote the same here?

; -----------------------------------------------------
; <L>68757<pc>5-1212<k1>fkzarajas<k2>fkzarajas

; oldls:<ls n="R.">45. 50. 52.</ls>
; ?
657960 old <ls>R. 7, 37, 1, 1.</ls> <ls n="R.">45. 50. 52.</ls> {#°rajasa#}
;
657960 new <ls>R. 7, 37, 1, 1.</ls> <ls n="R.">45. 50. 52.</ls> {#°rajasa#}

AB: <ls>R. 7, 37, 1, 1.</ls> and <ls n="R. 7, 37, 1,">45. 50. 52.</ls> are all from the 1st प्रक्षिप्त sarga after 37th sarga of Vol.7; cf. my earlier post

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Sep 17, 2024
funderburkjim added a commit to sanskrit-lexicon/csl-websanlexicon that referenced this issue Sep 17, 2024
funderburkjim added a commit to sanskrit-lexicon/csl-apidev that referenced this issue Sep 17, 2024
funderburkjim added a commit that referenced this issue Sep 17, 2024
@funderburkjim
Copy link
Contributor Author

I think all the extra specific corrections suggested by AB have been included.
There remain several items that may be soluble:

  • link target for 2-number references to ṚV. and AV. I think a solution would be
    to assume that the verse number is 1 and to do with change to basicadjust.php
  • the प्रक्षिप्त references for Ramayana. The three extract-n.pdf files from Ramayana link markup in pwg #57 could provide a new link target for <ls>R. x, y, z, w.</ls>. Indexing needed.
  • VN missing pages -- This was begun in VN missing pages #39. I'll review that to see what might be done.

Can this issue be closed now?

@Andhrabharati
Copy link

Andhrabharati commented Sep 17, 2024

I think all the extra specific corrections suggested by AB have been included.

The correction suggested at <L>13998 entry in my post is "missed" by Jim, though he has acted on changing the metrical symbols at lines other than the 126 cases in my "Ç lines" file.

This specific correction is at line 128490, as
(2 Mal 16 + 18 Moren; Ausgang -o- und o—) -> (2 Mal 16 + 18 Moren; Ausgang – ⏑ – und ⏑ –)

Next some wrong corrections [2 cases, due to my "Ç lines" file data being prior to my ls-working; and 3 cases, by Jim's error!] that are to be changed--
line 1894: <ls>ṚV.</ls> <ls>PRĀTIŚ. 17, 30.</ls> -> <ls>ṚV. PRĀTIŚ. 17, 30.</ls>
line 130057: <ls n="AV.">3, 4, 2.</ls> <ls n="ṚV.">7, 76, 3.</ls> -> <ls n="AV.">3, 4, 2.</ls> <ls n="AV.">7, 76, 3.</ls>
line 277744: <pic name='rajatamudra.png'/> -> <lang n="mongolian">ᠲᠠᠮᠭᠠ</lang> ;; Mongolian word "tamga" as at my subsequent post and the related change in the inventory.txt may be reverted.
line 655630: <ls>386</ls> -> <ls n="Ind. St.">386</ls>
line 910175: <ls n="DHĀTUP 7,">40</ls> -> <ls n="DHĀTUP. 7,">40</ls>

There is yet another Mongolian word at <L>52576, which is completely "missed" in the digital text--
line 518808: <lang n="arabic">بهانور</lang> -> <lang n="arabic">بهانور</lang>,
line 518809: , -> <lang n="mongolian">ᠪᠠᠭᠠᠶᠤᠷ</lang> ;; Mongolian word "bagayur" as at my pwk post

[With these the issues #29 and #66 can be closed, to which I forced (rather "blackmailed") Jim to have a look at!]

@Andhrabharati
Copy link

Andhrabharati commented Sep 17, 2024

There remain several items that may be soluble:

* link target for 2-number references to ṚV. and AV.  I think a solution would be
  to assume that the verse number is 1 and to do with change to basicadjust.php
  
  * @Andhrabharati agree?

Not a bad idea; but we need to cover a range of hymns in these resp. cases and just pointing to a particular hymn is not a "proper" solution. The Roth & Whitney ed. of AV has the related hymns under these पर्यायसूक्तs "marked" in round-braces; I do not see such "demarcation" in the AV data compiled by @gasyoun (which is being used for AV linking presently)!

And see what PWG has at the biblio listing for AV.--

In den ersten Bogen des Wörterbuchs finden sich mehrere Citate, deren Zahlen mit der in der Ausgabe angenommenen Zählung nicht ganz zusammentreffen. Man kann den Unterschied dadurch ausgleichen, dass man in den zusammengesetzten Liedern (Paryāyasūkta), welche in der Ausgabe als Einheiten gezählt sind, die Unterabtheilungen (Strophen) als besondere Lieder zählt.

[As such, for the "intended" hymns around these places, we might need to see where does Marcis's AV. data point to!!]

Finally in my opinion, these are all what would properly come under 'titular' citations category that we have recently discussed in MW.

  • the प्रक्षिप्त references for Ramayana. The three extract-n.pdf files could provide a new link target for R. x, y, z, w.. Indexing needed.

    @Andhrabharati agree?

I was deliberating of late, if the 'actual' source used by PWG is to be given-out; the earlier extracts posted for these प्रक्षिप्त sargas were from a different edition (as I already mentioned there), and do not always reflect what PWG cites!

With this, not only the प्रक्षिप्त sarga citations (~180) but the entire Vol.7 citations (~2000) of PWG [from the Bomb. ed. Rāmāyaṇa] could also be linked up.

However, if CDSL likes to "live with" the different ed. citations [as in case of ṚV. and AV.], we can look for Marcis to come up with indexing the Parab. ed. (Nirnayasagar) of Rāmāyaṇa [as he himself once "committed" earlier, after Mahābhārata (Calc. ed.) linking was finished].

@Andhrabharati
Copy link

Andhrabharati commented Sep 17, 2024

VN missing pages -- This was begun in #39. I'll review that to see what might be done.

I await to see if you would come up with "yet another model" for the VN integration!!

@Andhrabharati
Copy link

Can this issue be closed now?

YES, after attending to the further corrections as at my post above.

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Sep 17, 2024
funderburkjim added a commit to sanskrit-lexicon/csl-websanlexicon that referenced this issue Sep 17, 2024
funderburkjim added a commit that referenced this issue Sep 17, 2024
@funderburkjim
Copy link
Contributor Author

final touches for this issue

  • replace better scans: pwg5-0907.pdf, pwg5-1165.pdf
  • change_10.txt per the 'above' link in previous comment

@Andhrabharati
Copy link

I shall prepare the index/map file for Roth's AV (that was mentioned above), and hope Jim might "find" a means and time to link that work as well [there is every uncertainity that Marcis would be coming up with the text of it added in his AV data (as he once mentioned)].

Thus all the points talked about here would've been covered and this issue is thus closable now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants