Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document false positive Pithopus inermis #128

Open
Archilegt opened this issue Aug 29, 2022 · 5 comments
Open

Document false positive Pithopus inermis #128

Archilegt opened this issue Aug 29, 2022 · 5 comments

Comments

@Archilegt
Copy link

Document false positive Pithopus inermis on page https://www.biodiversitylibrary.org/page/663902
The name does not occur on that page. If we figure out what went wrong maybe we could fix it.

@Archilegt
Copy link
Author

Archilegt commented Aug 29, 2022

Maybe "petiolis inermibus" or a spelling variant is producing the false positive.

@dimus
Copy link
Member

dimus commented Aug 29, 2022

I think the related output from gnfinder is this one:

    {
      "cardinality": 2,
      "verbatim": "Petrolus inermis,",
      "name": "Petrolus inermis",
      "oddsLog10": 11.983664170973137,
      "oddsDetails": [
        {
          "feature": "spDict: inSpecies",
          "odds": 8904.045433955427
        },
        {
          "feature": "uniDict: inGenus",
          "odds": 2976.794090112943
        },
        {
          "feature": "uniEnd3: lus",
          "odds": 570.6314549737272
        },
        {
          "feature": "spEnd3: mis",
          "odds": 210.6946910672223
        },
        {
          "feature": "spLen: 7",
          "odds": 3.6025724692203513
        },
        {
          "feature": "uniLen: 8",
          "odds": 0.9606164921956841
        },
        {
          "feature": "abbr: false",
          "odds": 0.8732848865715452
        },
        {
          "feature": "priorOdds: true",
          "odds": 0.1
        }
      ],
      "start": 143,
      "end": 160,
      "annotationNomenType": "NO_ANNOT",
      "verification": {
        "id": "0dbc49e2-b393-5d52-a0be-2b09ce6231fa",
        "name": "Petrolus inermis",
        "cardinality": 2,
        "matchType": "PartialExact",
        "bestResult": {
          "dataSourceId": 181,
          "dataSourceTitleShort": "IRMNG",
          "curation": "Curated",
          "recordId": "urn:lsid:irmng.org:taxname:1391559",
          "entryDate": "2022-06-10",
          "sortScore": 8.67908829458864,
          "matchedName": "Petrolus Rafinesque, 1815",
          "matchedCardinality": 1,
          "matchedCanonicalSimple": "Petrolus",
          "matchedCanonicalFull": "Petrolus",
          "currentRecordId": "urn:lsid:irmng.org:taxname:1391559",
          "currentName": "Petrolus Rafinesque, 1815",
          "currentCardinality": 1,
          "currentCanonicalSimple": "Petrolus",
          "currentCanonicalFull": "Petrolus",
          "isSynonym": false,
          "classificationPath": "Biota|Animalia|Chordata|Vertebrata|Reptilia|Reptilia|Reptilia|Petrolus",
          "classificationRanks": "|Kingdom|Phylum|Subphylum|Class|Order|Family|Genus",
          "classificationIds": "urn:lsid:irmng.org:taxname:1|urn:lsid:irmng.org:taxname:2|urn:lsid:irmng.org:taxname:148|urn:lsid:irmng.org:taxname:11905117|urn:lsid:irmng.org:taxname:1448|urn:lsid:irmng.org:taxname:10544|urn:lsid:irmng.org:taxname:100138|urn:lsid:irmng.org:taxname:1391559",
          "editDistance": 0,
          "stemEditDistance": 0,
          "matchType": "PartialExact",
          "scoreDetails": {
            "cardinalityScore": 0,
            "infraSpecificRankScore": 0,
            "fuzzyLessScore": 1,
            "curatedDataScore": 0.6666667,
            "authorMatchScore": 0.14285715,
            "acceptedNameScore": 1,
            "parsingQualityScore": 1
          }
        },

So looks like Pithopus inermis is not returned from gnfinder.

@mlichtenberg and @cajunjoel can you help to find out how this false positive appeared in BHL?

@mlichtenberg
Copy link

It was old data left over from a previous name-finding algorithm. I re-ran that page through the latest version of GNFinder (1.0.0) and the data now reflects the GNFinder output shown in the previous comment (https://www.biodiversitylibrary.org/page/663902).

@dimus
Copy link
Member

dimus commented Aug 30, 2022

@mlichtenberg, @cajunjoel, taking into account an imminent approach of bhlindex v1.0.0, may be we should plan to run it in October against whole BHL and get rid of outdated inaccuracies of old algorithms?

@Archilegt
Copy link
Author

Recognition of Petrolus is as expected for "Petiolus inermis" sentence in line 5, with underlying uncorrected OCR "Petrolus inermis".
There is one less false positive for a centipede name! ;)
I will leave the issue open in case that you wish to continue working on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants