Add interactive mode #5

vsoch · 2020-06-30T00:05:46Z

No description provided.

yarikoptic · 2020-07-30T21:06:59Z

crazy idea -- if it was a bot -- it could have presented those options as an issue (or just part of the PR description) with

- [ ]

for choices, so we could just select the one we want to add (or not)...
e.g.

- [x] Important one
  - [ ] [orcid1](https://orcid.org/orcid1)  (name, affiliation etc)
  - [ ] ...
- [x] The one we decide to ignore and just uncheck the [x]
  - [ ] [orcid1](https://orcid.org/orcid1)  (name, affiliation etc)
  - [ ] ...

and then monitor for some @con/tributors process in a comment so it would
take that markedup description and use it to actually do all the choices
etc.

vsoch · 2020-07-30T21:13:06Z

This is a cool idea, although it would probably need to be a separate action from the main con/tributors (I'm not sure how one GitHub repo can serve more than one). I think to start wouldn't it be more reasonable to make an actually interactive mode, to run on the command line, and ask the user to choose from a list?

In terms of metadata, it's not trivial to just list all the names /affiliations, because the initial request just returns orcids, and a follow up request shows metadata. So if we find 30 results, that means 30 calls to get all of the detailed metadata and then prompt the user. Is that a reasonable thing to do? I think it's just as easy to see the list, and then copy paste the identifiers into a URL to look carefully at the records. In practice just the name and affiliation isn't enough - a lot of users just have a name and you need to use papers, etc. to actually figure out the affiliation.

vsoch · 2020-07-30T21:26:21Z

@yarikoptic this won't work because here is the metadata that we get back:

[{'orcid-identifier': {'uri': 'https://orcid.org/0000-0001-9374-7098',
   'path': '0000-0001-9374-7098',
   'host': 'orcid.org'}},
 {'orcid-identifier': {'uri': 'https://orcid.org/0000-0001-9750-2514',
   'path': '0000-0001-9750-2514',
   'host': 'orcid.org'}},
 {'orcid-identifier': {'uri': 'https://orcid.org/0000-0003-3181-8561',
   'path': '0000-0003-3181-8561',
   'host': 'orcid.org'}},
 {'orcid-identifier': {'uri': 'https://orcid.org/0000-0003-0925-2012',
   'path': '0000-0003-0925-2012',
   'host': 'orcid.org'}}]

And the API works to look up based on metadata, but you can't just get full metadata for any orcid (it only works for your own). So we could add an interactive mode to ask the user to select an orcid (and they would need to still open the browser to do it) or we could keep as is and give them the list, and then assume they are skilled at opening a text file and copy pasting the entry.

yarikoptic · 2020-07-31T00:43:37Z

rright, according to https://orcid.org/organizations/integrators/API it would require "Basic Member API" to "Search/retrieve member-subscriber data Subject to permissions granted by iD holders" and according to https://orcid.org/about/membership "Standard (single legal entity): US$5,150 " (- some discounts) which is quite obnoxious IMHO. I would have probably paid ~100$ to support/use ... but hey -- Dartmouth is a member without any contact information :-(. I will inquire. If I can get a token I could use, then I will see what it would return!

vsoch · 2020-07-31T00:46:07Z

Awesome! Yes please let me know and I can update here to support it, if it's something we could reasonably do.

yarikoptic · 2020-08-03T21:36:35Z

uff, ok - looked at https://pub.orcid.org/v3.0/#!/Development_Public_API_v3.0/viewRecordv3 and it seems you do not even need any TOKEN to access public records. So, as long as you have (a candidate) orcid id already, it seems to possible to retrieve an entire public record, e.g.

$> curl --silent -X GET --header 'Accept: application/json' 'https://pub.orcid.org/v3.0/0000-0003-3456-2493' | jq . > /tmp/myorcid.json

$> grep email /tmp/myorcid.json
    "verified-email": true,
    "verified-primary-email": true
    "emails": {
      "email": [
          "email": "debian@onerussian.com",
      "path": "/0000-0003-3456-2493/email"

from which you could display name, affiliation(s), etc. It seems no token is even needed for basic search:

$> curl --silent -X GET --header 'Accept: application/json' 'https://pub.orcid.org/v3.0/expanded-search?q=Vanessa+AND+Sochat' | jq . | head -n 20
{
  "expanded-result": [
    {
      "orcid-id": "0000-0002-4387-3819",
      "given-names": "Vanessa",
      "family-names": "Sochat",
      "credit-name": null,
      "other-name": [
        "Vanessasaurus"
      ],
      "email": [],
      "institution-name": [
        "Stanford University School of Medicine"
      ]
    }
  ],
  "num-found": 1
}

vsoch · 2020-08-03T21:38:56Z

okay so let's say that we do this - and that we get a result of N=400 for some other names query. Then we would do 400 other requests just to show the user a list? :/

yarikoptic · 2020-08-03T21:42:15Z

so why did we need token at all? It seems to be doing quite good job for me as well:

$> curl --silent -X GET --header 'Accept: application/json' 'https://pub.orcid.org/v3.0/expanded-search?q=Yaroslav+AND+Halchenko' | jq .             
{
  "expanded-result": [
    {
      "orcid-id": "0000-0003-3456-2493",
      "given-names": "Yaroslav",
      "family-names": "Halchenko",
      "credit-name": null,
      "other-name": [
        "Ярослав Олеговіч Гальченко"
      ],
      "email": [
        "debian@onerussian.com"
      ],
      "institution-name": [
        "Center for Open Neuroscience",
        "Dartmouth College",
        "Debian Project",
        "New Jersey Institute of Technology",
        "Rutgers University",
        "University of New Mexico",
        "Vinnytsia State Technical University"
      ]
    }
  ],
  "num-found": 1
}

for Michael Hanke it brings some false positives etc, actually not even sure if real one among them -- good example where showing possibly emails etc would be of help to disambiguate.

The question is though -- why needed to do all the API token dance? ;)

yarikoptic · 2020-08-03T21:43:04Z

note: doesn't tollerate unicode well ;)

$> curl --silent -X GET --header 'Accept: application/json' 'https://pub.orcid.org/v3.0/expanded-search?q=Ярослав+AND+Олеговіч+AND+Гальченко'
<!doctype html><html lang="en"><head><title>HTTP Status 400 – Bad Request</title><style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 400 – Bad Request</h1><hr class="line" /><p><b>Type</b> Exception Report</p><p><b>Message</b> Invalid character found in the request target. The valid characters are defined in RFC 7230 and RFC 3986</p><p><b>Description</b> The server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing).</p><p><b>Exception</b></p><pre>java.lang.IllegalArgumentException: Invalid character found in the request target. The valid characters are defined in RFC 7230 and RFC 3986
	org.apache.coyote.http11.Http11InputBuffer.parseRequestLine(Http11InputBuffer.java:483)
	org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:502)
	org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
	org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:810)
	org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1623)
	org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
	java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
	java.lang.Thread.run(Thread.java:748)
</pre><p><b>Note</b> The full stack trace of the root cause is available in the server logs.</p><hr class="line" /><h3>Apache Tomcat/8.5.50</h3></body></html>%

vsoch · 2020-08-03T21:43:43Z

Good question - I never got it to work without the token! Let me give that a try.

vsoch · 2020-08-03T22:38:34Z

Yep that works! The difference is that you are using expanded-search and not regular search, which I had never tried. I'll update the current PR to do this, and also add interactive mode since we have the metadata available.

vsoch self-assigned this Jun 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add interactive mode #5

Add interactive mode #5

vsoch commented Jun 30, 2020

yarikoptic commented Jul 30, 2020

vsoch commented Jul 30, 2020

vsoch commented Jul 30, 2020

yarikoptic commented Jul 31, 2020

vsoch commented Jul 31, 2020

yarikoptic commented Aug 3, 2020

vsoch commented Aug 3, 2020

yarikoptic commented Aug 3, 2020

yarikoptic commented Aug 3, 2020

vsoch commented Aug 3, 2020

vsoch commented Aug 3, 2020

Add interactive mode #5

Add interactive mode #5

Comments

vsoch commented Jun 30, 2020

yarikoptic commented Jul 30, 2020

vsoch commented Jul 30, 2020

vsoch commented Jul 30, 2020

yarikoptic commented Jul 31, 2020

vsoch commented Jul 31, 2020

yarikoptic commented Aug 3, 2020

vsoch commented Aug 3, 2020

yarikoptic commented Aug 3, 2020

yarikoptic commented Aug 3, 2020

vsoch commented Aug 3, 2020

vsoch commented Aug 3, 2020