Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User-side feedback #183

Open
dmitriz opened this issue May 5, 2020 · 3 comments
Open

User-side feedback #183

dmitriz opened this issue May 5, 2020 · 3 comments

Comments

@dmitriz
Copy link

dmitriz commented May 5, 2020

Opening this issue to document feedback and recommendation from the users' perspectives.

It is 2018 2020 and we still talk about papers. 😄

  • Nice and easy installation with npm.
  • Many deprecated packages reported, might need fixing in the future, maybe not an immediate priority.

Minimal usage

$ getpapers -q covid
info: Searching using eupmc API
error: No output directory given. You must provide the --outdir argument.
  • Use a default directory to save some typing for users in a hurry?
  • Even nicer: Use the search string as directory name by default. That will make it really easy to use.

Next simplest choice:

$ getpapers -q covid -o covid
info: Searching using eupmc API
info: Found 37494 open access results
warn: This version of getpapers wasn't built with this version of the EuPMC api in mind
warn: getpapers EuPMCVersion: 5.3.2 vs. 6.2 reported by api
Retrieving results [==----------------------------] 8% (eta 232.8s)^C
  • Does it really download 40k results by default? That could be unexpected.
  • I have terminated download and the directory is empty. An alternative could be to keep some of the results.

Smaller searches work nicely, apart from the warnings that are a bit confusing.

$ getpapers -q "covid tracing" -o tracing
info: Searching using eupmc API
info: Found 1155 open access results
warn: This version of getpapers wasn't built with this version of the EuPMC api in mind
warn: getpapers EuPMCVersion: 5.3.2 vs. 6.2 reported by api
Retrieving results [==============================] 100% (eta 0.0s)
info: Done collecting results
info: Saving result metadata
info: Full EUPMC result metadata written to eupmc_results.json
info: Individual EUPMC result metadata records written
info: Extracting fulltext HTML URL list (may not be available for all articles)
info: Fulltext HTML URL list written to eupmc_fulltext_html_urls.txt

Now refining:

getpapers -q "covid tracing korea" -o tracing
info: Searching using eupmc API
info: Found 254 open access results
warn: This version of getpapers wasn't built with this version of the EuPMC api in mind
warn: getpapers EuPMCVersion: 5.3.2 vs. 6.2 reported by api
Retrieving results [==============================] 100% (eta 0.0s)
info: Done collecting results
info: Saving result metadata
info: Full EUPMC result metadata written to eupmc_results.json
info: Individual EUPMC result metadata records written
info: Extracting fulltext HTML URL list (may not be available for all articles)
info: Fulltext HTML URL list written to eupmc_fulltext_html_urls.txt

And again:

getpapers -q "covid tracing korea taiwan vietnam" -o tracing
info: Searching using eupmc API
info: Found 26 open access results
warn: This version of getpapers wasn't built with this version of the EuPMC api in mind
warn: getpapers EuPMCVersion: 5.3.2 vs. 6.2 reported by api
Retrieving results [==============================] 100% (eta 0.0s)
info: Done collecting results
info: Saving result metadata
info: Full EUPMC result metadata written to eupmc_results.json
info: Individual EUPMC result metadata records written
info: Extracting fulltext HTML URL list (may not be available for all articles)
info: Fulltext HTML URL list written to eupmc_fulltext_html_urls.txt
  • It looks like the results are merged rather than lost, which is nice.
  • But each created subdirectory has only one JSON file. Would it be easier to navigate just a list of files instead of directories?
@petermr
Copy link
Member

petermr commented May 5, 2020

Thanks. Valuable.

I am NOT the author of getpapers . Rick Smith-Unna is and we should try to get his views. Here are mine. I think they should be refiled as issues.

  1. default directory.
  • pros: it's simple
  • cons: some queries are a page long. We either truncate or hash.
  1. infinite download.
    Yes, this is a major problem. There needs to be an inbuilt limit

  2. cached download.
    The JSON is (I think) ordered by scientific priority. I don't know if the download order follows this.

  3. overwriting and merging.
    This is an important issue. It's nice that you can download on top of an existing dir/CProject. But there may be implicit context that is lost. It probably useful to have a switch --overwrite

I am having to deal with some of this in ami download https://github.com/petermr/ami3

@dmitriz
Copy link
Author

dmitriz commented May 6, 2020

Thanks. Valuable.

Thank you for your appreciation. :)

I am NOT the author of getpapers . Rick Smith-Unna is and we should try to get his views.

Judged by the lack of responses to previous issues and last code back in 2016, this could be off his radar for quite a while.

default directory.
pros: it's simple
cons: some queries are a page long. We either truncate or hash.

What about using the search string?

infinite download.
Yes, this is a major problem. There needs to be an inbuilt limit

100 results seem like a common default I've seen with many APIs.
Also, the order is needed, maybe the 100 most recent ones?

cached download.
The JSON is (I think) ordered by scientific priority. I don't know if the download order follows this.

By scientific priority, you mean the first mention? I didn't know the APIs could do such things. :)

overwriting and merging.
This is an important issue. It's nice that you can download on top of an existing dir/CProject. But there may be implicit context that is lost. It probably useful to have a switch --overwrite

Agree. The user-friendliest way is probably to print an overwrite warning with options to select: yes, no, or yes to all to skip the rest of warnings.

I am having to deal with some of this in ami download https://github.com/petermr/ami3

Do you still need getpapers then?

@petermr
Copy link
Member

petermr commented May 6, 2020

Yes, we still need it. There are tutorials out there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants