Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with DOI containing trailing slash "/" (and maybe other url-breaking symbols like ; # ? < > \ ) #239

Open
eutanatos opened this issue Jul 21, 2023 · 1 comment

Comments

@eutanatos
Copy link

I have a problem with retrieving metadata for some type of DOI that contains trailing slash, like this: "10.36652/0042-4633-2023-102-5-404-413/".

cr_works does not resolves this DOI:

rcrossref::cr_works("10.36652/0042-4633-2023-102-5-404-413/")$data

#Warning message:
#404 (client error): /works/10.36652/0042-4633-2023-102-5-404-413/ - Resource not found. 

and Crossref REST API too

but doi.org resolves this DOI and I'm sure that it is registered in Crossref.

Little investigation leads me to this statement in current Crossref Unified Resource API, in API overview chapter:

"You should always url-encode DOIs and parameter values when using the API. DOIs are notorious for including characters that break URLs (e.g. semicolons, hashes, slashes, ampersands, question marks, etc.)."

So I suggest fix for this issue - add transformation to url-encode DOI. In my case its changing "/" on "%2F":

rcrossref::cr_works("10.36652/0042-4633-2023-102-5-404-413%2F", .progress = "text")$data

P.S.: Found Crossref documentation for members about construction of DOI where they ask not to use / in DOIs: "Do not encode forward slash / when resolving DOIs or retrieving metadata from our REST API", but what can we do using only single words...

----------- additional info, perhaps this will be useful to someone else -----------

I found that adding extra slash also gives positive result either for cr_works or Crossref REST API:

(rcrossref::cr_works("10.36652/0042-4633-2023-102-5-404-413//")$data

And my investigation leads me to old Crossref API issue thread_1 and thread_2 touching some sort of this problem. As I understand they does not close possibility of creation DOI with trailing slash and now trailing slash breaks "/agency" queries even if DOI url-encoded.

@eutanatos eutanatos changed the title Problem with DOI containing trailing slash "/" (and maybe Problem with DOI containing trailing slash "/" (and maybe other url-breaking symbols like ; # ? < > \ ) Jul 21, 2023
@njahn82
Copy link
Member

njahn82 commented Oct 2, 2023

Hi @eutanatos , can confirm that Crossref does not return metadata for this record,

https://api.crossref.org/works/10.36652/0042-4633-2023-102-5-404-413/

However, I don't think rcrossref has an issue with trailing slashes, eg:

rcrossref::cr_works("10.1002/asi.24460/")
#> $meta
#> NULL
#> 
#> $data
#> # A tibble: 1 × 35
#>   alternative.id    archive container.title    created deposited published.print
#>   <chr>             <chr>   <chr>              <chr>   <chr>     <chr>          
#> 1 10.1002/asi.24460 Portico Journal of the As… 2021-0… 2023-08-… 2021-09        
#> # ℹ 29 more variables: published.online <chr>, doi <chr>, indexed <chr>,
#> #   issn <chr>, issue <chr>, issued <chr>, member <chr>, page <chr>,
#> #   prefix <chr>, publisher <chr>, score <chr>, source <chr>,
#> #   reference.count <chr>, references.count <chr>,
#> #   is.referenced.by.count <chr>, subject <chr>, title <chr>, type <chr>,
#> #   update.policy <chr>, url <chr>, volume <chr>, abstract <chr>,
#> #   language <chr>, short.container.title <chr>, assertion <list>, …
#> 
#> $facets
#> NULL

Created on 2023-10-02 with reprex v2.0.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants