Skip to content

v0.12

Latest
Compare
Choose a tag to compare
@J535D165 J535D165 released this 27 Jun 12:15
a203837

What's Changed

  • Fix issue with new Dryad REST API format for downloading files #75 by @micafer in #76
  • Remove Pandas dep and fix argument checksum on CLI by @J535D165 in #79
  • [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #74
  • Add support for providers in OSF: #69 by @micafer in #77
  • Fix unused, overwritten line of code by @J535D165 in #82
  • Change GitHub Actions to run workflow job per service by @J535D165 in #83

New Contributors

Full Changelog: v0.11...v0.12

Coverage report

The following benchmark was applied to 1000 randomly selected records from Datacite.

Percentages

Percentage of datasets supported: 26.7%
Percentage of datasets not supported: 69.1%
Percentage of datasets with error: 4.2%

Table with unexpected errors

id type url service error
47 10.58100/ibcr0302rx67ws2 dois http://dis.iodp.pangaea.de/BCRDIS/webview/CORES_INFO.aspx?SKEY=21038&SAM=IBCR0302RX67WS2 nan 503 Server Error: Service Unavailable for url: https://dis.iodp.pangaea.de/BCRDIS/webview/CORES_INFO.aspx?SKEY=21038&SAM=IBCR0302RX67WS2
52 10.18730/v7c2= dois https://glis.fao.org/glis/doi/10.18730/V7C2= nan '10.18730/v7c2=' is not a correct resource identifier (e.g. a URL, DOI, Handle)
73 10.20345/digitue.1029.61 dois http://idb.ub.uni-tuebingen.de/opendigi/litrdsch_1902#p=141 nan 500 Server Error: Internal Server Error for url: https://opendigi.ub.uni-tuebingen.de/opendigi/litrdsch_1902#p=141
90 10.6068/dp15afea413954 dois http://statisticaldatasets.data-planet.com/dataplanet/Datasheet_DOI_Servlet?ID=15afea413954&type=gwtdatasheet&version=1 nan HTTPConnectionPool(host='statisticaldatasets.data-planet.com', port=80): Max retries exceeded with url: /dataplanet/Datasheet_DOI_Servlet?ID=15afea413954&type=gwtdatasheet&version=1 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f369f730fb0>, 'Connection to statisticaldatasets.data-planet.com timed out. (connect timeout=3)'))
96 10.17876/plate/dr.2/plates/201_33742 dois https://www.plate-archive.org/objects/dr.2/plates/201_33742 nan 500 Server Error: Internal Server Error for url: https://www.plate-archive.org/objects/dr.2/plates/201_33742/
119 10.18430/m3.irrmc.4168 dois https://proteindiffraction.org/project/SETDB1-x122 nan 'NoneType' object has no attribute 'find'
129 10.58100/ibcr0310rxocku2 dois http://dis.iodp.pangaea.de/BCRDIS/webview/CORES_INFO.aspx?SKEY=22618&SAM=IBCR0310RXOCKU2 nan 503 Server Error: Service Unavailable for url: https://dis.iodp.pangaea.de/BCRDIS/webview/CORES_INFO.aspx?SKEY=22618&SAM=IBCR0310RXOCKU2
133 10.14469/ch/8676 dois https://spectradspace.lib.imperial.ac.uk:8443/dspace/handle/10042/to-8701 nan HTTPSConnectionPool(host='spectradspace.lib.imperial.ac.uk', port=8443): Max retries exceeded with url: /dspace/handle/10042/to-8701 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f3693833500>, 'Connection to spectradspace.lib.imperial.ac.uk timed out. (connect timeout=3)'))
136 10.17614/q4h70857g dois http://pqr.pitt.edu/mol/KFKSYDSVYUWMHK-UHFFFAOYSA-N nan HTTPConnectionPool(host='pqr.pitt.edu', port=80): Max retries exceeded with url: /mol/KFKSYDSVYUWMHK-UHFFFAOYSA-N (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f369fbb6b40>, 'Connection to pqr.pitt.edu timed out. (connect timeout=3)'))
156 10.6068/dp14ba7fada6a81 dois http://statisticaldatasets.data-planet.com/dataplanet/Datasheet_DOI_Servlet?ID=14ba7fada6a81&type=datasheet&version=1 nan HTTPConnectionPool(host='statisticaldatasets.data-planet.com', port=80): Max retries exceeded with url: /dataplanet/Datasheet_DOI_Servlet?ID=14ba7fada6a81&type=datasheet&version=1 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f36938bb440>, 'Connection to statisticaldatasets.data-planet.com timed out. (connect timeout=3)'))
241 10.4233/uuid:51dde3f6-2a38-47a0-b719-420ff74ded5d dois http://resolver.tudelft.nl/uuid:51dde3f6-2a38-47a0-b719-420ff74ded5d nan HTTPSConnectionPool(host='repository.tudelft.nl', port=443): Read timed out. (read timeout=10)
256 10.17171/1-8-2854 dois http://repository.edition-topoi.org/collection/ICG/object/3675 nan HTTPConnectionPool(host='repository.edition-topoi.org', port=80): Max retries exceeded with url: /collection/ICG/object/3675 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f36938b9e80>, 'Connection to repository.edition-topoi.org timed out. (connect timeout=3)'))
259 10.6068/dp15e784c851034 dois http://statisticaldatasets.data-planet.com/dataplanet/Datasheet_DOI_Servlet?ID=15e784c851034&type=datasheet&version=1 nan HTTPConnectionPool(host='statisticaldatasets.data-planet.com', port=80): Max retries exceeded with url: /dataplanet/Datasheet_DOI_Servlet?ID=15e784c851034&type=datasheet&version=1 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f369fcc8050>, 'Connection to statisticaldatasets.data-planet.com timed out. (connect timeout=3)'))
267 10.17614/q4td9p06n dois http://pqr.pitt.edu/mol/HJQMFSDWWCLFTC-TWJUVVLDSA-N nan HTTPConnectionPool(host='pqr.pitt.edu', port=80): Max retries exceeded with url: /mol/HJQMFSDWWCLFTC-TWJUVVLDSA-N (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f369fddeae0>, 'Connection to pqr.pitt.edu timed out. (connect timeout=3)'))
318 10.48550/arxiv.1509.07661 dois https://arxiv.org/abs/1509.07661 nan HTTPSConnectionPool(host='arxiv.org', port=443): Read timed out. (read timeout=10)
362 10.14469/ch/1303 dois https://spectradspace.lib.imperial.ac.uk:8443/dspace/handle/10042/to-1328 nan HTTPSConnectionPool(host='spectradspace.lib.imperial.ac.uk', port=8443): Max retries exceeded with url: /dspace/handle/10042/to-1328 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f369fecc590>, 'Connection to spectradspace.lib.imperial.ac.uk timed out. (connect timeout=3)'))
383 10.14456/scitechasia.2022.12 dois http://doi.nrct.go.th/?page=resolve_doi&resolve_doi=10.14456/scitechasia.2022.12 nan HTTPSConnectionPool(host='doi.nrct.go.th', port=443): Max retries exceeded with url: /?page=resolve_doi&resolve_doi=10.14456/scitechasia.2022.12 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)')))
397 10.17876/plate/dr.2/envelopes/201_50873 dois https://www.plate-archive.org/objects/dr.2/envelopes/201_50873 nan 500 Server Error: Internal Server Error for url: https://www.plate-archive.org/objects/dr.2/envelopes/201_50873/
400 10.23725/akhp-6959 dois https://ors.datacite.org/doi:/10.23725/akhp-6959 nan HTTPSConnectionPool(host='ors.datacite.org', port=443): Max retries exceeded with url: /doi:/10.23725/akhp-6959 (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7f3693693e30>: Failed to resolve 'ors.datacite.org' ([Errno -2] Name or service not known)"))
403 10.58100/ibcr0381exz5001 dois http://dis.iodp.pangaea.de/BCRDIS/webview/CORES_INFO.aspx?SKEY=26882&SAM=IBCR0381EXZ5001 nan 503 Server Error: Service Unavailable for url: https://dis.iodp.pangaea.de/BCRDIS/webview/CORES_INFO.aspx?SKEY=26882&SAM=IBCR0381EXZ5001
434 10.58100/ibcr0364exxoa01 dois http://dis.iodp.pangaea.de/BCRDIS/webview/CORES_INFO.aspx?SKEY=26567&SAM=IBCR0364EXXOA01 nan 503 Server Error: Service Unavailable for url: https://dis.iodp.pangaea.de/BCRDIS/webview/CORES_INFO.aspx?SKEY=26567&SAM=IBCR0364EXXOA01
452 10.14469/ch/129258 dois https://spectradspace.lib.imperial.ac.uk:8443/dspace/handle/10042/134211 nan HTTPSConnectionPool(host='spectradspace.lib.imperial.ac.uk', port=8443): Max retries exceeded with url: /dspace/handle/10042/134211 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f3693892de0>, 'Connection to spectradspace.lib.imperial.ac.uk timed out. (connect timeout=3)'))
458 10.14469/ch/41814 dois https://spectradspace.lib.imperial.ac.uk:8443/dspace/handle/10042/48213 nan HTTPSConnectionPool(host='spectradspace.lib.imperial.ac.uk', port=8443): Max retries exceeded with url: /dspace/handle/10042/48213 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f369fbb4620>, 'Connection to spectradspace.lib.imperial.ac.uk timed out. (connect timeout=3)'))
483 10.18730/12n7m$ dois https://glis.fao.org/glis/doi/10.18730/12N7M$ nan '10.18730/12n7m$' is not a correct resource identifier (e.g. a URL, DOI, Handle)
496 10.14457/cmu.the.2009.132 dois http://doi.nrct.go.th/?page=resolve_doi&resolve_doi=10.14457/CMU.the.2009.132 nan HTTPSConnectionPool(host='doi.nrct.go.th', port=443): Max retries exceeded with url: /?page=resolve_doi&resolve_doi=10.14457/CMU.the.2009.132 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)')))
501 10.14456/stj.2019.4 dois http://doi.nrct.go.th/?page=resolve_doi&resolve_doi=10.14456/stj.2019.4 nan HTTPSConnectionPool(host='doi.nrct.go.th', port=443): Max retries exceeded with url: /?page=resolve_doi&resolve_doi=10.14456/stj.2019.4 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)')))
503 10.14457/kmutt.res.2010.25 dois http://doi.nrct.go.th/?page=resolve_doi&resolve_doi=10.14457/KMUTT.res.2010.25 nan HTTPSConnectionPool(host='doi.nrct.go.th', port=443): Max retries exceeded with url: /?page=resolve_doi&resolve_doi=10.14457/KMUTT.res.2010.25 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)')))
505 10.14469/ch/175982 dois https://spectradspace.lib.imperial.ac.uk:8443/dspace/handle/10042/180406 nan HTTPSConnectionPool(host='spectradspace.lib.imperial.ac.uk', port=8443): Max retries exceeded with url: /dspace/handle/10042/180406 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f369363d490>, 'Connection to spectradspace.lib.imperial.ac.uk timed out. (connect timeout=3)'))
515 10.15781/t2c824g2w dois https://repositories.lib.utexas.edu/handle/2152/41169 nan 502 Server Error: Proxy Error for url: https://repositories.lib.utexas.edu/items/e85c1b13-5c44-488b-a9a6-0507078f39d3
551 10.17876/plate/dr.2/plates/201_35722 dois https://www.plate-archive.org/objects/dr.2/plates/201_35722 nan 500 Server Error: Internal Server Error for url: https://www.plate-archive.org/objects/dr.2/plates/201_35722/
557 10.14457/mu.the.1999.140 dois http://doi.nrct.go.th/?page=resolve_doi&resolve_doi=10.14457/MU.the.1999.140 nan HTTPSConnectionPool(host='doi.nrct.go.th', port=443): Max retries exceeded with url: /?page=resolve_doi&resolve_doi=10.14457/MU.the.1999.140 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)')))
625 10.17182/hepdata.60582.v1/t187 dois https://www.hepdata.net/record/61173 nan HTTPSConnectionPool(host='www.hepdata.net', port=443): Read timed out. (read timeout=10)
639 10.58100/ibcr0364exf0601 dois http://dis.iodp.pangaea.de/BCRDIS/webview/CORES_INFO.aspx?SKEY=26688&SAM=IBCR0364EXF0601 nan 503 Server Error: Service Unavailable for url: https://dis.iodp.pangaea.de/BCRDIS/webview/CORES_INFO.aspx?SKEY=26688&SAM=IBCR0364EXF0601
680 10.58108/csrwa19919 dois https://rockstore.csiro.au/arrc/#/browsesamples/CSRWA19919 nan HTTPSConnectionPool(host='rockstore.csiro.au', port=443): Max retries exceeded with url: /arrc/ (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)')))
683 10.20379/dbaud-1041 dois http://webdatenbank.grass-medienarchiv.de/receive/ggrass_mods_00001019 nan 503 Server Error: Service Unavailable for url: https://webdatenbank.grass-medienarchiv.de/receive/ggrass_mods_00001019
757 10.18730/q3s0= dois https://glis.fao.org/glis/doi/10.18730/Q3S0= nan '10.18730/q3s0=' is not a correct resource identifier (e.g. a URL, DOI, Handle)
782 10.20372/nadre:1554185535.13 dois https://nadre.ethernet.edu.et/record/3238?ln=en nan HTTPSConnectionPool(host='nadre.ethernet.edu.et', port=443): Max retries exceeded with url: /record/3238?ln=en (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f369385f0e0>, 'Connection to nadre.ethernet.edu.et timed out. (connect timeout=3)'))
816 10.14469/ch/90617 dois https://spectradspace.lib.imperial.ac.uk:8443/dspace/handle/10042/97675 nan HTTPSConnectionPool(host='spectradspace.lib.imperial.ac.uk', port=8443): Max retries exceeded with url: /dspace/handle/10042/97675 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f36938b1ca0>, 'Connection to spectradspace.lib.imperial.ac.uk timed out. (connect timeout=3)'))
821 10.14456/apsr.2022.3 dois http://doi.nrct.go.th/?page=resolve_doi&resolve_doi=10.14456/apsr.2022.3 nan HTTPSConnectionPool(host='doi.nrct.go.th', port=443): Max retries exceeded with url: /?page=resolve_doi&resolve_doi=10.14456/apsr.2022.3 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)')))
823 10.17171/1-9-1799-5 dois http://repository.edition-topoi.org/collection/MRMD/single/0047/13 nan HTTPConnectionPool(host='repository.edition-topoi.org', port=80): Max retries exceeded with url: /collection/MRMD/single/0047/13 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f3693885c40>, 'Connection to repository.edition-topoi.org timed out. (connect timeout=3)'))
852 10.17171/1-13-16687 dois http://repository.edition-topoi.org/collection/ANCM/object/9341 nan HTTPConnectionPool(host='repository.edition-topoi.org', port=80): Max retries exceeded with url: /collection/ANCM/object/9341 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f36938927b0>, 'Connection to repository.edition-topoi.org timed out. (connect timeout=3)'))
894 10.5287/bodleianjpcy.2 dois https://databank.ora.ox.ac.uk/ww1archives/datasets/ww1-3945?version=2 nan HTTPSConnectionPool(host='databank.ora.ox.ac.uk', port=443): Max retries exceeded with url: /ww1archives/datasets/ww1-3945?version=2 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f369f6bcb00>, 'Connection to databank.ora.ox.ac.uk timed out. (connect timeout=3)'))

Table with unsupported repositories

netloc count
pid.geoscience.gov.au 103
app.geosamples.org 79
doi.plutof.ut.ee 60
www.gbif.org 57
glis.fao.org 30
www.e-periodica.ch 26
ba.e-pics.ethz.ch 22
dlc.library.columbia.edu 19
bacdive.dsmz.de 18
rgdoi.net 16
digitallibrary.usc.edu 14
www.ccdc.cam.ac.uk 14
www.lfi.ch 11
nakala.fr 9
catalog.paradisec.org.au 8
www.osti.gov 8
digital.ucd.ie 7
www.plate-archive.org 7
doi.library.ubc.ca 7
ntnu.tind.io 6
architekturmuseum.ub.tu-berlin.de 6
doi.nrct.go.th 6
www.die-bonn.de 6
spectradspace.lib.imperial.ac.uk:8443 6
straininfo.dsmz.de 5
dadosdepesquisa.fiocruz.br 5
publikationen.bibliothek.kit.edu 5
dis.iodp.pangaea.de 5
digi.ub.uni-heidelberg.de 5
www.rvdata.us 4
hdl.handle.net 4
era.library.ualberta.ca 4
data.neotomadb.org 4
sage.figshare.com 4
repositories.lib.utexas.edu 3
apex.ipk-gatersleben.de 3
www.boldsystems.org 3
epos.myesr.org 3
statisticaldatasets.data-planet.com 3
journals.ub.uni-heidelberg.de 3
ageconsearch.umn.edu 3
doi.ala.org.au 3
sr.ethz.ch 3
www.hepdata.net 3
repository.edition-topoi.org 3
classiques-garnier.com 2
ikee.lib.auth.gr 2
biosys.e-pics.ethz.ch 2
gdac.broadinstitute.org 2
bib-pubdb1.desy.de 2
d.lib.msu.edu 2
cyberleninka.ru 2
cocoon.huma-num.fr 2
www.e-manuscripta.ch 2
scholarworks.wm.edu 2
pqr.pitt.edu 2
search.rads-doi.org 2
springernature.figshare.com 2
147.156.5.176:8080 2
doi.roper.center 2
viurrspace.ca 2
core.tdar.org 2
hasp.ub.uni-heidelberg.de 2
www.e-gs.ethz.ch 2
www.psycharchives.org 1
underline.io 1
www.sozialpolitik.ch 1
proteindiffraction.org 1
idb.ub.uni-tuebingen.de 1
publica.fraunhofer.de 1
ads.nipr.ac.jp 1
data.caltech.edu 1
www.worldpop.org.uk 1
nsidc.org 1
didomena.ehess.fr 1
archaeologydataservice.ac.uk 1
www.elibrary.ru 1
cyberdoi.ru 1
spiral.imperial.ac.uk 1
opus.bibliothek.uni-wuerzburg.de 1
www.tib.eu 1
resolver.tudelft.nl 1
daac.ornl.gov 1
doi.ciser.cornell.edu 1
journals.open.tudelft.nl 1
tuprints.ulb.tu-darmstadt.de 1
academiccommons.columbia.edu 1
www.archaeolog.ru 1
bl.iro.bl.uk 1
dataservices.gfz-potsdam.de 1
boris.unibe.ch 1
ors.datacite.org 1
www.e-rara.ch 1
theses.gla.ac.uk 1
www.jamstec.go.jp 1
drops.dagstuhl.de 1
www.icpsr.umich.edu 1
archiviostorico.fondazione1563.it 1
ruor.uottawa.ca 1
archive.materialscloud.org 1
www.zora.uzh.ch 1
elib.spbstu.ru 1
campagnes.flotteoceanographique.fr 1
arxiv.org 1
www.openagrar.de 1
ojs.utlib.ee 1
esdcdoi.esac.esa.int 1
epub.uni-regensburg.de 1
archiv.ub.uni-heidelberg.de 1
encyclopedia.1914-1918-online.net 1
www.repository.cam.ac.uk 1
b2share.eudat.eu 1
rucore.libraries.rutgers.edu 1
dlc.mpg.de 1
deepblue.lib.umich.edu 1
rockstore.csiro.au 1
webdatenbank.grass-medienarchiv.de 1
mdsoar.org 1
dataverse.callisto.calmip.univ-toulouse.fr 1
ascomycete.org 1
ap.elte.hu 1
depositonce.tu-berlin.de 1
nadre.ethernet.edu.et 1
www.openaccessrepository.it 1
www.crd.york.ac.uk 1
tecnoscienza.unibo.it 1
databank.ora.ox.ac.uk 1
qatest.labarchives.com 1
data.oceannetworks.ca 1
ad.e-pics.ethz.ch 1
resume.uni.lu 1
www.bindingdb.org 1
cdr.lib.unc.edu 1
resolver.caltech.edu 1
figshare.com 1
cwm-archiv.gbv.de 1