You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Those who wish to do clustering need to use Java to retrieve existing candidate records for a user. To facilitate alternate clustering approaches, we should allow third parties to retrieve these candidate records using an API service.
Requirements
Please create a new API called Candidate Article Retrieval, which has this URL: /reciter/candidate-article-retrieval/by/uid. It should be in the re-citer-controler.
Parameters should be personIdentifier, useGoldStandard, and retrievalRefreshFlag.
The articles returned should be consistent with what is returned by Feature Generator API except in this case we are returning all candidate records irrespective of score (the scoring phase which hasn't happened yet).
retrievalRefreshFlag and useGoldStandard should behave the same as they do for Feature Generator API. In the case of the latter, the value for useGoldStandard can change whether records stored in the GoldStandard table are also included.
The resulting data should be stateless, meaning that these data are not saved anywhere.
Authentication can be done using the same key that authenticates for Feature Generator API.
Sample data
Here's the proposed output for the user, meb7002, which returns 1 of the 42 articles it would customarily return.
{
"personIdentifier": "meb7002",
"dateRun": "2020-10-27T22:14:44.038+00:00",
"countArticles": 42,
"reCiterArticleFeatures": [
{
"pmid": 24694772,
"pmcid": "PMC4180817",
"publicationDateDisplay": "2014 Mar 30",
"publicationDateStandardized": "2014-03-30",
"datePublicationAddedToEntrez": "2014-04-04",
"doi": "10.1016/j.jbi.2014.03.013",
"publicationType": {
"publicationTypeCanonical": "Academic Article",
"publicationTypePubMed": [
"Journal Article"
],
"publicationTypeScopus": {
"publicationTypeScopusAbbreviation": "ar",
"publicationTypeScopusLabel": "Article"
}
},
"timesCited": 6,
"citesCitedBy": {
"cites": ["25046832","20016547","23362505","20072710","19567789","14728536","20083443"],
"citedBy": ["25046832","29339930"],
"type": "pmid"
}
},
"publicationAbstract": "OBJECTIVE: Publications are a key data source for investigator profiles and research networking systems. We developed ReCiter, an algorithm that automatically extracts bibliographies from PubMed using institutional information about the target investigators. METHODS: ReCiter executes a broad query against PubMed, groups the results into clusters that appear to constitute distinct author identities and selects the cluster that best matches the target investigator. Using information about investigators from one of our institutions, we compared ReCiter results to queries based on author name and institution and to citations extracted manually from the Scopus database. Five judges created a gold standard using citations of a random sample of 200 investigators. RESULTS: About half of the 10,471 potential investigators had no matching citations in PubMed, and about 45% had fewer than 70 citations. Interrater agreement (Fleiss' kappa) for the gold standard was 0.81. Scopus achieved the best recall (sensitivity) of 0.81, while name-based queries had 0.78 and ReCiter had 0.69. ReCiter attained the best precision (positive predictive value) of 0.93 while Scopus had 0.85 and name-based queries had 0.31. DISCUSSION: ReCiter accesses the most current citation data, uses limited computational resources and minimizes manual entry by investigators. Generation of bibliographies using named-based queries will not yield high accuracy. Proprietary databases can perform well but requite manual effort. Automated generation with higher recall is possible but requires additional knowledge about investigators.",
"articleKeywords": [
{
"keyword": "Abstracting and Indexing",
"type": "MESH_MAJOR",
"count": null
},
{
"keyword": "Algorithms",
"type": "MESH_MAJOR",
"count": 199455
},
{
"keyword": "Authorship",
"type": "MESH_MAJOR",
"count": 5259
},
{
"keyword": "Data Mining",
"type": "MESH_MAJOR",
"count": 12200
},
{
"keyword": "Natural Language Processing",
"type": "MESH_MAJOR",
"count": 3223
},
{
"keyword": "Pattern Recognition, Automated",
"type": "MESH_MAJOR",
"count": 35284
},
{
"keyword": "PubMed",
"type": "MESH_MAJOR",
"count": 39308
}
],
"journalCategory": {
"journalCategoryID": 36,
"journalCategoryLabel": "Medical Informatics"
},
"grantIdentifiers": [
"UL1 RR024996",
"UL1 TR000040",
"UL1 TR000457"
]
},
"scopusDocID": "84907990604",
"journalTitleVerbose": "Journal of biomedical informatics",
"issn": [
{
"issntype": "Electronic",
"issn": "1532-0480"
},
{
"issntype": "Linking",
"issn": "1532-0464"
}
],
"journalTitleISOabbreviation": "J Biomed Inform",
"articleTitle": "Automatic generation of investigator bibliographies for institutional research networking systems.",
"reCiterArticleAuthorFeatures": [
{
"rank": 1,
"lastName": "Johnson",
"firstName": "Stephen B",
"initials": "S",
"affiliations": {
"affiliationStatementLabel": "Department of Public Health, Weill Cornell Medical College, New York, United States. Electronic address: johnsos@med.cornell.edu.",
"affiliationStatementLabelSource": "PUBMED",
"affiliationInstitutions": [
{
"affiliationInstitutionLabel": "Weill Cornell Medicine",
"affiliationInstitutionId": 60007997,
"affiliationInstitutionSource": "SCOPUS"
}
]
},
"email": "johnsos@med.cornell.edu",
"targetAuthor": false
},
{
"rank": 2,
"lastName": "Bales",
"firstName": "Michael E",
"initials": "M",
"affiliations": {
"affiliationStatementLabel": "Department of Biomedical Informatics, Columbia University, New York, United States.",
"affiliationStatementLabelSource": "PUBMED",
"affiliationInstitutions": [
{
"affiliationInstitutionLabel": "Columbia University in the City of New York",
"affiliationInstitutionId": 60030162,
"affiliationInstitutionSource": "SCOPUS"
}
]
},
"targetAuthor": true
},
{
"rank": 3,
"lastName": "Dine",
"firstName": "Daniel",
"initials": "D",
"affiliations": {
"affiliationStatementLabel": "Department of Biomedical Informatics, Columbia University, New York, United States; The Irving Institute for Clinical and Translational Research, Columbia University, New York, United States.",
"affiliationStatementLabelSource": "PUBMED",
"affiliationInstitutions": [
{
"affiliationInstitutionLabel": "Columbia University in the City of New York",
"affiliationInstitutionId": 60030162,
"affiliationInstitutionSource": "SCOPUS"
}
]
},
"targetAuthor": false
},
{
"rank": 4,
"lastName": "Bakken",
"firstName": "Suzanne",
"initials": "S",
"affiliations": {
"affiliationStatementLabel": "Department of Biomedical Informatics, Columbia University, New York, United States; The Irving Institute for Clinical and Translational Research, Columbia University, New York, United States.",
"affiliationStatementLabelSource": "PUBMED",
"affiliationInstitutions": [
{
"affiliationInstitutionLabel": "Columbia University in the City of New York",
"affiliationInstitutionId": 60030162,
"affiliationInstitutionSource": "SCOPUS"
}
]
},
"targetAuthor": false
},
{
"rank": 5,
"lastName": "Albert",
"firstName": "Paul J",
"initials": "P",
"affiliations": {
"affiliationStatementLabel": "Samuel J. Wood Library, Weill Cornell Medical College, New York, United States.",
"affiliationStatementLabelSource": "PUBMED",
"affiliationInstitutions": [
{
"affiliationInstitutionLabel": "Weill Cornell Medicine",
"affiliationInstitutionId": 60007997,
"affiliationInstitutionSource": "SCOPUS"
}
]
},
"targetAuthor": false
},
{
"rank": 6,
"lastName": "Weng",
"firstName": "Chunhua",
"initials": "C",
"affiliations": {
"affiliationStatementLabel": "Department of Biomedical Informatics, Columbia University, New York, United States; The Irving Institute for Clinical and Translational Research, Columbia University, New York, United States.",
"affiliationStatementLabelSource": "PUBMED",
"affiliationInstitutions": [
{
"affiliationInstitutionLabel": "Columbia University in the City of New York",
"affiliationInstitutionId": 60030162,
"affiliationInstitutionSource": "SCOPUS"
}
]
},
"targetAuthor": false
}
],
"volume": "51",
"pages": "8-14"
}
]
}
The text was updated successfully, but these errors were encountered:
Background
Those who wish to do clustering need to use Java to retrieve existing candidate records for a user. To facilitate alternate clustering approaches, we should allow third parties to retrieve these candidate records using an API service.
Requirements
Please create a new API called
Candidate Article Retrieval
, which has this URL:/reciter/candidate-article-retrieval/by/uid
. It should be in there-citer-controler
.Parameters should be personIdentifier, useGoldStandard, and retrievalRefreshFlag.
The articles returned should be consistent with what is returned by Feature Generator API except in this case we are returning all candidate records irrespective of score (the scoring phase which hasn't happened yet).
retrievalRefreshFlag and useGoldStandard should behave the same as they do for Feature Generator API. In the case of the latter, the value for useGoldStandard can change whether records stored in the GoldStandard table are also included.
The resulting data should be stateless, meaning that these data are not saved anywhere.
Authentication can be done using the same key that authenticates for Feature Generator API.
Sample data
Here's the proposed output for the user, meb7002, which returns 1 of the 42 articles it would customarily return.
The text was updated successfully, but these errors were encountered: