You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As we look towards architecting a system that can support tracking objects from multiple source systems, it becomes important to not hard code equivalency between our two current source systems, Scopus and PubMed.
We might want to say that a set of objects in Web of Science, Scopus, and PubMed refer to the same article, or two out of those three do. Also, because the matching is imperfect, we may need to support manually override our dynamically defined logic for asserting equivalence.
I suggest we create a table called ArticleEquivalents.
Data model
id: 2344342
id-PubMed: “2342340”,
id-Scopus: “435342234”,
assertion: “SAME”,
sourceAssertion: “RECITER”,
dateAssertion: 2019051600000Z
/* Meaning: ReCiter system says these two records are equivalent. */
id: 2344342
id-Scopus: “234234110”,
id-Scopus: “435342234”,
assertion: “SAME”,
sourceAssertion: “USER”,
preferredObjectID: “435342234”,
dateAssertion: 2019051600000Z
/* Meaning: User says these two records in the same system are equivalent and identifies which article/object is preferred. */
id: 6613599
id-PubMed: “2342340”,
id-Scopus: “435342234”,
assertion: “DIFFERENT”,
sourceAssertion: “USER”,
sourceAssertion-id: “paa2013”,
dateAssertion: 2019051900000Z
/* Meaning: From some yet-to-be created UI, user has asserted these two records are not equivalent. Manual assertions should take precedence over automated ones. */
id: 11113423
id-PubMed: “987432”,
id-Scopus: “4928272”,
assertion: “SAME”,
sourceAssertion: “USER”,
sourceAssertion-id: “paa2013”,
dateAssertion: 2019051900000Z
/* Meaning: From some yet-to-be created UI, user has asserted these two records are equivalent. Manual assertions should take precedence over automated ones. */
id: 9348799
id-WebOfScience: “wos123498234”,
id-Scopus: “43534342234”,
assertion: “SAME”,
sourceAssertion: “RECITER”,
dateAssertion: 2019051900000Z
/* Meaning: ReCiter system says these two records are equivalent. */
Refactoring how lookup and storage of Scopus articles should change
Candidate articles are retrieved for a given user.
PubMed to Scopus matching logic is invoked. It might be a good idea if we clearly defined all the matching logic so that others could expand upon it and include other mappings between sources. We would want a PubMed-Scopus mapping file, a WebOfScience-Scopus mapping file, etc. Talked to Jie about this briefly; he suggests that maybe this should be a separate service.
ReCiter comes up with a series of equivalency statements, i.e., PMID1 = ScopusDocID2, PMID3 = ScopusDocID4, etc.
ReCiter stores new Scopus records in ScopusArticle table using ScopusDocID as primary key.
ReCiter stores equivalency relationships in ArticleEquivalents (see above data model).
ReCiter now needs to use PubMed and Scopus metadata to compute suggestions.
Current way: start with the PMID and look it up in the ScopusArticle table.
New way: use the PMID to look up the ScopusDocId equivalent in the ArticleEquivalents table. The one case where it wouldn’t do that is if there is a manual assertion that a pair of articles are not equivalents (i.e., assertion=“DIFFERENT”). Then take the ScopusDocID and look it up in the ScopusArticle table.
The text was updated successfully, but these errors were encountered:
paulalbert1
changed the title
Use a separate table, ArticleEquivalents, for asserting and updating equivalence between source systems
Use a separate table, ArticleEquivalents, for asserting and overriding equivalence between source systems
May 16, 2019
Problem
As we look towards architecting a system that can support tracking objects from multiple source systems, it becomes important to not hard code equivalency between our two current source systems, Scopus and PubMed.
We might want to say that a set of objects in Web of Science, Scopus, and PubMed refer to the same article, or two out of those three do. Also, because the matching is imperfect, we may need to support manually override our dynamically defined logic for asserting equivalence.
I suggest we create a table called ArticleEquivalents.
Data model
Refactoring how lookup and storage of Scopus articles should change
Candidate articles are retrieved for a given user.
PubMed to Scopus matching logic is invoked. It might be a good idea if we clearly defined all the matching logic so that others could expand upon it and include other mappings between sources. We would want a PubMed-Scopus mapping file, a WebOfScience-Scopus mapping file, etc. Talked to Jie about this briefly; he suggests that maybe this should be a separate service.
ReCiter comes up with a series of equivalency statements, i.e., PMID1 = ScopusDocID2, PMID3 = ScopusDocID4, etc.
ReCiter stores new Scopus records in ScopusArticle table using ScopusDocID as primary key.
ReCiter stores equivalency relationships in ArticleEquivalents (see above data model).
ReCiter now needs to use PubMed and Scopus metadata to compute suggestions.
The text was updated successfully, but these errors were encountered: