Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a separate table, ArticleEquivalents, for asserting and overriding equivalence between source systems #342

Open
paulalbert1 opened this issue May 16, 2019 · 0 comments

Comments

@paulalbert1
Copy link
Contributor

paulalbert1 commented May 16, 2019

Problem

As we look towards architecting a system that can support tracking objects from multiple source systems, it becomes important to not hard code equivalency between our two current source systems, Scopus and PubMed.

We might want to say that a set of objects in Web of Science, Scopus, and PubMed refer to the same article, or two out of those three do. Also, because the matching is imperfect, we may need to support manually override our dynamically defined logic for asserting equivalence.

I suggest we create a table called ArticleEquivalents.

Data model

id: 2344342
  id-PubMed: “2342340”,
  id-Scopus: “435342234”,
  assertion: “SAME”,
  sourceAssertion: “RECITER”,
  dateAssertion: 2019051600000Z

/* Meaning: ReCiter system says these two records are equivalent. */
  
id: 2344342
  id-Scopus: “234234110”,
  id-Scopus: “435342234”,
  assertion: “SAME”,
  sourceAssertion: “USER”,
  preferredObjectID: “435342234”,
  dateAssertion: 2019051600000Z

/* Meaning: User says these two records in the same system are equivalent and identifies which article/object is preferred. */

  
id: 6613599
  id-PubMed: “2342340”,
  id-Scopus: “435342234”,
  assertion: “DIFFERENT”,
  sourceAssertion: “USER”,
  sourceAssertion-id: “paa2013”,
  dateAssertion: 2019051900000Z

/* Meaning: From some yet-to-be created UI, user has asserted these two records are not equivalent. Manual assertions should take precedence over automated ones. */


id: 11113423
  id-PubMed: “987432”,
  id-Scopus: “4928272”,
  assertion: “SAME”,
  sourceAssertion: “USER”,
  sourceAssertion-id: “paa2013”,
  dateAssertion: 2019051900000Z

/* Meaning: From some yet-to-be created UI, user has asserted these two records are  equivalent. Manual assertions should take precedence over automated ones. */


id: 9348799
  id-WebOfScience: “wos123498234”,
  id-Scopus: “43534342234”,
  assertion: “SAME”,
  sourceAssertion: “RECITER”,
  dateAssertion: 2019051900000Z
  
 /* Meaning: ReCiter system says these two records are equivalent. */

Refactoring how lookup and storage of Scopus articles should change

  1. Candidate articles are retrieved for a given user.

  2. PubMed to Scopus matching logic is invoked. It might be a good idea if we clearly defined all the matching logic so that others could expand upon it and include other mappings between sources. We would want a PubMed-Scopus mapping file, a WebOfScience-Scopus mapping file, etc. Talked to Jie about this briefly; he suggests that maybe this should be a separate service.

  3. ReCiter comes up with a series of equivalency statements, i.e., PMID1 = ScopusDocID2, PMID3 = ScopusDocID4, etc.

  4. ReCiter stores new Scopus records in ScopusArticle table using ScopusDocID as primary key.

  5. ReCiter stores equivalency relationships in ArticleEquivalents (see above data model).

  6. ReCiter now needs to use PubMed and Scopus metadata to compute suggestions.

  • Current way: start with the PMID and look it up in the ScopusArticle table.
  • New way: use the PMID to look up the ScopusDocId equivalent in the ArticleEquivalents table. The one case where it wouldn’t do that is if there is a manual assertion that a pair of articles are not equivalents (i.e., assertion=“DIFFERENT”). Then take the ScopusDocID and look it up in the ScopusArticle table.
@paulalbert1 paulalbert1 changed the title Use a separate table, ArticleEquivalents, for asserting and updating equivalence between source systems Use a separate table, ArticleEquivalents, for asserting and overriding equivalence between source systems May 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant