Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📕 Documentation: Documentation: Dictionary.xml and DictionaryDescription.md of: eoActivity #77

Open
EmanuelFaria opened this issue Jan 28, 2020 · 9 comments
Assignees

Comments

@EmanuelFaria
Copy link
Collaborator

EmanuelFaria commented Jan 28, 2020

Here we describe the process of creating a [DictionaryName]DictionaryDescription.md document, within which we will describe the contents of the individual dictionary (named in the title of this Issue), which was created (or is in the process of being created) from data collected for Oil186.

I will begin this thread by pasting the contents of the INDEX description, then follwed by first draft copy below for discussion and direction.

@EmanuelFaria
Copy link
Collaborator Author

EmanuelFaria commented Jan 28, 2020

 EO Activities

ActivityDictionaryDescription.md

@EmanuelFaria
Copy link
Collaborator Author

Activity​​ Dictionary

 

A dictionary of 184 activities mentioned in the 186 test articles downloaded from PubMed.

 

File Data

 

Table Column Headings

  • title: type of data to be normalized and tagged with Wikidata ID.

  • desc: data source

  • id: CM.activities.n where n is a serialized number

  • name: The name is a human readable string describing the concept.

  • term: The term is the precise string used to identify the concept. Name and Term are often the same.

  • wikidata: Unique identifier linked to Wikidata.org — a free and open knowledge base that can be read and edited by both humans and machines.

  • wikipedia:

 

Contents/Results

  • No. of source papers: 186

  • No. of Entries (Headers are not counted): 184

  • No. of unique compound names (including alternate spellings or synonyms): 184

  • No. of Chemical Compounds resolved in Wikidata: 74

  • No. of Chemical Compounds NOT resolved in Wikidata: 110

 

Notes:

  • No source papers are listed. Should we assume 186, or delete that from Contents/Results?

  • We need to normalize the headings across all Dictionaries

    • This is the third case where the column heading “description” means something other than "data source / method of input"

    • Capitalization

  • In this case, is the column heading “id” related to Essoil? I don’t know how to describe it here. The format is: CM.activities.n where n is a serialized number

  • I don’t know how to describe the column headings for “Wikipedia” here

@EmanuelFaria EmanuelFaria changed the title 📕 Documentation: DictionaryDescription of: EO Activities - ActivityDictionaryDescription.md 📕 Documentation: DictionaryDescription of: EOActivities Feb 3, 2020
@EmanuelFaria
Copy link
Collaborator Author

EmanuelFaria commented Feb 11, 2020

@petermr Currently working on cleaning the activities.xml dictionary.

Searching Wikidata for “antiacne” I found this entry:

https://www.wikidata.org/wiki/Q143139 "therapeutic subgroup of the Anatomical Therapeutic Chemical Classification System: Anti-acne preparations”

which led me to search and find this:

https://www.wikidata.org/wiki/Q192093 "classification of active ingredients of drugs according to the organ or system on which they act and their therapeutic, pharmacological and chemical properties.”

and this: https://en.wikipedia.org/wiki/Anatomical_Therapeutic_Chemical_Classification_System

Questions:

  1. In the absence of a wikidata ID for "antiacne", should I...
    a) use no id at all
    b) use https://www.wikidata.org/wiki/Q143139
    c) use the ID for "acne" and let users put 2 and 2 together about the "anti-" part?

  2. should we be adding the Anatomical_Therapeutic_Chemical_Classification_System’s IDs to the activities dictionary as well as wikidata?
    https://www.whocc.no/atc_ddd_index/

@EmanuelFaria
Copy link
Collaborator Author

EmanuelFaria commented Feb 11, 2020

Incidentally, the WHO Collaborating Centre for Drug Statistics Methodology
also has useful ways to express the following, which may be useful as dictionaries as well.

Units

g = gram
mg = milligram
mcg = microgram
U = unit
TU = thousand units
MU = million units
mmol = millimole
ml = milliliter (e.g. eyedrops)

Route of administration (Adm.R)

Implant = Implant
Inhal = Inhalation
Instill = Instillation
N = nasal
O = oral
P = parenteral
R = rectal
SL = sublingual/buccal/oromucosal
TD = transdermal
V = vaginal

@petermr
Copy link
Owner

petermr commented Feb 12, 2020 via email

@EmanuelFaria
Copy link
Collaborator Author

Ok, I will add new entries as I go. If too time-consuming, I’ll swing back and do it after the dictionaries are cleaned, and then update them accordingly

Sent with GitHawk

@EmanuelFaria
Copy link
Collaborator Author

I have just finished uploading the cleaned, disambiguated and Wikidata attributed activities dictionary, and updated it's description, as well as the master INDEX of descriptions.

ActivityDictionaryDescription.md

Hallelujah.

@EmanuelFaria
Copy link
Collaborator Author

activity.xml and ActivityDictionaryDescription.md are now updated and working.

I have also updated master INDEXofOIL186Dictionaries.md

@EmanuelFaria EmanuelFaria changed the title 📕 Documentation: DictionaryDescription of: EOActivities 📕 Documentation: Documentation: Dictionary.xml and DictionaryDescription.md of: eoActivity Mar 25, 2020
@EmanuelFaria
Copy link
Collaborator Author

As of today, I believe this dictionary and it's description document are complete. Below I will copy the contents of the description document:

EO Activity​​ Dictionary

 

File Data

 

Table Column Headings

  • id: serialized identification number

  • term: The name is a human readable string describing the concept.

  • wikidataID: Unique identifier linked to Wikidata.org — a free and open knowledge base that can be read and edited by both humans and machines.

  • description: short description of the activity sourced from wikidata and/or wikipedia

 

Contents/Results

  • No. of source papers: 186

  • No. of entries (Headers are not counted): 438

  • No. of unique activity names (including alternate spellings or synonyms): 438

  • No. of activities resolved in wikidata (including alternate spellings or synonyms): 340

  • Number of unique wikidata ids attributed to activities (normalizing for alternate spellings and synonyms): 250

  • No. of entries withoug wikidataid: 98

  • No. of entries with descriptions: 336

  • No. of entries without descriptions: 102

 

Notes:

  •  

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants