-
Notifications
You must be signed in to change notification settings - Fork 17
miniproject: viral epidemics and zoonoses
Which Viral Zoonoses lead to Viral Epidemic?
SANA SAIFI
_
Zoonoses are diseases transmissible from animals, to Humans. Both new and old viral zoonoses are important in emerging and reemerging virus diseases leading to a epidemic. Scientists estimate that more than 6 out of every 10 known infectious diseases in people can be spread from animals , and 3 out of every 4 new or emerging infectious diseases in people come from animals.
Viruses from wildlife hosts have caused such emerging high-impact diseases as severe acute respiratory syndrome (SARS), Ebola fever, and influenza in humans.
OBJECTIVE
This Mini Project is set to find, How and which zoonotic diseases lead to the Viral Epidemic.
METHODOLOGY
-
Using the communal corpus
Viral Epidemic
50 articles were downloaded using get papers.🟩FINISHED
-
Binary Classification of the 50 articles into True Positives/ False Positivesi.e, the articles are based on Viral Epidemics or not.🟩
FINISHED
-
Using ami search to find whether the articles mentioned any comorbidity in a viral epidemic or not, annotating with dictionaries to create ami DataTables.🟩
FINISHED
-
Sectioning the articles using ami section to split a document in a
Ctree
into sections. Based on tags from JATS, etc.🟩FINISHED
-
Re-run the query to get a corpus of 950 articles on the _ Viral Epidemics and Zoonoses_.🟩
FINISHED
-
Scrutinizing the 950 articles for true positives and false positives and creating a spreadsheet.🟨
STARTED
-
Using ami search to create DataTables and ami section for sectioning the 950 articles.🟩
FINISHED
-
Create a dictionary, specifically related to the Mini Project.🟩
FINISHED
-
Sectioning the papers on the basis of the diseases related to animals.🟪
IN PROGRESS
-
Use relevant machine learning techniques for the classification of data based on whether the papers are related to viral epidemics and the which Viral Zoonotic Disease were reported.🟨
STARTED
-
Displaying of results using
R
/KNIME
. 🟥NOT STARTED
PROGRESS
◾ Spreadsheet of 50 articles classified into the subcategories of viruses, funders, countries, year of publish, testing and tracing, and type of paper.🟩FINISHED
◾ Sectioning of the 950 papers using ami section 🟩FINISHED
◾ Downloaded a corpus of 950 articles on viral epidemics and zoonoses using getpapers
🟩FINISHED
◾ Created a dictionary with 135 entries on zoonotic disease using ami dict
.🟩FINISHED
◾ Created a Dictionary using Wikidata Query Service and SPARQL.🟩FINISHED
◾ Run ami search on corpus 950. 🟩FINISHED
◾ Release corpus 950 using Github desktop. 🟩FINISHED
◾ Installation of Anaconda for installing various tools i.e., Jupyter. 🟩FINISHED
◾ Initially the communal corpus of 50 articles on viral epidemics
.
getpapers -q viral epidemics -k 950 -o viral epidemics -x -p
◾ Next, a new corpus of 950 articles using the Dictionary Zoonoses.
◾ Downloaded the corpus of 950 articles using getpapers with the syntax:
getpapers -q "Zoonoses in Viral epidemics" -k 950 -o viral epidemics -x -p
◾ This corpora was classified, searched and sectioned.
There are three methods to upload the corpus.
- Through VISUAL CODE STUDIO.
See @Ambreen's Page for the instructions
- Through COMMAND PROMPT
pre-required: openVirus repository in pc. if not clone it from the following syntax.
git clone https://github.com/petermr/openVirus.git
then follow these command lines.
C:\Users\admin>cd openVirus
C:\Users\admin\openVirus> cd miniproject
C:\Users\admin\openVirus\miniproject> cd zoonoses
C:\Users\admin\openVirus\miniproject\zoonoses>git status
C:\Users\admin\openVirus\miniproject\zoonoses>dir
C:\Users\admin\openVirus\miniproject\zoonoses>git add .
C:\Users\admin\openVirus\miniproject\zoonoses>git status
C:\Users\admin\openVirus\miniproject\zoonoses>git commit -am "first commit all corpus"
C:\Users\admin\openVirus\miniproject\zoonoses>git pull
files will start getting upload.
C:\Users\admin\openVirus\miniproject\zoonoses>git push
will ask for username and password. after entering the same your file will get committed under your name.
- Through Github Desktop
pre-required: Github Desktop (install from here and cloned openVirus Repository.
- Open the folder where we cloned the repository. Open your files in CProject.
- Copy the files and Paste to the folder in openVirus repository(remote repository) where we want to commit the files.
- Open the Github desktop.
- Go to 'File', then 'Add Local Repository'.
- Now, choose the openVirus repository from your system.
- Add a commit message and go to 'Commit to master'.
- After committing, go to 'Push to origin'.
- After completion of pushing the repository, your uploaded files can be viewed on the Github repository.
- How to create dictionary?
(https://github.com/petermr/openVirus/wiki/Dictionary:-Zoonosis#how-i-created-)
-
The Test Dictionary created using
amidict
. -
The dictionary created using SPARQL Query Service from Wikidata.
-
Results
-
The Test Dictionary created using
amidict
was done manually and lacked synonyms, host, variable name, description, wikidata links, wikipedia links and etc. -
The Dictionary created using
SPARQL
had descriptions, links, some synonyms, labels and ids. However, the rendered results were _Scientific Articles and Journals _.This need refining as we want the ids which is on Zoonotic diseases/viruses.
As PMR suggested this zoonotic disease dictionary has to be done manually.
Link for the manually Made Dictionary - https://github.com/petermr/openVirus/blob/master/dictionaries/zoonoses/zoonosis.xml
-
nodejs
nvm
for installing get papers -
getpapers
for retrieving 950 articles from EuPMC -
AMI
for sectioning and searching. -
SPARQL
andamidict
for creating dictionaries. -
KNIME
for displaying results.
AMI
SECTIONING :
Sectioning of the dataset is usually done for greater precision.
-
Downloaded the corpus of 950 papers using getpapers in XML, PDF and JSON file.
getpapers -q "Zoonoses in Viral epidemics" -k 950 -o viral epidemics -x -p
-
To easy the process, made 5 subfolders of 200 corpus.
-
To divide the content of papers into sections of front, body, back and float groups, again open the Command Prompt and give the syntax:
ami -p <name of directory> section
-
This will create a subfolder of sections in each folder of the scientific paper which is there in your directory.
AMI
SEARCH
-
Downloaded the corpus of 950 papers using the above same syntax in XML, PDF and JSON file.
-
To search the dictionary of country drugs funders diseases, open the command prompt and give syntax:
ami -p <name of directory> search --dictionary country drugs funders diseases
-
Open the directory and at the end of folder you will find various HTML Document.
AMI
VALIDATION
Open command prompt and type :
cd ami3
git pull
mvn clean install -Dmaven.test.skip=true
Wait! ... BUILD SUCCESS!
NOT STARTED: KNIME, Keras, R
STARTED : dictionary
BLOCKED : .
FINISHED : downloading and installing get papers, manual classification, list of zoonotic diseases, installing ami, getpapers, maven, jdk, sectioning of corpus950, ami search of corpus 950.