-
Notifications
You must be signed in to change notification settings - Fork 17
MiniProject:Testing and tracing in viral epidemics
Testing and tracing in viral epidemics
Vanisha Arora
Om Prakash
The project aims to extract information about various "Tests" to diagnose the viral infections during the epidemic and also to encircle various ways to prevent its further spread.
- "Contact tracing" is the process of identification of persons who may have come into contact with an infected person ("contacts") and subsequent collection of further information about these contacts
The goals of contact tracing are:
-
Interrupting ongoing transmission and reduce the spread of an infection.
-
To offer diagnosis, counseling and treatment to already infected individuals
-
If the infection is treatable, helping prevent reinfection of the originally infected patient
-
Interrupting ongoing transmission and reduce the spread of an infection.
This miniproject is based on extracting information about testing and tracing done in the viral epidemics.
🟨 Conduct binary classification on communal corpus "EpidemicnoCov50" and create a spreadsheet.
🟨 Creating dictionaries for the project using AMI
. Searching for the test names for disease diagnosis in the scientific literature.
🟨 Dictionary:Testing and Tracing Dictionary (https://github.com/petermr/openVirus/blob/master/cambiohack2020/dictionaries/testTrace.xml)
🟨 Downloading a corpus of 250 articles using getpapers. using quer"Testing and tracing in viral epidemics).
🟨 Using the ami
section for the Sectioning of the papers.
Command used:
ami -p name of directory section
(The directory having the corpus)
🟨 Running ami
search on the corpus for searching the terms of the dictionary in the corpus.
Command used:
ami -p name of directory search --dictionary test_trace
🟨 Multi dictionary search.
🟨 Jupyter notebooks
🟨 Annotation of the corpus.
🟨 Machine learning
Following steps were followed:
-
Create a text file (.txt) containing a list of Terms related to "Testing and contact tracing"(From wikipedia or through research papers.) The terms include the tests to diagnose the virus during an epidemic and also tracing terms.
-
Meanwhile, create a directory by giving command in the command prompt as : mkdir mydictionaries This is the ouput directory where you are going to get the dictionary.
-
Open the command prompt and give the command as:
amidict -v --dictionary testing_and_tracing --directory mydictionaries --input test_trace.txt create --informat list --outformats xml,html
-
The input file in the syntax is the txt file with the terms.
-
After giving the above command, it took a while to create the dictionary.
-
Open the folder 'mydictionaries' in the system, the dictionary is created as both xml and html file.
Link to the dictionary: https://github.com/petermr/openVirus/blob/master/dictionaries/test/test_trace.xml
This dictionary includes only names and terms. Addition of other attributes require a different syntax.
Given the command:
amidict -v --dictionary testTrace --directory mydictionaries --input test_trace.txt create --informat list --outformats xml,html --wikilinks wikipedia, wikidata
Link to the dictionary: https://github.com/petermr/openVirus/blob/master/cambiohack2020/dictionaries/testTrace.xml
The above dictionary includes wikidataID,URLs description as well. The above dictionary is valid.
Tried committing through: ✔️ Github desktop
If you are using Github desktop to commit: Following steps are followed:
- Install Github desktop from : https://desktop.github.com
- Clone the repository openVirus into the system using Gitbash command line :
git clonehttps://github.com/petermr/openVirus.git
- Open the folder where you want to upload your CProject.
- Paste your project to the folder in openVirus repository(our remote repository) where you want to commit the files.
- Open the Github desktop.
- Go to 'File', then 'Add Local Repository'.
- Now, choose the openVirus repository from your system.
- Add a commit message and go to 'Commit to master'.
- After committing, go to 'Push to origin'.
- After completion of pushing the repository, your uploaded files can be viewed on the Github repository.
Issues faced:
- Existence of the lock file in the repository, which has to be deleted to proceed. ✔️ Suggestion is to delete the file if any before starting to commit to avoid the wastage of time.
- Connectivity issue : A good internet connection is required.
*Used Git pull command in git bash to download the corpus for running ami
section and ami
search.
git pull path.git
This command showed error in my windows . So, i used command prompt for cloning the repository.
PMR SUGGESTION: To Start working on a small corpus to make things easy and avoid the time wastage.
-
GETPAPERS for downloading corpus.
-
AMI
for creating dictionary and sectioning the corpus. -
KNIME
for data extraction and Binary classification. -
KNIME, R
for analysis. -
Jupyter notebooks
****Usage of knime
and R
****Usage of jupyter notebooks
****Language variants
****Ami
section for sectioning of corpus and Ami
search for searching the tests in the corpus.
****Dictionary modification. Annotating the corpus
****Adding more attributes to the dictionary.
**** PREVIOUS BLOCKER(Solved now)
Ami
search , The Testing and tracing being very rarely mentioned in the papers, so it is not searching the tests, hence, not getting the data tables. But trying the same in the corpus and searching for funders or countries is giving the results.
Tried ami search on the corpus of 100 and 150 as well but data tables still empty.
**PMR: I agree. This is a hard search. I think we need to collect terms iteratively from Wikipedia, from papers and gradually build a multi-term query. Use Wikipedia's "Contact tracing" as a good source of words and phrases. Currently, not blocked on anything.
**** Binary classification for EpidemicnoCov50
**** Creating a Corpus of 250 papers.
**** Dictionary:
-
With wikidataIDs and URLs: https://github.com/petermr/openVirus/blob/master/cambiohack2020/dictionaries/testTrace.xml
-
With only names and terms: https://github.com/petermr/openVirus/blob/master/dictionaries/test/test_trace.xml
**** AMI
sectioning.
**** Ami
search: Results in the form of data tables.
**** Multi dictionary search against a small corpus. The dictionaries Funders, drugs, test trace, diseases and country searched against a corpus. Got the data tables and cooccurence graphs.
**** Dictionary validation.
**** Corpus commit by PMR.
**** Annotation of the 50 papers with 12 false positives and 38 true positives.