Skip to content

A platform to create crowd-sourced gene function gold standards with Amazon Mechanical Turk

License

Notifications You must be signed in to change notification settings

Thyra/gold-crowd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gold-crowd

A platform to create crowd-sourced gene function gold standards with Amazon Mechanical Turk

Installation

  1. Make sure you have all requirements: python2, pipenv, and java (tested on openjdk 1.8, used for NobleCoder).
  2. Download the repository
  3. Change into it and pipenv install python dependencies
  4. Launch NobleCoder from tools/NobleCoder-1.0.jar and import the Gene Ontology (download from here) under the name go. The process.py script will run NobleCoder on your abstracts and tell it to use the Ontology "go", so if you choose a different name you will have to adapt the script.

Usage

  1. Put the Pubmed IDs of the abstracts you're interested in into data/pmid_list.txt
  2. Run pipenv run python process.py
  3. Output is in data/abstracts and data/brat-input. Put all files from these folders together in the same folder of your brat installation. In that same folder you will also need a file annotation.conf that could look like this (more information here):
    [entities]
    
    Gene
    Function
    
    [relations]
    
    Does	Arg1:Gene, Arg2:Function
    Does	Arg1:Function, Arg2:Gene
    DoesNot	Arg1:Function, Arg2:Gene
    DoesNot	Arg1:Gene, Arg2:Function
    
    [attributes]
    
    [events]
    
    There will also be a file data/statistics.cvs containing the number of words, genes, and functions for each abstract.

About

A platform to create crowd-sourced gene function gold standards with Amazon Mechanical Turk

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published