(corresponds to v1.0
)
STARGEO is a webapp which allows users to identify differentially expressed genes between samples of their choosing. Users annotate studies in GEO to indicate which samples belong to which conditions. We've annotated many samples for their membership to specific disease or control classes. Then for a specific query (case versus control specification), STARGEO meta-analyzes across all the studies with relevant samples.
Here, we perform STARGEO analyses for diseases in our drug repurposing hetnet. See the Thinklab discussion for more information.
This repository depends on the starapi
package. See environment.yml
for the other installed packages in the environment.
The notebooks are executed in the following order:
retrieve-tags.ipynb
retrieves the current tags from the STARGEO database. The connection details are stored indsn.txt
(private).prepare_queries.ipynb
prepares the STARGEO queries based off of manual Disease Ontology to STARGEO tag mappings (data/DO-tag-mapping.tsv
). The queries specifics are stored indata/queries.tsv
.querier.ipynb
performs the STARGEO analyses. The output for each disease is stored indata/doslim
.combine.ipynb
aggregates the differential expression results for all diseases.data/diffex.tsv
contains the significantly differential expressions.data/summary.tsv
shows the number of up and down-regulated genes per disease.
All original content in this repository is released under CC0 1.0.