This repository contains all the code to implement all examples showed in our paper:
A network perspective on the ecology of gut microbiota and progression of Type 2 Diabetes: linkages to keystone taxa in a Mexican cohort
We obtained the data from a mexican cohort of Type 2 Diabetes (T2D) patients from Diener et al 2021. Abundances and taxonomy of the samples can be found here.
We organized the data in separated tables according to their T2D status, healthy, impaired fasting glycemia (IFG), impaired glucose tolerance (IGT), IFG + IGT, DT2 and Diabetes Type 2 treated (DT2_treated), you can find this files in the folder data.
- SparCC
- Python = 2.7
- Numpy = 1.16.5
- Pandas = 0.22.0
- R packages
- tidyverse
- visNetwork
- EdgeR
- Phyloseq
The next figure presents the methods used in our paper. Every step is detailled in the following sections
We used SparCC to infer microbial interactions for every group of study. Run the next command in your terminal to infer the microbial network:
bash net_inference_sparcc.sh
Run the next script in R to get the nodes and edges of each network for every group of study:
Note: you will need to run an additional script in python. It is available in notebook format.
Run the next script in R, it will create an html file with the network visualization:
In this step we used Netshift to evaluate the topoligical features of each network and also to compare between different clinical conditions. To do that upload the filtered tables previously obtained during the network preparation step. Netshift has an online interface, you can find it here
We perfomed the following comparisons between groups.
Control | Case |
---|---|
Healthy | IFG |
IFG | IGT |
IGT | IFG + IGT |
IFG + IGT | T2D |
T2D | T2D_treated |
In this step, we used EdgeR to infer how the abundance of gut microbiota is related to the clinical status of patients. The analysis was carried out at the genus level using the same groups described above.
We used Upset R library to generate absence/presence plots from the nodes network for each clinical condition.
Please, follow the next links to visualize the results
The networks follow the next shape code for each group population:
State | Shape |
---|---|
Healthy | Square |
IFG | Triangle |
IGT | Diamond |
IFG + IGT | Star |
T2D | Dot |
T2D_Treated | Triangle down |
The color of each node refers to a specific Phylum and they are indicated in the networks graph.