Machine Learning with Microbiome Data Workshop

Guest speaker: Dr Giovanni Birolo, University of Turin (Italy)

Workshop Overview

In this workshop, you will:

Learn how to import a typical microbiome dataset into Python.
Explore the dataset's structure, including metadata, feature tables, and taxonomy tables.
Use statistical and machine learning libraries to classify and predict labels within your dataset.

We will focus on a dataset from Pat Schloss's lab, examining the murine gut microbiome to understand community membership and structure changes over time.

⚠️ We used this dataset with Matteo Calgaro.

Pre-requisites

Before attending, please ensure you have a basic understanding of Python and some basics on its data manipulation libraries, such as pandas and numpy.

Tools and Libraries

We will use the following Python libraries:

numpy: For numerical operations.
pandas: For data manipulation and analysis.
sklearn: For applying machine learning techniques.
seaborn and matplotlib: For data visualization.

Ensure these libraries are installed and updated in your Python environment before the workshop.

Dataset Overview

The dataset comprises several tables reflecting different aspects of the microbiome study:

Metadata: Attributes for each sample, including sample identifiers and labels.
Feature Table: Abundances of each feature/species in each sample (some times this is referred to as OTU table).
Taxonomy Table: Taxonomic classification for each feature.

Structure

The workshop is structured as follows:

Introduction to Microbiome Data: Understanding the structure and content of microbiome datasets.
Data Importing and Exploration: Loading datasets into Python and performing initial explorations.
Visual Data Exploration: Using PCA for visual exploration of sample similarity.
Statistical Analysis: Applying statistical tests to uncover significant differences in the data.
Machine Learning Applications: Building and evaluating predictive models using machine learning.
Model Evaluation: Assessing model performance with cross-validation.

Installation Instructions

Please ensure you have a recent version of Python installed. You can download and install the necessary libraries using pip (or conda, and environment file is shared in this folder):

pip install numpy pandas scikit-learn matplotlib seaborn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Machine Learning with Microbiome Data Workshop

Workshop Overview

Pre-requisites

Tools and Libraries

Dataset Overview

Structure

Installation Instructions

Files

README.md

Latest commit

History

README.md

File metadata and controls

Machine Learning with Microbiome Data Workshop

Workshop Overview

Pre-requisites

Tools and Libraries

Dataset Overview

Structure

Installation Instructions