Welcome to the ALS Biomarker Identification Project repository. This project, conducted in collaboration with Aliaksei at Paris Saclay University, aims to identify potential biomarkers associated with Amyotrophic Lateral Sclerosis (ALS) through RNA-Seq data analysis.
The primary objective is to use advanced computational techniques to identify genes with significant expression differences between ALS patients and a Non-Neurological control group. By identifying these biomarkers, we aim to contribute to early diagnosis and targeted interventions for ALS.
- Data Preprocessing: Parsed RNA-Seq data into DataFrame, split samples, and organized data using an object-oriented approach.
- Descriptive Analysis: Analyzed and visualized the distribution of gene samples, highlighting differences between ALS and control groups.
- Dimensionality Reduction Techniques: Used Principal Component Analysis (PCA) and t-distributed Stochastic Neighbor Embedding (tSNE) to visualize complex patterns and clusters.
- Advanced Analysis Techniques: Conducted univariate analyses and used the PyDESeq2 library to identify genes with significant expression differences.
- Model Tuning and Generalization: Applied normalization and ElasticNet model tuning to accurately identify ALS-impacted genes.
Analysis of Amyotrophic Lateral Sclerosis RNA-Seq
To get started with this project:
- Clone this repository to your local machine.
git clone https://github.com/demic-dev/als-biomarker-identification-project.git
- Create a conda environment after opening the repository
conda create -n bio python=3.9 -y
conda activate bio
- Install the necessary dependencies listed in
requirements.txt
.
pip install -r requirements.txt
- Explore the Jupyter notebooks in the
analysis
directory to understand our methodology and findings.
We thank Paris-Saclay University for their resources and support. We also appreciate the guidance and expertise of our mentor and collaborators.
The data was provided by a research Postmortem Cortex Samples Identify Distinct Molecular Subtypes of ALS: Retrotransposon Activation, Oxidative Stress, and Activated Glia