Skip to content

Computational Strategies in Nutrigenetics: Constructing a Reference Dataset of Nutrition-Associated Genetic Polymorphisms https://doi.org/10.1101/2023.08.04.23293659

Notifications You must be signed in to change notification settings

johndef64/GRPM_system

Repository files navigation

GRPM System

GRPM (Gene-Rsid-Pmid-Mesh) system is a comprehensive tool designed to integrate and analyze genetic polymorphism data associated with specific biomedical subjects. It comprises five modules that allow data retrieval, merging, analysis, and incorporation of GWAS data.

medrxiv Manuscript DOI

Overview

Introduction

GRPM System is a Python framework able to build a comprehensive dataset of human genetic polymorphisms associated with nutrition. By combining data from multiple sources and utilizing MeSH terms as a framework, this workflow enables researchers to explore the vast genetic literature in search of variants significantly associated with a specific biomedical subject. The main purpose of developing this resource was to assist nutritionists in investigating gene-diet interactions and implementing personalized nutrition interventions.

Graphical Abstract

Modules

The GRPM System comprises five modules that perform various tasks to facilitate the integration and analysis of genetic polymorphism data associated with nutrition. These modules are as follows:

To try out GRPM System. Run each module separately by clicking the "Open in Colab". Be careful to import all necessary dependencies and files. Google Drive folder synch option available.

Each Jupyter notebook is provided with the code for downloading and installing the necessary requirements for their execution.

No. Notebook Module Description
1. Open In Colab Dataset Builder Retrieves data from LitVar and PubMed databases, merging them into a CSV format.
2. Open In Colab MeSH Selection for Retrieval Defines a coherent MeSH term list for information retrieval over the whole GRPM Dataset using NLP.
3. Open In Colab GRPM Dataset MeSH Query Employs MeSH terms for GRPM dataset retrieval. It extracts a subset of matched entities making a Data Report.
4. Open In Colab GRPM Data Analyzer Analyzes retrieved data and calculates survgey metrics. Data visualization trough matplotlib and seaborn.
5. Open In Colab GRPM-GWAS Data Integration: Integrates GWAS data associating GWAS phenotypes and potential risk/effect alleles with the GRPM Dataset.

GRPM system: Integrating Genetic Polymorphism Data with PMIDs and MeSH Terms to Retrieve Genes and rsIDs for Biomedical Research Fields. GRPM Dataset: pcg, protein coding genes; rna, RNA genes; pseudo, presudogenes; in parentheses, dataset shape.

These modules provide a comprehensive framework for researchers and nutritionists to explore genetic polymorphism data and gain insights into gene-diet interactions and personalized nutrition interventions.

Updates

The GRPM Dataset available on Zenodo is a snapshot of LitVar1. LitVar1 is now deprecated and has been fully replaced by LitVar2. Module 1 (Dataset Builder) has been updated to retrieve data from LitVar2. The subsequent modules in the pipeline remain functional and can be tested using the original version of the GRPM Dataset available on Zenodo.

Installation

To install GRPM System, clone the repository to your local machine:

git clone https://github.com/johndef64/GRPM_system.git

Otherwise, run each module separately in Google Colab importing Google Drive to keep-up your progress.

Usage

Detailed instructions on how to use each module of GRPM System can be found inside the relative Jupyter Module provided in the repository. Make sure to follow the instructions and install the necessary Python packages specified for each module.

Requirements

GRPM System has the following requirements:

  • Python 3.9 or above
  • pandas
  • requests
  • biopython
  • nbib
  • beautifulsoup
  • openai
  • matplotlib
  • seaborn
  • nltk

About

Computational Strategies in Nutrigenetics: Constructing a Reference Dataset of Nutrition-Associated Genetic Polymorphisms https://doi.org/10.1101/2023.08.04.23293659

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages