Skip to content
View Olga-Bagrova's full-sized avatar
🥥
Focusing
🥥
Focusing
  • The Gamaleya National Center of Epidemiology and Microbiology

Block or report Olga-Bagrova

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Olga-Bagrova/README.md

Olga Bagrova

⚛️Physics in my mind 🧠 🧬Biology in my heart❤️ 💻Coding in my soul🕊️

LinkedIn Gmail Telegram

This is the github of an aspiring bioinformatician. I graduated from the Bioinformatics Institute as a biostatistician and bioinformatician and the Lomonosov Moscow State University with BSc and MSc degrees in Biophysics. My experience encompasses collaboration with virologists, biochemists, and immunologists. However, my primary passion lies in researching the structure and properties of proteins.

Skills:

Python Pandas NumPy Matplotlib SciPy scikit-learn PyTorch Biopython R Tidyverse Linux Git Anaconda GNU Bash

Research experience

Studying population frequencies of T-cell receptor (TCR) alleles using immune repertoire sequencing

Repo

Comprehensive analysis of the TCR repertoire for a large group of donors.

We analyzed:

  • Gene usage distribution of the TRA and TRB chains for the identification of deletions

  • Co-expression factors for V-V, J-J, V-J pairs within and between TRA and TRB chains

  • Genes usage between only functional and non-functional sequences to identify thymus selection

  • We also compared found patterns between populations

  • Skills: Python (scipy, statsmodels, numpy, pandas, matplotlib, seaborn, re, os), Bash, Jupyterhub, Conda, Biological databases (IMGT).

The analysis of the secondary structures distributions along the polypeptide chains of proteins within different functional classes, homologous proteins and topologous proteins

Repo

Development of a new representation of proteins for comparing structures with a focus on the secondary structures distribution.

  • New representation of protein molecules using the distributions of secondary structures along their chains was developed

  • Proteins from PDB were divided into groups according to function and homology

  • Frequencies of occurrence of various secondary structures for the selected groups were compared

  • Skills: Python (biotite, numpy, pandas, matplotlib, seaborn, re, os), Bash, Conda, Biological databases (PDB, CATH, UniProt, NCBI, PFam, GO).

Biostatistical analysis of melanoma patients’ transcriptomic data from open database TCGA

Repo

Analyzing transcriptomic data from melanoma samples to identify trends in gene expression and building a model to predict overall survival based on expression levels.

  • The study of clinical and transcriptomic data of patients with melanoma to identify patient groups and their patterns

  • Transcriptomic signatures were selected based on a literature review and Cox regression to predict overall patient survival

  • A survival prediction model using Cox regression has been developed

  • Skills: R (survival, survminer, glmnet, ComplexHeatmap, tidyverse, gtsummary, factoextra, ggbiplot, ggplot2, ggpubr, dplyr, plotly, tibble, matrixStats), Biological databases (TCGA, GO).

Pinned Loading

  1. RepSeq_TCRanalysis RepSeq_TCRanalysis Public

    Project in Bioinformatics Institute with the TCR Analysis of Rep-Seq data

    Jupyter Notebook 4

  2. DistrProtStruc DistrProtStruc Public

    Distribution of protein structure

    Jupyter Notebook

  3. TCGA_SKCM TCGA_SKCM Public

    Forked from flajole/TCGA_SKCM

    Analysis of the data from http://www.cbioportal.org/study/summary?id=skcm_tcga_pan_can_atlas_2018

    HTML

  4. BI_ML_2024 BI_ML_2024 Public

    Repository is for ML homework in Bioinformatics Institute

    1 1