Skip to content
View santi-souza's full-sized avatar

Block or report santi-souza

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
santi-souza/README.md

๐Ÿ‘‹ Hi, Iโ€™m Santiago Souza Nava (@santi-souza)

๐Ÿ‘€ What I'm interested in:

  • I specialize in the intersection of Bioinformatics, Biostatistics, and Computational Biology, with a focus on machine learning applications to biological data. Iโ€™m especially enthusiastic about using AI and machine learning to analyze complex omics data and predict disease outcomes in healthcare. I am looking to contribute to multidisciplinary teams and help develop reproducible research pipelines for cutting-edge biological research.

๐ŸŒฑ Education:

  • MSc in Bioinformatics and Biostatistics | CEMP

    • Thesis: "Stroke: Statistical Analysis, Exploratory Data Analysis (EDA), and Machine Learning Prediction"
    • This project focused on applying statistical methods and machine learning models (including logistic regression, random forests, and gradient boosting) to predict stroke based on complex datasets.
  • BSc in Biotechnology Engineering | ORT University

    • Thesis: "Molecular Detection of Lethal White Overo Syndrome and Congenital Stationary Night Blindness in Horses"
    • I utilized Sanger sequencing and capillary electrophoresis to identify genetic mutations in horses, contributing to disease diagnostics in veterinary medicine.

๐Ÿ’ผ Experience: Bioinformatics and Biostatistics:

  • Conducted Exploratory Data Analysis (EDA) on healthcare datasets (stroke prediction).
  • Built predictive models using machine learning algorithms (logistic regression, random forests, gradient boosting).
  • Knowledge in analyzing Omics data (RNA-Seq, WGS) for biological insights.
  • Proficient in genomic analysis tools like FASTQC, HISAT2, GATK, and DESeq2.

Programming Skills:

  • Bash/Unix
  • R (Data Analysis, Statistics, Machine Learning)
  • Python (Bioinformatics, Data Science, Machine Learning)
  • SQL (Data Management, Querying)
  • Familiar with bioinformatics pipelines and tools for high-throughput sequencing analysis.

๐Ÿ“ซ How to reach me:

  • ๐Ÿ“ง Email: santisouza97@gmail.com
  • ๐ŸŒ Website: santisouza.my.canva.site
  • ๐Ÿ”— LinkedIn: linkedin.com/in/santiagosouza

Pinned Loading

  1. stroke-eda-ml stroke-eda-ml Public

    Stroke: Statistical analysis of risk factors and creation of predictive models using machine learning

    R 1

  2. WGS-GATK WGS-GATK Public

    Pipeline for a WGS by using GATK best practices

    Shell

  3. python-datascience-machinelearning python-datascience-machinelearning Public

    Python for Data Science and Machine Learning Bootcamp

    Jupyter Notebook

  4. bioinfo-ML-project bioinfo-ML-project Public

    Roff

  5. knight-tour knight-tour Public

    This repository contains a Python solution to the Knight's Shortest Path Problem, which calculates all possible shortest paths a knight can take to travel from a start position to an end position oโ€ฆ

    Python

  6. python-oop-pandas python-oop-pandas Public

    Jupyter Notebook