I created this repository to share some of the materials and opinions I have generated over the years working as a bioinformatician in Pat Edger's lab and as an intern at Syngenta.
- Best Practices for High-Quality Data Science
- Purpose: This was created for the 2024 MSU Plant Genomics REU as a way to introduce fundamental concepts of reproducibility, data management, code anti-patterns, and version control. My brother was fundamental in introducing many topics such as version control, Makefiles, TeX, and much more; I wish I had learned or been aware of these topics much sooner in life as they have made my job a lot easier and more fulfilling, it is my hope that I can do the same for the REU students.
- Audience: The audience is a mixed group of undergraduates with minimal research experience, and little to no knowledge of bioinformatics.
pull_request_pattern_form.md
- This is just a small sample document that I like to use for pull-requests on GitHub/GitLab.
regular_document.tex
- I like to use this for generating small reports in LaTeX.
template_python.py
- This is a good starter template for bioinformatics work, it uses
argparse
and thelogging
module. Primary purpose is to provide this to beginner bioinformaticians as a skeleton.
- This is a good starter template for bioinformatics work, it uses
weekly_report.md
- I originally created this to facilitate communication between an undergraduate I was mentoring while in Pat's lab. I feel like it is particularly useful when working on complex coding projects that have multiple moving parts. I also used this while at Syngenta to provide regular updates to my manager and team in lieu of meetings. The document is also good for keeping a track record of planned and completed work.