Goal of the final project The goal of this project is to explore an environmental data science question for which you do not know the answer. You will construct a research question, collect relevant data, and design a statistical analysis to answer your question. Your analysis must apply at least some of the statistical concepts you have learned in this course, including concepts covered in the second half of the class. You will summarize your findings and communicate clearly how they have or have not helped you answer your question.
Technical blog post
This blog post is a write up summarizing in text and with figures and/or tables your question, the data you have collected, your analysis plan, and your results. Your target audience should be other quantitative scientists and practitioners familiar with the basics of statistics and data science, but not necessarily experts in environmental science or the details of the methods studied in this course.
Email blog post in PDF format and as a live link to professor.
Some guidelines for the blog post:
• 3-5 pages in length, including figures, tables, captions, and references list
• 1-3 tables or figures, each carefully labeled and captioned so that they are easily interpretable
• Include scientific references when applicable
• Include links to the underlying data you use. If your data cannot be shared publicly, note this in a short “data availability” statement at the end of your post.
• [Recommended, but optional] If you can, include a link to a repository with your replication code. I will not evaluate your code as part of your grade, but this is good practice for reproducibility and transparency and will make your blog post more exciting to the outside world.
General guidelines:
• Motivate your question. Why is this important? Is there existing evidence on this question? If so, why is it inconclusive? If not, why not?
• Describe your data. Where did you access it? What are its spatial and temporal features? What are its limitations? What do you know about the sampling strategy and what biases that may introduce? If helpful, you can use a histogram, scatterplot, or summary statistics table to describe your data.
• Clearly describe your analysis plan. What is your analysis plan? Why did you choose this analysis, given your data and question? What are the limitations?
• Summarize your results visually and in words. Show us your results in figure(s) and/or table(s) that are carefully labeled and captioned. Describe in the text (and orally when presenting) what you found, and how these results either do or do not help you answer your question.
• What might you do next? One short analysis cannot fully answer an interesting scientific question. If you had time to collect more data or conduct more analysis, what would help you answer this question better?