# Example code for instructor
= ["apple", "banana", "cherry"]
fruits print("Original list:", fruits)
@@ -512,7 +512,7 @@ Task 2: Dicti
Update the quantity of an existing item.
Print the final inventory.
-
+
# Example code for instructor
= {
inventory "apples": 50,
@@ -549,7 +549,7 @@ Task
Find and print the intersection of the two sets.
Add a new element to the evens
set.
-
+
# Example code for instructor
= {2, 4, 6, 8, 10}
evens = {1, 3, 5, 7, 9}
@@ -583,7 +583,7 @@ odds
Use a list comprehension to remove duplicates.
Print the results of both methods.
-
+
# Example code for instructor
= [1, 2, 2, 3, 3, 3, 4, 4, 5]
numbers
diff --git a/docs/course-materials/answer-keys/3b_control_flows-key.html b/docs/course-materials/answer-keys/3b_control_flows-key.html
index 81bd3eb..5de93ae 100644
--- a/docs/course-materials/answer-keys/3b_control_flows-key.html
+++ b/docs/course-materials/answer-keys/3b_control_flows-key.html
@@ -470,10 +470,10 @@ Task 1: Simpl
Otherwise, print “Enjoy the pleasant weather!”
-
+
= 20 temperature
-
+
if temperature > 25:
print("It's a hot day, stay hydrated!")
else:
@@ -497,10 +497,10 @@ Task 2: Grade Clas
Below 60: “F”
-
+
= 85 score
-
+
if score >= 90:
= 'A'
grade elif score >= 80:
@@ -528,7 +528,7 @@ Task 3: Counting She
Use a for loop with the range() function
Print each number followed by “sheep”
-
+
for i in range(1,6):
print(f"{i} sheep")
@@ -548,10 +548,10 @@ Task 4: Sum of Numbe
Use a for loop with the range() function to add each number to total
After the loop, print the total
-
+
= 0 total
-
+
for i in range(1,11):
= total + i
total
@@ -573,10 +573,10 @@ Task 5: Countdown
After each print, decrease the countdown by 1
When the countdown reaches 0, print “Blast off!”
-
+
= 5 countdown
-
+
while countdown > 0:
print(countdown)
# (-= is a python syntax shortcut inherited from C)
diff --git a/docs/course-materials/answer-keys/3d_pandas_series-key.html b/docs/course-materials/answer-keys/3d_pandas_series-key.html
index ff2f163..a092e0c 100644
--- a/docs/course-materials/answer-keys/3d_pandas_series-key.html
+++ b/docs/course-materials/answer-keys/3d_pandas_series-key.html
@@ -440,7 +440,7 @@ Resources
Setup
First, let’s import the necessary libraries and create a sample Series.
-
+
import pandas as pd
import numpy as np
@@ -463,7 +463,7 @@ Exercise 1: C
apple: $0.5, banana: $0.3, cherry: $1.0, date: $1.5, elderberry: $2.0
-
+
# Create a Series called 'prices' with the same index as 'fruits'
# Use these prices: apple: $0.5, banana: $0.3, cherry: $1.0, date: $1.5, elderberry: $2.0
= pd.Series([0.5, 0.3, 1.0, 2.5, 3.0], index=fruits.values, name='Prices')
@@ -486,7 +486,7 @@ prices Exercise 2: S
Find the most expensive fruit.
Apply a 10% discount to all fruits priced over 1.0.
-
+
# 1. Calculate the total price of all fruits
= prices.sum()
total_price
@@ -522,7 +522,7 @@ Exercise 3: Ser
How many fruits cost less than $1.0?
What is the price range (difference between max and min prices)?
-
+
# 1. Calculate the average price of the fruits
= prices.mean()
average_price
@@ -550,7 +550,7 @@ Exercise 4:
Remove ‘banana’ from both Series.
Sort both Series by fruit name (alphabetically).
-
+
# 1. Add 'fig' to both Series (price: $1.2)
= pd.concat([fruits, pd.Series(['fig'], name='Fruits')])
fruits = pd.concat([prices, pd.Series([1.2], index=['fig'], name='Prices')])
diff --git a/docs/course-materials/answer-keys/5c_cleaning_data-key.html b/docs/course-materials/answer-keys/5c_cleaning_data-key.html
index dfe2b9c..9821089 100644
--- a/docs/course-materials/answer-keys/5c_cleaning_data-key.html
+++ b/docs/course-materials/answer-keys/5c_cleaning_data-key.html
@@ -435,7 +435,7 @@ prices Resources
Setup
First, let’s import the necessary libraries and load an example messy dataframe.
-
+
import pandas as pd
import numpy as np
@@ -450,36 +450,36 @@
Removing duplicates
-
+
=True) messy_df.drop_duplicates(inplace
- Handling missing values (either fill or dropna to remove rows with missing data)
-
+
= messy_df.dropna() messy_df
- Ensuring consistent data types (dates, strings)
-
+
'site'] = messy_df['site'].astype('string')
messy_df['collection date'] = pd.to_datetime(messy_df['collection date']) messy_df[
- Formatting the ‘site’ column for consistency
-
+
'site'] = messy_df['site'].str.lower().replace('sitec','site_c') messy_df[
- Making sure all column names are lower case, without whitespace.
-
+
={'collection date': 'collection_date'}, inplace=True) messy_df.rename(columns
Try to implement these steps using the techniques we’ve learned.
-
+
= messy_df.copy()
cleaned_df
print("Cleaned DataFrame:")
diff --git a/docs/course-materials/answer-keys/7c_visualizations-key.html b/docs/course-materials/answer-keys/7c_visualizations-key.html
index a7a1e2b..72df041 100644
--- a/docs/course-materials/answer-keys/7c_visualizations-key.html
+++ b/docs/course-materials/answer-keys/7c_visualizations-key.html
@@ -437,7 +437,7 @@ Introduction
Setup
First, let’s import the necessary libraries and load our dataset.
-
+
Code
import pandas as pd
@@ -495,7 +495,7 @@
+
Code
# Answer for Task 1
@@ -536,7 +536,7 @@ Task 2: Exam
Modify the pairplot to show the species information using different colors.
Interpret the pairplot: which variables seem to be most strongly correlated? Do you notice any patterns related to species?
-
+
Code
# Answer for Task 2
@@ -572,7 +572,7 @@
+
Code
# Answer for Task 3
@@ -621,7 +621,7 @@ Task 4: Jo
Experiment with different kind
parameters in the joint plot (e.g., ‘scatter’, ‘kde’, ‘hex’).
Create another joint plot, this time for ‘bill_length_mm’ and ‘bill_depth_mm’, colored by species.
-
+
Code
# Answer for Task 4
@@ -696,7 +696,7 @@ Bonus Challenge
Customize the heatmap by adding annotations and adjusting the colormap.
Compare the insights from this heatmap with those from the pairplot. What additional information does each visualization provide?
-
+
Code
# Answer for Bonus Challenge
diff --git a/docs/course-materials/answer-keys/7c_visualizations-key_files/figure-html/cell-5-output-1.png b/docs/course-materials/answer-keys/7c_visualizations-key_files/figure-html/cell-5-output-1.png
index c095130..d5a28f6 100644
Binary files a/docs/course-materials/answer-keys/7c_visualizations-key_files/figure-html/cell-5-output-1.png and b/docs/course-materials/answer-keys/7c_visualizations-key_files/figure-html/cell-5-output-1.png differ
diff --git a/docs/course-materials/answer-keys/7c_visualizations-key_files/figure-html/cell-5-output-2.png b/docs/course-materials/answer-keys/7c_visualizations-key_files/figure-html/cell-5-output-2.png
index f21785e..2aeee46 100644
Binary files a/docs/course-materials/answer-keys/7c_visualizations-key_files/figure-html/cell-5-output-2.png and b/docs/course-materials/answer-keys/7c_visualizations-key_files/figure-html/cell-5-output-2.png differ
diff --git a/docs/course-materials/answer-keys/7c_visualizations-key_files/figure-html/cell-5-output-3.png b/docs/course-materials/answer-keys/7c_visualizations-key_files/figure-html/cell-5-output-3.png
index 1eea260..3e3232d 100644
Binary files a/docs/course-materials/answer-keys/7c_visualizations-key_files/figure-html/cell-5-output-3.png and b/docs/course-materials/answer-keys/7c_visualizations-key_files/figure-html/cell-5-output-3.png differ
diff --git a/docs/course-materials/answer-keys/eod-day1-key.html b/docs/course-materials/answer-keys/eod-day1-key.html
index 976bb04..792dbc5 100644
--- a/docs/course-materials/answer-keys/eod-day1-key.html
+++ b/docs/course-materials/answer-keys/eod-day1-key.html
@@ -441,7 +441,7 @@ Instructions
Import the necessary libraries to work with data (pandas
) and create plots (matplotlib.pyplot
). Use the standard python conventions that import pandas as pd
and import matplotlib.pyplot as plt
-
+
import pandas as pd
import matplotlib.pyplot as plt
@@ -454,7 +454,7 @@ Instructions
Create a variable called url
that stores the URL provided above.
Use the pandas library’s read_csv()
function from pandas to load the data from the URL into a new DataFrame called df
. Any pandas function will always be called using the pd
object and dot notation: pd.read_csv()
-
+
= 'https://raw.githubusercontent.com/environmental-data-science/eds217-day0-comp/main/data/raw_data/toolik_weather.csv'
url = pd.read_csv(url) df
@@ -467,7 +467,7 @@ Instructions
Note: Because the head()
function is a method of a DataFrame, you will call it using dot notation and the dataframe you just created: df.head()
-
+
df.head()
@@ -635,7 +635,7 @@ Instructions
Use the isnull()
method combined with sum()
to count missing values in each column.
-
+
sum() df.isnull().
Year 0
@@ -671,7 +671,7 @@ Instructions
Use the info()
method to get an overview of the DataFrame, including data types and non-null counts. Just like the head()
function, these are methods associated with your df
object, so you call them with dot notation.
-
+
df.describe() df.info()
@@ -712,7 +712,7 @@ Instructions
- Choose a strategy to handle missing data in the columns. For example, fill missing values with the mean of the column using the `fillna()` method or drop rows with missing data using the `dropna()` method.
-::: {#a87d2b0b .cell execution_count=6}
+::: {#85e9c032 .cell execution_count=6}
``` {.python .cell-code}
df['Daily_AirTemp_Mean_C'].fillna(df['Daily_AirTemp_Mean_C'].mean(), inplace=True)
df.dropna(subset=['Daily_globalrad_total_jcm2'], inplace=True)
@@ -720,7 +720,7 @@ Instructions
::: {.cell-output .cell-output-stderr}
```
-/var/folders/bs/x9tn9jz91cv6hb3q6p4djbmw0000gn/T/ipykernel_61613/1318736512.py:1: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
+/var/folders/bs/x9tn9jz91cv6hb3q6p4djbmw0000gn/T/ipykernel_85158/1318736512.py:1: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.
For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.
@@ -741,7 +741,7 @@ Instructions
Calculate the mean of the ‘Daily_AirTemp_Mean_C’ column for each month in the monthly
using the mean()
function. Save this result to a new variable called monthly_means
.
-
+
= df.groupby('Month')
monthly = monthly['Daily_AirTemp_Mean_C'].mean() monthly_means
@@ -755,7 +755,7 @@ Instructions
Syntax Similarity: Use plt.plot()
or plot.bar()
to create plots. In R, you would use ggplot()
.
-
+
plt.plot(monthly_means)
@@ -765,7 +765,7 @@ Instructions
-
+
= ['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec']
months plt.bar(months, monthly_means)
@@ -784,7 +784,7 @@ Instructions
Hint: Similar to calculating monthly averages, group by the ‘Year’ column.
-
+
= df.groupby('Year')
year = year['Daily_AirTemp_Mean_C'].mean()
yearly_means plt.plot(yearly_means)
@@ -796,7 +796,7 @@ Instructions
-
+
= df['Year'].unique()
year_list plt.bar(year_list, yearly_means)
diff --git a/docs/course-materials/answer-keys/eod-day2-key.html b/docs/course-materials/answer-keys/eod-day2-key.html
index f5c36d4..ea4b6e1 100644
--- a/docs/course-materials/answer-keys/eod-day2-key.html
+++ b/docs/course-materials/answer-keys/eod-day2-key.html
@@ -464,7 +464,7 @@ Learning Objectives
Setup
First, let’s import the necessary libraries:
-
+
Code
# We won't use the random library until the end of this exercise,
@@ -480,7 +480,7 @@ Part 1: Data Collec
Task 1: Create a List of Classmates
Create a list containing the names of at least 4 of your classmates in this course.
-
+
Code
= ["Alice", "Bob", "Charlie", "David", "Eve"]
@@ -500,7 +500,7 @@ classmates
+
Code
= {
@@ -549,7 +549,7 @@ classmate_info Task 3: List Operat
Sort the list alphabetically
Find and print the index of a specific classmate
-
+
Code
# a) Add a new classmate
@@ -584,7 +584,7 @@ Task 4: Dicti
Update the “number of pets” for one classmate
Create a list of all the favorite colors your classmates mentioned
-
+
Code
# a) Add favorite_study_spot
@@ -643,7 +643,7 @@ Task 5: Basic Stat
The average number of pets among your classmates
The name of the classmate who got the most sleep last night
-
+
Code
# a) Average number of pets
@@ -663,7 +663,7 @@ Task 5: Basic Stat
Task 6: Data Filtering
Create a new list containing only the classmates who have at least one pet.
-
+
Code
= [name for name, info in classmate_info.items() if info["number_of_pets"] > 0]
@@ -681,7 +681,7 @@ classmates_with_pets Part 4:
Example: Random Selection from a Dictionary
Here’s a simple example of how to select random items from a dictionary:
-
+
Code
import random
@@ -710,10 +710,10 @@ print(f"Randomly selected {num_selections} fruits: {random_fruits}")
-Randomly selected fruit: banana
-Its color: yellow
+Randomly selected fruit: grape
+Its color: purple
Another randomly selected fruit: kiwi
-Randomly selected 3 fruits: ['grape', 'apple', 'kiwi']
+Randomly selected 3 fruits: ['orange', 'apple', 'banana']
This example demonstrates how to:
@@ -734,7 +734,7 @@ Task 7: Random
# Test your function
assign_random_snacks(classmate_info)
-
+
Code
def assign_random_snacks(classmate_info):
@@ -746,7 +746,7 @@ Task 7: Random
assign_random_snacks(classmate_info)
-Alice will share almonds with Bob
+Alice will share almonds with Charlie
diff --git a/docs/course-materials/answer-keys/eod-day3-key.html b/docs/course-materials/answer-keys/eod-day3-key.html
index 2fc130c..350fb3b 100644
--- a/docs/course-materials/answer-keys/eod-day3-key.html
+++ b/docs/course-materials/answer-keys/eod-day3-key.html
@@ -447,7 +447,7 @@ Introduction
Setup
First, let’s import the necessary libraries and set up our environment.
-
+
Code
import pandas as pd
@@ -461,7 +461,7 @@
Creating a Random Number Generator
We can create a random number generator object like this:
-
+
Code
= np.random.default_rng() rng
@@ -472,7 +472,7 @@ Creatin
Using a Seed for Reproducibility
In data science, it’s often crucial to be able to reproduce our results. We can do this by setting a seed for our random number generator. Here’s how:
-
+
Code
= np.random.default_rng(seed=42) rng
@@ -487,7 +487,7 @@ Creating t
Create a series called scores
that contains 10 elements representing monthly test scores. We’ll use random integers between 70 and 100 to generate the monthly scores, and set the index to be the month names from September to June:
= ['Sep', 'Oct', 'Nov', 'Dec', 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun'] months
-
+
Code
# Create the month list:
@@ -505,7 +505,7 @@ Analyzing the Te
1. What is the student’s average test score for the entire year?
Calculate the mean of all scores in the series.
-
+
Code
# 1. Average score for the entire year
@@ -520,7 +520,7 @@
2. What is the student’s average test score during the first half of the year?
Calculate the mean of the first five months’ scores.
-
+
Code
# 2. Average score for the first half of the year
@@ -537,7 +537,7 @@
3. What is the student’s average test score during the second half of the year?
Calculate the mean of the last five months’ scores.
-
+
Code
= scores.iloc[5:].mean()
@@ -553,7 +553,7 @@ second_half_average
4. Did the student improve their performance in the second half? If so, by how much?
Compare the average scores from the first and second half of the year.
-
+
Code
# 4. Performance improvement
@@ -572,7 +572,7 @@
Exploring Reproducibility
To demonstrate the importance of seeding, try creating two series with different random number generators:
-
+
Code
= np.random.default_rng(seed=42)
@@ -588,7 +588,7 @@ rng1 Exploring Reprod
Now try creating two series with random number generators that have different seeds:
-
+
Code
= np.random.default_rng(seed=42)
diff --git a/docs/course-materials/answer-keys/eod-day4-key.html b/docs/course-materials/answer-keys/eod-day4-key.html
index 425cfd2..8c63702 100644
--- a/docs/course-materials/answer-keys/eod-day4-key.html
+++ b/docs/course-materials/answer-keys/eod-day4-key.html
@@ -455,7 +455,7 @@ rng3 Introduction
This end-of-day session is focused on using pandas for loading, visualizing, and analyzing marine microplastics data. This session is designed to help you become more comfortable with the pandas library, equipping you with the skills needed to perform data analysis effectively.
The National Oceanic and Atmospheric Administration, via its National Centers for Environmental Information has an entire section related to marine microplastics – that is, microplastics found in water — at https://www.ncei.noaa.gov/products/microplastics.
We will be working with a recent download of the entire marine microplastics dataset. The url for this data is located here:
-
+
Code