From 3d7366e7cf8233ff54c4a015e8a1f5d08bfeb140 Mon Sep 17 00:00:00 2001
From: GitHub Actions Objectives
task.
For reading, writing and analysing data stored in the netCDF file format, atmosphere and ocean scientists will typically do most of their -work with either the xarray or iris libraries. These libraries -are built on top of more generic data science libraries like numpy and -matplotlib, to make the types of analysis we do faster and more -efficient. To learn more about the PyAOS “stack” shown in the diagram -below (i.e. the collection of libraries that are typically used for data -analysis and visualisation in the atmosphere and ocean sciences), check -out the overview of the PyAOS -stack at the PyAOS community site.
+work with either the xarray or iris libraries. These +libraries are built on top of more generic data science libraries like +numpy and matplotlib, to make the types of analysis we do faster and +more efficient. To learn more about the PyAOS “stack” shown in the +diagram below (i.e. the collection of libraries that are typically used +for data analysis and visualisation in the atmosphere and ocean +sciences), check out the overview of the PyAOS stack at +the PyAOS community site.Now that we’ve identified the Python libraries we might want to use, @@ -430,17 +430,18 @@
According to the latest
-documentation, Anaconda comes with over 250 of the most widely used
-data science libraries (and their dependencies) pre-installed. In
-addition, there are several thousand more libraries available via the
-conda install
command, which can be executed using the Bash
-Shell or Anaconda Prompt (Windows only). It is also possible to install
-packages using the Anaconda Navigator graphical user interface.
According to the latest documentation,
+Anaconda comes with over 300 of the most widely used data science
+libraries (and their dependencies) pre-installed. In addition, there are
+several thousand more libraries available via the Anaconda Public
+Repository, which can be installed by running the
+conda install
command the Bash Shell or Anaconda Prompt
+(Windows only). It is also possible to install packages using the
+Anaconda Navigator graphical user interface.
If you don’t want to install the entire Anaconda distribution, you -can install Miniconda instead. It -essentially comes with conda and nothing else.
+can install Miniconda +instead. It essentially comes with conda and nothing else.You can search Anaconda Cloud to find the command needed to install
the package. For instance, here is the search result for the
iris
package:
As you can see, there are often multiple versions of the same package
-up on Anaconda Cloud. To try and address this duplication problem, conda-forge has been launched,
-which aims to be a central repository that contains just a single
-(working) version of each package on Anaconda Cloud. You can therefore
-expand the selection of packages available via
-conda install
beyond the chosen few thousand by adding the
-conda-forge channel:
conda install
beyond
+the chosen few thousand by adding the conda-forge channel:
For these particular lessons we will use xarray
, but all
the same tasks could be performed with iris
. We’ll also
-install dask
+install dask
(xarray
uses this for parallel processing), netCDF4
(xarray
requires this to read netCDF files), cartopy
(to
help with geographic plot projections), cmocean
(for
@@ -553,7 +552,7 @@
For instance, we could create an environment called @@ -615,28 +614,13 @@
$ conda env create -f pyaos-lesson.yml
For ease of sharing the YAML file, it can be uploaded to your account -at the Anaconda Cloud website,
- -so that others can re-create the environment by simply refering to -your Anaconda username:
-The ease with which others can recreate your environment (on any operating system) is a huge breakthough for reproducible research.
To delete the environment:
- @@ -658,10 +642,10 @@>>>
prompt indicates that you are now
talking to the Python interpreter.
A more powerful alternative to the default Python interpreter is -IPython (Interactive Python). The online -documentation outlines all the special features that come with -IPython, but as an example, it lets you execute bash shell commands -without having to exit the IPython interpreter:
+IPython (Interactive Python). The online documentation outlines +all the special features that come with IPython, but as an example, it +lets you execute bash shell commands without having to exit the IPython +interpreter:$ ipython Python 3.7.1 (default, Dec 14 2018, 13:28:58) Type 'copyright', 'credits' or 'license' for more information @@ -695,11 +679,11 @@
directory using the Bash Shell: -BASHdata-carpentry
+@@ -748,10 +731,9 @@BASH
-+$ cd ~/Desktop/data-carpentry -$ jupyter notebook &
$ cd ~/Desktop/data-carpentry +$ jupyter notebook &
(The
@@ -714,9 +698,8 @@&
allows you to come back and use the bash shell without closing your notebook first.)BASHJupyterLab
-The Jupyter team have recently launched JupyterLab -which combines the Jupyter Notebook with many of the features common to -an IDE.
+If you like Jupyter Notebooks you might want to try JupyterLab, which combines +the Jupyter Notebook with many of the features common to an IDE.
Show me the solution
@@ -776,13 +758,13 @@-The setup -menu at the top of the page contains drop-down boxes explaining how -to install the Python libraries using the Bash Shell or Anaconda -Navigator.
+The software +installation instructions explain how to install the Python +libraries using the Bash Shell or Anaconda Navigator.
Launch a Jupyer Notebook
Once your notebook is open, import
-xarray
,catropy
,matplotlib
andnumpy
using the following Python commands:+PYTHON
-+import xarray as xr -import cartopy.crs as ccrs -import matplotlib.pyplot as plt -import numpy as np
import xarray as xr +import cartopy.crs as ccrs +import matplotlib.pyplot as plt +import numpy as np
(Hint: Hold down the shift and return keys to execute a code cell in a Jupyter Notebook.)
diff --git a/02-visualisation.html b/02-visualisation.html index 5a72227..349953d 100644 --- a/02-visualisation.html +++ b/02-visualisation.html @@ -626,8 +626,7 @@PYTHON< plt.show()
The default colorbar used by matplotlib is
viridis
. It -used to bejet
, but that was changed a couple of years ago -in response to the #endtherainbow +used to bejet
, but that was changed in response to the #endtherainbow campaign.Putting all the code together (and reversing viridis so that wet is purple and dry is yellow)…
@@ -711,7 +710,7 @@Season selection
Rather than plot the annual climatology, edit the code so that it plots the June-August (JJA) season.
-(Hint: the groupby functionality can be used to group +
(Hint: the
diff --git a/03-functions.html b/03-functions.html index 2b8b111..a666d28 100644 --- a/03-functions.html +++ b/03-functions.html @@ -574,7 +574,7 @@groupby
functionality can be used to group all the data into seasons prior to averaging over the time axis)PYTHON< -
+@@ -968,7 +968,7 @@PYTHON diff --git a/04-cmdline.html b/04-cmdline.html index 10c7f11..fb6c9c6 100644 --- a/04-cmdline.html +++ b/04-cmdline.html @@ -388,8 +388,8 @@
Objectives
the task (as we have for plotting the precipitation climatology), that code can be transferred to a Python script so that it can be executed at the command line. It’s likely that your data processing workflows will -include command line utilities from the CDO and NCO projects in addition -to Python code, so the command line is the natural place to manage your +include command line utilities from the CDO and NCO projects in addition to +Python code, so the command line is the natural place to manage your workflows (e.g. using shell scripts or make files).In general, the first thing that gets added to any Python script is the following:
diff --git a/05-git.html b/05-git.html index c3334c3..eb63e7a 100644 --- a/05-git.html +++ b/05-git.html @@ -1014,9 +1014,9 @@Checking out with GitScientist’s nightmareTypes of errors
There are essentially two kinds of errors that can arise in Python: @@ -710,8 +710,8 @@
Testing and continuous integrationResearch -Software Engineering With Python has a chapter on -testing that is well worth a read. +Software Engineering With Python has a chapter on testing +that is well worth a read.
PYTHON< -
+The basic configuration command at the top of the
main
function @@ -1012,7 +1012,7 @@PYTHON< -
+PYTHON @@ -1047,7 +1047,7 @@
plot_precipitation_climatology.py
Show me the solution
-+PYTHON diff --git a/09-provenance.html b/09-provenance.html index 16b96e4..397726f 100644 --- a/09-provenance.html +++ b/09-provenance.html @@ -570,7 +570,7 @@
PYTHON<
Handling different image formats
-The
plt.savefig
+The
plt.savefig
documentation provides information on the metadata keys accepted by PNG, PDF, EPS and PS image formats.Using that information as a guide, add a new function called @@ -605,7 +605,7 @@
PYTHON< -
+The new function could read as follows:
@@ -738,7 +738,7 @@PYTHON< -
+The beginning of the script would need to be updated to import the
@@ -781,7 +781,7 @@cmdline_provenance
library:plot_precipitation_climatology.py
Show me the solution
-+@@ -754,7 +754,7 @@PYTHON diff --git a/10-large-data.html b/10-large-data.html index 08c145a..6bc5ba2 100644 --- a/10-large-data.html +++ b/10-large-data.html @@ -399,7 +399,7 @@
Data download
Instructors teaching this lesson can download the CNRM-CM6-1-HR daily precipitation data from the Earth System Grid Federation (ESGF). See the -instructor +instructor notes for details. Since it is a very large download (45 GB), learners are not expected to download the data. (None of the exercises at the end of the lesson require downloading the data.)
@@ -726,8 +726,8 @@Writing your own Dask-aware functionsxarray (e.g. an interpolation routine from the SciPy library) we’d first need to use the
apply_ufunc
ormap_blocks
function to make -those operations “Dask-aware”. The xarray -tutorial from SciPy 2020 explains how to do this. +those operations “Dask-aware”. There’s an xarray +tutorial that explains how to do this.Alternatives to Dask
Show me the solution
-+@@ -1540,7 +1521,7 @@@@ -838,13 +820,13 @@Other options include writing a loop to process each netCDF file one at a time or a series of sub-regions one at a time.
@@ -786,7 +786,7 @@Find out what you’re working with
Show me the solution
-+@@ -810,10 +793,9 @@@@ -718,10 +702,10 @@@@ -545,18 +546,16 @@Run the following two commands in your notebook to setup the client and then type
client
to view the display of diff --git a/aio.html b/aio.html index ae5fb5a..13e5320 100644 --- a/aio.html +++ b/aio.html @@ -459,14 +459,14 @@Objectives
task.For reading, writing and analysing data stored in the netCDF file format, atmosphere and ocean scientists will typically do most of their -work with either the xarray or iris libraries. These libraries -are built on top of more generic data science libraries like numpy and -matplotlib, to make the types of analysis we do faster and more -efficient. To learn more about the PyAOS “stack” shown in the diagram -below (i.e. the collection of libraries that are typically used for data -analysis and visualisation in the atmosphere and ocean sciences), check -out the overview of the PyAOS -stack at the PyAOS community site.
+work with either the xarray or iris libraries. These +libraries are built on top of more generic data science libraries like +numpy and matplotlib, to make the types of analysis we do faster and +more efficient. To learn more about the PyAOS “stack” shown in the +diagram below (i.e. the collection of libraries that are typically used +for data analysis and visualisation in the atmosphere and ocean +sciences), check out the overview of the PyAOS stack at +the PyAOS community site.Python distributions for data science
@@ -482,19 +482,20 @@Objectives
most popular data science libraries and their dependencies pre-installed, and some also come with a package manager to assist with installing additional libraries that weren’t pre-installed. Today the -most popular distribution for data science is Anaconda, which comes +most popular distribution for data science is Anaconda, which comes with a package (and environment) manager called conda.Introducing conda
-According to the latest -documentation, Anaconda comes with over 250 of the most widely used -data science libraries (and their dependencies) pre-installed. In -addition, there are several thousand more libraries available via the -
+conda install
command, which can be executed using the Bash -Shell or Anaconda Prompt (Windows only). It is also possible to install -packages using the Anaconda Navigator graphical user interface.According to the latest documentation, +Anaconda comes with over 300 of the most widely used data science +libraries (and their dependencies) pre-installed. In addition, there are +several thousand more libraries available via the Anaconda Public +Repository, which can be installed by running the +
conda install
command the Bash Shell or Anaconda Prompt +(Windows only). It is also possible to install packages using the +Anaconda Navigator graphical user interface.@@ -531,8 +532,8 @@Miniconda
If you don’t want to install the entire Anaconda distribution, you -can install Miniconda instead. It -essentially comes with conda and nothing else.
+can install Miniconda +instead. It essentially comes with conda and nothing else.MinicondaAnaconda Cloud website, where the community can contribute conda installation packages. This is critical because many of our libraries have a small user base, which -means they’ll never make it into the top few thousand data science -libraries supported by Anaconda. +means they’ll never make it into the Anaconda Public Repository.
You can search Anaconda Cloud to find the command needed to install the package. For instance, here is the search result for the
iris
package:As you can see, there are often multiple versions of the same package -up on Anaconda Cloud. To try and address this duplication problem, conda-forge has been launched, -which aims to be a central repository that contains just a single -(working) version of each package on Anaconda Cloud. You can therefore -expand the selection of packages available via -
+up on Anaconda Cloud. To try and address this duplication problem, conda-forge has been launched, which +aims to be a central repository that contains just a single (working) +version of each package on Anaconda Cloud. You can therefore expand the +selection of packages available viaconda install
beyond the chosen few thousand by adding the -conda-forge channel:conda install
beyond +the chosen few thousand by adding the conda-forge channel:-BASH
@@ -572,7 +571,7 @@BASH
For these particular lessons we will use
xarray
, but all the same tasks could be performed withiris
. We’ll also -installdask
+installdask
(xarray
uses this for parallel processing),netCDF4
(xarray
requires this to read netCDF files),cartopy
(to help with geographic plot projections),cmocean
(for @@ -611,7 +610,7 @@Creating separate environmentsIf you’ve got multiple data science projects on the go, installing all your packages in the same conda environment can get a little messy. (By default they are installed in the root/base environment.) It’s -therefore common practice to create +therefore common practice to create separate conda environments for the various projects you’re working on.
For instance, we could create an environment called @@ -673,28 +672,13 @@
BASH
$ conda env create -f pyaos-lesson.yml
For ease of sharing the YAML file, it can be uploaded to your account -at the Anaconda Cloud website,
- -so that others can re-create the environment by simply refering to -your Anaconda username:
-The ease with which others can recreate your environment (on any operating system) is a huge breakthough for reproducible research.
To delete the environment:
-BASHThe
>>>
prompt indicates that you are now talking to the Python interpreter.A more powerful alternative to the default Python interpreter is -IPython (Interactive Python). The online -documentation outlines all the special features that come with -IPython, but as an example, it lets you execute bash shell commands -without having to exit the IPython interpreter:
+IPython (Interactive Python). The online documentation outlines +all the special features that come with IPython, but as an example, it +lets you execute bash shell commands without having to exit the IPython +interpreter:$ ipython Python 3.7.1 (default, Dec 14 2018, 13:28:58) Type 'copyright', 'credits' or 'license' for more information @@ -757,11 +741,11 @@
directory using the Bash Shell: -BASHdata-carpentry
+BASH
-+$ cd ~/Desktop/data-carpentry -$ jupyter notebook &
$ cd ~/Desktop/data-carpentry +$ jupyter notebook &
(The
@@ -776,9 +760,8 @@&
allows you to come back and use the bash shell without closing your notebook first.)BASHJupyterLab
-The Jupyter team have recently launched JupyterLab -which combines the Jupyter Notebook with many of the features common to -an IDE.
+If you like Jupyter Notebooks you might want to try JupyterLab, which combines +the Jupyter Notebook with many of the features common to an IDE.
Show me the solution
-The setup -menu at the top of the page contains drop-down boxes explaining how -to install the Python libraries using the Bash Shell or Anaconda -Navigator.
+The software +installation instructions explain how to install the Python +libraries using the Bash Shell or Anaconda Navigator.
Launch a Jupyer Notebook
Once your notebook is open, import
-xarray
,catropy
,matplotlib
andnumpy
using the following Python commands:+PYTHON
-+import xarray as xr -import cartopy.crs as ccrs -import matplotlib.pyplot as plt -import numpy as np
import xarray as xr +import cartopy.crs as ccrs +import matplotlib.pyplot as plt +import numpy as np
(Hint: Hold down the shift and return keys to execute a code cell in a Jupyter Notebook.)
@@ -1148,8 +1130,7 @@PYTHON< plt.show()
The default colorbar used by matplotlib is
viridis
. It -used to bejet
, but that was changed a couple of years ago -in response to the #endtherainbow +used to bejet
, but that was changed in response to the #endtherainbow campaign.Putting all the code together (and reversing viridis so that wet is purple and dry is yellow)…
@@ -1233,7 +1214,7 @@Season selection
Rather than plot the annual climatology, edit the code so that it plots the June-August (JJA) season.
-(Hint: the groupby functionality can be used to group +
(Hint: the
groupby
functionality can be used to group all the data into seasons prior to averaging over the time axis)PYTHON< -
+@@ -4573,7 +4554,7 @@PYTHON @@ -1704,8 +1685,8 @@
Objectives
the task (as we have for plotting the precipitation climatology), that code can be transferred to a Python script so that it can be executed at the command line. It’s likely that your data processing workflows will -include command line utilities from the CDO and NCO projects in addition -to Python code, so the command line is the natural place to manage your +include command line utilities from the CDO and NCO projects in addition to +Python code, so the command line is the natural place to manage your workflows (e.g. using shell scripts or make files).In general, the first thing that gets added to any Python script is the following:
@@ -2922,9 +2903,9 @@Checking out with GitScientist’s nightmareTypes of errors @@ -4311,8 +4292,8 @@
Testing and continuous integrationResearch -Software Engineering With Python has a chapter on -testing that is well worth a read. +Software Engineering With Python has a chapter on testing +that is well worth a read.
PYTHON< -
+The basic configuration command at the top of the
main
function @@ -4617,7 +4598,7 @@PYTHON< -
+PYTHON @@ -4652,7 +4633,7 @@
plot_precipitation_climatology.py
Show me the solution
-+PYTHON @@ -5080,7 +5061,7 @@
PYTHON<
Handling different image formats
-The
plt.savefig
+The
plt.savefig
documentation provides information on the metadata keys accepted by PNG, PDF, EPS and PS image formats.Using that information as a guide, add a new function called @@ -5115,7 +5096,7 @@
PYTHON< -
+The new function could read as follows:
@@ -5248,7 +5229,7 @@PYTHON< -
+The beginning of the script would need to be updated to import the
@@ -5291,7 +5272,7 @@cmdline_provenance
library:plot_precipitation_climatology.py
Show me the solution
-+@@ -5928,7 +5909,7 @@PYTHON @@ -5573,7 +5554,7 @@
Data download
Instructors teaching this lesson can download the CNRM-CM6-1-HR daily precipitation data from the Earth System Grid Federation (ESGF). See the -instructor +instructor notes for details. Since it is a very large download (45 GB), learners are not expected to download the data. (None of the exercises at the end of the lesson require downloading the data.)
@@ -5900,8 +5881,8 @@Writing your own Dask-aware functionsxarray (e.g. an interpolation routine from the SciPy library) we’d first need to use the
apply_ufunc
ormap_blocks
function to make -those operations “Dask-aware”. The xarray -tutorial from SciPy 2020 explains how to do this. +those operations “Dask-aware”. There’s an xarray +tutorial that explains how to do this.Alternatives to Dask
Show me the solution
-+diff --git a/instructor/03-functions.html b/instructor/03-functions.html index 8d854cc..4a3080c 100644 --- a/instructor/03-functions.html +++ b/instructor/03-functions.html @@ -576,7 +576,7 @@@@ -778,13 +760,13 @@Other options include writing a loop to process each netCDF file one at a time or a series of sub-regions one at a time.
@@ -5960,7 +5941,7 @@Find out what you’re working with
Show me the solution
-+@@ -750,10 +733,9 @@@@ -660,10 +644,10 @@Run the following two commands in your notebook to setup the client and then type
client
to view the display of diff --git a/index.html b/index.html index b085440..5d686e0 100644 --- a/index.html +++ b/index.html @@ -465,7 +465,7 @@Troubeshooting
Your workshop instructor may also ask that you install the python -packages introduced in the first +packages introduced in the first lesson ahead of time. You can do this via the command line or by using the Anaconda Navigator:
diff --git a/instructor-notes.html b/instructor-notes.html index 1287c08..2d46944 100644 --- a/instructor-notes.html +++ b/instructor-notes.html @@ -427,9 +427,9 @@@@ -491,18 +492,16 @@Instructor Notes
At the beginning of the workshop, participants are required to -download a number of data files (instructions at the setup -page). In the first -lesson, they are then required to install some python libraries +download a number of data files (instructions at the setup +page). In the first +lesson they are then required to install some python libraries (
jupyter
,xarray
,cmocean
, etc). Both these tasks can be problematic at venues with slow wifi, so it is often a good idea to ask participants to download the data and install @@ -451,10 +451,10 @@Instructor Notes
-The setup +
The setup page gives details of the software installation instructions that can provided to participants.
-You can also send the helper +
You can also send the helper lesson check to helpers prior to the workshop, so that they can test that all the software and code is working correctly.
diff --git a/instructor/01-conda.html b/instructor/01-conda.html index 64c110e..9278796 100644 --- a/instructor/01-conda.html +++ b/instructor/01-conda.html @@ -411,14 +411,14 @@Objectives
task.For reading, writing and analysing data stored in the netCDF file format, atmosphere and ocean scientists will typically do most of their -work with either the xarray or iris libraries. These libraries -are built on top of more generic data science libraries like numpy and -matplotlib, to make the types of analysis we do faster and more -efficient. To learn more about the PyAOS “stack” shown in the diagram -below (i.e. the collection of libraries that are typically used for data -analysis and visualisation in the atmosphere and ocean sciences), check -out the overview of the PyAOS -stack at the PyAOS community site.
+work with either the xarray or iris libraries. These +libraries are built on top of more generic data science libraries like +numpy and matplotlib, to make the types of analysis we do faster and +more efficient. To learn more about the PyAOS “stack” shown in the +diagram below (i.e. the collection of libraries that are typically used +for data analysis and visualisation in the atmosphere and ocean +sciences), check out the overview of the PyAOS stack at +the PyAOS community site.Python distributions for data science
Now that we’ve identified the Python libraries we might want to use, @@ -432,17 +432,18 @@
Objectives
most popular data science libraries and their dependencies pre-installed, and some also come with a package manager to assist with installing additional libraries that weren’t pre-installed. Today the -most popular distribution for data science is Anaconda, which comes +most popular distribution for data science is Anaconda, which comes with a package (and environment) manager called conda.Introducing conda
-According to the latest -documentation, Anaconda comes with over 250 of the most widely used -data science libraries (and their dependencies) pre-installed. In -addition, there are several thousand more libraries available via the -
+conda install
command, which can be executed using the Bash -Shell or Anaconda Prompt (Windows only). It is also possible to install -packages using the Anaconda Navigator graphical user interface.According to the latest documentation, +Anaconda comes with over 300 of the most widely used data science +libraries (and their dependencies) pre-installed. In addition, there are +several thousand more libraries available via the Anaconda Public +Repository, which can be installed by running the +
conda install
command the Bash Shell or Anaconda Prompt +(Windows only). It is also possible to install packages using the +Anaconda Navigator graphical user interface.@@ -479,8 +480,8 @@Miniconda
If you don’t want to install the entire Anaconda distribution, you -can install Miniconda instead. It -essentially comes with conda and nothing else.
+can install Miniconda +instead. It essentially comes with conda and nothing else.MinicondaAnaconda Cloud website, where the community can contribute conda installation packages. This is critical because many of our libraries have a small user base, which -means they’ll never make it into the top few thousand data science -libraries supported by Anaconda. +means they’ll never make it into the Anaconda Public Repository.
You can search Anaconda Cloud to find the command needed to install the package. For instance, here is the search result for the
iris
package:As you can see, there are often multiple versions of the same package -up on Anaconda Cloud. To try and address this duplication problem, conda-forge has been launched, -which aims to be a central repository that contains just a single -(working) version of each package on Anaconda Cloud. You can therefore -expand the selection of packages available via -
+up on Anaconda Cloud. To try and address this duplication problem, conda-forge has been launched, which +aims to be a central repository that contains just a single (working) +version of each package on Anaconda Cloud. You can therefore expand the +selection of packages available viaconda install
beyond the chosen few thousand by adding the -conda-forge channel:conda install
beyond +the chosen few thousand by adding the conda-forge channel:-BASH
@@ -516,7 +515,7 @@BASH
For these particular lessons we will use
xarray
, but all the same tasks could be performed withiris
. We’ll also -installdask
+installdask
(xarray
uses this for parallel processing),netCDF4
(xarray
requires this to read netCDF files),cartopy
(to help with geographic plot projections),cmocean
(for @@ -555,7 +554,7 @@Creating separate environmentsIf you’ve got multiple data science projects on the go, installing all your packages in the same conda environment can get a little messy. (By default they are installed in the root/base environment.) It’s -therefore common practice to create +therefore common practice to create separate conda environments for the various projects you’re working on.
For instance, we could create an environment called @@ -617,28 +616,13 @@
BASH
$ conda env create -f pyaos-lesson.yml
For ease of sharing the YAML file, it can be uploaded to your account -at the Anaconda Cloud website,
- -so that others can re-create the environment by simply refering to -your Anaconda username:
-The ease with which others can recreate your environment (on any operating system) is a huge breakthough for reproducible research.
To delete the environment:
-BASHThe
>>>
prompt indicates that you are now talking to the Python interpreter.A more powerful alternative to the default Python interpreter is -IPython (Interactive Python). The online -documentation outlines all the special features that come with -IPython, but as an example, it lets you execute bash shell commands -without having to exit the IPython interpreter:
+IPython (Interactive Python). The online documentation outlines +all the special features that come with IPython, but as an example, it +lets you execute bash shell commands without having to exit the IPython +interpreter:$ ipython Python 3.7.1 (default, Dec 14 2018, 13:28:58) Type 'copyright', 'credits' or 'license' for more information @@ -697,11 +681,11 @@
directory using the Bash Shell: -BASHdata-carpentry
+BASH
-+$ cd ~/Desktop/data-carpentry -$ jupyter notebook &
$ cd ~/Desktop/data-carpentry +$ jupyter notebook &
(The
@@ -716,9 +700,8 @@&
allows you to come back and use the bash shell without closing your notebook first.)BASHJupyterLab
-The Jupyter team have recently launched JupyterLab -which combines the Jupyter Notebook with many of the features common to -an IDE.
+If you like Jupyter Notebooks you might want to try JupyterLab, which combines +the Jupyter Notebook with many of the features common to an IDE.
Show me the solution
-The setup -menu at the top of the page contains drop-down boxes explaining how -to install the Python libraries using the Bash Shell or Anaconda -Navigator.
+The software +installation instructions explain how to install the Python +libraries using the Bash Shell or Anaconda Navigator.
Launch a Jupyer Notebook
Once your notebook is open, import
-xarray
,catropy
,matplotlib
andnumpy
using the following Python commands:+PYTHON
-+import xarray as xr -import cartopy.crs as ccrs -import matplotlib.pyplot as plt -import numpy as np
import xarray as xr +import cartopy.crs as ccrs +import matplotlib.pyplot as plt +import numpy as np
(Hint: Hold down the shift and return keys to execute a code cell in a Jupyter Notebook.)
diff --git a/instructor/02-visualisation.html b/instructor/02-visualisation.html index 81da780..c2aa3a1 100644 --- a/instructor/02-visualisation.html +++ b/instructor/02-visualisation.html @@ -628,8 +628,7 @@PYTHON< plt.show()
The default colorbar used by matplotlib is
viridis
. It -used to bejet
, but that was changed a couple of years ago -in response to the #endtherainbow +used to bejet
, but that was changed in response to the #endtherainbow campaign.Putting all the code together (and reversing viridis so that wet is purple and dry is yellow)…
@@ -713,7 +712,7 @@Season selection
Rather than plot the annual climatology, edit the code so that it plots the June-August (JJA) season.
-(Hint: the groupby functionality can be used to group +
(Hint: the
groupby
functionality can be used to group all the data into seasons prior to averaging over the time axis)PYTHON< -
+@@ -970,7 +970,7 @@PYTHON diff --git a/instructor/04-cmdline.html b/instructor/04-cmdline.html index 8e72129..02f965b 100644 --- a/instructor/04-cmdline.html +++ b/instructor/04-cmdline.html @@ -390,8 +390,8 @@
Objectives
the task (as we have for plotting the precipitation climatology), that code can be transferred to a Python script so that it can be executed at the command line. It’s likely that your data processing workflows will -include command line utilities from the CDO and NCO projects in addition -to Python code, so the command line is the natural place to manage your +include command line utilities from the CDO and NCO projects in addition to +Python code, so the command line is the natural place to manage your workflows (e.g. using shell scripts or make files).In general, the first thing that gets added to any Python script is the following:
diff --git a/instructor/05-git.html b/instructor/05-git.html index 72ff7a8..0090768 100644 --- a/instructor/05-git.html +++ b/instructor/05-git.html @@ -1016,9 +1016,9 @@Checking out with GitScientist’s nightmareTypes of errors
There are essentially two kinds of errors that can arise in Python: @@ -712,8 +712,8 @@
Testing and continuous integrationResearch -Software Engineering With Python has a chapter on -testing that is well worth a read. +Software Engineering With Python has a chapter on testing +that is well worth a read.
PYTHON< -
+The basic configuration command at the top of the
main
function @@ -1014,7 +1014,7 @@PYTHON< -
+PYTHON @@ -1049,7 +1049,7 @@
plot_precipitation_climatology.py
Show me the solution
-+PYTHON diff --git a/instructor/09-provenance.html b/instructor/09-provenance.html index 1e4141a..4b8aeb1 100644 --- a/instructor/09-provenance.html +++ b/instructor/09-provenance.html @@ -572,7 +572,7 @@
PYTHON<
Handling different image formats
-The
plt.savefig
+The
plt.savefig
documentation provides information on the metadata keys accepted by PNG, PDF, EPS and PS image formats.Using that information as a guide, add a new function called @@ -607,7 +607,7 @@
PYTHON< -
+The new function could read as follows:
@@ -740,7 +740,7 @@PYTHON< -
+The beginning of the script would need to be updated to import the
@@ -783,7 +783,7 @@cmdline_provenance
library:plot_precipitation_climatology.py
Show me the solution
-+@@ -756,7 +756,7 @@PYTHON diff --git a/instructor/10-large-data.html b/instructor/10-large-data.html index 23c8070..b8502cd 100644 --- a/instructor/10-large-data.html +++ b/instructor/10-large-data.html @@ -401,7 +401,7 @@
Data download
Instructors teaching this lesson can download the CNRM-CM6-1-HR daily precipitation data from the Earth System Grid Federation (ESGF). See the -instructor +instructor notes for details. Since it is a very large download (45 GB), learners are not expected to download the data. (None of the exercises at the end of the lesson require downloading the data.)
@@ -728,8 +728,8 @@Writing your own Dask-aware functionsxarray (e.g. an interpolation routine from the SciPy library) we’d first need to use the
apply_ufunc
ormap_blocks
function to make -those operations “Dask-aware”. The xarray -tutorial from SciPy 2020 explains how to do this. +those operations “Dask-aware”. There’s an xarray +tutorial that explains how to do this.Alternatives to Dask
Show me the solution
-+@@ -1545,7 +1526,7 @@@@ -841,13 +823,13 @@Other options include writing a loop to process each netCDF file one at a time or a series of sub-regions one at a time.
@@ -788,7 +788,7 @@Find out what you’re working with
Show me the solution
-+@@ -813,10 +796,9 @@@@ -721,10 +705,10 @@@@ -548,18 +549,16 @@Run the following two commands in your notebook to setup the client and then type
client
to view the display of diff --git a/instructor/aio.html b/instructor/aio.html index 70c8696..0a74947 100644 --- a/instructor/aio.html +++ b/instructor/aio.html @@ -462,14 +462,14 @@Objectives
task.For reading, writing and analysing data stored in the netCDF file format, atmosphere and ocean scientists will typically do most of their -work with either the xarray or iris libraries. These libraries -are built on top of more generic data science libraries like numpy and -matplotlib, to make the types of analysis we do faster and more -efficient. To learn more about the PyAOS “stack” shown in the diagram -below (i.e. the collection of libraries that are typically used for data -analysis and visualisation in the atmosphere and ocean sciences), check -out the overview of the PyAOS -stack at the PyAOS community site.
+work with either the xarray or iris libraries. These +libraries are built on top of more generic data science libraries like +numpy and matplotlib, to make the types of analysis we do faster and +more efficient. To learn more about the PyAOS “stack” shown in the +diagram below (i.e. the collection of libraries that are typically used +for data analysis and visualisation in the atmosphere and ocean +sciences), check out the overview of the PyAOS stack at +the PyAOS community site.Python distributions for data science
@@ -485,19 +485,20 @@Objectives
most popular data science libraries and their dependencies pre-installed, and some also come with a package manager to assist with installing additional libraries that weren’t pre-installed. Today the -most popular distribution for data science is Anaconda, which comes +most popular distribution for data science is Anaconda, which comes with a package (and environment) manager called conda.Introducing conda
-According to the latest -documentation, Anaconda comes with over 250 of the most widely used -data science libraries (and their dependencies) pre-installed. In -addition, there are several thousand more libraries available via the -
+conda install
command, which can be executed using the Bash -Shell or Anaconda Prompt (Windows only). It is also possible to install -packages using the Anaconda Navigator graphical user interface.According to the latest documentation, +Anaconda comes with over 300 of the most widely used data science +libraries (and their dependencies) pre-installed. In addition, there are +several thousand more libraries available via the Anaconda Public +Repository, which can be installed by running the +
conda install
command the Bash Shell or Anaconda Prompt +(Windows only). It is also possible to install packages using the +Anaconda Navigator graphical user interface.@@ -534,8 +535,8 @@Miniconda
If you don’t want to install the entire Anaconda distribution, you -can install Miniconda instead. It -essentially comes with conda and nothing else.
+can install Miniconda +instead. It essentially comes with conda and nothing else.MinicondaAnaconda Cloud website, where the community can contribute conda installation packages. This is critical because many of our libraries have a small user base, which -means they’ll never make it into the top few thousand data science -libraries supported by Anaconda. +means they’ll never make it into the Anaconda Public Repository.
You can search Anaconda Cloud to find the command needed to install the package. For instance, here is the search result for the
iris
package:As you can see, there are often multiple versions of the same package -up on Anaconda Cloud. To try and address this duplication problem, conda-forge has been launched, -which aims to be a central repository that contains just a single -(working) version of each package on Anaconda Cloud. You can therefore -expand the selection of packages available via -
+up on Anaconda Cloud. To try and address this duplication problem, conda-forge has been launched, which +aims to be a central repository that contains just a single (working) +version of each package on Anaconda Cloud. You can therefore expand the +selection of packages available viaconda install
beyond the chosen few thousand by adding the -conda-forge channel:conda install
beyond +the chosen few thousand by adding the conda-forge channel:-BASH
@@ -575,7 +574,7 @@BASH
For these particular lessons we will use
xarray
, but all the same tasks could be performed withiris
. We’ll also -installdask
+installdask
(xarray
uses this for parallel processing),netCDF4
(xarray
requires this to read netCDF files),cartopy
(to help with geographic plot projections),cmocean
(for @@ -614,7 +613,7 @@Creating separate environmentsIf you’ve got multiple data science projects on the go, installing all your packages in the same conda environment can get a little messy. (By default they are installed in the root/base environment.) It’s -therefore common practice to create +therefore common practice to create separate conda environments for the various projects you’re working on.
For instance, we could create an environment called @@ -676,28 +675,13 @@
BASH
$ conda env create -f pyaos-lesson.yml
For ease of sharing the YAML file, it can be uploaded to your account -at the Anaconda Cloud website,
- -so that others can re-create the environment by simply refering to -your Anaconda username:
-The ease with which others can recreate your environment (on any operating system) is a huge breakthough for reproducible research.
To delete the environment:
-BASHThe
>>>
prompt indicates that you are now talking to the Python interpreter.A more powerful alternative to the default Python interpreter is -IPython (Interactive Python). The online -documentation outlines all the special features that come with -IPython, but as an example, it lets you execute bash shell commands -without having to exit the IPython interpreter:
+IPython (Interactive Python). The online documentation outlines +all the special features that come with IPython, but as an example, it +lets you execute bash shell commands without having to exit the IPython +interpreter:$ ipython Python 3.7.1 (default, Dec 14 2018, 13:28:58) Type 'copyright', 'credits' or 'license' for more information @@ -760,11 +744,11 @@
directory using the Bash Shell: -BASHdata-carpentry
+BASH
-+$ cd ~/Desktop/data-carpentry -$ jupyter notebook &
$ cd ~/Desktop/data-carpentry +$ jupyter notebook &
(The
@@ -779,9 +763,8 @@&
allows you to come back and use the bash shell without closing your notebook first.)BASHJupyterLab
-The Jupyter team have recently launched JupyterLab -which combines the Jupyter Notebook with many of the features common to -an IDE.
+If you like Jupyter Notebooks you might want to try JupyterLab, which combines +the Jupyter Notebook with many of the features common to an IDE.
Show me the solution
-The setup -menu at the top of the page contains drop-down boxes explaining how -to install the Python libraries using the Bash Shell or Anaconda -Navigator.
+The software +installation instructions explain how to install the Python +libraries using the Bash Shell or Anaconda Navigator.
Launch a Jupyer Notebook
Once your notebook is open, import
-xarray
,catropy
,matplotlib
andnumpy
using the following Python commands:+PYTHON
-+import xarray as xr -import cartopy.crs as ccrs -import matplotlib.pyplot as plt -import numpy as np
import xarray as xr +import cartopy.crs as ccrs +import matplotlib.pyplot as plt +import numpy as np
(Hint: Hold down the shift and return keys to execute a code cell in a Jupyter Notebook.)
@@ -1152,8 +1134,7 @@PYTHON< plt.show()
The default colorbar used by matplotlib is
viridis
. It -used to bejet
, but that was changed a couple of years ago -in response to the #endtherainbow +used to bejet
, but that was changed in response to the #endtherainbow campaign.Putting all the code together (and reversing viridis so that wet is purple and dry is yellow)…
@@ -1237,7 +1218,7 @@Season selection
Rather than plot the annual climatology, edit the code so that it plots the June-August (JJA) season.
-(Hint: the groupby functionality can be used to group +
(Hint: the
groupby
functionality can be used to group all the data into seasons prior to averaging over the time axis)PYTHON< -
+@@ -4583,7 +4564,7 @@PYTHON @@ -1710,8 +1691,8 @@
Objectives
the task (as we have for plotting the precipitation climatology), that code can be transferred to a Python script so that it can be executed at the command line. It’s likely that your data processing workflows will -include command line utilities from the CDO and NCO projects in addition -to Python code, so the command line is the natural place to manage your +include command line utilities from the CDO and NCO projects in addition to +Python code, so the command line is the natural place to manage your workflows (e.g. using shell scripts or make files).In general, the first thing that gets added to any Python script is the following:
@@ -2929,9 +2910,9 @@Checking out with GitScientist’s nightmareTypes of errors @@ -4321,8 +4302,8 @@
Testing and continuous integrationResearch -Software Engineering With Python has a chapter on -testing that is well worth a read. +Software Engineering With Python has a chapter on testing +that is well worth a read.
PYTHON< -
+The basic configuration command at the top of the
main
function @@ -4627,7 +4608,7 @@PYTHON< -
+PYTHON @@ -4662,7 +4643,7 @@
plot_precipitation_climatology.py
Show me the solution
-+PYTHON @@ -5091,7 +5072,7 @@
PYTHON<
Handling different image formats
-The
plt.savefig
+The
plt.savefig
documentation provides information on the metadata keys accepted by PNG, PDF, EPS and PS image formats.Using that information as a guide, add a new function called @@ -5126,7 +5107,7 @@
PYTHON< -
+The new function could read as follows:
@@ -5259,7 +5240,7 @@PYTHON< -
+The beginning of the script would need to be updated to import the
@@ -5302,7 +5283,7 @@cmdline_provenance
library:plot_precipitation_climatology.py
Show me the solution
-+@@ -5940,7 +5921,7 @@PYTHON @@ -5585,7 +5566,7 @@
Data download
Instructors teaching this lesson can download the CNRM-CM6-1-HR daily precipitation data from the Earth System Grid Federation (ESGF). See the -instructor +instructor notes for details. Since it is a very large download (45 GB), learners are not expected to download the data. (None of the exercises at the end of the lesson require downloading the data.)
@@ -5912,8 +5893,8 @@Writing your own Dask-aware functionsxarray (e.g. an interpolation routine from the SciPy library) we’d first need to use the
apply_ufunc
ormap_blocks
function to make -those operations “Dask-aware”. The xarray -tutorial from SciPy 2020 explains how to do this. +those operations “Dask-aware”. There’s an xarray +tutorial that explains how to do this.Alternatives to Dask
Show me the solution
-+Other options include writing a loop to process each netCDF file one at a time or a series of sub-regions one at a time.
@@ -5972,7 +5953,7 @@Find out what you’re working with
Show me the solution
-+Run the following two commands in your notebook to setup the client and then type
client
to view the display of diff --git a/instructor/index.html b/instructor/index.html index b107b17..e3dc679 100644 --- a/instructor/index.html +++ b/instructor/index.html @@ -561,7 +561,7 @@Troubeshooting
Your workshop instructor may also ask that you install the python -packages introduced in the first +packages introduced in the first lesson ahead of time. You can do this via the command line or by using the Anaconda Navigator:
diff --git a/instructor/instructor-notes.html b/instructor/instructor-notes.html index 459a031..cdb6390 100644 --- a/instructor/instructor-notes.html +++ b/instructor/instructor-notes.html @@ -429,9 +429,9 @@Instructor Notes
At the beginning of the workshop, participants are required to -download a number of data files (instructions at the setup -page). In the first -lesson, they are then required to install some python libraries +download a number of data files (instructions at the setup +page). In the first +lesson they are then required to install some python libraries (
jupyter
,xarray
,cmocean
, etc). Both these tasks can be problematic at venues with slow wifi, so it is often a good idea to ask participants to download the data and install @@ -453,10 +453,10 @@Instructor Notes
-The setup +
The setup page gives details of the software installation instructions that can provided to participants.
-You can also send the helper +
You can also send the helper lesson check to helpers prior to the workshop, so that they can test that all the software and code is working correctly.
Package management
diff --git a/md5sum.txt b/md5sum.txt index ee682d5..7175555 100644 --- a/md5sum.txt +++ b/md5sum.txt @@ -5,17 +5,17 @@ "helper_lesson_check.md" "1d1b53176140a057b588969f9f60fa61" "site/built/helper_lesson_check.md" "2024-07-25" "index.md" "0e62af27fdd16f8c6f2c9e1fef67e750" "site/built/index.md" "2024-07-25" "paper.md" "32ace9442c642b5e25ce659e7ae9df72" "site/built/paper.md" "2024-07-25" -"episodes/01-conda.md" "a6686033d7b73c398bf6f559e29b0065" "site/built/01-conda.md" "2024-07-25" -"episodes/02-visualisation.md" "1e0da181aa29d53a614a7358d5379434" "site/built/02-visualisation.md" "2024-07-25" +"episodes/01-conda.md" "6145214a159aca5d93d996ce060b6fc2" "site/built/01-conda.md" "2024-07-25" +"episodes/02-visualisation.md" "a12f74e55346ec75d7fc4a945ab028c5" "site/built/02-visualisation.md" "2024-07-25" "episodes/03-functions.md" "5f76d28ee961529d9fdf812a01cc5e81" "site/built/03-functions.md" "2024-07-25" -"episodes/04-cmdline.md" "ba7a163f8c93824a88a42045723108c6" "site/built/04-cmdline.md" "2024-07-25" -"episodes/05-git.md" "6b10244dccd7f145ba1ad6c915ac06b8" "site/built/05-git.md" "2024-07-25" +"episodes/04-cmdline.md" "b4b598fae925dc8a9ec46a8cafae2048" "site/built/04-cmdline.md" "2024-07-25" +"episodes/05-git.md" "eb8f096013c9345e773be2150f3a83c2" "site/built/05-git.md" "2024-07-25" "episodes/06-github.md" "3ae16cc2084d7dca01805a73eabc556a" "site/built/06-github.md" "2024-07-25" "episodes/07-vectorisation.md" "04d52cf7cd3a849c406c5c96cca4671c" "site/built/07-vectorisation.md" "2024-07-25" -"episodes/08-defensive.md" "4e709227290c2bffb21d737b55cb67f9" "site/built/08-defensive.md" "2024-07-25" -"episodes/09-provenance.md" "91a9a5b5e34b594ca49d77c319264eb8" "site/built/09-provenance.md" "2024-07-25" -"episodes/10-large-data.md" "909dba426cb9409b6124181662c31af4" "site/built/10-large-data.md" "2024-07-25" -"instructors/instructor-notes.md" "858a8aea24f020d8732ac71b2601f2e6" "site/built/instructor-notes.md" "2024-07-25" +"episodes/08-defensive.md" "ffb30e3794f41e1ccabd50fee6c93550" "site/built/08-defensive.md" "2024-07-25" +"episodes/09-provenance.md" "c7e54cd796b333c02084d0b774bae734" "site/built/09-provenance.md" "2024-07-25" +"episodes/10-large-data.md" "25882b71d089110a254d7265cbd1ae62" "site/built/10-large-data.md" "2024-07-25" +"instructors/instructor-notes.md" "1ce510d0cb6a52e613cda477b6fad382" "site/built/instructor-notes.md" "2024-07-25" "learners/reference.md" "4e0dcbc7892af6f9610d44d356e66617" "site/built/reference.md" "2024-07-25" -"learners/setup.md" "d86e41df032ef642f234e5014b28178a" "site/built/setup.md" "2024-07-25" +"learners/setup.md" "67749e8099ace68108310de95ec0a636" "site/built/setup.md" "2024-07-25" "profiles/learner-profiles.md" "60b93493cf1da06dfd63255d73854461" "site/built/learner-profiles.md" "2024-07-25" diff --git a/pkgdown.yml b/pkgdown.yml index e91f2e0..3516c96 100644 --- a/pkgdown.yml +++ b/pkgdown.yml @@ -2,4 +2,4 @@ pandoc: 3.1.11 pkgdown: 2.1.0 pkgdown_sha: ~ articles: {} -last_built: 2024-07-25T19:28Z +last_built: 2024-07-25T20:44Z