Skip to content

Lectures, Data, and Exercises for Environmental Data Science at UCSB

Notifications You must be signed in to change notification settings

environmental-data-science/envdatasci

Repository files navigation

Environmental Data Science

Course Summary

This course will provide an introduction to the principles of environmental physics and their application to ecological sciences, with a focus on programming and data analysis in Python. Course activities will use data analysis to quantify environmental patterns and processes. Emphasis will be placed developing coding skills in Python and applying these skills to environmental and biophysical problems.

Course Goals

  1. To develop expertise in the Python programming language and the use of Python's data science stack to effectively store, manipulate, and gain insight into environmental data.

  2. To be able to apply this understanding to characterize data on environmental patterns and processes at varying spatial and temporal scales.

  3. To use data to model environmental processes of energy and mass transfer.

Course Format

Students will learn the principles of Python programming and environmental data science by working largely independently on weekly course materials conducted in Python. Readings will be assigned for both programming and disciplinary content related to weekly themes. At least once a week, we will meet as a group to introduce and discuss concepts. In addition, students will have the opportunity to conduct weekly one-on-one check-ins with the instructional team.

How to use this repository.

If you are a student in G136:

  1. Login to the G136 JupyterHub Server.

  2. Clone this repository to your server instance.

    • Open a Terminal in your JupyterLab instance. (Instructions)

    • Type git clone and the the url for this repository, which is https://github.com/environmental-data-science/envdatasci.git.

      The entire command will look like this:

      jovyan@jupyter-USERNAME:~$ git clone https://github.com/environmental-data-science/envdatasci

      Note: In the line above, the jovyan@jupyter-USERNAME:~$ is your terminal prompt, where USERNAME is your ucsb id. On other systems, the command prompt is something like >, or $. To keep these directions more general, I will just use $ to represent the command prompt throughout our docs. The key point is that you don't need to type this as part of the command.

    • Press Enter. A local clone of the class repository will be created in your JupyterLab instance.

       $ git clone https://github.com/environmental-data-science/envdatasci
       > Cloning into envdatasci...
       > remote: Counting objects: 10, done.
       > remote: Compressing objects: 100% (8/8), done.
       > remove: Total 10 (delta 1), reused 10 (delta 1)
       > Unpacking objects: 100% (10/10), done.
      

      You will now have a new local directory in your instance called envdatasci/, which contains all of the course materials. Before proceeding, we need to make sure that your instance has all the necessary python libraries that the course materials require. We will use a python installation utility called pip to update your instance with the required libraries.

  3. Use pip to install required libraries.

    • In your open terminal, change directory into the newly-created envdatasci folder.

      $ cd envdatasci

    • There is a text file called requirements.txt in this folder. You page through this file using the more command.

      $ more requirements.txt

    • The file contains a list of python modules. We will be using these various modules in the course, and so we need to make sure they are installed in your JupyterLab instance. This is easy to do with the pip command:

      $ pip install -r requirements.txt

    • Type the above command and press Enter. You will see a ton of output as the pip command reads each line of the requirements.txt file, determines what library (and library version) is on each line, and then installs the specific version of that library if is needed. The command also tracks down any dependencies that each new library might require and installs those too.

      Note: Most of the libraries in requirements.txt should already be installed, in which case pip will report back Requirement already satisfied for almost every line.

Local Installation (for Instructors or non-students)

  1. Install Conda & Git.

    • Mac OS: Use homebrew

      $ brew install anaconda

      $ brew install git

    • Windows: ??

    • Linux: Use homebrew??

  2. Create a conda environment.

    $ conda create -n envdatasci python=3.7.3

    Note: We are using python version 3.7.3 in this class. That may change in the future, but for now it matches the python that LSIT is using in their docker images that they use to build JupyterHub deployments.

  3. Activate the conda envrionment

    $ conda activate envdatasci

  4. Install pip into the local conda environment.

    $ conda install pip

  5. Clone the repository to your local machine and cd into the class repo directory

    $ git clone https://github.com/environmental-data-science/envdatasci

    $ cd envdatasci

  6. Add additional libaries to your conda environment using pip.

    $ pip install -r requirements.txt

    Note: We are using pip to manage dependencies within this conda environment. The use of pip and the requirements.txt file ensures consistency with our insallations on the JupyterHub server. This allows us to make sure that the working environment on our local machines matches exactly the working environment on JupyterHub.

    Note: If you add a package to your local environment that is used in any of the course materials, you must use pip freeze > requirements.txt and push the new commit to our repo.

Releases

No releases published

Packages

No packages published

Languages