Skip to content

Python oriented toward data analysis

Notifications You must be signed in to change notification settings

rustil/lecture-python

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Binder

Introduction to Python for Data Analysis

This repository contains the material for lecture of a data scientist university degree proposed at Université Clermont-Auvergne (UCA), which is mostly based on a NumPy tutorial that I gave at the Laboratoire de Physique de Clermont in April 2019. No prerequisite knowledge is assumed but being familiar with one progamming language might be useful. However, it is better to know about some basic mathemtatics like simple vectorial operation or statistics.

General scope of the lecture

The main goal of this lecture is to make people familiar with python and data analysis tools in order to make them able to extend their knowledge on themself. Pratical exercises and small projects are also proposed to provide a few working examples of data manipulations with different level of complexity. The full lecture material is available in PDF here.

What this lecture is? A basic and practical introduction to python together with some of the most important data analysis tools namely numpy, matplotlib and pandas.

What this lecture isn't? Neither a formal introduction to python, nor a extensive demonstration of all features available in the tools mentioned above.

Content of the lecture

There are a lot of information in this lecture. In order to help you to focus on important aspect, each chapter start with a list of expected skills that you should take away, ranked with three levels: basic, medium, expert.

1. Practical Introduction to Python. This first section is dedicated to basic object type and operation in python. Fonctions will also be described but object oriented programming will not be covered -- online notebook

2. Introduction to numpy. Differences between usual python objects and numpy objects will be introduced -- online notebook

3. Three tools to know. This section gives a glimpse of matplotlib, pandas and scipy packages allowing powerful data analysis -- online notebook

4. Multidimensional data manipulation. Non-trivial operation for multidimensional data using the full power of numpy. Most of these operation can be performed with existing tools but it is intructive to do it once with native numpy -- online notebook

5. Introduction to image processing. Very first steps of image processing (definition, plotting, operation) including basic filters application (noising, sharpen, border detection) -- online notebook

Other practical examples. Depending on the remaining time (and the people taste), we can go through different topics among the following ones. Some of them can be also used as a project performed by students.

  • Fourier analysis
  • Principal component analysis (PCA)
  • Random Forest regression
  • Gaussian processes

How to get prepared

1. Get familiar with python. I would recommand two links: w3school tutorial (both basic and complete) and https://www.learnpython.org (code can be ran directly within your web browser).

2. Install python with anaconda. In order to run python on your own machine, you should install it. I would recommand anaconda for this, which also includes jupyter-notebook.

3. Install git. This is a versioning software which can be installed following these instructions. This whole repository can be cloned using git clone https://github.com/rmadar/lecture-python command.

4. Get familiar with notebooks. This represents a nice environement combining codes, notes and plots. This is very powerful to learn something and play with it. You can checkout this video or this post.

Prerequisites to run notebooks

  • Python 3
  • matplotlib
  • NumPy
  • pandas
  • SciPy
  • Pillow

or use the full list of dependencies to build a working conda environment with the environment.yml

About

Python oriented toward data analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%