Skip to content

theed-ml/theed-ml.github.io

Repository files navigation

Introduction to Machine Learning & Deep Learning

Description

The term machine learning or statistical learning refers to the science of automated detection of patterns in data. It has been widely used in tasks that require information extraction from large data sets. Examples of tasks include SPAM detection, fraudulent credit card transaction detection, face recognition by digital cameras, and voice commands recognition by personal assistance on smart-phones. Machine learning is also widely used in scientific domains such as Bioinformatics, medicine, and astronomy. One characteristic of all these applications is that a human developer cannot provide an explicit and detailed specification of how these tasks should be executed, due to the complexity of the patterns that need to be detected.

Objectives

This course provides a thorough grounding in the methods, techniques, and algorithms of machine learning. In the end of this course, the students should be able to describe the main concepts underlying machine learning, including for instance:

  • (a) what is learning
  • (b) how can a machine learning
  • (c) what kind of problems can be solved by using machine learning approaches
  • (d) how to formalize them as a machine learning problem, and
  • (e) how to compare and evaluate the performance of different machine learning
  • (f) apply machine learning methods into different use cases, using Python, Pandas, scikit-learn, among others

Textbooks

Grading scheme

  • A practical session that will realized in group of two students.
  • One individual project used for the personal evaluation at the end of the course. It must be an IoT use case.

Lectures

  1. Machine Learning

    • Introduction to machine learning
    • Computational foundations:
      • Using Python, Anaconda, Jupyter Notebooks
      • Scientific Computing with NumPy, SciPy, and Matplotlib
      • Exploratory data analysis (EDA), data processing, and machine learning with scikit-learn
      • Reproducible machine learning pipeline with Docker
  2. Regression: predicting house prices

    • Introduction
    • Regression, gradient descent
    • Assessing performance, error types, and bias/variance trade-off
    • Overfitting, regularized regression, ridge regression
    • Lasso regression, cross-validation
  3. Classification: sentiment analysis

    • Introduction
    • Logistic regression
    • Tree-based methods:
      • decision trees
      • overfitting in decision trees
    • Precision, recall, and ROC curve
    • Ensemble methods:
      • Boosting
      • Bagging
      • Random Forests
  4. Clustering and similarity: retrieving documents

    • Introduction to clustering
    • kNN methods for classification and regression
    • k-means
    • Hierarchical clustering
  5. Dimensionality reduction

    • Feature selection and extraction
    • Principal component analysis (PCA)
  6. Recommender systems: recommending products

    • Introduction to recommender systems
    • Performance metrics
  7. Neural networks

    • Perceptron
    • Multilayer Perceptron
    • Support Vector Machines (SVM)
  8. Deep learning: image classification

    • Introduction
    • Single and multilayer networks
    • Convolutional neural network (CNN)

Resources