With recent big data, data science and deep learning revolution, enterprises ranging from FORTUNE 100 to startups across the world are hungry for data scientists and machine learning scientists to bring actionable insight from the vast amount of data collected. In the past a couple of years, deep learning has gained traction in many application areas and it becomes an essential tool in data scientist’s toolbox. In this course, students will develop a clear understanding of the big data cloud platform, technical skills in data sciences and machine learning, and especially the motivation and use cases of deep learning through hands-on exercises. We will also cover the “art” part of data science and machine learning to guide participants to learn typical agile data science project flow, general pitfalls in data science and machine learning, and soft skills to effectively communicate with business stakeholders. This course will prepare statisticians to be successful data scientists and deep learning scientist in various industries and business sectors.
The big data platform, data science, and deep learning overviews are specifically designed for audience with statistics education background. The data science workflow, pitfalls and soft skills are highlight through real-world data science and machine learning problems. The Databricks community edition cloud platform will be used throughout the training course to cover hands-on sessions including:
(1) Big data platform using Spark through R sparklyr package;
(2) Introduction to Deep Neural Network, Convolutional Neural Network and Recurrent Neural Networks and their applications;
(3) Deep learning examples using TensorFlow through R keras package.
The primary audiences for this course are:
(1) Statistician in traditional industry sectors such as manufacturing, pharmaceutical and banking;
(2) Statistician in government agencies;
(3) Statistical researchers in universities;
(4) Graduate students in statistics departments. The prerequisite knowledge is MS level education in statistics and entry level of R knowledge. No software installation is needed in students’ laptop and the cloud platform is easily accessed through browsers such as Chrome or Firefox with internet connection.
Topic | Time |
---|---|
Introduction to Data Science | 08:40 - 09:10 |
Deep Learning 1 | 09:10 - 10:10 |
Tea Break | 10:10 - 10:30 |
Deep Learning 2 & 3 | 10:30 - 12:00 |
Lunch | 12:00 - 13:00 |
Big Data Cloud Platform and Hands-on | 13:00 - 13:45 |
Deep Learning 1 Hands-on | 13:45 - 14:30 |
Tea Break | 14:30 - 14:50 |
Deep Learning 2 & 3 Hands-on | 14:50 - 15:50 |
Soft Skill and Project Cycle | 15:50 - 16:15 |
Q&A | 16:15 - 16:30 |
- Course homepage: https://course2019.netlify.com/
- Databrick free community edition account
- Perceptron notebook
- Adaline notebook
- Feedforward neural network notebook
- Convolutional neural network notebook
- Recurrent neural network notebook
- Big Data Platform notebook
- Data preprocessing notebook
- Data wrangling notebook
- Industry recommendations for academic data science programs
- Deep Learning Using R, François Chollet with J. J. Allaire, ISBN 9781617295546 (2018)
- Python Machine Learning by Sebastian Raschka, ISBN-13: 978-1787125933 (2018)
- https://keras.rstudio.com/
- http://spark.rstudio.com/
- https://databricks.com/spark/about
- https://github.com/onnx/onnx