Skip to content

UBC-MDS/525_group_24

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning Models in the Cloud to Predict Daily Rainfall in Australia

DSCI 525 - Group 24 @ University of British Columbia March 30, 2021

About

In this project, we will build and deploy ensemble machine learning models in the cloud to predict daily rainfall in Australia on a large dataset (~6 GB) (data source is here), where features are outputs of different climate models and the target is the actual rainfall observation. The purpose of this project is to get exposure to working with larger dataset and achieve various learning objectives in each of the following four milestones:

Milestone 1: Get the data from web using API, process it, and convert it to an efficient file format.
Milestone 2: Move the data to cloud, setup the infrastructure in the cloud and build a Machine Learning model.
Milestone 3: Setup distributed infrastructure using Spark and run the Machine Learning model on Spark.
Milestone 4: Deploy Machine Learning model in cloud so that other consumers can use it.

Report

Milestone 1: A summary of observations and discussion on challenges encountered, is documented in this notebook_1.

Milestone 2: A summary of moving data to cloud and wrangling for machine learning, is documented in this notebook_2.

Milestone 3: A summary of Machine Learning model building results is documented here.

Milestone 4: A summary of API deployment results is documented here.

License

The material on analysis about “Machine Learning Models in the Cloud to Predict Daily Rainfall in Australia” are licensed under the MIT License (Copyright (c) 2020 Master of Data Science at the University of British Columbia). If you want to re-use/re-mix the analysis and the materials used in this project, please provide attribution and link to this repository.

The data used to create the “Daily Rainfall over NSW, Australia” data set are freely available under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence.

Contributors

Contributor Name GitHub Username
Huanhuan Li huan-ds
Nash Makhija nashmakh
Nicholas Wu nichowu

Releases

No releases published

Packages

No packages published