Skip to content

Latest commit

 

History

History
16 lines (11 loc) · 724 Bytes

README.md

File metadata and controls

16 lines (11 loc) · 724 Bytes

README

This data mining repository is focusing on the coding assignments designed by Dr Yao-Yi Chiang for module DSCI553(INF553). All codes are emphasised on using Spark to handle massive data.

The professor allowed students to post our codes on the GitHub; these codes will be used for plagiarism checking in the future. Therefore, anyone who is currently in this course is strongly discouraged to see the codes.


The repository consists of six parts

  • A warm-up for spark
  • SON algorithm for Frequent Item
  • Recommender System
  • Girvan-Newman algorithm for Community Detecting
  • BFR algorithm for Clustering
  • Streaming