Authors: Abheek Basu, Christine Huang, Ege Sagduyu
Final Project for STAT 571: Modern Data Mining
Abstract: In this paper, we analyze soccer data from 1888 to 2014 in order to discover trends and develop a match outcome classifier. We attempt to answer two main questions: how did soccer’s evolution as a sport impact scoring patterns, and would a Classifier based on the Elo rating system produce reliable, above-average predictions?
Most of our statistical analyses are contained in the associated R
files, whereas textcleaning.py
contains the clean-up, parsing, and graphing.
If you are interested in our process, please check out the PDF
file.