We implement a Bayesian spam filter in Python.
We first implemented the vanilla Bayesian filter, training on the data provided, and then extended it in various directions.
- Stemming using Porter's algorithm
- Cosine normalisation of feature weights
- Feature selection using WEKA.