1.2 ML vs Rule-Based Systems

Notes

The difference between ML and Rule-Based systems is explained with the example of a spam filter.

Traditional Rule-Based systems are based on a set of characteristics (keywords, email length, etc.) that identify an email as spam or not. As spam emails keep changing over time the system needs to be upgraded making the process untractable due to the complexity of code maintenance as the system grows.

ML can be used to solve this problem with the following steps:

1. Get data

Emails from the user's spam folder and inbox give examples of spam and non-spam.

2. Define and calculate features

Rules/characteristics from rule-based systems can be used as a starting point to define features for the ML model. The value of the target variable for each email can be defined based on where the email was obtained from (spam folder or inbox).

Each email can be encoded (converted) to the values of its features and target.

3. Train and use the model

A machine learning algorithm can then be applied to the encoded emails to build a model that can predict whether a new email is spam or not spam. The predictions are probabilities, and to make a decision it is necessary to define a threshold to classify emails as spam or not spam.

⚠️	The notes are written by the community. If you see an error here, please create a PR with a fix.

Notes from Peter Ernicke

Navigation

Machine Learning Zoomcamp course
Lesson 1: Introduction to Machine Learning
Previous: Introduction to Machine Learning
Next: Supervised Machine Learning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

02-ml-vs-rules.md

02-ml-vs-rules.md

1.2 ML vs Rule-Based Systems

Notes

1. Get data

2. Define and calculate features

3. Train and use the model

Navigation

Files

02-ml-vs-rules.md

Latest commit

History

02-ml-vs-rules.md

File metadata and controls

1.2 ML vs Rule-Based Systems

Notes

1. Get data

2. Define and calculate features

3. Train and use the model

Navigation