Table of Contents
This projects predicts whether a message is either Spam or Not Spam using a Long Short-term Memory Network(LSTM).
The predictions are served through an API which is run in a Docker container.
The dataset for this project is taken from Kaggle. Download it and place it in the "src/main/data/" directory.
This projects starts a server where the API is running.
To start, move to the docker directory by running
cd docker/
After that, run the following commands. First, we build the docker image using docker-compose.
docker compose build
And then, run the application inside the docker container.
docker compose up
After that, the API will be available at http://0.0.0.0:8000/predict
In the form, you can type a message an it will predict if the message is a spam or not.
The LSTM model is trained using the train.py file located at "src/main/train.py".
Additionally, if the MLflow server is up and running, the metrics and hyperparameters are tracked and saved.
To train the model, follow these steps:
cd src
python3 main/train.py --source main/data/spam.csv --pipe_path main/trained_pipe --model_path main/trained_models
The "source", "pipe_path", and "model_path" are mandatory arguments to train the model. There are additional optional arguments to specify the hyperparameters to be used. These can be found in the "train.py" file.
Install the required Python libraries from the "requirements.txt" file.
- Containerize the application ☑
- Set up CI/CD pipeline ☑
- Deploy on Heroku