Skip to content

A Machine Learning Project to detect if someone has Lung Cancer or not. Built with FastAPI, Streamlit and Docker.

License

Notifications You must be signed in to change notification settings

Nneji123/Lung-Cancer-Detection

Repository files navigation

Lung Cancer Detection App built with Streamlit, FastAPI and Docker

Language Framework Framework Framework hosted Docker build reposize

An end-to-end Machine Learning Project to detect if someone has Lung Cancer or not. Built with FastAPI, Streamlit and Docker.

Problem Statement

Lung cancer is a type of cancer that begins in the lungs and most often occurs in people who smoke. Two major types of lung cancer are non-small cell lung cancer and small cell lung cancer. Causes of lung cancer include smoking, second-hand smoke, exposure to certain toxins and family history. Symptoms include a cough (often with blood), chest pain, wheezing and weight loss. These symptoms often don't appear until the cancer is advanced. Treatments vary but may include surgery, chemotherapy, radiation therapy, targeted drug therapy and immunotherapy.

This Streamlit App utilizes a Machine Learning API built with FastAPI in order to detect lung cancer in patients based on the following criteria: age, gender, blood pressure, smoke, coughing, allergies, fatigue etc.

​The machine learning model used for this app was deployed as an API using the FastAPI framework and then accessed through a frontend interface with Streamlit.

The App can be viewed through this link

The API and its documentation can be viewed here or here

Data Preparation

The original dataset contains 16 columns and 310 rows with the "GENDER" and "LUNG_CANCER" columns containing object data types while the rest of the columns were integer datatypes.

The data was then cleaned and processed for modelling by changing the following:

  • The values "M" and "F" in the "GENDER column were converted to 1 and 0 respectively.
  • The values "YES" and "NO" in the "LUNG_CANCER" column were converted to 1 and 0 respectively.
  • The values "2" and "1" in the rest of the columns were converted to 1 and 0 respectively.

The processed dataset was then saved as processed_lung_cancer.csv

Original Dataset

Processed Dataset

Modelling

In this project I tested 6 different classification algorithms namely:

  • Logistic Regression
  • Decision Tree
  • Random Forest
  • XGBoost
  • GradientBoostClassifier
  • SupportVectorClassifier

The final model used for the API was the Gradient Boost Classifier model which had an accuracy score of 0.94.

Preview

API Demo

fastapi

Streamlit App Demo

streamlit

How to run API and Streamlit App on Google Colab:

💻 Running the API on Google Colab

To run a demo or carry out testing with the API it's best to do that with Google Colab. To run/test the API on Google Colab do the following:

  1. Clone the repository.
  2. Open a Google Colab instance and upload the Lung Cancer Prediction.ipynb file to that instance.
  3. Run each cell until the last cell and you should be able to view the API with a link that has the name ngrok in it.
💻 Running the Streamlit App on Google Colab

The Streamlit App can also be viewed using Google Colab by doing the following:

  1. Upload the "streamlit_app.py" and "requirements.txt" file to your instance on Google Colab
  2. Install the requirements by running:
!pip install -r requirements.txt
  1. Install Pyngrok in your instance:
!pip install pyngrok
  1. Run the following code in your instance:
from pyngrok import ngrok 
public_url = ngrok.connect(port=’8501')
public_url
  1. You can then view the streamlit app on your Google Colab instance by running:
!streamlit run /content/streamlit_app.py & npx localtunnel — port 8501

Running on Local Machine 💻

Since we have multiple containers communcating with each other, I created a bridge network called AIservice. For testing, a docker-compose.yml file has been included so as to run both the API and Streamlit app simultaneously as docker containers. To run the API and the Streamlit app on your local machine do the following:

  1. Clone the repository to your local machine
  2. Install docker and docker-compose if you haven't
  3. Open a bash/cmd in the directory and run:
docker network create AIservice
  1. Then run this command
docker-compose up -d --build
  1. After the above steps have been carried out you can now view the documentation of the API and also the Streamlit app.

To visit the FastAPI documentation go to http://localhost:8000 with a web browser.

To visit the Streamlit UI, visit http://localhost:8501.

Logs can be inspected via:

docker-compose logs

The docker-compose method can also be used to deploy the API and Streamlit app on Heroku(using Dockhero which is not free) or using cloud services such as Microsoft Azure, Amazon Web Services or Google Cloud Platform.

Deployment

The API has been deployed using the dockerfile on heroku.

💻 Deploying the API Assuming you have git and heroku cli installed just carry out the following steps:
  1. Clone the repository
git clone https://github.com/Nneji123/Lung-Cancer-Prediction.git
  1. Change the working directory
cd Lung-Cancer-Prediction
  1. Create the heroku app
heroku create your-app-name 

Replace your-app-name with the name of your choosing.

  1. Set the heroku cli git remote to that app
heroku git:remote your-app-name
  1. Set the heroku stack setting to container
heroku stack:set container
  1. Push to heroku
git push heroku main
💻 Deploying the Streamlit App to Streamlit Cloud

The Streamlit App was deployed using the streamlit cloud and accesses the API deployed on Heroku. To deploy the app using streamlit cloud share do the following:

  1. Fork this repository to your Github account.
  2. Create a Streamlit Account and then navigate to https://streamlit.io/cloud
  3. Create a new app and then choose the repository you cloned and the "streamlit_app.py" and then click deploy.

After the app has been built on the cloud you should then be able to view your app right away!