Chest_Xray_normal_pneumonia_classification

The purpose of this project is to detect pneumonia(bacterial/viral) from X-Ray image using classification methods. We developed this project in a Machine Learning course in our studying framework.

We have created the following models:

Convolutional Neural Network (CNN)
Backpropagation algorithm
K-Nearest neighbor (KNN)
Recurrent Neural Networks (RNN)
AdaBoost
SVM
Random Forest

The main idea was to understand more how the models work, not to have high accuracy. The models have been developed using Jupyter Notebook.

DataBase

Our Database has been taken from: https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia

This database has 5000+ images(JPEG), which fall into three categories:

Train
Test
Validation

Every category has two types of images:

Normal - depicts clear lungs
Pneumonia - depicts pneumonia from viral pathogens or bacterial pathogens

Imaging examples:

The normal chest X-ray (left panel) depicts clear lungs without any areas of abnormal opacification in the image. Bacterial pneumonia (middle) typically exhibits a focal lobar consolidation, in this case in the right upper lobe (white arrows), whereas viral pneumonia (right) manifests with a more diffuse ‘‘interstitial’’ pattern in both lungs.

Results:

CNN:

Backpropagation:

KNN:

RNN:

AdaBoost:

SVM

Random Forest

Main challenges we faced:

The biggest challenge of this project is the imbalance of the dataset.

The number of X-Ray images for NORMAL and PNEUMONIA cases was not 50%/50% in training and test datasets.
The class ratio is not consistent across different datasets. The NORMAL/PNEUMONIA ratio was around 1:3 in training dataset so we made it 1:1 to be balanced, also 1:1 ratio in validation dataset, and around 1:1.67 in test dataset.
The dataset is relatively small, thus may lead to overfitting and low prediction accuracy on test dataset.
We tried to fit the model into 5 different algorithms, The first one is CNN (Convolution neural network), that can handle the images very well because there is a kernel that moves around the image and shares it's weight. This algorithm gave the higher accuracy hence it was the best.
The second is back propagation, fully connected neural network that can improve the weight by errors. (it reduces/increases the weights by the gradient of the error) so it can learn the data very well too.
The third was AdaBoost classifier, in this method we tried to work again on the data so the algorithm can handle it better (1, -1), (there is no neural network in this algorithm).
The fourth is KNN cluster (K- Nearest neighbor), it takes an image and tries to find the closest 3 neighbors and clusters it to the closest label.
The last was RNN (recurrent neural network), although in this algorithm we tried to do our best but couldn’t handle the data .
It was challenging to find the best learning rate that can give the best results, in some values we got a high accuracy rate on the training data but it wasn’t close in the test(overfitting) so we tried to find the best value.
To find the best normalized size to the image (150, 150) in a way that we don’t lose any features that are important to classification.
To get the number of hidden layers in back Propagation that can learn the data without overfitting.
Random forest model builds trees and decides according to legal feautures. Because we are using images, we do not have clear features, so we had to convert the image to a size of 150x150 and since it is a two-dimensional array, we turned it into a one-dimensional array of size 22500, and this helped us insert it instead of several features.
SVM model accepts a vector and we have an image so we turned the image into a vector of pixels and then ran it through this vector. It was difficult for us to decide whether to take the pixels as they are or divide by 255 (we will get a pixel between 0 and 1) or even use the histogram of the pixels.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Adaboost.ipynb		Adaboost.ipynb
Backprop.ipynb		Backprop.ipynb
CNN.ipynb		CNN.ipynb
KNN.ipynb		KNN.ipynb
README.md		README.md
RNN.ipynb		RNN.ipynb
RandomForest.ipynb		RandomForest.ipynb
SVM.ipynb		SVM.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chest_Xray_normal_pneumonia_classification

DataBase

Results:

CNN:

Backpropagation:

KNN:

RNN:

AdaBoost:

SVM

Random Forest

Main challenges we faced:

About

Releases

Packages

Languages

NadeemJazmawe/Chest_Xray_normal_pneumonia_classification

Folders and files

Latest commit

History

Repository files navigation

Chest_Xray_normal_pneumonia_classification

DataBase

Results:

CNN:

Backpropagation:

KNN:

RNN:

AdaBoost:

SVM

Random Forest

Main challenges we faced:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages