Histopathologic Cancer Detection in histopathologic scans of lymph node sections.
This project uses the dataset from here and for performance evaluation i made an 80/20 split of the train(folder) dataset for train/validation.
The performance is 90+% accuracy on validation set.
If you want to make a kaggle submission, you can add the remaining 20%(validation) to the training set, retrain and hopefully gain much better results. As you can see by the below graph, there is both relatively high bias and variance which means that a more complex model could also help.
Validation is the blue line, training is the orange one.
Instructions:
- install dependencies in official/requirements.txt
- export PYTHONPATH="$PYTHONPATH:/path/to/HistopathologicCancerDetection"
- Download the dataset and extract train.zip under official/histopathicC/train/ and train_labels.csv.zip under official/histopathicC/
- python3 convertTiftoJpeg.py to convert images from .tif to .jpeg format(change path inside the file)
- python3 createDataset.py to generate tfrecords
- python3 Simple_model.py to train the model
tensorboard --logdir="path/to/Simple_model_model" to view graphs