Skip to content

Commit

Permalink
Merge pull request #6 from owczr/develop
Browse files Browse the repository at this point in the history
 Updates and small improvements
  • Loading branch information
owczr authored Mar 12, 2024
2 parents eafad6f + 84a5fb1 commit 0941d71
Show file tree
Hide file tree
Showing 47 changed files with 2,251 additions and 238 deletions.
3 changes: 2 additions & 1 deletion .amlignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,5 @@ notebooks/
docs/
.pytest_cache/
.github/

logs/
*.log
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@ test/
LIDC-IDRI/
.vscode/
__pycache__/
.env
.env
logs/
47 changes: 39 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,50 @@
# Lung Cancer Detection

## Table of Contents
- [About](#about)
- [Usage](#usage)
- [License](#license)
## Table Of Contents
1. [About](#about)
2. [Project Structure](#project-structure)
3. [Usage](#usage)
4. [License](#license)

## About
Lung Cancer Detection is a project made as part of Engineers Thesis *"Applications of artificial intellingence in oncology on computer tomography dataset"* by **Jakub Owczarek**, under the guidance of Thesis Advisor dr. hab. inz **Mariusz Mlynarczuk** prof. AGH.
Lung Cancer Detection is a project made as part of Engineers Thesis *"Applications of artificial intelligence in oncology on computer tomography dataset"* by **Jakub Owczarek**, under the guidance of Thesis Advisor dr. hab. inz **Mariusz Mlynarczuk** prof. AGH.
<br>

The goal of this projet is to process the [LIDC-IDRI](https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=1966254) dataset and measure the performence of deep learning models pre-trained on Image Net by using transfer learning methods.
The goal of this project is to process the [LIDC-IDRI](https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=1966254) dataset and evaluate the performance of deep learning models pre-trained on Image Net by leveraging transfer learning.

## Usage
## Project Structure
This repository contains the following directories:

TODO: Fill in how to use this project locally and on Azure ML
- *docs* - contains markdown files with more specific descriptions of the project components
- *notebooks* - contains Jupyter Notebooks that were used for experiments, analysis, visualizations, etc
- *scripts* - this directory is the actual workhorse and contains two notable subdirectories:

- *azure* - contains scripts for Azure Virtual Machine and Azure Machine Learning
- *local* - contains scripts that were used for local development

- *src* - contains main components of the project:

- *azure* - contains utilities specific to Azure services
- *dataset* - contains `DatasetLoader` component used to feed data during model training
- *model* - contains model builder and director classes
- *preprocessing* - contains classes used for LIDC-IDRI dataset preprocessing
- *config.py* - some constants used throughout the project

- *tests* - contains (few) tests for the project components

## Usage
This project was created with Azure in mind and therefore the main scripts are meant for usage on Azure.

![usage_img](docs/assets/usage.png)

### 1. Preprocessing
1. First step is to download the LIDC-IDRI dataset on Azure Virtual Machine. The `azure/virtual_machine/download_dataset.sh` script is meant for this task.
2. Then, it's time to preprocess this dataset to a format suitable for supervised deep learning model training. The `azure/virtual_machine/process_dataset.py` script is meant for this task. Additionally, in the same directory is `train_test_split.py`, which should be used to split processed data.
3. Finally, the preprocessed dataset can be uploaded with the `upload_dataset_2.sh` script to Azure Blob Storage. There is also `upload_dataset.sh` script, but it doesn't use the `azcopy` utility and is too slow.

### 2. Model training
1. With preprocessed dataset on Azure Blob Storage, the Virtual Machine will be no longer necessary. From this dataset an Azure Machine Learning data asset can be created, which can be utilized during model training.
2. Now to run the actual model training under `scripts/azure/machine_learing` is the `run_training_job.py` script. This script can be used to create a job on AML, to build, compile and train desired model.

## License
This project is licensed under the MIT License - see the LICENSE.md file for details
Binary file added docs/assets/usage.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
342 changes: 342 additions & 0 deletions notebooks/results.ipynb

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
},
{
"cell_type": "code",
"execution_count": 148,
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -28,9 +28,22 @@
},
{
"cell_type": "code",
"execution_count": 150,
"execution_count": 4,
"metadata": {},
"outputs": [],
"outputs": [
{
"ename": "PermissionError",
"evalue": "[Errno 13] Permission denied: '/home/student/Repositories/lung-cancer-detection/LIDC-IDRI/CT/test/LIDC-IDRI-0001/01-01-2000-NA-NA-30178/3000566.000000-NA-03192/1-040.dcm'",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mPermissionError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m/home/jakub/Repositories/lung-cancer-detection/notebooks/segmentation.ipynb Cell 4\u001b[0m line \u001b[0;36m2\n\u001b[1;32m <a href='vscode-notebook-cell:/home/jakub/Repositories/lung-cancer-detection/notebooks/segmentation.ipynb#W3sZmlsZQ%3D%3D?line=0'>1</a>\u001b[0m dicom_path \u001b[39m=\u001b[39m \u001b[39m\"\u001b[39m\u001b[39m/home/student/Repositories/lung-cancer-detection/LIDC-IDRI/CT/test/LIDC-IDRI-0001/01-01-2000-NA-NA-30178/3000566.000000-NA-03192/1-040.dcm\u001b[39m\u001b[39m\"\u001b[39m\n\u001b[0;32m----> <a href='vscode-notebook-cell:/home/jakub/Repositories/lung-cancer-detection/notebooks/segmentation.ipynb#W3sZmlsZQ%3D%3D?line=1'>2</a>\u001b[0m dcm \u001b[39m=\u001b[39m pydicom\u001b[39m.\u001b[39mdcmread(dicom_path)\n",
"File \u001b[0;32m~/.conda/envs/cancer/lib/python3.11/site-packages/pydicom/filereader.py:1002\u001b[0m, in \u001b[0;36mdcmread\u001b[0;34m(fp, defer_size, stop_before_pixels, force, specific_tags)\u001b[0m\n\u001b[1;32m 1000\u001b[0m caller_owns_file \u001b[39m=\u001b[39m \u001b[39mFalse\u001b[39;00m\n\u001b[1;32m 1001\u001b[0m logger\u001b[39m.\u001b[39mdebug(\u001b[39m\"\u001b[39m\u001b[39mReading file \u001b[39m\u001b[39m'\u001b[39m\u001b[39m{0}\u001b[39;00m\u001b[39m'\u001b[39m\u001b[39m\"\u001b[39m\u001b[39m.\u001b[39mformat(fp))\n\u001b[0;32m-> 1002\u001b[0m fp \u001b[39m=\u001b[39m \u001b[39mopen\u001b[39m(fp, \u001b[39m'\u001b[39m\u001b[39mrb\u001b[39m\u001b[39m'\u001b[39m)\n\u001b[1;32m 1003\u001b[0m \u001b[39melif\u001b[39;00m fp \u001b[39mis\u001b[39;00m \u001b[39mNone\u001b[39;00m \u001b[39mor\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mhasattr\u001b[39m(fp, \u001b[39m\"\u001b[39m\u001b[39mread\u001b[39m\u001b[39m\"\u001b[39m) \u001b[39mor\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mhasattr\u001b[39m(fp, \u001b[39m\"\u001b[39m\u001b[39mseek\u001b[39m\u001b[39m\"\u001b[39m):\n\u001b[1;32m 1004\u001b[0m \u001b[39mraise\u001b[39;00m \u001b[39mTypeError\u001b[39;00m(\u001b[39m\"\u001b[39m\u001b[39mdcmread: Expected a file path or a file-like, \u001b[39m\u001b[39m\"\u001b[39m\n\u001b[1;32m 1005\u001b[0m \u001b[39m\"\u001b[39m\u001b[39mbut got \u001b[39m\u001b[39m\"\u001b[39m \u001b[39m+\u001b[39m \u001b[39mtype\u001b[39m(fp)\u001b[39m.\u001b[39m\u001b[39m__name__\u001b[39m)\n",
"\u001b[0;31mPermissionError\u001b[0m: [Errno 13] Permission denied: '/home/student/Repositories/lung-cancer-detection/LIDC-IDRI/CT/test/LIDC-IDRI-0001/01-01-2000-NA-NA-30178/3000566.000000-NA-03192/1-040.dcm'"
]
}
],
"source": [
"dicom_path = \"/home/student/Repositories/lung-cancer-detection/LIDC-IDRI/CT/test/LIDC-IDRI-0001/01-01-2000-NA-NA-30178/3000566.000000-NA-03192/1-040.dcm\"\n",
"dcm = pydicom.dcmread(dicom_path) "
Expand All @@ -45,7 +58,7 @@
},
{
"cell_type": "code",
"execution_count": 151,
"execution_count": null,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -83,7 +96,7 @@
},
{
"cell_type": "code",
"execution_count": 200,
"execution_count": null,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -112,7 +125,7 @@
},
{
"cell_type": "code",
"execution_count": 201,
"execution_count": null,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -146,7 +159,7 @@
},
{
"cell_type": "code",
"execution_count": 202,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -162,7 +175,7 @@
},
{
"cell_type": "code",
"execution_count": 203,
"execution_count": null,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -198,7 +211,7 @@
},
{
"cell_type": "code",
"execution_count": 204,
"execution_count": null,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -235,7 +248,7 @@
},
{
"cell_type": "code",
"execution_count": 221,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -310,7 +323,7 @@
},
{
"cell_type": "code",
"execution_count": 222,
"execution_count": null,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -348,7 +361,16 @@
},
{
"cell_type": "code",
"execution_count": 223,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"steps += [segmented_lungs = image * mask]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
Expand All @@ -363,7 +385,12 @@
}
],
"source": [
"fig, axes = plt.subplots(nrows=1, ncols=len(steps), figsize=(20, 15))\n",
"from itertools import chain\n",
"\n",
"\n",
"fig, axes = plt.subplots(nrows=2, ncols=len(steps) // 2, figsize=(20, 15))\n",
"\n",
"axes = list(chain.from_iterable(axes))\n",
"\n",
"for step, ax in zip(steps, axes):\n",
" ax.imshow(step, cmap=\"bone\")"
Expand All @@ -379,7 +406,7 @@
},
{
"cell_type": "code",
"execution_count": 224,
"execution_count": null,
"metadata": {},
"outputs": [
{
Expand Down
150 changes: 150 additions & 0 deletions scripts/azure/machine_learning/fine_tune.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
import os
import logging
from datetime import datetime

import click
import mlflow
import numpy as np
import tensorflow as tf
from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes

from src.model.director import ModelDirector
from src.dataset.dataset_loader import DatasetLoader
from src.config import (
RANDOM_SEED,
EARLY_STOPPING_CONFIG,
REDUCE_LR_CONFIG,
MODELS,
BUILDERS,
CALLBACKS,
METRICS,
config_logging
)

config_logging()
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("azure")


def get_compiled_model(model, optimizer, loss):
builder = BUILDERS[model]()

director = ModelDirector(builder)
model_nn = director.make()
logger.info(f"Built model_nn with {str(builder)}")

optimizer_cls = {
"adam": tf.keras.optimizers.Adam,
"sgd": tf.keras.optimizers.SGD,
}[optimizer]()

loss_cls = {
"binary_crossentropy": tf.keras.losses.BinaryCrossentropy,
"categorical_crossentropy": tf.keras.losses.CategoricalCrossentropy,
}[loss]()

metrics = [metric() for metric in METRICS]

model_nn.compile(optimizer=optimizer_cls, loss=loss_cls, metrics=metrics, run_eagerly=False)
logger.info("Compiled model")

return model_nn


def get_compiled_distributed_model(model, optimizer, loss):
strategy = tf.distribute.MultiWorkerMirroredStrategy()

with strategy.scope():
model_nn = get_compiled_model(model, optimizer, loss)

return model_nn

@click.command()
@click.option(
"--model", type=click.Choice(MODELS), default="mobilenet", help="Model to train"
)
@click.option(
"--train", type=click.Path(exists=True), help="Path to the training dataset"
)
@click.option("--test", type=click.Path(exists=True), help="Path to the test dataset")
@click.option(
"--optimizer",
type=click.Choice(["adam", "sgd"]),
default="adam",
help="Optimizer to use",
)
@click.option(
"--loss",
type=click.Choice(["binary_crossentropy", "categorical_crossentropy"]),
default="binary_crossentropy",
help="Loss function to use",
)
@click.option("--epochs", type=click.INT, default=10, help="Number of epochs to train for")
@click.option("--batch_size", type=click.INT, default=64, help="Batch size for dataset loaders")
@click.option("--job_name", type=click.STRING, help="Azure Machine Learning job name")
@click.option("--distributed", is_flag=True, help="Use distributed startegy")
def run(model, train, test, optimizer, loss, epochs, batch_size, job_name, distributed):
mlflow.set_experiment("lung-cancer-detection")
mlflow_run = mlflow.start_run(run_name=f"train_{model}_{datetime.now().strftime('%Y%m%d%H%M%S')}")

mlflow.log_param("optimizer", optimizer)
mlflow.log_param("loss", loss)
mlflow.log_param("epochs", epochs)
mlflow.log_param("batch_size", batch_size)
mlflow.log_param("random_seed", RANDOM_SEED)

logger.info(f"Started training run at {datetime.now()}")
logger.info(
f"Run parameters - optimizer: {optimizer}, loss: {loss}"
)

if not distributed:
model_nn = get_compiled_model(model, optimizer, loss)
else:
model_nn = get_compiled_distributed_model(model, optimizer, loss)

train_loader = DatasetLoader(train)
test_loader = DatasetLoader(test)

train_loader.set_seed(RANDOM_SEED)
test_loader.set_seed(RANDOM_SEED)

train_dataset = train_loader.get_dataset()
test_dataset = test_loader.get_dataset()
logger.info("Loaded train and test datasets")

history = model_nn.fit(train_dataset, epochs=epochs, callbacks=CALLBACKS)
logger.info("Trained model")

for metric, values in history.history.items():
for step, value in enumerate(values):
mlflow.log_metric(f"{metric}", value, step=step)

results = model_nn.evaluate(test_dataset, return_dict=True)
logger.info("Evaluated model")

for metric, value in results.items():
mlflow.log_metric(f"Final {metric}", value)

logger.info(f"Finished training at {datetime.now()}")

try:
mlflow.tensorflow.save_model(
model=model_nn,
path=os.path.join(job_name, model),
)
except TypeError as e:
logger.error(f"Saving model raised an error:\n{e}")

mlflow.tensorflow.log_model(
model=model_nn,
registered_model_name=model,
artifact_path=model,
)

mlflow.end_run()


if __name__ == "__main__":
run() # pylint: disable=no-value-for-parameter
Loading

0 comments on commit 0941d71

Please sign in to comment.