Transferring Relative Monocular Depth to Surgical Vision with Temporal Consistency (MICCAI 2024)

This is the official repository for our state-of-the-art approach to monocular depth in surgical vision as presented in our paper...

Transferring Relative Monocular Depth to Surgical Vision with Temporal Consistency

MICCAI

arXiv

Using Our Models

First, install our package...

pip install git+https://github.com/charliebudd/transferring-relative-monocular-depth-to-surgical-vision

Then download one of our models weights from the release tab in this repo. We would recommend our best performer, depthanything-sup-temp.pt. The model may then be used as follows...

import torch
from torchvision.io import read_image
from torchvision.transforms.functional import resize
import matplotlib.pyplot as plt

from trmdsv import load_model

model, resize_for_model, normalise_for_model = load_model("depthanything", "weights/path.pt", "cuda")
model.eval()

image = read_image("surgical_image.png").cuda() / 255.0
original_size = image.shape[-2:]
image_for_model = normalise_for_model(resize_for_model(image.unsqueeze(0)))

with torch.no_grad():
    depth = model(image_for_model)

depth = resize(depth, original_size)

plt.subplot(121).axis("off")
plt.imshow(image.cpu().permute(1, 2, 0))
plt.subplot(122).axis("off")
plt.imshow(depth.cpu().permute(1, 2, 0))
plt.show()

Recreating Our Results

To recreate our results first clone this repository and install all the requirements...

git clone https://github.com/charliebudd/transferring-relative-monocular-depth-to-surgical-vision
pip install -r requirements.txt

As the meta-dataset created for this project uses data with a mix of licenses, we are not able to redistribute our dataset. To recreate the dataset, you will first need to download all the input datasets. These datasets have different procedures to access, the starting point for each can be found with the following links...

bash download_hamlyn.sh

Download each dataset and place them in a shared directory. It should look like this...

Datasets
|
|____Cholec80
| |____videos
|
|____EndoVis2017
| |____test
| |____train
|
|____EndoVis2018
| |____test
| |____train
|
|____Hamlyn
| |____dataset4
| ⋮
|
|____KidneyBoundary
| |____kidney_dataset_1
| ⋮
|
|____ROBUST-MIS
| |____Raw data
|
|____SCARED
| |____dataset_1
| ⋮
|
|____SERV-CT
| |____Experiment_1
| |____Experiment_2
|
|____StereoMIS
| |____P1
| ⋮

Now run the data processing script provided to generate our meta-dataset

python data_preprocess.py --input-directory Datasets

You should now be ready to run the training script...

python train.py --experiment-name my-trained-model --model depthanything --train-mode sup temp

This will finetune depthanything using a combination of normal supervision (sup) and temporal consistency self-supervision (temp). You can then compare your finetuned model against the original depthanything model using the evaluation script...

python evaluate.py --model depthanything
python evaluate.py --model depthanything --weights outputs/my-trained-model/best_weights_validation.pt

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
src		src
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
download_hamlyn.sh		download_hamlyn.sh
evaluate.py		evaluate.py
prepare_metamed.py		prepare_metamed.py
requirements.txt		requirements.txt
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transferring Relative Monocular Depth to Surgical Vision with Temporal Consistency (MICCAI 2024)

Using Our Models

Recreating Our Results

Self-supervision data...

Normal Supervision data...

Evaluation data...

About

Releases 1

Languages

License

charliebudd/transferring-relative-monocular-depth-to-surgical-vision

Folders and files

Latest commit

History

Repository files navigation

Transferring Relative Monocular Depth to Surgical Vision with Temporal Consistency (MICCAI 2024)

Using Our Models

Recreating Our Results

Self-supervision data...

Normal Supervision data...

Evaluation data...

About

Resources

License

Stars

Watchers

Forks

Releases 1

Languages