Bounding Box as an Additional Input #2611
Unanswered
rorybennett
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello All!
I find myself with a little problem that I have "solved" in one way, but I quite honestly don't know if my method is a valid solution, and would like to solve it in a more elegant manner.
The Problem
I have a grayscale ultrasound dataset of 19 images. Each image has two objects in it that need to be segmented. I have tested out-of-the-box nn-UNet and achieved decent Dice scores (75% - 94%).
I then thought I would try adding in a bounding box as an additional input to the network to improve on the Dice scores (similar to MedSAM ), which has become a little more difficult than I thought it would be.
Attempted Solution
My first attempt at including a bounding box involved adding a second input image that was a mask of the bounding box (see image below, left image is the original input image 0000 and the right image is the mask 0001) that I assigned to channel 0001 (with the original image channel 0000). It seems like this has worked quite well, with the Dice scores increasing substantially (93% - 97%).
The Question
Is there a more "nn-UNet" way of solving this, creating custom trainers/dataloaders? I have had a look at the trainers and dataloaders and have no idea where to begin, or if it is actually possible.
MedSAM allows for a random perturbation on the bounding box during training time (the ground truth segmentation is read, bounding box coordinates are determined, and the perturbation applied), which is what I would like to emulate. At the moment the only way I can think of doing this is creating the 0001 channel file with the perturbation already applied, before preprocessing is run.
Beta Was this translation helpful? Give feedback.
All reactions