You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I am totally confused about the working of the Detection Decoder
In the output of the module what are the 2 labels/classes?
According to the paper, the 1st and 2nd channel of the prediction output gives the confidence that an object of interest is present at a particular location.
What are the objects of interest: car/road?
Fig.3 shows 3 crossed gray cells: are those the cells in 'I don't care area'.
is it expected that the top of the image (the sky) is not considered as "I don't care area".
The last 4 channels are the bounding box coordinate ( x0, y0, h, w).
are those coordinates at the scale of the input image dimension, or at the scale of the (39x12) feature map?
What is "delta prediction" (the residue)? Is it the correction to be applied to the coarse estimate of the bounding box (from the prediction)
what's the difference between the output of the Segmentation decoder and the Detection Decoder in terms of output: I understand that the Segmentation outputs a mask related to the 2 classes. But I would thought that the Detection Decoder output the coordinate of the bounding boxes.
Thank you
The text was updated successfully, but these errors were encountered:
Hi,
I am totally confused about the working of the Detection Decoder
In the output of the module what are the 2 labels/classes?
According to the paper, the 1st and 2nd channel of the prediction output gives the confidence that an object of interest is present at a particular location.
What are the objects of interest: car/road?
Fig.3 shows 3 crossed gray cells: are those the cells in 'I don't care area'.
is it expected that the top of the image (the sky) is not considered as "I don't care area".
The last 4 channels are the bounding box coordinate ( x0, y0, h, w).
are those coordinates at the scale of the input image dimension, or at the scale of the (39x12) feature map?
What is "delta prediction" (the residue)? Is it the correction to be applied to the coarse estimate of the bounding box (from the prediction)
what's the difference between the output of the Segmentation decoder and the Detection Decoder in terms of output: I understand that the Segmentation outputs a mask related to the 2 classes. But I would thought that the Detection Decoder output the coordinate of the bounding boxes.
Thank you
The text was updated successfully, but these errors were encountered: