-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accuracy benchmark and inference speed #4
Comments
Hi Yes the code actually works :) But tracking in general is only as good as your detection priors. The main purpose of my research was to have a multiple object tracker that could exploit the temporal dependencies to track occluded objects. Training for both the task together was to test the hypothesis that certain features might be better for tracking. For general tracking I'd say tune your detector first on the dataset and freeze it before training the tracker part. So, my dataset was focused on occlusion and there were lot of things I also wanted to try but hardware is a limitation for me now. Maybe we could collaborate on this one, if that's feasible. The training as I have seen is also very sensitive to the ratio of tracking and detection loss which you would need to experiment yourself. You could also try Bidirectional LSTMs. As far as speed goes, I haven't done any rigorous analysis on that but I imagine 10fps should be easily achievable with a 1080 since it is basically a regression and does not use any RPNs which are slow. Additionally increasing temporal stride will give you better tracker speed. Imagine 3 forward passes for 4 frames with 2 forward passes for 6 frames. And also if you could parallelize the pre and post processing/decoding of frames like Yolo's original C implementation does, it should be quite easy to have 10fps. Regards |
Great, thanks for a thorough reply! Yeah I agree that starting with a pre-trained detector and only fine-tuning the tracker would make sense. I am not 100% from the code yet but it looks like the part between detector and tracker is differentiable, so turning on the learning on the detector after finetuning the tracker would also help so that to train the whole system end-to-end. Yep, I see your task. Occlusion is difficult. Also in terms of the detector, I am using Yolo v3 which has 3 output heads (I just have small objects, and v3 has 3 heads at different layers). If you want to improve your result overall, then the detector does all the hard work, so improving the detector is very important. |
I will let you how my development goes, if anything I could submit a PR. |
So I first wanted to train a detector KerasYOLO.py. I think I changed all the right things (anchors, labels, etc...) but I get non-sense low-confidence predictions after training, even though the loss steadily decreases. Also looking at the individual loss terms, WH term seems to be several orders of magnitude larger than the rest, I tried scaling OBJ term but I had to use extremely large values which doesn't really make sense. Would you know what is the problem? I've used code from keras-yolo2 but not sure what changes you have made. |
Hi,
Great code, thanks a lot!
I had 2 questions:
Accuracy
Just wanted to know whether the code actually works and tracking does improve detection.
Speed of tracking
I am looking into using
MultiObjDetTracker
in my project. The requirements of my project is real-time processing, would you how fast is the inference time of tracking part (excluding detection) of theMultiObjDetTracker
on the GPU (1050 or 1080)? I would imagine it would <20% of the detector since it has 2 layers even though LSTM layer might take more time. I am aiming for ~10 fps.Thanks!
The text was updated successfully, but these errors were encountered: