Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about traffic light #35

Open
EcustBoy opened this issue May 31, 2023 · 7 comments
Open

question about traffic light #35

EcustBoy opened this issue May 31, 2023 · 7 comments

Comments

@EcustBoy
Copy link

Hi~ author, In my opinion, TCP model directly use raw image and some measurement signal as input, and doesn't consider intermediate perception results. But how does it learn traffic light information? If only rely on expert trajectory samples to train, I think the traffic light is too small in front view such that it's actually hard to learn "red-stop, green-start" behavior?

Besides, does training dataset size has crucial impact on the final performance of understanding traffic light? Whether there are relevant ablation experiments about this?

@penghao-wu
Copy link
Collaborator

Yes, it learns the "red-stop, green-start" from the expert demonstrations. And I think the current camera setup could capture the traffic light information. But you can also try to add another camera with an explicit traffic light detection module to enhance its ability similar to LAV.

Most of the training routes contain junctions with traffic lights, so the traffic light related data is abundant. I think the dataset size is important to learn rules about the traffic light, but we do not have such ablations.

@EcustBoy
Copy link
Author

EcustBoy commented May 31, 2023

Yes, it learns the "red-stop, green-start" from the expert demonstrations. And I think the current camera setup could capture the traffic light information. But you can also try to add another camera with an explicit traffic light detection module to enhance its ability similar to LAV.

Most of the training routes contain junctions with traffic lights, so the traffic light related data is abundant. I think the dataset size is important to learn rules about the traffic light, but we do not have such ablations.

Thanks for your reply, right now I only train on my own small dataset (about 75K samples) and I haven't feed image to planner decoder directly, I think this is the main reason where my model can't learn to understanding traffic light. :-).

I'm gonna try to design similar front view feature extraction network similar to TCP, it seems that ego car is able to learn the "red-stop, green-start" behavior as long as I feed raw image to simple network and train on a relatively big dataset, instead of some complicated design, right? many thanks for your answer~

@penghao-wu
Copy link
Collaborator

So currently what is the input to your planner decoder if you do not feed the image features to it?

@EcustBoy
Copy link
Author

So currently what is the input to your planner decoder if you do not feed the image features to it?

actually I input (1)other cars and map detection embedding feature which are output by the front backbone and detection head, and (2)some ego car state(including command waypoint and speed), So I think I shouldn't only use the intermediate feature, it seems the raw image is also needed.

@penghao-wu
Copy link
Collaborator

Yes, you need to include information containing traffic light information (like raw images or traffic light detection results) as input.

@EcustBoy
Copy link
Author

EcustBoy commented Jun 1, 2023

Yes, you need to include information containing traffic light information (like raw images or traffic light detection results) as input.

Hi~author, I read your code again and notice you use pretrained resnet34 to get image feature.

I wanna ask is a pretrained image feature network backbone necessary if I only wanna get traffic light info from front-view? For limit the network size, perhaps a shallow custom-designed network is already enough? Not sure whether you‘ve made such comparison~

@penghao-wu
Copy link
Collaborator

I think a shallow network would suffice if you have direct supervision on the traffic light states.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants