-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bag play command segfaults #110
Comments
You could run the binary under valgrind/strace and see if that provides extra info. |
Good idea. I was able to get the backtrace by running python3 under GDB and then running the bag launcher script from there.
Here is the full backtrace. It looks like the segfault might be happening in the destructor of the Player.
|
This adds a patch that moves the `std::shared_ptr` holding a reference to a `rclcpp::SharedLibrary` from `rclcpp::GenericPublisher` to `rclcpp::PublisherBase`. The shared library in question contains type support code for the publisher's message type. In the standard ROS build, two separate shared objects are used, one is represented by the mentioned `shared_ptr`-wrapped `SharedLibrary` instance, the other is never unloaded. At the same time, code from the latter library is required during destruction of a `PublisherBase` instance. The rules_ros2 build creates a single library, which is unloaded when `GenericPublisher` is destroyed (the `shared_ptr`'s reference count goes to zero). When `GenericPublisher`'s parent class, `PublisherBase`, is destroyed, trying to call code from the shared library failed and caused a segfault. Moving the `shared_ptr` to `PublisherBase` makes sure the shared library stays loaded as long as required. We further extend the existing rosbag test to also excersise the `play` command. This fix is similar to what was done for PR#47. See the discussion there for more details. Fixes mvukov#110
This adds a patch that moves the `std::shared_ptr` holding a reference to a `rclcpp::SharedLibrary` from `rclcpp::GenericPublisher` to `rclcpp::PublisherBase`. The shared library in question contains type support code for the publisher's message type. In the standard ROS build, two separate shared objects are used, one is represented by the mentioned `shared_ptr`-wrapped `SharedLibrary` instance, the other is never unloaded. At the same time, code from the latter library is required during destruction of a `PublisherBase` instance. The rules_ros2 build creates a single library, which is unloaded when `GenericPublisher` is destroyed (the `shared_ptr`'s reference count goes to zero). When `GenericPublisher`'s parent class, `PublisherBase`, is destroyed, trying to call code from the shared library failed and caused a segfault. Moving the `shared_ptr` to `PublisherBase` makes sure the shared library stays loaded as long as required. We further extend the existing rosbag test to also excersise the `play` command. This fix is similar to what was done for PR#47. See the discussion there for more details. Fixes mvukov#110
Just wanted to ping you @ahans , but I see that you're already working on this :) Very nice! |
This is very similar to the issue we had with PR#47. In addition to the subscription path, we also need to patch the publisher one. I created a PR that does just that. Btw, the segfault also happens when run under Bazel. You can tell when looking at the return code. For some reason Bazel swallows the segfault message. The same also happened for the recorder test in PR#47. |
This adds a patch that moves the `std::shared_ptr` holding a reference to a `rclcpp::SharedLibrary` from `rclcpp::GenericPublisher` to `rclcpp::PublisherBase`. The shared library in question contains type support code for the publisher's message type. In the standard ROS build, two separate shared objects are used, one is represented by the mentioned `shared_ptr`-wrapped `SharedLibrary` instance, the other is never unloaded. At the same time, code from the latter library is required during destruction of a `PublisherBase` instance. The rules_ros2 build creates a single library, which is unloaded when `GenericPublisher` is destroyed (the `shared_ptr`'s reference count goes to zero). When `GenericPublisher`'s parent class, `PublisherBase`, is destroyed, trying to call code from the shared library failed and caused a segfault. Moving the `shared_ptr` to `PublisherBase` makes sure the shared library stays loaded as long as required. We further extend the existing rosbag test to also excersise the `play` command. This fix is similar to what was done for PR#47. See the discussion there for more details. Fixes mvukov#110
Thank you! I pulled in the fix and confirmed that I no longer see the play command segfaulting :) |
Steps to reproduce:
Running the above command results in a segfault. The segfault occurs at the very end of playback just before the bag playback is finished.
Some more context:
Following #96, I created a wrapper script that replaces all relative paths passed to the bag command with absolute paths. The wrapper script then finds the runfiles directory and cd's into it before invoking the underlying bag command.
This is working ok as a work-around except for the bag play command. For some reason the bag play command segfaults.
If I run the same command using
bazel run chatter:bag -- play [path to bag file]
, then it does not segfault. So there seems to be something different about the build environment compared to the environment when running the output binary.The text was updated successfully, but these errors were encountered: