Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scaled jtc #1191

Draft
wants to merge 32 commits into
base: master
Choose a base branch
from
Draft

Scaled jtc #1191

wants to merge 32 commits into from

Conversation

fmauch
Copy link
Contributor

@fmauch fmauch commented Jul 3, 2024

This is my take2 on #301.

I plan to finish this until Friday, @firesurfer feel free to poke me :-)

Currently missing

  • tests for scaled execution
  • There are some ToDos in the documentation:
    • Algorithmics
    • Handling tolerances

Things that will get changed, as discussed in the latest working group meeting

  • Allowing scaling factors above 1 might get postponed to a later point.
  • Report the scaling factor in the controller state
  • Use a custom datatype for the scaling_factor subscriber (std_msgs are deprecated)
  • Change subscription to transient_local. This will require documentation, as a standard ros2 topic pub will not work with this.

fmauch and others added 13 commits July 3, 2024 17:38
This adds a scaling factor between 0 and 1 to the JTC so that the trajectory
time inside the controller is extended respectively. A value of 0.5 means
that trajectory execution will take twice as long as the trajectory states.

The scaling factor itself is read from the hardware for now.
Co-authored-by: Christoph Fröhlich <christophfroehlich@users.noreply.github.com>
This avoids writing the explicit conversion by hand

Internally that basically does:
static_cast<rcl_duration_value_t>(static_cast<long double>(rcl_duration_.nanoseconds) * scale_ld)
…etting

of scaling factor

Co-authored-by: Manuel M <mamueluth@gmail.com>
That was changed in a previous commit
used

The current implementation only works correctly when a position
interface is used.

When the hardware in fact slows down the robot (as UR does), also velocity interfaces
should be scaled correctly, but that't rather an edge case right now.
It is accessed from the RT thread only, anyway.
@fmauch
Copy link
Contributor Author

fmauch commented Jul 3, 2024

Thoughts on tests I should implement:

  • Execution with scaling factor of 0.5 takes twice as long (and does not fail, when no tolerances are set)
  • Execution with scaling < 1 and goal time tolerance leads to an early halt and failed execution
  • Execution with scaling < 1 and path tolerances set should not fail (if goal_time_tolerance is large enough)
  • Execution with scaling > 1 should not fail independent of the tolerances set <-- is that true?

@firesurfer
Copy link
Contributor

hi @fmauch Great to see this being worked on again. I added some comments to the code.

Regarding the tests:

Execution with scaling factor of 0.5 takes twice as long (and does not fail, when no tolerances are set)

Makes sense

Execution with scaling < 1 and goal time tolerance leads to an early halt and failed execution

Isn't this what we patched in: UniversalRobots/Universal_Robots_ROS2_Driver#882. As far as I can tell you also backported this patch. (And I still think the goal time should be also scaled ;) )

Execution with scaling > 1 should not fail independent of the tolerances set <-- is that true?

Same as above. From heavily using this controller in our system I have the opinion that in most use cases it makes sense (and is what a user would expect) that the goal time should also be scaled. But perhaps I understood your suggestion wrong.

@fmauch
Copy link
Contributor Author

fmauch commented Jul 4, 2024

From heavily using this controller in our system I have the opinion that in most use cases it makes sense (and is what a user would expect) that the goal time should also be scaled. But perhaps I understood your suggestion wrong.

Thank you for your input. That's why I put my brain dump on that here, so we can discuss this :-) And that's also why I didn't write the documentation on that, yet.

I'll wrap my head around this a little more and try to get a better feeling for the current behavior, then come back to this.

Comment on lines +6 to +10

control_msgs:
type: git
url: https://github.com/fmauch/control_msgs.git
version: scaled_jtc
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
control_msgs:
type: git
url: https://github.com/fmauch/control_msgs.git
version: scaled_jtc

Comment on lines +6 to +10

control_msgs:
type: git
url: https://github.com/fmauch/control_msgs.git
version: scaled_jtc
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
control_msgs:
type: git
url: https://github.com/fmauch/control_msgs.git
version: scaled_jtc

@firesurfer firesurfer mentioned this pull request Jul 4, 2024
@firesurfer
Copy link
Contributor

@fmauch

Execution with scaling factor of 0.5 takes twice as long.

Yes. we sample the trajectory half as fast

Execution with scaling factor of 2 should take 50% of the time.

Yes we sample the trajectory twice as fast

Execution with scaling factor of 0 should take as long as expected

I would call this paused. I have the feeling this is one of the major usecases of this controller

Independent of goal and path tolerances the action should not fail -- the goal time is scaled, as well.

In case the trajectory is within the (scaled) goal time, it should not fail -> yes. Otherwise the combination of a scaled trajectory + a set goal time tolerance would always fail.
Is there a specific use case you have in mind that would not be covered by that ?

@fmauch
Copy link
Contributor Author

fmauch commented Jul 5, 2024

My concern is mainly this:

If you have an application and you execute a trajectory on a robot that does speed scaling. The trajectory gets slowed down due to safety reasons, which means it finishes later than originally requested. This will slow down the whole application and no longer behave as programmed. This might be considered a fault.

One could argue that

  1. Depending follow-up actions should be triggered once their post-conditions are met (the motion action finished) and nit by time.
  2. If execution time is critical, that can be monitored as a wrapper. MoveIt! for example does that by default.

I just quickly discussed that with a colleague and he also agreed that the behavior that we currently have is probably the expected one. He was also emphasizing the second point.

So, I guess, I'll just leave that result standing. As I said, I can definitively live with it. And it's probably even better / more appropriate than "I can live with that".

@fmauch
Copy link
Contributor Author

fmauch commented Jul 5, 2024

Update on the tests:

  • testing is currently difficult in this repository, see Test failures: subscription already associated with a wait set #1101
  • There is also a crash in one of the tests, I will open a separate issue with that
  • Running the current tests produces an error in almost all tests:
    unknown file: Failure
    C++ exception with description "can't subtract times with different time sources [3 != 1]" thrown in the test body.
    
  • There is currently an issue with the goal time tolerance. The action goal's tolerance always overwrites the one from he parameter. I'll add a PR for that.

@christophfroehlich
Copy link
Contributor

* testing is currently difficult in this repository, see [Test failures: subscription already associated with a wait set #1101](https://github.com/ros-controls/ros2_controllers/issues/1101)

In the last weeks, I compiled ros2_control+controllers on jammy+humble to run the tests.

return false;
}

RCLCPP_INFO(get_node()->get_logger(), "New scaling factor will be %f", scaling_factor);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be possible to disable this logging message. Some one could have the idea of continuously streaming the scaling factor and this would lead to spamming the log/terminal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could set that output to an equality check with the current value and add a throttle to it. I think that might be a better approach than adding yet another config value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And/Or use a different child logger that can be separately silenced

@firesurfer
Copy link
Contributor

Update on the tests:

* testing is currently difficult in this repository, see [Test failures: subscription already associated with a wait set #1101](https://github.com/ros-controls/ros2_controllers/issues/1101)

* There is also a crash in one of the tests, I will open a separate issue with that

* Running the current tests produces an error in almost all tests:
  ```
  unknown file: Failure
  C++ exception with description "can't subtract times with different time sources [3 != 1]" thrown in the test body.
  ```

So what about the remaining issues especially regarding the tests.
From what I can tell so far:

  1. Does not build on rolling because there the updated control_msgs are not used. Probably it would make sense to merge the control_msgs change first. Given that the scaled JTC should/will definitely go upstream there should be no reason not to merge as soon as possible.
  2. Then in case of Jazzy where the build apparently succeeds there are multiple unrelated tests failing

Regarding the tests: @christophfroehlich I am not sure if your suggestion works. I had to explicitly make 1-2 small fixes to some ros2control packages in order to compile ros2control on iron. I would guess that's the same for humble. Is there currently any work in progress in order to get the test working on rolling again?

@fmauch I guess given that you can't run any of the tests at the moment you didn't start to implement any of the new tests we discussed before?

* There is currently an issue with the goal time tolerance. The action goal's tolerance always overwrites the one from he parameter. I'll add a PR for that.

Is there already an issue for that ? Do we need to discuss it in an extra issue?

@christophfroehlich
Copy link
Contributor

@firesurfer we have a CI job for that, it currently builds but some tests fail/are flaky
https://github.com/ros-controls/ros2_control_ci/actions/runs/9866959705

@fmauch
Copy link
Contributor Author

fmauch commented Jul 16, 2024

@fmauch I guess given that you can't run any of the tests at the moment you didn't start to implement any of the new tests we discussed before?

Precisely. #1206 could be the PR removing this barrier.

Is there already an issue for that ? Do we need to discuss it in an extra issue?

Sorry for not answering earlier: See #1192

So, I'll probably start implementing tests tomorrow, either locally rebased ontop of #1206 or on a merged master :-)

@fmauch
Copy link
Contributor Author

fmauch commented Jul 18, 2024

I was sick yesterday, but I'll try my best to squeeze in the tests until the end of the week.

Copy link
Contributor

mergify bot commented Jul 21, 2024

This pull request is in conflict. Could you fix it @fmauch?

@firesurfer
Copy link
Contributor

firesurfer commented Aug 6, 2024

@fmauch what's the current state ?

@fmauch
Copy link
Contributor Author

fmauch commented Aug 8, 2024

@fmauch what's the current state ?

I've spent some time last week (or the week before?) on trying to get the tests running correctly.

With that I realized that I basically ran into #697. The JTC currently samples the trajectory at the time given to the update loop. In the scaled version, however we basically sample at $t+\Delta t$. That is a conceptual difference. In the latest working group meeting we decided that I will change the JTC accordingly, as this is actually the "proper" approach.

Another issue related to that is that we move forward in the trajectory only based on the controller periods combined with the speed scaling (which again is different to the current method which is why I will have to rewrite the tests). We can't really use the time directly, since the time we spent in the trajectory has been scaled, so we have to sum up the periods.

This will also get interesting when it comes to trajectory replacement, I think, but I haven't looked into that, yet.

So, in summary: Looks like the devil is in the details and this requires still some implementation work to do. While looking at that I did some refactoring work which I haven't pushed yet, as I want to get this together with working basic tests to make sure that the refactoring is actually behaving as expected. As I will be on vacation the next three weeks I don't know how much time I will find for that, though. But I will try.

@fmauch
Copy link
Contributor Author

fmauch commented Aug 12, 2024

I'm planning to split things up a bit into separate PRs to make things with a smaller scope, especially when it comes to adapting the tests.

  • Use $t+\Delta t$ in the JTC
  • Calculate the sample time based upon passed periods instead of the absolute time. This is a requirement for the scaled JTC, since we have to sum up the scaled periods to move forward in the trajectory.
  • Add Speed scaling. This will basically only add the interfaces for speed scaling and change that the period the trajectory is moved forward isn't necessarily the complete last update cycle.

I'll leave out trajectory replacement for now, since this shall be implemented at a later point, anyway.

It might still make sense to merge them all in one PR, but I'll separate at least development like that.

@christophfroehlich
Copy link
Contributor

It might still make sense to merge them all in one PR, but I'll separate at least development like that.

Ping me if you want me to create a feature/scaled_jtc branch in this repo for your individual PRs. However, the first two points seem to be independent somehow.

Copy link
Contributor

mergify bot commented Aug 26, 2024

This pull request is in conflict. Could you fix it @fmauch?

@firesurfer
Copy link
Contributor

@fmauch

feel free to poke me :-)

Any updates ? :D

@fmauch
Copy link
Contributor Author

fmauch commented Sep 14, 2024

Any updates ? :D

I almost finished the t+dT change on https://github.com/fmauch/ros2_controllers/tree/jtc_tdt. This change still makes tests fail which I have to investigate: fmauch@f453049#diff-8dd2135dc1f4abe6382cc7671ef9ce939196b2d8adf18ac3a77b5c0fddf1a25fR287

@fmauch
Copy link
Contributor Author

fmauch commented Oct 7, 2024

Update: #1306 implements sampling at $t + \Delta T$. Furthermore, ros-controls/ros2_control#1774 should fix incorrect controller periods reported to the controller.

With this the next step would be to propagate through the trajectory based on the accumulated periods instead of the absolute time. I expect that this will be quite a lot of work to make all tests comply with that.

I will update this PR once #1306 is merged.

@fmauch
Copy link
Contributor Author

fmauch commented Oct 14, 2024

Another update:

Edit:

Note: You will need ros-controls/ros2_control#1774 in order to get valid period times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants