Fuzzy-matching Trajectory Cache Injectable Traits refactor 🔥🔥 #2941

methylDragon · 2024-07-31T08:38:47Z

As requested in the original PR, this PR refactors the TrajectoryCache to allow users to inject their own behaviors (which will allow them to cache on and sort by any arbitrary feature, as long as it can be represented as a series of warehouse_ros columns).

Depends on:

Add TF Buffer accessor to MoveGroupInterface #2944

Builds on top of:

Implement fuzzy-matching Trajectory Cache 🔥 #2838

TODOs:

Fix integration-test
Formatting pass (will be done after review)
Fix tutorials
Fix bugs

Preamble

I apologize that this PR is even larger than the one it builds on top of. Most of the added lines are docstrings or tests, and boilerplate to support the behavior injection pattern this refactor is concerned with.

On the other hand, the average file length has decreased, so the code is MUCH more modular and hopefully easy to digest.

I can't split up this PR into multiple smaller ones since technically speaking, in order to preserve cache functionality, all the feature extractors and cache insert policies introduced in this PR will need to go in together.

I would suggest looking at the tests to aid in review (they run the gamut of unit and integration tests).

You can also build and run the example:

Add Trajectory Cache Example For Refactor moveit2_tutorials#940

PS: If the size is still prohibitive, and we are okay with having partial implementations live in moveit2 while reviews are pending, let me know and I will split up the PR into smaller PRs (though I suspect at that point, that a logical number of splits might end up being somewhere close to 5-10 PRs.)

Navigating This PR

Since this PR is so large, here is a proposed order for comprehending the PR.

Fully read this PR description and the example code in the description
Build and run the demo to convince yourself that the cache works (instructions in that PR)
- Add Trajectory Cache Example For Refactor moveit2_tutorials#940
Look at the new interfaces introduced
- features/features_interface.hpp, cache_insert_policies/cache_insert_policy_interface.hpp
Then look at their implementations and tests
- features/, cache_insert_policies/
Then look at the main TrajectoryCache class
- trajectory_cache.hpp, trajectory_cache.cpp
Then tie it all together by looking at the example usage of the classes in this PR in the demo code in Add Trajectory Cache Example For Refactor moveit2_tutorials#940

Additionally, all docstrings are filled, including file ones, as appropriate. So hopefully any clarificatory questions would have already been pre-empted and answered.

Description

This PR builds on top of the fuzzy-matching Trajectory Cache introduced in:

Implement fuzzy-matching Trajectory Cache 🔥 #2838
Once that is merged, the diff for this PR should reduce significantly.

The implementation in that PR coupled the cache tightly with assumptions about what features to extract and sort by (i.e., a specific set of start/goal constraints, and pruning by execution time.)

This means that users who might want to store different features or a subset of those features, or who might want to fetch and prune on different features (e.g., minimum jerk, path length, etc.) will be unable to.

This PR refactors the cache to allow users to inject their own feature extractors and pruning/insertion logic!

This is done by introducing two new abstract classes that can be injected into the cache methods, acting a lot like class "traits":

FeaturesInterface<>: Governs what query/metadata items to extract and append to the warehouse_ros query.
CacheInserterInterface<>: Governs the pruning and insertion logic.

For more details on FeaturesInterface, see the Docstrings: https://github.com/moveit/moveit2/blob/cc0feb37cf423076e133523ccdbbf3038b84a01e/moveit_ros/trajectory_cache/include/moveit/trajectory_cache/features/features_interface.hpp

Some notes:

I decided to not go with lambdas, because there should be tight coupling between the query and metadata insertion logic for a particular feature.
Similarly, cache insertion logic heavily benefits from being stateful, and coupling those chunks of logic together.

Example

In other words, before. the cache was used like this:

auto traj_cache = std::make_shared<TrajectoryCache>(node);
traj_cache->init(/*db_host=*/":memory:", /*db_port=*/0, /*exact_match_precision=*/1e-6);

auto fetched_trajectory =
    traj_cache->fetchBestMatchingTrajectory(*move_group_interface, robot_name, motion_plan_req_msg,
                                            _cache_start_match_tolerance, _cache_goal_match_tolerance,
                                            /*sort_by=*/"execution_time_s", /*ascending=*/true);

if (fetched_trajectory)
{
  // Great! We got a cache hit
  // Do something with the plan.
}
else
{
  // Plan... And put it for posterity!
  traj_cache->insertTrajectory(
      *interface, robot_name, std::move(plan_req_msg), std::move(res->result.trajectory),
      rclcpp::Duration(res->result.trajectory.joint_trajectory.points.back().time_from_start).seconds(),
      res->result.planning_time, /*delete_worse_trajectories=*/true);
}

Now the cache is used like this:

auto traj_cache = std::make_shared<TrajectoryCache>(node);
traj_cache->init(/*db_host=*/":memory:", /*db_port=*/0, /*exact_match_precision=*/1e-6);

std::vector<std::unique_ptr<FeaturesInterface<MotionPlanRequest>>> features;
features.emplace_back(std::make_unique<WorkspaceFeatures>());
features.emplace_back(std::make_unique<StartStateJointStateFeatures>(start_tolerance));
// ...

auto fetched_trajectory =
    traj_cache->fetchBestMatchingTrajectory(move_group, robot_name, motion_plan_req_msg,
                                            /*features=*/features,
                                            /*sort_by=*/TrajectoryCache::getDefaultSortFeature(),
                                            /*ascending=*/true);

// Or more simply, if you want the same feature set as before the refactor, instead of painfully listing the features one by one:
// Type: std::vector<std::unique_ptr<FeaturesInterface<MotionPlanRequest>>>
auto default_features = TrajectoryCache::getDefaultFeatures(_cache_start_match_tolerance, _cache_goal_match_tolerance);

if (fetched_trajectory)
{
  // Great! We got a cache hit
  // Do something with the plan.
}
else
{
  // Plan... And put it for posterity!
  //
  // NOTE: Now instead of passing a trajectory, pass the plan result,
  // it'll contain the execution time and planning time we need!
  //
  // cache_inserter is a CacheInserterInterface<MotionPlanRequest, MotionPlanResponse, msg::RobotTrajectory>
  // It will tell the trajectory cache:
  //   - how to fetch "matching entries"
  //   - how to determine if they should be pruned,
  //   - how to determine when to insert the candidate cache entry
  //   - and what metadata to attach 
  //
  // additional_features allows a user to further add more metadata features for use with fetching
  // though they will not be considered by the cache_inserter
  traj_cache->insertTrajectory(
      move_group, robot_name, std::move(plan_req_msg), std::move(plan),
      /*cache_inserter=*/BestSeenExecutionTimePolicy(),
      /*prune_worse_trajectories=*/true, /*additional_features=*/{});
}

See the motion plan request features here: 79b7f95

The Feature Contract

Users may use an instance of FeaturesInterface<> to fetch a cache entry only if it was supported by the CacheInserterInterface<> instance that they used (or on insert, the feature was added in additional_features).

I could not think of a way to create a coupling between uses of the cache inserters and the features. This is the cost of generality and allowing users to inject arbitrary logic into the cache.

As such, users must take care to look at the docs of the cache inserter to see what features can be used with them.

(This can be mitigated by adding helper methods to get "standard" bundles of features and a "standard" CacheInserter.)

Bonus

I added new features to the default feature extractors set and cleaned up some utilities!

There are now FeaturesInterface<> implementations that can handle path and trajectory constraints!
Multiple goal constraints are also handled (at the cost of increased cardinality.)

I also added "atomic" features that wrap the basic ops you can do with warehouse_ros, to allow users to specify their own metadata to tag cache entries with.

Here: cc0feb3

Pre-Existing Support

The package now provides some starter implementations that covers most general cases of motion planning.

For more information, see the implementations of:

FeaturesInterface
CacheInsertPolicyInterface

Cache Keying Features

The following are features of the plan request and response that you can key the cache on.

These support separately configurable fuzzy lookup on start and goal conditions!
Additionally, these features "canonicalize" the inputs to reduce the cardinality of the cache, increasing the chances of cache hits. (e.g., restating poses relative to the planning frame).

Supported Features:

"Start"
- WorkspaceFeatures: Planning workspace
- StartStateJointStateFeatures: Starting robot joint state
"Goal"
- MaxSpeedAndAccelerationFeatures: Max velocity, acceleration, and cartesian speed limits
- GoalConstraintsFeatures: Planning request goal_constraints
  - This includes ALL joint, position, and orientation constraints (but not constraint regions)!
- PathConstraintsFeatures: Planning request path_constraints
- TrajectoryConstraintsFeatures: Planning request trajectory_constraints

Additionally, support for user-specified features are provided for query-only or cache metadata tagging constant features.

Similar support exists for the cartesian variants of these.

Cache Insert and Pruning Policies

The following are cache insertion and pruning policies to govern when cache entries are inserted, and how they are (optionally) pruned.

Supported Cache Insert Policies:

BestSeenExecutionTimePolicy: Only insert best seen execution time, optionally prune on best execution time.
AlwaysInsertNeverPrunePolicy: Always insert, never prune

Caveat

The increased functionality is now no longer 100% covered. But I tried adding tests where I had time to. I am unfortunately running out of time to iterate on this, so let's be targeted with the improvements!

Build/Test

source /opt/ros/rolling/setup.bash
sudo rosdep init
rosdep update

mkdir -p traj_cache_ws/src
cd traj_cache_ws/src
git clone https://github.com/methyldragon/moveit2.git -b ch3/trajectory-cache-refactor
for repo in moveit2/moveit2.repos $(f="moveit2/moveit2_rolling.repos"; test -r $f && echo $f); do vcs import < "$repo"; done

cd ..
rosdep install --from-paths src --ignore-src --rosdistro rolling -y -r
colcon build --packages-up-to moveit_ros_trajectory_cache --cmake-args -DCMAKE_BUILD_TYPE=Release

source install/setup.bash
colcon test --packages-select moveit_ros_trajectory_cache

Precommit with:

pre-commit run --all

methylDragon · 2024-08-01T07:32:22Z

Quick question: clang-format/tidy is erroneously editing the template parameters.

How do I get around it?
Related bug: llvm/llvm-project#46097

Is it acceptable to throw in NOLINT directives?

stephanie-eng · 2024-08-02T03:30:21Z

You'd be far from the first use-case of NOLINT, but would recommend judicious use of it.

sea-bass · 2024-12-29T13:43:45Z

Latest commit seems to look good -- I am rerunning CI a few times just to confirm. Thanks for working on this, @methylDragon!

sea-bass · 2024-12-29T13:48:09Z

Ah, the 2nd run of CI failed in the rolling/clang-tidy job, actually.

I'd be a bit wary of merging with these flaky tests. Any ideas on how to improve this behavior? And what do you think was merged upstream to be causing this?

Maybe #3143?

flaky tests

methylDragon · 2024-12-29T16:23:52Z

Ah, the 2nd run of CI failed in the rolling/clang-tidy job, actually.

I'd be a bit wary of merging with these flaky tests. Any ideas on how to improve this behavior? And what do you think was merged upstream to be causing this?

Maybe #3143?

It's the ros2_control_node itself that's getting a segfault though. I doubt this is related to moveit :/

The test this PR introduces basically checks if the nodes involved with the launchfile all terminated nicely, and also had the appropriate number of PASS printouts. (No way around it otherwise since move_group requires the robot node and ros2_control to work)

Test is failing because ros2_control's node segfaults and returns a bad exit code. This is a different failure mode from the one my commit fixed.

sea-bass · 2024-12-29T19:07:52Z

It's the ros2_control_node itself that's getting a segfault though. I doubt this is related to moveit :/

Test is failing because ros2_control's node segfaults and returns a bad exit code. This is a different failure mode from the one my commit fixed.

Those ros2_control segfaults in integration tests are unfortunately false alarms in all these tests -- there is an actual assertion failing here: https://github.com/moveit/moveit2/actions/runs/12533008564/job/34958234846?pr=2941#step:11:10051

    [test_best_seen_execution_time_policy_with_move_group-8] /home/runner/work/moveit2/moveit2/.work/target_ws/src/moveit2/moveit_ros/trajectory_cache/test/cache_insert_policies/test_best_seen_execution_time_policy_with_move_group.cpp:170: Failure
    [test_best_seen_execution_time_policy_with_move_group-8] Value of: ret
    [test_best_seen_execution_time_policy_with_move_group-8]   Actual: false
    [test_best_seen_execution_time_policy_with_move_group-8] Expected: true
    [test_best_seen_execution_time_policy_with_move_group-8] BestSeenExecutionTimePolicy: Empty joint trajectory points.
    [test_best_seen_execution_time_policy_with_move_group-8] 
    [test_best_seen_execution_time_policy_with_move_group-8] [  FAILED  ] MoveGroupFixture.BestSeenExecutionTimePolicyWorks (5209 ms)

methylDragon · 2024-12-30T04:40:55Z

It's the ros2_control_node itself that's getting a segfault though. I doubt this is related to moveit :/
Test is failing because ros2_control's node segfaults and returns a bad exit code. This is a different failure mode from the one my commit fixed.

Those ros2_control segfaults in integration tests are unfortunately false alarms in all these tests -- there is an actual assertion failing here: https://github.com/moveit/moveit2/actions/runs/12533008564/job/34958234846?pr=2941#step:11:10051
    [test_best_seen_execution_time_policy_with_move_group-8] /home/runner/work/moveit2/moveit2/.work/target_ws/src/moveit2/moveit_ros/trajectory_cache/test/cache_insert_policies/test_best_seen_execution_time_policy_with_move_group.cpp:170: Failure
    [test_best_seen_execution_time_policy_with_move_group-8] Value of: ret
    [test_best_seen_execution_time_policy_with_move_group-8]   Actual: false
    [test_best_seen_execution_time_policy_with_move_group-8] Expected: true
    [test_best_seen_execution_time_policy_with_move_group-8] BestSeenExecutionTimePolicy: Empty joint trajectory points.
    [test_best_seen_execution_time_policy_with_move_group-8] 
    [test_best_seen_execution_time_policy_with_move_group-8] [  FAILED  ] MoveGroupFixture.BestSeenExecutionTimePolicyWorks (5209 ms)

Ah! I see, I think this is an error on my part while writing the tests.
Let's try with 6b19476

Almost to the finish line! Thanks for the prompt reviews 🙇

Signed-off-by: methylDragon <[email protected]>

sea-bass · 2024-12-30T06:03:20Z

Seems the lingering flaky failures are now based on exit codes. You could consider trying something like this:

a401e13

Signed-off-by: methylDragon <[email protected]>

methylDragon · 2024-12-30T12:59:28Z

Seems the lingering flaky failures are now based on exit codes. You could consider trying something like this:

a401e13

Hmm, okay. Less ideal since it feels like papering over the issue, but okay since technically the segfaults are unrelated. 03e0dc4

sea-bass · 2024-12-30T15:10:25Z

Aw man, latest run has a real failure in the test again...

methylDragon · 2024-12-30T21:32:38Z

Oh man, let me continue investigating. Sorry for the repeated back and forths..

Signed-off-by: methylDragon <[email protected]>

methylDragon · 2024-12-30T22:25:45Z

d78fd22 is the only thing I can think of, otherwise I will need to const define any trajectories in the test (which will be quite wordy)

EDIT: Looks like the CI failure is a different flaky test now (pilz). Unfortunately my local setup makes it difficult to replicate CI, but I did run it a couple times and didn't see the same failure we saw before.

sea-bass

Seems fine now -- in 5 tries, only got another exit code related failure on a Pilz test, which is for sure not linked to this PR.

Thanks for all the hard work!

methylDragon · 2024-12-31T00:41:39Z

Seems fine now -- in 5 tries, only got another exit code related failure on a Pilz test, which is for sure not linked to this PR.

Thanks for all the hard work!

Thanks for the in-depth reviews! They helped test stability and fixing some minor bugs in the code.

After this is merged the only thing left for this entire feature is the tutorial:

Add Trajectory Cache Example For Refactor moveit2_tutorials#940

Pinging @sjahr for re-approval (from stale review) and merge!

sea-bass · 2024-12-31T00:56:48Z

After this is merged the only thing left for this entire feature is the tutorial:

Add Trajectory Cache Example For Refactor moveit2_tutorials#940

Note that we've since branched Jazzy off of main and are only landing bug fixes; however, in the tutorials repo the main branch is currently for both Jazzy and Rolling. So we can either:

Add a disclaimer to your specific tutorial to use rolling/main from source
Branch a Jazzy branch of the tutorials / build a separate docs page for Jazzy (the right, but more annoying thing to do)

Pinging @sjahr for re-approval (from stale review) and merge!

Sebastian, feel free to merge when/if you approve after the updates.

methylDragon · 2024-12-31T01:55:20Z

Add a disclaimer to your specific tutorial to use rolling/main from source

Branch a Jazzy branch of the tutorials / build a separate docs page for Jazzy (the right, but more annoying thing to do)

Not sure how to do the latter, so I'm just gonna do the former 😬

sea-bass · 2024-12-31T01:58:11Z

I meant more that the maintainers could help do the branching if this will be the jump-off point for the tutorials as well. But I'm still on the fence since this feature is very much at the "leaf" level of MoveIt.

sjahr

@methylDragon @sea-bass Thanks for your work to bring this over the finish line ❤️

methylDragon force-pushed the ch3/trajectory-cache-refactor branch 12 times, most recently from 91b48e8 to cc0feb3 Compare August 1, 2024 05:50

methylDragon mentioned this pull request Aug 1, 2024

Add TF Buffer accessor to MoveGroupInterface #2944

Merged

methylDragon changed the title ~~(DO NOT MERGE) Ch3/trajectory cache refactor~~ (WIP) Pluggable Fuzzy Matching TrajectoryCache refactor Aug 1, 2024

methylDragon changed the title ~~(WIP) Pluggable Fuzzy Matching TrajectoryCache refactor~~ (WIP) Pluggable fuzzy-matching Trajectory Cache refactor Aug 1, 2024

methylDragon mentioned this pull request Aug 1, 2024

Implement fuzzy-matching Trajectory Cache 🔥 #2838

Merged

methylDragon changed the title ~~(WIP) Pluggable fuzzy-matching Trajectory Cache refactor~~ (WIP) Fuzzy-matching Trajectory Cache Traits refactor Aug 1, 2024

methylDragon changed the title ~~(WIP) Fuzzy-matching Trajectory Cache Traits refactor~~ (WIP) Fuzzy-matching Trajectory Cache Injectable Traits refactor Aug 1, 2024

methylDragon force-pushed the ch3/trajectory-cache-refactor branch from 79b7f95 to e64cad8 Compare August 1, 2024 07:29

methylDragon force-pushed the ch3/trajectory-cache-refactor branch 9 times, most recently from 77789f3 to 1feb211 Compare August 3, 2024 10:11

sea-bass previously approved these changes Dec 29, 2024

View reviewed changes

Mitigate test flakiness

0c4d4d5

Signed-off-by: methylDragon <[email protected]>

methylDragon force-pushed the ch3/trajectory-cache-refactor branch from 6b19476 to 0c4d4d5 Compare December 30, 2024 04:51

Allow -11 for move_group gtest fixture

03e0dc4

Signed-off-by: methylDragon <[email protected]>

methylDragon requested a review from sjahr December 30, 2024 12:59

Make execution times in test deterministic

d78fd22

Signed-off-by: methylDragon <[email protected]>

sea-bass approved these changes Dec 31, 2024

View reviewed changes

sjahr approved these changes Jan 3, 2025

View reviewed changes

Merge branch 'main' into ch3/trajectory-cache-refactor

6b74a24

sjahr enabled auto-merge January 3, 2025 14:31

sjahr added this pull request to the merge queue Jan 3, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 3, 2025

sea-bass added this pull request to the merge queue Jan 4, 2025

Merged via the queue into moveit:main with commit a12f327 Jan 4, 2025
8 of 9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fuzzy-matching Trajectory Cache Injectable Traits refactor 🔥🔥 #2941

Fuzzy-matching Trajectory Cache Injectable Traits refactor 🔥🔥 #2941

methylDragon commented Jul 31, 2024 •

edited

Loading

methylDragon commented Aug 1, 2024 •

edited

Loading

stephanie-eng commented Aug 2, 2024

sea-bass commented Dec 29, 2024 •

edited

Loading

sea-bass commented Dec 29, 2024 •

edited

Loading

methylDragon commented Dec 29, 2024 •

edited

Loading

sea-bass commented Dec 29, 2024

methylDragon commented Dec 30, 2024 •

edited

Loading

sea-bass commented Dec 30, 2024

methylDragon commented Dec 30, 2024

sea-bass commented Dec 30, 2024

methylDragon commented Dec 30, 2024

methylDragon commented Dec 30, 2024 •

edited

Loading

sea-bass left a comment •

edited

Loading

methylDragon commented Dec 31, 2024 •

edited

Loading

sea-bass commented Dec 31, 2024

methylDragon commented Dec 31, 2024

sea-bass commented Dec 31, 2024

sjahr left a comment

Fuzzy-matching Trajectory Cache Injectable Traits refactor 🔥🔥 #2941

Fuzzy-matching Trajectory Cache Injectable Traits refactor 🔥🔥 #2941

Conversation

methylDragon commented Jul 31, 2024 • edited Loading

Preamble

Navigating This PR

Description

Example

The Feature Contract

Bonus

Pre-Existing Support

Cache Keying Features

Cache Insert and Pruning Policies

Caveat

Build/Test

methylDragon commented Aug 1, 2024 • edited Loading

stephanie-eng commented Aug 2, 2024

sea-bass commented Dec 29, 2024 • edited Loading

sea-bass commented Dec 29, 2024 • edited Loading

methylDragon commented Dec 29, 2024 • edited Loading

sea-bass commented Dec 29, 2024

methylDragon commented Dec 30, 2024 • edited Loading

sea-bass commented Dec 30, 2024

methylDragon commented Dec 30, 2024

sea-bass commented Dec 30, 2024

methylDragon commented Dec 30, 2024

methylDragon commented Dec 30, 2024 • edited Loading

sea-bass left a comment • edited Loading

Choose a reason for hiding this comment

methylDragon commented Dec 31, 2024 • edited Loading

sea-bass commented Dec 31, 2024

methylDragon commented Dec 31, 2024

sea-bass commented Dec 31, 2024

sjahr left a comment

Choose a reason for hiding this comment

methylDragon commented Jul 31, 2024 •

edited

Loading

methylDragon commented Aug 1, 2024 •

edited

Loading

sea-bass commented Dec 29, 2024 •

edited

Loading

sea-bass commented Dec 29, 2024 •

edited

Loading

methylDragon commented Dec 29, 2024 •

edited

Loading

methylDragon commented Dec 30, 2024 •

edited

Loading

methylDragon commented Dec 30, 2024 •

edited

Loading

sea-bass left a comment •

edited

Loading

methylDragon commented Dec 31, 2024 •

edited

Loading