-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix the execution order of entites in executors to favor insertion order #2537
base: rolling
Are you sure you want to change the base?
Conversation
Signed-off-by: William Woodall <[email protected]>
Signed-off-by: William Woodall <[email protected]>
for (auto it = update_to.begin(); it != update_to.end(); ) { | ||
for (auto it = update_to.begin_ordered(); it != update_to.end_ordered(); ) { | ||
if (update_from.count(it->first) == 0) { | ||
auto entity = it->second.entity.lock(); | ||
if (entity) { | ||
on_removed(entity); | ||
} | ||
it = update_to.erase(it); | ||
it = update_to.erase_ordered(it); | ||
} else { | ||
++it; | ||
} | ||
} | ||
for (auto it = update_from.begin(); it != update_from.end(); ++it) { | ||
for (auto it = update_from.begin_ordered(); it != update_from.end_ordered(); ++it) { | ||
if (update_to.count(it->first) == 0) { | ||
auto entity = it->second.entity.lock(); | ||
if (entity) { | ||
on_added(entity); | ||
} | ||
update_to.insert(*it); | ||
bool inserted = update_to.insert(*it); | ||
// Should never be false, so this is a defensive check, mark unused too | ||
// in order to avoid a warning in release builds. | ||
assert(inserted); | ||
RCUTILS_UNUSED(inserted); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the important part of the changes that affect the order of execution when tied, because this is what controls the order in which they are added to the wait set in the case of the wait set based executors, i.e. everything except the EventsExecutor
.
/// Type of the map used for random access | ||
using MapType = std::unordered_map<const EntityKeyType *, CollectionEntry<EntityValueType>>; | ||
|
||
/// Type of the vector for insertion order access | ||
// Note, we cannot use typename MapType::value_type because it makes the first | ||
// item in the pair const, which prevents copy assignment of the pair, which | ||
// prevents std::vector::erase from working later... | ||
using VectorType = std::vector<std::pair<const EntityKeyType *, | ||
CollectionEntry<EntityValueType>>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This detail was annoying, the MapType::value_type makes the first argument of the pair const (would be const EntityKeyType * const
in this case), which prevents the use of the operator=
on the pair, which prevents std::vector::erase
from working...
Anyway, this is essentially a std::vector<typename MapType::value_type>
except for that detail...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just store the iterator returned by std::map::insert ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because they can be invalidated by future insertions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also considered storing the pointer (which is the key for the map) only, but then when iterating in order you need to do a lookup in the map for each one you want to use, which is made redundant if you just store the information twice. In fact this conclusion is what several of the "ordered dict in c++" projects I looked at online ended up using.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because they can be invalidated by future insertions.
Nope, not for std::map :
https://en.cppreference.com/w/cpp/container/map/insert
No iterators or references are invalidated. If the insertion is successful, pointers and references to the element obtained while it is held in the node handle are invalidated, and pointers and references obtained to that element before it was extracted become valid.(since C++17)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the point is to use the unordered_map
so that you get the better look up performance (Constant on average, worst case linear in the size of the container.
for unordered_map
vs Logarithmic in the size of the container.
for map
), and in that case I interpret it to mean that rehashing will invalidate them:
If after the operation the new number of elements is greater than old max_load_factor() * bucket_count() a rehashing takes place.
If rehashing occurs (due to the insertion), all iterators are invalidated. Otherwise (no rehashing), iterators are not invalidated. If the insertion is successful, pointers and references to the element obtained while it is held in the node handle are invalidated, and pointers and references obtained to that element before it was extracted become valid.(since C++17)
-- https://en.cppreference.com/w/cpp/container/unordered_map/insert
Which is also the conclusion of this SO answer (which I was using to try and confirm my interpretation): https://stackoverflow.com/a/54004916
For what it's worth, I considered using std::map
and its iterators too.
However, the conclusion I came to is that whether it's a vector of iterators (pointers) or a vector of triplets of pointers, the impact is not much different on memory, even at relatively large scales. Instead, I thought preserving the cpu time optimizations was better for the executor, given the feedback we've consistently gotten, but that's a generalization.
auto remove_entities = [](auto & collection) -> size_t { | ||
size_t removed = 0; | ||
for (auto it = collection.begin(); it != collection.end(); ) { | ||
for (auto it = collection.begin_ordered(); it != collection.end_ordered(); ) { | ||
if (it->second.entity.expired()) { | ||
++removed; | ||
it = collection.erase(it); | ||
it = collection.erase_ordered(it); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was not strictly necessary, but I reasoned it was slightly faster to provide the iterator for erasure to the vector and look up the iterator for the map, rather than the other way around, which is what erase_ordered
does.
collection.insert( | ||
bool inserted = collection.insert( | ||
{ | ||
this->notify_waitable_.get(), | ||
{this->notify_waitable_, weak_group_ptr} | ||
}); | ||
// Explicitly ignore if the notify waitable was not inserted because that means | ||
// it was already inserted, which happens initially as it is explicitly added | ||
// in the constructor as well as every time the collection is reset, so on | ||
// the first reset there is a second insertion attempt. | ||
// We could check before trying to insert, but that would require a "find" call | ||
// on each refresh, which is expensive, and otherwise it would require additional | ||
// state in this class to detect the initial case where it is added twice. | ||
// Therefore we just insert and ignore it if it fails (the only way it fails | ||
// is when a duplicate is inserted). | ||
RCUTILS_UNUSED(inserted); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an oddity I found with the events executor, where it was always inserting the notify waitable twice on one occasion (first refresh after construction), and it wasn't checking the return value of insert, which if we had checked would have occasionally been false meaning it wasn't inserted. This didn't end up mattering, but it did feel weird and broke some things when I added the vector storage along side the map.
Anyway I landed on this approach where we continue to add it twice sometimes, but we just ignore it when it wasn't inserted as the fact that it has been inserted is important, not the order in which it was, at least in the specific case of this notify waitable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI @mauropasse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this was a bug.
I opened a PR to fix it #2564
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks for checking up on it. I guess I'll wait for that pr to merge and then update this. I have to check into timer execution order on Windows for these new tests anyways.
As I've been looking into how all of this works, it seems to me that the bulk of the complexity associated with the
As the |
I don't think that's actually true. The SingleThreadedExecutor does actually consider the callback groups, but in a very limited fashion. There's actually a test that disables individual callback groups explicitly, and checks that executors do not execute from them. Now whether or not that's useful isn't clear to me, but as it stands CallbackGroups are a non-optional part of the executor related APIs, and for better worse we cannot just ignore them. Now we could change that, but it would require changing some of the semantics, but that's not something we can do in a stable release, in my opinion. But we can consider it for the future. From a technical angle, there's nothing that requires that any executor, including the two you mentioned, use the Entity Collection classes, it was just convenient. However, in my experience you cannot remove much if anything that's there and still handle ownership mechanics correctly, e.g. nodes hold pointers to callback groups which holds pointers to entities and you have to handle gracefully removing them when the user lets any go out of scope, but you have to hold ownership in the executor while using them, etc... So feel free to try and optimize them, but just be aware that there be dragons, lol. Also, despite the name, the |
Signed-off-by: William Woodall <[email protected]>
As discussed in the client library WG, an important feature (that is currently achieved through callback groups) is to allow to add only some of the node's entities to an executor
We usually decided to not focus too much on the overhead of insertion/deletion (compared to the overhead of the fast-loop: selection and execution of entities). |
Signed-off-by: William Woodall <[email protected]>
I'm not sure if we only ever insert/delete when the user adds or removes something, e.g. copying or updating a collection can happen even when the user hasn't added or removed anything, but maybe I'm wrong about that. We have two collections in the executor base class which are swapped (sort of) at specific intervals: rclcpp/rclcpp/src/rclcpp/executor.cpp Lines 683 to 717 in 3bc364a
Anyway, maybe @mjcarroll can expand on that. This is still something I'm going to investigate before pushing to merge this. |
Well, rebuilding the collection only when something changes was the purpose of the "notifier waitable" and the "entities collection" classes, and that's how they were used in the events executor and static single threaded executor. |
Signed-off-by: William Woodall <[email protected]>
That was my hope, but I wanted to confirm which I am going to do soon hopefully. |
That is the behavior unless something was done incorrectly. If the set of the entities in the collection stays constant, the waitset should not be rebuilt. |
I closed #2536 in favor of this pr (it was just the test, with no fix), but cross-posting here for visibility since there were some good discussions on that pr too. |
@wjwwood did you have a chance to check if the collection was correctly functioning? i.e. it's rebuilt only when something changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall lgtm, just a minor comment for the test.
"reverse call order", | ||
{ | ||
{"waitable", {false, reverse, forward}}, | ||
{"subscription", {false, reverse, forward}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we skip this test, i think the same test is executed twice.
{"subscription", {false, reverse, forward}}, | |
{"subscription", {true, reverse, forward}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked again, it doesn't seem like we're doing the same test twice (not exactly), so maybe you could explain why you think that?
If we were to disable this test, then we should do {true, {},{}}
also, as the call order vs expected order don't matter if it is skipped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think call_order
is not used in the test of the subscription
(maybe cannot be used), so it always check against expected_execution_order
which is forward
. in that case, we are doing the same test twice?
I can confirm that the insertions only occur when entities are created or destroyed, so the overhead of this fix should be minimal. I just added print statements during insertion as well as some to show what steps of the test we're in:
You can see that it does some insertions when the callback group is added, but not between spins, which was expected. |
Signed-off-by: William Woodall <[email protected]>
I tested it with print statements, and I started to add a unit test, but checking it as a unit test became pretty difficult, since it's not just testing a unit, but rather two working together (with different results for each pair), e.g. STE with the So unfortunately I'll have to leave that for others or future me to investigate. |
if (update_to.count(it->first) == 0) { | ||
auto entity = it->second.entity.lock(); | ||
if (entity) { | ||
on_added(entity); | ||
} | ||
update_to.insert(*it); | ||
bool inserted = update_to.insert(*it); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why insert an outdated entity? I guess this should be in the if(entity) ?
@@ -72,34 +73,47 @@ void update_entities( | |||
std::function<void(const typename CollectionType::EntitySharedPtr &)> on_removed | |||
) | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here is a potential bug. If between two call of update_entities the order of elements in update_from changes, the order in update_to will not change. I am not really sure if this might happen though.
In specific circumstances, multiple entities (of the same kind) in an executor can be ready at the same time and the executor must arbitrarily decide which to execute first, and in Iron and before, this fell back to the order in which they were added, but in Jazzy we (inadvertently) changed this and it became related to the hash order, subtly changing the order of execution that some folks had come to rely on, see: #2532
While this ordering was not guaranteed, this pr aims to restore the previous behavior for consistency and to provide a less surprising result, but the order is still not a guarantee nor is it the same for all executor types (e.g. the EventsExecutor is consistent, but is instead based on the trigger order of the entities, not their insertion/addition order).
The new test is also in a separate pr here: #2536 so it can be more easily back ported to previous versions. I think it should pass on them, but I still need to test it. See that pr for updates.
This approach will not work for backporting as it breaks ABI, but I believe I could make a version that doesn't break ABI (it will be as ugly as past non-ABI breaking changes). I just want to make sure we want to follow through with this before committing resources to that attempt.
This pr does incur some additional overhead, but I believe it should be relatively low, considering what we're storing. It does, however, slow down insertion and erasure from the collections which could be a concern in the tight loop of spin. I will do some more experimentation related to this, but the memory overhead of storing the entity pointers of the collection twice (once as an unordered map for fast random access and once in a vector for maintaining insertion order) should be relatively small. It does increase the memory footprint by 2x, but the footprint was small to begin with (three pointers per entity).
I looked at using existing "ordered maps" but all of them (that I could see) either provided the desired API but lost the fast lookup of the hash or did what I did here and maintained the information twice in two different data structures.
So I think this is a reasonable approach, but I'm up for discussion on alternatives.