-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FMV] Multi Versioned Inlining #71714
Comments
cc: @ilinpv |
For FMV the caller and callee attributes need to be compatible. This may not be true for the default implementation. |
Thanks for bringing this up. In the worst case when nothing can be picked just bail out as probably not worth doing it. ( and issue a remark )
Wondering how much of it to be added to ACLE, to be sure all toolchain behaves in the same way. |
This sounds like an issue with inlining in general, and might not be specific to this FMV optimization. |
I think there's a choice to be made here as to how to manage partial resolution when the caller/callee have different target_version's in their respective cohorts:
The simplest answer is probably to declare that FMV'd callers will directly call the FMV'd callee with the same feature set, if such a callee exists. Otherwise, dispatch through the resolver. If we wanted to be precise, the correct best choice is to take the feature sets of the caller and callee, and take their intersection, selecting the result with the highest priority. Either one of these options can reveal a behavioral change compared to dispatching through the resolver on platforms that support a feature enabled in a callee's target_version when the corresponding caller's target_version doesn't exist. |
... and given that, IMO the "simple answer" is the best one, and should be how this gets codified into the ACLE. |
This "simple" answer would be wrong in some cases. If you have a function foo with "sve" and "default" versions calls function bar with "sve2", "sve" and "default" versions, then you can't remove the indirection from the sve version of foo because you don't know whether the correct version of bar is the"sve" one or the "sve2" one. Let's use For a caller version
The second of these points is checking that the resolver wouldn't select any previous callee version, and the first point is checking that the resolver wouldn't skip over the candidate callee version. Aside from this, there's also a requirement to check that the function can't be interposed by a different implementation at link or load time; this is similar to the checks required to establish whether inlining is safe. Updated 2024-04-02: The last references to |
Sure, but it isn't "wrong" if we're re-defining the behavior of the spec. Agreed re: optimizing according to the current spec.
Yes, good point! |
This doesn't seem right to me. An optimization (which is what we are talking about here) should not alter the behavior of the original program, therefore I don't believe there is something to document in ACLE about it. That said we can still bypass the resolver as long as:
|
@andrewcarlotti reading your comment more carefully I believe I now understand what you meant. Can you confirm the behavior with #87939 adheres to your criteria? Just look at the test file. Cheers |
… callers ... when there is a callee with a matching feature set, and no other higher priority callee. This optimization helps the inliner see past the ifunc+resolver to the callee that we know it will always land on. This is a conservative implementation of: llvm/llvm-project#71714
When calling an ACLE Multi Versioned function from another that is also versioned, we should statically resolve that specific callsite to allow inlining.
Proposed semantics: take the caller's feature set, and consider all callee versions with features that are in that subset, sort them according to their priority the same way the resolver would choose, and emit a remark explaining which one was picked. Worst case, pick the default implementation of the callee.
I.e. given:
We should resolve some of the call sites in the IR to the equivalent of:
so that the
simd
version ofcaller
can be optimized to:The text was updated successfully, but these errors were encountered: