-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Projected winrt::consume_ methods will nullptr crash if the underlying QueryInterface call fails #1442
Conversation
As a binary size test case I built a private copy of WinUI3. The binary with cppwinrt usage is Microsoft.UI.Xaml.Controls.dll. The binary size for that actually went down (not up). 5,775,872 to 5,636,096 (-139,776 bytes aka -2.4%). |
Compared to what? The previous commit in the cppwinrt repo or some previous release? |
Compared to |
Technically the FH4 work that Raymond alluded to is about folding identical catch handlers rather than call sites. A reduction here for |
… the temporary alive until the semicolon. Remove that macro from where I need to call the check method to fix this crash
With the latest iteration (with the fix to keep the lifetime valid on the casted-to object) the binary size difference is close to zero. With the changes in this PR the XAML controls DLL is 10KB smaller than before (well below 1% diff). |
In prior investigations, I found there are types that have pathological inline behavior. Types like ValueSet, for instance, don't have any methods of their own - it's a flavored IMap<>. Someone accidentally doing the obvious thing: ValueSet v{};
v.Insert(L"kittens", winrt::box_value(7u));
v.Insert(L"puppies", winrt::box_value(2.0f));
v.Insert(L"muffins", winrt::box_value(L"blueberry")); ... ended up with three inlined QueryInterface calls from Any way to either merge those three QI sequences (tear off interface caching?) or at least not inline them (so the QI + call Insert is an invocable thing) would produce really great size enhancements. The code's already doing a bunch of "call" anyhow, we aren't saving much by inlining the QI+Call into the parent frame. (edited to add) I guess we're saving something by not faulting in more code to run, as the QI+Call are inlined directly. And we're adding some amount of exception handling overhead on a per-frame basis that might not have been there. But we're taking away some amount of temporary management (that QI comes paired with a Release of the temporary...) from the parent frame. Overall, reducing the inlining seems like a win. |
Make sure to also run the validation with the "real build line" - things like PGO can be super duper smart and decide to either inline or fold based on "other data." I tried fixing winrt::hstring to be less inlined only to find that the PGO optimizer was going to already de-inline all the calls on its own. |
Before I worked on FH4, I worked with MSVC on an optimization to treat certain calls to Ultimately, types like |
…if it were a nuget package
…ctly taking ownership of error info instead of re-originating
903190b
to
46f64de
Compare
With the latest iteration of code Microsoft.UI.Xaml.Controls.dll is 6,634,560 bytes (-141,312bytes, -2.4%, compared to baseline). That is back to where it was with the failfast approach. |
@@ -546,6 +546,28 @@ namespace winrt::impl | |||
} | |||
return result; | |||
} | |||
|
|||
template <typename T> | |||
WINRT_IMPL_NOINLINE void check_cast_result(T* from WINRT_IMPL_SOURCE_LOCATION_ARGS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't need to be a template. Can just take a void*
. This allows the different instantiations to be folded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair. If I spin another PR I'll make that change.
@@ -546,6 +546,28 @@ namespace winrt::impl | |||
} | |||
return result; | |||
} | |||
|
|||
template <typename T> | |||
WINRT_IMPL_NOINLINE void check_cast_result(T* from WINRT_IMPL_SOURCE_LOCATION_ARGS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this have to be WINRT_IMPL_NOINLINE inline
to avoid ODR problems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What sort of ODR problems? The only variance based on compiler options is the SOURCE_LOCATION macro and that already has global ODR guards.
The noinline
is important because it is how the code bloat is avoided by having lots of exception throwing locations spewed out into containing code.
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if code gen would be better as
WINRT_IMPL_NOINLINE [[noreturn]] void throw_failed_cast(hresult code WINRT_IMPL_SOURCE_LOCATION_ARGS)
{
throw hresult_error(code, take_ownership_from_abi WINRT_IMPL_SOURCE_LOCATION_FORWARD);
}
inline void check_cast_result(hresult code WINRT_IMPL_SOURCE_LOCATION_ARGS)
{
if (code < 0) throw_failed_cast(code WINRT_IMPL_SOURCE_LOCATION_FORWARD);
}
where we have a new try_as_reason that returns both a pointer and an hresult.
void* result{};
hresult code = ptr->QueryInterface(guid_of<To>(), &result);
return { wrap_as_result<To>(result), code };
…hen the type is already a match (#1448) Why is this change being made? I have been doing some local builds of an internal component against the latest cppwinrt.exe, which includes the fixes in #1442. I have noticed some modest (1%) binary size grown when comparing the April 2024 release and the latest. Using SizeBench to analyze binary size differences I determined that the inlined code gen seems to explain the difference. This change eliminates that increase, at least for the binary I'm testing against, and even shrinks it a bit further. Briefly summarize what changed The various winrt::impl::consume_THING methods get heavily inlined in release builds. The current code gen has some casts for when the types don't match (and is an AddRef/Release when they do match). The small amount of new cast result checking ends up at many call sites and slowly adds up a bit. I think we can do better in the cases where the types already match. This seems to have proven true in practice. Using if constexpr to determine if the type is a match allows us to skip any casts when this is true. In fact that is a net improvement over the original baseline because we don't need to AddRef/Release either. We can directly call the appropriate method. When a cast is necessary the code gen is identical to the previous baseline. The QueryInterface call is unavoidable and so is checking the result. The code writer format string is definitely starting to creak under the weight of many arguments. I don't want to refactor that as part of this PR but it would be good to simplify in the future. Also included is late feedback from the previous PR from @oldnewthing. The check_cast_result method can take void* instead of a template argument because it only null checks it. In my local measurements the optimizer already folded them so it makes no binary size difference but it is still a nice change to take. How was this change tested? Ran the full suite of cppwinrt tests locally. I also debugged into Windows.Foundation.IStringable.ToString for cases where it both is and is not the correct type. They run the expected code paths. I also checked the binary size of a large example to confirm the expected reduction. The disassembly for a Release build similarly matches expectations.
Why is this change being made?
The generated projection code for interfaces that the metadata declares as required for a runtimeclass assume that QueryInterface never fails. Assuming the metadata is correct in the first place, this is a valid assumption for inproc calls.
However, for cross-process calls the QI can fail even if the metadata is correct and the class really does implement all of the required interfaces. It can fail with E_ACCESSDENIED and a variety of other RPC error codes.
When this happens there is a nullptr crash in the generated consume method. This can be very painful to debug because the HRESULT is lost by the time it crashes.
Briefly summarize what changed
This set of changes fixes the crash by detecting the QueryInterface error and throwing an exception when that occurs during one of these required casts. The
try_as
method was changed to capture COM error context when the QI fails. The code gen surroundingWINRT_IMPL_STUB
was changed to save the result into a temporary variable and then pass it to a newcheck_cast_result
method.check_cast_result
is markednoinline
so that the binary size impact of throwing exceptions is limited to a single function instead of inlining into high-volume generated code.If the cast succeeded then nothing happens. If the cast failed, returning null, then the stored COM exception is retrieved. Assuming it is available the HRESULT is pulled out of it and it is thrown using
check_result
. This then propagates like any other exception. Callers are free to try and catch it or let it go uncaught and crash. Now they have the choice.I also added a new file of test code that exercises this code path. The test_component IDL declares a runtimeclass that implements
IStringable
. And then the implementation fails to implementIStringable
. WhenToString
is called on this object it hits the failure path. The cppwinrt code gen will not allow this to happen so I had to directly usewinrt::implements
.How was this change tested?
Besides the new test cases I also wrote a little console app that crashes this way. I built and ran it using the latest stable cppwinrt as well as my private new one. As expected the debugger blame is far more useful with these changes.
IDL:
main.cpp:
Before:
FAILURE_BUCKET_ID: ACCESS_VIOLATION_c0000005_CppwinrtQIFailureCrashRepro.exe!winrt::impl::consume_Windows_Foundation_IStringable_winrt::CrashRepro::LiesAboutInheritance_::ToString
After:
FAILURE_BUCKET_ID: STOWED_EXCEPTION_80004002_CppwinrtQIFailureCrashRepro.exe!winrt::check_cast_result_winrt::impl::abi_winrt::Windows::Foundation::IStringable,void_::type_*_
Note that the HRESULT ends up in the signature and the blame says that the cast result to IStringable is what failed.
Closes #1441