-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Windows] SearchHost.exe may freeze caused by FileUtil::CreateDirectory #1076
Comments
This commit affects only mozc_tip{32,64}.dll build with debug mode. There must be no behavior change in release build. Currently 'Mozc_tsf_ui.log' is created only when NDEBUG is defined. With this commit we stop creating it even when NDEBUG is defined. This addresses a crash issue in debug builds discussed in google#1077. See google#856 about why absl::LocalTimeZone cannot be used in Windows right now. Note that this commit may also help us diagnose google#1076, where Windows.Storage.OneCore.dll looks to be intercepting certain Win32 file I/O API calls in AppContainer processes then trigger RoInitialize as needed. Creating 'Mozc_tsf_ui.log' only in debug builds can make our debugging more complicated.
This commit affects only mozc_tip{32,64}.dll build with debug mode. There must be no behavior change in release build. Currently 'Mozc_tsf_ui.log' is created only when NDEBUG is defined. With this commit we stop creating it even when NDEBUG is defined. This addresses a crash issue in debug builds discussed in google#1077. See google#856 about why absl::LocalTimeZone cannot be used in Windows right now. Note that this commit may also help us diagnose google#1076, where Windows.Storage.OneCore.dll looks to be intercepting certain Win32 file I/O API calls in AppContainer processes then trigger RoInitialize as needed. Creating 'Mozc_tsf_ui.log' only in debug builds can make our debugging more complicated.
This attemps to avoid the deadlock issue discussed in google#1076. From what I can observe on Windows 11 23H2 (22631.4317), it looks to be dangerous for us to call File I/O Win32 APIs such as CreateFile from AppContainer processes unless we are confident that the target file/dir is accessible to the AppContainer process. When such an access is not allowed at the ACL level, the debugger shows that Windows.Storage.OneCore.dll tries to forward it to a broker process (e.g. via BrokeredCreateFile2). The issue is that Windows.Storage.OneCore.dll is a WinRT component, which means that just calling CreateFile() results in dyanamically invoking RoInitialize() internally, which can confuse TSF runtime as we have already seen in google#856. To summarize, the safest option is to ensure that the target file/dir is always accessible to AppContainer so that Windows.Storage.OneCore.dll will not be involved if the file/dir is opened with CreateFile() from MozcTip{32,64}.dll. If this commit is not sufficient to address google#1076, we then need to take care of other File I/O APIs such as CreateDirectoryW() and GetFileAttributes().
This attempts to avoid the deadlock issue discussed in google#1076. From what I can observe on Windows 11 23H2 (22631.4317), it looks to be dangerous for us to call File I/O Win32 APIs such as CreateFile from AppContainer processes unless we are confident that the target file/dir is accessible to the AppContainer process. When such an access is not allowed at the ACL level, the debugger shows that Windows.Storage.OneCore.dll tries to forward it to a broker process (e.g. via BrokeredCreateFile2). The issue is that Windows.Storage.OneCore.dll is a WinRT component, which means that just calling CreateFile() results in dynamically invoking RoInitialize() internally, which can confuse TSF runtime as we have already seen in google#856. To summarize, the safest option is to ensure that the target file/dir is always accessible to AppContainer so that Windows.Storage.OneCore.dll will not be involved if the file/dir is opened with CreateFile() from MozcTip{32,64}.dll. If this commit is not sufficient to address google#1076, we then need to take care of other File I/O APIs such as CreateDirectoryW() and GetFileAttributes().
This attempts to avoid the deadlock issue discussed in google#1076. Our current hypothesis to explain google#1076 is that "broadFileSystemAccess" capability given to SearchHost.exe is playing an interesting role. When TIP DLL calls CreateFileW API to open "config1.db", the process itself does not have sufficient permission to open it. Then "broadFileSystemAccess" capability will take place and Windows.Storage.OneCore.dll will attempts brokered file access after initializing the thread with RoInitialize() when not yet initialized. If this happens in a certain situation, TSF runtime gets confused and may start re-initializing Mozc TIP. 1. TSF calls back into Mozc TIP 2. Mozc TIP calls CreateFileW() 3. The system invokes RoInitialize() before returning from CreateFileW() 4. TSF calls back into Mozc TIP before returning from RoInitialize() 5. Now Mozc TIP is handling a reentrant callback. Given that whether the target app has "broadFileSystemAccess" capability or not is completely out of Mozc TIP's control, the safest option would be to ensure that the target file/dir is always accessible to AppContainer processes. As the first step, this commit makes 'config1.db' readable from AppContainer processes. There are two cases when the file permission of 'config1.db' is updated. A. When the installer is installing a new version of Mozc. B. When 'config1.db' is updated by mozc_server or mozc_tool. If this commit is not sufficient to address google#1076, we then need to take care of other cases such as calling GetFileAttributes() against the user profile directory.
This attempts to avoid the deadlock issue discussed in google#1076. Our current hypothesis to explain google#1076 is that "broadFileSystemAccess" capability given to SearchHost.exe is playing an interesting role. When TIP DLL calls CreateFileW API to open "config1.db", the process itself does not have sufficient permission to open it. Then "broadFileSystemAccess" capability will take place and Windows.Storage.OneCore.dll will attempts brokered file access after initializing the thread with RoInitialize() when not yet initialized. If this happens in a certain situation, TSF runtime gets confused and may start re-initializing Mozc TIP. 1. TSF calls back into Mozc TIP 2. Mozc TIP calls CreateFileW() 3. The system invokes RoInitialize() before returning from CreateFileW() 4. TSF calls back into Mozc TIP before returning from RoInitialize() 5. Now Mozc TIP is handling a reentrant callback. Given that whether the target app has "broadFileSystemAccess" capability or not is completely out of Mozc TIP's control, the safest option would be to ensure that the target file/dir is always accessible to AppContainer processes. As the first step, this commit makes 'config1.db' readable from AppContainer processes. There are two cases when the file permission of 'config1.db' is updated. A. When the installer is installing a new version of Mozc. B. When 'config1.db' is updated by mozc_server or mozc_tool. If this commit is not sufficient to address google#1076, we then need to take care of other cases such as calling GetFileAttributes() against the user profile directory.
While I still cannot reproduce the issue locally, based on the information I got my current hypothesis is that When Mozc TIP DLL calls
The on-going issue looks to be a deadlock due to non-reentrant lock.
To confirm that
|
This attempts to avoid the deadlock issue discussed in google#1076. Our current hypothesis to explain google#1076 is that "broadFileSystemAccess" capability given to SearchHost.exe is playing an interesting role. When TIP DLL calls CreateFileW API to open "config1.db", the process itself does not have sufficient permission to open it. Then "broadFileSystemAccess" capability will take place and Windows.Storage.OneCore.dll will attempts brokered file access after initializing the thread with RoInitialize() when not yet initialized. If this happens in a certain situation, TSF runtime gets confused and may start re-initializing Mozc TIP. 1. TSF calls back into Mozc TIP 2. Mozc TIP calls CreateFileW() 3. The system invokes RoInitialize() before returning from CreateFileW() 4. TSF calls back into Mozc TIP before returning from RoInitialize() 5. Now Mozc TIP is handling a reentrant callback. Given that whether the target app has "broadFileSystemAccess" capability or not is completely out of Mozc TIP's control, the safest option would be to ensure that the target file/dir is always accessible to AppContainer processes. As the first step, this commit makes 'config1.db' readable from AppContainer processes. There are two cases when the file permission of 'config1.db' is updated. A. When the installer is installing a new version of Mozc. B. When 'config1.db' is updated by mozc_server or mozc_tool. If this commit is not sufficient to address google#1076, we then need to take care of other cases such as calling GetFileAttributes() against the user profile directory.
One of the biggest challenge while investigating google#1076 is that it has been reliably reproducible only on certain users' environments. One idea to accelerate the debugging in such a scenario is to ship debug symbol file for Mozc TIP DLL (e.g. mozc_tip64.dll) to users. Then we can ask the reporter to use tools like Process Explorer and Process Hacker to dump the call stack of the thread that got stuck. With this commit we actually start deploying mozc_tip64.dll.pdb to users. The storage impact looks to be acceptable overall as long as we strip private symbols [1]. Here are actual values taken in my local environment. * Mozc64.msi: + 584 kB * mozc_tip64.dll.pdb: + 6,116 kB Note that this commit does not fully take care of Bazel build. While the symbol file is actually deployed, there remain the following known issues only in Bazel build. * private symbols are not yet stripped out. * the symbol file name embedded in mozc_tip64.dll is mozc_tip.pdb rather than mozc_tip64.dll.pdb. [1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/public-and-private-symbols
This attempts to avoid the deadlock issue discussed in google#1076. Our current hypothesis to explain google#1076 is that "broadFileSystemAccess" capability given to SearchHost.exe is playing an interesting role. When TIP DLL calls CreateFileW API to open "config1.db", the process itself does not have sufficient permission to open it. Then "broadFileSystemAccess" capability will take place and Windows.Storage.OneCore.dll will attempts brokered file access after initializing the thread with RoInitialize() when not yet initialized. If this happens in a certain situation, TSF runtime gets confused and may start re-initializing Mozc TIP. 1. TSF calls back into Mozc TIP 2. Mozc TIP calls CreateFileW() 3. The system invokes RoInitialize() before returning from CreateFileW() 4. TSF calls back into Mozc TIP before returning from RoInitialize() 5. Now Mozc TIP is handling a reentrant callback. Given that whether the target app has "broadFileSystemAccess" capability or not is completely out of Mozc TIP's control, the safest option would be to ensure that the target file/dir is always accessible to AppContainer processes. As the first step, this commit makes 'config1.db' readable from AppContainer processes. There are two cases when the file permission of 'config1.db' is updated. A. When the installer is installing a new version of Mozc. B. When 'config1.db' is updated by mozc_server or mozc_tool. If this commit is not sufficient to address google#1076, we then need to take care of other cases such as calling GetFileAttributes() against the user profile directory.
* This is a temporary workaround to avoid freeze of the host application. * #1076 PiperOrigin-RevId: 685979550
This commit affects only mozc_tip{32,64}.dll build with debug mode. There must be no behavior change in release build. Currently 'Mozc_tsf_ui.log' is created only when NDEBUG is defined. With this commit we stop creating it even when NDEBUG is defined. This addresses a crash issue in debug builds discussed in #1077. See #856 about why absl::LocalTimeZone cannot be used in Windows right now. Note that this commit may also help us diagnose #1076, where Windows.Storage.OneCore.dll looks to be intercepting certain Win32 file I/O API calls in AppContainer processes then trigger RoInitialize as needed. Creating 'Mozc_tsf_ui.log' only in debug builds can make our debugging more complicated. PiperOrigin-RevId: 686025719
One of the biggest challenge while investigating google#1076 is that it has been reliably reproducible only on certain users' environments. One idea to accelerate the debugging in such a scenario is to ship debug symbol file for Mozc TIP DLL (e.g. mozc_tip64.dll) to users. Then we can ask the reporter to use tools like Process Explorer and Process Hacker to dump the call stack of the thread that got stuck. With this commit we actually start deploying mozc_tip64.dll.pdb to users. The storage impact looks to be acceptable overall as long as we strip private symbols [1]. Here are actual values taken in my local environment. * Mozc64.msi: + 584 kB * mozc_tip64.dll.pdb: + 6,116 kB Note that this commit does not fully take care of Bazel build. While the symbol file is actually deployed, there remain the following known issues only in Bazel build. * private symbols are not yet stripped out. * the symbol file name embedded in mozc_tip64.dll is mozc_tip.pdb rather than mozc_tip64.dll.pdb. [1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/public-and-private-symbols
This attempts to avoid the deadlock issue discussed in #1076. Our current hypothesis to explain #1076 is that "broadFileSystemAccess" capability given to SearchHost.exe is playing an interesting role. When TIP DLL calls CreateFileW API to open "config1.db", the process itself does not have sufficient permission to open it. Then "broadFileSystemAccess" capability will take place and Windows.Storage.OneCore.dll will attempts brokered file access after initializing the thread with RoInitialize() when not yet initialized. If this happens in a certain situation, TSF runtime gets confused and may start re-initializing Mozc TIP. 1. TSF calls back into Mozc TIP 2. Mozc TIP calls CreateFileW() 3. The system invokes RoInitialize() before returning from CreateFileW() 4. TSF calls back into Mozc TIP before returning from RoInitialize() 5. Now Mozc TIP is handling a reentrant callback. Given that whether the target app has "broadFileSystemAccess" capability or not is completely out of Mozc TIP's control, the safest option would be to ensure that the target file/dir is always accessible to AppContainer processes. As the first step, this commit makes 'config1.db' readable from AppContainer processes. There are two cases when the file permission of 'config1.db' is updated. A. When the installer is installing a new version of Mozc. B. When 'config1.db' is updated by mozc_server or mozc_tool. If this commit is not sufficient to address #1076, we then need to take care of other cases such as calling GetFileAttributes() against the user profile directory. PiperOrigin-RevId: 686325184
One of the biggest challenge while investigating #1076 is that it has been reliably reproducible only on certain users' environments. One idea to accelerate the debugging in such a scenario is to ship debug symbol file for Mozc TIP DLL (e.g. mozc_tip64.dll) to users. Then we can ask the reporter to use tools like Process Explorer and Process Hacker to dump the call stack of the thread that got stuck. With this commit we actually start deploying mozc_tip64.dll.pdb to users. The storage impact looks to be acceptable overall as long as we strip private symbols [1]. Here are actual values taken in my local environment. * Mozc64.msi: + 584 kB * mozc_tip64.dll.pdb: + 6,116 kB Note that this commit does not fully take care of Bazel build. While the symbol file is actually deployed, there remain the following known issues only in Bazel build. * private symbols are not yet stripped out. * the symbol file name embedded in mozc_tip64.dll is mozc_tip.pdb rather than mozc_tip64.dll.pdb. [1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/public-and-private-symbols PiperOrigin-RevId: 686360089
There are some reports that SearchHost.exe freezes with Mozc built with
BRANDING=GoogleJapaneseInput
.Through bisect investigations, I figured out 68e99df is the change of this issue. It modified FileUtil::CreateDirectory.
We should update FileUtil::CreateDirectory again to address the problem.
As a tricky point, this issue is reported only on
BRANDING=GoogleJapaneseInput
andchannel_dev=0
.References:
The text was updated successfully, but these errors were encountered: