Closed Bug 1507102 Opened 6 years ago Closed 6 years ago

Crash in MessageBuilder::WriteElement spiking from users of WPS Office

Categories

(Core :: Disability Access APIs, defect)

63 Branch
All
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox-esr60 --- wontfix
firefox63 + wontfix
firefox64 + wontfix
firefox65 --- wontfix

People

(Reporter: philipp, Unassigned)

References

Details

(Keywords: crash, csectype-wildptr, regression)

Crash Data

This bug was filed from the Socorro interface and is
report bp-789d0acf-6be3-4356-ab0f-f08ff0181114.
=============================================================

Top 10 frames of crashing thread:

0 uiautomationcore.dll MessageBuilder::WriteElement 
1 uiautomationcore.dll MessageBuilder::WriteTraverseStateOut 
2 uiautomationcore.dll RemoteUiaNodeStub::Incoming_Find 
3 uiautomationcore.dll RemoteUiaNodeStub::OnMessage 
4 uiautomationcore.dll InvokeOnCorrectContext_Callback 
5 uiautomationcore.dll ProcessIncomingRequest 
6 uiautomationcore.dll HookBasedServerConnectionManager::HookCallback 
7 uiautomationcore.dll HookUtil<&HookBasedClientConnection::HookCallback 
8 uiautomationcore.dll HandleHookMessage 
9 uiautomationcore.dll HookUtil<&HookBasedClientConnection::HookCallback 

=============================================================

this crash signature is spiking up today from win7 users on zh-CN builds, in the first couple of hours since the issue first appeared there are already over 2000 crash reports processed (64% of all browser crashes from chinese builds).

apparently most affected users are in a11y mode due to processes related https://en.wikipedia.org/wiki/WPS_Office

Accessibility client facet
1 	wpscloudsvr.exe|10.1.0.7668 	1377 	60.32 %
2 	wps.exe|11.1.0.7932 	635 	27.81 %
3 	wps.exe|11.1.0.7949 	77 	3.37 %
4 	wps.exe|11.1.0.7967 	55 	2.41 %
5 	wpscloudsvr.exe|10.1.0.7670 	45 	1.97 %
6 	wpscloudsvr.exe|10.1.0.7643 	26 	1.14 %
7 	wps.exe|11.1.0.7940 	9 	0.39 %
8 	wpscloudsvr.exe|10.1.0.6206 	3 	0.13 %
9 	wpscenter.exe|10.1.0.7643 	2 	0.09 %

i'm unsure what may be behind the uptick of those crashes now - it would also be coinciding with microsoft's monthly patchday.
Hector, this crash spike seems to be exclusively with zh-CN builds. The buildid in crash reports is the same as the regular Firefox but I don't know if the buildid changes for builds provided through Mozilla Online, I do notice though that the majority of installs have *@mozillaonline.com extensions installed in their profile.

Could your team investigate these crashes as they appear to affect only zh-CN builds? Thanks!
Flags: needinfo?(bzhao)
Alexander, these crashes seem related to accessibility APIs, could you also investigate the reason why this is happening? Thanks
Flags: needinfo?(surkov.alexander)
According to some Chinese user feedback I collected, they all installed WPS and 360(different version), and system is win7. The performance is Firefox can be opened, but it will crash after a while.

Most users have recently updated one of these two software. But due to the different versions, I am currently unable to determine which software caused the crash. After checked 'Prevent accessibility services from accessing your browser' in Firefox Options, it can be solved.
(In reply to Pascal Chevrel:pascalc from comment #1)
> Hector, this crash spike seems to be exclusively with zh-CN builds. The
> buildid in crash reports is the same as the regular Firefox but I don't know
> if the buildid changes for builds provided through Mozilla Online, I do
> notice though that the majority of installs have *@mozillaonline.com
> extensions installed in their profile.

Firefox for desktop distributed by Beijing office are partner repacks created as part of release automation[1], they have identical buildid with vanilla Fx.

Firefox for Android we distribute are compiled with patches[2], and have different buildid.

> 
> Could your team investigate these crashes as they appear to affect only
> zh-CN builds? Thanks!

Yuehang Xu just posted his initial findings in comment 3.

[1]: https://tools.taskcluster.net/groups/eAQCnoHFQtKgrtaNXSZTYA/tasks/RKH2qOQCSuGhkPfsKyYUnw/details
[2]: https://github.com/MozillaOnline/gecko-dev/commits/cn-release
Flags: needinfo?(bzhao)
According to the feedback obtained from more users, it looks like related to WPS not 360. However, some users said that they received a patch that Microsoft recently pushed.

Some win10 users also reported this problem, but the performance is that Firefox is not responding (hang but not crash) and the problem can be solved after disabling the accessibility service.

I am trying to contact the WPS staff. They want to know what our accessibility APIs is, or whether it has relevant documentation.
WPS is still troubleshooting the reason, they temporarily roll back the file version. We can observe whether the crash has a downward trend.
Aaron, maybe you could take a look?
Flags: needinfo?(aklotz)
Lets ask Jamie first.
Flags: needinfo?(surkov.alexander)
Flags: needinfo?(jteh)
Flags: needinfo?(aklotz)
This is a stability issue that doesn't need to be treated as a security bug. The top 50 frames of the stack are outside Firefox.
Group: core-security
(In reply to yxu from comment #5)
> I am trying to contact the WPS staff. They want to know what our
> accessibility APIs is, or whether it has relevant documentation.

When we speak of accessibility APIs, we're referring to Microsoft Active Accessibility (MSAA):
https://docs.microsoft.com/en-us/windows/desktop/winauto/microsoft-active-accessibility
and UI Automation:
https://docs.microsoft.com/en-us/windows/desktop/winauto/uiauto-clientsoverview
According to information in crash reports, it looks like WPS might be using both of these. (AccessibilityInProcClient is 0x400, UNKNOWN, instead of UIAUTOMATION, suggesting that the initial request was not from UI Automation. However, the crash occurs when UIAutomationCore is trying to process a request.)

If WPS aren't aware of what this is, they probably shouldn't be using it. Furthermore, if disabling the accessibility service in Firefox has no noticeable effect on these users, that would also suggest there's no good reason for WPS to be using this. If that's the case, I'd suggest we may wish to start by blocking WPS as an accessibility client, which may or may not fix the issue (depending on whether there's some other client on these systems).

Looking at the stack, there seems to be reentry into UI Automation here. My guess is that a UIA client is querying for information about an element in a remote content process, which results in a cross-process COM call, but then another UIA call comes in while that COM call is being serviced. Perhaps UIA can't handle reentry? It's hard to know for sure, though, and if that's the case, I'm surprised we haven't seen this kind of thing before. I also don't know why a client would make two simultaneous calls.
Flags: needinfo?(jteh)
Crashes seem to go down over the last week and. We don't have any other dot release planned for 63 and it doesn't seem there is anything actionable on our side we could uplift, so wontfix for 63.
I think this is wontfix for 64 as well.

James what is the right next step here? Anything actionable?
Flags: needinfo?(jteh)
These crashes have decreased significantly. They're no longer primarily caused by WPS; only 3.69% of crashes (35 crashes) in the last week were caused by WPS. That suggests that there has been a fix in WPS or Windows. Given that this issue was originally filed for the spike caused by WPS, I'm closing this. Note that blocking UI Automation until we have a proper UI Automation implementation (bug 1420276) might mitigate the remaining crashes, WPS and non-WPS alike.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(jteh)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.