Crash in mozilla::a11y::IDSet::GetID (MSAA id exhaustion)
Categories
(Core :: Disability Access APIs, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | wontfix |
firefox-esr60 | --- | wontfix |
firefox-esr68 | --- | wontfix |
firefox-esr78 | 81+ | verified |
firefox53 | --- | wontfix |
firefox54 | --- | wontfix |
firefox55 | --- | wontfix |
firefox56 | --- | wontfix |
firefox57 | --- | wontfix |
firefox64 | --- | wontfix |
firefox65 | --- | wontfix |
firefox66 | + | wontfix |
firefox67 | --- | wontfix |
firefox68 | --- | wontfix |
firefox69 | --- | wontfix |
firefox78 | --- | wontfix |
firefox79 | --- | wontfix |
firefox80 | --- | verified |
People
(Reporter: philipp, Assigned: Jamie)
References
(Regression)
Details
(Keywords: crash, regression, Whiteboard: a11y:crash-win )
Crash Data
Attachments
(2 files)
545 bytes,
text/html
|
Details | |
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-esr78+
|
Details | Review |
Comment 1•7 years ago
|
||
Comment 2•7 years ago
|
||
Updated•7 years ago
|
Comment 3•7 years ago
|
||
Comment 4•7 years ago
|
||
Comment 5•7 years ago
|
||
Comment 6•7 years ago
|
||
Comment 7•7 years ago
|
||
Updated•7 years ago
|
Updated•7 years ago
|
Updated•7 years ago
|
Updated•7 years ago
|
Updated•7 years ago
|
Updated•6 years ago
|
Comment 9•6 years ago
|
||
Assignee | ||
Comment 10•6 years ago
|
||
Comment 11•6 years ago
|
||
Comment 12•6 years ago
|
||
Comment 13•6 years ago
|
||
Comment 14•6 years ago
|
||
Comment 16•6 years ago
|
||
No useful comments, will check URLs for some clues.
Assignee | ||
Comment 17•6 years ago
|
||
I don't think we have enough data to take any action here. As per earlier comments, this would happen if there are more than 2 ^ 24 accessible objects in a single process. Either:
- Some users really do have this many accessibles. This would mean that there are at least this many visible DOM nodes (attached to document and not display: none). Do we have any data on this?
- There is a leak somewhere preventing DOM nodes from being cleaned up and thus their associated accessibles.
- There is a leak somewhere preventing dead accessibles from being completely destroyed.
Without any clue as to how to reproduce this, I don't know how we can figure out which of these it is.
Comment 18•6 years ago
|
||
(In reply to James Teh [:Jamie] from comment #17)
I don't think we have enough data to take any action here. As per earlier comments, this would happen if there are more than 2 ^ 24 accessible objects in a single process. Either:
- Some users really do have this many accessibles. This would mean that there are at least this many visible DOM nodes (attached to document and not display: none). Do we have any data on this?
- There is a leak somewhere preventing DOM nodes from being cleaned up and thus their associated accessibles.
- There is a leak somewhere preventing dead accessibles from being completely destroyed.
Without any clue as to how to reproduce this, I don't know how we can figure out which of these it is.
I wonder if mccr8 or smaug or hsivonen have numbers/thoughts on the above?
Comment 19•6 years ago
|
||
It is certainly plausible that a page could leak a few million nodes. Comment 9 does sound like something that could do that. We've seen infinite scrolling news feed kind of things do that in the past.
Comment 20•6 years ago
|
||
Maybe this is a WONTFIX?
Assignee | ||
Comment 21•6 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #19)
It is certainly plausible that a page could leak a few million nodes. Comment 9 does sound like something that could do that. We've seen infinite scrolling news feed kind of things do that in the past.
When you refer to leaked nodes, are these attached to the document and visible (not display: none)? Accessibles should get destroyed if the node is not attached and/or display: none.
Comment 22•6 years ago
|
||
Well, mostly when I've seen it they aren't in the document, but I could imagine a page that just gets longer and longer as more things are loaded in.
Comment 23•6 years ago
|
||
(In reply to justinpulliam from comment #9)
This crashed a tab for one of my web app's testers. My web app is a
continuous stream of live data that is presented by adding server formatted
html to the page using element.innerHTML = new server data. The server data
has around 20 html elements, and it is updated 5-10 times a second.The user had only a couple of other non-demanding tabs open. He wasn't using
FF to surf other than for testing the web app. He had FF open for awhile,
maybe a few days. I will try to make a test case that reliably triggers this
crash.
Any update on a test case or whether your app can be accessed via so public URL? Can you clarify if you used innerHTML to add nodes or to replace nodes?
Comment 24•6 years ago
|
||
2 ^ 24 nodes takes over 2GB on 64bit system + then all the other memory usage. But sounds still possible.
Does a11y keep Accessible objects alive also for the DOM trees in background tabs?
Assignee | ||
Comment 25•6 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #24)
Does a11y keep Accessible objects alive also for the DOM trees in background tabs?
Yes, as some clients (notably screen readers) need them in order to keep their own models up to date.
Updated•6 years ago
|
Comment 26•6 years ago
|
||
This is a pretty high crash volume in beta 66. Tracking for 66 but from jimm's comments in triage it may be a difficult problem relating to e10s.
Updated•6 years ago
|
Reporter | ||
Comment 27•6 years ago
|
||
the user at https://support.mozilla.org/en-US/questions/1247107 had a reproducible case (unfortunately it looked like an internal app).
Assignee | ||
Comment 28•6 years ago
|
||
It'd be good to know what that app does when it adds new data. Does it just keep adding more elements to the document or does it remove old elements? If a document contains enough nodes to cause this crash, I would think it would consume a huge amount of memory, even with accessibility disabled.
Comment 29•6 years ago
|
||
Hi everyone,
I received an email from philipp that you had interest in what caused the daily crash in my Firefox.
I really do not know which "browser window/tab" caused the crashes, if any. We have as many as 10 tabs/windows running 24/7. I had given a photo of what they look like here: https://support.mozilla.org/en-US/questions/1247107
All I know is that the solution was to activate the "prevent accessibility services from accessing your browser" setting. I do not really understand what this setting does. There have been no more crashes after activating this setting. So, it's kind of like a "magic button" to me (press the magic button and the problem is solved). But all my tabs continue to function as usual and are not affected by this setting.
I would be happy to provide more information, but I think I may need some guidance on what information to provide.
Thanks for helping to solve my daily crash.
Thanks,
Jonah.
+65-98350243
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
Assignee | ||
Comment 31•6 years ago
|
||
(In reply to jonahyongntu from comment #29)
I received an email from philipp that you had interest in what caused the daily crash in my Firefox.
Thanks so much for following up and for your willingness to help.
All I know is that the solution was to activate the "prevent accessibility services from accessing your browser" setting. I do not really understand what this setting does.
Normally, accessibility services are used by people with disabilities; e.g. a screen reader used by blind users. I assume you don't have any users with disabilities requiring such tools, as preventing accessibility services would cause Firefox to be inaccessible to them. Unfortunately, some other apps (security software, anti-virus software, malware, etc.) use Firefox's accessibility framework as well. My guess is that some app on your system was doing this. While not entirely related, I'd be curious to know what app is triggering this on your system. You can determine this as follows:
- Uncheck "Prevent accessibility services" in Firefox Options and restart Firefox.
- Go to this address using the address bar: about:support
- Look for "Accessibility Instantiator" and include the information specified there in your reply.
I would be happy to provide more information, but I think I may need some guidance on what information to provide.
With regard to the tabs you have open, do any of them continually add new information to the document without clearing previous information? That is, the amount of information in the document continues to increase the longer it is open (and the further you can scroll back the longer it is open)? Or does the information on the page simply update (rather than adding more content to the bottom)?
Thanks again for your help.
Assignee | ||
Comment 33•6 years ago
|
||
Since this needs an owner, I'll "own" this, but it's important to note that right now, it isn't actionable without more data/a reproduceable test case.
Comment 34•6 years ago
|
||
(In reply to James Teh [:Jamie] from comment #33)
Since this needs an owner, I'll "own" this, but it's important to note that right now, it isn't actionable without more data/a reproduceable test case.
should it be a tracked bug then?
Assignee | ||
Comment 35•6 years ago
|
||
It's being tracked because of the high crash volume as per comment 26. I guess the hope is that we'll somehow acquire data that will allow us to implement a fix. I very much doubt this is going to happen at this point, though.
Comment 36•6 years ago
|
||
Hi James,
Sorry. The system needs to be running 24/7, so I do not want to "undo the solution" by unchecking "prevent accessibility services".
But I can answer the questions regarding how the tabs work.
None of the tabs get new information added to them continually:
https://prod-cdn.sumo.mozilla.net/uploads/images/2019-01-20-18-07-51-3e20d4.png
The "charts using divs and pngs" receive data for about 5 seconds, and then just stay there. There is a javascript to "reload the page" once every 5 minutes.
The webmap, gif animations and remaining tabs are pretty static too. They just load once and then sit there forever.
Updated•6 years ago
|
Comment 37•6 years ago
|
||
Wontfix for the releases in flight as this is currently stalled.
Updated•5 years ago
|
Comment 38•5 years ago
|
||
This is a test case that crashes my browser (both 70.0.1 and compiled from source) at that assertion in less than an hour.
The relevant section of my about:support is
Accessibility
Activated true
Prevent Accessibility 0
Accessible Handler Used false
Accessibility Instantiator UNKNOWN|
Library Versions
Assignee | ||
Comment 39•4 years ago
|
||
When an Accessible is shut down, there might still be external references to it, so we keep the instance alive.
We don't want to reuse the id until all references are released, so we only release the id in the destructor (which runs when all references are released).
Previously, we were clearing the id variable in Shutdown, which meant we couldn't release the id in the destructor!
This meant we always leaked ids, never reusing them.
Now, we don't clear the id in Shutdown.
Comment 40•4 years ago
|
||
Comment 41•4 years ago
|
||
bugherder |
Comment 42•4 years ago
|
||
Nicely done!
Comment 43•4 years ago
|
||
Too late to get it into 79.0 at this point, but I think we may want to consider getting it onto ESR78 eventually given the high crash volume and maybe even a 79.0.1 dot release.
Assignee | ||
Comment 44•4 years ago
|
||
I'd really like this to bake on nightly and perhaps even beta for a few weeks at least. While I believe the patch itself to be safe, it causes ids to be reused. That's precisely the original intention... but the previous (non-reuse) behaviour existed for years and there's a slight chance we might see some fallout. For example, if an object dies, it's slightly possible its id might get reused before a client has even had a chance to process the fact that the old object died, particularly in an asynchronous, coalescing world.
Updated•4 years ago
|
Updated•4 years ago
|
Comment 45•4 years ago
|
||
I've reproduced the crash signatures with Fx 72.0a1 on Windows 10, helped by the information from comment 38 (after running the browser for about 45m).
I did not longer reproduced this particular crash with the latest builds RC 80.0 and 81.0a1 (Windows 10), but after around 1h20m the browser crashed with the OOM signature, which due to the nature of the test case I think should be expected.
James, is this expected indeed, should I consider this issue verified ( note that I've checked with several profiles without encountering the mozilla::a11y::IDSet::GetI crash at all) or should it be something else to look for?
Assignee | ||
Comment 46•4 years ago
|
||
(In reply to Anca Soncutean [:Anca], Desktop Release QA from comment #45)
I've reproduced the crash signatures with Fx 72.0a1 on Windows 10, helped by the information from comment 38 (after running the browser for about 45m).
Were you running an accessibility client? What did the Accessibility section of about:support show for you?
I did not longer reproduced this particular crash with the latest builds RC 80.0 and 81.0a1 (Windows 10), but after around 1h20m the browser crashed with the OOM signature, which due to the nature of the test case I think should be expected.
The test case gets rid of old nodes when it adds new ones, so seeing OOM here concerns me. It'd be interesting to know whether you see OOM (though it might take longer) with the accessibility.force_disabled pref set to 1.
should I consider this issue verified ( note that I've checked with several profiles without encountering the mozilla::a11y::IDSet::GetI crash at all)
I think we can consider this verified.
or should it be something else to look for?
If you can't repro the OOM with accessibility.force_disabled set to 1, even with a longer time period, we should file anew bug in disability access APIs. Thanks!
Comment 47•4 years ago
|
||
Hi Jamie, did you want to nominate this for ESR78 approval now or wait for another cycle? Note that we'll be migrating ESR68 users to ESR78 with the upcoming 78.3esr release.
Assignee | ||
Comment 48•4 years ago
|
||
Comment on attachment 9165607 [details]
Bug 1368270: When shutting down Windows AccessibleWraps, don't clear the id.
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration: Crashes.
- User impact if declined: Crashes with accessibility enabled on pages which create a lot of nodes.
- Fix Landed on Version: 80
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Straightforward change which results in the behaviour that was originally intended. Baked on beta, nightly and release with no reported issues.
- String or UUID changes made by this patch: None
Updated•4 years ago
|
Comment 49•4 years ago
|
||
Comment on attachment 9165607 [details]
Bug 1368270: When shutting down Windows AccessibleWraps, don't clear the id.
crash fix, approved for 78.3
Updated•4 years ago
|
Comment 50•4 years ago
|
||
bugherder uplift |
Comment 51•4 years ago
|
||
As already mentioned in comment 45, I’ve reproduced the crash signatures with Fx 72.0a1 on Windows 10. I could no longer reproduced this crash (or the OOM signature I’ve ran into when attempting to verified this issue the first time). Issue verified fixed with Nightly 82.0a1 (20200909213959), RC 80.0.1 (20200831163820) and ESR 78.3.0 (treeherder build - 20200909191438) on Windows 10.
Description
•