Closed
Bug 1338908
Opened 8 years ago
Closed 8 years ago
Crash in mozilla::EventListenerManager::HandleEventSubType
Categories
(Core :: DOM: Events, defect, P1)
Tracking
()
RESOLVED
INCOMPLETE
People
(Reporter: philipp, Unassigned)
Details
(6 keywords, Whiteboard: [sec-triage-backlog])
Crash Data
This bug was filed from the Socorro interface and is
report bp-d7eb9da5-c37a-4449-b427-352602170212.
=============================================================
Crashing Thread (0)
Frame Module Signature Source
0 xul.dll mozilla::EventListenerManager::HandleEventSubType(mozilla::EventListenerManager::Listener*, nsIDOMEvent*, mozilla::dom::EventTarget*) dom/events/EventListenerManager.cpp:1112
1 xul.dll mozilla::EventListenerManager::HandleEventInternal(nsPresContext*, mozilla::WidgetEvent*, nsIDOMEvent**, mozilla::dom::EventTarget*, nsEventStatus*) dom/events/EventListenerManager.cpp:1286
2 xul.dll mozilla::EventTargetChainItem::HandleEventTargetChain(nsTArray<mozilla::EventTargetChainItem>&, mozilla::EventChainPostVisitor&, mozilla::EventDispatchingCallback*, mozilla::ELMCreationDetector&) dom/events/EventDispatcher.cpp:358
3 xul.dll mozilla::EventDispatcher::Dispatch(nsISupports*, nsPresContext*, mozilla::WidgetEvent*, nsIDOMEvent*, nsEventStatus*, mozilla::EventDispatchingCallback*, nsTArray<mozilla::dom::EventTarget*>*) dom/events/EventDispatcher.cpp:711
4 @0x1200187
this crash signature has been around for a while, but it seems to regress in terms of numbers in firefox 51 - it is affecting all versions of windows there & happens mostly in the content process. also russian locale builds seem to be disproportionately high affected.
Correlations for Firefox Release
(79.14% in signature vs 02.01% overall) address = 0x4
(56.89% in signature vs 11.15% overall) useragent_locale = ru [71.50% vs 14.61% if startup_crash = null]
(79.52% in signature vs 35.94% overall) reason = EXCEPTION_ACCESS_VIOLATION_READ
(34.89% in signature vs 65.49% overall) Module "iertutil.dll" = true [26.73% vs 75.65% if platform_pretty_version = Windows 10]
(34.77% in signature vs 66.55% overall) Module "wininet.dll" = true [28.71% vs 77.05% if platform_pretty_version = Windows 10]
(31.35% in signature vs 66.16% overall) Module "dxva2.dll" = true [45.54% vs 79.80% if platform_pretty_version = Windows 10]
(31.35% in signature vs 65.99% overall) Module "mf.dll" = true [45.54% vs 79.70% if platform_pretty_version = Windows 10]
(31.35% in signature vs 65.96% overall) Module "evr.dll" = true [45.54% vs 79.70% if platform_pretty_version = Windows 10]
(33.25% in signature vs 66.74% overall) Module "mfplat.dll" = true [46.53% vs 80.12% if platform_pretty_version = Windows 10]
(06.95% in signature vs 39.09% overall) e10s_cohort = disqualified-test [33.74% vs 58.46% if process_type = null]
(79.39% in signature vs 31.41% overall) process_type = content [100.0% vs 84.17% if startup_crash = null]
(66.37% in signature vs 87.67% overall) Module "psapi.dll" = true [73.27% vs 94.45% if platform_pretty_version = Windows 10]
Updated•8 years ago
|
Component: Untriaged → DOM: Events
Comment 1•8 years ago
|
||
Quick thought: possibly related to bug 1304874?
| Reporter | ||
Comment 2•8 years ago
|
||
it's generally spiking up in 51 - also when you discount ru users (so that may one factor playing into the issue, but not the only one): http://bit.ly/2kQpdhV
Comment 3•8 years ago
|
||
Looks like this is usually a startup crash and there are about 1000 crashes per week on release.
Worth investigating for 53/54 if there is anything actionable here.
Comment 4•8 years ago
|
||
Stone, please help investigate this.
Flags: needinfo?(sshih)
Priority: -- → P1
Updated•8 years ago
|
Assignee: nobody → sshih
Flags: needinfo?(sshih)
Comment 5•8 years ago
|
||
Found several crashes happened when browsing e.mail.ru. Added some assertions to test FF with it but still can't reproduce it.
Comment 6•8 years ago
|
||
(In reply to Ming-Chou Shih [:stone] from comment #5)
> Found several crashes happened when browsing e.mail.ru. Added some
> assertions to test FF with it but still can't reproduce it.
Hi Mike, would it be possible to have your team's assistance in investigating the website implementation? Our current investigation so far points that there seems something suspicious while browsing the inbox of e.mail.ru.
Hi Florin, would you or your team be able to help us get STRs for reproducing this?
Thank you very much!
Flags: needinfo?(miket)
Flags: needinfo?(florin.mezei)
Comment 7•8 years ago
|
||
Moving the ni to Andrei to see if we can find some time for investigation. Note that the team is busy with the release for 52 currently, so this will likely not happen very soon.
Flags: needinfo?(florin.mezei) → needinfo?(andrei.vaida)
Comment 8•8 years ago
|
||
Is the crash address always 0 + offset?
Is aListener null?
Comment 9•8 years ago
|
||
Steps to reproduce the crash would be helpful, my Russian isn't great.
Flags: needinfo?(miket)
Comment 10•8 years ago
|
||
I can’t reproduce this issue using Firefox 51.0.1(Build Id: 20170125094131,ru), Firefox 52.0 esr (Build Id:20170303022339, ru) and Firefox 53.0b1 (Build Id:20170307064827, ru) on Windows 10 64 bit (GPU:Intel HD Graphics 530), Windows 10 64 bit (GPU: ATI Radeon HD 3000) and Windows 7 x86 (GPU: AMD Radeon HD 6450).
I have spent most of the time trying to reproduce this issue using Firefox 51.0.1 (ru), since it is spiking in 51 build and it is affecting mostly the ru users.
I have further investigated if this crash is reproducible using the websites where this has occurred frequently (e.mail.ru and deutsche-bank.de) but I didn’t managed to reproduce it. I have also tried to reproduce this issue using Hola VPN proxy but failed to do so.
Flags: needinfo?(andrei.vaida)
Comment 11•8 years ago
|
||
sec-high UAF
Of the crashes in the last week, there are many with random addresses, and around a dozen with clear UAF addresses, in addition to the more-common 0x4.
Crashes in 52bNN, 52esr, and 51esr(??!) in addition to loads in 51. all on the same source line
EventListenerHolder listenerHolder(aListener->mListener.Clone());
Last changes by Xidorn in bug 1287706
Another bug possibly involved higher up the stack was bug 1236979 (dholbert)
Keywords: csectype-uaf,
sec-high
| Reporter | ||
Updated•8 years ago
|
Group: core-security
Updated•8 years ago
|
status-firefox55:
--- → affected
status-firefox-esr45:
--- → unaffected
Comment 12•8 years ago
|
||
FWIW, I was looking at this a bit and couldn't see anything obvious. Crashes seemed to be related to XPCOM event listeners, so perhaps some C++ caller has bad lifetime managing for some listener?
Updated•8 years ago
|
Group: core-security → dom-core-security
Comment 13•8 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #8)
> Is the crash address always 0 + offset?
> Is aListener null?
It doesn't seem so. mListener is the first field in struct Listener, so if aListener is nullptr, the crash should happen at 0x0 rather than 0x4.
Also, if aListener is nullptr, this crash should happen earlier lines like [1], unless "Maybe::ptr()" returns something weird.
So the EXCEPTION_ACCESS_VIOLATION_READ cases seem more likely to be that the pointer stored in mListener is 0x4, or its 0x0 but the null-check in NS_IF_ADDREF is removed by the compiler.
[1] https://hg.mozilla.org/releases/mozilla-release/annotate/327e081221b0/dom/events/EventListenerManager.cpp#l1260
Comment 14•8 years ago
|
||
This error happens almost only on x86 Windows, so it could be that something is mis-optimized somehow.
(In reply to Xidorn Quan [:xidorn] (UTC+10) from comment #13)
> It doesn't seem so. mListener is the first field in struct Listener, so if
> aListener is nullptr, the crash should happen at 0x0 rather than 0x4.
Not so fast.
mListener is an EventListenerHolder, which is a CallbackObjectHolder. CallbackObjectHolder derives from CallbackObjectHolderBase. CallbackObjectHolderBase needs to have nonzero size because C++ requires all types to have nonzero size (probably 1 in this case), so CallbackObjectHolder most likely stores (on 32-bit) its actual data at an offset of 0x4 (for alignment).
Comment 16•8 years ago
|
||
There is a static_assert in CallbackObjectHolder [1] to ensure that it has the size of a pointer, so it is guaranteed that CallbackObjectHolderBase doesn't contribute to the size of the object.
[1] https://dxr.mozilla.org/mozilla-central/rev/528e9dbbb882db0b32792d44b5be9cc539afa1a8/dom/bindings/CallbackObject.h#394-397
Comment 17•8 years ago
|
||
All EXCEPTION_ACCESS_VIOLATION_EXEC crashes (which are those related to XPCOMCallback mentioned by smaug in comment 12) come from nsDocument::DispatchContentLoadedEvents() [1], so it might be that there is a broken XPCOM listener setup for DOMContentLoaded event.
EXCEPTION_ACCESS_VIOLATION_READ cases don't seem to have pattern on what event would trigger.
I guess there are actually two different kinds of crashes happen to share the same signature, and we probably should analyse them separately.
[1] https://hg.mozilla.org/releases/mozilla-release/annotate/327e081221b0/dom/base/nsDocument.cpp#l4987
Comment 18•8 years ago
|
||
About the signature 'EXCEPTION_ACCESS_VIOLATION_READ to address 0x04', using release build 52 to debug, found something about crash in [1]
1. aListener (EventListenerManager::Listener) is not null
2. mPtrBits of CallbackObjectHolder (mListener) is not null (stores the address of the callback object)
3. The vtbl of the callback object pointer is null
4. When calling .Clone() [2], we are jumping to the address of AddRef (vtbl[1] = 0x04)
Looks like the callback object is destroyed.
[1] http://searchfox.org/mozilla-central/rev/ef0b6a528e0e7b354ddf8cdce08392be8b8ca679/dom/events/EventListenerManager.cpp#1102
[2] http://searchfox.org/mozilla-central/rev/ef0b6a528e0e7b354ddf8cdce08392be8b8ca679/dom/bindings/CallbackObject.h#449
Comment 19•8 years ago
|
||
(In reply to Ming-Chou Shih [:stone] from comment #18)
> 2. mPtrBits of CallbackObjectHolder (mListener) is not null (stores the
> address of the callback object)
2. mPtrBits of CallbackObjectHolder (mListener) is not 0
> 3. The vtbl of the callback object pointer is null
3. The vtbl of the callback object is null
Comment 20•8 years ago
|
||
Although the crash reports hint that it may be caused by a dead listener object, still can't find which object caused it.
Updated•8 years ago
|
Updated•8 years ago
|
Keywords: testcase-wanted
Updated•8 years ago
|
Updated•8 years ago
|
Whiteboard: [sec-triage-backlog]
Comment 21•8 years ago
|
||
This is an assigned P1 bug without activity in two weeks.
If you intend to continue working on this bug for the current release/iteration/sprint, remove the 'stale-bug' keyword.
Otherwise we'll reset the priority of the bug back to '--' on Monday, August 28th.
Keywords: stale-bug
Comment 22•8 years ago
|
||
Checked the graph and the crashes decreased around Mar19. 66.21% of crashes happened on the build 20170125094131. The second high build is 20170627155318 (3.58%).
So far I have no idea about next action. Reset the assignee.
Assignee: sshih → nobody
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
Updated•7 years ago
|
Group: dom-core-security
Updated•3 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•