Crash in nsGlobalWindow::DispatchDOMWindowCreated

RESOLVED FIXED in Firefox -esr52

Status

()

defect
--
critical
RESOLVED FIXED
3 years ago
4 months ago

People

(Reporter: kanru, Assigned: freesamael)

Tracking

(4 keywords)

52 Branch
mozilla55
Unspecified
Android
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(firefox-esr45 wontfix, firefox52 wontfix, firefox-esr5253+ fixed, firefox53+ verified, firefox54+ fixed, firefox55+ fixed)

Details

(Whiteboard: [adv-main53+][adv-esr52.1+], crash signature)

Attachments

(1 attachment, 1 obsolete attachment)

This bug was filed from the Socorro interface and is 
report bp-96207d91-81aa-41d0-bdef-7a2012170215.
=============================================================
FennecAndroid only crash.

I can always reproduce this when trying to open http://www.lastampa.it/edizioni/milano on my phone.

Looks like UAF so marked as security, feel free to make it public otherwise.

Tentatively file under DOM. Hsinyi, do you have anyone can take a look?
Flags: needinfo?
Flags: needinfo? → needinfo?(htsai)
All the Fennec crashes and most of the Desktop ones are near-null reads and don't look like security problems.

I did see two Desktop crashes that crashed on an exec violation on a random address.
bp-5dfaf4ff-8e7b-4349-b7fa-b71822170130
bp-01684e17-21a3-430d-b260-bf7d92170127

I don't see evidence of the UAF-poisoning marker (addresses with e5e5 in them) but this could be a UAF nonetheless. Certainly corrupted memory somehow.

I can reproduce the crash with my Fennec 52.0b5 build bp-924b966f-bace-402e-bbcb-3c4b32170216
We should jump on this while we have a reproducible case -- who knows when that site will change. Who should we get on this bug?
Group: core-security → dom-core-security
Flags: needinfo?(continuation)
Keywords: testcase
Samael might be able to shed some light here.
Flags: needinfo?(htsai) → needinfo?(sawang)
This is pretty confusing.  Are we crashing because "this" is null or something?  That would be ... pretty odd in this case.
So I can reproduce it locally. 

gdb shows that somehow mDoc turns to nullptr after nsContentUtils::DispatchChromeEvent call [1], so that the following mDoc->NodePrincipal() causes a crash. I'll try to figure it out.
Assignee: nobody → sawang
Flags: needinfo?(sawang)
(In reply to Samael Wang [:freesamael][:sawang] from comment #4)
> So I can reproduce it locally. 
> 
> gdb shows that somehow mDoc turns to nullptr after
> nsContentUtils::DispatchChromeEvent call [1], so that the following
> mDoc->NodePrincipal() causes a crash. I'll try to figure it out.

[1] http://searchfox.org/mozilla-central/rev/cac6cb6a10afb8ebb2ecfbeeedaff7c66f57dd75/dom/base/nsGlobalWindow.cpp#3258
There was an XHR triggered nested event loop, and docshell was destroyed within that, causing mDoc being cleared.

#0  nsGlobalWindow::DropOuterWindowDocs (this=0xc66c6000) at /home/freesamael/Repos/gecko-unified/dom/base/nsGlobalWindow.cpp:1785
#1  0xdb1c433e in nsGlobalWindow::DetachFromDocShell (this=0xc66c6000) at /home/freesamael/Repos/gecko-unified/dom/base/nsGlobalWindow.cpp:3400
#2  0xdcff38d5 in nsDocShell::Destroy (this=0xc540f400) at /home/freesamael/Repos/gecko-unified/docshell/base/nsDocShell.cpp:5844
#3  0xdb2a6c5f in nsFrameLoader::DestroyDocShell (this=0xc92f7940) at /home/freesamael/Repos/gecko-unified/dom/base/nsFrameLoader.cpp:2114
#4  0xdb2c8347 in nsFrameLoaderDestroyRunnable::Run (this=0xc7fd6a40) at /home/freesamael/Repos/gecko-unified/dom/base/nsFrameLoader.cpp:2052
#5  0xda547f9e in nsThread::ProcessNextEvent (this=0xe086d0e0, aMayWait=true, aResult=0xedd3a4bf) at /home/freesamael/Repos/gecko-unified/xpcom/threads/nsThread.cpp:1264
#6  0xda549ec1 in NS_ProcessNextEvent (aThread=0xe086d0e0, aMayWait=true) at /home/freesamael/Repos/gecko-unified/xpcom/threads/nsThreadUtils.cpp:389
#7  0xda54a5fc in nsThread::Shutdown (this=0xcac0dd50) at /home/freesamael/Repos/gecko-unified/xpcom/threads/nsThread.cpp:1015
#8  0xdb9e81f9 in applyImpl<nsIThread, nsresult (nsIThread::*)()> (args=..., m=<optimized out>, o=<optimized out>)
    at /home/freesamael/Repos/gecko-build/obj-android/dist/include/nsThreadUtils.h:855
#9  apply<nsIThread, nsresult (nsIThread::*)()> (this=0xc660ba50, m=<optimized out>, o=<optimized out>)
    at /home/freesamael/Repos/gecko-build/obj-android/dist/include/nsThreadUtils.h:862
#10 mozilla::detail::RunnableMethodImpl<nsCOMPtr<nsIThread>, nsresult (nsIThread::*)(), true, false>::Run (this=0xc660ba30)
    at /home/freesamael/Repos/gecko-build/obj-android/dist/include/nsThreadUtils.h:890
#11 0xda547f9e in nsThread::ProcessNextEvent (this=0xe086d0e0, aMayWait=true, aResult=0xedd3a5ef) at /home/freesamael/Repos/gecko-unified/xpcom/threads/nsThread.cpp:1264
#12 0xda549ec1 in NS_ProcessNextEvent (aThread=0xe086d0e0, aMayWait=true) at /home/freesamael/Repos/gecko-unified/xpcom/threads/nsThreadUtils.cpp:389
#13 0xdc20c449 in mozilla::dom::XMLHttpRequestMainThread::SendInternal (this=0xc662ec80, aBody=<optimized out>)
    at /home/freesamael/Repos/gecko-unified/dom/xhr/XMLHttpRequestMainThread.cpp:2982
#14 0xdc213931 in mozilla::dom::XMLHttpRequestMainThread::Send (this=0xc662ec80, aRv=...) at /home/freesamael/Repos/gecko-unified/dom/xhr/XMLHttpRequestMainThread.h:313
#15 0xdc213972 in mozilla::dom::XMLHttpRequestMainThread::Send (this=0xc662ec80, aCx=0xe082d000, aString=..., aRv=...)
    at /home/freesamael/Repos/gecko-unified/dom/xhr/XMLHttpRequestMainThread.h:357
#16 0xdb7bb040 in mozilla::dom::XMLHttpRequestBinding::send (cx=0xe082d000, obj=..., self=0xc662ec80, args=...)
    at /home/freesamael/Repos/gecko-build/obj-android/dom/bindings/XMLHttpRequestBinding.cpp:783
#17 0xdb9ad640 in mozilla::dom::GenericBindingMethod (cx=0xe082d000, argc=1, vp=0xedd3a978) at /home/freesamael/Repos/gecko-unified/dom/bindings/BindingUtils.cpp:2959
So I finally got a stacktrace which could possibly explain the whole story [1]. Basically during the frameloader loading process, a mutation observer removes the frame and the frameloader destroys in the middle of loading process.

I think we should prevent frameloader being destroyed during ReallyStartLoading(), maybe by queuing the StartDestroy() call until ReallyStartLoading() finishes. Suggestions are welcome if this doesn't sound right. I'll also try to make an artificial test to reproduce this issue.

[1] https://pastebin.mozilla.org/8979483
The unchecked use of mDoc after running arbitrary script in nsGlobalWindow::DispatchDOMWindowCreated is clearly a bug and should be fixed.

For the rest, I'm not sure what the right thing is.  It looks like fundamentally we ask for a UA override, land in http://searchfox.org/mozilla-central/rev/cac6cb6a10afb8ebb2ecfbeeedaff7c66f57dd75/mobile/android/chrome/content/browser.js#3234 and it wants a window.  So we create one and that fires the event, which triggers the mutation observers (on the page?).

But why is that mutation observer triggering happening?  What mutation observer notifications got queued while not being in a microtask to start with here?  :(
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #8)
> For the rest, I'm not sure what the right thing is.  It looks like
> fundamentally we ask for a UA override, land in
> http://searchfox.org/mozilla-central/rev/
> cac6cb6a10afb8ebb2ecfbeeedaff7c66f57dd75/mobile/android/chrome/content/
> browser.js#3234 and it wants a window.  So we create one and that fires the
> event, which triggers the mutation observers (on the page?).

I did check the URL passed in nsDocShell::LoadURI, which was http://www.lastampa.it/modulo/meteo/url/meteo-milano.url?2017021603, an iframe in that website, so I think it's caused by web content.
(In reply to Samael Wang [:freesamael][:sawang] from comment #9)
> (In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #8)
> > For the rest, I'm not sure what the right thing is.  It looks like
> > fundamentally we ask for a UA override, land in
> > http://searchfox.org/mozilla-central/rev/
> > cac6cb6a10afb8ebb2ecfbeeedaff7c66f57dd75/mobile/android/chrome/content/
> > browser.js#3234 and it wants a window.  So we create one and that fires the
> > event, which triggers the mutation observers (on the page?).
> 
> I did check the URL passed in nsDocShell::LoadURI, which was
> http://www.lastampa.it/modulo/meteo/url/meteo-milano.url?2017021603, an
> iframe in that website, so I think it's caused by web content.

never mind I was confused...
Flags: needinfo?(continuation)
Posted patch v1 (obsolete) — Splinter Review
Did you perhaps test what happens if there is { AutoMicroTask mt; } in html parser before the call which triggers all this stuff?
(In reply to Olli Pettay [:smaug] (pto-ish for couple of days) from comment #12)
> Did you perhaps test what happens if there is { AutoMicroTask mt; } in html
> parser before the call which triggers all this stuff?

I couldn't produce a test case to trigger the crash right from RunFlushLoop, and that website in comment 0 no longer reproduce the issue either so I couldn't find a way to test this part :(
I thought it's not UAF? The data member was explicitly set to nullptr.
Comment on attachment 8839883 [details] [diff] [review]
v1

Hi Olli,

Do you think it's sufficient fix & test? Wonder if there's other better way to test it.
Attachment #8839883 - Flags: review?(bugs)
Comment on attachment 8839883 [details] [diff] [review]
v1

this we should definitely do, but could you file a followup bug to figure out microtask handling here.

But, you don't ever remove the observer, so the test ends up leaking.
Attachment #8839883 - Flags: review?(bugs) → review-
Attachment #8839883 - Attachment description: Check for mDoc after DispatchChromeEvent → v1
Attachment #8839883 - Attachment is obsolete: true
Comment on attachment 8841478 [details] [diff] [review]
Check for mDoc after DispatchChromeEvent

Should the follow-up bug be "consider adding microtask checkpoint before EndDocUpdate"? I'm not sure what exactly we want to achieve for the microtask part.
Attachment #8841478 - Flags: review?(bugs)
The mutation observer added by the web page shouldn't run when the chrome only DOMWindowCreated event is dispatched but earlier.
Attachment #8841478 - Flags: review?(bugs) → review+
The vast majority of these crashes are null derefs, and that's what's being fixed. Let's get that out of the way. The exec violations could be something else and are pretty rare.
Comment on attachment 8841478 [details] [diff] [review]
Check for mDoc after DispatchChromeEvent

Approval Request Comment
[Feature/Bug causing the regression]: Not a regression.
[User impact if declined]: Crash.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: The web content in comment 0 no longer reproduce the crash so I'm not sure how it can be considered verified in nightly, but it's covered in automated test and the patch is a simple safe change.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: It adds only a nullptr check to prevent crash.
[String changes made/needed]: None.
Attachment #8841478 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/ebe86d0f658a
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Is there any reason not to request Beta/ESR52 approval on this as well? AFAICT, this crash affects those branches as well.
Flags: needinfo?(sawang)
Comment on attachment 8841478 [details] [diff] [review]
Check for mDoc after DispatchChromeEvent

(In reply to Ryan VanderMeulen [:RyanVM] from comment #24)
> Is there any reason not to request Beta/ESR52 approval on this as well?
> AFAICT, this crash affects those branches as well.

I wasn't sure when I should request for beta/esr (hasn't uplift anything before), but would be good if we can do that.

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: Crash
Fix Landed on Version: 55
Risk to taking this patch (and alternatives if risky): Low risk as it only adds a nullptr check.
String or UUID changes made by this patch: None
Flags: needinfo?(sawang)
Attachment #8841478 - Flags: approval-mozilla-esr52?
Attachment #8841478 - Flags: approval-mozilla-beta?
Comment on attachment 8841478 [details] [diff] [review]
Check for mDoc after DispatchChromeEvent

Fix a crash. Aurora54+.
Attachment #8841478 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8841478 [details] [diff] [review]
Check for mDoc after DispatchChromeEvent

Crash fix, looks low risk, we're in early beta so let's uplift.
Attachment #8841478 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Group: dom-core-security → core-security-release
Depends on: 1346471
Tested on Firefox 53 Beta 2 with: 
- Huawei MediaPad M2 (Android 5.1.1);
- Asus ZenPad 8 (Android 6.0.1);
- Lenovo A536 (Android 4.4.2).
 
 Following the steps from description and some exploratory testing I didn't manage to crash the Beta build so mark as verified on 53.
See Also: → 1347097
Release managers, ESR52 has been waiting for approval for a month here. Can one of you approve this or deny it? See comment 25
Flags: needinfo?(rkothari)
Flags: needinfo?(lhenry)
Flags: needinfo?(jcristau)
I think we were hoping for a resolution on bug 1346471.
Flags: needinfo?(lhenry)
(In reply to Julien Cristau [:jcristau] from comment #32)
> I think we were hoping for a resolution on bug 1346471.

That hasn't been updated since March 27 and it is April 5 now and we're running out of time in this release.
Whiteboard: [adv-main53+]
Comment on attachment 8841478 [details] [diff] [review]
Check for mDoc after DispatchChromeEvent

ok let's get this crash fix in 52.1.0esr despite the test failure, so it's in sync with 53.
Flags: needinfo?(rkothari)
Flags: needinfo?(jcristau)
Attachment #8841478 - Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
Whiteboard: [adv-main53+] → [adv-main53+][adv-esr52.1+]
Group: core-security-release
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.