Closed Bug 894215 Opened 11 years ago Closed 6 years ago

"Assertion failure: window" [@ nsDOMWindowList::IndexedGetter] with adoptNode, navigation

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla62

Tracking Flags:

Tracking

Status

firefox-esr52

---

wontfix

firefox-esr60

---

wontfix

firefox56

---

wontfix

firefox57

---

wontfix

firefox58

---

wontfix

firefox60

---

wontfix

firefox61

---

wontfix

firefox62

---

fixed

People

(Reporter: jruderman, Assigned: bzbarsky)

Details

(Keywords: assertion, sec-moderate, testcase, Whiteboard: [post-critsmash-triage][adv-main62+])

Attachments

(7 files, 1 obsolete file)

testcase 11 years ago Jesse Ruderman 922 bytes, text/html		Details
stack 11 years ago Jesse Ruderman 14.81 KB, text/plain		Details
alt stack 11 years ago Jesse Ruderman 16.85 KB, text/plain		Details
Testcase using blob URIs instead of data:, for simplicity 6 years ago Boris Zbarsky [:bzbarsky] 960 bytes, text/html		Details
part 1. Change nsSHEntryShared to inherit from nsStubMutationObserver 6 years ago Boris Zbarsky [:bzbarsky] 3.95 KB, patch	smaug : review+	Details \| Diff \| Splinter Review
part 2. Make sure we don't bfcache pages that get mutated after we capture their state 6 years ago Boris Zbarsky [:bzbarsky] 6.49 KB, patch		Details \| Diff \| Splinter Review
part 2. Fix the document "disallow bfcaching" mechanism to work without asserting 6 years ago Boris Zbarsky [:bzbarsky] 4.15 KB, patch	smaug : review+	Details \| Diff \| Splinter Review
part 3. Make sure we don't bfcache pages that get mutated after we capture their state 6 years ago Boris Zbarsky [:bzbarsky] 6.61 KB, patch	smaug : review+	Details \| Diff \| Splinter Review

Jesse Ruderman

Reporter

Description

•

11 years ago

Attached file testcase — Details

Assertion failure: window, at dom/base/nsDOMWindowList.cpp:102

Jesse Ruderman

Reporter

Comment 1

•

11 years ago

Attached file stack — Details

Jesse Ruderman

Reporter

Comment 2

•

11 years ago

Attached file alt stack — Details

Without the last part of the testcase, it usually hits the same assert from ssi_getSelectedPageStyle in SessionStore.jsm.  But that's on a longer timer.

Boris Zbarsky [:bzbarsky]

Assignee

Comment 3

•

11 years ago

Hrm.  We can add a null-check, but why do we have a child docshell that has a null window???

Flags: needinfo?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 4

•

11 years ago

So we're about to destroy the docshell we access, so GetInterface returns null.
But why do we have such docshell as a child when we call w[0]...

Flags: needinfo?(bugs)

Boris Zbarsky [:bzbarsky]

Assignee

Comment 5

•

11 years ago

Right, that was the question.  Why are we about to destroy a docshell that we have as a child?

Al Billings [:abillings - ex-MoCo]

Comment 6

•

11 years ago

Does anyone have any suggestion of a security rating for this?

Andrew McCreight [:mccr8]

Updated

•

11 years ago

Keywords: sec-high

Andrew McCreight [:mccr8]

Comment 7

•

11 years ago

Olli, should this be sec-moderate until we know that this is anything besides a null crash?

Andrew McCreight [:mccr8]

Comment 8

•

11 years ago

I'm going to reduce this to sec-moderate as it looks like just a null deref.  Feel free to increase it if you want.

Keywords: sec-high → sec-moderate

Bobby Holley (:bholley)

Comment 9

•

10 years ago

This doesn't crash for me anymore, so I'd hazard a guess that this was fixed by Window WebIDL or something similar. Jesse, can you confirm that the testcase no longer works?

Flags: needinfo?(jruderman)

Boris Zbarsky [:bzbarsky]

Assignee

Comment 10

•

10 years ago

> so I'd hazard a guess that this was fixed by Window WebIDL or something similar

Unlikely.  The stack shown is the one we'd still take for indexed access on a window, and the same MOZ_ASSERT is still there...

Jesse Ruderman

Reporter

Comment 11

•

9 years ago

I can still reproduce this bug.

Flags: needinfo?(jruderman)

BMO Automation

Updated

•

9 years ago

Group: core-security → dom-core-security

Ryan VanderMeulen [:RyanVM]

Comment 12

•

7 years ago

This still reproduces on trunk. Note that the testcase requires you to set the "security.data_uri.unique_opaque_origin" and "security.data_uri.block_toplevel_data_uri_navigations" prefs to false now due to the restrictions on data URIs that were put in place.

Has Regression Range: --- → no

status-firefox56: --- → wontfix

status-firefox57: --- → wontfix

status-firefox58: --- → fix-optional

status-firefox-esr52: --- → wontfix

Boris Zbarsky [:bzbarsky]

Assignee

Comment 13

•

7 years ago

OK, so I think I know what's going on.  The sequence of events is as follows:

1) We load a page, with a subframe.

2) We start loading a new page.

3) We call nsDocShell::Embed on the new page's content viewer.  This calls
   SetupNewViewer, which calls nsDocShell::DestroyChildren etc.  This nulls
   out the parent pointer on the subframe docshell but it's still around,
   because we are planning to put the first page into bfcache.  It _also_
   captures the list of child docshells and stores it in the SHEntry
   (see nsDocShell::CaptureState).

4) This code runs:

    document.adoptNode(page1root);

   The adoptNode call removes page1root from the DOM of its document,
   which is the document that we are navigating away from.  At this point
   it has _not_ yet been "fully" placed in bfcache, because Destroy() hasn't
   been called on the viewer.  Specifically, the SHEntry has been stored
   on the documentviewer, but we have not called
   nsSHEntryShared::SetContentViewer yet.  So we are not yet watching for mutations
   to drop the document from bfcache.  Before returning from the adoptNode call,
   we end up calling Destroy() on the subframe's docshell (via
   nsDocument::MaybeInitializeFinalizeFrameLoaders being called from a
   scriptblocker coming off the stack).  This nulls out its mScriptGlobal pointer,
   which is why we have a docshell that has a null window.

5) This code runs:

    w.history.back();

   When back() is called, that lands in nsDocShell::InternalLoad, which does:

           nsCOMPtr<nsISHEntry> viewerEntry;
           prevViewer->GetHistoryEntry(getter_AddRefs(viewerEntry));
           if (viewerEntry == aSHEntry) {
             // Make sure this viewer ends up in the right place
             mContentViewer->SetPreviousViewer(nullptr);
             prevViewer->Destroy();
           }

   This Destroy() call places prevViewer's document in the bfcache.  Then we
   keep going with the back() navigation and load it from bfcache.

6) When going back via bfcache we grab the set of child shells from the shentry.
   This was stored back in step 3, so contains the now-torn-down docshell.

Upshot: The fix for bug 292962 didn't actually fix it very well.  There's this time period while a new page is loading when we have already grabbed some bfcache state but don't have the mutation observers set up yet, and if the old page is mutated during this time, bad things will happen.

At least the "holds weak refs to docshells" bit got fixed in bug 319551, so we get a null-deref here, not a UAF....

Anyway, we should fix things so we drop bfcache in this case.  That _may_ be enough, since even nsSHEntryShared::RemoveFromBFCacheAsync (which is what we do on mutations) calls DropPresentationState() sync unless it fails to dispatch the runnable.  And DropPresentationState() is what clears out the child shell list in the shentry.

Boris Zbarsky [:bzbarsky]

Assignee

Comment 14

•

6 years ago

Attached file Testcase using blob URIs instead of data:, for simplicity — Details

Boris Zbarsky [:bzbarsky]

Assignee

Comment 15

•

6 years ago

Attached patch part 1. Change nsSHEntryShared to inherit from nsStubMutationObserver — Details — Splinter Review

Attachment #8976778 - Flags: review?(bugs)

Boris Zbarsky [:bzbarsky]

Assignee

Updated

•

6 years ago

Assignee: nobody → bzbarsky

Status: NEW → ASSIGNED

Boris Zbarsky [:bzbarsky]

Assignee

Comment 16

•

6 years ago

Attached patch part 2. Make sure we don't bfcache pages that get mutated after we capture their state (obsolete) — Details — Splinter Review

Attachment #8976779 - Flags: review?(bugs)

Boris Zbarsky [:bzbarsky]

Assignee

Comment 17

•

6 years ago

Comment on attachment 8976779 [details] [diff] [review]
part 2.  Make sure we don't bfcache pages that get mutated after we capture their state

This part still has some issues I'm investigating...

Attachment #8976779 - Flags: review?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

6 years ago

Attachment #8976778 - Flags: review?(bugs) → review+

Boris Zbarsky [:bzbarsky]

Assignee

Comment 18

•

6 years ago

Attached patch part 2. Fix the document "disallow bfcaching" mechanism to work without asserting — Details — Splinter Review

There are two issues being fixed here.  First, if DisallowBFCaching is called
before CanSavePresentation, we should really return false from
CanSavePresentation.  Otherwise we'll end up doing a bunch of state-capturing
work for no reason.

Second, if DisallowBFCaching is called between CanSavePresentation and
nsDocumentViewer::Destroy, we need to actually tear down the viewer state.
What we do right now is avoid putting the viewer into the SHEntry, but still
not tear down its presshell and so forth, which leads to asserts in
~nsDocumentViewer when this case is hit.

Attachment #8977079 - Flags: review?(bugs)

Boris Zbarsky [:bzbarsky]

Assignee

Comment 19

•

6 years ago

Attached patch part 3. Make sure we don't bfcache pages that get mutated after we capture their state — Details — Splinter Review

Attachment #8977080 - Flags: review?(bugs)

Boris Zbarsky [:bzbarsky]

Assignee

Updated

•

6 years ago

Attachment #8976779 - Attachment is obsolete: true

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 20

•

6 years ago

Comment on attachment 8977079 [details] [diff] [review]
part 2.  Fix the document "disallow bfcaching" mechanism to work without asserting

ahaa, 8 line context wasn't enough for DocumentViewer.

Attachment #8977079 - Flags: review?(bugs) → review+

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 21

•

6 years ago

Comment on attachment 8977080 [details] [diff] [review]
part 3.  Make sure we don't bfcache pages that get mutated after we capture their state

I wonder if we could have reused the MutationObserver session history has. But this isn't too complicated either.

Attachment #8977080 - Flags: review?(bugs) → review+

Boris Zbarsky [:bzbarsky]

Assignee

Comment 22

•

6 years ago

> I wonder if we could have reused the MutationObserver session history has.

Hmm.  I guess with the new setup where I just disallow bfcaching on the document (as opposed to the earlier attempt where it did something else) we could do that...  The logic might still end up more complicated and harder to follow, though.  In particular, we would need to add the observer when we set up the shentry, then remove it while we sanitize, then readd it.

Boris Zbarsky [:bzbarsky]

Assignee

Comment 23

•

6 years ago

   https://hg.mozilla.org/integration/mozilla-inbound/rev/131a40749cb981542c1e84b570d8d24eff17f399
   https://hg.mozilla.org/integration/mozilla-inbound/rev/4df8820be2aacf5d3cb7a0cc84fdc3acab0015d5
   https://hg.mozilla.org/integration/mozilla-inbound/rev/7f98616d79149624aeb956624af1d9714fda2117

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 24

•

6 years ago

https://hg.mozilla.org/mozilla-central/rev/131a40749cb9
https://hg.mozilla.org/mozilla-central/rev/4df8820be2aa
https://hg.mozilla.org/mozilla-central/rev/7f98616d7914

Group: dom-core-security → core-security-release

Status: ASSIGNED → RESOLVED

Closed: 6 years ago

status-firefox62: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla62

Ryan VanderMeulen [:RyanVM]

Comment 25

•

6 years ago

This grafts cleanly to both Beta and ESR60, though I'm not sure what the urgency is here for uplifting given how old the bug is. Is this something we should consider or just let it ride the trains?

status-firefox58: fix-optional → wontfix

status-firefox60: --- → wontfix

status-firefox61: --- → affected

status-firefox-esr60: --- → affected

Flags: needinfo?(bzbarsky)

Flags: in-testsuite?

Boris Zbarsky [:bzbarsky]

Assignee

Comment 26

•

6 years ago

Given the limited severity of the bug and the risk of any changes to this code, I don't think we should be uplifting anything here.

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Assignee

Updated

•

6 years ago

Depends on: 1462885

Ryan VanderMeulen [:RyanVM]

Updated

•

6 years ago

status-firefox61: affected → wontfix

status-firefox-esr60: affected → wontfix

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Updated

•

6 years ago

No longer depends on: 1462885

Andrei Vaida [:avaida]

Updated

•

6 years ago

Flags: qe-verify+

Whiteboard: [post-critsmash-triage]

Emil Ghitta, QA [:emilghitta]

Comment 27

•

6 years ago

Hi Jesse,

Unfortunately I didn't managed to reproduce this issue by using an affected build (tried on the latest release 61, and with old asan builds).

I tried to reproduce this issue by modifying the prefs Ryan provided in Comment 12, but without luck.

Could you help us by verifying if this issue is fixed in Fx62?.

Thanks a lot!

Flags: needinfo?(jruderman)

Emil Ghitta, QA [:emilghitta]

Comment 28

•

6 years ago

It seems that Jesse is unavailable at this time (per abillings). Removing the qe-verify flag for now, as QA cannot reproduce the issue and therefore can't confirm whether it's fixed. 

Perhaps someone else could give it a try?

Flags: qe-verify+

Flags: needinfo?(jruderman)

Al Billings [:abillings - ex-MoCo]

Updated

•

6 years ago

Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][adv-main62+]

Daniel Veditz [:dveditz]

Updated

•

5 years ago

Group: core-security-release

You need to log in before you can comment on or make changes to this bug.