Closed Bug 1358056 Opened 8 years ago Closed 8 years ago

Assertion failure: aAssociationMode == NotOwnedByDocument, at StyleSheet.cpp:579

Categories

(Core :: CSS Parsing and Computation, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox-esr52 --- unaffected
firefox53 --- wontfix
firefox54 --- fixed
firefox55 + fixed

People

(Reporter: cbook, Assigned: bzbarsky)

References

()

Details

(Keywords: assertion)

Attachments

(3 files)

Attached file crash stack
Assertion failure: aAssociationMode == NotOwnedByDocument, at c:/builds/moz2_slave/m-cen-w32-d-000000000000000000/build/src/layout/style/StyleSheet.cpp:579

found by bughunter and reproduced with latest debug windows trunk build from tinderbox.

Steps to reproduce:
-> Load 
http://guro.kumc.or.kr/popup/popDoctorInfo.do?mode=A&&DR_NO=7117
--> Assertion failure after 21 seconds
xidorn, emilio: can you take a look
Flags: needinfo?(xidorn+moz)
Flags: needinfo?(emilio+bugs)
The association mode stuff was added in bug 1332353.
Flags: needinfo?(bzbarsky)
I can't seem to reproduce with a debug Mac build from last night's inbound tip.  :(

What was the exact rev for the build mentioned as reproducing in comment 0?
Flags: needinfo?(bzbarsky) → needinfo?(cbook)
Anyway, looking at the stack, we're cycle-collecting a CSSStyleSheet, which removes it from the inner.  That calls into StyleSheetInfo::RemoveSheet.  That calls ChildSheetListBuilder::ReparentChildList with the next sheet in the inner as aPrimarySheet.  That calls SetAssociatedDocument on each child with aPrimarySheet's document and document association mode.

If the assert is then firing, that means aPrimarySheet->mDocument is null but aPrimarySheet->mDocumentAssociationMode is not OwnedByDocument, which is bogus.

OK, so how did aPrimarySheet get into that state?  If it were via SetAssociatedDocument(), then we should have gotten an assert at that point.

What can set mDocumentAssociationMode?  It's set by SetAssociatedDocument and by the StyleSheet constructors.

What can set mDocument?  It's set by the following:

1)  SetAssociatedDocument
2)  The StyleSheet constructors. 
3)  CSSStyleSheet::ReparseSheet
4)  StyleSheet::UnlinkInner
5)  StyleSheet::UnparentChildren
6)  StyleSheet::AppendStyleSheet

* SetAssociatedDocument clearly couldn't have been the way this got messed up (it would assert).
* The StyleSheet constructors both set the mDocumentAssociationMode to NotOwnedByDocument,
  so couldn't have lead to this state. 
* CSSStyleSheet::ReparseSheet looks buggy to me, in a preexisting way: it only nulls out the
  document on "child", not on child's _descendants_.  It needs to be calling SetAssociatedDocument!
* StyleSheet::UnlinkInner is arguably buggy in the same way.  It's a bit less clear,
  because presumably we're unlinking all those child sheets too.  But unlinking order is not
  guaranteed, and if we unlink here and one of our child sheets is a non-first sheet for its
  own inner and then we destroy the first sheet, we could get into the situation described
  above, I think.  I think we do want to SetAssociatedDocument here too.  In particular, it's
  possible that we're going to get told we're not document-associated from the document's unlink,
  but it just hasn't run yet, so we _can_ in fact still be document-associated here.
* StyleSheet::UnparentChildren is more of the same: nulls mDocument but not deeply.
  It's called from the CSSStyleSheet destructor.  I'm pretty sure that by this point
  mDocument better be null anyway...
* StyleSheet::AppendStyleSheet sets mDocument to whatever our mDocument is.  It doesn't set
  mDocumentAssociationMode.  It also doesn't set mDocument deeply.  That's probably buggy;
  it should SetAssociatedDocument instead.
Attachment #8860122 - Flags: review?(cam)
Assignee: nobody → bzbarsky
Status: NEW → ASSIGNED
I think that should help, but hard to tell without being able to reproduce.
Flags: needinfo?(xidorn+moz)
Flags: needinfo?(emilio+bugs)
I reproduced with a Nightly debug build from this afternoon on my macbook. I started the browser and loaded the url and using the developer tools/web console opened a tab and then did a setTimeout('opener.document.location.reload()', 30000) in the new tab. This ran for quite some time before I decided to see if setting the never remember history pref would help. I changed the pref to never remember history then clicked the restart button. On the restart the assertion appeared.

The never remember history thing might not be required. Perhaps only restarting/shutting down is required after some period of reloading the page.
Attached file assertion stack
Right, that's showing shutdown cc.  I just tried a few times, and still not managing to reproduce.  But there's a good chance this is dependent on the order in which things get unlinked in cc, which is fairly random....
yeah i also used yesterdays rev / build from tinderbox so nothing special with that build :(
Flags: needinfo?(cbook)
I could obviously create try builds with the patch for people to attempt to reproduce in, but I'm not sure it would mean much if it _didn't_ reproduce in them.  :(
Comment on attachment 8860122 [details] [diff] [review]
Fix stylesheet handling of associated documents in various edge cases

Review of attachment 8860122 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/style/CSSStyleSheet.cpp
@@ +1030,5 @@
>    for (StyleSheet* child = GetFirstChild(); child; ) {
>      NS_ASSERTION(child->mParent == this, "Child sheet is not parented to this!");
>      StyleSheet* next = child->mNext;
>      child->mParent = nullptr;
> +    child->SetAssociatedDocument(nullptr, NotOwnedByDocument);

I guess we should be doing something similar in ServoStyleSheet::ParseSheet.  Filed bug 1358993.
Attachment #8860122 - Flags: review?(cam) → review+
Pushed by bzbarsky@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/38740b783dda
Fix stylesheet handling of associated documents in various edge cases.  r=heycam
https://hg.mozilla.org/mozilla-central/rev/38740b783dda
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Can this ride the 55 train or should we consider backporting it to 54?
Blocks: 1332353
Flags: needinfo?(bzbarsky)
Backport consideration might not be a bad idea.  I think the patch is reasonably safe...
Flags: needinfo?(bzbarsky)
Comment on attachment 8860122 [details] [diff] [review]
Fix stylesheet handling of associated documents in various edge cases

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1332353 added the assertions, but bug
   851892 is the thing that depends on the assertions.
[User impact if declined]:  Possible crashes during cycle collection due to
   dereferencing dead pointers.
[Is this code covered by automated tests?]:  Apparently not well enough.  The
   problem is that the timing has to work just right.
[Has the fix been verified in Nightly?]: Not yet.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: In my opinion, no.
[Why is the change risky/not risky?]: 
[String changes made/needed]: None.
Attachment #8860122 - Flags: approval-mozilla-aurora?
Comment on attachment 8860122 [details] [diff] [review]
Fix stylesheet handling of associated documents in various edge cases

54 is on Beta now :)
Attachment #8860122 - Flags: approval-mozilla-aurora? → approval-mozilla-beta?
Comment on attachment 8860122 [details] [diff] [review]
Fix stylesheet handling of associated documents in various edge cases

Fix an assertion and a potential crash. Beta54+. Should be in 54 beta 3.
Attachment #8860122 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
https://hg.mozilla.org/releases/mozilla-beta/rev/cb989af0ee37
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #17)

> [Is this code covered by automated tests?]:  Apparently not well enough.  The
>    problem is that the timing has to work just right.
> [Needs manual test from QE? If yes, steps to reproduce]: No.

Setting qe-verify- based on Boris' assessment on manual testing.
Flags: qe-verify-
Actually, it would be good if the people who could reproduce this bug initially could try with this fix.  I forgot about that part.
Flags: needinfo?(cbook)
Flags: needinfo?(bob)
Bughunter could not reproduce on Beta or Nightly. I tested locally on Fedora 25 x86_64 with builds I did last evening and found

Release/53 9 out of 20 times.
Beta/54, Nightly/55 0 out of 20 times.
Flags: needinfo?(bob)
Great, thank you!
(In reply to Bob Clary [:bc:] from comment #23)
> Bughunter could not reproduce on Beta or Nightly. I tested locally on Fedora
> 25 x86_64 with builds I did last evening and found
> 
> Release/53 9 out of 20 times.
> Beta/54, Nightly/55 0 out of 20 times.

same here, did manual testing and was not able to reproduce the crash
Flags: needinfo?(cbook)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: