Closed Bug 1384162 Opened 7 years ago Closed 7 years ago

Stylo: SIGSEGEV @ mozilla::StyleSetHandle::Ptr::AddDocStyleSheet, with failure in MOZ_RELEASE_ASSERT(IsServo())

Categories

(Core :: CSS Parsing and Computation, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox-esr52 --- unaffected
firefox54 --- unaffected
firefox55 --- unaffected
firefox56 --- fixed

People

(Reporter: kubrick, Assigned: xidorn)

References

Details

(Keywords: crash)

Crash Data

Attachments

(6 files, 1 obsolete file)

British Airways has on its check-in page a "print boarding pass" feature. It shows the boarding pass in an i-frame and a print button that prints that i-frame.
With servo css enabled this always results in a crash (works fine without servo css).
Crash reports from reporter:
bp-2bbbfc55-ab91-4c6f-9561-0dd640170725
bp-490ec41f-a2c1-4f0c-8f42-ea5650170725
bp-f6119122-4eb0-4665-9c82-a444c0170725
Crash Signature: bp-2bbbfc55-ab91-4c6f-9561-0dd640170725 bp-490ec41f-a2c1-4f0c-8f42-ea5650170725 bp-f6119122-4eb0-4665-9c82-a444c0170725 → [@ mozilla::StyleSetHandle::Ptr::AddDocStyleSheet]
Thank you for the report!

If you can still visit the affected page: could you try saving the affected page (as "Web Page: Complete" to try to capture all of the resources), and see if the saved copy reproduces the bug?

If the saved copy does reproduce the crash as well, that could help immensely with trying to debug this.  (Perhaps best not to post the saved copy as an attachment here, in case it has personal information, perhaps you could email it to a developer working on stylo -- like xidorn or emilio or heycam -- and I'm sure one of them would be sure to keep its details private.)

(I just briefly tried printing a simple page with an iframe, with stylo enabled...
 data:text/html,<iframe src="http://example.org">
...but I didn't hit any problems, so there must be something more complex going on here.)
Flags: needinfo?(kubrick)
Keywords: crash
Summary: Stylo: SIGSEGEV @ mozilla::StyleSetHandle::Ptr::AddDocStyleSheet → Stylo: SIGSEGEV @ mozilla::StyleSetHandle::Ptr::AddDocStyleSheet, with failure in MOZ_RELEASE_ASSERT(IsServo())
OK, I just reproduced something like this, twice, with a fresh profile and a relatively simple testcase -- so maybe save-as-complete isn't necessary after all.

My crash reports:
 bp-7ebd4888-bf7d-423b-b658-62a5f0170725
 bp-1c014818-dd2d-4593-b27e-374970170725

My rough STR:
 0. Start out with stylo disabled.
 1. Load a page with an iframe.
 2. In a separate tab, flip layout.css.servo.enabled to "true"
 3. Back in the tab with your testcase, set the iframe's "src" attribute to https://example.org
document.getElementsByTagName("iframe")[0].setAttribute("src", "http://example.org")
 4. Print the testcase to a file.

I'll post a testcase to make this simpler, too. My first crash report here was with a simple data URL; my second is with the testcase that I'm about to attach. I can't reproduce 100%, though -- I'm not sure if it's just hard to trigger, or if there's some extra step that I haven't noticed/captured yet.
Status: UNCONFIRMED → NEW
Ever confirmed: true
I found Xidorn Quan GPG key so I'll send the saved page in an encrypted email anyway, in case it helps. Thanks!
Flags: needinfo?(kubrick)
(In reply to Francois Guerraz from comment #4)
> I found Xidorn Quan GPG key so I'll send the saved page in an encrypted
> email anyway, in case it helps. Thanks!

Great, thank you!

(BTW: my crash reports are in mozilla::StyleSetHandle::Ptr::PrependStyleSheet rather than AddDocStyleSheet, but the rest of the stack is the same and it's the same failing assertion, so I think it's still the same issue.)

Francois, new question for you: how recently had you flipped the layout.css.servo.enabled about:config pref, when you hit this bug?  Was it very recent? (like, after the page had partially loaded?) I'm guessing that might've been the case, based on the assertion, but it's kind of a shot in the dark.  So I've included that in my tentative STR (comment 3) but I'm not 100% sure whether it's necessary.
Here's my testcase for comment 3. My second crash report,  bp-1c014818-dd2d-4593-b27e-374970170725, was from me using this testcase (locally).
No, I had enabled the preference and restarted the browser just in case, then after a while completed the check in procedure on ba.co.uk. So no chance it was partially loaded.
I can reproduce this issue with the page Francois sent to me. It happens with a stylo-by-default local build, so it is unrelated to switching the flag. The assertion matches that in the crash report.

I'll look into this.

Thanks for sending me that page. That helps a lot.
STR:
1. open this testcase
2. do print preview (or just print to file)

It crashes reliably with the same stack.
Hmmm, actually no script is needed at all.
Attachment #8890174 - Attachment is obsolete: true
So the problem here is that, for the Servo-styled document, there is a Gecko stylesheet... How can this happen?

And interestingly, the stylesheet in nsIDocument::mStyleSheets doesn't have owning node, while the <style> element doesn't have stylesheet connected.
When we print, we create a clone of the document, with nsIDocument::CreateStaticClone.  Is it possible we clone the document, it gets a different StyleBackendType, but we add the same style sheets (with mismatching StyleBackendType) from the primary document?
That would be weird as well. Why would the original document being a Gecko-styled document when Stylo is enabled? Maybe the backend type isn't updated?
OK so script is necessary here.

When we get the content document of the iframe synchronously, it is the document which before any navigation happens. This document seems to be a Gecko-styled document even with Stylo enabled. And if we write things into it, this document would not navigate away, so now we have a Gecko-styled document inside a Servo-styled document.

It is not cleared to me the detailed mechanism for this (and I suppose this is what the spec says). I guess what we probably need to do is to ensure the initial document (before any navigation) is a Servo-styled document as well.
So in this case, nsIDocument::UpdateStyleBackendType() sets the backend to Gecko because there is no mDocumentContainer. And there is no document container because UpdateStyleBackendType is called before we actually set the container.

Code-wise, in nsDocShell::CreateAboutBlankContentViewer, we call into nsContentDLF::CreateBlankDocument at [1], and several lines after that, we set the container.

However, in nsContentDLF::CreateBlankDocument, we create and insert several elements (html, head, body) into the blank document at [2] before returning. Inserting new element into a document triggers UpdateStyleBackendType() via UnsetRestyleFlagsIfGecko() in Element::BindToTree at [3].

I haven't had any idea what is the best approach to fix this issue. Probably we can add some check to nsINode::UnsetRestyleFlagsIfGecko to make it not do anything when the backend is not yet set, but I'm not sure whether that is safe.

[1] https://searchfox.org/mozilla-central/rev/8a61c71153a79cda2e1ae7d477564347c607cc5f/docshell/base/nsDocShell.cpp#8189
[2] https://searchfox.org/mozilla-central/rev/8a61c71153a79cda2e1ae7d477564347c607cc5f/layout/build/nsContentDLF.cpp#318
[3] https://searchfox.org/mozilla-central/rev/8a61c71153a79cda2e1ae7d477564347c607cc5f/dom/base/Element.cpp#1626
Alternatively, we can de-XPCOM the createBlankDocument function, and then move some of the logic from nsDocShell::CreateAboutBlankContentViewer into the nsContentDLF::CreateBlankDocument function.
Assignee: nobody → xidorn+moz
Comment on attachment 8890283 [details]
Bug 1384162 part 1 - DeCOMtaminate nsContentDLF::CreateBlankDocument.

https://reviewboard.mozilla.org/r/161396/#review166822

r=me given the next set of commits that address the comments I was about to type here.  ;)
Attachment #8890283 - Flags: review?(bzbarsky) → review+
Comment on attachment 8890284 [details]
Bug 1384162 part 2 - Mondernize nsContentDLF::CreateBlankDocument.

https://reviewboard.mozilla.org/r/161398/#review166826

r=me
Attachment #8890284 - Flags: review?(bzbarsky) → review+
Comment on attachment 8890285 [details]
Bug 1384162 part 3 - Move document container setting into nsContentDLF::CreateBlankDocument before adding any nodes.

https://reviewboard.mozilla.org/r/161400/#review166828

r=me
Attachment #8890285 - Flags: review?(bzbarsky) → review+
Comment on attachment 8890286 [details]
Bug 1384162 part 4 - Add test to ensure that blank document uses the same backend as its parent document and update test expectation.

https://reviewboard.mozilla.org/r/161402/#review166830

r=me on the new test, but the wpt thing doesn't make sense to me.

::: testing/web-platform/meta/cssom-view/matchMedia.xht.ini:8
(Diff revision 2)
>    expected: TIMEOUT
>    [window.matchMedia exists]
>      expected: FAIL
>  
>    [MediaQueryList.matches for "(max-width: 199px), all and (min-width: 200px)"]
> -    expected: FAIL
> +    expected:

Does this part really belong in this diff?  If it's fixed by one of the earlier patches, it should belong in there; if it's not related to this bug, it shouldn't be here at all.
Attachment #8890286 - Flags: review?(bzbarsky) → review+
Comment on attachment 8890286 [details]
Bug 1384162 part 4 - Add test to ensure that blank document uses the same backend as its parent document and update test expectation.

https://reviewboard.mozilla.org/r/161402/#review166830

> Does this part really belong in this diff?  If it's fixed by one of the earlier patches, it should belong in there; if it's not related to this bug, it shouldn't be here at all.

That should be in part 3 then. I just get used to having all test expectation changes in one commit...
Pushed by xquan@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d2b2de8c314b
part 1 - DeCOMtaminate nsContentDLF::CreateBlankDocument. r=bz
https://hg.mozilla.org/integration/autoland/rev/e0a391cfa6d6
part 2 - Mondernize nsContentDLF::CreateBlankDocument. r=bz
https://hg.mozilla.org/integration/autoland/rev/558d00335f14
part 3 - Move document container setting into nsContentDLF::CreateBlankDocument before adding any nodes. r=bz
https://hg.mozilla.org/integration/autoland/rev/0b3e4b5b8e4d
part 4 - Add test to ensure that blank document uses the same backend as its parent document and update test expectation. r=bz
Depends on: 1388031
No longer depends on: 1388031
Blocks: 1383845
You need to log in before you can comment on or make changes to this bug.