Closed Bug 1389300 Opened 4 years ago Closed 4 years ago

stylo: Crash in nsRuleNode::nsRuleNode


(Core :: CSS Parsing and Computation, defect, P1)




Tracking Status
firefox-esr52 --- unaffected
firefox55 --- unaffected
firefox56 --- unaffected
firefox57 --- fixed


(Reporter: mccr8, Assigned: bholley)


(Blocks 1 open bug)


(Keywords: crash, reproducible)

Crash Data


(2 files)

This bug was filed from the Socorro interface and is 
report bp-5763fa78-9707-49e7-a7f6-d36fb0170810.

This is a Windows top crash, on the August 9 Nightly. However, this is from a single installation. But... I was able to actually reproduce this crash, by visiting one of the URLs in the crash reports. Many of the crash reports included a similar URL.

The site this is crashing on is the Kelly Blue Book site: 

Just go to the site and click around a bit, scroll, resize the window maybe, and I was able to reproduce the crash twice. I had to enter a zip code first to interact with the site.

These crashes are all null derefs.
This crash is present on many branches, but it first showed up on Nightly in the 8-9 build.

Here's the change set for that Nightly:

Emilio has a lot of patches in that range. Any ideas about this crash?
Flags: needinfo?(emilio+bugs)
I suspect this is due to
Flags: needinfo?(emilio+bugs)
I'll investigate this one today.
(In reply to Emilio Cobos Álvarez [:emilio] from comment #2)
> I suspect this is due to

The fact that it's a null deref makes this senseless I think... And I haven't been able to repro actually.

Andrew, was this for sure with stylo enabled? That should be a Gecko-only path as far as I can tell.
Flags: needinfo?(continuation)
(In reply to Emilio Cobos Álvarez [:emilio] from comment #4)
> Andrew, was this for sure with stylo enabled? That should be a Gecko-only
> path as far as I can tell.

I have layout.css.servo.enabled set to true. I'm on OSX.

I'm seeing stuff like "experiments":{"pref-flip-quantum-css-style-r1-1381147":{"branch":"stylo"} in the crash reports. I don't know if that indicates Stylo is enabled or not.
Flags: needinfo?(continuation)
I can still reproduce this on an 8-11 build. It doesn't happen immediately, but I can do it fairly quickly. I can bisect this today and see if that helps.
(In reply to Andrew McCreight [:mccr8] from comment #5)
> I'm seeing stuff like
> "experiments":{"pref-flip-quantum-css-style-r1-1381147":{"branch":"stylo"}
> in the crash reports. I don't know if that indicates Stylo is enabled or not.

Yes. That means you have been enlisted in the Stylo Nightly experiment, even if you hadn't manually set the layout.css.servo.enabled pref.
Keywords: reproducible
Priority: -- → P2
This is the #3 Windows topcrash in Nightly 20170811100330, with 54 occurrences.
Priority: P2 → P1
I bisected locally using mozregression, and got it down to this range:
There are at least 4 servo commits in that range.
I'm going to try to bisect further with local builds.
OS: Mac OS X → All
MozReview-Commit-ID: 8Decj2cxySY
Attachment #8897080 - Flags: review?(cam)
mccr8 caught this in a debug build, and hit an assertion with the following stack. We IRC-debugged things to determine that the caller was using a different style backend than the callee.

    Thread 1 "Web Content" received signal SIGSEGV, Segmentation fault.
    0x00007fffe8986e77 in mozilla::DeclarationBlock::AsGecko (this=<optimized out>)
        at /home/amccreight/mc/obj-dbg.noindex/dist/include/mozilla/DeclarationBlockInlines.h:15
    15      MOZ_DEFINE_STYLO_METHODS(DeclarationBlock, css::Declaration, ServoDeclarationBlock)
    (gdb) bt
    #0  0x00007fffe8986e77 in mozilla::DeclarationBlock::AsGecko (this=<optimized out>)
        at /home/amccreight/mc/obj-dbg.noindex/dist/include/mozilla/DeclarationBlockInlines.h:15
    #1  nsHTMLCSSStyleSheet::ElementRulesMatching (this=<optimized out>, aPresContext=0x7fffb959a000,
        aElement=<optimized out>, aRuleWalker=0x7fffffff7090)
        at /home/amccreight/mc/layout/style/nsHTMLCSSStyleSheet.cpp:72
    #2  0x00007fffe89b340e in EnumRulesMatching<ElementRuleProcessorData> (
        aProcessor=0x7ffff7110540 <_IO_2_1_stderr_>, aData=0x7ffff7111770 <_IO_stdfile_2_lock>)
        at /home/amccreight/mc/layout/style/nsStyleSet.cpp:796
    #3  0x00007fffe89b2ac4 in nsStyleSet::FileRules (this=<optimized out>,
        aCollectorFunc=0x7fffe89b3404 <EnumRulesMatching<ElementRuleProcessorData>(nsIStyleRuleProcessor*, void*)>,
        aData=<optimized out>, aElement=0x7fffba4e8700, aRuleWalker=<optimized out>)
        at /home/amccreight/mc/layout/style/nsStyleSet.cpp:1174
    #4  0x00007fffe89b3278 in nsStyleSet::ResolveStyleForInternal (this=<optimized out>, aElement=0x7fffba4e8700,
        aParentContext=<optimized out>, aTreeMatchContext=..., aAnimationFlag=(unknown: 4160345920))
        at /home/amccreight/mc/layout/style/nsStyleSet.cpp:1354
    #5  0x00007fffe89b309f in nsStyleSet::ResolveStyleFor (this=0x7fffc35743a0, aElement=0x7fffba4e8700,
        aParentContext=0x0, aTreeMatchContext=...) at /home/amccreight/mc/layout/style/nsStyleSet.cpp:1390
    #6  nsStyleSet::ResolveStyleFor (this=0x7ffff7111770 <_IO_stdfile_2_lock>, aElement=0x7fffba4e8700,
        aParentContext=0x0) at /home/amccreight/mc/layout/style/nsStyleSet.cpp:1337
    #7  0x00007fffe893e3f1 in nsStyleSet::ResolveStyleFor (this=0x7fffc35743a0, aElement=0x7fffba4e8700,
        aParentContext=<optimized out>) at /home/amccreight/mc/layout/style/nsStyleSet.h:125
    #8  (anonymous namespace)::StyleResolver::ResolveWithAnimation (aStyleSet=0x7fffc35743a0,
        aElement=<optimized out>, aType=mozilla::CSSPseudoElementType::NotPseudo, aParentContext=<optimized out>,
        aStyleType=nsComputedDOMStyle::eAll, this=<optimized out>, aInDocWithShell=<optimized out>)
        at /home/amccreight/mc/layout/style/nsComputedDOMStyle.cpp:470
    #9  nsComputedDOMStyle::DoGetStyleContextNoFlush (aElement=0x7fffba4e8700, aPseudo=<optimized out>, aPresShell=
        0x7fffc43f3000, aStyleType=nsComputedDOMStyle::eAll, aAnimationFlag=nsComputedDOMStyle::eWithAnimation)
        at /home/amccreight/mc/layout/style/nsComputedDOMStyle.cpp:692
    #10 0x00007fffe893e2ae in nsComputedDOMStyle::GetStyleContextNoFlush (
        aElement=0x7ffff7111770 <_IO_stdfile_2_lock>, aPseudo=<optimized out>, aPresShell=0x7fffc43f3000,
        aStyleType=nsComputedDOMStyle::eAll) at /home/amccreight/mc/layout/style/nsComputedDOMStyle.h:104
    #11 nsComputedDOMStyle::DoGetStyleContextNoFlush (aElement=0x7fffb72d5b20, aPseudo=<optimized out>,
        aPresShell=0x7fffc43f3000, aStyleType=nsComputedDOMStyle::eAll,
        at /home/amccreight/mc/layout/style/nsComputedDOMStyle.cpp:683
    #12 0x00007fffe893dfb0 in nsComputedDOMStyle::GetStyleContextNoFlush (aElement=0x7fffb72d5b20, aPseudo=0x0,
        aPresShell=0x0, aStyleType=nsComputedDOMStyle::eAll)
        at /home/amccreight/mc/layout/style/nsComputedDOMStyle.h:104
Assignee: nobody → bobbyholley
I contemplated writing a crash test but it would depend on the precise criteria we use to select the backend, which will change quickly and soon go away entirely.
I wasn't able to reproduce the crash with Bobby's patch applied.
Comment on attachment 8897080 [details] [diff] [review]
Don't mix style backend types in nsComputedDOMStyle. v1

Review of attachment 8897080 [details] [diff] [review]:

::: layout/style/nsComputedDOMStyle.cpp
@@ +586,5 @@
>      presShell = aPresShell;
>      if (!presShell)
>        return nullptr;
> +
> +

Nit: probably don't need two blank lines here.
Attachment #8897080 - Flags: review?(cam) → review+
Our current machinery for enabling stylo requires a docshell - if there isn't
one, we default to the Gecko style system.

When getComputedStyle operates on an element without a presshell, it uses the
caller's presshell instead. If the element has previously been styled with
one style system (but no longer has a presshell), and the caller uses a
different style backend, using the caller's style system can cause crashes when
we pull bits of cached data off the DOM (like cached style attributes).

So we want to throw when window.getComputedStyle(element) is called for a
(window, element) pair with different style backends (which is what the next
patch in this bug does).

However, that causes a few failures where stylo-backed documents try to do
getComputedStyle on an XHR document (which, without a docshell, will use the
gecko style system).

So this patch does some work to propagate the creator's style backend into
various docshell-less documents. This should allow both chrome (which uses gecko)
and content (which uses stylo) to use getComputedStyle on the response document
for XHRs they create.

Note that the second patch in this bug will make
chromeWin.getComputedStyle(contentObj) throw. If we discover code that does
that, we can just make it invoke the content's getComputedStyle method over Xrays.

MozReview-Commit-ID: 5OsmHJKq5Ui
Attachment #8897586 - Flags: review?(cam)
Attachment #8897586 - Flags: review?(bugs)
Comment on attachment 8897586 [details] [diff] [review]
Inherit style backend into NS_NewDOMDocument. v1

>+  // Try to inherit a style backend.
>+  auto styleBackend = StyleBackendType::None;
>+  nsCOMPtr<nsPIDOMWindowInner> window = do_QueryInterface(mScriptHandlingObject);
>+  if (window && window->GetDoc()) {
Nit, GetExtantDoc()

>+    styleBackend = window->GetDoc()->GetStyleBackendType();

>+  auto styleBackend = StyleBackendType::None;
>+  nsCOMPtr<nsPIDOMWindowInner> window = do_QueryInterface(global);
>+  if (window && window->GetDoc()) {

>+    styleBackend = window->GetDoc()->GetStyleBackendType();

>   XMLHttpRequestMainThread();
>   void Construct(nsIPrincipal* aPrincipal,
>                  nsIGlobalObject* aGlobalObject,
>                  nsIURI* aBaseURI = nullptr,
>                  nsILoadGroup* aLoadGroup = nullptr)
>   {
>     MOZ_ASSERT(aPrincipal);
>-    MOZ_ASSERT_IF(nsCOMPtr<nsPIDOMWindowInner> win = do_QueryInterface(
>-      aGlobalObject), win->IsInnerWindow());
>+    nsCOMPtr<nsPIDOMWindowInner> win = do_QueryInterface(aGlobalObject);
>+    if (win) {
>+      MOZ_ASSERT(win->IsInnerWindow());
>+      if (win->GetDoc()) {
>+        mStyleBackend = win->GetDoc()->GetStyleBackendType();
and here
>+  StyleBackendType mStyleBackend;
Hmm, this isn't initialized always. Initialize in ctor.
Attachment #8897586 - Flags: review?(bugs) → review+
Comment on attachment 8897586 [details] [diff] [review]
Inherit style backend into NS_NewDOMDocument. v1

Review of attachment 8897586 [details] [diff] [review]:

::: gfx/thebes/gfxSVGGlyphs.cpp
@@ +361,5 @@
>      nsCOMPtr<nsIPrincipal> principal = NullPrincipal::Create();
> +    // XXXbholley: The style backend here probably doesn't matter, since the
> +    // document isn't reachable by content and it never gets styled directly.

Not sure what you mean by "directly", but the document does get styled, in gfxSVGGlyphsDocument::SetupPresentation.  But you're right it doesn't matter for the purpose of outside content reaching into this document.
Attachment #8897586 - Flags: review?(cam) → review+
Thanks for the reviews!
Pushed by
Inherit style backend into NS_NewDOMDocument. r=smaug,r=heycam
Don't mix style backend types in nsComputedDOMStyle. r=heycam
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Marking status-firefox56=fix-optional because there are only about 11 crash reports with this signature from Beta 56, compared to 300+ on Nightly 57.
(In reply to Chris Peterson [:cpeterson] from comment #23)
> Marking status-firefox56=fix-optional because there are only about 11 crash
> reports with this signature from Beta 56, compared to 300+ on Nightly 57.

Are we running the beta experiment on 56 yet? This would only happen if stylo was enabled.
Flags: needinfo?(cpeterson)
Was this a regression from something landed in the 8-9 Nightly build? If that's the case, and the regressor was uplifted, then it might not be in a beta yet.
We are not running the experiment in Beta 56 yet, but we have some Beta users who have manually enabled Stylo. We can uplift this crash fix to Beta if it's not risky to non-Stylo code.
Flags: needinfo?(cpeterson)
Comment on attachment 8897080 [details] [diff] [review]
Don't mix style backend types in nsComputedDOMStyle. v1

Approval Request Comment
[Feature/Bug causing the regression]: N/A
[User impact if declined]: Possible crashes on the stylo 56 beta experiment.
[Is this code covered by automated tests?]: No.
[Has the fix been verified in Nightly?]: Yes.
[Needs manual test from QE? If yes, steps to reproduce]: STR in comment 0. Given that this is stylo-experiment-only, manual verification probably not required. 
[List of other uplifts needed for the feature/fix]: Part 1 + Part 2 in this bug.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: Mostly just affects stylo. The "Inherit style backend" patch touches code that runs in non-stylo, but only to propagate the style backend, which only has an effect for stylo.
[String changes made/needed]: None.
Attachment #8897080 - Flags: approval-mozilla-beta?
Comment on attachment 8897080 [details] [diff] [review]
Don't mix style backend types in nsComputedDOMStyle. v1

Fix a stylo crash. Beta56+.
Attachment #8897080 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
I ended up having to back this out from Beta. The first issue was Rust panics in Stylo reftests:

Emilio was able to point me at bug 1388319 for those, and indeed they went away after uplifting that one-liner patch with a=bustage.

However, after that, Stylo reftests were hitting svg-as-image failures:

Per IRC discussion with Emilio, it sounds like this is due to bug 1377158 not being on Beta. That's a lot to uplift for a bustage fix, so I had to resort to backing out for the time-being.
Flags: needinfo?(bobbyholley)
Ok. I guess we'll see the extent to which it crops up on the beta experiment and whether we want to invest the time on the uplift.
Flags: needinfo?(bobbyholley)
status-firefox56=fix-optional because we will only uplift this fix if many Beta 56 users are affected.
I haven't seen this crash signature on Beta 56 yet.
Depends on: 1398619
You need to log in before you can comment on or make changes to this bug.