stylo: Crash in nsRuleNode::nsRuleNode

RESOLVED FIXED in Firefox 57

Status

()

defect
P1
critical
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: mccr8, Assigned: bholley)

Tracking

(Blocks 1 bug, {crash, reproducible})

unspecified
mozilla57
Unspecified
All
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox55 unaffected, firefox56 unaffected, firefox57 fixed)

Details

(crash signature)

Attachments

(2 attachments)

(Reporter)

Description

2 years ago
This bug was filed from the Socorro interface and is 
report bp-5763fa78-9707-49e7-a7f6-d36fb0170810.
=============================================================

This is a Windows top crash, on the August 9 Nightly. However, this is from a single installation. But... I was able to actually reproduce this crash, by visiting one of the URLs in the crash reports. Many of the crash reports included a similar URL.

The site this is crashing on is the Kelly Blue Book site:

https://www.kbb.com/toyota/highlander/2017/styles/?stylename=le&intent=buy-new 

Just go to the site and click around a bit, scroll, resize the window maybe, and I was able to reproduce the crash twice. I had to enter a zip code first to interact with the site.

These crashes are all null derefs.
(Reporter)

Comment 1

2 years ago
This crash is present on many branches, but it first showed up on Nightly in the 8-9 build.

Here's the change set for that Nightly: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a921bfb8a2cf3db4d9edebe9b35799a3f9d035da&tochange=4c5fbf49376351679dcc49f4cff26c3c2e055ccc

Emilio has a lot of patches in that range. Any ideas about this crash?
Flags: needinfo?(emilio+bugs)
I suspect this is due to https://hg.mozilla.org/mozilla-central/rev/40807a516e5e.
Flags: needinfo?(emilio+bugs)
I'll investigate this one today.
(In reply to Emilio Cobos Álvarez [:emilio] from comment #2)
> I suspect this is due to
> https://hg.mozilla.org/mozilla-central/rev/40807a516e5e.

The fact that it's a null deref makes this senseless I think... And I haven't been able to repro actually.

Andrew, was this for sure with stylo enabled? That should be a Gecko-only path as far as I can tell.
Flags: needinfo?(continuation)
(Reporter)

Comment 5

2 years ago
(In reply to Emilio Cobos Álvarez [:emilio] from comment #4)
> Andrew, was this for sure with stylo enabled? That should be a Gecko-only
> path as far as I can tell.

I have layout.css.servo.enabled set to true. I'm on OSX.

I'm seeing stuff like "experiments":{"pref-flip-quantum-css-style-r1-1381147":{"branch":"stylo"} in the crash reports. I don't know if that indicates Stylo is enabled or not.
Flags: needinfo?(continuation)
(Reporter)

Comment 6

2 years ago
I can still reproduce this on an 8-11 build. It doesn't happen immediately, but I can do it fairly quickly. I can bisect this today and see if that helps.
(In reply to Andrew McCreight [:mccr8] from comment #5)
> I'm seeing stuff like
> "experiments":{"pref-flip-quantum-css-style-r1-1381147":{"branch":"stylo"}
> in the crash reports. I don't know if that indicates Stylo is enabled or not.

Yes. That means you have been enlisted in the Stylo Nightly experiment, even if you hadn't manually set the layout.css.servo.enabled pref.
Keywords: reproducible
Priority: -- → P2
This is the #3 Windows topcrash in Nightly 20170811100330, with 54 occurrences.
(Assignee)

Updated

2 years ago
Priority: P2 → P1
(Reporter)

Comment 9

2 years ago
I bisected locally using mozregression, and got it down to this range:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=47248637eafa9a38dade8dc3aa6c4736177c8d8d&tochange=fde1450a4368d04e97174e2eb00fb48901179857
There are at least 4 servo commits in that range.
(Reporter)

Comment 10

2 years ago
I'm going to try to bisect further with local builds.
(Reporter)

Updated

2 years ago
OS: Mac OS X → All
(Assignee)

Comment 11

2 years ago
MozReview-Commit-ID: 8Decj2cxySY
Attachment #8897080 - Flags: review?(cam)
(Assignee)

Comment 12

2 years ago
mccr8 caught this in a debug build, and hit an assertion with the following stack. We IRC-debugged things to determine that the caller was using a different style backend than the callee.

    Thread 1 "Web Content" received signal SIGSEGV, Segmentation fault.
    0x00007fffe8986e77 in mozilla::DeclarationBlock::AsGecko (this=<optimized out>)
        at /home/amccreight/mc/obj-dbg.noindex/dist/include/mozilla/DeclarationBlockInlines.h:15
    15      MOZ_DEFINE_STYLO_METHODS(DeclarationBlock, css::Declaration, ServoDeclarationBlock)
    (gdb) bt
    #0  0x00007fffe8986e77 in mozilla::DeclarationBlock::AsGecko (this=<optimized out>)
        at /home/amccreight/mc/obj-dbg.noindex/dist/include/mozilla/DeclarationBlockInlines.h:15
    #1  nsHTMLCSSStyleSheet::ElementRulesMatching (this=<optimized out>, aPresContext=0x7fffb959a000,
        aElement=<optimized out>, aRuleWalker=0x7fffffff7090)
        at /home/amccreight/mc/layout/style/nsHTMLCSSStyleSheet.cpp:72
    #2  0x00007fffe89b340e in EnumRulesMatching<ElementRuleProcessorData> (
        aProcessor=0x7ffff7110540 <_IO_2_1_stderr_>, aData=0x7ffff7111770 <_IO_stdfile_2_lock>)
        at /home/amccreight/mc/layout/style/nsStyleSet.cpp:796
    #3  0x00007fffe89b2ac4 in nsStyleSet::FileRules (this=<optimized out>,
        aCollectorFunc=0x7fffe89b3404 <EnumRulesMatching<ElementRuleProcessorData>(nsIStyleRuleProcessor*, void*)>,
        aData=<optimized out>, aElement=0x7fffba4e8700, aRuleWalker=<optimized out>)
        at /home/amccreight/mc/layout/style/nsStyleSet.cpp:1174
    #4  0x00007fffe89b3278 in nsStyleSet::ResolveStyleForInternal (this=<optimized out>, aElement=0x7fffba4e8700,
        aParentContext=<optimized out>, aTreeMatchContext=..., aAnimationFlag=(unknown: 4160345920))
        at /home/amccreight/mc/layout/style/nsStyleSet.cpp:1354
    #5  0x00007fffe89b309f in nsStyleSet::ResolveStyleFor (this=0x7fffc35743a0, aElement=0x7fffba4e8700,
        aParentContext=0x0, aTreeMatchContext=...) at /home/amccreight/mc/layout/style/nsStyleSet.cpp:1390
    #6  nsStyleSet::ResolveStyleFor (this=0x7ffff7111770 <_IO_stdfile_2_lock>, aElement=0x7fffba4e8700,
        aParentContext=0x0) at /home/amccreight/mc/layout/style/nsStyleSet.cpp:1337
    #7  0x00007fffe893e3f1 in nsStyleSet::ResolveStyleFor (this=0x7fffc35743a0, aElement=0x7fffba4e8700,
        aParentContext=<optimized out>) at /home/amccreight/mc/layout/style/nsStyleSet.h:125
    #8  (anonymous namespace)::StyleResolver::ResolveWithAnimation (aStyleSet=0x7fffc35743a0,
        aElement=<optimized out>, aType=mozilla::CSSPseudoElementType::NotPseudo, aParentContext=<optimized out>,
        aStyleType=nsComputedDOMStyle::eAll, this=<optimized out>, aInDocWithShell=<optimized out>)
        at /home/amccreight/mc/layout/style/nsComputedDOMStyle.cpp:470
    #9  nsComputedDOMStyle::DoGetStyleContextNoFlush (aElement=0x7fffba4e8700, aPseudo=<optimized out>, aPresShell=
        0x7fffc43f3000, aStyleType=nsComputedDOMStyle::eAll, aAnimationFlag=nsComputedDOMStyle::eWithAnimation)
        at /home/amccreight/mc/layout/style/nsComputedDOMStyle.cpp:692
    #10 0x00007fffe893e2ae in nsComputedDOMStyle::GetStyleContextNoFlush (
        aElement=0x7ffff7111770 <_IO_stdfile_2_lock>, aPseudo=<optimized out>, aPresShell=0x7fffc43f3000,
        aStyleType=nsComputedDOMStyle::eAll) at /home/amccreight/mc/layout/style/nsComputedDOMStyle.h:104
    #11 nsComputedDOMStyle::DoGetStyleContextNoFlush (aElement=0x7fffb72d5b20, aPseudo=<optimized out>,
        aPresShell=0x7fffc43f3000, aStyleType=nsComputedDOMStyle::eAll,
        aAnimationFlag=nsComputedDOMStyle::eWithAnimation)
        at /home/amccreight/mc/layout/style/nsComputedDOMStyle.cpp:683
    #12 0x00007fffe893dfb0 in nsComputedDOMStyle::GetStyleContextNoFlush (aElement=0x7fffb72d5b20, aPseudo=0x0,
        aPresShell=0x0, aStyleType=nsComputedDOMStyle::eAll)
        at /home/amccreight/mc/layout/style/nsComputedDOMStyle.h:104
(Assignee)

Updated

2 years ago
Assignee: nobody → bobbyholley
(Assignee)

Comment 13

2 years ago
I contemplated writing a crash test but it would depend on the precise criteria we use to select the backend, which will change quickly and soon go away entirely.
(Reporter)

Comment 14

2 years ago
I wasn't able to reproduce the crash with Bobby's patch applied.
Comment on attachment 8897080 [details] [diff] [review]
Don't mix style backend types in nsComputedDOMStyle. v1

Review of attachment 8897080 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/style/nsComputedDOMStyle.cpp
@@ +586,5 @@
>      presShell = aPresShell;
>      if (!presShell)
>        return nullptr;
> +
> +

Nit: probably don't need two blank lines here.
Attachment #8897080 - Flags: review?(cam) → review+
(Assignee)

Comment 16

2 years ago
Our current machinery for enabling stylo requires a docshell - if there isn't
one, we default to the Gecko style system.

When getComputedStyle operates on an element without a presshell, it uses the
caller's presshell instead. If the element has previously been styled with
one style system (but no longer has a presshell), and the caller uses a
different style backend, using the caller's style system can cause crashes when
we pull bits of cached data off the DOM (like cached style attributes).

So we want to throw when window.getComputedStyle(element) is called for a
(window, element) pair with different style backends (which is what the next
patch in this bug does).

However, that causes a few failures where stylo-backed documents try to do
getComputedStyle on an XHR document (which, without a docshell, will use the
gecko style system).

So this patch does some work to propagate the creator's style backend into
various docshell-less documents. This should allow both chrome (which uses gecko)
and content (which uses stylo) to use getComputedStyle on the response document
for XHRs they create.

Note that the second patch in this bug will make
chromeWin.getComputedStyle(contentObj) throw. If we discover code that does
that, we can just make it invoke the content's getComputedStyle method over Xrays.

MozReview-Commit-ID: 5OsmHJKq5Ui
Attachment #8897586 - Flags: review?(cam)
Attachment #8897586 - Flags: review?(bugs)
Comment on attachment 8897586 [details] [diff] [review]
Inherit style backend into NS_NewDOMDocument. v1

>+  // Try to inherit a style backend.
>+  auto styleBackend = StyleBackendType::None;
>+  nsCOMPtr<nsPIDOMWindowInner> window = do_QueryInterface(mScriptHandlingObject);
>+  if (window && window->GetDoc()) {
Nit, GetExtantDoc()

>+    styleBackend = window->GetDoc()->GetStyleBackendType();
ditto


>+  auto styleBackend = StyleBackendType::None;
>+  nsCOMPtr<nsPIDOMWindowInner> window = do_QueryInterface(global);
>+  if (window && window->GetDoc()) {
ditto

>+    styleBackend = window->GetDoc()->GetStyleBackendType();
ditto

>   XMLHttpRequestMainThread();
> 
>   void Construct(nsIPrincipal* aPrincipal,
>                  nsIGlobalObject* aGlobalObject,
>                  nsIURI* aBaseURI = nullptr,
>                  nsILoadGroup* aLoadGroup = nullptr)
>   {
>     MOZ_ASSERT(aPrincipal);
>-    MOZ_ASSERT_IF(nsCOMPtr<nsPIDOMWindowInner> win = do_QueryInterface(
>-      aGlobalObject), win->IsInnerWindow());
>+    nsCOMPtr<nsPIDOMWindowInner> win = do_QueryInterface(aGlobalObject);
>+    if (win) {
>+      MOZ_ASSERT(win->IsInnerWindow());
>+      if (win->GetDoc()) {
GetExtantDoc()
>+        mStyleBackend = win->GetDoc()->GetStyleBackendType();
and here
...
> 
>+  StyleBackendType mStyleBackend;
Hmm, this isn't initialized always. Initialize in ctor.
Attachment #8897586 - Flags: review?(bugs) → review+
Comment on attachment 8897586 [details] [diff] [review]
Inherit style backend into NS_NewDOMDocument. v1

Review of attachment 8897586 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/thebes/gfxSVGGlyphs.cpp
@@ +361,5 @@
>  
>      nsCOMPtr<nsIPrincipal> principal = NullPrincipal::Create();
>  
> +    // XXXbholley: The style backend here probably doesn't matter, since the
> +    // document isn't reachable by content and it never gets styled directly.

Not sure what you mean by "directly", but the document does get styled, in gfxSVGGlyphsDocument::SetupPresentation.  But you're right it doesn't matter for the purpose of outside content reaching into this document.
Attachment #8897586 - Flags: review?(cam) → review+
(Assignee)

Comment 19

2 years ago
Thanks for the reviews!

Comment 20

2 years ago
Pushed by bholley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f10462fa4e89
Inherit style backend into NS_NewDOMDocument. r=smaug,r=heycam
https://hg.mozilla.org/integration/autoland/rev/c5c135b17ec1
Don't mix style backend types in nsComputedDOMStyle. r=heycam

Comment 22

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/f10462fa4e89
https://hg.mozilla.org/mozilla-central/rev/c5c135b17ec1
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Marking status-firefox56=fix-optional because there are only about 11 crash reports with this signature from Beta 56, compared to 300+ on Nightly 57.
(Assignee)

Comment 24

2 years ago
(In reply to Chris Peterson [:cpeterson] from comment #23)
> Marking status-firefox56=fix-optional because there are only about 11 crash
> reports with this signature from Beta 56, compared to 300+ on Nightly 57.

Are we running the beta experiment on 56 yet? This would only happen if stylo was enabled.
Flags: needinfo?(cpeterson)
(Reporter)

Comment 25

2 years ago
Was this a regression from something landed in the 8-9 Nightly build? If that's the case, and the regressor was uplifted, then it might not be in a beta yet.
We are not running the experiment in Beta 56 yet, but we have some Beta users who have manually enabled Stylo. We can uplift this crash fix to Beta if it's not risky to non-Stylo code.
Flags: needinfo?(cpeterson)
(Assignee)

Comment 27

2 years ago
Comment on attachment 8897080 [details] [diff] [review]
Don't mix style backend types in nsComputedDOMStyle. v1

Approval Request Comment
[Feature/Bug causing the regression]: N/A
[User impact if declined]: Possible crashes on the stylo 56 beta experiment.
[Is this code covered by automated tests?]: No.
[Has the fix been verified in Nightly?]: Yes.
[Needs manual test from QE? If yes, steps to reproduce]: STR in comment 0. Given that this is stylo-experiment-only, manual verification probably not required. 
[List of other uplifts needed for the feature/fix]: Part 1 + Part 2 in this bug.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: Mostly just affects stylo. The "Inherit style backend" patch touches code that runs in non-stylo, but only to propagate the style backend, which only has an effect for stylo.
[String changes made/needed]: None.
Attachment #8897080 - Flags: approval-mozilla-beta?
Comment on attachment 8897080 [details] [diff] [review]
Don't mix style backend types in nsComputedDOMStyle. v1

Fix a stylo crash. Beta56+.
Attachment #8897080 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
I ended up having to back this out from Beta. The first issue was Rust panics in Stylo reftests:
https://treeherder.mozilla.org/logviewer.html#?job_id=124127205&repo=mozilla-beta

Emilio was able to point me at bug 1388319 for those, and indeed they went away after uplifting that one-liner patch with a=bustage.

However, after that, Stylo reftests were hitting svg-as-image failures:
https://treeherder.mozilla.org/logviewer.html#?job_id=124171318&repo=mozilla-beta

Per IRC discussion with Emilio, it sounds like this is due to bug 1377158 not being on Beta. That's a lot to uplift for a bustage fix, so I had to resort to backing out for the time-being.

https://hg.mozilla.org/releases/mozilla-beta/rev/b25f3c5ec90497426da80de98f38d53282b835ba
Flags: needinfo?(bobbyholley)
(Assignee)

Comment 31

2 years ago
Ok. I guess we'll see the extent to which it crops up on the beta experiment and whether we want to invest the time on the uplift.
Flags: needinfo?(bobbyholley)
status-firefox56=fix-optional because we will only uplift this fix if many Beta 56 users are affected.
I haven't seen this crash signature on Beta 56 yet.

Updated

2 years ago
Depends on: 1398619
You need to log in before you can comment on or make changes to this bug.