Closed Bug 1391444 Opened 7 years ago Closed 7 years ago

stylo: Crash in core::option::expect_failed | geckoservo::glue::Servo_ResolveStyle on Battle.net websites

Categories

(Core :: CSS Parsing and Computation, defect, P1)

Unspecified
All
defect

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox-esr52 --- unaffected
firefox55 --- unaffected
firefox56 --- unaffected
firefox57 --- fixed

People

(Reporter: mccr8, Assigned: emilio)

References

(Blocks 2 open bugs, )

Details

(Keywords: crash, regression, topcrash)

Crash Data

This bug was filed from the Socorro interface and is 
report bp-676f32c2-a5ed-409e-939c-fc24c0170816.
=============================================================

A patch for this signature landed in bug 1390179, but this crash continues.

There are a suspiciously large number of URLs for Battle.net in the crash reports, such as: https://us.battle.net//d3/en/blog/20976068

I haven't been able to reproduce myself yet.
Summary: servo: Crash in core::option::expect_failed | geckoservo::glue::Servo_ResolveStyle on Battle.net websites → stylo: Crash in core::option::expect_failed | geckoservo::glue::Servo_ResolveStyle on Battle.net websites
Yeah, this signature basically means "we screwed up somewhere and forgot to style this element". STR would be very useful.
Some other URLs:
https://eu.battle.net/forums/fr/overwatch/
https://us.battle.net/d3/en/
https://www.battlenet.com.cn/shop/zh/

I have a Battle.net account, so I can check later today if I can trigger this crash when I'm logged in.
Nightly 57 x64 20170817100132 @ Debian Testing
bp-ad8868d9-f77b-4ebc-9188-8baad0170817
bp-c0f77938-3158-43e0-83af-2fe120170817
Got it 2 times. Comments section. Without being logged in.
Still searching for clear STR.
Crash Signature: [@ core::option::expect_failed | geckoservo::glue::Servo_ResolveStyle] → [@ core::option::expect_failed | geckoservo::glue::Servo_ResolveStyle] [@ mozalloc_abort | abort | core::option::expect_failed | geckoservo::glue::Servo_ResolveStyle ]
OS: Windows 7 → All
I was able to reproduce with latest Linux Nightly, with a fresh profile with these STR:
 (0) Visit about:config & toggle layout.css.servo.enabled to true
 (1) Visit https://us.battle.net/forums/en/overwatch/
 (2) Log in with battle.net username/password.
 (3) If you haven't crashed yet, reload.

I was also able to reproduce in a different fresh profile by simply visiting https://us.battle.net/forums/en/overwatch/ (without logging in) and simply reloading a few times.  So login may help trigger it, but doesn't seem to be required.

Sample crash reports from a few minutes of testing:
bp-c3a5ba6b-fd86-4f93-88b4-6fdf80170817
bp-20ca823b-050f-4a09-962c-c6e760170817
bp-de5a7536-f020-424b-825b-ab5a50170817
bp-4718d78d-1981-4077-b709-855f10170817
bp-6d8bc1ff-ba32-40a2-92e1-f98770170817
bp-1289d14a-07cc-461c-ae28-fe4b60170817
bp-4e0655da-349e-4f17-b228-f23b80170817
bp-a34d5bc3-6ca3-4093-ba3a-bff990170817
bp-d78ee087-67dc-412e-a2f4-873900170817
bp-fa778b70-b39d-43ea-b1f3-8bd140170817
I also managed to reproduce this issue (without being logged in) on this page:
   https://us.battle.net/d3/en/item/circle-of-nailujs-evol
There's a column on the right side that says "See also legendary rings", and I clicked on one of the links there to another page, and then I repeated that process 3 or 4 times and it crashed: bp-138304ed-bea2-4060-8d10-c5a8e0170817
1. https://us.battle.net/login/en/
2. Press F5 fast.
?
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #6)
> 1. https://us.battle.net/login/en/
> 2. Press F5 fast.

Sorry: often! (not too fast :D)
URLs containing battle.net first show up in this signature on the 20170814100258 build, so it is some kind of recent regression. Here's the new patches in that build: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=e928c65095ed32151bbe6e462eda3e8c79edca9f&tochange=df9beb781895fcd0493c21e95ad313e0044515ec
P1 because there are over 600 crash reports, including a dozen from Beta 56.
Keywords: regression, topcrash
(In reply to Chris Peterson [:cpeterson] from comment #9)
> P1 because there are over 600 crash reports, including a dozen from Beta 56.

This signature is extremely generic, so this may not be the same issue as what we are seeing on beta. The beta crashes included some URLs, but no Battle.net ones. Plus I don't know if anything that landed in the last few days could have been uplifted to beta and shipped to users yet.
(In reply to Andrew McCreight [:mccr8] from comment #8)
> URLs containing battle.net first show up in this signature on the
> 20170814100258 build, so it is some kind of recent regression. Here's the
> new patches in that build:
> https://hg.mozilla.org/mozilla-central/
> pushloghtml?fromchange=e928c65095ed32151bbe6e462eda3e8c79edca9f&tochange=df9b
> eb781895fcd0493c21e95ad313e0044515ec

I suspect the patches from bug 1389790.

But all this means is that we have dirty bits on display: none content (or similar) somewhere, which is a bug on its own...
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #6)
> 1. https://us.battle.net/login/en/
> 2. Press F5 often.

An approach? Don't know.
I did not get a crash so far when javascript.enabled=false, but I see an unstyled page for a second sometimes (it's real xhtml). I haven't seen that unstyled second with an enabled javascript yet.
And I know such an unstyled second (just press F5 fast and often enough) from https://fastaim.de (xhtml too, but has deferred scripts), but I don't get a crash there.
This is the #5 Windows topcrash in Nightly 20170818100226.
I suspect the fishiness with dirty bits mentioned in bug 1383332 are the cause, and also the fix suggested there is the correct fix. I can try to get the correct fix landed separately, will post a patch.
(so I don't forget)
Flags: needinfo?(emilio+bugs)
I submitted https://github.com/servo/servo/pull/18173 to fix this. I couldn't reproduce with those patches, but I couldn't repro on nightly either, so...
Flags: needinfo?(emilio+bugs)
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #6)
> 1. https://us.battle.net/login/en/
> 2. Press F5 often

Nightly 57 x64 20170821100350 @ Debian Testing (without your patch)
Something has changed: Socorro now gets an "Internal Server Error", but only on (my) tab crashes of battle.net.
Filed bug 1392412.
Assigning to Emilio because he landed a possible fix.
Assignee: nobody → emilio+bugs
I can still reproduce this with my STR from comment 5, on the latest Nightly that has the fix that was landed. I showed bholley how to reproduce it, too.
It's hard to reproduce this on debug build since I get SIGPIPE (bug 767815?).  Unfotunately I did not use rr so, I don't know what's going on there but anyway, what I knew is:

1) The unstyled element is <script>, its parent is <body>
2) The element is not "WAS_RESTYLED".
3) oldStyleContext is null, so we try to grab the style here [1]

[1] https://hg.mozilla.org/mozilla-central/file/b911a4c97fde/layout/base/ServoRestyleManager.cpp#l754
(In reply to Hiroyuki Ikezoe (:hiro) from comment #21)
> It's hard to reproduce this on debug build since I get SIGPIPE (bug
> 767815?).  Unfotunately I did not use rr so, I don't know what's going on
> there but anyway, what I knew is:
> 
> 1) The unstyled element is <script>, its parent is <body>
> 2) The element is not "WAS_RESTYLED".

This is somewhat confusing.  I meant wasRestyled value in ProcessPostTraversal, not WAS_RESTYLED.
(In reply to Hiroyuki Ikezoe (:hiro) from comment #21)
> It's hard to reproduce this on debug build since I get SIGPIPE (bug
> 767815?).  Unfotunately I did not use rr so, I don't know what's going on
> there but anyway, what I knew is:
> 
> 1) The unstyled element is <script>, its parent is <body>
> 2) The element is not "WAS_RESTYLED".
> 3) oldStyleContext is null, so we try to grab the style here [1]
> 
> [1]
> https://hg.mozilla.org/mozilla-central/file/b911a4c97fde/layout/base/
> ServoRestyleManager.cpp#l754

I remove that "grab the old style if the element is unstyled" in bug 1392170. Can you repro with the first patch in that bug?
Flags: needinfo?(hikezoe)
OK, I will try the patch, but please don't expect too much since it took an hour to reproduce the crash locally. :)  (At first glance the patch should work fine though)
Confirmed that the first patch for bug 1392170 fixes the crash that I saw in comment 21. In the case of the crash in comment 21, I got a warning 'Trying to get change hint from unstyled element' in Servo_TakeChangeHint(). With the first patch I got the same warning, but the crash did not occur. :)
Depends on: 1392170
Flags: needinfo?(hikezoe)
Yeah, also, note that the patch would make the crash not appear, but that's just a wallpaper really, there's an underlying bug for which we're trying to recurse into unstyled stuff. I think I know what may be going wrong, I'll test something.
So, just tried with the STR that were posted on Servo a few hours ago, and on a debug build I get the assertion from bug 1384232 on the root... Which kind of pinpoints where things are going wrong... I'll investigate that test-case a bit, since it may explain the underlying reason.
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #6)
> 1. https://us.battle.net/login/en/
> 2. Press F5 often

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a78ccb11ab9768e7325f7345a882de8f4fcae2e3 (Linux 64 debug)
RUST_BACKTRACE=1 RUST_LOG=debug Downloads/tmp/firefox/firefox -P --no-remote &> 2nd-newprofile2.log
I previously created a new "battlenet2" profile and enabled stylo.
https://cloud.terrax.net/s/mRF5kipFEtO0trU
After repeatedly pressing F5 I got a tab crash, so I clicked on X to close Nightly.
When the file size didn't grow anymore after about half a minute I cancelled everything with Ctrl+C.
I'm hoping this either helps you or that you tell me how to capture it correctly.
I did the exact steps I did for producing these crash signatures in comment 19.
So, what I think it's happening is that we don't traverse unstyled elements on the animation only restyle, but we're finding them on the post-traversal, and we're trying to style them.

Hiro, does this look like something that could happen?
Flags: needinfo?(hikezoe)
That would explain the crash Jan Andre is seeing. I also got https://crash-stats.mozilla.com/report/index/a4bed079-847e-43d7-8b62-c661b0170823, which is inside innerText, and given there's cross-compartment wrapper stuff in there, I suspect we're using the wrong shell somehow, but it's not clear to me how that is possible.
(In reply to Emilio Cobos Álvarez [:emilio] from comment #29)
> So, what I think it's happening is that we don't traverse unstyled elements
> on the animation only restyle, but we're finding them on the post-traversal,
> and we're trying to style them.
> 
> Hiro, does this look like something that could happen?

After bug 1388031, we process normal traversal even for throttled animations flush, so if it happens, that means we skip traversing the unstyled elements during normal traversal too. I think it's unlikely but...
Flags: needinfo?(hikezoe)
But yes, it could happen if there is no element which has normal dirty bit.
(In reply to Hiroyuki Ikezoe (:hiro) from comment #32)
> But yes, it could happen if there is no element which has normal dirty bit.

Yeah, I think this shouldn't happen... We propagate the dirty bit on content insertion, unless the parent is unstyled too, but that case should work the same way.
I did check the log in comment 28.

The element in question does not have dirty bit. Also the root node does not have the dirty bit either. But the element is traversed in post traversal, that means the element has NODE_DESCENDANTS_NEED_FRAMES. 

Anyway, I can't reproduce the warning "Trying to get change hint from unstyled element" any more on current autoland (d1c70c20e7b5).  Can someone still able to get the warning?
See Also: → 1393637
I could reproduce the warning locally, filed bug 1393637.
There is no report since buildid: 20170824100243. I am confident bug 1392170 fixed this crash.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Oops, I was wrong. I was seeing the crash report from a link in stylo Wiki, from the link I see no new reports, but actually there are a bunch of new reports. 

https://crash-stats.mozilla.com/report/index/e9e3c5e2-3745-4ce8-81cd-b06fb0170827

This is a different case from what I've been investigating though.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
:mccr8, can you see URLs in new crash reports other than battle.net?  I think battle.net case has been fixed by bug 1392170.
Flags: needinfo?(continuation)
(In reply to Hiroyuki Ikezoe (:hiro) from comment #38)
> :mccr8, can you see URLs in new crash reports other than battle.net?  I
> think battle.net case has been fixed by bug 1392170.

I see a couple of URLs from ebay:
https://www.ebay.com/myb/PurchaseHistory
https://www.ebay.com/myb/BidsOffers
https://www.ebay.com/myb/WatchList

Nothing else shows up more than once.
Flags: needinfo?(continuation)
Anyways, I agree that this looks to be fixed for Battle.net, based on the lack of any reports with battle.net as the URL since your fix landed. You should probably file a new bug for new crashes to make things less confusing.
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
See Also: → 1394560
Thank you!  Filed bug 1394560.
Target Milestone: --- → mozilla57
(In reply to Chris Peterson [:cpeterson] from comment #9)
> P1 because there are over 600 crash reports, including a dozen from Beta 56.

Those crashes reported on Bete56 seemed irrelevant to the one in Battle.net.
They should be covered in bug 1394560.
You need to log in before you can comment on or make changes to this bug.