Closed
Bug 1405878
Opened 7 years ago
Closed 7 years ago
stylo: Tab Crashes in geckoservo::glue::Servo_ResolveStyle
Categories
(Core :: CSS Parsing and Computation, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla58
People
(Reporter: nachtigall, Assigned: bzbarsky)
References
Details
Crash Data
Attachments
(5 files)
6.27 KB,
text/plain
|
Details | |
59 bytes,
text/x-review-board-request
|
Details | |
6.29 KB,
text/plain
|
Details | |
282 bytes,
application/xml
|
Details | |
4.84 KB,
patch
|
hsivonen
:
review+
Sylvestre
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
I can see a reproducable crash on my network's router admin page. Here are a few crash reports from this: * https://crash-stats.mozilla.com/report/index/d2aed79f-6c5f-499f-ac6f-1c3ca0171004 * https://crash-stats.mozilla.com/signature/?product=Firefox&signature=mozalloc_abort%20%7C%20abort%20%7C%20core%3A%3Aoption%3A%3Aexpect_failed%20%7C%20geckoservo%3A%3Aglue%3A%3AServo_ResolveStyle#summary * https://crash-stats.mozilla.com/report/index/63090648-9762-4732-a88e-eff9f0170930#tab-details I could add more this this happens reproducable. mozregression says: 64:23.40 INFO: Last good revision: ebd299dd19d6f53ad7a22b55bd283911f77e0a88 64:23.40 INFO: First bad revision: 3265cc2501b6d23f2ece3cc27eaa6d87401bc7bc 64:23.40 INFO: Pushlog: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=ebd299dd19d6f53ad7a22b55bd283911f77e0a88&tochange=3265cc2501b6d23f2ece3cc27eaa6d87401bc7bc This is on a private LAN's router webconfig page so I cannot share it. FWIW, this is on table that's get rendered very slowly by the router row by row. If I just save the page (including assets like CSS/JS) with Chrome or an older working Firefox and then open this page with the crashing Firefox then it works. So it seems to me that this is caused by the the slow way in which the page is sent to the browser by the not so cpu-strong router. Due to the pushlog I set NI and bug dependency - please correct in case I might be wrong.
Comment 1•7 years ago
|
||
Stylo panic. bholley, can you find somebody to take a quick look?
Flags: needinfo?(bobbyholley)
Comment 2•7 years ago
|
||
CCing some people who might be able to help. We're crashing trying to resolve styles on an unstyled element in nsCSSFrameConstructor::InitializeSelectFrame, in case that rings any bells. Jens - can you reproduce in a debug build of firefox? If so, setting the env vars: |STYLO_THREADS=1 RUST_LOG=geckoservo=debug,style::traversal=debug| will produce a log that very often will help us debug the issue. Alternatively, you could try taking a recording of the traffic using mitmproxy, which we could then play back on your end.
Flags: needinfo?(bobbyholley)
Priority: -- → P2
Comment 3•7 years ago
|
||
Yeah, maybe something funny about how select frames work. They have definitely some custom paths in the FC, but not sure without a test-case, or an idea of one... Seeing the HTML & CSS of the page or something would be very valuable.
Component: Panning and Zooming → CSS Parsing and Computation
I think it would be easiest if I would dig a hole into our firewall and send someone (or two people) a temporary login to the local page. So you can get access to this page in order to easily reproduce yourself. Maybe someone can send me a private mail and then I can communicate the login via an encrypted channel (PGP, Signal or WhatsApp would be fine). Drop me a line at nachtigall__AT__web__DOT__de
Comment 5•7 years ago
|
||
Ok. Assuming you're in Germany, working with emilio probably makes sense, since he's in the Berlin office and is our most prolific stylo regression fixer in any case.
Flags: needinfo?(emilio)
Comment 6•7 years ago
|
||
On it. Cool part is I can't repro on Nightly. I can repro very consistently in current dev-edition 57.0b5. So time to find a regression range I guess...
Assignee: nobody → emilio
Flags: needinfo?(emilio)
Comment 7•7 years ago
|
||
Err, I'm wrong, it does still repro on nightly (took a bit more), investigating a bit more...
Comment 8•7 years ago
|
||
Ok, so we're creating an HTML <option> element from the XML parser in: * http://searchfox.org/mozilla-central/rev/8efd128b48cdae3a6c6862795ce618aa1ca1c0b4/dom/xml/nsXMLContentSink.cpp#981 By the time we get there, the parent <select> element is already styled and laid out. That call doesn't notify at all so there's no way for the style system to notice that it has to style that element. It's not clear to me how this is supposed to work, I'll try to build a test-case.
Updated•7 years ago
|
Summary: Tab Crashes in geckoservo::glue::Servo_ResolveStyle → stylo: Tab Crashes in geckoservo::glue::Servo_ResolveStyle
Comment 9•7 years ago
|
||
(In reply to Emilio Cobos Álvarez [:emilio] from comment #8) > Ok, so we're creating an HTML <option> element from the XML parser in: > > * > http://searchfox.org/mozilla-central/rev/ > 8efd128b48cdae3a6c6862795ce618aa1ca1c0b4/dom/xml/nsXMLContentSink.cpp#981 > > By the time we get there, the parent <select> element is already styled and > laid out. That call doesn't notify at all so there's no way for the style > system to notice that it has to style that element. Pretty similar to what I saw in bug 1393637.
Comment 10•7 years ago
|
||
There's no notification, and layout and style have definitely run (on the parent) by then. Indeed the only reason we then (later) try to construct frames for the child is because we reframe the whole table that contains it.
Comment 11•7 years ago
|
||
Henri, can you help here? I'm not familiar with how the parser operates. Once we've run layout on an element, what guarantees that layout will notice that a children has been inserted in that subtree, given the parser doesn't notify[1]? [1]: http://searchfox.org/mozilla-central/rev/8efd128b48cdae3a6c6862795ce618aa1ca1c0b4/dom/xml/nsXMLContentSink.cpp#981
Flags: needinfo?(hsivonen)
Comment hidden (mozreview-request) |
Comment 13•7 years ago
|
||
So, the patch above fixes the problem for me, as expected, I guess. But I suspect it's not the right fix, given nothing else except the doc element insertion in that file seems to notify? In any case, Jens, can you confirm (when builds are ready) that a build with this patch applied like: https://treeherder.mozilla.org/#/jobs?repo=try&revision=caa513c47f968c62bd04a7902521e5959c6f650f Fixes the crash for you too? I'm still trying to construct a test-case. Trying something naive with PHP + usleep isn't working as well as I'd hoped...
Flags: needinfo?(nachtigall)
Comment 14•7 years ago
|
||
FWIW, I don't suggest you run a custom build directly, because that may harm your profile. It would probably be better to use mozregression to run single build with the given commit on try.
Comment 15•7 years ago
|
||
So boris pointed at me at how this is supposed to work, and indeed my patch is wrong. But knowing a bit more the setup (the parser notifying in CloseTag() and FlushTags()), made me find the bug. In particular, we're flushing content _after_ popping the select element from the stack, but before checking mNotifyLevel for the children. That FlushTags overrides mNotifyLevel, which means we never notify the children. This is an nsXMLContentSink bug per se, not a stylo bug, fwiw... ni? Boris since he wanted to take a closer look at this.
Flags: needinfo?(hsivonen)
Reporter | ||
Comment 17•7 years ago
|
||
(In reply to Emilio Cobos Álvarez [:emilio] from comment #13) > In any case, Jens, can you confirm (when builds are ready) that a build with > this patch applied like: > > > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=caa513c47f968c62bd04a7902521e5959c6f650f > > Fixes the crash for you too? Yes, using this build (I've done `mozregression --repo try --launch caa513c47f968c62bd04a7902521e5959c6f650f`) the crash is fixed.
Flags: needinfo?(nachtigall)
Assignee | ||
Comment 18•7 years ago
|
||
So ideally we would fix bug 363688 or something. The HTML sink doesn't have this deferred-notification setup at all; see the minimal work nsHtml5TreeOpExecutor::FlushPendingNotifications does. In the meantime, the XML sink is buggy as heck with <select>; this bug is just triggering some of that. For example: <html xmlns="http://www.w3.org/1999/xhtml"> <select> <option>Hello there</option> </select> </html> does not show the option when loaded as XML, precisely because of Emilio's stack and failing to notify on the option... Anyway, I think the right fix here is to make sure we do that whole "notify on the kids if needed" bit after we pop the stack, but before we do anything else that can cause reentry and mess up our state. I just did some historical digging, and this is what the HTML sink used to do too, which is pretty hopeful risk-wise. ;) See http://searchfox.org/mozilla-central/rev/836aa0b2844d25a457699424800efcbb3e4b53d8/content/html/document/src/nsHTMLContentSink.cpp#1325-1362 which does the notification bits right after popping the stack and before it starts calling DoneAddingChildren on anything. I'll take this, since I'd want a second opinion on the review anyway and Emilio has enough on his plate right now.
Assignee: emilio → bzbarsky
Flags: needinfo?(bzbarsky)
Assignee | ||
Comment 19•7 years ago
|
||
Assignee | ||
Comment 20•7 years ago
|
||
Once we call DoneAddingChildren, random code of various sorts will run, which can flush our notification state. If that happens before we've notified on our kids, but after we've popped the element we're closing off the element stack, we will fail to ever notify on the kids. MozReview-Commit-ID: Ei7v5OobX8R
Attachment #8916123 -
Flags: review?(hsivonen)
Assignee | ||
Comment 21•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3cf84b3012f1de8c1756d2a1deac2f776864a41e
Assignee | ||
Comment 22•7 years ago
|
||
[Tracking Requested - why for this release]: Stylo crash regression. This is a longstanding XML-parsing bug (e.g. the reftest fails pre-stylo), but with stylo it violates even more invariants.
tracking-firefox57:
--- → ?
tracking-firefox58:
--- → ?
Updated•7 years ago
|
Attachment #8916123 -
Flags: review?(hsivonen) → review+
Comment 23•7 years ago
|
||
Based on Comment 22 I think we should + this for 57/58. Please nominate for uplift to beta after the fix lands on central.
status-firefox57:
--- → affected
status-firefox58:
--- → affected
Comment 24•7 years ago
|
||
Pushed by bzbarsky@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/f144e1434312 Make sure to notify for our kids, if needed, before calling DoneAddingChildren in the XML content sink. r=hsivonen
Assignee | ||
Comment 25•7 years ago
|
||
Comment on attachment 8916123 [details] [diff] [review] Make sure to notify for our kids, if needed, before calling DoneAddingChildren in the XML content sink Approval Request Comment [Feature/Bug causing the regression]: Stylo, sort of, but the actual bug has been around for a long time. [User impact if declined]: (Safe) crashes able to be triggered by XML involving <html:select>. [Is this code covered by automated tests?]: Yes. [Has the fix been verified in Nightly?]: Yes. [Needs manual test from QE? If yes, steps to reproduce]: No. [List of other uplifts needed for the feature/fix]: None. [Is the change risky?]: Yes. [Why is the change risky/not risky?]: This is changing the XML parser, which is not exactly a simple area of the code. I'm somewhat confident about the fix, in the sense that it aligns the behavior better with what the HTML parser used to have, which was much more extensively tested than the XML parser. But I do not have super-high confidence that I am not introducing other bugs (including correctness and security bugs) in the process... If it were not for the stylo dependency where this causes crashes on code that used to not crash, I would not even consider suggesting this for uplift. [String changes made/needed]:
Attachment #8916123 -
Flags: approval-mozilla-beta?
Comment 26•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f144e1434312
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Updated•7 years ago
|
Crash Signature: [@ mozalloc_abort | abort | core::option::expect_failed | geckoservo::glue::Servo_ResolveStyle]
Comment 27•7 years ago
|
||
Comment on attachment 8916123 [details] [diff] [review] Make sure to notify for our kids, if needed, before calling DoneAddingChildren in the XML content sink Fix a crash, taking it.
Attachment #8916123 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 28•7 years ago
|
||
Should be in 57b8
Comment 29•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/e1004148e064
Flags: in-testsuite+
Comment 30•7 years ago
|
||
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #25) > [Is this code covered by automated tests?]: Yes. > [Has the fix been verified in Nightly?]: Yes. > [Needs manual test from QE? If yes, steps to reproduce]: No. Setting qe-verify- based on Boris's assessment on manual testing needs and the fact that this fix has automated coverage.
Flags: qe-verify-
You need to log in
before you can comment on or make changes to this bug.
Description
•