Closed Bug 1716708 Opened 4 years ago Closed 4 years ago

8.22 - 7.09% a11yr dhtml.html / a11yr dhtml.html + 2 more (Linux) regression on Mon June 14 2021

Categories

(Core :: Layout, defect)

defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox-esr78 --- unaffected
firefox89 --- unaffected
firefox90 --- unaffected
firefox91 --- affected

People

(Reporter: Bebe, Assigned: MatsPalmgren_bugz)

References

(Regression)

Details

(4 keywords)

Attachments

(2 files)

Perfherder has detected a talos performance regression from push 47b4452c0a6025476db68b16a2448b5752ce7562. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Suite Test Platform Options Absolute values (old vs new)
8% a11yr dhtml.html linux1804-64-shippable-qr e10s stylo webrender 474.26 -> 513.26
8% a11yr dhtml.html linux1804-64-shippable-qr e10s stylo webrender-sw 462.28 -> 499.21
8% a11yr dhtml.html linux1804-64-shippable-qr e10s stylo webrender 477.57 -> 515.32
7% a11yr dhtml.html linux1804-64-shippable e10s stylo 463.48 -> 496.34

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(mats)

== Change summary for alert #30379 (as of Wed, 16 Jun 2021 05:38:26 GMT) ==

Regressions:

Ratio Suite Test Platform Options Absolute values (old vs new)
27% wikipedia LastVisualChange macosx1015-64-shippable-qr warm webrender 986.67 -> 1,256.67
25% wikipedia LastVisualChange macosx1014-64-shippable-qr cold webrender 2,670.00 -> 3,343.33
24% wikipedia LastVisualChange macosx1014-64-shippable-qr warm webrender 1,246.67 -> 1,546.67
22% wikipedia LastVisualChange macosx1015-64-shippable-qr cold webrender 2,026.67 -> 2,476.67
19% wikipedia LastVisualChange linux1804-64-shippable-qr warm webrender 1,356.67 -> 1,610.00
18% wikipedia LastVisualChange linux1804-64-shippable warm 1,326.67 -> 1,566.67
16% wikipedia LastVisualChange linux1804-64-shippable cold 2,656.67 -> 3,090.00
16% facebook loadtime macosx1014-64-shippable-qr cold webrender 2,346.67 -> 2,722.04
15% wikipedia LastVisualChange linux1804-64-shippable-qr cold webrender 2,710.00 -> 3,120.00
13% facebook LastVisualChange macosx1014-64-shippable-qr cold webrender 2,225.00 -> 2,513.33
... ... ... ... ... ...
6% google-search dcf macosx1014-64-shippable-qr cold webrender 405.50 -> 429.00
5% facebook dcf macosx1014-64-shippable-qr cold webrender 839.23 -> 881.67
4% microsoft LastVisualChange macosx1014-64-shippable-qr cold webrender 1,216.67 -> 1,270.00
3% microsoft loadtime macosx1014-64-shippable-qr cold webrender 1,020.62 -> 1,053.62
3% office LastVisualChange macosx1014-64-shippable-qr cold webrender 3,510.00 -> 3,610.00

Improvements:

Ratio Suite Test Platform Options Absolute values (old vs new)
46% imgur fnbpaint macosx1015-64-shippable-qr warm webrender 149.38 -> 80.29
30% wikipedia loadtime windows10-64-shippable-qr warm webrender 1,001.00 -> 698.79
29% wikipedia loadtime windows10-64-shippable-qr warm webrender 990.79 -> 702.62
26% wikipedia loadtime windows10-64-shippable warm 973.54 -> 718.58
22% wikipedia fnbpaint windows10-64-shippable-qr warm webrender 1,007.58 -> 782.38
... ... ... ... ... ...
3% nytimes SpeedIndex linux1804-64-shippable-qr warm webrender 882.12 -> 856.33

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=30379

Set release status flags based on info from the regressing bug 1542807

This is the a11y test: https://searchfox.org/mozilla-central/source/testing/talos/talos/tests/a11y/dhtml.html
It creates 6000 dynamically inserted <li>s amongst other things. 8% seems a bit much but some regression is probably to be expected given that we now actually create anonymous content for these and then go through normal frame construction, which now creates an inline frame + text frame vs just an nsBulletFrame before.

For the various painting tests in comment 2 - I don't really know what these tests are measuring but there seems to be about as many regressions as there are improvements. Would it be possible that we simply moved some execution time from one place to another?
I guess there could be some performance difference between the old nsBulletFrame display items and the new nsTextFrame ones, but I would expect that difference to be negligible.

Flags: needinfo?(mats)
Attached file testcase

I've tested this in a local Opt build on Linux and I can confirm that there's a ~8% regression before and after bug 1542807. This test inserts 18000 list items, half with list-style-type: decimal and half with disc.
The regression also occurs with visibility:hidden on the ancestor, so we can exclude a painting regression.
My build is built with --disable-accessibility so we can exclude any a11y code too.

When I enable the li { list-style-type: none; } rule in this test, then I see a significant performance improvement instead, which I assume comes from no longer creating an nsBulletFrame for that case.

So the regression appears to be in frame construction or reflow related to the ::marker. I'll continue to investigate and see if I can pinpoint the exact cause...

Assignee: nobody → mats
Attached image Firefox Profiler stack

I ran the Firefox Profiler on the before/after builds and they have pretty much identical profiles in terms of called functions, so there's no unexpected calls there. It's completely dominated by reflow, in particular nsBlockFrame::RecoverFloats[For] in both cases. (this is for the testcase attached above)

If I instead save that generated markup as a static HTML file and load that instead then it loads pretty much instantly, so the bad performance is caused by the 3000 DOM mutations which presumably reflows the entire document each time(?) If I put display:flow-root on the <div> then the test takes about a quarter of the time it takes without it.

So, if I add these rules to the testcase:

ul ::marker { content: counter(list-item,disc); }
ol ::marker { content: counter(list-item,decimal) "."; }

then I see the exact same perf regression in the old build (without the changes in bug 1542807). IOW, this regression is what we should expect for the original test.

So the regression is that we used to create a single nsBulletFrame for a ::marker created from list-style-type, but now we create a block frame containing a child text frame. Reflowing the latter is slower.

Given that bug 1542807 fixes a bunch of outstanding ::marker layout bugs and is a desired architectural change we simply have to live with this regression. Also, it's not going to visibly affect any real web pages unless they have a ton of list items and make lot's of DOM/style changes that causes reflows.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX

Fwiw, fixing bug 406512 will likely help make this faster...

Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: