Closed Bug 1808824 Opened 3 years ago Closed 3 years ago

[gsheets pageload] initialpaint delay and nsHtml5ExecutorReflusher can cause huge delays to FCP

Categories

(Core :: DOM: HTML Parser, defect)

defect

Tracking

()

RESOLVED FIXED
110 Branch
Tracking Status
firefox110 --- fixed

People

(Reporter: tcampbell, Assigned: smaug)

References

Details

(Keywords: perf-alert)

Attachments

(1 file)

In the test-linux1804-64-shippable-qr/opt-browsertime-tp6-bytecode-firefox-google-slides job, the FCP values are bimodal with several hundred ms difference in the values. The initialpaint delay sometimes comes slightly too late and we miss the vsync. Then the nsHtml5ExecutorReflusher runs 400ms worth of scripts without allowing any pauses between for vsync events meaning that FCP must wait.

Profile URL: https://share.firefox.dev/3Isbdq7

I'm not familiar enough to know what our options are here. Is there some notion of a suppressed paint and pending vsync so that we could process the paint after only some of the scripts are flushed?

Here is the fast version of it: https://share.firefox.dev/3WOSYiT
The PresShell::sPaintSuppressionCallback runnable happens before the VSync, which means the paint happens before the 400 ms of nsHtml5ExecutorReflusher.

If I'm understanding the code right, the nsHtml5ExecutorReflusher is able to yield, but it uses the content.sink.perf_parse_time pref for timeout which is set to 360ms which would explain where this 400ms delay (360ms + 57ms for final script execution in bundle). This value was chosen at least a decade ago, so maybe our thinking should be updated here.

Changing that pref might or might not regress Speedindex. We do throttle painting a bit after FCP but before page load since that helps with speedindex, especially on react heavy sites.

With pref value 60000 Amazon speedindex improves ~20%, Reddit regresses 16%. Most sites pretty much are unaffected.
Now testing with 30000.

30000 looks better. I guess I should land that change.

Attachment #9311649 - Attachment description: WIP: Bug 1808824 - decrease content.sink.perf_parse_time → Bug 1808824 - decrease content.sink.perf_parse_time
Component: Performance → DOM: HTML Parser
Assignee: nobody → smaug
Status: NEW → ASSIGNED
Pushed by opettay@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f8f690df2c32 decrease content.sink.perf_parse_time r=peterv,perftest-reviewers,sparky

Backed out for causing crashtest failures

Backout link: https://hg.mozilla.org/integration/autoland/rev/0b74dca33cb0b069ae2c120f72cad1949de685ca

Push with failures

Failure log

 INFO - [5640, Main Thread] WARNING: XPCOM_MEM_BLOAT_LOG is set, disabling native allocations.: file /builds/worker/checkouts/gecko/tools/profiler/core/platform.cpp:339
[task 2023-01-11T21:19:49.728Z] 21:19:49     INFO - REFTEST TEST-UNEXPECTED-FAIL | parser/htmlparser/tests/crashtests/1747514.html | load failed: timed out after 300000 ms waiting for 'load' event for file:///Z:/task_167346402046941/build/tests/reftest/tests/parser/htmlparser/tests/crashtests/1747514.html
[task 2023-01-11T21:19:49.729Z] 21:19:49     INFO - REFTEST INFO | Saved log: START file:///Z:/task_167346402046941/build/tests/reftest/tests/parser/htmlparser/tests/crashtests/1747514.html
Flags: needinfo?(smaug)

That is certainly not a type of test I was expecting to fail. Looking

Attachment #9311649 - Attachment description: Bug 1808824 - decrease content.sink.perf_parse_time → Bug 1808824 - decrease content.sink.perf_parse_time, r=peterv
Pushed by opettay@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/69270e156ce2 decrease content.sink.perf_parse_time, r=peterv,perftest-reviewers,sparky
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 110 Branch

(In reply to Pulsebot from comment #7)

Pushed by opettay@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f8f690df2c32
decrease content.sink.perf_parse_time r=peterv,perftest-reviewers,sparky

== Change summary for alert #36725 (as of Thu, 12 Jan 2023 20:21:44 GMT) ==

Improvements:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
16% buzzfeed fcp linux1804-64-shippable-qr fission warm webrender 114.29 -> 96.00 Before/After
13% amazon FirstVisualChange macosx1015-64-shippable-qr bytecode-cached cold fission webrender 570.00 -> 493.33 Before/After
12% amazon PerceptualSpeedIndex macosx1015-64-shippable-qr bytecode-cached cold fission webrender 596.92 -> 522.33 Before/After
12% amazon SpeedIndex macosx1015-64-shippable-qr bytecode-cached cold fission webrender 601.25 -> 527.33 Before/After
12% amazon ContentfulSpeedIndex macosx1015-64-shippable-qr bytecode-cached cold fission webrender 592.50 -> 520.08 Before/After
5% google-slides LastVisualChange linux1804-64-shippable-qr cold fission webrender 3,983.33 -> 3,796.67 Before/After

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=36725

Keywords: perf-alert
Regressions: 1810948
Regressions: 1832914
Flags: needinfo?(smaug)

(In reply to Sandor Molnar[:smolnar] from comment #8)

Backed out for causing crashtest failures
(In reply to Olli Pettay [:smaug][bugs@pettay.fi] from comment #9)
That is certainly not a type of test I was expecting to fail. Looking

Late-breaking afterthought here: I think this crashtest timeout is due to layout thrashing from parsing a giant document, with the parser (now) yielding to relayout every 30ms, as of this commit. In debug builds, we don't make much progress in these incremental batches of parsing, and the inefficiency of doing things in pieces (at least for the resulting small layouts) makes the total load-duration much longer.

This came across my radar when investigating a different test timeout in bug 1860872, but I think it's likely the explanation for this crashtest, too.

I suspect we can avoid the issue by increasing the value of this pref (content.sink.perf_parse_time) specifically for the affected tests, to allow the parser (and resulting layout(s)) to be more efficient.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: