Very long reflows on Wikipedia Barack Obama page
Categories
(Core :: Layout: Columns, defect)
Tracking
()
Performance Impact | medium |
People
(Reporter: bas.schouten, Unassigned)
References
()
Details
(5 keywords)
Attachments
(2 files)
Comment 1•4 years ago
|
||
I wonder if this is a regression from stuff like bug 1647332.
Comment 2•4 years ago
|
||
[Tracking Requested - why for this release]: Severe perf regression in some Wikipedia pages.
Tentatively moving to Layout: Columns, given Bas says this is a regression from 78, and that matches the time with bug 1647332.
Updated•4 years ago
|
Reporter | ||
Updated•4 years ago
|
Comment 3•4 years ago
|
||
Bizarrely, the perf graph shows an improvement for the regressing bug 🙃.
Comment 4•4 years ago
|
||
yeah, this is probably an edge case that may happen in some but not all pages or such.
Reporter | ||
Comment 5•4 years ago
|
||
(In reply to Emilio Cobos Álvarez (:emilio) from comment #4)
yeah, this is probably an edge case that may happen in some but not all pages or such.
We suspect this may also be related to the way we capture and replay. Most wikipedia pages I looked at real quick (to be fare, n=3) were showing disproportionately large reflow times when compared to the overall page complexity.
Comment 6•4 years ago
|
||
Bug 1658198 Comment 22 has an explanation of why a page having a column container is slow. However, I don't see wikipedia page has font-size: 0
or line-height: 0
though.
Bas, could you try the build with my patch in 1658198 applied, and see if the performance is improved?
https://treeherder.mozilla.org/#/jobs?repo=try&revision=62b2260c92fb2708e09cc1c35df4d8f56de7a40b
In general, wikipedia pages that have a large "Notes and references" section (like Obama's page) can be expensive to find the best column balancing height, and it can still take several reflow iterations after the effort in bug 1647332 and bug 1647520. bug 575614 also has some discussion regarding the performance of multi-column layout.
Reporter | ||
Comment 8•4 years ago
|
||
Not really. The column display on the left with the languages still has a 700ms delay to display vs Chrome with that build. Fwiw, I was able to reproduce this across 4 different machines and across release, beta and nightly :). So it should be easy to verify.
Reporter | ||
Comment 9•4 years ago
|
||
To make matters worse, in some cases this reflow happens before first paint. See a profile from release here: https://share.firefox.dev/3axrPK7
Comment 10•4 years ago
|
||
We're building the 80 release candidate today; I can track this, but won't block on it.
Comment 11•4 years ago
•
|
||
I'm trying to use mozrgression to capture a profile on my end. I'm looking at the second (longest) nsColumnSetFrame::Reflow
in both profiles in the "Flame graph" tab.
- Firefox 78 (2020-05-15) https://share.firefox.dev/3kTfcO9. The first and second
nsColumnSetFrame::Reflow
takes 19ms and 592ms. - Firefox 81 (2020-08-16) https://share.firefox.dev/348C9XN. The first and second
nsColumnSetFrame::Reflow
takes 64ms and 319ms. (Not sure why the greenposix_fallocate
in graphic category take near 1 minute on current nighty)
I profiled both build a few times, and the longest nsColumnSetFrame::Reflow
take roughly the same amount of time each time, about 5xx ms (2020-05-15) and 3xx ms (2020-08-16). I'm sure compared to Chrome, Firefox can be slower to layout multicol, but I'm skeptical that the slowness is because of bug 1647332.
Bas, could you help take a look at the profiles I captured, and see if I misinterpreted the data?
Reporter | ||
Comment 12•4 years ago
|
||
(In reply to Ting-Yu Lin [:TYLin] (UTC-7) from comment #11)
I'm trying to use mozrgression to capture a profile on my end. I'm looking at the second (longest)
nsColumnSetFrame::Reflow
in both profiles in the "Flame graph" tab.
- Firefox 78 (2020-05-15) https://share.firefox.dev/3kTfcO9. The first and second
nsColumnSetFrame::Reflow
takes 19ms and 592ms.- Firefox 81 (2020-08-16) https://share.firefox.dev/348C9XN. The first and second
nsColumnSetFrame::Reflow
takes 64ms and 319ms. (Not sure why the greenposix_fallocate
in graphic category take near 1 minute on current nighty)I profiled both build a few times, and the longest
nsColumnSetFrame::Reflow
take roughly the same amount of time each time, about 5xx ms (2020-05-15) and 3xx ms (2020-08-16). I'm sure compared to Chrome, Firefox can be slower to layout multicol, but I'm skeptical that the slowness is because of bug 1647332.Bas, could you help take a look at the profiles I captured, and see if I misinterpreted the data?
What's up with that 1s rasterize on the second profile?
Mind you, I was looking at Windows, which doesn't show these weirdly long reflows. It's pretty hard to compare your first and second profile, but most certainly both have really long reflows (I didn't see a really long reflow in 78, but I only ran 78 a couple of times, it could be a coincidence). Although the 78 reflow on a whole is 'considerably faster' than the 81 reflow. (n=1, and we appear to be seeing different frames in both profiles so this could be a coincidence)
As for bug 1647332, I'm not claiming it's necessarily related to this :). That was Emilio's first guess I think. But this reflow duration is most definitely problematic. I don't know enough about layout to make a guess as to another cause.
Comment 13•4 years ago
|
||
Thanks Bas.
I just try profiling on macOS, and I don't see the weirdly long paint of posix_fallocate
.
Yeah, I agree the long reflow of Wikipedia pages is a known problem because of our currently implementation to find the best column block-size. That motivates me to implement heycam's ideas to make it faster in bug 1647332, although we still have plenty of room to improve it.
As for bug 1647332, I'm not claiming it's necessarily related to this :). That was Emilio's first guess I think. But this reflow duration is most definitely problematic. I don't know enough about layout to make a guess as to another cause.
Per the above, I'll remove the tracking flags and bug 1647332 from the "Regressed by" field.
Comment 14•4 years ago
|
||
Pretty sure Bas tested this on an earlier build (78-ish) and didn't see the same issue on his Windows machine. It's probably worth mozregression'ing this to see what made this worse, given the size of the issue on Windows. I don't know if Wikipedia serves different markup/CSS depending on UA and that somehow affects this, but either way it's probably worth getting a better idea of what's going on here.
Comment 15•4 years ago
|
||
Firefox 81 (2020-08-16) https://share.firefox.dev/348C9XN. The first and second nsColumnSetFrame::Reflow takes 64ms and 319ms. (Not sure why the green posix_fallocate in graphic category take near 1 minute on current nighty)
FYI, the posix_fallocate
shown on the profiler is bug 1658847.
Updated•4 years ago
|
Comment 16•4 years ago
|
||
Hi, I'm happy to help performing a mozregression, can I get some STR in order to do so?
Thanks!
Best,
Clara.
Comment 17•4 years ago
|
||
(In reply to Clara Guerrero from comment #16)
Hi, I'm happy to help performing a mozregression, can I get some STR in order to do so?
I think Bas is best placed to give STR here.
Updated•4 years ago
|
Reporter | ||
Comment 18•4 years ago
|
||
(In reply to :Gijs (he/him) from comment #17)
(In reply to Clara Guerrero from comment #16)
Hi, I'm happy to help performing a mozregression, can I get some STR in order to do so?
I think Bas is best placed to give STR here.
Basically just open the Obama wikipedia page, in the 'bad' case the page loads slower (in particular the column on the left with the languages, for example) comes in later. Fwiw this seems to be somehow load order dependent or something along those lines, as it doesn't appear to 'always' happen, it sort of comes and goes. Which means it's possible I just got lucky when testing on an older version.
Comment 19•4 years ago
|
||
Can you please confirm the first attempt reflects the issue, and second attempt loads fine?
Best,
Clara
Reporter | ||
Comment 20•4 years ago
|
||
(In reply to Clara Guerrero from comment #19)
Created attachment 9176388 [details]
left column(1).webmCan you please confirm the first attempt reflects the issue, and second attempt loads fine?
Best,
Clara
Indeed, that's what it looked like for me!
Comment 21•4 years ago
|
||
So, I'm trying to get a regression range but I noticed that chrome Version 85.0.4183.121 (Official Build) (64-bit) is also behaving as shown in my video, (Internet explorer version 11.1082.18362.0 and Microsoft Edge 44.18362.449.0 won't show this behaviour though). Please confirm if it's a good idea for me try to obtain a range with mozregression.
Best,
Clara
Comment 22•4 years ago
|
||
I think if you can find a build that has the "good" behaviour from comment 20 then it may be worth running mozregression.
Comment 23•4 years ago
|
||
Sorry for being late to reply. I agree with Gijs. If we were perform better but end up being like other browsers, it still helpful to identify the regressor.
Comment 25•1 year ago
•
|
||
Yes, we are still seeing very slow reflows on this site
This one includes a 749ms reflow on a cold page load:
https://share.firefox.dev/3QFlnrc
This looks to be one of the reason why our performance relative to Chrome not good on this site: https://faraday.basschouten.com/mozilla/Pageload/details.html?os=linux
This try push includes results and profiles for all platforms:
https://treeherder.mozilla.org/jobs?repo=try&selectedTaskRun=LeXlSApURpOSPaBkqrz3nQ.0&tier=1%2C2%2C3&revision=354e1b0b7958c8815fd284f3861a534bb0b051eb
Comment 26•1 year ago
|
||
The Performance Impact Calculator has determined this bug's performance impact to be high. If you'd like to request re-triage, you can reset the Performance Impact flag to "?" or needinfo the triage sheriff.
Platforms: [x] Windows [x] Linux
Page load impact: Some
Websites affected: Major
[x] Able to reproduce locally
Comment 27•1 year ago
|
||
The severity field for this bug is set to S3. However, the Performance Impact
field flags this bug as having a high impact on the performance.
:TYLin, could you consider increasing the severity of this performance-impacting bug? Alternatively, if you think the performance impact is lower than previously assessed, could you request a re-triage from the performance team by setting the Performance Impact
flag to ?
?
For more information, please visit BugBot documentation.
Comment 28•1 year ago
|
||
(In reply to Ting-Yu Lin [:TYLin] (UTC-8) from comment #6)
Bug 1658198 Comment 22 has an explanation of why a page having a column container is slow. However, I don't see wikipedia page has
font-size: 0
orline-height: 0
though.Bas, could you try the build with my patch in 1658198 applied, and see if the performance is improved?
https://treeherder.mozilla.org/#/jobs?repo=try&revision=62b2260c92fb2708e09cc1c35df4d8f56de7a40bIn general, wikipedia pages that have a large "Notes and references" section (like Obama's page) can be expensive to find the best column balancing height, and it can still take several reflow iterations after the effort in bug 1647332 and bug 1647520. bug 575614 also has some discussion regarding the performance of multi-column layout.
Would this patch still be potentially helpful?
Comment 29•1 year ago
|
||
Re comment 25:
In the profile, I see we spend a lot of time in nsColumnSetFrame::FindBestBalanceBSize
. I've improved the column balancing performance a bit a few years ago, and I don't have other idea to further improve it this moment.
Re comment 26:
Performance Impact Calculator deems this bug's performance impact to be high because Wikipedia is a major site. However, not all the Wikipedia pages has hundreds of list items in the References section like Barack Obama's page. Spending ~1 second of reflow time in column balancing is bad, but it is not bad enough to become a noticeable jank or delay imho.
Andrew, do you feel the performance impact is still high per my explanation above?
Re comment 28:
The patch in the try run has been landed in Bug 1658198.
Comment 30•1 year ago
|
||
(In reply to Ting-Yu Lin [:TYLin] (UTC-8) from comment #29)
Re comment 25:
In the profile, I see we spend a lot of time in
nsColumnSetFrame::FindBestBalanceBSize
. I've improved the column balancing performance a bit a few years ago, and I don't have other idea to further improve it this moment.Re comment 26:
Performance Impact Calculator deems this bug's performance impact to be high because Wikipedia is a major site. However, not all the Wikipedia pages has hundreds of list items in the References section like Barack Obama's page. Spending ~1 second of reflow time in column balancing is bad, but it is not bad enough to become a noticeable jank or delay imho.
Andrew, do you feel the performance impact is still high per my explanation above?
Yes, that's a fair point, not all of wikipedia.org exhibits this performance discrepancy.
I've re-run it and it comes in at medium
.
Because this particular page is part of our pageload tests we see the results very frequently.
Description
•