Closed Bug 1806095 Opened 2 years ago Closed 2 years ago

5.93 - 4.31% welcome ContentfulSpeedIndex / welcome ContentfulSpeedIndex + 4 more (Linux, OSX, Windows) regression on Wed December 14 2022

Categories

(Firefox :: Messaging System, defect, P1)

defect

Tracking

()

RESOLVED FIXED
110 Branch
Iteration:
110.1 - Dec 12 - Dec 23
Tracking Status
firefox-esr102 --- unaffected
firefox108 --- unaffected
firefox109 --- unaffected
firefox110 --- fixed

People

(Reporter: aesanu, Assigned: aminomancer)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Attachments

(2 files, 1 obsolete file)

Perfherder has detected a browsertime performance regression from push 9456d51620516727a0141b266d55db32b0792ea4. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
6% welcome ContentfulSpeedIndex macosx1015-64-shippable-qr fission warm webrender 1,219.17 -> 1,291.42
6% welcome ContentfulSpeedIndex linux1804-64-shippable-qr fission warm webrender 1,143.00 -> 1,210.00 Before/After
5% welcome ContentfulSpeedIndex windows10-64-shippable-qr cold fission webrender 1,160.88 -> 1,221.33 Before/After
5% welcome ContentfulSpeedIndex linux1804-64-shippable-qr cold fission webrender 1,198.17 -> 1,259.50 Before/After
5% welcome ContentfulSpeedIndex windows10-64-shippable-qr fission warm webrender 1,158.33 -> 1,216.42 Before/After
4% welcome ContentfulSpeedIndex macosx1015-64-shippable-qr cold fission webrender 1,257.11 -> 1,311.33

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(shughes)

Set release status flags based on info from the regressing bug 1804726

Makes sense, I'll investigate if anything can be done to recover a few percents

Flags: needinfo?(shughes)
Assignee: nobody → shughes
Status: NEW → ASSIGNED

I've been discussing with Punam and we're wondering if the added complexity in about:welcome & perf regressions are worth it for what is essentially a "nice to have" enhancement for the progress bar. (Especially since it leaves us with an extra state hook to maintain.) There are some refactors in the regressing patch that are really handy, so I'm wondering if just unhooking the animations from the forward/back state of the browser would simplify things enough to fix the perf regression. I'm going to run some raptor tests on both the regressing patch and this patch and see what the range is.

As a last resort, we could always revert back to the margin: -1 fix that was on the step indicators, and try and pull out the cleanup/refactor changes from https://phabricator.services.mozilla.com/D164310.

The perf test in question doesn't actually trigger the history transition AFAICT, so idk if removing it would save any time. The history transitions were actually just a happy side effect of adding the previousOrder state hook, which was only necessary in order to animate the progress bar (so it doesn't come into existence with its target index).

I'm not sure what the performance hit is caused by, the CSS changes or the additional state hook. But I don't think it can be the history transitions themselves, as history.pushState is not called unless we advance screens, and the perf test doesn't do that. I thought it might have just been an oversight I missed in the hook, but fixing that made no difference in the results.

I didn't really like the margin: -1px solution because it meant progress step 2 technically begins to the left of the end of step 1.

I actually think the best way to simplify this (and the way I'd probably try to approach this if starting from scratch) would be to put the progress bar in the container, not in the screen, so that it's not re-rendered between screens at all. That would spare the extra renders and eliminate the need for the previousOrder hook. Generally reduce the number of props per component too.

Another possible optimization might be to switch to using an actual <progress> element

Iteration: --- → 110.1 - Dec 12 - Dec 23
Priority: -- → P1
Severity: -- → S3

Barret and alexandrui from perftest ran a side-by-side SpeedTest comparison to see how the patch is affecting the perceived load times/responsiveness. Here's the technical explanation if anyone is curious

I was going to resolve wontfix since the regression is acceptable (appears not to represent any difference in the time until the progress bar appears or until it finishes animating). However, in the process of going over all of this with a fine-toothed comb, I do have some minor adjustments I think are worth making, not for the ContentfulSpeedIndex but for general performance and neatness.

The latest revision of the patch (D164927) can be scrapped I think — it does improve CSI, because it stops React from reinserting DOM elements that may not need to be reinserted for every screen. But that change has some costs I don't think are worth paying, like needing to key the screen's child elements that do need to be reinserted, which is generally frowned upon.

Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → NEW
Attachment #9308754 - Attachment is obsolete: true

Avoid an unnecessary hook trigger and clean up the handling of the start
screen behavior.

(In reply to Shane Hughes [:aminomancer] from comment #6)

Created attachment 9308993 [details]
SpeedIndex comparison (parent on the left, regressor on the right)

Barret and alexandrui from perftest ran a side-by-side SpeedTest comparison to see how the patch is affecting the perceived load times/responsiveness. Here's the technical explanation if anyone is curious

@aminomancer do we know if these numbers are for warm or cold page load? Regressing patch has quite a bit of changes that touches css and states, do we know what could have caused the regression , will be good to capture those learnings thanks

(In reply to Punam Dahiya [:pdahiya] from comment #9)

@aminomancer do we know if these numbers are for warm or cold page load? Regressing patch has quite a bit of changes that touches css and states, do we know what could have caused the regression , will be good to capture those learnings thanks

Good question - the video was produced with the side-by-side tool with the test browsertime-first-install-firefox-welcome, which is a subtest of browsertime-first-install, which runs in chimera mode, meaning it does both warm and cold. In my testing with the normal browsertime parameter, it opens the app and tests about:welcome within that session 25 times closes the app, and repeats 25 times. So I think it's effectively 25 cold runs and 600 warm runs.

Anyway, my takeaway from both local testing and testing on try was that the difference between before/after is the same for warm and cold. Which makes sense now that I have read up on this metric. My understanding from the documentation is that the only distinction between warm and cold pageload variants is that the browser is restarted between each successive iteration in the cold variant, while the warm variant just opens new tabs (so either way, the aboutwelcome content starts fresh). I inferred that we didn't get any other alerts (loadtime for example) since the changes don't meaningfully impact I/O throughput, being pretty modest in terms of bytes.

The real mystery is why ContentfulSpeedIndex is increased while PerceptualSpeedIndex is not. The test is set up to alert on PerceptualSpeedIndex. ContentfulSpeedIndex is supposed to tell you when all the text is visible, while PerceptualSpeedIndex is supposed to tell you when visual progress stops. So how can you cause the text to appear later without causing visual changes at the same time that would extend the visual progress curve? Looking at the raw numbers I thought perhaps it was some kind of delay in Fluent translation. But the video shows that nothing is noticeably different. It's pretty intriguing

What is it from the regressing bug that is causing this regression? Is it something with the additional useState or useEffect or the new component or css changes? And is the attached patch trying to address it and how or it's just some cleanup that shouldn't affect it?

(In reply to Ed Lee :Mardak from comment #11)

What is it from the regressing bug that is causing this regression? Is it something with the additional useState or useEffect or the new component or css changes?

I'll post an update when I have figured it out. React hooks are the most plausible. The DOM and CSS changes should actually improve performance, that was one of the motivations for consolidating the progress "steps" into a single bar element. It means we have only 1 element instead of multiple, and more importantly means we can animate width instead of transform, which should be better in pretty much every way. And it does slightly improve the other speed indices. But this one particular ContentfulSpeedIndex is 4-6% higher. It doesn't necessarily mean 4-6% slower, and the video seems to indicate that it's not slower. Based on my understanding of this metric, it could simply mean that there are 4-6% more frames with text on them. Which might just mean that it's stuttering less.

If the text really is coming in later than before, in some way that doesn't affect the other indices like PerceptualSpeedIndex, then it has to be due to the React hooks, as the CSS and DOM changes only affect elements with no text content. The issue with the additional state/effect hooks is that they should already be executed by the first paint. Which as you can see from the video, finishes at the same time in both instances. Not 4-6% later at least. However...

And is the attached patch trying to address it and how or it's just some cleanup that shouldn't affect it?

I initially thought the attached patch shouldn't impact the ContentfulSpeedIndex, it might have some minuscule positive performance impact on navigations, but really it's just cleaning up some unnecessary stuff. But now that I think about it, in the regressing patch, I moved the effect hook that implements the history navigation behavior. I had to move it below the variables it depends on.

But that means it comes after a couple of other hooks. And this may be important because, in addition to setting up the popstate listeners that implement navigation from within the page, it also set the starting screen index. Which is necessary so user can navigate back from about:home to about:welcome and land on the last screen. Moving this lower might mean that the screen index is updated from history a bit later than before, and that the first transition (from nothing to screen 1) is triggered a bit earlier than before. For whatever that's worth...

One problem with this hook was that it updates the screen index no matter what, whether we have an index stored in history state or not. In fact, it doesn't need to update the screen index at all. We already set the screen index based on props.startScreen when instantiating the state. I didn't realize this at first, but in my attached patch, I added a check so that we only set the screen index in the hook if there is a history state. Which there isn't, on the first screen. So this may actually help. Of course, this is entirely based on the assumption that the order of React hooks somehow affects the rendering of text without affecting the visual progress (which would show up in PerceptualSpeedIndex, which isn't alerting).

(In reply to Shane Hughes [:aminomancer] from comment #12)

more importantly means we can animate width instead of transform, which should be better in pretty much every way

Why do you say that? The original transitions were carefully selected to be performant in that transforms can GPU accelerated.

(In reply to Ed Lee :Mardak from comment #13)

Why do you say that? The original transitions were carefully selected to be performant in that transforms can GPU accelerated.

Hmm, do you mean transitions for something else? As far as I know we only had transform transitions on the progress bar for about a week. Prior to bug 1794702, they had a kind of opacity transition where the whole bar would fade in and out between each screen. My original plan was to animate each segment individually from 0 to 100% x scale, which was live for a week, but then bug 1804098 was reported. So I gave each progress segment a -1px margin to band-aid that issue — which is not a terrible solution all things considered, it has only a very tiny effect on the animation smoothness. That maybe could even be mitigated with the right cubic-bezier timing function, but I preferred to avoid transform entirely if possible.

Idk exactly how this works technically, but transform does subpixel rounding differently somehow, which in my experience may or may not be visually discernible depending on the css to device pixel ratio (bug 1804098 being one example). Sometimes the anti-aliasing artifacts interface poorly with normal layout, e.g. you get a 1px semitransparent edge overlapping the parent element. Most importantly, when transform is animated from no transform to any transform, there's often an awkward jump at the beginning/end of the transition, I think also depending on css to device pixel ratio.

In this case when the progress bar stopped transitioning, it would jump from having anti-aliased edges to being a regular, perfect rectangle consistent with the rest of the layout. IIRC it was much worse in RTL, because we have these transform: rotateY(180deg) rules on the body and the screen. I think the one on the body is intended to keep the noodles bidirectionally consistent, and the one on the screen is intended to "cancel out" the one on the body so the text isn't flipped. And at a glance they do cancel out, but it's equivalent to just saying transform: rotateY(0deg), i.e. not the same as transform: none (I confirmed this distinction is actually documented in the spec, not an aberration).

Maybe there are others, but my own personal workaround for this is to make sure the element always has a transform value, I suspect that works because it means it never "jumps" from the transform coordinate system to the normal layout. So in the first patch I gave the progress bar scale: 1 (baseline) to resolve that, but then it's always applied, so now the 1px lines between the segments are no longer just subpixel rounding discrepancies that exist during the transition but go away when it ends, now they're basically permanent. So the negative margins become necessary.

I opted for transform at the time because animating width on the individual segments would cause them all to flex, all the layout changes would be slow or require awkward CSS to prevent. But removing all the segments would allow me to use width, which is more consistent with how it's done elsewhere in the browser, it's simpler, it doesn't result in the creation of a transform context, no subpixel rounding issues at all, the bar always perfectly aligns with everything laid out adjacently, etc.

All that said, most of the issues with transform are a result of there being multiple segments. With only one segment, it seems less problematic. Should be possible to simply animate the transform property and do transform: scaleX(${ratio}) instead of width: ${ratio * 100}%. With the exception of the 1px layout gaps, the rendering issues I've mentioned are quite subtle, so most people probably won't notice them. It might be worth testing but I'd be surprised if that's our culprit, because if transitioning the width property itself is actually less performant, I believe we should be getting an alert on PerceptualSpeedIndex, not ContentfulSpeedIndex (I could be wrong, that's just my takeaway from reading the documentation on browsertime)

Pushed by shughes@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/049cca6cd5d7 Fix oversights in about:welcome progress bar. r=emcminn

(In reply to Punam Dahiya [:pdahiya] from comment #9)

(In reply to Shane Hughes [:aminomancer] from comment #6)

Created attachment 9308993 [details]
SpeedIndex comparison (parent on the left, regressor on the right)

Barret and alexandrui from perftest ran a side-by-side SpeedTest comparison to see how the patch is affecting the perceived load times/responsiveness. Here's the technical explanation if anyone is curious

@aminomancer do we know if these numbers are for warm or cold page load? Regressing patch has quite a bit of changes that touches css and states, do we know what could have caused the regression , will be good to capture those learnings thanks

The table in comment 0 indicates which regressions were on cold page load vs warm page load

The regression is caused by the animation. ContentfulSpeedIndex scores are exacerbated by animations, due to its implementation.

ContentfulSpeedIndex (or CSI) is our own metric based on top of SpeedIndex. For SpeedIndex, we capture video output and compute "visual complete %" for each frame. Then we generate a graph and look at the area above the curve, which is an approximation of how close the page is to its finished rendering vs the time it takes to load. The mdn article has a graph demonstating this.

For CSI, we take the video we would run SpeedIndex on and run an edge-detection algorithm on each frame. Both images and text should be highlighted by edge detection, so CSI gives an approximation of how much "content" (text/images) is close to its final render (with edge detection).

If a page has no animation and the text renders top-to-bottom, then the edge detection will pick out the text on the page as it renders and the %VisuallyComplete will increase linearly. However, if there is an animation, text and images will be moving around the screen for the duration of the animation and %VisuallyComplete will not progress linearly during that time. Instead, it will likely hang around a lower bound and then it will jump up once the animation is complete. Because it was not increasing during that period, the CSI will be higher becuase there is more area. That is why we are seeing a regression here after introducing the animation.

If you take a look at the video in comment 6, we can see that the actual visual load characteristics are not impacted and that the SpeedIndex is largely unchanged.

Status: NEW → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 110 Branch
See Also: → 1807970
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: