Closed Bug 1933020 Opened 1 year ago Closed 1 year ago

TART on Windows11 has become Bimodal since 13Nov2024 (should disable CFR / callouts more comprehensively in perf tests)

Categories

(Firefox :: Messaging System, defect, P1)

Desktop
All
defect
Points:
1

Tracking

()

RESOLVED FIXED
135 Branch
Iteration:
135.2 - Dec 9 - Dec 20
Tracking Status
firefox135 --- fixed

People

(Reporter: mayankleoboy1, Assigned: emcminn)

References

Details

(Whiteboard: [fxp][operational])

Attachments

(1 file)

The pushlog and perfherder suggest this started with the patch from bug 1904945.

TART is meant to test tab animation speed. The bimodality would suggest that some of the time that has become slower / jankier, probably because some background thing is taking longer and messing with the test - or potentially because something has changed tabstrip layout bits.

Emily, would you mind checking a mach try perf compare for tart on Windows between a trypush that is just autoland ref https://hg.mozilla.org/integration/autoland/rev/67040e0843c8d1b5d0dd9f6d5801f24950557889 and a backout directly on top of that ref, to see if that would actually "fix" the regression? It doesn't make a lot of sense to me that that change would impact tart but I suppose it's possible we're missing something. The trypush would help clarify if that revision actually changed something material (even if it's just something odd like pushing the size of FeatureCalloutMessages.sys.mjs over a certain threshold) or it's some other kind of fluke?

Although that is the rev pointed to by the link from comment 0, there are some flex layout changes not long before it (bug 1930561 and bug 1930427) that would be more plausible suspects if some layout thing actually got slower...

Flags: needinfo?(emcminn)

See also https://treeherder.mozilla.org/perfherder/alerts?id=42620 which was generated, but was deemed to be "due to infra changes"

(In reply to Mayank Bansal from comment #2)

See also https://treeherder.mozilla.org/perfherder/alerts?id=42620 which was generated, but was deemed to be "due to infra changes"

My apologies but "Due to infra changes" was incorrectly put by me, please dis-regard that, sorry for any confusion

Copying message from https://bugzilla.mozilla.org/show_bug.cgi?id=1931420#c5

Hey :emcminn,

Just wanted to let you know something as an FYI from our end on the perftesting team
Bug 1904945 caused a regression on the tart performance tests: https://treeherder.mozilla.org/perfherder/alerts?id=42620
This patch which limited the feature seems to have reverted the regression though: https://treeherder.mozilla.org/perfherder/alerts?id=42773&hideDwnToInv=0

This is just to say when you expand the feature you may cause a regression to tart again so just as a heads up to check the perf test before you land

Hope this saves you a future headache :)

Since it sounds like limiting the audience for the landed message fixed the regression, is it safe to close this? We're going to do further experimentation before expanding the feature again, so I can make sure any perf issues get addressed before we do that.

Flags: needinfo?(emcminn)

It looks like there is a windows issue related to bi-modality didn't go away after your patch but the mac ones did, I'm sorry for the confusion when I typed my message on Monday, I had missed the windows alert.

I think I'm pretty confused at this point. :-)

I would not expect most performance tests to run onboarding / message system notifications. They are meant to evaluate the speed of basic operations (opening windows, opening tabs, opening devtools, session restore, etc.). That includes tart.

So I think really my question is how was it possible for message system stuff to affect tart, and can we fix that more holistically (ie disable whatever is tripping these messages in the middle of performance testing, by preffing more stuff off in the default performance user profile). That would avoid future message system stuff causing perf alerts for which they're not directly responsible. We could do that here (ie morph this bug) or in a separate bug but I think we should make sure we've disentangled the two.

The flip side is that it would be nice if we caught how message system stuff affected new profile startup, performance wise. But that would probably want/need a dedicated test of sorts. I don't think we have one today - IIRC ts_paint which is meant to be about startup explicitly ignores first startup on the profile, because so much stuff has to be initialized. So I think this could be a lower-priority follow-up: to add a performance test that evaluates first startup on new profiles, including messaging system "stuff".

Does that make sense?

Flags: needinfo?(emcminn)

So the odd thing is that the message that was tripping the perf test is explicitly not a new profile/first startup message; it was targeted at existing users after previousSessionEnd (along with some other targeting criteria.) The one targeted at new profiles is what's left in tree and it doesn't seem to be causing trouble!

I'm surprised a testing profile would be meeting the targeting criteria anyway, but I suppose we could fix it by preffing off "browser.newtabpage.activity-stream.asrouter.userprefs.cfr.features" && "browser.newtabpage.activity-stream.asrouter.userprefs.cfr.addons" which are the prefs that control whether we show any feature callouts at all. (Assuming that the problem has to do with the callout actually rendering and not something about the time it takes to evaluate the targeting expression.) I don't know if that would be considered heavy handed though.

Flags: needinfo?(emcminn)
Severity: -- → S3
Priority: -- → P3

(In reply to Emily McMinn :emcminn from comment #8)

So the odd thing is that the message that was tripping the perf test is explicitly not a new profile/first startup message; it was targeted at existing users after previousSessionEnd (along with some other targeting criteria.) The one targeted at new profiles is what's left in tree and it doesn't seem to be causing trouble!

Most perf tests ignore first startup, and/or even do a startup, wait for a bit, shut down, and then re-start to do the test properly without first-startup-on-new-profile code interfering with the test results. I bet tart is the same (though haven't checked).

I'm surprised a testing profile would be meeting the targeting criteria anyway, but I suppose we could fix it by preffing off "browser.newtabpage.activity-stream.asrouter.userprefs.cfr.features" && "browser.newtabpage.activity-stream.asrouter.userprefs.cfr.addons" which are the prefs that control whether we show any feature callouts at all. (Assuming that the problem has to do with the callout actually rendering and not something about the time it takes to evaluate the targeting expression.) I don't know if that would be considered heavy handed though.

That sounds right - https://searchfox.org/mozilla-central/source/testing/profiles/common/user.js has a bunch of other cfr stuff already turned off.

Let's use this bug to disable this stuff for perf tests.

Severity: S3 → --
Component: Performance → Messaging System
OS: Unspecified → All
Priority: P3 → --
Product: Testing → Firefox
Hardware: Unspecified → Desktop
Summary: TART on Windows11 has become Bimodal since 13Nov2024 → TART on Windows11 has become Bimodal since 13Nov2024 (should disable CFR / callouts more comprehensively in perf tests)
Assignee: nobody → emcminn
Status: NEW → ASSIGNED
Attachment #9443026 - Attachment description: WIP: Bug 1933020 - Disable ASRouter CFRs for perf tests → Bug 1933020 - Disable ASRouter CFRs for perf tests
Pushed by emcminn@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e394eb174f60 Disable ASRouter CFRs for perf tests r=perftest-reviewers,sparky
Points: --- → 1
Priority: -- → P1
Iteration: --- → 135.2 - Dec 9 - Dec 20
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 135 Branch

Bimodality on the TART graphs seems to be gone now.

Whiteboard: [fxp] → [fxp][operational]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: