Open Bug 1689373 Opened 3 years ago Updated 6 months ago

[meta] Provide plan for using Mann-Whitney-U alongside student' s t-test

Categories

(Tree Management :: Perfherder, task)

Tracking

(Not tracked)

People

(Reporter: igoldan, Unassigned)

References

(Depends on 2 open bugs, Blocks 1 open bug)

Details

(Keywords: meta)

All our tests are using the student-T test to generate performance alerts. This algorithm is generally inferior to Mann-Whitney-U (MWU), in that it cannot alert on data that is noisy, multi modal and/or has outliers.

Downside of fully enabling MWU is it will generate way more alerts than our perf sheriffs are able to investigate.

Let's come up with a plan for integrating MWU as a new alerting strategy & enable it only for Fenix tests. We should be able to run either of these strategies or both with minimal changes.

Jira link: https://jira.mozilla.com/browse/FXP-1472

If this is successful, I am hopeful that it can be expanded to cover other tests including Desktop.

I don't yet know if Mann-Whitney-U would detect this change, but consider this change (a memory cache experiment):

18.84% improvement to mean loadtime over 45 runs.
But not picked up as statistically significant.

Summary: Provide plan for using Mann-Whitney-U on Fenix tests → [meta] Provide plan for using Mann-Whitney-U on Fenix tests
Keywords: meta
Summary: [meta] Provide plan for using Mann-Whitney-U on Fenix tests → [meta] Provide plan for using Mann-Whitney-U alongside student-T test
Summary: [meta] Provide plan for using Mann-Whitney-U alongside student-T test → [meta] Provide plan for using Mann-Whitney-U alongside student-T Test
Summary: [meta] Provide plan for using Mann-Whitney-U alongside student-T Test → [meta] Provide plan for using Mann-Whitney-U alongside student' s t-test
Depends on: 1689584
Depends on: 1689586

(In reply to Andrew Creskey [:acreskey] [he/him] from comment #1)

If this is successful, I am hopeful that it can be expanded to cover other tests including Desktop.

I believe this is the intention: that each test (mobile, desktop) can have it's own alert setup (either MWU or student' s t-test). I've rephrased the summary.

I don't yet know if Mann-Whitney-U would detect this change, but consider this change (a memory cache experiment):
18.84% improvement to mean loadtime over 45 runs.
But not picked up as statistically significant.

That looks like an outlier. According to :ekyle' s report, MWU should be able to detect the change.

You need to log in before you can comment on or make changes to this bug.