Closed Bug 1821791 Opened 2 years ago Closed 2 years ago

Use kmeans instead of gaussian filter in extra-summary-methods

Categories

(Testing :: Raptor, task, P1)

task

Tracking

(firefox113 fixed)

RESOLVED FIXED
113 Branch
Tracking Status
firefox113 --- fixed

People

(Reporter: sparky, Assigned: sparky)

References

(Regressed 1 open bug)

Details

Attachments

(2 files)

There's a bug with our existing filtering method that uses the gaussian method. It's not exactly a bug, but it's not working as intended. The issue is that when we have a few too many outliers that we want to remove, the std. dev. increases along with the mean. This makes it difficult to remove the outlier points.

Instead, given that we're almost always dealing with multi-modal data in some way, we should use a kmeans filter with 2 k-means to search for. This way, we'll always be able to remove at least one of the offending modes if they don't take up too much of the data, and if the differences are large enough.

Example of this issue: https://treeherder.mozilla.org/jobs?repo=autoland&revision=d59b76766f0dca1e3e1fb4227a5110b1f4f58f11&group_state=expanded&selectedTaskRun=Oz-BqKwzQz2HyUzDD_Ls3A.0

Running locally using k-means:

$ ./mach raptor -t wikia --browsertime-existing-results "/home/sparky/Downloads/browsertime-results(49)/browsertime-results" --browsertime-visualmetrics --extra-summary-methods geomean --chimera

...

21:24:11     INFO -  loadtime (geomean): Filtering out 6 data points found in minor_group of data with mean 48643.333333333336 vs. 1721.6842105263158 in major group
21:24:11     INFO -  cpuTime (geomean): Filtering out 6 data points found in minor_group of data with mean 27647.166666666668 vs. 12903.631578947368 in major group
21:24:11     INFO -  LastVisualChange (geomean): Filtering out 6 data points found in minor_group of data with mean 14460.0 vs. 4048.4210526315787 in major group
21:24:11     INFO -  SpeedIndex (geomean): Filtering out 6 data points found in minor_group of data with mean 4631.333333333333 vs. 1225.7894736842106 in major group
21:24:11     INFO -  PerceptualSpeedIndex (geomean): Filtering out 6 data points found in minor_group of data with mean 4416.5 vs. 1181.0526315789473 in major group


21:24:11     INFO -  perftest-output Info: PERFHERDER_DATA: {"framework": {"name": "browsertime"}, "suites": [{"name": "wikia", "type": "pageload", "extraOptions": ["fission", "cold", "webrender"], "tags": ["fission", "cold", "webrender"], "lowerIsBetter": true, "unit": "ms", "alertThreshold": 2.0, "subtests": [{"name": "ContentfulSpeedIndex", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [2047, 1937, 2399, 1930, 1882, 1967, 2006, 2015, 1818, 1749, 1810, 1586, 1785, 2549, 1956, 1943, 1951, 1960, 1970, 2525, 1951, 1992, 1993, 2065, 2067], "value": 1960}, {"name": "ContentfulSpeedIndex (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [2047, 1937, 2399, 1930, 1882, 1967, 2006, 2015, 1818, 1749, 1810, 1586, 1785, 2549, 1956, 1943, 1951, 1960, 1970, 2525, 1951, 1992, 1993, 2065, 2067], "value": 1983.5}, {"name": "FirstVisualChange", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [400, 240, 240, 240, 240, 240, 240, 240, 200, 240, 240, 240, 240, 200, 320, 280, 240, 240, 240, 240, 240, 240, 240, 360, 240], "value": 240}, {"name": "FirstVisualChange (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [400, 240, 240, 240, 240, 240, 240, 240, 200, 240, 240, 240, 240, 200, 320, 280, 240, 240, 240, 240, 240, 240, 240, 360, 240], "value": 249.7}, {"name": "LastVisualChange", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [15160, 4320, 5200, 4080, 14720, 14960, 4160, 4160, 3840, 3600, 3840, 3200, 3720, 14400, 4160, 13920, 4000, 4080, 4080, 13600, 3960, 4120, 4040, 4200, 4160], "value": 4160}, {"name": "LastVisualChange (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [15160, 4320, 5200, 4080, 14720, 14960, 4160, 4160, 3840, 3600, 3840, 3200, 3720, 14400, 4160, 13920, 4000, 4080, 4080, 13600, 3960, 4120, 4040, 4200, 4160], "value": 4032.0}, {"name": "PerceptualSpeedIndex", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [4500, 1383, 1524, 1186, 4345, 4394, 1184, 1196, 1054, 1019, 1139, 929, 1090, 4646, 1255, 4218, 1138, 1177, 1184, 4396, 1120, 1203, 1184, 1276, 1199], "value": 1196}, {"name": "PerceptualSpeedIndex (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [4500, 1383, 1524, 1186, 4345, 4394, 1184, 1196, 1054, 1019, 1139, 929, 1090, 4646, 1255, 4218, 1138, 1177, 1184, 4396, 1120, 1203, 1184, 1276, 1199], "value": 1174.7}, {"name": "SpeedIndex", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [4776, 1187, 1375, 1244, 4615, 4728, 1313, 1256, 1164, 1115, 1163, 1016, 1129, 4853, 1266, 4478, 1237, 1224, 1236, 4338, 1215, 1253, 1247, 1318, 1332], "value": 1253}, {"name": "SpeedIndex (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [4776, 1187, 1375, 1244, 4615, 4728, 1313, 1256, 1164, 1115, 1163, 1016, 1129, 4853, 1266, 4478, 1237, 1224, 1236, 4338, 1215, 1253, 1247, 1318, 1332], "value": 1222.9}, {"name": "cpuTime", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [28410, 13485, 18855, 12904, 28233, 26317, 11694, 13046, 11700, 10925, 13357, 11686, 13012, 27600, 14100, 27394, 11637, 12638, 13073, 27929, 11604, 13034, 12986, 13346, 12087], "value": 13046}, {"name": "cpuTime (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [28410, 13485, 18855, 12904, 28233, 26317, 11694, 13046, 11700, 10925, 13357, 11686, 13012, 27600, 14100, 27394, 11637, 12638, 13073, 27929, 11604, 13034, 12986, 13346, 12087], "value": 12816.7}, {"name": "dcf", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [565, 389, 383, 377, 380, 412, 409, 389, 490, 494, 477, 243, 433, 294, 363, 377, 415, 537, 499, 353, 401, 458, 388, 425, 416], "value": 405.0}, {"name": "dcf (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [565, 389, 383, 377, 380, 412, 409, 389, 490, 494, 477, 243, 433, 294, 363, 377, 415, 537, 499, 353, 401, 458, 388, 425, 416], "value": 408.6}, {"name": "fcp", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [362, 203, 142, 193, 181, 195, 205, 190, 188, 173, 193, 168, 191, 176, 184, 192, 193, 202, 195, 176, 189, 190, 196, 197, 199], "value": 191.5}, {"name": "fcp (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [362, 203, 142, 193, 181, 195, 205, 190, 188, 173, 193, 168, 191, 176, 184, 192, 193, 202, 195, 176, 189, 190, 196, 197, 199], "value": 192.4}, {"name": "fnbpaint", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [363, 204, 178, 194, 182, 196, 206, 191, 189, 174, 194, 169, 192, 177, 188, 193, 194, 203, 196, 176, 190, 191, 198, 198, 200], "value": 192.5}, {"name": "fnbpaint (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [363, 204, 178, 194, 182, 196, 206, 191, 189, 174, 194, 169, 192, 177, 188, 193, 194, 203, 196, 176, 190, 191, 198, 198, 200], "value": 195.3}, {"name": "loadtime", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [51986, 3051, 3054, 1034, 51716, 42369, 2949, 1047, 1268, 597, 3203, 507, 1064, 51343, 4225, 42624, 1260, 1057, 997, 51822, 1307, 1081, 1020, 1085, 2906], "value": 1287.5}, {"name": "loadtime (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [51986, 3051, 3054, 1034, 51716, 42369, 2949, 1047, 1268, 597, 3203, 507, 1064, 51343, 4225, 42624, 1260, 1057, 997, 51822, 1307, 1081, 1020, 1085, 2906], "value": 1438.0}]}, {"name": "wikia", "type": "pageload", "extraOptions": ["fission", "webrender", "warm"], "tags": ["fission", "webrender", "warm"], "lowerIsBetter": true, "unit": "ms", "alertThreshold": 2.0, "subtests": [{"name": "ContentfulSpeedIndex", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [661, 600, 584, 644, 655, 712, 614, 624, 651, 568, 652, 620, 614, 669, 611, 596, 647, 662, 715, 654, 601, 611, 653, 650, 652], "value": 647}, {"name": "ContentfulSpeedIndex (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [661, 600, 584, 644, 655, 712, 614, 624, 651, 568, 652, 620, 614, 669, 611, 596, 647, 662, 715, 654, 601, 611, 653, 650, 652], "value": 635.8}, {"name": "FirstVisualChange", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [120, 80, 120, 160, 120, 160, 120, 120, 120, 120, 120, 160, 160, 160, 80, 120, 160, 160, 160, 160, 120, 120, 120, 120, 120], "value": 120}, {"name": "FirstVisualChange (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [120, 80, 120, 160, 120, 160, 120, 120, 120, 120, 120, 160, 160, 160, 80, 120, 160, 160, 160, 160, 120, 120, 120, 120, 120], "value": 128.9}, {"name": "LastVisualChange", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [1280, 1200, 1120, 1200, 1240, 1360, 1160, 1200, 1240, 1080, 1280, 1120, 1120, 1240, 1200, 1120, 1200, 1240, 1360, 1200, 1160, 1200, 1280, 1240, 1280], "value": 1200}, {"name": "LastVisualChange (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [1280, 1200, 1120, 1200, 1240, 1360, 1160, 1200, 1240, 1080, 1280, 1120, 1120, 1240, 1200, 1120, 1200, 1240, 1360, 1200, 1160, 1200, 1280, 1240, 1280], "value": 1210.8}, {"name": "PerceptualSpeedIndex", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [399, 346, 361, 411, 393, 444, 377, 379, 390, 354, 393, 397, 396, 419, 350, 365, 412, 416, 442, 414, 366, 372, 394, 390, 399], "value": 393}, {"name": "PerceptualSpeedIndex (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [399, 346, 361, 411, 393, 444, 377, 379, 390, 354, 393, 397, 396, 419, 350, 365, 412, 416, 442, 414, 366, 372, 394, 390, 399], "value": 390.3}, {"name": "SpeedIndex", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [429, 375, 387, 439, 423, 478, 405, 407, 421, 382, 424, 419, 421, 450, 383, 393, 442, 448, 475, 442, 398, 404, 427, 414, 431], "value": 421}, {"name": "SpeedIndex (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [429, 375, 387, 439, 423, 478, 405, 407, 421, 382, 424, 419, 421, 450, 383, 393, 442, 448, 475, 442, 398, 404, 427, 414, 431], "value": 419.9}, {"name": "cpuTime", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [8404, 9307, 9330, 8625, 9003, 8810, 8635, 8660, 8529, 9354, 8695, 9316, 9106, 8764, 8414, 9636, 8725, 8802, 8740, 9513, 9245, 8861, 8518, 8775, 8818], "value": 8802}, {"name": "cpuTime (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [8404, 9307, 9330, 8625, 9003, 8810, 8635, 8660, 8529, 9354, 8695, 9316, 9106, 8764, 8414, 9636, 8725, 8802, 8740, 9513, 9245, 8861, 8518, 8775, 8818], "value": 8896.9}, {"name": "dcf", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [141, 147, 146, 175, 139, 119, 153, 174, 196, 154, 178, 141, 136, 180, 151, 144, 142, 157, 167, 141, 145, 194, 152, 193, 148], "value": 151.5}, {"name": "dcf (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [141, 147, 146, 175, 139, 119, 153, 174, 196, 154, 178, 141, 136, 180, 151, 144, 142, 157, 167, 141, 145, 194, 152, 193, 148], "value": 155.3}, {"name": "fcp", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [74, 69, 63, 81, 71, 79, 64, 73, 73, 76, 67, 77, 73, 78, 77, 77, 76, 75, 76, 76, 79, 70, 77, 74, 63], "value": 75.5}, {"name": "fcp (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [74, 69, 63, 81, 71, 79, 64, 73, 73, 76, 67, 77, 73, 78, 77, 77, 76, 75, 76, 76, 79, 70, 77, 74, 63], "value": 73.4}, {"name": "fnbpaint", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [106, 104, 102, 118, 109, 118, 106, 107, 104, 108, 105, 109, 104, 108, 113, 113, 109, 108, 116, 107, 115, 107, 107, 104, 103], "value": 107.5}, {"name": "fnbpaint (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [106, 104, 102, 118, 109, 118, 106, 107, 104, 108, 105, 109, 104, 108, 113, 113, 109, 108, 116, 107, 115, 107, 107, 104, 103], "value": 108.3}, {"name": "loadtime", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [225, 218, 263, 262, 228, 236, 224, 266, 224, 272, 262, 270, 265, 221, 225, 301, 224, 271, 225, 226, 259, 223, 227, 227, 225], "value": 227.5}, {"name": "loadtime (geomean)", "lowerIsBetter": true, "alertThreshold": 2.0, "unit": "ms", "shouldAlert": false, "replicates": [225, 218, 263, 262, 228, 236, 224, 266, 224, 272, 262, 270, 265, 221, 225, 301, 224, 271, 225, 226, 259, 223, 227, 227, 225], "value": 241.7}]}], "application": {"name": "firefox", "version": "112.0a1"}}

This patch changes the filtering method from a gaussian filter to a k-means filter that should be more suitable to our needs. See this bug comment: https://bugzilla.mozilla.org/show_bug.cgi?id=1821791#c0

With kmeans from scipy, we specify it to search for 2 groups. From there, we check to see if there is a group that comprises no more than 40% of the total size. If there is a group, then we check if the difference in the means are 200%. If they are, then we throw out the dataset that has the least amount of data in it.

This fixes an issue where datasets that had outliers that skewed the standard deviation, and the mean too much would prevent us from removing them.

Assignee: nobody → gmierz2
Status: NEW → ASSIGNED
Pushed by gmierz2@outlook.com: https://hg.mozilla.org/integration/autoland/rev/f7da71e7b1f9 Use kmeans filter in raptor extra-summary-methods. r=AlexandruIonescu,perftest-reviewers https://hg.mozilla.org/integration/autoland/rev/3363ac0afc0d Remove the extra summary metrics using mean. r=perftest-reviewers,AlexandruIonescu
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 113 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: