Closed Bug 1510400 Opened 4 years ago Closed 4 years ago
Promote new tp6-3, 4, 5, and 6 tests to tier 1
47 bytes, text/x-phabricator-request
|Details | Review|
Once we're satisfied with the results for the tests from bug 1503990 we should promote them to tier 1.
A few things to check/do: - review how the new test suites (tp6-3, tp6-4, tp6-5, and tp6-6 have been performing as tier 2 on mozilla-central). Are they green consistently or are there issues to be solved first (i.e. frequent intermittent failures). - does the data look consistent & stable (how noisy is the data compared to the other tp6* suites) - trigger each new test suite with --gecko-profile and ensure the profiling job works and the resulting profiles can be opened in perf-html.io from the link in treeherder - update the Raptor wiki with the test details (Bug 1505788) - add the raw html pages (from playing back each new mitmproxy recording) to the perf-automation repo (also Bug 1505788)
Assignee: nobody → rwood
Status: NEW → ASSIGNED
Triggered gecko-profiling jobs on Raptor tp6-3,4,5, and 6 on a central rev here: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&searchStr=raptor%2Ctp6&revision=d99bf39f5223abe554a629a947eee344c3e9e29e
I'd like to land a fix for intermittent Bug 1502032 first which is just a [taskcluster:error] Task aborted - max run time exceeded (need to increase the overall job time). Also note that these new jobs (and other existing tp6-1 and tp6-2 jobs) on google chrome linux64 fail; this is a known issue (Bug 1495903) but we always leave Raptor google chrome jobs as tier 2 anyway, so that won't prevent us from moving tier 1 for Firefox.
Depends on: 1502032
Looking at data noise for these new Raptor tp6-3, 4, 5, and 6 jobs on mozilla-central: Bing (tp6-4) and Reddit (tp6-6) seem to fluctuate the most, at least from the limited numbers I looked at. :jmaher, can you please remind me how I can get the noise factor for specific test jobs? I want to be sure Bing and Reddit aren't too noisy for automated regression detection. If I do a perfherder compare of one mozilla-central job with others from the last 2 days (perfherder default) and narrow it down to 'raptor-tp6-bing-firefox' there's no noise factor available. Maybe this isn't the right method... https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=mozilla-central&newRevision=d99bf39f5223abe554a629a947eee344c3e9e29e&framework=10&filter=raptor-tp6-bing-firefox&selectedTimeRange=172800
I compare against the same build- typically 6+ data points for the revision, I have retriggered for this build: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&searchStr=tp6%2Cfirefox&group_state=expanded&revision=a89d378954538f7fe0cad49681b409e80c3f8a0f&selectedJob=214657334 and in the near future we will see data to look at here: https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=a89d378954538f7fe0cad49681b409e80c3f8a0f&newProject=mozilla-central&newRevision=a89d378954538f7fe0cad49681b409e80c3f8a0f&framework=1
Thanks :jmaher for the compare view link! https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=a89d378954538f7fe0cad49681b409e80c3f8a0f&newProject=mozilla-central&newRevision=a89d378954538f7fe0cad49681b409e80c3f8a0f&framework=10&filter=tp6&showOnlyComparable=1 I don't see anything too concerning, a few high standard deviations on some configurations but they mostly look like outliers to me. I think we should go ahead and upgrade these to tier 1.
Summary: Promote tp6 tests to tier 1 → Promote new tp6-3, 4, 5, and 6 tests to tier 1
Pushed by email@example.com: https://hg.mozilla.org/integration/autoland/rev/a2d7fd0f5139 Promote new tp6-3, 4, 5, and 6 tests to tier 1; r=jmaher
You need to log in before you can comment on or make changes to this bug.