Closed Bug 1583364 Opened 3 months ago Closed 15 days ago

Store test runtime information by manifest

Categories

(Testing :: General, task, P2)

Version 3
task

Tracking

(firefox72 fixed)

RESOLVED FIXED
mozilla72
Tracking Status
firefox72 --- fixed

People

(Reporter: ahal, Assigned: ahal)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

Attachments

(6 files)

In order to support --chunk-by-runtime, we have a local copy of test runtime data checked into the tree:
https://searchfox.org/mozilla-central/source/testing/runtimes/

But there are so many tests that the data is too large to store. So we take shortcuts by storing e.g, only the Nth percentile of slowest tests.

Instead, we should compute the average runtime of each test manifest. This would be a much smaller data set to store and allow us much greater accuracy.

The true motivation for filing this bug is to support bug 1583353. If the decision task is going to start performing chunking, the algorithm needs to be fast and accurate.

Priority: -- → P3
Assignee: nobody → ahal
Status: NEW → ASSIGNED
Priority: P3 → P2

Build flavors are defined in 'python/mozbuild/mozbuild/testing.py'.

This change is needed by D52729 but it's also a good way to tell which suites
are integrated into the TestManifestBackend in the build system. So I'm landing
it here instead.

Depends on D53030

The main motivation here was to gain access to the mach enviroment for the
future refactor.

Depends on D53698

The script should just do the thing that we want. Providing options just
increases the chance of user error. I don't see any need to specify either of
these things.

Depends on D53699

The new format will be:

{ <path/to/manifest.ini>: <average duration> }

Depends on D53700

As a side-effect this will also update runtime data for all suites using
'--chunk-by-runtime'.

Depends on D53701

Attachment #9109759 - Attachment description: Bug 1583364 - Update testing/runtimes/writeruntimes script to write info at the manifest level → Bug 1583364 - Update testing/runtimes/writeruntimes script to write info at the manifest level, r?gbrown
Attachment #9109760 - Attachment description: Bug 1583364 - Generate 'manifest-runtimes.json' and update mochitest harness to use it → Bug 1583364 - Generate 'manifest-runtimes.json' and update mochitest harness to use it, r?gbrown

Here's my try push:
https://treeherder.mozilla.org/#/jobs?repo=try&duplicate_jobs=visible&revision=0f960ea72ed73599c2debf5ae4b9bb48db936df0

While it looks like there are perma-fails.. they are actually all fission which only run on mozilla-central. And those tasks look just as orange over there so I don't think they are related to this change.

Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/38314b60638c
Create a 'build_flavor' key mapping 'moztest.resolve.TEST_SUITES' to their build flavor, r=gbrown
https://hg.mozilla.org/integration/autoland/rev/20a16191cee2
Convert testing/runtimes/writeruntimes.py to a 'mach python' script, r=gbrown
https://hg.mozilla.org/integration/autoland/rev/fd9f3064ec85
Remove ability to specify platforms/e10s in testing/runtimes/writeruntimes, r=gbrown
https://hg.mozilla.org/integration/autoland/rev/6ce87f7cc6f8
Update testing/runtimes/writeruntimes script to write info at the manifest level, r=gbrown
https://hg.mozilla.org/integration/autoland/rev/ed4d544f3db4
Generate 'manifest-runtimes.json' and update mochitest harness to use it, r=gbrown
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b22b8ed60c0c
[manifestparser] Fix regression to ChunkByManifest filter. r=gbrown

Backed out for making Bug 1593402 near permafail on Linux x64 debug fission.

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&group_state=expanded&selectedJob=277767523&resultStatus=success%2Ctestfailed%2Cbusted%2Cexception&tochange=cf114f3b74940ba766a8a7e9651a84bc91768da4&searchStr=linux%2Cx64%2Cdebug%2Cmochitests%2Cwith%2Cfission%2Cenabled%2Ctest-linux64%2Fdebug-mochitest-browser-chrome-fis-e10s-4%2Cm-fis%28bc4%29&fromchange=46ac60d88588b50b77bb0f300cf78a72d60390a4

Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=277767523&repo=autoland

Backout link: https://hg.mozilla.org/integration/autoland/rev/81d8250a66ad5cf3dd3a5b6a97ef88aa8598c672

[task 2019-11-23T01:37:21.814Z] 01:37:21 INFO - TEST-UNEXPECTED-FAIL | toolkit/mozapps/extensions/test/browser/browser_about_debugging_link.js | A promise chain failed to handle a rejection: this.transport is null - stack: send@resource://devtools/server/debugger-server-connection.js:89:5
[task 2019-11-23T01:37:21.814Z] 01:37:21 INFO - writeError@resource://devtools/shared/protocol/Actor.js:98:15
[task 2019-11-23T01:37:21.814Z] 01:37:21 INFO - generateRequestHandlers/</handler/</<@resource://devtools/shared/protocol/Actor.js:187:30
[task 2019-11-23T01:37:21.814Z] 01:37:21 INFO - promise callbackgenerateRequestHandlers/</handler/<@resource://devtools/shared/protocol/Actor.js:187:14
[task 2019-11-23T01:37:21.815Z] 01:37:21 INFO - _queueResponse@resource://devtools/shared/protocol/Actor.js:107:28
[task 2019-11-23T01:37:21.815Z] 01:37:21 INFO - handler@resource://devtools/shared/protocol/Actor.js:183:14
[task 2019-11-23T01:37:21.815Z] 01:37:21 INFO - onPacket@resource://devtools/server/debugger-server-connection.js:378:58
[task 2019-11-23T01:37:21.815Z] 01:37:21 INFO - send/<@resource://devtools/shared/transport/local-transport.js:70:25
[task 2019-11-23T01:37:21.816Z] 01:37:21 INFO - exports.makeInfallible/<@resource://devtools/shared/ThreadSafeDevToolsUtils.js:111:22
[task 2019-11-23T01:37:21.816Z] 01:37:21 INFO - DevToolsUtils.executeSoon
exports.executeSoon@resource://devtools/shared/DevToolsUtils.js:62:21
[task 2019-11-23T01:37:21.816Z] 01:37:21 INFO - send@resource://devtools/shared/transport/local-transport.js:58:21
[task 2019-11-23T01:37:21.816Z] 01:37:21 INFO - send@resource://devtools/shared/protocol/Front.js:198:30
[task 2019-11-23T01:37:21.817Z] 01:37:21 INFO - request@resource://devtools/shared/protocol/Front.js:216:10
[task 2019-11-23T01:37:21.817Z] 01:37:21 INFO - generateRequestMethods/</frontProto[name]@resource://devtools/shared/protocol/Front/FrontClassWithSpec.js:49:19
[task 2019-11-23T01:37:21.817Z] 01:37:21 INFO - getTarget/this._targetFrontPromise<@resource://devtools/shared/fronts/descriptors/process.js:74:40
[task 2019-11-23T01:37:21.817Z] 01:37:21 INFO - getTarget@resource://devtools/shared/fronts/descriptors/process.js:91:7
[task 2019-11-23T01:37:21.819Z] 01:37:21 INFO - listAllWorkers@resource://devtools/shared/fronts/root.js:107:52
[task 2019-11-23T01:37:21.819Z] 01:37:21 INFO - asynclistWorkers@resource://devtools/client/aboutdebugging/src/modules/client-wrapper.js:153:36
[task 2019-11-23T01:37:21.820Z] 01:37:21 INFO - requestWorkers/<@resource://devtools/client/aboutdebugging/src/actions/debug-targets.js:300:31
[task 2019-11-23T01:37:21.820Z] 01:37:21 INFO - thunk/</<@resource://devtools/client/shared/redux/middleware/thunk.js:15:9
[task 2019-11-23T01:37:21.821Z] 01:37:21 INFO - dispatch@resource://devtools/client/shared/vendor/redux.js:755:18
[task 2019-11-23T01:37:21.821Z] 01:37:21 INFO - onWorkersUpdated@resource://devtools/client/aboutdebugging/src/middleware/debug-target-listener.js:23:11
[task 2019-11-23T01:37:21.821Z] 01:37:21 INFO - emit@resource://devtools/shared/event-emitter.js:190:24
[task 2019-11-23T01:37:21.822Z] 01:37:21 INFO - emit@resource://devtools/shared/event-emitter.js:271:18
[task 2019-11-23T01:37:21.822Z] 01:37:21 INFO - onPacket@resource://devtools/shared/protocol/Front.js:252:13
[task 2019-11-23T01:37:21.823Z] 01:37:21 INFO - onPacket@resource://devtools/shared/client/debugger-client.js:583:13
[task 2019-11-23T01:37:21.823Z] 01:37:21 INFO - send/<@resource://devtools/shared/transport/local-transport.js:70:25
[task 2019-11-23T01:37:21.823Z] 01:37:21 INFO - exports.makeInfallible/<@resource://devtools/shared/ThreadSafeDevToolsUtils.js:111:22
[task 2019-11-23T01:37:21.824Z] 01:37:21 INFO - DevToolsUtils.executeSoon
exports.executeSoon@resource://devtools/shared/DevToolsUtils.js:62:21
[task 2019-11-23T01:37:21.824Z] 01:37:21 INFO - send@resource://devtools/shared/transport/local-transport.js:58:21
[task 2019-11-23T01:37:21.825Z] 01:37:21 INFO - send@resource://devtools/server/debugger-server-connection.js:89:20
[task 2019-11-23T01:37:21.826Z] 01:37:21 INFO - onProcessListChanged@resource://devtools/server/actors/root.js:562:15
[task 2019-11-23T01:37:21.827Z] 01:37:21 INFO - observe@resource://devtools/server/actors/process.js:73:12
[task 2019-11-23T01:37:21.828Z] 01:37:21 INFO - Rejection date: Sat Nov 23 2019 01:37:11 GMT+0000 (Coordinated Universal Time) - false == true - JS frame :: resource://testing-common/PromiseTestUtils.jsm :: assertNoUncaughtRejections :: line 265
[task 2019-11-23T01:37:21.828Z] 01:37:21 INFO - Stack trace:
[task 2019-11-23T01:37:21.829Z] 01:37:21 INFO - resource://testing-common/PromiseTestUtils.jsm:assertNoUncaughtRejections:265
[task 2019-11-23T01:37:21.830Z] 01:37:21 INFO - chrome://mochikit/content/browser-test.js:Tester_execTest/<:1100
[task 2019-11-23T01:37:21.831Z] 01:37:21 INFO - chrome://mochikit/content/browser-test.js:Tester_execTest:1104
[task 2019-11-23T01:37:21.831Z] 01:37:21 INFO - chrome://mochikit/content/browser-test.js:nextTest/<:932
[task 2019-11-23T01:37:21.832Z] 01:37:21 INFO - chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:SimpleTest.waitForFocus/waitForFocusInner/focusedOrLoaded/<:805
[task 2019-11-23T01:37:21.835Z] 01:37:21 INFO - Leaving test bound testAboutDebugging
[task 2019-11-23T01:37:21.837Z] 01:37:21 INFO - GECKO(4114) | JavaScript error: resource://gre/actors/BrowserElementParent.jsm, line 81: TypeError: browser is null
[task 2019-11-23T01:37:21.837Z] 01:37:21 INFO - GECKO(4114) | [Parent 4114, Main Thread] WARNING: 'error.Failed()', file /builds/worker/workspace/build/src/dom/ipc/JSWindowActor.cpp, line 198
[task 2019-11-23T01:37:21.838Z] 01:37:21 INFO - GECKO(4114) | JavaScript error: , line 0: NS_ERROR_UNEXPECTED:
[task 2019-11-23T01:37:21.839Z] 01:37:21 INFO - Console message: [JavaScript Error: "TypeError: browser is null" {file: "resource://gre/actors/BrowserElementParent.jsm" line: 81}]
[task 2019-11-23T01:37:21.840Z] 01:37:21 INFO - Console message: [JavaScript Error: "NS_ERROR_UNEXPECTED: "]

Flags: needinfo?(ahal)

This change re-arranges which test manifests run in which chunks, so if it causes the test to become perma-fail then that test has some pretty bad isolation issues. For context we restart the browser and clobber the profile between each test manifest. So if changing which test manifests run in a chunk somehow affects this test, it likely means the test is depending on artifacts written to disk but outside of the profile somehow. Fwiw I did notice this failure on try, but it looked extremely frequent on central as well so I figured it was ok.

This change is blocking some pretty major work (that ironically aims to improve our scheduling and intermittent story), and there's nothing to be done in this particular bug w.r.t individual tests. So we'll need to either:

  1. Fix the test
  2. Tolerate the increased intermittent rate
  3. Disable the test

I'll add the other bug as a blocker here and needinfo you and the owners of the test so we can come to a concensus on how to proceed.

Flags: needinfo?(ahal)
Depends on: 1593402
Attachment #9110964 - Attachment is obsolete: true
Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6a56539a8db9
Create a 'build_flavor' key mapping 'moztest.resolve.TEST_SUITES' to their build flavor, r=gbrown
https://hg.mozilla.org/integration/autoland/rev/15c40a30f38e
Convert testing/runtimes/writeruntimes.py to a 'mach python' script, r=gbrown
https://hg.mozilla.org/integration/autoland/rev/f3a314683f00
Remove ability to specify platforms/e10s in testing/runtimes/writeruntimes, r=gbrown
https://hg.mozilla.org/integration/autoland/rev/4a9458903055
Update testing/runtimes/writeruntimes script to write info at the manifest level, r=gbrown
https://hg.mozilla.org/integration/autoland/rev/5659244be63a
Generate 'manifest-runtimes.json' and update mochitest harness to use it, r=gbrown
Attachment #9110964 - Attachment is obsolete: false
Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c4bfcbd7eb22
[manifestparser] Fix regression to ChunkByManifest filter, r=gbrown
You need to log in before you can comment on or make changes to this bug.