Bug 1232305 (autophone-Mdm)

Autophone - tracking bug for Mochitest DOM Media failures

NEW
Unassigned

Status

defect
4 years ago
2 years ago

People

(Reporter: bc, Unassigned)

Tracking

(Depends on 47 bugs, {meta})

Details

(Whiteboard: [leave open])

Attachments

(7 attachments)

bug 1214812 fixed the unit tests and autophone in order to be able to run the unit tests again. We were only able to enable the Mochitest DOM Media (Mdm) test on Nexus S Android 2.3 due to failures on the other android versions.

This bug will be used to track the current failures in order that we have a single location to track the current status on all of our devices.

These bugs are a result of two try runs:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a99d064ef982&exclusion_profile=false&selectedJob=14568918

Nexus S Android 2.3 API 10
Nexus 4 Android 4.2 API 17
Nexus 5 Android 4.4 API 19
Nexus 7 Android 4.3 API 18

https://treeherder.allizom.org/#/jobs?repo=try&exclusion_profile=false&group_state=expanded&author=bclary@mozilla.com&selectedJob=15074643

Nexus One Android 2.3 API 10
Samsung Galaxy S3 Android 4.0 API 15
Nexus 6 Android 5.1 API 22
Nexus 9 Android 5.0 API 21
Depends on: 1232308
Depends on: 1232309
Depends on: 1232312
Depends on: 1232313
Depends on: 1232318
Depends on: 1232323
Depends on: 1232334
Depends on: 1232335
Depends on: 1232336
No longer depends on: 896723
Depends on: 1272874
Depends on: 1272875
Depends on: 1272876
Depends on: 1272877
No longer depends on: 1272877
Depends on: 1272877
Alias: autophone-Mdm
Depends on: 1317362
To investigate the totality of failures, I used this patch temporarily to not terminate the test run if 4 time outs were detected.
Posted file status.log
Running the tests locally I found the following errors in a series of test runs where I was attempting to mark dom/media/test/mochitest.ini with the appropriate fail-if and skip-if qualifiers. This does not explicitly include crashes since they do not appear in the logs by default unfortunately.

I used

Android Android Device
Version SDK
======= ======= ============
4.0     15      samsungs-gs3
4.2     17      nexus-4
4.4     19      nexus-5
6.0     23      nexus-6p
7.1     25      pixel

This only misses the nexus-6 and nexus-9 devices in production.
This patch makes Mdm run green with some intermittent oranges. I attempted to make this granular down to the sdk and build type level, but found that the intermittent nature of the time outs and crashes made it nearly impossible to do so in a reasonable time frame. I ended up using toolkit == 'android' when I had a sufficient number of sdk levels for a skip-if. As it is, this took dozens of local runs as well as a number of runs in production on try. I didn't try to quantify the exact reason for the skip since it varied from run to run, from device to device, build type to build type and apparently the phase of the moon.

In addition to adding this bug to the fail/skip-if, I searched for open time out bugs for each test and annotated the line with them as well.

My last two runs on try:

https://treeherder.mozilla.org/#/jobs?repo=try&author=bclary@mozilla.com&fromchange=1701b3f2dd07090b38c4c475cd0c92358439837d&group_state=expanded&exclusion_profile=false&tochange=4b0d198b464338dfe638f6bdf14567b7bd0c3ea1

Opt builds run in approximately 5-6 minutes while Debug builds run in a bit more but usually less than 10 minutes. The reason my local debug builds took so long was due to my mozconfigs which disabled optimization on debug. Fixing that help improve the turn around time significantly, fwiw.

We will definitely be able to run both opt and debug on mozilla-central and perhaps for some devices on inbound and autoland, but that is not entirely clear yet. I'll file a follow up bug for enabling these on mozilla-central next.

I intend to file follow up bugs once this is in production for the crashes found by selectively removing the skip-if and running the tests on try then filing bugs on the failures.
Attachment #8843441 - Flags: review?(gbrown)
Assignee: nobody → bob
Comment on attachment 8843441 [details] [diff] [review]
bug-1232305-Mdm.patch

Review of attachment 8843441 [details] [diff] [review]:
-----------------------------------------------------------------

Wow, that's a lot of annotations! I hate to see a manifest that cluttered, and I wonder if pressing on with this will create a false sense of dom/media coverage on autophone. Perhaps it would be "better" to not run dom/media on autophone until more tests run green? Or use a separate white-list manifest for dom/media-on-autophone??

But I trust bc has the best perspective on the overall picture here...fine with me to proceed as-is.
Attachment #8843441 - Flags: review?(gbrown) → review+
I tried to determine a proper subset of tests once before and after burning countless hours gave up. Waiting didn't help the situation and we have languished with no real improvement and no real coverage at all. I believe Blake and John are working on media and will be interested in fixing the bugs and tests so that we can improve the story here.

Keeping everything in one manifest will help since we can easily turn on sets of tests and submit to try without the overhead of maintaining separate jobs/manifests and will also have the added "benefit" of encountering any interactions with the existing tests at the same time.

I plan to enable tests on try by "tag" initially and filing a single bug for all of the related failures in order to keep from creating too many related bugs which may have the same root causes. I believe this approach will be more manageable than one bug per test file that we have attempted in the past. The developers can file additional dependent bugs as they feel is necessary to track the work required.

Thanks!
Whiteboard: leave open
Pushed by bclary@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c7695823f811
Disable failing Mochitest DOM Media tests on Android, r=gbrown.
I got totally green runs for my local devices on two opt/debug runs from the same local build but had to tweak it several times to get super green on production try.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=d50cafc1533c0a9e4a942bec7b676bed84a292b7&group_state=expanded&exclusion_profile=false

I'm sure there will be other intermittent orange failures but I think the rate will be small enough. Once we get these tests running in production we can work on filing bugs and getting the underlying product or test issues fixed so we can improve reliability and increase test coverage.
Attachment #8844944 - Flags: review?(gbrown)
Comment on attachment 8844944 [details] [diff] [review]
bug-1232305-followup.patch

Review of attachment 8844944 [details] [diff] [review]:
-----------------------------------------------------------------

Good looking try runs there - nice!
Attachment #8844944 - Flags: review?(gbrown) → review+
Pushed by bclary@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/102f898ac465
Disable failing Mochitest DOM Media tests on Android - follow up 1, r=gbrown
Blocks: 1345566
Depends on: 1346631
Depends on: 1347102
Depends on: 1347756
Depends on: 1347953
Depends on: 1347954
Depends on: 1348140
Depends on: 1348141
Depends on: 1348386
Depends on: 1348581
Depends on: 1348594
Depends on: 1348595
Depends on: 1350024
Depends on: 1350507
Depends on: 1351019
Depends on: 1351317
Depends on: 1351550
Depends on: 1352483
Depends on: 1352485
Depends on: 1352655
Depends on: 1352928
Depends on: 1353896
Bob, 
Alastor(irc: alwu)from TPE media team will help on this. Please cc him in the bugs related to media.
Thanks.
The first patch made the merge but this one did not. I would like to get the branches synced up so we eliminate as much orange as possible on aurora/beta that we have already identified.
Whiteboard: leave open → [leave open][checkin-needed-aurora][checkin-needed-beta]
https://hg.mozilla.org/releases/mozilla-aurora/rev/ff3090de2879
Whiteboard: [leave open][checkin-needed-aurora][checkin-needed-beta] → [leave open][checkin-needed-beta]
Depends on: 1354346
Depends on: 1354435
Depends on: 1354506
Depends on: 1356169
Depends on: 1356373
Depends on: 1356740
Depends on: 1356741
Depends on: 1358046
Depends on: 1358271
Depends on: 1358637
Depends on: 1358640
Depends on: 1359063
Assignee: bob → nobody
Depends on: 1360908
Depends on: 1361903
Depends on: 1362793
Depends on: 1362794
Depends on: 1362796
Depends on: 1363207
Depends on: 1363210
Depends on: 1363781
Depends on: 1363784
Depends on: 1363786
Depends on: 1363787
Depends on: 1363791
Depends on: 1364256
Depends on: 1330522
No longer depends on: 1330522
Depends on: 1366506
Depends on: 1367109
Depends on: 1367998
Depends on: 1369718
Depends on: 1369721
Depends on: 1369723
Depends on: 1370114
Depends on: 1370115
Depends on: 1370283
Depends on: 1372687
Component: Autophone → General
Product: Testing → Tracking
Version: unspecified → ---
Depends on: 1381912
Depends on: 1387434
Depends on: 1387435
Depends on: 1392747
Depends on: 1392945
Depends on: 1392946
Depends on: 1392949
Depends on: 1392952
Depends on: 1393866
Depends on: 1393867
Depends on: 1394482
Depends on: 1394848
Depends on: 1395149
Depends on: 1395150
Depends on: 1395626
You need to log in before you can comment on or make changes to this bug.