Closed Bug 1426822 Opened 2 years ago Closed 11 months ago

Intermittent TEST-UNEXPECTED-FAIL | remoteautomation.py | application timed out after 370 seconds with no output

Categories

(Testing :: Mochitest, defect, P5)

Version 3
defect

Tracking

(firefox63 fixed, firefox64 fixed)

RESOLVED FIXED
mozilla64
Tracking Status
firefox63 --- fixed
firefox64 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: gbrown)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell disable-recommended])

Attachments

(1 file)

Filed by: apavel [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=153098208&repo=autoland

https://queue.taskcluster.net/v1/task/AaAgZOByST-0f_j5RWNprw/runs/0/artifacts/public/logs/live_backing.log

[task 2017-12-22T08:33:12.099Z] 08:33:12     INFO -  Checking for xpcshell processes...
[task 2017-12-22T08:33:13.739Z] 08:33:13     INFO -  SUITE-START | Running 243 tests
[task 2017-12-22T08:33:13.740Z] 08:33:13     INFO -  TEST-START | dom/base/test/test_bug1025933.html
[task 2017-12-22T08:33:13.741Z] 08:33:13     INFO -  TEST-SKIP | dom/base/test/test_bug1025933.html | took 0ms
[task 2017-12-22T08:33:13.742Z] 08:33:13     INFO -  TEST-START | dom/base/test/test_bug1037687.html
[task 2017-12-22T08:33:13.743Z] 08:33:13     INFO -  TEST-SKIP | dom/base/test/test_bug1037687.html | took 0ms
[task 2017-12-22T08:33:13.744Z] 08:33:13     INFO -  TEST-START | dom/base/test/test_bug1101364.html
[task 2017-12-22T08:33:13.744Z] 08:33:13     INFO -  TEST-SKIP | dom/base/test/test_bug1101364.html | took 1ms
[task 2017-12-22T08:33:13.744Z] 08:33:13     INFO -  TEST-START | dom/base/test/test_bug444546.html
[task 2017-12-22T08:33:13.744Z] 08:33:13     INFO -  TEST-SKIP | dom/base/test/test_bug444546.html | took 0ms
[task 2017-12-22T08:33:13.744Z] 08:33:13     INFO -  TEST-START | dom/base/test/test_bug475156.html
[task 2017-12-22T08:33:13.745Z] 08:33:13     INFO -  TEST-SKIP | dom/base/test/test_bug475156.html | took 1ms
[task 2017-12-22T08:33:13.745Z] 08:33:13     INFO -  TEST-START | dom/base/test/test_bug482935.html
[task 2017-12-22T08:33:13.745Z] 08:33:13     INFO -  TEST-SKIP | dom/base/test/test_bug482935.html | took 0ms
[task 2017-12-22T08:33:13.745Z] 08:33:13     INFO -  TEST-START | dom/base/test/test_bug503473.html
[task 2017-12-22T08:33:13.745Z] 08:33:13     INFO -  TEST-SKIP | dom/base/test/test_bug503473.html | took 1ms
[task 2017-12-22T08:33:13.746Z] 08:33:13     INFO -  TEST-START | dom/base/test/test_bug503481.html
[task 2017-12-22T08:33:13.747Z] 08:33:13     INFO -  TEST-SKIP | dom/base/test/test_bug503481.html | took 0ms
[task 2017-12-22T08:33:13.747Z] 08:33:13     INFO -  TEST-START | dom/base/test/test_bug503481b.html
[task 2017-12-22T08:33:13.748Z] 08:33:13     INFO -  TEST-SKIP | dom/base/test/test_bug503481b.html | took 1ms
[task 2017-12-22T08:33:13.749Z] 08:33:13     INFO -  TEST-START | dom/base/test/test_bug505783.html
[task 2017-12-22T08:33:13.749Z] 08:33:13     INFO -  TEST-SKIP | dom/base/test/test_bug505783.html | took 0ms
[task 2017-12-22T08:33:13.750Z] 08:33:13     INFO -  TEST-START | dom/base/test/test_bug560780.html
[task 2017-12-22T08:33:13.750Z] 08:33:13     INFO -  TEST-SKIP | dom/base/test/test_bug560780.html | took 1ms
[task 2017-12-22T08:33:13.751Z] 08:33:13     INFO -  TEST-START | dom/base/test/test_bug574596.html
[task 2017-12-22T08:33:13.751Z] 08:33:13     INFO -  TEST-SKIP | dom/base/test/test_bug574596.html | took 1ms
[task 2017-12-22T08:33:13.752Z] 08:33:13     INFO -  TEST-START | dom/base/test/test_bug590812.html
[task 2017-12-22T08:33:13.752Z] 08:33:13     INFO -  TEST-SKIP | dom/base/test/test_bug590812.html | took 0ms
[task 2017-12-22T08:33:17.958Z] 08:33:17     INFO -  runtests.py | Failed to copy /builds/worker/workspace/build/tests/mochitest/hyphenation to profile
[task 2017-12-22T08:33:18.482Z] 08:33:18     INFO -  pk12util: PKCS12 IMPORT SUCCESSFUL
[task 2017-12-22T08:33:19.178Z] 08:33:19     INFO -  MochitestServer : launching [u'/builds/worker/workspace/build/hostutils/host-utils-58.0a1.en-US-linux-x86_64/xpcshell', '-g', '/builds/worker/workspace/build/hostutils/host-utils-58.0a1.en-US-linux-x86_64', '-f', '/builds/worker/workspace/build/hostutils/host-utils-58.0a1.en-US-linux-x86_64/components/httpd.js', '-e', "const _PROFILE_PATH = '/tmp/tmprQXsyj.mozrunner'; const _SERVER_PORT = '8854'; const _SERVER_ADDR = '10.0.2.2'; const _TEST_PREFIX = undefined; const _DISPLAY_RESULTS = false;", '-f', '/builds/worker/workspace/build/tests/mochitest/server.js']
[task 2017-12-22T08:33:19.178Z] 08:33:19     INFO -  runtests.py | Server pid: 1176
[task 2017-12-22T08:33:19.186Z] 08:33:19     INFO -  runtests.py | Websocket server pid: 1179
[task 2017-12-22T08:33:19.216Z] 08:33:19     INFO -  runtests.py | SSL tunnel pid: 1182
[task 2017-12-22T08:33:37.259Z] 08:33:37     INFO -  runtests.py | Running with e10s: False
[task 2017-12-22T08:33:37.260Z] 08:33:37     INFO -  runtests.py | Running tests: start.
[task 2017-12-22T08:34:00.810Z] 08:34:00     INFO -  INFO | automation.py | Application pid: 1008
[task 2017-12-22T08:46:33.193Z] 08:46:33     INFO -  Browser unexpectedly found running. Killing...
[task 2017-12-22T08:46:33.197Z] 08:46:33     INFO -  TEST-INFO | started process screentopng
[task 2017-12-22T08:46:35.570Z] 08:46:35     INFO -  TEST-INFO | screentopng: exit 0
[task 2017-12-22T08:46:42.706Z] 08:46:42     INFO -  org.mozilla.fennec_aurora still alive after SIGABRT: waiting...
[task 2017-12-22T08:46:48.083Z] 08:46:48     INFO -  org.mozilla.fennec_aurora still alive after SIGABRT: waiting...
[task 2017-12-22T08:46:54.037Z] 08:46:54  WARNING -  TEST-UNEXPECTED-FAIL | remoteautomation.py | application timed out after 370 seconds with no output
[task 2017-12-22T08:46:54.037Z] 08:46:54     INFO -  INFO | automation.py | Application ran for: 0:13:16.775655
[task 2017-12-22T08:46:54.038Z] 08:46:54     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmpS6Rqh9pidlog
[task 2017-12-22T08:46:54.608Z] 08:46:54     INFO -  Contents of /data/anr/traces.txt:
[task 2017-12-22T08:46:54.608Z] 08:46:54     INFO -  ----- pid 1008 at 2017-12-22 00:34:43 -----
[task 2017-12-22T08:46:54.609Z] 08:46:54     INFO -  Cmd line: org.mozilla.fennec_aurora
[task 2017-12-22T08:46:54.611Z] 08:46:54     INFO -  JNI: CheckJNI is on; workarounds are off; pins=0; globals=173

[task 2017-12-22T08:47:08.797Z] 08:47:08     INFO -  mozcrash Downloading symbols from: https://queue.taskcluster.net/v1/task/FkpdKAm2RS26-oDGy3c6FQ/artifacts/public/build/target.crashreporter-symbols.zip
[task 2017-12-22T08:47:25.383Z] 08:47:25     INFO -  mozcrash Copy/paste: /builds/worker/workspace/build/linux64-minidump_stackwalk /tmp/tmpDnk_kp/29761241-c689-f9ab-a69d-893e01673307.dmp /tmp/tmpDWXIGo
[task 2017-12-22T08:48:41.346Z] 08:48:41     INFO -  mozcrash Saved minidump as /builds/worker/workspace/build/blobber_upload_dir/29761241-c689-f9ab-a69d-893e01673307.dmp
[task 2017-12-22T08:48:41.352Z] 08:48:41     INFO -  mozcrash Saved app info as /builds/worker/workspace/build/blobber_upload_dir/29761241-c689-f9ab-a69d-893e01673307.extra
[task 2017-12-22T08:48:41.385Z] 08:48:41  WARNING -  PROCESS-CRASH | remoteautomation.py | application crashed [@ libc.so + 0x1b014]
[task 2017-12-22T08:48:41.388Z] 08:48:41     INFO -  Crash dump filename: /tmp/tmpDnk_kp/29761241-c689-f9ab-a69d-893e01673307.dmp
[task 2017-12-22T08:48:41.391Z] 08:48:41     INFO -  Operating system: Android
[task 2017-12-22T08:48:41.394Z] 08:48:41     INFO -                    0.0.0 Linux 2.6.29-gea477bb #1 Wed Sep 26 11:04:45 PDT 2012 armv7l
[task 2017-12-22T08:48:41.395Z] 08:48:41     INFO -  CPU: arm
[task 2017-12-22T08:48:41.397Z] 08:48:41     INFO -       ARMv7 ARM Cortex-A8 features: swp,half,thumb,fastmult,vfpv2,edsp,neon,vfpv3
[task 2017-12-22T08:48:41.399Z] 08:48:41     INFO -       1 CPU
[task 2017-12-22T08:48:41.401Z] 08:48:41     INFO -  GPU: UNKNOWN
[task 2017-12-22T08:48:41.403Z] 08:48:41     INFO -  Crash reason:  SIGABRT
[task 2017-12-22T08:48:41.405Z] 08:48:41     INFO -  Crash address: 0x519
[task 2017-12-22T08:48:41.407Z] 08:48:41     INFO -  Process uptime: not available
[task 2017-12-22T08:48:41.410Z] 08:48:41     INFO -  Thread 0 (crashed)
See Also: → 1414495
Duplicate of this bug: 1308560
Depends on: 1416125
There have been 32 failures in the last week.
The failures occur on android-em-4-3-armv7-api16 / opt & debug with a single exception for linux64 / pgo.

Summary: Intermittent TEST-UNEXPECTED-FAIL | remoteautomation.py | application timed out after 370 seconds with no output
Recent log file and snippet with the failure:
https://treeherder.mozilla.org/logviewer.html#?job_id=187963072&repo=autoland&lineNumber=1934

[task 2018-07-13T02:21:36.386Z] 02:21:36     INFO -  adb launch_application: am start -W -n org.mozilla.fennec_aurora/org.mozilla.gecko.BrowserApp -a android.intent.action.VIEW --es env9 MOZ_CRASHREPORTER_NO_REPORT=1 --es env8 R_LOG_DESTINATION=stderr --es args '-no-remote -profile /sdcard/tests/profile//' --es env3 DISABLE_UNSAFE_CPOW_WARNINGS=1 --es env2 R_LOG_VERBOSE=1 --es env1 XPCOM_DEBUG_BREAK=stack --es env0 MOZ_CRASHREPORTER=1 --es env7 MOZ_LOG_FILE=/sdcard/tests/mozlog/moz.log --es env6 MOZ_CRASHREPORTER_SHUTDOWN=1 --es env5 MOZ_IN_AUTOMATION=1 --es env4 MOZ_DISABLE_NONLOCAL_CONNECTIONS=1 --es env13 MOZ_HIDE_RESULTS_TABLE=1 --es env12 R_LOG_LEVEL=6 --es env11 MOZ_PROCESS_LOG=/tmp/tmp0qmHdupidlog --es env10 NO_EM_RESTART=1 -d 'http://mochi.test:8888/tests?autorun=1&closeWhenDone=1&logFile=%2Fsdcard%2Ftests%2Flogs%2Fmochitest.log&fileLevel=INFO&consoleLevel=INFO&hideResultsTable=1&manifestFile=tests.json&dumpOutputDirectory=%2Fsdcard%2Ftests'
[task 2018-07-13T02:21:43.407Z] 02:21:43     INFO -  INFO | automation.py | Application pid: 1809
[task 2018-07-13T02:28:23.706Z] 02:28:23     INFO -  Browser unexpectedly found running. Killing...
[task 2018-07-13T02:28:23.706Z] 02:28:23     INFO -  TEST-INFO | started process screentopng
[task 2018-07-13T02:28:24.058Z] 02:28:24     INFO -  TEST-INFO | screentopng: exit 0
[task 2018-07-13T02:28:38.606Z] 02:28:38  WARNING -  TEST-UNEXPECTED-FAIL | remoteautomation.py | application timed out after 370 seconds with no output
[task 2018-07-13T02:28:38.606Z] 02:28:38     INFO -  INFO | automation.py | Application ran for: 0:07:02.325555
[task 2018-07-13T02:28:38.607Z] 02:28:38     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmp0qmHdupidlog
[task 2018-07-13T02:28:38.918Z] 02:28:38     INFO -  Contents of /data/anr/traces.txt:
[task 2018-07-13T02:28:38.919Z] 02:28:38     INFO -  ----- pid 1809 at 2018-07-12 19:28:24 -----
[task 2018-07-13T02:28:38.919Z] 02:28:38     INFO -  Cmd line: org.mozilla.fennec_aurora

:ahal, could you please have a look? Thank you.
Flags: needinfo?(ahal)
Whiteboard: [stockwell needswork]
Logcat artifacts for most of the failures show:

07-12 03:29:30.394  1093  1109 I GeckoDump: TEST-UNEXPECTED-FAIL: manifestLibrary.js | error parsing http://mochi.test:8888/tests.json (TypeError: RunSet is undefined)
07-12 03:29:30.394  1093  1109 E GeckoConsole: [JavaScript Error: "TypeError: RunSet is undefined" {file: "http://mochi.test:8888/tests/SimpleTest/setup.js" line: 262}]
07-12 03:29:30.394  1093  1109 E GeckoConsole: hookupTests@http://mochi.test:8888/tests/SimpleTest/setup.js:262:3
07-12 03:29:30.394  1093  1109 E GeckoConsole: parseTestManifest@http://mochi.test:8888/manifestLibrary.js:36:5
07-12 03:29:30.394  1093  1109 E GeckoConsole: getTestManifest/req.onload@http://mochi.test:8888/manifestLibrary.js:49:11
Oh, and before that...

07-12 03:29:26.314  1093  1109 E GeckoConsole: [JavaScript Error: "ReferenceError: SpecialPowers is not defined" {file: "http://mochi.test:8888/tests/SimpleTest/setup.js" line: 108}]
07-12 03:29:26.314  1093  1109 E GeckoConsole: @http://mochi.test:8888/tests/SimpleTest/setup.js:108:3
Correlation with bug 1451513? (speculative)
(In reply to Geoff Brown [:gbrown] from comment #31)
> Correlation with bug 1451513? (speculative)

I see the same JavaScript Errors in the logcat for

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4941a739bc17dcc0c93edfcb426980845a9aa8c9&selectedJob=187304914

and I do not see those errors in any jobs prior to 1451513. (There were certainly earlier "370 seconds without output" failures, but not with this frequency, and not with these SpecialPowers / RunSet errors.)

:aswan - Can you have a look?
Blocks: 1451513
Flags: needinfo?(ahal) → needinfo?(aswan)
Looks like we have a race where the tests can start running before SpecialPowers is fully loaded.  I guess one option is to fix this inside the addons manager by adding new code that would let us block startup until test extensions are loaded.  It would be nice to avoid that if possible but, to be honest, I've had a hard time sorting out all the permutations of mochitest flavors and different platforms and exactly how tests get launched in each case.  Perhaps some dedicated time with somebody who thoroughly understands all the mochitest scenarios to figure out an overall strategy would be useful.  Would that be Joel?  Geoff?  Somebody else?
Flags: needinfo?(aswan)
(In reply to Andrew Swan [:aswan] (on PTO until 7/23) from comment #33)
> Perhaps some dedicated time with somebody who thoroughly
> understands all the mochitest scenarios to figure out an overall strategy
> would be useful.  Would that be Joel?  Geoff?  Somebody else?

The bit that I know is that marionette is used to trigger mochitests on desktop, but not on android (initially because marionette was not enabled on android, later because marionette was not enabled on android/opt and currently because marionette is not enabled in android on beta or release).

:jmaher - More thoughts? We should probably try to get this fixed (or back-out??) before Andrew returns...possible?
Flags: needinfo?(jmaher)
this is a tough one, but something we need to solve.  If we can prove that we can run reliably mochitest* on android with marionette, then I would vote for that- we could let this ride the trains or push to get marionette on mozilla-beta.  I suspect that will take a while, possibly a backout for a week or so until we figure this out would be best.
Flags: needinfo?(jmaher)
Update:
There have been 121 failures in the last week.
The failures occur on android-em-4-3-armv7-api16 / opt & debug with 2 exceptions for android-em-4-2-x86 / opt.
Recent log file: https://treeherder.mozilla.org/logviewer.html#?job_id=189382340&repo=mozilla-inbound&lineNumber=1729
Failures started up again on July 31, consistent with https://bugzilla.mozilla.org/show_bug.cgi?id=1451513#c30.

Same symptoms as before:

https://treeherder.mozilla.org/logviewer.html#?job_id=191672164&repo=mozilla-inbound&lineNumber=1634

https://taskcluster-artifacts.net/F729sjuDRNmFe2rUiwRDwQ/0/public/test_info//logcat-emulator-5554.log

08-02 12:27:31.038  1071  1087 E GeckoConsole: [JavaScript Error: "ReferenceError: SpecialPowers is not defined" {file: "http://mochi.test:8888/tests/SimpleTest/setup.js" line: 108}]
08-02 12:27:31.038  1071  1087 E GeckoConsole: @http://mochi.test:8888/tests/SimpleTest/setup.js:108:3
...
08-02 12:27:35.328  1071  1087 I GeckoDump: TEST-UNEXPECTED-FAIL: manifestLibrary.js | error parsing http://mochi.test:8888/tests.json (TypeError: RunSet is undefined)
08-02 12:27:35.328  1071  1087 E GeckoConsole: [JavaScript Error: "TypeError: RunSet is undefined" {file: "http://mochi.test:8888/tests/SimpleTest/setup.js" line: 262}]
08-02 12:27:35.328  1071  1087 E GeckoConsole: hookupTests@http://mochi.test:8888/tests/SimpleTest/setup.js:262:3
08-02 12:27:35.328  1071  1087 E GeckoConsole: parseTestManifest@http://mochi.test:8888/manifestLibrary.js:36:5
08-02 12:27:35.328  1071  1087 E GeckoConsole: getTestManifest/req.onload@http://mochi.test:8888/manifestLibrary.js:49:11

Failure frequency is about the same / might be a little less than previously.

:aswan - Sorry to be the bearer of bad news...
Flags: needinfo?(aswan)
I can dig in, but can you explain (or point me to an overview of) how these extensions are loaded when we're running mochitest on android?  Do we use testing/mochitest/runtests.py or something else?
Flags: needinfo?(aswan) → needinfo?(gbrown)
(In reply to Andrew Swan [:aswan] from comment #48)

Thanks! 

I will try to write up more details tomorrow, but the top-level view is:

 - Android mochitests are run with testing/mochitest/runtestsremote.py, which leverages runtests.py

 - extensions are included in the profile built at

https://dxr.mozilla.org/mozilla-central/rev/3cb90f16402bc5e1203e2771dc93553b8377fa40/testing/mochitest/runtestsremote.py#228

https://dxr.mozilla.org/mozilla-central/rev/3cb90f16402bc5e1203e2771dc93553b8377fa40/testing/mochitest/runtests.py#1876

 - that profile is built on the host, then pushed to the device before each browser start:

https://dxr.mozilla.org/mozilla-central/rev/3cb90f16402bc5e1203e2771dc93553b8377fa40/testing/mochitest/runtestsremote.py#246
That's helpful, thanks.  So a major difference between desktop and android is that on android we just drop the test extensions in the profile and flip preferences to allow them to be sideloaded while on desktop we install the extensions at runtime using marionette.  With the sideload technique there's no way to find out when the extension has been fully loaded so I suspect there's a race where we try to start executing tests before the extension is ready.  As of this patch I don't believe we have a similar race on desktop:
https://hg.mozilla.org/mozilla-central/rev/01d7fd373ccb

Do you know offhand if there's a reason we can't install the test extensions using marionette on android?  It looks like the marionette driver currently can only install addons from the filesystem:
https://searchfox.org/mozilla-central/source/testing/marionette/addon.js#24

but we could easily generalize that to use getInstallForURL() to install the test extensions over the network (ie, from the host where android tests are launched from)
(In reply to Andrew Swan [:aswan] from comment #50)
> Do you know offhand if there's a reason we can't install the test extensions
> using marionette on android?  It looks like the marionette driver currently
> can only install addons from the filesystem:
> https://searchfox.org/mozilla-central/source/testing/marionette/addon.js#24
> 
> but we could easily generalize that to use getInstallForURL() to install the
> test extensions over the network (ie, from the host where android tests are
> launched from)

I have not tried it, but I suspect it is possible to install test extensions using marionette on android. In addition to the getInstallFromURL() idea, we could easily have the harness push addons to the device filesystem and have marionette take it from there.

BUT, recall comment 37: Marionette is only enabled on Android in non-release builds. So a marionette-only solution would not allow us to run android mochitests on beta or release branches.

:snorp -- Could marionette be enabled on release builds? Would this technique work in other apps, like TestRunnerActivity?
Flags: needinfo?(gbrown) → needinfo?(snorp)
Duplicate of this bug: 1309088
If using marionette isn't feasible, a few other options come to mind:
1. There's a method in the Firefox debugging protocol for installing extensions, we could use that
2. We could make sideloaded extensions load synchronously at startup.  We don't want to do that in general, so it would have to be some sort of preference that's only applied during tests.

If possible, I'd really prefer to use Marionette or option #1 above, but I don't really know if/why those would be difficult.
Jim, did I remember right that you were investigating some races with addon loading? This seems related to that.
Flags: needinfo?(snorp) → needinfo?(nchen)
I don't remember working on that.
Flags: needinfo?(nchen)
Let's try this again...

Andrew's preferred method for resolving this frequently failing intermittent is to use marionette to install test extensions. We cannot do that currently because marionette is disabled on beta-and-release builds (and of course, we want to continue to run tests on release builds). I think there have been concerns about bloat and security which have prevented marionette from being enabled on release builds, but I'm not clear on the specific concerns, or whether those apply today.

:snorp -- Could marionette be enabled on release builds? Would this technique work in other apps, like TestRunnerActivity?

(If marionette is not an option, there are other ideas for resolving this. See comment 54.)
Flags: needinfo?(snorp)
Depends on: 1484220
(In reply to Geoff Brown [:gbrown] (pto Aug 20-Aug 24) from comment #62)
> :snorp -- Could marionette be enabled on release builds? Would this
> technique work in other apps, like TestRunnerActivity?

We are looking at this in bug 1484220.
Flags: needinfo?(snorp)
Update:
There have been 132 failures in the last week.
Almost all of the failures are on android-em-4-3-armv7-api16 / debug & opt.

Summary: Intermittent TEST-UNEXPECTED-FAIL | remoteautomation.py | application timed out after 370 seconds with no output

Depends on Bug 1484220
Update:

There have been 118 failures in the last week.

The most affected platform / build: android-em-4-3-armv7-api16 / debug.

Recent relevat log file: https://treeherder.mozilla.org/logviewer.html#?job_id=197151819&repo=mozilla-inbound&lineNumber=1531

Summary: Intermittent TEST-UNEXPECTED-FAIL | remoteautomation.py | application timed out after 370 seconds with no output
There have been 99 failures in the last week
It fails almost exclusively on android-em-4-3-armv7-api16 / debug & opt.
Summary: Intermittent TEST-UNEXPECTED-FAIL | remoteautomation.py | application timed out after 370 seconds with no output

Recent relevant log file:
https://treeherder.mozilla.org/logviewer.html#?job_id=198546312&repo=mozilla-inbound&lineNumber=1812
:aswan - With bug 1484220 resolved, I think it is possible to proceed with your idea of comment 50 - use marionette to install test extensions on Android. Is that something you could take on?
Flags: needinfo?(aswan)
I can try but I 1) am unfamiliar with all the python code that drives mochitests and 2) have my hands full with other things at the moment.  If nobody else is available, I'll take a shot, but I won't be able to get to it for at least a few weeks.
Flags: needinfo?(aswan)
Thanks. I might know some of the python code better, but I'm in the same boat -- hands full for a while. We'll see how it goes.
See Also: → 1492553
There have been 110 failures within the last 7 days, all of them on Android 4.3 API16+ debug

Recent failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=201293911&repo=mozilla-central&lineNumber=1799

[task 2018-09-24T23:15:25.029Z] 23:15:25     INFO -  TEST-INFO | started process screentopng
[task 2018-09-24T23:15:25.388Z] 23:15:25     INFO -  TEST-INFO | screentopng: exit 0
[task 2018-09-24T23:15:40.044Z] 23:15:40  WARNING -  TEST-UNEXPECTED-FAIL | remoteautomation.py | application timed out after 370 seconds with no output
[task 2018-09-24T23:15:40.045Z] 23:15:40     INFO -  INFO | automation.py | Application ran for: 0:07:09.120200
[task 2018-09-24T23:15:40.045Z] 23:15:40     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmpDeJBsopidlog
[task 2018-09-24T23:15:40.566Z] 23:15:40     INFO -  /data/tombstones does not exist; tombstone check skipped
[task 2018-09-24T23:15:41.490Z] 23:15:41     INFO -  mozcrash Copy/paste: /usr/local/bin/linux64-minidump_stackwalk /tmp/tmpwOkunW/55e74b60-a609-9098-8726-bd1879b9762e.dmp /builds/worker/workspace/build/symbols
[task 2018-09-24T23:15:53.227Z] 23:15:53     INFO -  mozcrash Saved minidump as /builds/worker/workspace/build/blobber_upload_dir/55e74b60-a609-9098-8726-bd1879b9762e.dmp
[task 2018-09-24T23:15:53.228Z] 23:15:53     INFO -  mozcrash Saved app info as /builds/worker/workspace/build/blobber_upload_dir/55e74b60-a609-9098-8726-bd1879b9762e.extra
[task 2018-09-24T23:15:53.233Z] 23:15:53  WARNING -  PROCESS-CRASH | remoteautomation.py | application crashed [@ libc.so + 0x1c3dc]
Assignee: nobody → gbrown
I reproduced this failure in Android 4.3 mochitest-chrome on try with extra logging and found that the load event was intermittently missed at:

https://dxr.mozilla.org/mozilla-central/rev/d03b538b6b417ba892d0a92fd693945b741246e1/testing/mochitest/browser-test.js#24

so testInit() was never called - oops.

Upon reflection, I cannot think of a good reason to wait for load + MozAfterPaint and could not discover the answer from code archaeology back to 2015. My new strategy for sync'ing up redirect.html and browser-test.js is simple: Start listening for the custom event as soon as possible in browser-tests.js, and keep sending it from redirect.html until the redirect succeeds. That seems to work very well indeed, and I see no new problems:

https://treeherder.mozilla.org/#/jobs?repo=try&tier=1%2C2%2C3&revision=fa13ba263b8544b3a9f0698abc60755085ea790b


(I still think it would be great to use marionette instead so that the same strategy and code is used for all platforms, but I'm not feeling up to that challenge today. We can always tackle that in another bug.)
Attachment #9012007 - Flags: review?(jmaher)
Comment on attachment 9012007 [details] [diff] [review]
simplify mochitest initialization

Review of attachment 9012007 [details] [diff] [review]:
-----------------------------------------------------------------

can we use marionette on android these days? I would prefer to land this and move forward with that in a different thread if it is worthy of a discussion or experiment.
Attachment #9012007 - Flags: review?(jmaher) → review+
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/22d45dac40c8
Avoid intermittent failure to initialize android mochitests; r=jmaher
https://hg.mozilla.org/mozilla-central/rev/22d45dac40c8
Status: NEW → RESOLVED
Closed: 11 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Recent failure log from m-c: https://treeherder.mozilla.org/logviewer.html#?job_id=201903410&repo=mozilla-central&lineNumber=1542
Status: RESOLVED → REOPENED
Flags: needinfo?(gbrown)
Resolution: FIXED → ---
Target Milestone: mozilla64 → ---
It looks like the fix is working, but there is another case to consider.
Flags: needinfo?(gbrown)
gbrown: Should I close this bug as Resolved Fixed and file a new bug for that failure?
Flags: needinfo?(gbrown)
(In reply to Eliza Balazs [:ebalazs_] from comment #108)
> gbrown: Should I close this bug as Resolved Fixed and file a new bug for
> that failure?

That would be great - thanks!
Flags: needinfo?(gbrown)
Closing this as fixed and bug https://bugzilla.mozilla.org/show_bug.cgi?id=1494657 was created for the new failure.
Status: REOPENED → RESOLVED
Closed: 11 months ago11 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
See Also: → 1494657
See Also: → 1499890
You need to log in before you can comment on or make changes to this bug.