Bug 663657 (Intermittent Android "command timed out: 2400 seconds without output, attempting to kill") is at the top of Orange Factor this week, and most of the failures are in dom/imptests (mochitest-5). There seemed to be an increase in frequency of dom/imptest timeouts around Nov 28, which is just when bug 778011 landed, which turned on some additional tests in dom/imptests. That patch was backed out, but the failures persist.
Trying to determine a subset of tests causing the failures: Baseline: https://tbpl.mozilla.org/?tree=Try&rev=e070a321f236 Some disabled: https://tbpl.mozilla.org/?tree=Try&rev=04a4eb657563 More disabled: https://tbpl.mozilla.org/?tree=Try&rev=2c8e23f9cc1e
Just a couple more disabled: https://tbpl.mozilla.org/?tree=Try&rev=27e549d10f33
Almost there? https://tbpl.mozilla.org/?tree=Try&rev=4e18b2f3e490
Skipping a few mochitest-8 tests seems to resolve a less-frequent but similar problem in m8. But mochitest-5 feels like a moving target: I keep disabling the tests where the hang occurs, and then it occurs earlier. ** I see now that the hang occurs on the very last test in m5 ** or just before (which could just be a truncated log). So probably all the tests are completing, but then we are hanging on shutdown. Maybe it would be better to identify a regression range and try a backout or two? http://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=663657&tree=trunk&startday=2013-11-01&endday=2013-12-09 suggests we might start looking around Nov 27.
My try runs have failed to identify a set of tests that can be disabled to avoid these timeouts. I am pretty sure that mochitest-5 is hanging on shutdown. I wonder if that's related to bug 924622 - a shutdown crash that happened occasionally during the try runs.
I am not finding time to follow up on this. Ideas for sorting this out: - resolve known shutdown crashes/hangs like bug 924622 - disable subsets of tests in M5 to see if a particular test is to blame - identify a regression range and try some backouts
Assignee: gbrown → nobody
Summary: Frequent timeout in Android mochitest-5 during dom/imptests → Frequent timeout in Android mochitest-5 during dom/imptests (hang on shutdown)
Try runs with more logging show that this is caused by a shutdown hang. The logs often look like there is a hang in the middle of a test, but that's just because the log isn't being flushed. If we enable devicemanager debugging, you can see the logcat being retrieved at the end of the test run and get the full picture. https://tbpl.mozilla.org/?tree=Try&rev=7955825807ab https://tbpl.mozilla.org/php/getParsedLog.php?id=32567931&tree=Try&full=1#error0 There seems to be an AsyncShutdown problem during the profile-before-change phase: 17:27:45 INFO - 01-05 16:33:21.374 I/Gecko ( 2236): WARNING: A phase completion condition is taking too long to complete. Condition: AddonManager: shutting down providers Phase: profile-before-change 17:27:45 INFO - 01-05 16:33:21.374 I/Gecko ( 2236): WARNING: A phase completion condition is taking too long to complete. Condition: OS.File: flush I/O queued before profile-before-change Phase: profile-before-change :dteller -- Can you have a look?
Generally, AsyncShutdown doesn't cause the problem, it detects it. I have a patch somewhere to add more logging to AsyncShutdown.
I'll push the patch here: bug 957123.
We have not seen more of these shutdown hangs reported in bug 663657 since 2014-01-09 -- the problem may be resolved.
This does not seem to be a problem any longer, at least not reported this way. (There is another shutdown hang in bug 918759 -- best to track it there.)
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.