Closed Bug 948600 Opened 11 years ago Closed 10 years ago

Frequent timeout in Android mochitest-5 during dom/imptests (hang on shutdown)

Categories

(Testing :: General, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: gbrown, Unassigned)

References

Details

Bug 663657 (Intermittent Android "command timed out: 2400 seconds without output, attempting to kill") is at the top of Orange Factor this week, and most of the failures are in dom/imptests (mochitest-5).

There seemed to be an increase in frequency of dom/imptest timeouts around Nov 28, which is just when bug 778011 landed, which turned on some additional tests in dom/imptests. That patch was backed out, but the failures persist.
Skipping a few mochitest-8 tests seems to resolve a less-frequent but similar problem in m8.

But mochitest-5 feels like a moving target: I keep disabling the tests where the hang occurs, and then it occurs earlier. ** I see now that the hang occurs on the very last test in m5 ** or just before (which could just be a truncated log). So probably all the tests are completing, but then we are hanging on shutdown.


Maybe it would be better to identify a regression range and try a backout or two?

http://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=663657&tree=trunk&startday=2013-11-01&endday=2013-12-09 suggests we might start looking around Nov 27.
See Also: → 924622
My try runs have failed to identify a set of tests that can be disabled to avoid these timeouts. I am pretty sure that mochitest-5 is hanging on shutdown. I wonder if that's related to bug 924622 - a shutdown crash that happened occasionally during the try runs.
I am not finding time to follow up on this.

Ideas for sorting this out:
 - resolve known shutdown crashes/hangs like bug 924622
 - disable subsets of tests in M5 to see if a particular test is to blame
 - identify a regression range and try some backouts
Assignee: gbrown → nobody
Summary: Frequent timeout in Android mochitest-5 during dom/imptests → Frequent timeout in Android mochitest-5 during dom/imptests (hang on shutdown)
See Also: → 918759
Try runs with more logging show that this is caused by a shutdown hang. 

The logs often look like there is a hang in the middle of a test, but that's just because the log isn't being flushed. If we enable devicemanager debugging, you can see the logcat being retrieved at the end of the test run and get the full picture.

https://tbpl.mozilla.org/?tree=Try&rev=7955825807ab

https://tbpl.mozilla.org/php/getParsedLog.php?id=32567931&tree=Try&full=1#error0

There seems to be an AsyncShutdown problem during the profile-before-change phase:

17:27:45     INFO - 01-05 16:33:21.374 I/Gecko   ( 2236): WARNING: A phase completion condition is taking too long to complete. Condition: AddonManager: shutting down providers Phase: profile-before-change
17:27:45     INFO - 01-05 16:33:21.374 I/Gecko   ( 2236): WARNING: A phase completion condition is taking too long to complete. Condition: OS.File: flush I/O queued before profile-before-change Phase: profile-before-change

:dteller -- Can you have a look?
Flags: needinfo?(dteller)
Generally, AsyncShutdown doesn't cause the problem, it detects it. I have a patch somewhere to add more logging to AsyncShutdown.
Flags: needinfo?(dteller)
We have not seen more of these shutdown hangs reported in bug 663657 since 2014-01-09 -- the problem may be resolved.
This does not seem to be a problem any longer, at least not reported this way. (There is another shutdown hang in bug 918759 -- best to track it there.)
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.