Frequent timeout in Android mochitest-5 during dom/imptests (hang on shutdown)

RESOLVED WORKSFORME

Status

Testing
General
RESOLVED WORKSFORME
4 years ago
4 years ago

People

(Reporter: gbrown, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

4 years ago
Bug 663657 (Intermittent Android "command timed out: 2400 seconds without output, attempting to kill") is at the top of Orange Factor this week, and most of the failures are in dom/imptests (mochitest-5).

There seemed to be an increase in frequency of dom/imptest timeouts around Nov 28, which is just when bug 778011 landed, which turned on some additional tests in dom/imptests. That patch was backed out, but the failures persist.
(Reporter)

Comment 1

4 years ago
Trying to determine a subset of tests causing the failures:

Baseline:      https://tbpl.mozilla.org/?tree=Try&rev=e070a321f236
Some disabled: https://tbpl.mozilla.org/?tree=Try&rev=04a4eb657563
More disabled: https://tbpl.mozilla.org/?tree=Try&rev=2c8e23f9cc1e
(Reporter)

Comment 2

4 years ago
Just a couple more disabled: https://tbpl.mozilla.org/?tree=Try&rev=27e549d10f33
(Reporter)

Comment 5

4 years ago
Skipping a few mochitest-8 tests seems to resolve a less-frequent but similar problem in m8.

But mochitest-5 feels like a moving target: I keep disabling the tests where the hang occurs, and then it occurs earlier. ** I see now that the hang occurs on the very last test in m5 ** or just before (which could just be a truncated log). So probably all the tests are completing, but then we are hanging on shutdown.


Maybe it would be better to identify a regression range and try a backout or two?

http://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=663657&tree=trunk&startday=2013-11-01&endday=2013-12-09 suggests we might start looking around Nov 27.
(Reporter)

Updated

4 years ago
See Also: → bug 924622
(Reporter)

Comment 6

4 years ago
My try runs have failed to identify a set of tests that can be disabled to avoid these timeouts. I am pretty sure that mochitest-5 is hanging on shutdown. I wonder if that's related to bug 924622 - a shutdown crash that happened occasionally during the try runs.
(Reporter)

Comment 7

4 years ago
I am not finding time to follow up on this.

Ideas for sorting this out:
 - resolve known shutdown crashes/hangs like bug 924622
 - disable subsets of tests in M5 to see if a particular test is to blame
 - identify a regression range and try some backouts
Assignee: gbrown → nobody
Summary: Frequent timeout in Android mochitest-5 during dom/imptests → Frequent timeout in Android mochitest-5 during dom/imptests (hang on shutdown)
(Reporter)

Updated

4 years ago
See Also: → bug 918759
(Reporter)

Comment 8

4 years ago
Try runs with more logging show that this is caused by a shutdown hang. 

The logs often look like there is a hang in the middle of a test, but that's just because the log isn't being flushed. If we enable devicemanager debugging, you can see the logcat being retrieved at the end of the test run and get the full picture.

https://tbpl.mozilla.org/?tree=Try&rev=7955825807ab

https://tbpl.mozilla.org/php/getParsedLog.php?id=32567931&tree=Try&full=1#error0

There seems to be an AsyncShutdown problem during the profile-before-change phase:

17:27:45     INFO - 01-05 16:33:21.374 I/Gecko   ( 2236): WARNING: A phase completion condition is taking too long to complete. Condition: AddonManager: shutting down providers Phase: profile-before-change
17:27:45     INFO - 01-05 16:33:21.374 I/Gecko   ( 2236): WARNING: A phase completion condition is taking too long to complete. Condition: OS.File: flush I/O queued before profile-before-change Phase: profile-before-change

:dteller -- Can you have a look?
Flags: needinfo?(dteller)
Generally, AsyncShutdown doesn't cause the problem, it detects it. I have a patch somewhere to add more logging to AsyncShutdown.
Flags: needinfo?(dteller)
I'll push the patch here: bug 957123.
(Reporter)

Comment 11

4 years ago
We have not seen more of these shutdown hangs reported in bug 663657 since 2014-01-09 -- the problem may be resolved.
(Reporter)

Comment 12

4 years ago
This does not seem to be a problem any longer, at least not reported this way. (There is another shutdown hang in bug 918759 -- best to track it there.)
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.