Investigate timeout handling issues with B2G xpcshell tests

RESOLVED INVALID

Status

Testing
XPCShell Harness
RESOLVED INVALID
5 years ago
14 days ago

People

(Reporter: Paolo, Unassigned)

Tracking

Trunk
All
Gonk (Firefox OS)
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

5 years ago
Outer timeout issues may be preventing us from getting the test log on
xpcshell inner timeouts. I've found this while investigating B2G failures
in bug 869144:

https://tbpl.mozilla.org/php/getParsedLog.php?id=24774410&tree=Mozilla-Inbound#error0

This may or may not be related to recent changes in how xpcshell timeouts are
handled (made in the bugs listed as dependencies).
Specifically, it looks like the changes in bug 597064 don't actually work for the B2G runner, since we're getting this error:
01:09:29     INFO -  Traceback (most recent call last):
01:09:29     INFO -    File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
01:09:29     INFO -      self.run()
01:09:29     INFO -    File "/usr/lib/python2.7/threading.py", line 755, in run
01:09:29     INFO -      self.function(*self.args, **self.kwargs)
01:09:29     INFO -    File "/builds/slave/test/build/tests/xpcshell/runxpcshelltests.py", line 889, in <lambda>
01:09:29     INFO -      testTimer = Timer(HARNESS_TIMEOUT, lambda: self.testTimeout(name, proc.pid))
01:09:29     INFO -  AttributeError: 'NoneType' object has no attribute 'pid'

From this line:
http://mxr.mozilla.org/mozilla-central/source/testing/xpcshell/runxpcshelltests.py#889
Blocks: 966990
Not sure if this is related to what Paolo was seeing, but I'm seeing a case where we have a crash, but then the harness isn't detecting a timeout. Ultimately, mozharness is the one to kill the job (without checking for minidumps):

https://tbpl.mozilla.org/php/getParsedLog.php?id=33989315&tree=Mozilla-Inbound
The problem I'm seeing stems from the fact that the assertion happens in-between tests, after the previous testTimeout timer is canceled and before the next testTimeout timer is created. It looks like in addition to per-test timers as implemented in bug 597064, we also need some kind of global output timeout (a la mozprocess). This seems like a separate issue from what Paolo filed, so I'll file a new bug.
B2G is gone.
Status: NEW → RESOLVED
Last Resolved: 28 days ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.