Closed Bug 919353 Opened 11 years ago Closed 11 years ago

need stack traces when aborting on timeouts in b2g emulator mochitests

Categories

(Release Engineering :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: karlt, Assigned: ahal)

References

Details

Attachments

(2 files, 1 obsolete file)

In https://tbpl.mozilla.org/php/getParsedLog.php?id=28100440&tree=Mozilla-Inbound#error0
for example, the application is killed with signal 9, which won't produce a stack trace.

I don't know whether there is a mismatch of timeouts, so that a child script that would use the appropriate signal is not reaching the timeout, or whether the child script is failing to detect the timeout, or whether the child script is not passing the output to be seen by the parent reporting "command timed out: 1200 seconds without output, attempting to kill", or whether there is no child script and the "parent" is using the wrong signal.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
Even when bug 904839 is solved, we'll still need a better signal:

http://hg.mozilla.org/mozilla-central/annotate/f97307cb4c95/testing/mozbase/mozprocess/mozprocess/processhandler.py#l111
Blocks: 916135
Status: RESOLVED → REOPENED
Depends on: 904839
Resolution: DUPLICATE → ---
Apparently the b2g harness doesn't share much in common with the desktop harness anymore, but in the desktop harness we explicitly handle timeouts and kill to get a stack:
http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/runtests.py#933
http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/runtests.py#1064
Once [1] lands, the easy fix is to just add self.killAndGetStack(). The longer but proper fix is to massage B2G back into the desktop harness now that it no longer depends on automation.py.

[1] https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=917252&attachment=809432
Depends on: 917252, 917750
What is the status of this bug?
I added a call to check for crashes in bug 930025 when the harness times out. The original log isn't up any more so I'm not sure how to check if my patch did anything or not.

Though based on comment 2 that might not be enough? Is the fix here just to change signal.SIGKILL to signal.SIGABRT (https://github.com/mozilla/mozbase/blob/master/mozprocess/mozprocess/processhandler.py#L131)?
I'd rather have this handled by mozcrash by fixing bug 890026 and working out the right integration, but using SIGABRT there is probably good enough to start getting stacks.
I agree the mozcrash integration is the best way forward. Though this patch should get things working in the short term.

I don't think we want to always use SIGABRT, so this patch sets it up so it is only called with SIGABRT when a timeout occurs. This has no affect on Windows (at least until we depend on python 2.7 and can use os.kill everywhere).
Assignee: nobody → ahalberstadt
Status: REOPENED → ASSIGNED
Attachment #822353 - Flags: review?(ted)
Oops, forgot the SIG prefix on SIGABRT
Attachment #822353 - Attachment is obsolete: true
Attachment #822353 - Flags: review?(ted)
Attachment #822360 - Flags: review?(ted)
While I think we should still take that patch, it won't actually help us in the B2G case. This is because the SIGABRT is being sent to adb in this instance and not the b2g process on the device. We'll need call devicemanager.killProcess or something similar.
Depends on: 931078
Once bug 931078 lands, this is all that's needed to get minidumps (tested locally). This means we can either discard the previous patch or land it anyway (as I'd argue we should be sending SIGABRT on timeouts in the general case anyway).

I don't really care either way.
Attachment #822428 - Flags: review?(ted)
Attachment #822428 - Flags: review?(ted) → review+
https://hg.mozilla.org/mozilla-central/rev/5e2dc8523572
Status: ASSIGNED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
:karlt, I think you should be seeing stack traces on timeouts now. If that's not the case, let me know so we can continue investigating.
Sounds great, thank you very much.
Comment on attachment 822360 [details] [diff] [review]
0001-Bug-919353-mozprocess-should-use-SIGABRT-on-process-.patch

So per that other bug, can we just make the B2G harness kill with SIGABRT in an onTimeout handler?
Comment on attachment 822360 [details] [diff] [review]
0001-Bug-919353-mozprocess-should-use-SIGABRT-on-process-.patch

I think we want to go in the direction we did with bug 932349 instead.
Attachment #822360 - Flags: review?(ted) → review-
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: