Closed Bug 829650 Opened 12 years ago Closed 12 years ago

Talos runs mozcrash more than once on the same minidump

Categories

(Testing :: Talos, defect)

ARM
Android
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: jmaher)

References

Details

Attachments

(1 file)

(Note spotted this on runs before the recent deploy, so unrelated) On: https://tbpl.mozilla.org/php/getParsedLog.php?id=18716694&tree=Mozilla-Inbound Summary: { mozcrash ERROR | PROCESS-CRASH | ttest.py | application crashed [@ nssCertificate_Destroy] Thread 4 (crashed) mozcrash ERROR | PROCESS-CRASH | ttest.py | application crashed [@ nssCertificate_Destroy] Thread 4 (crashed) talosError: 'stack found after process termination (org.mozilla.fennec: terminated by testAgent.killProcess, plugin-container: terminated by testAgent.killProcess, crashreporter: terminated by testAgent.killProcess)' } If you look at both of the crash entries, they are using the same minidump (37a2d562-34e3-3f6a-2613a290-6a79a19b.dmp). I'm presuming we didn't clean up successfully after the first run, so low and behold find a minidump after the next run too. In ttest.py::cleanupAndCheckForCrashes() we have: { if browser_config['remote'] == True: # cleanup dumps on remote self._hostproc.removeDirectory(minidumpdir) } So we should be cleaning up surely?
here is what is happening: we have a function cleanupAndCheckForCrashes() where we raise an error: http://hg.mozilla.org/build/talos/file/tip/talos/ttest.py#l168 The problem is we call this from: http://hg.mozilla.org/build/talos/file/tip/talos/ttest.py#l380 and when the error is raised, we fall into the 'except' clause where we call cleanupAndCheckForCrashes() again! to fix this should we have a finally clause for cleanup, etc...?
Could we just get rid of the raises at http://hg.mozilla.org/build/talos/file/tip/talos/ttest.py#l168 ? This would not only solve this problem, but also mean we don't get the redundant "stack found after process termination" lines now that we have PROCESS-CRASH working correctly. I don't know how this would affect the result of the run (I haven't looked yet), but it would be good if crashes turned the run orange instead of red - which presumably would be done by stopping the exception and handling in buildbot like we do for the other suites?
(I think making the crashes appear orange would go some way to making devs realise it's an in-product problem not an automation problem, for these many many crashes)
if we have a crash it should be orange, since it is not infra, machine, network related. to get rid of the double crash, a finally wouldn't work, we could set a state variable such that it would indicate if we have processedCrashes and to ignore future processing. More likely we should remove all crash files from the device and local host once we are done processing them.
I wonder how to make talos turn orange? TEST-UNEXPECTED-FAIL ?
Assignee: nobody → jmaher
Status: NEW → ASSIGNED
Attachment #701856 - Flags: review?(jhammel)
Comment on attachment 701856 [details] [diff] [review] remove the crash from the device if we process it (1.0) lgtm!
Attachment #701856 - Flags: review?(jhammel) → review+
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Depends on: 832442
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: