Sisyphus - Crash Automation - workers should not exit unless explicitly terminated

RESOLVED FIXED

Status

Testing
Sisyphus
RESOLVED FIXED
8 years ago
8 years ago

People

(Reporter: bc, Assigned: bc)

Tracking

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments)

(Assignee)

Description

8 years ago
Created attachment 437050 [details] [diff] [review]
patch v1

Workers should try to restart if they have unrecoverable problems rather than just terminate. This will help in automating the management of multiple workers.
Attachment #437050 - Flags: review?(ctalbert)

Comment 1

8 years ago
Comment on attachment 437050 [details] [diff] [review]
patch v1

This looks good.  As I'm thinking about the things the worker would hit when it restarts, I started thinking about rogue processes from Firefox and Firefox subprocesses (as we talked about today).  Does it make sense for the worker to do some kind of ps -A |grep Firefox & kill -9 anything it finds when it is doing this restarting step?  Or is that logic handled by the sisyphus system itself?

r+, contingent on resolving that ^ issue.
Attachment #437050 - Flags: review?(ctalbert) → review+
(Assignee)

Comment 2

8 years ago
Created attachment 438680 [details] [diff] [review]
killTest patch

I started working on a patch to do that. The previous killTest might accidentally kill the worker and would only try to do so if there was an indication something went wrong. If the worker was unaware of the stuck processes it wouldn't handle it. This patch kills anything below the build directory either before a build is performed or before a test runs so I think it will cover the bases you outlined. It currently has the build directory hard-wired. I should be able to get the build directory another way, so its not quite ready for prime time. It *appears* to be working in testing though. Another thing it could do better is flag the case where a process is still running after the test *should have* completed.
Attachment #438680 - Flags: feedback?(ctalbert)
(Assignee)

Updated

8 years ago
Attachment #438680 - Attachment is patch: true
Attachment #438680 - Attachment mime type: application/octet-stream → text/plain

Comment 3

8 years ago
Comment on attachment 438680 [details] [diff] [review]
killTest patch

This looks like the right approach to me.
Attachment #438680 - Flags: feedback?(ctalbert) → feedback+
(Assignee)

Comment 4

8 years ago
http://hg.mozilla.org/qa/sisyphus/rev/f07b83b552ff
(Assignee)

Comment 5

8 years ago
Created attachment 438994 [details] [diff] [review]
killTest patch v2

same as v1 with addition of calling killTest when exiting or restarting the worker. This still has the hard coded path, but since we are rethinking the build issues, lets go with it as is for now.
Attachment #438994 - Flags: review?(ctalbert)

Updated

8 years ago
Attachment #438994 - Flags: review?(ctalbert) → review+
(Assignee)

Comment 6

8 years ago
http://hg.mozilla.org/qa/sisyphus/rev/7f13b712bfdf
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.