Closed Bug 416863 Opened 15 years ago Closed 15 years ago

Move talos process management code to killableprocess


(Release Engineering :: General, defect, P3)



(Not tracked)



(Reporter: mikeal, Unassigned)




(1 file, 4 obsolete files)

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9b4pre) Gecko/2008020904 Minefield/3.0b4pre
Build Identifier: 

There have been some talos issues where the windows machines can't properly kill the firefox process.

killableprocess is a python module with various contributors that has been passed around for a few years that acts as a drop in replacement for subprocess but keeps a handle of any child processes and should be able to kill them when the time comes. It has a good amount of code that specifically deals with windows process issues of this nature.

Reproducible: Always

Steps to Reproduce:
Attached patch killableprocess patch 1 (obsolete) — Splinter Review
Attachment #302634 - Flags: review?
Assignee: nobody → mrogers
Ever confirmed: true
Comment on attachment 302634 [details] [diff] [review]
killableprocess patch 1

review request
Attachment #302634 - Flags: review? → review?(anodelman)
Comment on attachment 302634 [details] [diff] [review]
killableprocess patch 1

Seems to be missing something.  This is the error I get when I attempt to do a "python --debug --noisy sample.config".

Traceback (most recent call last):
  File "", line 63, in <module>
    import ttest
  File "c:\mozilla\testing\performance\talos\", line 58, in <module>
    import killableprocess, subprocess
  File "c:\mozilla\testing\performance\talos\", line 73, in <module>
    import winprocess
ImportError: No module named winprocess
Attachment #302634 - Flags: review?(anodelman) → review-
Attached patch killableprocess patch 2 (obsolete) — Splinter Review
Whoops. Forgot to include in the last patch.
Attachment #302634 - Attachment is obsolete: true
Attachment #302706 - Flags: review?(anodelman)
Comment on attachment 302706 [details] [diff] [review]
killableprocess patch 2

Are there some extra dependencies that I should know about?

For python 2.4 (which the talos boxes run with)
Traceback (most recent call last):
  File "", line 63, in ?
    import ttest
  File "c:\mozilla\testing\performance\talos\", line 58, in ?
    import killableprocess, subprocess
  File "c:\mozilla\testing\performance\talos\", line 73, in ?
    import winprocess
  File "c:\mozilla\testing\performance\talos\", line 37, in ?
    from ctypes import c_void_p, POINTER, sizeof, Structure, windll, WinError, WINFUNCTYPE
ImportError: No module named ctypes

With python 2.5 (just for fun):
  File "c:\mozilla\testing\performance\talos\", line 223, in runCommand
    return Popen(cmd, **kwargs)
  File "c:\Python25\lib\", line 593, in __init__
    errread, errwrite)
  File "c:\mozilla\testing\performance\talos\", line 121, in _execute_child
    startupinfo.wShowWindow = winprocess.SW_HIDE
AttributeError: 'module' object has no attribute 'SW_HIDE'
Attachment #302706 - Flags: review?(anodelman) → review-
Maybe we should hold off on these patches till you get your winxp cd and can get through some of these initial pitfalls yourself. :)
Yeah, not being able to test on windows is incredibly annoying.

This module does require ctypes, which was added to standard lib in Python 2.5 as you can see. I thought that all the boxes running buildbot stuff were on 2.5, but I guess I was wrong.

That error on 2.5 is interesting, I hadn't thought to weed through killableprocess and winprocess for issues as it's in use by so many other people but looking at that code it's an obvious bug, unfortunately I can't see where SW_HIDE is suppose to be imported from until I get a windows VM and can look through the ctypes module.

If this is going to be 2.5 dependent then it is going to be pretty hard to deploy - not that we shouldn't consider it if it gives us some good wins stability wise.  Just that I don't relish the idea of having to update 30 machines.

I recall there being some other reason that we weren't using 2.5 having to do with buildbot... I'd need to do some research on that.
Alternatively, we could install the ctypes module on all the windows boxes.
Big Update.

I have a new patch I will be uploading momentarily, but refrain from reviewing it until I work on testing that the patch takes care of crashreporter.

The way killableprocess works is that I sets various options on all operating systems to tie a particular process, and any new processes that it spawns, to a group pid ( or equivalent on windows ). 

The patch has required ripping out all of the old "lookup process by name and kill it" code with new code that uses the process handler. In earlier testing Alice said that this may not be picking up the crashreporter, I don't know the innards of how the crashreporter works but I think this may be do to the fact that the crashreporter is launched in some way that it is not tied to the same group id.

I think the most reliable fix we can have for all edge cases is to simply stick back some of the "lookup process by name and kill it" code to be run _after_ the new killableprocess code. At that point we'll be doing all that we can feasibly do to kill open firefox and crashreporter processes.

The patch I currently have is not complete but I'll be sticking it in the bug for good measure, it should be solid in all the cases except killing the crash reporter.
Attached patch killable patch 3 (obsolete) — Splinter Review
Not for review. There still may be an issue killing crashreporter.
Attachment #302706 - Attachment is obsolete: true
Attached patch killable patch 4 (obsolete) — Splinter Review
The previous patch had tons of windows endline changes. This one is cleaned up.
Attachment #306060 - Attachment is obsolete: true
cool! looks like a good start on it. I noticed one thing at the start of the patch:

? killable_patch4.diff

I think you've got a diff file in your cvs directory. Should remove those when generating diffs.
Priority: -- → P2
Final patch.. hopefully.

Tested on all platforms.
Attachment #306165 - Attachment is obsolete: true
Attachment #307108 - Flags: review?(anodelman)
Component: Testing → Release Engineering
Product: Core →
Version: unspecified → other
QA Contact: testing → release
Are we still going to take this, or did the other buildbot patch fix this?

Priority: P2 → P3
This patch may be made unnecessary by bug 420216 which will handle killing of all subprocesses at the buildbot/twisted level.
Comment on attachment 307108 [details] [diff] [review]
killableprocess patch 5

I think that the base talos code has changed enough that this patch needs a refresh to be reviewable.
Attachment #307108 - Flags: review?(anodelman) → review-
Assignee: mrogers → nobody
Negated by bug 420216 comment 20
Closed: 15 years ago
Resolution: --- → WONTFIX
Bug#420216 seems to do good enough for us.
Component: Release Engineering: Talos → Release Engineering
Product: → Release Engineering
You need to log in before you can comment on or make changes to this bug.