Closed Bug 464191 Opened 11 years ago Closed 11 years ago

ssltunnel process persists after test run

Categories

(Testing :: Mochitest, enhancement)

enhancement
Not set

Tracking

(Not tracked)

VERIFIED FIXED
mozilla1.9.2a1

People

(Reporter: sayrer, Assigned: bjarne)

References

Details

(Keywords: fixed1.9.1, Whiteboard: [fixed1.9.1b4])

Attachments

(1 file, 11 obsolete files)

8.76 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
I was running mochitest on my machine today, and I've ended up with a bunch of ssltunnel processes. This is on linux.
Summary: sstunnel process persists after test run → ssltunnel process persists after test run
Assignee: nobody → honzab
Status: NEW → ASSIGNED
I was trying to autorun mochitest (a subset and complete) with same parameters as on tinderboxes (--autorun --console-level=INFO --close-when-done) and I cannot reproduce it. I've been testing on virtual machine with CentOS 5 installation similar to one on tinderboxes and python 2.5.1. What is your configuration and runtests.py arguments?
Attached patch Simple python 2.3+ patch (obsolete) — Splinter Review
The problem is that automation.py forks a shell which in turn forks the ssltunnel-executable. The pid handled by automation.py is the pid of the shell, but killing it does not kill child-processes (i.e. the ssltunnel-executable).

The provided patch handles this by setting a process-group for the shell (which is inherited by the ssltunnel-executable), and then kills the whole process-group instead of just the shell.

NOTE : This uses functionality introduced in Python 2.3. There is a comment about the need to support pre-2.3 Python in the class Process in automation.py, but I'm not sure how pertinent this is anymore.

Another way to avoid the problem of two processes might be to make the shell do an "exec ssltunnel" instead of just executing it like a normal program. I have not tried this though, so I don't know if there are pitfalls...
Cool, Bjarne, any theory why it could not be consistently reproducible on mac os x 10.4 and centos machine?
(In reply to comment #3)
> Cool, Bjarne, any theory why it could not be consistently reproducible on mac
> os x 10.4 and centos machine?

Probably (I'm guessing here!) because centos and mac os use a proper bash-shell for system-shell. On default Ubuntu it's just a link to dash, which is a lot less sophisticated shell.

We should resolve the question about using Python 2.3., but I don't know who should make such a decision - maybe you can add relevant people to the cc-list or request a review by the right guy?
Attachment #350752 - Attachment description: Simple 2.3+ patch → Simple python 2.3+ patch
Attachment #350752 - Flags: review?(jwalden+bmo)
We've already committed to Python 2.4 or newer in bug 427750. In fact, we should rewrite a bunch of automation.py to use the subprocess module in light of this.
(In reply to comment #5)
> We've already committed to Python 2.4 or newer in bug 427750. In fact, we
> should rewrite a bunch of automation.py to use the subprocess module in light
> of this.

Yes, the subprocess module would be another way to deal with this. However, I would vote against rewriting automation.py just for the sake of rewriting. :)
I think the amount of code removal we could perform would be worth it, but I'm also not volunteering. I suspect someone will do it eventually while fixing another bug.
Comment on attachment 350752 [details] [diff] [review]
Simple python 2.3+ patch

I recognize this problem description from some previous Mochitest modification -- anybody else have time to find the bug for that?  I think it might have been related to the xpcshell process for the server, but memory fails me.  We should figure out some way to prevent this problem from manifesting itself yet again in future code -- better documentation, somewhere?  I don't have good ideas yet, and I'm not helped by remembering the exact details of this problem the last time we hit it.

Looking at this change, first, I don't recall it being the one used the last time around.  Second, the docs I can find on setpgid say the call to it is racing against the shell actually being started, or at least that's how I'm interpreting them.  (Whether the race is a problem in practice is another question.)  Third, the availability of setpgid is Unix only, so you'd need to guard against it with IS_UNIX or whatever we used as the appropriate constant.

The way we did this last time is what we really should be doing here, tho, as far as I can remember, so we need to at least dig that up and check that it's not appropriate before reverting to setpgid here.
Attachment #350752 - Flags: review?(jwalden+bmo) → review-
...which actually does suggest switching to subprocess as the solution.  :-)
Bjarne is going to take a deeper look at this.
Assignee: honzab.moz → bjarne
Attached patch patch v1.0 using subprocess (obsolete) — Splinter Review
This builds on patch v2.2 from bug #456001.

I have successfully done a full, clean build plus mochitests on Ubuntu 8,04, but it needs testing on Windows, Mac, Centos etc.
Attachment #350752 - Attachment is obsolete: true
Attachment #351534 - Flags: review?(jwalden+bmo)
Tested on win xp, centos5 and mac 10.4. Building, runtests.py is working, ssltunnel is killed on all platforms.

Think left to do:
stdin=inputdata is wrong; stdin expects file and inputdata (designed to be as simple as possible to pass an input) is a string. currently inputdata is used just for CA manual re-generation what is more then rare. anyway, this is obviously wrong to be left in the patch as is. there is relatively easy way to make it work:
   1. use StringIO (what I originally tried but failed because StringIO doesn't in all cases behave as a file, see bug 428009 comment 42)
   2. or better, use tempfile.TemporaryFile() as was used for windows version of process run; I found it reliable

Now I will run profile guided builds to check it still works and feedback again ASAP.
Updated to handle comments about inputdata from Honza. Using communicate() to pass the given string to certutil, which should also work on Windows.
Attachment #351534 - Attachment is obsolete: true
Attachment #351614 - Flags: review?(jwalden+bmo)
Attachment #351534 - Flags: review?(jwalden+bmo)
Yep, this patch looks quit good. I again tested mochitest and generation of the ca cert (to test communicate). All works. I unfortunately cannot tell if PGO works because PGO build is broken on my mac and linux machine, it occasionally fails even w/o the patch, have yet to test on win. Maybe it is a known problem?
And one more thing: we will probably have to land patch for bug 456001 with this one in a signle changeset or fold them together. Other possibility is to flip them - land this first.
Attachment #351614 - Flags: review?(jwalden+bmo) → review-
Comment on attachment 351614 [details] [diff] [review]
patch v1.1 using subprocess with communicate()

>diff -r 303d387cff21 build/pgo/automation.py.in

>+import subprocess

I'm not sure when it happened, but these imports should be alphabetical unless this order is needed to not break functionality (which I doubt).  Please re-alphabetize these imports when you add the subprocess one here.


>-           "Process",
>+           "KillableSubprocess",

It's good to rename to catch any forgotten renamings, but KillableSubprocess is rather long to type, and the "Killable" penalty on every use just to remind the user that the .kill() method exists isn't worth paying.  Would it be a problem to use "Subprocess", or perhaps even keep the "Process" name?


>+class KillableSubprocess(subprocess.Popen):
>+  "Represents a subprocess which is killable (can be stopped by parent-process)"

"by the parent process", and put a period at the end of this.


>   def kill(self):
>-    "Kills this process."
>-    try:
>-      if not IS_WIN32:
>-        os.kill(self._process.pid, signal.SIGKILL)
>-      else:
>-        import subprocess
>-        pid = "%i" % self.pid
>-        process = subprocess.Popen(["taskkill", "/F", "/PID", pid])
>-        process.wait()
>-    except:
>-      pass
>+    if IS_WIN32:
>+      import ctypes
>+      ctypes.windll.kernel32.TerminateProcess(int(self._handle), -1)
>+    else:
>+      os.kill(self.pid, signal.SIGTERM)

Why are you changing the method of killing the process?  I don't think you should do this, at the very least not in this bug.  I've heard horror stories about TerminateProcess not actually killing Windows processes in a clean manner similar to Unix, so I'm doubly leery of this change.


>-  status = Process(certutil, ["-N", "-d", profileDir, "-f", pwfilePath], environment()).wait()
>+  status = subprocess.Popen(
>+                  [certutil, "-N","-d", profileDir, "-f", pwfilePath],
>+                  env=environment()).wait()

Assemble the command and arguments into an array on one line, then pass that to the constructor call in another.  Avoid going over 80 characters per line as a general rule.

Put one space on each side of the = in a keyword argument definition.


>     if ext == ".ca":
>-      Process(certutil, ["-A", "-i", os.path.join(CERTS_DIR, item), "-d", profileDir, "-f", pwfilePath, "-n", root, "-t", "CT,,"], environment()).wait()
>+      subprocess.Popen(
>+                  [certutil, "-A", "-i", os.path.join(CERTS_DIR, item), "-d", profileDir, "-f", pwfilePath, "-n", root, "-t", "CT,,"],
>+                  env=environment()).wait()

Same things here.


>     if ext == ".client":
>-      Process(pk12util, ["-i", os.path.join(CERTS_DIR, item), "-w", pwfilePath, "-d", profileDir], environment()).wait()
>+      subprocess.Popen(
>+                  [pk12util, "-i", os.path.join(CERTS_DIR, item), "-w", pwfilePath, "-d", profileDir],
>+                  env=environment()).wait()

And here.


>     # start ssltunnel to provide https:// URLs capability
>     ssltunnel = DIST_BIN + "/ssltunnel" + BIN_SUFFIX
>-    ssltunnelProcess = Process(ssltunnel, [os.path.join(CERTS_DIR, "ssltunnel.cfg")], environment())
>+    ssltunnelProcess = KillableSubprocess(
>+                           [ssltunnel, os.path.join(CERTS_DIR, "ssltunnel.cfg")],
>+                           env=environment())

And here.


>-  args = []
>+  args = [cmd]

Rename |args| to |command| or |cmdline| or something like that, since it's no longer actually just the arguments.


>-  proc = Process(cmd, args, env = environment(env))
>+  proc = subprocess.Popen(args, env = environment(env))

This is the style of whitespace in env-setting, if it wasn't clear.


>diff -r 303d387cff21 build/pgo/genpgocert.py.in

>+import subprocess

Is this import necessary?  I don't think it is, but I'm not sure.


> def runUtil(util, args, inputdata = None):
>-  proc = automation.Process(util, args, automation.environment(), inputdata)
>-  return proc.wait()
>+  cmdline = [util]
>+  cmdline.extend(args)
>+  proc = subprocess.Popen(cmdline, env=automation.environment(), stdin=subprocess.PIPE)
>+  if inputdata:
>+    proc.communicate(inputdata)
>+    return proc.returncode
>+  else:
>+    return proc.wait()

An else: after a return is meaningless; get rid of the else and unindent the |return proc.wait()|.


>diff -r 303d387cff21 testing/mochitest/runtests.py.in

>     xpcshell = automation.DIST_BIN + "/" + "xpcshell";
>-    self._process = automation.Process(xpcshell, args, env = env)
>+
>+    cmd = [xpcshell]
>+    cmd.extend(args)
>+
>+    self._process = automation.KillableSubprocess(cmd, env=env)

Don't bother with the |xpcshell| temporary; just create the array with that value in it directly.
(In reply to comment #17)
> (From update of attachment 351614 [details] [diff] [review])
> >diff -r 303d387cff21 build/pgo/automation.py.in
> 
> >+import subprocess
> 
> I'm not sure when it happened, but these imports should be alphabetical unless
> this order is needed to not break functionality (which I doubt).  Please
> re-alphabetize these imports when you add the subprocess one here.

Ok.

> 
> >-           "Process",
> >+           "KillableSubprocess",
> 
> It's good to rename to catch any forgotten renamings, but KillableSubprocess is
> rather long to type, and the "Killable" penalty on every use just to remind the
> user that the .kill() method exists isn't worth paying.  Would it be a problem
> to use "Subprocess", or perhaps even keep the "Process" name?

I absolutely disagree. :)

The only reason for this subclass is to introduce the kill() method. Thus, the only reason to use the subclass is that you plan to kill the process explicitly. For all other usage the standard library-class is sufficient and should be used IMO. Making explicit the fact that this is a killable process discourages habitual use of the subclass (or even worse, usage because it's quicker to write "Process" than "subprocess.Popen").

However, I'm a pragmatic guy and will of-course change this if you insist. :)

> >+class KillableSubprocess(subprocess.Popen):
> >+  "Represents a subprocess which is killable (can be stopped by parent-process)"
> 
> "by the parent process", and put a period at the end of this.

Ok. I guess that in fact any process with sufficient privs and the object-reference will be able to kill this subprocess (not just the parent). Would this phrase be ok ?

Represents a subprocess with a kill() method.


> >   def kill(self):
> >-    "Kills this process."
> >-    try:
> >-      if not IS_WIN32:
> >-        os.kill(self._process.pid, signal.SIGKILL)
> >-      else:
> >-        import subprocess
> >-        pid = "%i" % self.pid
> >-        process = subprocess.Popen(["taskkill", "/F", "/PID", pid])
> >-        process.wait()
> >-    except:
> >-      pass
> >+    if IS_WIN32:
> >+      import ctypes
> >+      ctypes.windll.kernel32.TerminateProcess(int(self._handle), -1)
> >+    else:
> >+      os.kill(self.pid, signal.SIGTERM)
> 
> Why are you changing the method of killing the process?  I don't think you
> should do this, at the very least not in this bug.  I've heard horror stories
> about TerminateProcess not actually killing Windows processes in a clean manner
> similar to Unix, so I'm doubly leery of this change.

Ok...  probably safer to stick to to the old, battle-hardened way. Will fix

> >-  status = Process(certutil, ["-N", "-d", profileDir, "-f", pwfilePath], environment()).wait()
> >+  status = subprocess.Popen(
> >+                  [certutil, "-N","-d", profileDir, "-f", pwfilePath],
> >+                  env=environment()).wait()
> 
> Assemble the command and arguments into an array on one line, then pass that to
> the constructor call in another.  Avoid going over 80 characters per line as a
> general rule.

Ok.

> Put one space on each side of the = in a keyword argument definition.

Ok.

> >-  args = []
> >+  args = [cmd]
> 
> Rename |args| to |command| or |cmdline| or something like that, since it's no
> longer actually just the arguments.

Yup - make sense.

> >+import subprocess
> 
> Is this import necessary?  I don't think it is, but I'm not sure.

I'll check but I think it's used in runUtil()...

> > def runUtil(util, args, inputdata = None):
> >-  proc = automation.Process(util, args, automation.environment(), inputdata)
> >-  return proc.wait()
> >+  cmdline = [util]
> >+  cmdline.extend(args)
> >+  proc = subprocess.Popen(cmdline, env=automation.environment(), stdin=subprocess.PIPE)
> >+  if inputdata:
> >+    proc.communicate(inputdata)
> >+    return proc.returncode
> >+  else:
> >+    return proc.wait()
> 
> An else: after a return is meaningless; get rid of the else and unindent the
> |return proc.wait()|.

According to specs, communicate() waits for process-termination before it returns. I'm not sure how calling wait() after communicate() will work, which is why I return the returncode in this case. Maybe you know?

> >diff -r 303d387cff21 testing/mochitest/runtests.py.in
> 
> >     xpcshell = automation.DIST_BIN + "/" + "xpcshell";
> >-    self._process = automation.Process(xpcshell, args, env = env)
> >+
> >+    cmd = [xpcshell]
> >+    cmd.extend(args)
> >+
> >+    self._process = automation.KillableSubprocess(cmd, env=env)
> 
> Don't bother with the |xpcshell| temporary; just create the array with that
> value in it directly.

Ok.
(In reply to comment #18)
> (In reply to comment #17)
> > >-           "Process",
> > >+           "KillableSubprocess",
> > 
> > It's good to rename to catch any forgotten renamings, but KillableSubprocess is
> > rather long to type, and the "Killable" penalty on every use just to remind the
> > user that the .kill() method exists isn't worth paying.  Would it be a problem
> > to use "Subprocess", or perhaps even keep the "Process" name?
> 
> I absolutely disagree. :)
> 
> The only reason for this subclass is to introduce the kill() method. Thus, the
> only reason to use the subclass is that you plan to kill the process
> explicitly. For all other usage the standard library-class is sufficient and
> should be used IMO. Making explicit the fact that this is a killable process
> discourages habitual use of the subclass (or even worse, usage because it's
> quicker to write "Process" than "subprocess.Popen").
> 
> However, I'm a pragmatic guy and will of-course change this if you insist. :)
> 

I also vote for just "Process".

> > >   def kill(self):
> > >-    "Kills this process."
> > >-    try:
> > >-      if not IS_WIN32:
> > >-        os.kill(self._process.pid, signal.SIGKILL)
> > >-      else:
> > >-        import subprocess
> > >-        pid = "%i" % self.pid
> > >-        process = subprocess.Popen(["taskkill", "/F", "/PID", pid])
> > >-        process.wait()
> > >-    except:
> > >-      pass
> > >+    if IS_WIN32:
> > >+      import ctypes
> > >+      ctypes.windll.kernel32.TerminateProcess(int(self._handle), -1)
> > >+    else:
> > >+      os.kill(self.pid, signal.SIGTERM)
> > 
> > Why are you changing the method of killing the process?  I don't think you
> > should do this, at the very least not in this bug.  I've heard horror stories
> > about TerminateProcess not actually killing Windows processes in a clean manner
> > similar to Unix, so I'm doubly leery of this change.
> 
> Ok...  probably safer to stick to to the old, battle-hardened way. Will fix
> 

I was testing this new code on xp sp3 machine and it worked for me. But I remembered I was looking for a true reliable way to kill processes a long time and this new one was not one of them. But I may not recall quit correctly. However, I vote for to leave this code unchanged.

> > > def runUtil(util, args, inputdata = None):
> > >-  proc = automation.Process(util, args, automation.environment(), inputdata)
> > >-  return proc.wait()
> > >+  cmdline = [util]
> > >+  cmdline.extend(args)
> > >+  proc = subprocess.Popen(cmdline, env=automation.environment(), stdin=subprocess.PIPE)
> > >+  if inputdata:
> > >+    proc.communicate(inputdata)
> > >+    return proc.returncode
> > >+  else:
> > >+    return proc.wait()
> > 
> > An else: after a return is meaningless; get rid of the else and unindent the
> > |return proc.wait()|.
> 
> According to specs, communicate() waits for process-termination before it
> returns. I'm not sure how calling wait() after communicate() will work, which
> is why I return the returncode in this case. Maybe you know?
> 

Jeff means something else, change it this way:

  if inputdata:
    proc.communicate(inputdata)
    return proc.returncode

  return proc.wait()

Clear?
And good news, on windows (just on it) PGO build and run 'python profileserver.py' works ok. I could not build on any other platform but not because of this new patch.
And yet one more thing: Bjarn, move this patch under the patch from 456001 (it means build my patch for that bug on top of yours, it would be simpler to check in those two patches. Sorry I told you before to do it in reverse order... Didn't realize.)
(In reply to comment #21)
> And yet one more thing: Bjarn, move this patch under the patch from 456001 (it
> means build my patch for that bug on top of yours, it would be simpler to check
> in those two patches. Sorry I told you before to do it in reverse order...
> Didn't realize.)

Then you would have to change your patch for 456001 since it uses the old Process-object for reading client-certs. But if this is ok with you, I don't really mind.
> > > > def runUtil(util, args, inputdata = None):
> > > >-  proc = automation.Process(util, args, automation.environment(), inputdata)
> > > >-  return proc.wait()
> > > >+  cmdline = [util]
> > > >+  cmdline.extend(args)
> > > >+  proc = subprocess.Popen(cmdline, env=automation.environment(), stdin=subprocess.PIPE)
> > > >+  if inputdata:
> > > >+    proc.communicate(inputdata)
> > > >+    return proc.returncode
> > > >+  else:
> > > >+    return proc.wait()
> > > 
> > > An else: after a return is meaningless; get rid of the else and unindent the
> > > |return proc.wait()|.
> > 
> > According to specs, communicate() waits for process-termination before it
> > returns. I'm not sure how calling wait() after communicate() will work, which
> > is why I return the returncode in this case. Maybe you know?
> > 
> 
> Jeff means something else, change it this way:
> 
>   if inputdata:
>     proc.communicate(inputdata)
>     return proc.returncode
> 
>   return proc.wait()
> 
> Clear?

Yes - I see what you mean. Far from my coding-style, though. :)
(In reply to comment #22)
> Then you would have to change your patch for 456001 since it uses the old
> Process-object for reading client-certs. But if this is ok with you, I don't
> really mind.

That's true. You just have to remove it from your patch queue.
Blocks: 456001
Updated to honor comments from reviewer and from Honza.

In this rewrite, all usage of the subprocess-module is encapsulated in automation.Process. Process now adds a kill() method, a default value for env, and it also passes input to the process if necessary when calling wait(). With this level of changes I don't have a problem using a name like "Process", nor to encourage using it instead of the library-object. (So now we won't have to spend time discussing that issue... ;) )

Note that this patch does *not* depend on any patches from bug #456001, as requested by Honza.
Attachment #351614 - Attachment is obsolete: true
Attachment #351880 - Flags: review?(jwalden+bmo)
Comment on attachment 351880 [details] [diff] [review]
Patch v1.2 using subprocess with communicate()

Am I the only person who thinks it's good not to wrap the APIs of subprocess.Popen?  Wrapping them means more pain learning their API, versus simply loading standard Python docs to see what to do.  I'd prefer the original thing of just adding a kill() method; am I alone in thinking this?
True is that in our scripts e only use constructor, wait() and kill(). So, there is actually no reason to have other methods of Popen. Turing process to aggregate it is not that bad idea. Also it would solve dirty code in wait() with inputstring.
(In reply to comment #26)
> (From update of attachment 351880 [details] [diff] [review])
> Am I the only person who thinks it's good not to wrap the APIs of
> subprocess.Popen?  Wrapping them means more pain learning their API, versus
> simply loading standard Python docs to see what to do.  I'd prefer the original
> thing of just adding a kill() method; am I alone in thinking this?

I'll leave this decision to you, Jeff. But following up on Honza's comments, I want to point out that there are two (non-contradictory) ways to see Process : 1) as an abstraction of our particular requirements for subprocesses, and 2) as enhancements to Popen making it more convenient to use for us.

If you prefer the first view, the case is clear. Using Popen is just an implementation-detail and the API for Process is a result of our needs. No reason to mix them up.

However, I assume you prefer the latter view, with which I agree. As you see from the code, the differences are (1) the kill() method and (2) a keyword string-parameter in the constructor which, if provided (3) causes wait() to pass it to the subprocess before waiting for termination. These changes to the API are pretty simple, IMHO, and in all other aspects Process behaves as Popen. The constructor can be made more compatible with Popen if required. I found the separation btw cmd and args to be convenient and better aligned with the old Process-class, and certain parameters are only valid on Unix. But it's simple to add missing parameters to the constructor and pass them to Popen.__init()__.

Finally, please keep in mind that we are talking about a small piece of the build/test system. Important enough, clearly, but hardly essential functionality used by large number of programmers. :) So let's be pragmatic and get this fixed - if you want to revert to just add kill() that's ok. Just say so.
Yes, please revert to just adding kill().  It occurs to me we can add a communicateAndWait() method that does what your modified wait() does, as a convenient helper function, and then avoid modifying the existing interface -- fel free to do that if you think people would find it useful.
Very well...  here's a version with only kill() added. I really, really think this subclass should be renamed to something like "KillableProcess", but never mind.
Attachment #351880 - Attachment is obsolete: true
Attachment #352706 - Flags: review?(jwalden+bmo)
Attachment #351880 - Flags: review?(jwalden+bmo)
Comment on attachment 352706 [details] [diff] [review]
V1.1.1 patch just adding kill() method

As it turns out, Python 2.6 adds .kill() and .terminate() methods.  No idea if either has the semantics of process killing on Unix wrt children.  But whatever; if we ever reach a point where 2.6 is something we can use, we can remove this then.

Thanks for being patient with my requests here!
Attachment #352706 - Flags: review?(jwalden+bmo) → review+
Bjarne, can you generate a final version of the patch with a commit message and your username included?  If you're using mq, qrefresh -U -m "message" should do the trick.  Once you've uploaded that someone can actually check it in.
And please make sure you are working on updated repository. I had some problems applying the patches.
Applies cleanly on my repository (trunk) updated 5 minutes ago.
Attachment #352706 - Attachment is obsolete: true
Attachment #353391 - Flags: review?(jwalden+bmo)
Depends on: 427750
Comment on attachment 353391 [details] [diff] [review]
V1.1.2 taking into account latest comments

Sigh, I really thought we were golden with the metadata changes, but several more things...

>Changed some mochitest-scipts to use subprocess-module

Commit messages are supposed to include who reviewed the patch, the bug number, a summary of what the problem is being fixed (generally the bug summary, possibly with tweaks to encode implicit knowledge of the product/component, as below), and optionally what was done to fix it if it's simple.  I personally prefer something like this (with capitalization/spelling/punctuation fixes, note):

Bug 464191 - ssltunnel process persists after Mochitest run.  r=jwalden


>diff -r 79136a417a3a build/pgo/genpgocert.py.in

> def runUtil(util, args, inputdata = None):
>-  proc = automation.Process(util, args, automation.environment(), inputdata)
>-  return proc.wait()
>+  if (inputdata):
>+    proc = subprocess.Popen([util] + args, env = automation.environment(), stdin = subprocess.PIPE)
>+    proc.communicate(inputdata)
>+    return proc.returncode
>+  return subprocess.Popen([util] + args, env = automation.environment()).wait()

I'd prefer if you used automation.Process here, might as well be consistent even if we don't need the extra kill functionality.  Also, no parens around the condition in the if there.


But perhaps most importantly, since the require-2.(x>3) change hasn't happened yet (see dep bug for reason), we can't actually check this in yet anyway even if it were good to go, so there's no rush to get these things fixed immediately -- if you have other things to work on until this happens, feel free to do those first if you want.
Attachment #353391 - Flags: review?(jwalden+bmo) → review-
(In reply to comment #35)
> Commit messages are supposed to include who reviewed the patch, the bug number,
> a summary of what the problem is being fixed (generally the bug summary,
> possibly with tweaks to encode implicit knowledge of the product/component, as
> below), and optionally what was done to fix it if it's simple.  I personally
> prefer something like this (with capitalization/spelling/punctuation fixes,
> note):
> 
> Bug 464191 - ssltunnel process persists after Mochitest run.  r=jwalden

Sorry - will fix. Should read guidelines more carefully...

> >diff -r 79136a417a3a build/pgo/genpgocert.py.in
> 
> > def runUtil(util, args, inputdata = None):
> >-  proc = automation.Process(util, args, automation.environment(), inputdata)
> >-  return proc.wait()
> >+  if (inputdata):
> >+    proc = subprocess.Popen([util] + args, env = automation.environment(), stdin = subprocess.PIPE)
> >+    proc.communicate(inputdata)
> >+    return proc.returncode
> >+  return subprocess.Popen([util] + args, env = automation.environment()).wait()
> 
> I'd prefer if you used automation.Process here, might as well be consistent
> even if we don't need the extra kill functionality.

Just here? Or consistently also in automation.py?

Observe that genpgocert.py still has to import subprocess due to the PIPE-constant in the constructor. Patch v1.2 avoids this import (making genpgocert.py independent of subprocess) by passing inputdata in the constructor instead, delegating the "if (inputdata)"-part to automation.Process.

Moreover, if we decide to consistently use automation.Process, why don't we just fall back to patch v1.2 which really is just an abstraction of our needs from a process? :)

> Also, no parens around the
> condition in the if there.

Right... fixed.
Attached patch V1.1.3 (obsolete) — Splinter Review
Updated according to comment #35 except for the point about consistently using automation.Process instead of subprocess.Popen, ref. question in comment #36.
Attachment #353391 - Attachment is obsolete: true
Attachment #354802 - Flags: review?(jwalden+bmo)
Bjarne, personally I would like you to change as little code as possible. Leaving usage of automation.Process where it already is is a good way to do. What if we in future decide to advance Process class again somehow or we find out that subprocess.Popen is buggy in some way for us? Then we have to find all calls to subprocess.Popen you made by this patch and change it again to use automation.Process (to revert your changes). I really don't understand why you insist on using directly subprocess.Popen. It makes the existing code inconsistent IMHO.
I normally avoid using subclasses if I don't need the functionality introduced in the subclass. Call it a desire for simplicity, maybe even asceticism... ;)

If we expect subprocess to be a candidate for replacement (or perhaps unavailable on some platform we are supposed to support) we should IMHO re-consider the approach suggested in comment #25 which is more suitable for that scenario (please read the whole comment to see why this is so). If we trust subprocess to be a stable component of Python I don't see why it shouldn't be used for what it's made for.

Nevertheless, I'll change everything to use automation.Process if you guys really want that. It's not how I would have done it, but I feel this defect has received more attention than it deserves by now - it's about time to bury it. :)
Version using (our own) automation.Process instead of (the standard) subprocess.Popen. See comment #39 for reasons why I would prefer using V1.1.3 instead.
Attachment #355548 - Flags: review?(jwalden+bmo)
Depends on: 467595
Comment on attachment 355548 [details] [diff] [review]
V1.1.4 using automation.Process instead of subprocess.Popen

Now we wait on infra upgrades for mobile... :-(
Attachment #355548 - Flags: review?(jwalden+bmo) → review+
Well, nobody is running Mochitest in scratchbox, so I don't think you really need to block on it. I would prefer to have the Python configure check in so we can use things like this and warn people upfront, but we already have some Python 2.4isms around.
Well, if you're fine with it, I'm fine with it.  :-)  Will land when tree reopens...
Keywords: checkin-needed
See last sentence of previous comment.  :-)
Keywords: checkin-needed
Blocks: 466080
What is the status here? This bug is marked as blocking bug 466080 which is a 3.1 blocker.
Looks like this landed:
http://hg.mozilla.org/mozilla-central/rev/82a1d612dde8
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Oops, sorry. A more careful look would have shown that it was landed (and backed out) twice.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
There's been some bitrotting with recent Mochitest churn, this is the latest thing I have that works.

The remaining problem with this patch is that build/leaktest.py.in expects to receive the output of the subprocess spawned for the application through the logging interface in Python.  This switch causes process output to no longer be logged, so leak-log comparison fails and turns the leak-testing boxes red.  I attempted to fix this by adding an appOutput = None default parameter to runApp and then setting it appropriately in leaktest.py and using it inside runApp as the value of the stdout argument to the Process call (and setting stderr appropriately as well to redirect that into stdout), but I didn't have luck getting it to work.  Bjarne, can you pick this up and fix that problem?  The following command will run leak tests, and then you can examine build.log to see what was generated.

cd $OBJDIR/_leaktest/ && python leaktest.py -l build.log
Attachment #355548 - Attachment is obsolete: true
Try this one. The stream-field in FileHandler is undocumented AFAIK, but works nicely for me.
Attachment #357857 - Attachment is obsolete: true
Attachment #358003 - Flags: review?(jwalden+bmo)
Comment on attachment 358003 [details] [diff] [review]
Jeffs patch redirecting stdout/stderr from the app to the log

This is, er, rather evil.

But I don't actually care all that much about that, although we'll be more fragile because of it -- what matters is that we're no longer displaying the output of the app process when we do this.  Please print the process's output before exiting so we don't regress that.
Attachment #358003 - Flags: review?(jwalden+bmo) → review-
(In reply to comment #50)
> (From update of attachment 358003 [details] [diff] [review])
> This is, er, rather evil.
> 
> But I don't actually care all that much about that, although we'll be more
> fragile because of it -- what matters is that we're no longer displaying the
> output of the app process when we do this.  Please print the process's output
> before exiting so we don't regress that.

You mean you want a tee-like functionality? Stdout/err should go both to console and log?
Doesn't have to be tee-like -- if there's an -l argument just dump the contents of the file when runApp returns, and if there isn't one just store output in a tempfile and dump its contents when runApp returns.  All that matters is that the output of the app be stored to the file and that the same output somehow gets in tinderbox logs.
Attached patch Simplified version (obsolete) — Splinter Review
This should be simpler and less evil.
Attachment #358003 - Attachment is obsolete: true
Attachment #358152 - Flags: review?(jwalden+bmo)
Comment on attachment 358152 [details] [diff] [review]
Simplified version

I'm not entirely happy about losing the output of other processes spawned along the way, e.g. ssltunnel and friends, which might print useful error messages if they fail to work, but yeah, this is probably safest.

But one last thing I noticed in passing:

>+      os.kill(self.pid, signal.SIGTERM)

Don't change kill behavior on non-Windows from SIGKILL to SIGTERM, please; keep it the way it was before using SIGKILL.  Post a patch with that change and I'll land it ASAP.  Thanks!
Attachment #358152 - Flags: review?(jwalden+bmo) → review+
Attachment #358152 - Attachment is obsolete: true
Attachment #358160 - Flags: review?(jwalden+bmo)
(In reply to comment #54)
> I'm not entirely happy about losing the output of other processes spawned along
> the way, e.g. ssltunnel and friends, which might print useful error messages if
> they fail to work, but yeah, this is probably safest.

You could try to move the for-line loop into Process itself and make all Process-objects redirect their stdout/stderr to the Python logger. But it would involve a heavier wrapping of Popen, which has been discussed before.

I won't have time in the short term for anything like that but feel free to register a new (low-pri) defect and assign it to me.
Comment on attachment 358160 [details] [diff] [review]
sigkill instead of sigterm
[Checkin: See comment 60 & 64]

(In reply to comment #56)
> You could try to move the for-line loop into Process itself and make all
> Process-objects redirect their stdout/stderr to the Python logger.

That's been done before, and it's what's done currently; the problem is then we have to manually spawn threads to deal with all that, and it's just a mess that I'd rather avoid.
Attachment #358160 - Flags: review?(jwalden+bmo) → review+
Depends on: 475163
Will this be landing on the 1.9.1 branch. If so, how soon do we expect it to land?

I'm asking because I need to fix TB's use of automation.py and the answer will let me know how I need to looking at fixing it.
Depends on: 475455
I think it'd be good to land it there; Bjarne, can you make a 191 version?  My qimport of the raw-rev for the checkin to m-c had patch-apply failures, but I didn't have time to check how substantial any of them were.
Target Milestone: --- → mozilla1.9.2a1
Version: unspecified → Trunk
Marking fixed since it is, on trunk at least...
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Any chance to get back reasonable stdout handling?
I need to see the stdout of firefox-bin immediately, not after some 
random time.

(Especially I need to see when cycle collector runs and when 
 user-interaction-active and user-interaction-inactive
 notifications are fired and when focus changes.)
Depends on: 476185
Attachment #354802 - Attachment is obsolete: true
Attachment #354802 - Flags: review?(jwalden+bmo)
(In reply to comment #62)
> I need to see the stdout of firefox-bin immediately, not after some 
> random time.

Filed as bug 476185.

***

V.Fixed, (locally) per(/after) bug 475455.
Status: RESOLVED → VERIFIED
Whiteboard: [needs 1.9.1 landing]
Comment on attachment 358160 [details] [diff] [review]
sigkill instead of sigterm
[Checkin: See comment 60 & 64]


http://hg.mozilla.org/releases/mozilla-1.9.1/rev/c93733ecb763

After fixing context for:
{
unable to find 'build/pgo/automation.py.in' for patching
5 out of 5 hunks FAILED

patching file testing/mochitest/runtests.py.in
Hunk #1 FAILED at 216
1 out of 1 hunks FAILED
}
Attachment #358160 - Attachment description: sigkill instead of sigterm → sigkill instead of sigterm [Checkin: See comment 60 & 64]
Flags: in-testsuite-
Keywords: fixed1.9.1
Whiteboard: [needs 1.9.1 landing] → [fixed1.9.1b4]
You need to log in before you can comment on or make changes to this bug.