Closed
Bug 597064
Opened 14 years ago
Closed 12 years ago
xpcshell harness should handle timeouts internally, fail tests (like mochitest or reftest harnesses)
Categories
(Testing :: XPCShell Harness, defect)
Testing
XPCShell Harness
Tracking
(Not tracked)
RESOLVED
FIXED
mozilla23
People
(Reporter: jdm, Assigned: jdm)
References
Details
(Keywords: sheriffing-P1, Whiteboard: [mentor=jdm][good first bug][lang=python])
Attachments
(1 file, 13 obsolete files)
19.31 KB,
patch
|
ted
:
review+
|
Details | Diff | Splinter Review |
<ted> because the xpcshell harness doesn't know about timeouts
<ted> we should fix that
<ted> it's buildbot forcibly killing the process
Updated•14 years ago
|
Summary: Timeout errors don't show any test output → xpcshell harness should handle timeouts internally, fail tests (like mochitest or reftest harnesses)
Assignee | ||
Updated•14 years ago
|
Assignee: nobody → josh
Assignee | ||
Comment 1•14 years ago
|
||
This is harder than simply adding a timer to head.js, unfortunately. IPC tests complicate things in that if a child is in an infinite loop the parent process will wait forever for it to finish. There's no way around that besides adding a watchdog into the shell, which I'm investigating. The significantly easier alternative is to go back to my original plan of using a timer from runxpcshelltests.py.
Assignee | ||
Comment 2•14 years ago
|
||
Assignee | ||
Updated•14 years ago
|
Attachment #476555 -
Attachment is obsolete: true
Assignee | ||
Comment 3•14 years ago
|
||
Assignee | ||
Updated•14 years ago
|
Attachment #477731 -
Attachment is obsolete: true
Assignee | ||
Comment 4•14 years ago
|
||
Now with 100% more qref
Assignee | ||
Updated•14 years ago
|
Attachment #477735 -
Flags: review?(ted.mielczarek)
Comment 5•14 years ago
|
||
Comment on attachment 477735 [details] [diff] [review]
Add timeout logic to xpcshell test runner.
>diff --git a/testing/xpcshell/runxpcshelltests.py b/testing/xpcshell/runxpcshelltests.py
>--- a/testing/xpcshell/runxpcshelltests.py
>+++ b/testing/xpcshell/runxpcshelltests.py
>+ def testTimeout(self, test, proc):
>+ global timedOut
You probably don't need to use a global here, just use a member variable, self.timedOut. Shouldn't be any synchronization issues since you're only setting it here.
>+ print "TEST-UNEXPECTED-FAIL | %s | Test timed out" % test
>+ timedOut = True
>+ os.killpg(os.getpgid(proc.pid), signal.SIGINT)
killpg won't work on Windows. You'll need to do something else. Also, for consistency with other harnesses, kill it with SIGABRT on Linux. In fact, you might just want to refactor this utility function into automationutils.py and use it instead:
http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#697
>+ if not interactive:
>+ testTimer = Timer(120.0, lambda: self.testTimeout(test, proc))
>+ testTimer.start()
You probably want to check debuggerInfo as well here, and not bother with this when a debugger of any kind is attached.
>+
> try:
> print "TEST-INFO | %s | running test ..." % test
>
> proc = self.launchProcess(cmdH + cmdT + self.xpcsRunArgs,
> stdout=pStdout, stderr=pStderr, env=self.env, cwd=testdir)
>
> # Allow user to kill hung subprocess with SIGINT w/o killing this script
> # - don't move this line above launchProcess, or child will inherit the SIG_IGN
>@@ -482,16 +495,18 @@ class XPCShellTests(object):
> # |stderr == None| as |pStderr| was either |None| or redirected to |stdout|.
> stdout, stderr = self.communicate(proc)
> signal.signal(signal.SIGINT, signal.SIG_DFL)
>
> if interactive:
> # Not sure what else to do here...
> return True
>
>+ testTimer.cancel()
If you wind up checking debuggerInfo you'll need to check it again here so you don't touch this uninitialized.
r-, needs some work, but a decent start.
Attachment #477735 -
Flags: review?(ted.mielczarek) → review-
Updated•14 years ago
|
Severity: normal → major
OS: Linux → All
Hardware: x86 → All
Version: unspecified → Trunk
Assignee | ||
Comment 8•13 years ago
|
||
This is a great bug for a new contributor with python knowledge to take on. The initial patch I submitted shows where the relevant changes are necessary; if you fix it up according to Ted's review comments in comment 5, we could get this checked in and make a number of people really happy.
Whiteboard: [mentor=jdm]
Assignee | ||
Comment 9•13 years ago
|
||
https://developer.mozilla.org/en/Writing_xpcshell-based_unit_tests has information on how to run xpcshell tests. Just add an infinite loop to an existing test and run that to ensure your changes are doing the right thing.
Assignee | ||
Updated•13 years ago
|
Whiteboard: [mentor=jdm][goodfirstbug] → [mentor=jdm][good first bug][lang=python]
Comment 10•13 years ago
|
||
I am willing to take on this bug. I will get started tomorrow.
Comment 11•13 years ago
|
||
I see that this bug has a dependency, should I try to fix the dependency, or wait until its fixed?
Assignee | ||
Comment 12•13 years ago
|
||
No, that dependency was not actually blocking this bug.
Assignee: josh → astropiloto
Assignee | ||
Comment 13•13 years ago
|
||
The reason I used killpg was to deal with subprocesses of xpcshell, such as occur when executing IPC tests (ie. any tests in unit_ipc test directories). I'm inclined to say that for the purposes of this bug getting resolved quickly, we should ignore that problem and save it for a followup.
Comment 14•13 years ago
|
||
Ok, will do. Everything else should be included right?
Assignee | ||
Comment 15•13 years ago
|
||
Yes, all of Ted's earlier comments should be addressed. You can look at using the routines from automation.py to replace the use of killpg in this patch for the time being.
Comment 16•13 years ago
|
||
Yes, that was what I was planning to do.
Comment 17•13 years ago
|
||
While running the code I got this error:
TEST-INFO | c:\Users\dorothy\firefox-src\mozilla-central\obj-i686-pc-mingw32\_te
sts\xpcshell\netwerk\test\unit\test_resumable_channel.js | running test ...
Exception in thread Thread-131:
Traceback (most recent call last):
File "c:\mozilla-build\python\lib\threading.py", line 552, in __bootstrap_inne
r
self.run()
File "c:\mozilla-build\python\lib\threading.py", line 756, in run
self.function(*self.args, **self.kwargs)
File "c:/Users/dorothy/firefox-src/mozilla-central/testing/xpcshell/runxpcshel
ltests.py", line 672, in <lambda>
testTimer = Timer(120.0, lambda: self.testTimeout(test, proc))
AttributeError: 'XPCShellTests' object has no attribute 'testTimeout'
I am unfamiliar with Timer and lambda, however this is the original unedited code, so what am I doing wrong?
Assignee | ||
Comment 18•13 years ago
|
||
That error suggests to me that you didn't apply this block from my original patch properly:
>+ def testTimeout(self, test, proc):
>+ global timedOut
>+ print "TEST-UNEXPECTED-FAIL | %s | Test timed out" % test
>+ timedOut = True
>+ os.killpg(os.getpgid(proc.pid), signal.SIGINT)
Assignee | ||
Comment 19•13 years ago
|
||
Specifically, it should be a method of the XPCShellTests class, at the same level of indentation as buildXpcsRunArgs.
Comment 20•13 years ago
|
||
That would do it. :)
Comment 21•13 years ago
|
||
(In reply to Josh Matthews [:jdm] from comment #15)
> Yes, all of Ted's earlier comments should be addressed. You can look at
> using the routines from automation.py to replace the use of killpg in this
> patch for the time being.
Should I copy this routine, or import it? If I am going to import, how?
Assignee | ||
Comment 22•13 years ago
|
||
It looks to me like you can just use |from automation import Automation| and create an instance of Automation where you need it. Check how other test suites call killAndGetStack to make sure you use it correctly.
Comment 23•13 years ago
|
||
I tried that, and it gives me "ImportError: No module named automation".
Assignee | ||
Comment 24•13 years ago
|
||
You probably need to add automation.py to http://mxr.mozilla.org/mozilla-central/source/testing/xpcshell/Makefile.in#63.
Comment 25•13 years ago
|
||
Historically we haven't had the xpcshell harness rely on automation.py directly. Instead, I created automationutils.py to store common functionality between xpcshell and automation.py. If it's not too much work, the right solution here might be to move that functionality from automation.py to automationutils.py.
Comment 26•13 years ago
|
||
Would I copy the function over to automationutils.py, or move it?
Comment 27•13 years ago
|
||
Move it. automation.py imports automationutils.py, so it should be fairly straightforward to fix the existing code to call it from automationutils instead.
Comment 28•13 years ago
|
||
I am having trouble copying it over. The build gives me an error "utilityPath is not defined.". I assume this is from an import in automation.py.in, but I looked around on mxr.mozilla.org and I couldn't find where it originates.
Assignee | ||
Comment 29•13 years ago
|
||
>739 def killAndGetStack(self, proc, utilityPath, debuggerInfo):
Given that utilityPath is an argument to the method that you're supposedly moving, something doesn't add up.
Comment 30•13 years ago
|
||
I think I was unclear, let me try again. When I call killAndGetStack from runxpcshelltests.py I get the error "utilityPath is not defined". This makes sense because utilityPath has not been assigned in runxpcshelltests.py. However, I am unsure what utilityPath should be assigned to, meaning what is the utilityPath? (Sorry for asking so many "stupid" questions. I have not participated in any large projects before, and this is kind of over my head. :) )
Assignee | ||
Comment 31•13 years ago
|
||
How are you calling killAndGetStack? What value are you passing for it? What value do existing callers of killAndGetStack pass to it?
Comment 32•13 years ago
|
||
Just for the record, I am moving dumpScreen http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#692 into automationutils.py because it is called by killAndGetStack.
Comment 33•13 years ago
|
||
I am having problems moving code from automation.py.in to automationutils.py. This is blocking further work.
Comment 34•13 years ago
|
||
Updated•13 years ago
|
Attachment #618817 -
Flags: feedback?(josh)
Comment 35•13 years ago
|
||
Comment on attachment 618817 [details] [diff] [review]
Patch for bug 597064 adding timeout for xpcshell test harness
automation.py.in functions should still live somewhere ;-<
Attachment #618817 -
Flags: feedback?(josh) → review-
Comment 36•13 years ago
|
||
Sorry about that. There was a problem with the diff file and I had to manually edit it. I meant to keep the section where it gets added to automationutils.py. Once again, sorry about that.
Assignee | ||
Comment 37•13 years ago
|
||
I have a couple other notes:
* killAndGetStack will only be called on non-mac and windows platforms, due to its indentation
* the bits where you check for NameErrors with certain variables is kind of weird. Could you explain why those are necessary?
* we probably need to check for testTimer existing before cancelling it
I will almost certainly have more comments once a fixed up version that includes the automationutils.py changes is included; please don't get discouraged by this! I'm excited that you're making progress here :)
Comment 38•13 years ago
|
||
Thanks for the feedback! I use the NameError stuff to check whether a variable exists. I know its probably not the best way, but that was how I was taught how to do it. And if I find a better way I could probably do that for testTimer as well. I am unsure what you want me to do with the first comment. Do you want me to get rid of the assignments for self.IS_MAC?
Assignee | ||
Comment 39•13 years ago
|
||
My point with the first comment is that right now, killAndGetStack only runs inside of one of the branches of the platform checks. You want to move it out of the linux check so that it runs regardless of the platform.
For the undefined variables, I suggest this: assign self.CRASHREPORTER in the XPCShellTests initialization and get rid of the bit before you create the timer. I'm not really sure what good the debuggerInfo one does, since right below it you check self.debuggerInfo. It looks like your code should work fine without it.
To do the testTimer check, just initialize testTimer to null before doing the checks to see if you should create it. That way you can just if |if testTimer:| later on when cancelling it.
Comment 40•13 years ago
|
||
After being sick for a long time, I am back and actively working on this bug.
Comment 41•13 years ago
|
||
I am moving this code here: http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#680 into the runxpcshelltest.py file. However that line needs this entire class: http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#160 Should I move that too?
Comment 42•12 years ago
|
||
still having trouble with OS recognition
Attachment #646994 -
Flags: feedback?
Assignee | ||
Comment 43•12 years ago
|
||
You'll need to be more specific than that if you want useful feedback, Alan. Also, if you could attach a diff of your changes instead of the whole file, that would be easier to work with.
Comment 44•12 years ago
|
||
Ok, sorry, it was late when I uploaded this. As I said before, OS recognition is not working. When I run the tests, and one times out, it should close it in the proper manner dependent on the OS. However, it simply runs the Mac close sequence (which is not under a if.. then.. statement), which causes errors on my Windows box. Somehow, (and this is why I am confused), my Windows box is not being recognized as such. Regarding a patch, I am not near my computer, but I will upload one ASAP. Once again, sorry for not being clear!
Assignee | ||
Comment 45•12 years ago
|
||
Sorry for taking so long to give you some feedback here; you probably need to use self.mozinfo in testTimeout (and similarly in trySetupNode).
Comment 46•12 years ago
|
||
No problem! I have tons of free time this week so I should (hopefully) get this done.
Updated•12 years ago
|
Attachment #646994 -
Flags: feedback?
Comment 47•12 years ago
|
||
This patch is NOT complete. For some reason, mozInfo doesn't have a OS property.
Attachment #646994 -
Attachment is obsolete: true
Attachment #655334 -
Flags: feedback?
Assignee | ||
Comment 48•12 years ago
|
||
Huh. I unfortunately have no clue why self.mozInfo.os isn't working. Try printing it out to see if you get an UNKOWN value, or if it really doesn't exist.
Comment 49•12 years ago
|
||
I tried printing it and got this:
Exception in thread Thread-1:
Traceback (most recent call last):
File "c:\mozilla-build\python\Lib\threading.py", line 552, in __bootstrap_inne
r
self.run()
File "c:\mozilla-build\python\Lib\threading.py", line 756, in run
self.function(*self.args, **self.kwargs)
File "c:/Users/Stan/firefox-src/mozilla-central/testing/xpcshell/runxpcshellte
sts.py", line 848, in <lambda>
testTimer = Timer(5.0, lambda: self.testTimeout(test, proc))
File "c:/Users/Stan/firefox-src/mozilla-central/testing/xpcshell/runxpcshellte
sts.py", line 609, in testTimeout
print self.mozInfo.os
AttributeError: 'dict' object has no attribute 'os'
Which I guess means it doesn't exist at all.
Comment 50•12 years ago
|
||
I got it, I had to not use self.
Comment 51•12 years ago
|
||
Attachment #618817 -
Attachment is obsolete: true
Attachment #655334 -
Attachment is obsolete: true
Attachment #655334 -
Flags: feedback?
Assignee | ||
Comment 52•12 years ago
|
||
Comment on attachment 661427 [details] [diff] [review]
proposed patch for bug 597064
Review of attachment 661427 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks for this, Alan! Assuming this works correctly, and continues to do so after the changes I propose, I think the next step should be to figure out how to avoid duplicating code from automation.py.in and move to sharing it instead. It's good that you have made these changes, since it's much clearer what actually needs to be shared now.
::: build/automation.py.in
@@ +17,5 @@
> import sys
> import threading
> import tempfile
> import sqlite3
> +import subprocess
Is this necessary?
@@ +157,5 @@
> "IS_DEBUG_BUILD",
> "DEFAULT_TIMEOUT",
> ]
>
> +
This definition needs to stay here since the rest of the file uses it.
::: netwerk/test/unit/test_307_redirect.js
@@ +1,1 @@
> +while(true){}
Another important test to run is one that iloops after doing some things that cause output (eg. do_check_true). We want to make sure we see any existing output when the test times out.
::: testing/xpcshell/runxpcshelltests.py
@@ +30,5 @@
>
> """ Control-C handling """
> gotSIGINT = False
> def markGotSIGINT(signum, stackFrame):
> + global gotSIGINT, timedOut
This change doesn't look like it should be necessary.
@@ +35,4 @@
> gotSIGINT = True
>
> class XPCShellTests(object):
> + class Process(subprocess.Popen):
If this needs to be in this file, it should probably be in the top level (ie. not a member of XPCShellTests). However, if we can avoid duplicating it and just refer to the original, that would be much better.
@@ +476,5 @@
> if self.nodeProc:
> self.log.info('Node SPDY server shutting down ...')
> # moz-spdy exits when its stdin reaches EOF, so force that to happen here
> self.nodeProc.communicate()
> + def killAndGetStack(self, proc, utilityPath, debuggerInfo):
Newline before this.
@@ +490,5 @@
> + if os.path.exists(crashinject) and subprocess.Popen([crashinject, str(proc.pid)]).wait() == 0:
> + return
> + #TODO: kill the process such that it triggers Breakpad on OS X (bug 525296)
> + self.log.info("Can't trigger Breakpad, just killing process")
> + proc.kill()
Newline after this.
@@ +613,5 @@
> + self.UNIXISH = True
> + self.IS_WIN32 = False
> + self.IS_Mac = False
> + self.killAndGetStack(proc, "../../dist/bin", None)
> + def dumpScreen(self, utilityPath):
Since we don't need to capture screenshots, just kill this whole method.
@@ +831,5 @@
> args.insert(0, '-d')
> + try:
> + debuggerInfo
> + except NameError:
> + debuggerInfo = None
Why is this necessary?
@@ +835,5 @@
> + debuggerInfo = None
> + try:
> + self.CRASHREPORTER
> + except AttributeError:
> + self.CRASHREPORTER = None
Likewise this. We should figure out what should actually be defining this value properly.
@@ +837,5 @@
> + self.CRASHREPORTER
> + except AttributeError:
> + self.CRASHREPORTER = None
> + if not interactive and not self.debuggerInfo:
> +
Kill this line.
@@ +838,5 @@
> + except AttributeError:
> + self.CRASHREPORTER = None
> + if not interactive and not self.debuggerInfo:
> +
> + testTimer = Timer(120.0, lambda: self.testTimeout(test, proc))
Just pass the test filename instead of the whole test object here.
@@ +868,5 @@
>
> if interactive:
> # Not sure what else to do here...
> return True
> + testTimer.cancel()
Wrap this in a |if testTimer| check.
Comment 53•12 years ago
|
||
(In reply to Josh Matthews [:jdm] from comment #52)
> Comment on attachment 661427 [details] [diff] [review]
> proposed patch for bug 597064
>
> Review of attachment 661427 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Thanks for this, Alan! Assuming this works correctly, and continues to do so
> after the changes I propose, I think the next step should be to figure out
> how to avoid duplicating code from automation.py.in and move to sharing it
> instead. It's good that you have made these changes, since it's much clearer
> what actually needs to be shared now.
>
> ::: build/automation.py.in
> @@ +17,5 @@
> > import sys
> > import threading
> > import tempfile
> > import sqlite3
> > +import subprocess
>
> Is this necessary?
>
It is necessary for the process class.
> @@ +157,5 @@
> > "IS_DEBUG_BUILD",
> > "DEFAULT_TIMEOUT",
> > ]
> >
> > +
>
> This definition needs to stay here since the rest of the file uses it.
>
> ::: netwerk/test/unit/test_307_redirect.js
> @@ +1,1 @@
> > +while(true){}
>
> Another important test to run is one that iloops after doing some things
> that cause output (eg. do_check_true). We want to make sure we see any
> existing output when the test times out.
>
I will do that.
> ::: testing/xpcshell/runxpcshelltests.py
> @@ +30,5 @@
> >
> > """ Control-C handling """
> > gotSIGINT = False
> > def markGotSIGINT(signum, stackFrame):
> > + global gotSIGINT, timedOut
>
> This change doesn't look like it should be necessary.
Will fix this, just something I forgot to get around to.
>
> @@ +35,4 @@
> > gotSIGINT = True
> >
> > class XPCShellTests(object):
> > + class Process(subprocess.Popen):
>
> If this needs to be in this file, it should probably be in the top level
> (ie. not a member of XPCShellTests). However, if we can avoid duplicating it
> and just refer to the original, that would be much better.
Ted told me that they are trying to get rid of automation.py.in http://logbot.glob.com.au/?c=mozilla%23introduction&s=26+Apr+2012&e=26+Apr+2012
>
> @@ +476,5 @@
> > if self.nodeProc:
> > self.log.info('Node SPDY server shutting down ...')
> > # moz-spdy exits when its stdin reaches EOF, so force that to happen here
> > self.nodeProc.communicate()
> > + def killAndGetStack(self, proc, utilityPath, debuggerInfo):
>
> Newline before this.
Will do.
>
> @@ +490,5 @@
> > + if os.path.exists(crashinject) and subprocess.Popen([crashinject, str(proc.pid)]).wait() == 0:
> > + return
> > + #TODO: kill the process such that it triggers Breakpad on OS X (bug 525296)
> > + self.log.info("Can't trigger Breakpad, just killing process")
> > + proc.kill()
>
> Newline after this.
Same.
>
> @@ +613,5 @@
> > + self.UNIXISH = True
> > + self.IS_WIN32 = False
> > + self.IS_Mac = False
> > + self.killAndGetStack(proc, "../../dist/bin", None)
> > + def dumpScreen(self, utilityPath):
>
> Since we don't need to capture screenshots, just kill this whole method.
Yep.
>
> @@ +831,5 @@
> > args.insert(0, '-d')
> > + try:
> > + debuggerInfo
> > + except NameError:
> > + debuggerInfo = None
>
> Why is this necessary?
see below
>
> @@ +835,5 @@
> > + debuggerInfo = None
> > + try:
> > + self.CRASHREPORTER
> > + except AttributeError:
> > + self.CRASHREPORTER = None
>
> Likewise this. We should figure out what should actually be defining this
> value properly.
How would I go about doing that?
>
> @@ +837,5 @@
> > + self.CRASHREPORTER
> > + except AttributeError:
> > + self.CRASHREPORTER = None
> > + if not interactive and not self.debuggerInfo:
> > +
>
> Kill this line.
Will do.
>
> @@ +838,5 @@
> > + except AttributeError:
> > + self.CRASHREPORTER = None
> > + if not interactive and not self.debuggerInfo:
> > +
> > + testTimer = Timer(120.0, lambda: self.testTimeout(test, proc))
>
> Just pass the test filename instead of the whole test object here.
Is that just test without the proc? or what would that be?
>
> @@ +868,5 @@
> >
> > if interactive:
> > # Not sure what else to do here...
> > return True
> > + testTimer.cancel()
>
> Wrap this in a |if testTimer| check.
Will do.
Comment 54•12 years ago
|
||
This fixes most of the problems that JDM brought up. Not fixed: global timedOut, moving Process class back (see previous comment).
Attachment #661427 -
Attachment is obsolete: true
Assignee | ||
Comment 55•12 years ago
|
||
Comment on attachment 661483 [details] [diff] [review]
2nd proposed patch for bug 597064
Review of attachment 661483 [details] [diff] [review]:
-----------------------------------------------------------------
::: testing/xpcshell/runxpcshelltests.py
@@ +9,5 @@
> from glob import glob
> from optparse import OptionParser
> from subprocess import Popen, PIPE, STDOUT
> +from tempfile import mkdtemp, gettempdir, mkstemp
> +import tempfile
I don't think these changes are required any longer.
@@ +20,4 @@
> from automationutils import *
>
> +
> +class Process(subprocess.Popen):
I don't think this is even necessary any more. Try removing it from here and reverting any changes you made to automation.py.in.
@@ +80,5 @@
>
> """ Control-C handling """
> gotSIGINT = False
> def markGotSIGINT(signum, stackFrame):
> + global gotSIGINT, timedOut
This is still present.
@@ +104,5 @@
> + self.IS_MAC = False
> + else:
> + self.UNIXISH = True
> + self.IS_WIN32 = False
> + self.IS_MAC = False
Let's just scrap this whole block and use the mozinfo.os bits below.
@@ +494,3 @@
>
> + if self.CRASHREPORTER and not debuggerInfo:
> + if self.UNIXISH:
mozinfo.os not in ['mac', 'win']
@@ +496,5 @@
> + if self.UNIXISH:
> + # ABRT will get picked up by Breakpad's signal handler
> + os.kill(proc.pid, signal.SIGABRT)
> + return
> + elif self.IS_WIN32:
mozinfo.os == 'win'
@@ +613,5 @@
> + def testTimeout(self, test, proc):
> + global timedOut
> + print "TEST-UNEXPECTED-FAIL | %s | Test timed out" % test
> + timedOut = True
> + self.haveDumpedScreen = False
Don't need this any longer.
@@ +614,5 @@
> + global timedOut
> + print "TEST-UNEXPECTED-FAIL | %s | Test timed out" % test
> + timedOut = True
> + self.haveDumpedScreen = False
> + self.killAndGetStack(proc, "../../dist/bin", None)
Any chance we can call Automation().killAndGetStack instead and not have to define it in this file?
@@ +787,5 @@
> args.insert(0, '-d')
> + try:
> + debuggerInfo
> + except NameError:
> + debuggerInfo = None
This does does not appear to be necessary to me. Is it?
@@ +791,5 @@
> + debuggerInfo = None
> + try:
> + self.CRASHREPORTER
> + except AttributeError:
> + self.CRASHREPORTER = None
The CRASHREPORTER stuff comes from http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#49, which is unfortunate since it's all about preprocessing. If we can get rid of killAndGetStack from this file, we can avoid the issue entirely.
@@ +793,5 @@
> + self.CRASHREPORTER
> + except AttributeError:
> + self.CRASHREPORTER = None
> + if not interactive and not self.debuggerInfo:
> + testTimer = Timer(120.0, lambda: self.testTimeout(test, proc))
The rest of the places that print things show that you can just use the |name| variable instead of the |test| one.
@@ +794,5 @@
> + except AttributeError:
> + self.CRASHREPORTER = None
> + if not interactive and not self.debuggerInfo:
> + testTimer = Timer(120.0, lambda: self.testTimeout(test, proc))
> + testTimer.start()
There's an extra level of indentation here.
Assignee | ||
Comment 56•12 years ago
|
||
And while I realize that ted said that there's a desire to stop using automation.py.in, that work does not seem to be progressing quickly. I think for the purposes of this patch we should focus on sharing as much as possible.
Assignee | ||
Comment 57•12 years ago
|
||
Bug 783727 just landed, so it might be pretty useful to use the psutil.Process type to launch the shell, and when killing it you can use proc.get_children(), kill all of those and then kill the original. This will address my original use of killpg which ted didn't like.
Comment 58•12 years ago
|
||
So I started using mq, and found that if I removed the patch, it builds fine, so it must be something I did. The problem is I have no idea whats causing the build to fail. When I get home I will post the error again.
Comment 59•12 years ago
|
||
I would like to continue working on this bug, but work may be a bit slower than normal. On the other hand, if someone wants this bug I am willing to let go of it bug in exchange for something a bit more trivial.
Comment 60•12 years ago
|
||
I don't see anyone else stepping up to do the work, so feel free to continue working at your own pace. Thanks for sticking with it!
Comment 61•12 years ago
|
||
Alright, I fixed everything except for using killAndGetStack from automation.py.in. I don't understand how to access it from runxpcshelltests.py. Some help would be appreciated. Thanks!
Assignee | ||
Comment 62•12 years ago
|
||
Did you try this suggestion?
>Any chance we can call Automation().killAndGetStack instead and not have to define
>it in this file?
Comment 63•12 years ago
|
||
Yes, I get this:
TEST-INFO | c:\Users\Stan\firefox-src\mozilla-central\obj-i686-pc-mingw32\_tests
\xpcshell\netwerk\test\unit\test_304_responses.js | running test ...
TEST-UNEXPECTED-FAIL | c:\Users\Stan\firefox-src\mozilla-central\obj-i686-pc-min
gw32\_tests\xpcshell\netwerk\test\unit\test_304_responses.js | Test timed out
Exception in thread Thread-1:
Traceback (most recent call last):
File "c:\mozilla-build\python\Lib\threading.py", line 552, in __bootstrap_inne
r
self.run()
File "c:\mozilla-build\python\Lib\threading.py", line 756, in run
self.function(*self.args, **self.kwargs)
File "c:/Users/Stan/firefox-src/mozilla-central/testing/xpcshell/runxpcshellte
sts.py", line 846, in <lambda>
testTimer = Timer(1.0, lambda: self.testTimeout(name, proc))
File "c:/Users/Stan/firefox-src/mozilla-central/testing/xpcshell/runxpcshellte
sts.py", line 578, in testTimeout
Automation.killAndGetStack(proc, "../../dist/bin", None)
NameError: global name 'Automation' is not defined
Assignee | ||
Comment 64•12 years ago
|
||
Add |from automation import Automation| after the rest of the imports, and make sure you're using |Automation().killAndGetStack(...)| (not the () after Automation).
Comment 65•12 years ago
|
||
I get a |ImportError: No module named automation| error.
Assignee | ||
Comment 66•12 years ago
|
||
Hmm, I was able to make it work by modifying some makefiles to use include the correct path, but there were some complications that arose (specifically, automation.py ends up running code that makes all output be repeated). Ted, do you have any input on how best to access killAndGetStack? I tried moving it into automationutils.py and take an Automation parameter, but I still need to end up importing automation which causes the double logging to occur.
Comment 67•12 years ago
|
||
Attachment #661483 -
Attachment is obsolete: true
Updated•12 years ago
|
Attachment #694017 -
Flags: feedback?(josh)
Assignee | ||
Comment 68•12 years ago
|
||
This is nearly ready. Unfortunately it breaks mach right now; I think fixing bug 794506 should make it work, since we rely on a python file that's in the objdir.
Assignee | ||
Updated•12 years ago
|
Assignee: astropiloto → josh
Assignee | ||
Updated•12 years ago
|
Attachment #694017 -
Attachment is obsolete: true
Attachment #694017 -
Flags: feedback?(josh)
Assignee | ||
Updated•12 years ago
|
Attachment #477735 -
Attachment is obsolete: true
Assignee | ||
Comment 69•12 years ago
|
||
qref
Assignee | ||
Updated•12 years ago
|
Attachment #694260 -
Attachment is obsolete: true
Comment 70•12 years ago
|
||
Is bug 794506 still a dependency? If so, any idea on when it will be fixed?
Updated•12 years ago
|
Keywords: sheriffing-P1
Comment 71•12 years ago
|
||
So now that bug 841713 is fixed, what do we have to do next?
Assignee | ||
Comment 72•12 years ago
|
||
I'm testing to see whether I was correct that running xpcshell tests under mach now works correctly.
Assignee | ||
Comment 73•12 years ago
|
||
May as well make sure that this solution is actually desired while waiting for my build to complete.
Attachment #730392 -
Flags: review?(ted)
Assignee | ||
Updated•12 years ago
|
Attachment #694261 -
Attachment is obsolete: true
Assignee | ||
Comment 74•12 years ago
|
||
Ah, it looks like we probably also want bug 794506. Then mach will have the right search paths to be able to find the preprocessed automation.py from $objdir/build.
Assignee | ||
Comment 75•12 years ago
|
||
Actually, it turns out that there is a way to fix the problem without waiting on proper virtualenv integration.
No longer depends on: 841713
Assignee | ||
Comment 76•12 years ago
|
||
Attachment #731193 -
Flags: review?(ted)
Assignee | ||
Updated•12 years ago
|
Attachment #730392 -
Attachment is obsolete: true
Attachment #730392 -
Flags: review?(ted)
Comment 77•12 years ago
|
||
Josh, how goes things here? IIUC, this blocks getting useful diagnostics on the various timeout failures which blocks un-hiding xpcshell tests on WinXP (bug 856027).
Flags: needinfo?(josh)
Assignee | ||
Comment 78•12 years ago
|
||
Don't blame me; It's still waiting on Ted's review.
Flags: needinfo?(josh)
Comment 80•12 years ago
|
||
So yeah, bug 859065 should fix the simple case where we just have an async test that fails to complete. This patch will still be necessary for more complex things, like IPC tests or things that are actually hanging the main thread.
Flags: needinfo?(ted)
Comment 81•12 years ago
|
||
Comment on attachment 731193 [details] [diff] [review]
Add timeout logic to xpcshell test runner.
Review of attachment 731193 [details] [diff] [review]:
-----------------------------------------------------------------
r- for a few things, but it's mostly ok.
::: build/automation.py.in
@@ +88,5 @@
> + while _log.handlers:
> + _log.removeHandler(_log.handlers[0])
> + handler = logging.StreamHandler(log)
> + _log.setLevel(logging.INFO)
> + _log.addHandler(handler)
I guess ideally we'd push this up into each of the harnesses that are using automation.py. It seems wrong for this module to be setting the log handler. Can you file a followup for that?
@@ +1020,3 @@
>
> def killAndGetStack(self, proc, utilityPath, debuggerInfo):
> """Kill the process, preferrably in a way that gets us a stack trace."""
Can you move this comment down to the new method, and mention here that this attempts to save a screenshot and then kill the process?
::: config/makefiles/xpcshell.mk
@@ +57,5 @@
>
> xpcshell-tests-remote: DM_TRANS?=adb
> xpcshell-tests-remote:
> $(PYTHON) -u $(topsrcdir)/testing/xpcshell/remotexpcshelltests.py \
> + -I$(testxpcobjdir) \
This...won't actually work. It's not running pythonpath.py.
::: testing/xpcshell/Makefile.in
@@ +13,5 @@
> include $(topsrcdir)/config/rules.mk
> +# We're installing to _tests/xpcshell, so this is the depth
> +# necessary for relative objdir paths.
> +TARGET_DEPTH = ../..
> +include $(topsrcdir)/build/automation-build.mk
I don't think you really need all this gunk. $objdir/build is already in sys.path, and that has a copy of automation.py that you can use. You're not launching an application with it, so you don't need the paths processed correctly. (TARGET_DEPTH just affects the path to the default binary etc.)
You would also need to adjust the xpcshell test packaging goop to stick $objdir/build/automation.py in the xpcshell dir in the test package, but I think that's cleaner than all this.
::: testing/xpcshell/runxpcshelltests.py
@@ +844,5 @@
> completeCmd = cmdH + cmdT + args
>
> + if not interactive and not self.debuggerInfo:
> + testTimer = Timer(120.0, lambda: self.testTimeout(name, proc))
> + testTimer.start()
I'm not wild about this using threads, but I guess we can clean this up in bug 771578.
@@ +869,5 @@
>
> if interactive:
> # Not sure what else to do here...
> return True
> + if testTimer:
This is going to error in the "interactive or debugger" case because testTimer is undefined. You need to assign testTimer = None or something like that outside of the first conditional you added.
Attachment #731193 -
Flags: review?(ted) → review-
Comment 82•12 years ago
|
||
(Sorry about the review delay, will do better for the next round.)
Assignee | ||
Comment 83•12 years ago
|
||
Rebased, with comments addressed. I've tested running individual tests under mach, as well as running check-one, check-interactive, and xpcshell-tests under make.
Attachment #746526 -
Flags: review?(ted)
Assignee | ||
Updated•12 years ago
|
Attachment #731193 -
Attachment is obsolete: true
Comment 84•12 years ago
|
||
Comment on attachment 746526 [details] [diff] [review]
Add timeout logic to xpcshell test runner.
Review of attachment 746526 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks for fixing this up.
::: config/makefiles/xpcshell.mk
@@ +39,5 @@
> # Execute all tests in the $(XPCSHELL_TESTS) directories.
> # See also testsuite-targets.mk 'xpcshell-tests' target for global execution.
> xpcshell-tests:
> $(PYTHON) -u $(topsrcdir)/config/pythonpath.py \
> + -I$(testxpcobjdir) \
I feel like we could probably get rid of pythonpath here, but we can punt that to a followup. (There's already a bug filed on getting rid of pythonpath.)
::: netwerk/test/unit/test_head.js
@@ +112,5 @@
> response = channel.getResponseHeader("Proxy-Authenticate");
> do_check_eq(response, "line 1\nline 2\nline 3");
> +
> + for (;;) {
> + }
I suspect you want to remove this before landing. :)
Attachment #746526 -
Flags: review?(ted) → review+
Assignee | ||
Comment 85•12 years ago
|
||
Comment 86•12 years ago
|
||
Backed out because of mochitest bustage:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a33030103ec4
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=da02b1166a5c
Comment 87•12 years ago
|
||
Not actually mochitest bustage, but postflight unification of Mac universal builds.
However, the timeouts in https://tbpl.mozilla.org/php/getParsedLog.php?id=22708325&tree=Mozilla-Inbound (test_cookies_read.js doesn't seem to be in the habit of timing out) and the inexplicable failure in https://tbpl.mozilla.org/php/getParsedLog.php?id=22707181&tree=Mozilla-Inbound mean you might also want to push to try and trigger a few dozen Windows debug and opt runs.
Assignee | ||
Comment 88•12 years ago
|
||
Assignee | ||
Comment 89•12 years ago
|
||
Assignee | ||
Comment 90•12 years ago
|
||
Comment 91•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
You need to log in
before you can comment on or make changes to this bug.
Description
•