597064 - xpcshell harness should handle timeouts internally, fail tests (like mochitest or reftest harnesses)

Assignee

Description

•

14 years ago

<ted> because the xpcshell harness doesn't know about timeouts <ted> we should fix that <ted> it's buildbot forcibly killing the process

(not currently active) Ted Mielczarek

Updated

•

14 years ago

Summary: Timeout errors don't show any test output → xpcshell harness should handle timeouts internally, fail tests (like mochitest or reftest harnesses)

Josh Matthews [:jdm]

Assignee

Updated

•

14 years ago

Assignee: nobody → josh

Josh Matthews [:jdm]

Assignee

Updated

•

14 years ago

Depends on: 597471

Josh Matthews [:jdm]

Assignee

Comment 1

•

14 years ago

This is harder than simply adding a timer to head.js, unfortunately. IPC tests complicate things in that if a child is in an infinite loop the parent process will wait forever for it to finish. There's no way around that besides adding a watchdog into the shell, which I'm investigating. The significantly easier alternative is to go back to my original plan of using a timer from runxpcshelltests.py.

Josh Matthews [:jdm]

Assignee

Comment 2

•

14 years ago

Attached patch Add timeout control to xpcshell harness v1. (obsolete) — Details — Splinter Review

Josh Matthews [:jdm]

Assignee

Updated

•

14 years ago

Attachment #476555 - Attachment is obsolete: true

Josh Matthews [:jdm]

Assignee

Comment 3

•

14 years ago

Attached patch Add timeout logic to xpcshell test runner. (obsolete) — Details — Splinter Review

Josh Matthews [:jdm]

Assignee

Updated

•

14 years ago

Attachment #477731 - Attachment is obsolete: true

Josh Matthews [:jdm]

Assignee

Comment 4

•

14 years ago

Attached patch Add timeout logic to xpcshell test runner. (obsolete) — Details — Splinter Review

Now with 100% more qref

Josh Matthews [:jdm]

Assignee

Updated

•

14 years ago

Attachment #477735 - Flags: review?(ted.mielczarek)

(not currently active) Ted Mielczarek

Comment 5

•

14 years ago

Comment on attachment 477735 [details] [diff] [review] Add timeout logic to xpcshell test runner. >diff --git a/testing/xpcshell/runxpcshelltests.py b/testing/xpcshell/runxpcshelltests.py >--- a/testing/xpcshell/runxpcshelltests.py >+++ b/testing/xpcshell/runxpcshelltests.py >+ def testTimeout(self, test, proc): >+ global timedOut You probably don't need to use a global here, just use a member variable, self.timedOut. Shouldn't be any synchronization issues since you're only setting it here. >+ print "TEST-UNEXPECTED-FAIL | %s | Test timed out" % test >+ timedOut = True >+ os.killpg(os.getpgid(proc.pid), signal.SIGINT) killpg won't work on Windows. You'll need to do something else. Also, for consistency with other harnesses, kill it with SIGABRT on Linux. In fact, you might just want to refactor this utility function into automationutils.py and use it instead: http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#697 >+ if not interactive: >+ testTimer = Timer(120.0, lambda: self.testTimeout(test, proc)) >+ testTimer.start() You probably want to check debuggerInfo as well here, and not bother with this when a debugger of any kind is attached. >+ > try: > print "TEST-INFO | %s | running test ..." % test > > proc = self.launchProcess(cmdH + cmdT + self.xpcsRunArgs, > stdout=pStdout, stderr=pStderr, env=self.env, cwd=testdir) > > # Allow user to kill hung subprocess with SIGINT w/o killing this script > # - don't move this line above launchProcess, or child will inherit the SIG_IGN >@@ -482,16 +495,18 @@ class XPCShellTests(object): > # |stderr == None| as |pStderr| was either |None| or redirected to |stdout|. > stdout, stderr = self.communicate(proc) > signal.signal(signal.SIGINT, signal.SIG_DFL) > > if interactive: > # Not sure what else to do here... > return True > >+ testTimer.cancel() If you wind up checking debuggerInfo you'll need to check it again here so you don't touch this uninitialized. r-, needs some work, but a decent start.

Attachment #477735 - Flags: review?(ted.mielczarek) → review-

Serge Gautherie (:sgautherie)

Updated

•

14 years ago

Severity: normal → major

OS: Linux → All

Hardware: x86 → All

Version: unspecified → Trunk

Josh Matthews [:jdm]

Assignee

Comment 8

•

13 years ago

This is a great bug for a new contributor with python knowledge to take on. The initial patch I submitted shows where the relevant changes are necessary; if you fix it up according to Ted's review comments in comment 5, we could get this checked in and make a number of people really happy.

Whiteboard: [mentor=jdm]

Josh Matthews [:jdm]

Assignee

Comment 9

•

13 years ago

https://developer.mozilla.org/en/Writing_xpcshell-based_unit_tests has information on how to run xpcshell tests. Just add an infinite loop to an existing test and run that to ensure your changes are doing the right thing.

cmtalbert

Updated

•

13 years ago

Whiteboard: [mentor=jdm] → [mentor=jdm][goodfirstbug]

Josh Matthews [:jdm]

Assignee

Updated

•

13 years ago

Whiteboard: [mentor=jdm][goodfirstbug] → [mentor=jdm][good first bug][lang=python]

Alan Williams [:aw1231]

Comment 10

•

13 years ago

I am willing to take on this bug. I will get started tomorrow.

Alan Williams [:aw1231]

Comment 11

•

13 years ago

I see that this bug has a dependency, should I try to fix the dependency, or wait until its fixed?

Josh Matthews [:jdm]

Assignee

Updated

•

13 years ago

No longer depends on: 597471

Josh Matthews [:jdm]

Assignee

Comment 12

•

13 years ago

No, that dependency was not actually blocking this bug.

Assignee: josh → astropiloto

Josh Matthews [:jdm]

Assignee

Comment 13

•

13 years ago

The reason I used killpg was to deal with subprocesses of xpcshell, such as occur when executing IPC tests (ie. any tests in unit_ipc test directories). I'm inclined to say that for the purposes of this bug getting resolved quickly, we should ignore that problem and save it for a followup.

Alan Williams [:aw1231]

Comment 14

•

13 years ago

Ok, will do. Everything else should be included right?

Josh Matthews [:jdm]

Assignee

Comment 15

•

13 years ago

Yes, all of Ted's earlier comments should be addressed. You can look at using the routines from automation.py to replace the use of killpg in this patch for the time being.

Alan Williams [:aw1231]

Comment 16

•

13 years ago

Yes, that was what I was planning to do.

Alan Williams [:aw1231]

Comment 17

•

13 years ago

While running the code I got this error: TEST-INFO | c:\Users\dorothy\firefox-src\mozilla-central\obj-i686-pc-mingw32\_te sts\xpcshell\netwerk\test\unit\test_resumable_channel.js | running test ... Exception in thread Thread-131: Traceback (most recent call last): File "c:\mozilla-build\python\lib\threading.py", line 552, in __bootstrap_inne r self.run() File "c:\mozilla-build\python\lib\threading.py", line 756, in run self.function(*self.args, **self.kwargs) File "c:/Users/dorothy/firefox-src/mozilla-central/testing/xpcshell/runxpcshel ltests.py", line 672, in <lambda> testTimer = Timer(120.0, lambda: self.testTimeout(test, proc)) AttributeError: 'XPCShellTests' object has no attribute 'testTimeout' I am unfamiliar with Timer and lambda, however this is the original unedited code, so what am I doing wrong?

Josh Matthews [:jdm]

Assignee

Comment 18

•

13 years ago

That error suggests to me that you didn't apply this block from my original patch properly: >+ def testTimeout(self, test, proc): >+ global timedOut >+ print "TEST-UNEXPECTED-FAIL | %s | Test timed out" % test >+ timedOut = True >+ os.killpg(os.getpgid(proc.pid), signal.SIGINT)

Josh Matthews [:jdm]

Assignee

Comment 19

•

13 years ago

Specifically, it should be a method of the XPCShellTests class, at the same level of indentation as buildXpcsRunArgs.

Alan Williams [:aw1231]

Comment 20

•

13 years ago

That would do it. :)

Alan Williams [:aw1231]

Comment 21

•

13 years ago

(In reply to Josh Matthews [:jdm] from comment #15) > Yes, all of Ted's earlier comments should be addressed. You can look at > using the routines from automation.py to replace the use of killpg in this > patch for the time being. Should I copy this routine, or import it? If I am going to import, how?

Josh Matthews [:jdm]

Assignee

Comment 22

•

13 years ago

It looks to me like you can just use |from automation import Automation| and create an instance of Automation where you need it. Check how other test suites call killAndGetStack to make sure you use it correctly.

Alan Williams [:aw1231]

Comment 23

•

13 years ago

I tried that, and it gives me "ImportError: No module named automation".

Josh Matthews [:jdm]

Assignee

Comment 24

•

13 years ago

You probably need to add automation.py to http://mxr.mozilla.org/mozilla-central/source/testing/xpcshell/Makefile.in#63.

(not currently active) Ted Mielczarek

Comment 25

•

13 years ago

Historically we haven't had the xpcshell harness rely on automation.py directly. Instead, I created automationutils.py to store common functionality between xpcshell and automation.py. If it's not too much work, the right solution here might be to move that functionality from automation.py to automationutils.py.

Alan Williams [:aw1231]

Comment 26

•

13 years ago

Would I copy the function over to automationutils.py, or move it?

(not currently active) Ted Mielczarek

Comment 27

•

13 years ago

Move it. automation.py imports automationutils.py, so it should be fairly straightforward to fix the existing code to call it from automationutils instead.

Alan Williams [:aw1231]

Comment 28

•

13 years ago

I am having trouble copying it over. The build gives me an error "utilityPath is not defined.". I assume this is from an import in automation.py.in, but I looked around on mxr.mozilla.org and I couldn't find where it originates.

Josh Matthews [:jdm]

Assignee

Comment 29

•

13 years ago

>739 def killAndGetStack(self, proc, utilityPath, debuggerInfo): Given that utilityPath is an argument to the method that you're supposedly moving, something doesn't add up.

Alan Williams [:aw1231]

Comment 30

•

13 years ago

I think I was unclear, let me try again. When I call killAndGetStack from runxpcshelltests.py I get the error "utilityPath is not defined". This makes sense because utilityPath has not been assigned in runxpcshelltests.py. However, I am unsure what utilityPath should be assigned to, meaning what is the utilityPath? (Sorry for asking so many "stupid" questions. I have not participated in any large projects before, and this is kind of over my head. :) )

Josh Matthews [:jdm]

Assignee

Comment 31

•

13 years ago

How are you calling killAndGetStack? What value are you passing for it? What value do existing callers of killAndGetStack pass to it?

Alan Williams [:aw1231]

Comment 32

•

13 years ago

Just for the record, I am moving dumpScreen http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#692 into automationutils.py because it is called by killAndGetStack.

Alan Williams [:aw1231]

Comment 33

•

13 years ago

I am having problems moving code from automation.py.in to automationutils.py. This is blocking further work.

Alan Williams [:aw1231]

Comment 34

•

13 years ago

Attached patch Patch for bug 597064 adding timeout for xpcshell test harness (obsolete) — Details — Splinter Review

Alan Williams [:aw1231]

Updated

•

13 years ago

Attachment #618817 - Flags: feedback?(josh)

Serge Gautherie (:sgautherie)

Comment 35

•

13 years ago

Comment on attachment 618817 [details] [diff] [review] Patch for bug 597064 adding timeout for xpcshell test harness automation.py.in functions should still live somewhere ;-<

Attachment #618817 - Flags: feedback?(josh) → review-

Alan Williams [:aw1231]

Comment 36

•

13 years ago

Sorry about that. There was a problem with the diff file and I had to manually edit it. I meant to keep the section where it gets added to automationutils.py. Once again, sorry about that.

Josh Matthews [:jdm]

Assignee

Comment 37

•

13 years ago

I have a couple other notes: * killAndGetStack will only be called on non-mac and windows platforms, due to its indentation * the bits where you check for NameErrors with certain variables is kind of weird. Could you explain why those are necessary? * we probably need to check for testTimer existing before cancelling it I will almost certainly have more comments once a fixed up version that includes the automationutils.py changes is included; please don't get discouraged by this! I'm excited that you're making progress here :)

Alan Williams [:aw1231]

Comment 38

•

13 years ago

Thanks for the feedback! I use the NameError stuff to check whether a variable exists. I know its probably not the best way, but that was how I was taught how to do it. And if I find a better way I could probably do that for testTimer as well. I am unsure what you want me to do with the first comment. Do you want me to get rid of the assignments for self.IS_MAC?

Josh Matthews [:jdm]

Assignee

Comment 39

•

13 years ago

My point with the first comment is that right now, killAndGetStack only runs inside of one of the branches of the platform checks. You want to move it out of the linux check so that it runs regardless of the platform. For the undefined variables, I suggest this: assign self.CRASHREPORTER in the XPCShellTests initialization and get rid of the bit before you create the timer. I'm not really sure what good the debuggerInfo one does, since right below it you check self.debuggerInfo. It looks like your code should work fine without it. To do the testTimer check, just initialize testTimer to null before doing the checks to see if you should create it. That way you can just if |if testTimer:| later on when cancelling it.

Alan Williams [:aw1231]

Comment 40

•

13 years ago

After being sick for a long time, I am back and actively working on this bug.

Alan Williams [:aw1231]

Comment 41

•

13 years ago

I am moving this code here: http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#680 into the runxpcshelltest.py file. However that line needs this entire class: http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#160 Should I move that too?

Alan Williams [:aw1231]

Comment 42

•

12 years ago

Attached file runxpcshelltests.py prototype (obsolete) — Details

still having trouble with OS recognition

Attachment #646994 - Flags: feedback?

Josh Matthews [:jdm]

Assignee

Comment 43

•

12 years ago

You'll need to be more specific than that if you want useful feedback, Alan. Also, if you could attach a diff of your changes instead of the whole file, that would be easier to work with.

Alan Williams [:aw1231]

Comment 44

•

12 years ago

Ok, sorry, it was late when I uploaded this. As I said before, OS recognition is not working. When I run the tests, and one times out, it should close it in the proper manner dependent on the OS. However, it simply runs the Mac close sequence (which is not under a if.. then.. statement), which causes errors on my Windows box. Somehow, (and this is why I am confused), my Windows box is not being recognized as such. Regarding a patch, I am not near my computer, but I will upload one ASAP. Once again, sorry for not being clear!

Josh Matthews [:jdm]

Assignee

Comment 45

•

12 years ago

Sorry for taking so long to give you some feedback here; you probably need to use self.mozinfo in testTimeout (and similarly in trySetupNode).

Alan Williams [:aw1231]

Comment 46

•

12 years ago

No problem! I have tons of free time this week so I should (hopefully) get this done.

Alan Williams [:aw1231]

Updated

•

12 years ago

Attachment #646994 - Flags: feedback?

Alan Williams [:aw1231]

Comment 47

•

12 years ago

Attached patch Almost complete (obsolete) — Details — Splinter Review

This patch is NOT complete. For some reason, mozInfo doesn't have a OS property.

Attachment #646994 - Attachment is obsolete: true

Attachment #655334 - Flags: feedback?

Josh Matthews [:jdm]

Assignee

Comment 48

•

12 years ago

Huh. I unfortunately have no clue why self.mozInfo.os isn't working. Try printing it out to see if you get an UNKOWN value, or if it really doesn't exist.

Alan Williams [:aw1231]

Comment 49

•

12 years ago

I tried printing it and got this: Exception in thread Thread-1: Traceback (most recent call last): File "c:\mozilla-build\python\Lib\threading.py", line 552, in __bootstrap_inne r self.run() File "c:\mozilla-build\python\Lib\threading.py", line 756, in run self.function(*self.args, **self.kwargs) File "c:/Users/Stan/firefox-src/mozilla-central/testing/xpcshell/runxpcshellte sts.py", line 848, in <lambda> testTimer = Timer(5.0, lambda: self.testTimeout(test, proc)) File "c:/Users/Stan/firefox-src/mozilla-central/testing/xpcshell/runxpcshellte sts.py", line 609, in testTimeout print self.mozInfo.os AttributeError: 'dict' object has no attribute 'os' Which I guess means it doesn't exist at all.

Alan Williams [:aw1231]

Comment 50

•

12 years ago

I got it, I had to not use self.

Alan Williams [:aw1231]

Comment 51

•

12 years ago

Attached patch proposed patch for bug 597064 (obsolete) — Details — Splinter Review

Attachment #618817 - Attachment is obsolete: true

Attachment #655334 - Attachment is obsolete: true

Attachment #655334 - Flags: feedback?

Josh Matthews [:jdm]

Assignee

Comment 52

•

12 years ago

Comment on attachment 661427 [details] [diff] [review] proposed patch for bug 597064 Review of attachment 661427 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for this, Alan! Assuming this works correctly, and continues to do so after the changes I propose, I think the next step should be to figure out how to avoid duplicating code from automation.py.in and move to sharing it instead. It's good that you have made these changes, since it's much clearer what actually needs to be shared now. ::: build/automation.py.in @@ +17,5 @@ > import sys > import threading > import tempfile > import sqlite3 > +import subprocess Is this necessary? @@ +157,5 @@ > "IS_DEBUG_BUILD", > "DEFAULT_TIMEOUT", > ] > > + This definition needs to stay here since the rest of the file uses it. ::: netwerk/test/unit/test_307_redirect.js @@ +1,1 @@ > +while(true){} Another important test to run is one that iloops after doing some things that cause output (eg. do_check_true). We want to make sure we see any existing output when the test times out. ::: testing/xpcshell/runxpcshelltests.py @@ +30,5 @@ > > """ Control-C handling """ > gotSIGINT = False > def markGotSIGINT(signum, stackFrame): > + global gotSIGINT, timedOut This change doesn't look like it should be necessary. @@ +35,4 @@ > gotSIGINT = True > > class XPCShellTests(object): > + class Process(subprocess.Popen): If this needs to be in this file, it should probably be in the top level (ie. not a member of XPCShellTests). However, if we can avoid duplicating it and just refer to the original, that would be much better. @@ +476,5 @@ > if self.nodeProc: > self.log.info('Node SPDY server shutting down ...') > # moz-spdy exits when its stdin reaches EOF, so force that to happen here > self.nodeProc.communicate() > + def killAndGetStack(self, proc, utilityPath, debuggerInfo): Newline before this. @@ +490,5 @@ > + if os.path.exists(crashinject) and subprocess.Popen([crashinject, str(proc.pid)]).wait() == 0: > + return > + #TODO: kill the process such that it triggers Breakpad on OS X (bug 525296) > + self.log.info("Can't trigger Breakpad, just killing process") > + proc.kill() Newline after this. @@ +613,5 @@ > + self.UNIXISH = True > + self.IS_WIN32 = False > + self.IS_Mac = False > + self.killAndGetStack(proc, "../../dist/bin", None) > + def dumpScreen(self, utilityPath): Since we don't need to capture screenshots, just kill this whole method. @@ +831,5 @@ > args.insert(0, '-d') > + try: > + debuggerInfo > + except NameError: > + debuggerInfo = None Why is this necessary? @@ +835,5 @@ > + debuggerInfo = None > + try: > + self.CRASHREPORTER > + except AttributeError: > + self.CRASHREPORTER = None Likewise this. We should figure out what should actually be defining this value properly. @@ +837,5 @@ > + self.CRASHREPORTER > + except AttributeError: > + self.CRASHREPORTER = None > + if not interactive and not self.debuggerInfo: > + Kill this line. @@ +838,5 @@ > + except AttributeError: > + self.CRASHREPORTER = None > + if not interactive and not self.debuggerInfo: > + > + testTimer = Timer(120.0, lambda: self.testTimeout(test, proc)) Just pass the test filename instead of the whole test object here. @@ +868,5 @@ > > if interactive: > # Not sure what else to do here... > return True > + testTimer.cancel() Wrap this in a |if testTimer| check.

Alan Williams [:aw1231]

Comment 53

•

12 years ago

(In reply to Josh Matthews [:jdm] from comment #52) > Comment on attachment 661427 [details] [diff] [review] > proposed patch for bug 597064 > > Review of attachment 661427 [details] [diff] [review]: > ----------------------------------------------------------------- > > Thanks for this, Alan! Assuming this works correctly, and continues to do so > after the changes I propose, I think the next step should be to figure out > how to avoid duplicating code from automation.py.in and move to sharing it > instead. It's good that you have made these changes, since it's much clearer > what actually needs to be shared now. > > ::: build/automation.py.in > @@ +17,5 @@ > > import sys > > import threading > > import tempfile > > import sqlite3 > > +import subprocess > > Is this necessary? > It is necessary for the process class. > @@ +157,5 @@ > > "IS_DEBUG_BUILD", > > "DEFAULT_TIMEOUT", > > ] > > > > + > > This definition needs to stay here since the rest of the file uses it. > > ::: netwerk/test/unit/test_307_redirect.js > @@ +1,1 @@ > > +while(true){} > > Another important test to run is one that iloops after doing some things > that cause output (eg. do_check_true). We want to make sure we see any > existing output when the test times out. > I will do that. > ::: testing/xpcshell/runxpcshelltests.py > @@ +30,5 @@ > > > > """ Control-C handling """ > > gotSIGINT = False > > def markGotSIGINT(signum, stackFrame): > > + global gotSIGINT, timedOut > > This change doesn't look like it should be necessary. Will fix this, just something I forgot to get around to. > > @@ +35,4 @@ > > gotSIGINT = True > > > > class XPCShellTests(object): > > + class Process(subprocess.Popen): > > If this needs to be in this file, it should probably be in the top level > (ie. not a member of XPCShellTests). However, if we can avoid duplicating it > and just refer to the original, that would be much better. Ted told me that they are trying to get rid of automation.py.in http://logbot.glob.com.au/?c=mozilla%23introduction&s=26+Apr+2012&e=26+Apr+2012 > > @@ +476,5 @@ > > if self.nodeProc: > > self.log.info('Node SPDY server shutting down ...') > > # moz-spdy exits when its stdin reaches EOF, so force that to happen here > > self.nodeProc.communicate() > > + def killAndGetStack(self, proc, utilityPath, debuggerInfo): > > Newline before this. Will do. > > @@ +490,5 @@ > > + if os.path.exists(crashinject) and subprocess.Popen([crashinject, str(proc.pid)]).wait() == 0: > > + return > > + #TODO: kill the process such that it triggers Breakpad on OS X (bug 525296) > > + self.log.info("Can't trigger Breakpad, just killing process") > > + proc.kill() > > Newline after this. Same. > > @@ +613,5 @@ > > + self.UNIXISH = True > > + self.IS_WIN32 = False > > + self.IS_Mac = False > > + self.killAndGetStack(proc, "../../dist/bin", None) > > + def dumpScreen(self, utilityPath): > > Since we don't need to capture screenshots, just kill this whole method. Yep. > > @@ +831,5 @@ > > args.insert(0, '-d') > > + try: > > + debuggerInfo > > + except NameError: > > + debuggerInfo = None > > Why is this necessary? see below > > @@ +835,5 @@ > > + debuggerInfo = None > > + try: > > + self.CRASHREPORTER > > + except AttributeError: > > + self.CRASHREPORTER = None > > Likewise this. We should figure out what should actually be defining this > value properly. How would I go about doing that? > > @@ +837,5 @@ > > + self.CRASHREPORTER > > + except AttributeError: > > + self.CRASHREPORTER = None > > + if not interactive and not self.debuggerInfo: > > + > > Kill this line. Will do. > > @@ +838,5 @@ > > + except AttributeError: > > + self.CRASHREPORTER = None > > + if not interactive and not self.debuggerInfo: > > + > > + testTimer = Timer(120.0, lambda: self.testTimeout(test, proc)) > > Just pass the test filename instead of the whole test object here. Is that just test without the proc? or what would that be? > > @@ +868,5 @@ > > > > if interactive: > > # Not sure what else to do here... > > return True > > + testTimer.cancel() > > Wrap this in a |if testTimer| check. Will do.

Alan Williams [:aw1231]

Comment 54

•

12 years ago

Attached patch 2nd proposed patch for bug 597064 (obsolete) — Details — Splinter Review

This fixes most of the problems that JDM brought up. Not fixed: global timedOut, moving Process class back (see previous comment).

Attachment #661427 - Attachment is obsolete: true

Josh Matthews [:jdm]

Assignee

Comment 55

•

12 years ago

Comment on attachment 661483 [details] [diff] [review] 2nd proposed patch for bug 597064 Review of attachment 661483 [details] [diff] [review]: ----------------------------------------------------------------- ::: testing/xpcshell/runxpcshelltests.py @@ +9,5 @@ > from glob import glob > from optparse import OptionParser > from subprocess import Popen, PIPE, STDOUT > +from tempfile import mkdtemp, gettempdir, mkstemp > +import tempfile I don't think these changes are required any longer. @@ +20,4 @@ > from automationutils import * > > + > +class Process(subprocess.Popen): I don't think this is even necessary any more. Try removing it from here and reverting any changes you made to automation.py.in. @@ +80,5 @@ > > """ Control-C handling """ > gotSIGINT = False > def markGotSIGINT(signum, stackFrame): > + global gotSIGINT, timedOut This is still present. @@ +104,5 @@ > + self.IS_MAC = False > + else: > + self.UNIXISH = True > + self.IS_WIN32 = False > + self.IS_MAC = False Let's just scrap this whole block and use the mozinfo.os bits below. @@ +494,3 @@ > > + if self.CRASHREPORTER and not debuggerInfo: > + if self.UNIXISH: mozinfo.os not in ['mac', 'win'] @@ +496,5 @@ > + if self.UNIXISH: > + # ABRT will get picked up by Breakpad's signal handler > + os.kill(proc.pid, signal.SIGABRT) > + return > + elif self.IS_WIN32: mozinfo.os == 'win' @@ +613,5 @@ > + def testTimeout(self, test, proc): > + global timedOut > + print "TEST-UNEXPECTED-FAIL | %s | Test timed out" % test > + timedOut = True > + self.haveDumpedScreen = False Don't need this any longer. @@ +614,5 @@ > + global timedOut > + print "TEST-UNEXPECTED-FAIL | %s | Test timed out" % test > + timedOut = True > + self.haveDumpedScreen = False > + self.killAndGetStack(proc, "../../dist/bin", None) Any chance we can call Automation().killAndGetStack instead and not have to define it in this file? @@ +787,5 @@ > args.insert(0, '-d') > + try: > + debuggerInfo > + except NameError: > + debuggerInfo = None This does does not appear to be necessary to me. Is it? @@ +791,5 @@ > + debuggerInfo = None > + try: > + self.CRASHREPORTER > + except AttributeError: > + self.CRASHREPORTER = None The CRASHREPORTER stuff comes from http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#49, which is unfortunate since it's all about preprocessing. If we can get rid of killAndGetStack from this file, we can avoid the issue entirely. @@ +793,5 @@ > + self.CRASHREPORTER > + except AttributeError: > + self.CRASHREPORTER = None > + if not interactive and not self.debuggerInfo: > + testTimer = Timer(120.0, lambda: self.testTimeout(test, proc)) The rest of the places that print things show that you can just use the |name| variable instead of the |test| one. @@ +794,5 @@ > + except AttributeError: > + self.CRASHREPORTER = None > + if not interactive and not self.debuggerInfo: > + testTimer = Timer(120.0, lambda: self.testTimeout(test, proc)) > + testTimer.start() There's an extra level of indentation here.

Josh Matthews [:jdm]

Assignee

Comment 56

•

12 years ago

And while I realize that ted said that there's a desire to stop using automation.py.in, that work does not seem to be progressing quickly. I think for the purposes of this patch we should focus on sharing as much as possible.

Josh Matthews [:jdm]

Assignee

Comment 57

•

12 years ago

Bug 783727 just landed, so it might be pretty useful to use the psutil.Process type to launch the shell, and when killing it you can use proc.get_children(), kill all of those and then kill the original. This will address my original use of killpg which ted didn't like.

Alan Williams [:aw1231]

Comment 58

•

12 years ago

So I started using mq, and found that if I removed the patch, it builds fine, so it must be something I did. The problem is I have no idea whats causing the build to fail. When I get home I will post the error again.

Alan Williams [:aw1231]

Comment 59

•

12 years ago

I would like to continue working on this bug, but work may be a bit slower than normal. On the other hand, if someone wants this bug I am willing to let go of it bug in exchange for something a bit more trivial.

(not currently active) Ted Mielczarek

Comment 60

•

12 years ago

I don't see anyone else stepping up to do the work, so feel free to continue working at your own pace. Thanks for sticking with it!

Alan Williams [:aw1231]

Comment 61

•

12 years ago

Alright, I fixed everything except for using killAndGetStack from automation.py.in. I don't understand how to access it from runxpcshelltests.py. Some help would be appreciated. Thanks!

Josh Matthews [:jdm]

Assignee

Comment 62

•

12 years ago

Did you try this suggestion? >Any chance we can call Automation().killAndGetStack instead and not have to define >it in this file?

Alan Williams [:aw1231]

Comment 63

•

12 years ago

Yes, I get this: TEST-INFO | c:\Users\Stan\firefox-src\mozilla-central\obj-i686-pc-mingw32\_tests \xpcshell\netwerk\test\unit\test_304_responses.js | running test ... TEST-UNEXPECTED-FAIL | c:\Users\Stan\firefox-src\mozilla-central\obj-i686-pc-min gw32\_tests\xpcshell\netwerk\test\unit\test_304_responses.js | Test timed out Exception in thread Thread-1: Traceback (most recent call last): File "c:\mozilla-build\python\Lib\threading.py", line 552, in __bootstrap_inne r self.run() File "c:\mozilla-build\python\Lib\threading.py", line 756, in run self.function(*self.args, **self.kwargs) File "c:/Users/Stan/firefox-src/mozilla-central/testing/xpcshell/runxpcshellte sts.py", line 846, in <lambda> testTimer = Timer(1.0, lambda: self.testTimeout(name, proc)) File "c:/Users/Stan/firefox-src/mozilla-central/testing/xpcshell/runxpcshellte sts.py", line 578, in testTimeout Automation.killAndGetStack(proc, "../../dist/bin", None) NameError: global name 'Automation' is not defined

Josh Matthews [:jdm]

Assignee

Comment 64

•

12 years ago

Add |from automation import Automation| after the rest of the imports, and make sure you're using |Automation().killAndGetStack(...)| (not the () after Automation).

Alan Williams [:aw1231]

Comment 65

•

12 years ago

I get a |ImportError: No module named automation| error.

Josh Matthews [:jdm]

Assignee

Comment 66

•

12 years ago

Hmm, I was able to make it work by modifying some makefiles to use include the correct path, but there were some complications that arose (specifically, automation.py ends up running code that makes all output be repeated). Ted, do you have any input on how best to access killAndGetStack? I tried moving it into automationutils.py and take an Automation parameter, but I still need to end up importing automation which causes the double logging to occur.

Alan Williams [:aw1231]

Comment 67

•

12 years ago

Attached patch 3rd proposed patch (obsolete) — Details — Splinter Review

Attachment #661483 - Attachment is obsolete: true

Alan Williams [:aw1231]

Updated

•

12 years ago

Attachment #694017 - Flags: feedback?(josh)

Josh Matthews [:jdm]

Assignee

Comment 68

•

12 years ago

Attached patch Add timeout logic to xpcshell test runner. (obsolete) — Details — Splinter Review

This is nearly ready. Unfortunately it breaks mach right now; I think fixing bug 794506 should make it work, since we rely on a python file that's in the objdir.

Josh Matthews [:jdm]

Assignee

Updated

•

12 years ago

Assignee: astropiloto → josh

Josh Matthews [:jdm]

Assignee

Updated

•

12 years ago

Attachment #694017 - Attachment is obsolete: true

Attachment #694017 - Flags: feedback?(josh)

Josh Matthews [:jdm]

Assignee

Updated

•

12 years ago

Attachment #477735 - Attachment is obsolete: true

Josh Matthews [:jdm]

Assignee

Comment 69

•

12 years ago

Attached patch Add timeout logic to xpcshell test runner. (obsolete) — Details — Splinter Review

qref

Josh Matthews [:jdm]

Assignee

Updated

•

12 years ago

Attachment #694260 - Attachment is obsolete: true

Alan Williams [:aw1231]

Comment 70

•

12 years ago

Is bug 794506 still a dependency? If so, any idea on when it will be fixed?

Josh Matthews [:jdm]

Assignee

Updated

•

12 years ago

Depends on: 841713

Dave Townsend [:mossop]

Updated

•

12 years ago

Blocks: 824022

Ed Morley [:emorley]

Updated

•

12 years ago

Keywords: sheriffing-P1

Alan Williams [:aw1231]

Comment 71

•

12 years ago

So now that bug 841713 is fixed, what do we have to do next?

Josh Matthews [:jdm]

Assignee

Comment 72

•

12 years ago

I'm testing to see whether I was correct that running xpcshell tests under mach now works correctly.

Josh Matthews [:jdm]

Assignee

Comment 73

•

12 years ago

Attached patch Add timeout logic to xpcshell test runner. (obsolete) — Details — Splinter Review

May as well make sure that this solution is actually desired while waiting for my build to complete.

Attachment #730392 - Flags: review?(ted)

Josh Matthews [:jdm]

Assignee

Updated

•

12 years ago

Attachment #694261 - Attachment is obsolete: true

Josh Matthews [:jdm]

Assignee

Comment 74

•

12 years ago

Ah, it looks like we probably also want bug 794506. Then mach will have the right search paths to be able to find the preprocessed automation.py from $objdir/build.

Josh Matthews [:jdm]

Assignee

Comment 75

•

12 years ago

Actually, it turns out that there is a way to fix the problem without waiting on proper virtualenv integration.

No longer depends on: 841713

Josh Matthews [:jdm]

Assignee

Comment 76

•

12 years ago

Attached patch Add timeout logic to xpcshell test runner. (obsolete) — Details — Splinter Review

Attachment #731193 - Flags: review?(ted)

Josh Matthews [:jdm]

Assignee

Updated

•

12 years ago

Attachment #730392 - Attachment is obsolete: true

Attachment #730392 - Flags: review?(ted)

Ryan VanderMeulen [:RyanVM]

Comment 77

•

12 years ago

Josh, how goes things here? IIUC, this blocks getting useful diagnostics on the various timeout failures which blocks un-hiding xpcshell tests on WinXP (bug 856027).

Flags: needinfo?(josh)

Josh Matthews [:jdm]

Assignee

Comment 78

•

12 years ago

Don't blame me; It's still waiting on Ted's review.

Flags: needinfo?(josh)

Ryan VanderMeulen [:RyanVM]

Comment 79

•

12 years ago

Feel free to transfer the needinfo in that case ;)

Flags: needinfo?(ted)

:Paolo Amadini

Updated

•

12 years ago

Depends on: 859065

(not currently active) Ted Mielczarek

Comment 80

•

12 years ago

So yeah, bug 859065 should fix the simple case where we just have an async test that fails to complete. This patch will still be necessary for more complex things, like IPC tests or things that are actually hanging the main thread.

Flags: needinfo?(ted)

(not currently active) Ted Mielczarek

Comment 81

•

12 years ago

Comment on attachment 731193 [details] [diff] [review] Add timeout logic to xpcshell test runner. Review of attachment 731193 [details] [diff] [review]: ----------------------------------------------------------------- r- for a few things, but it's mostly ok. ::: build/automation.py.in @@ +88,5 @@ > + while _log.handlers: > + _log.removeHandler(_log.handlers[0]) > + handler = logging.StreamHandler(log) > + _log.setLevel(logging.INFO) > + _log.addHandler(handler) I guess ideally we'd push this up into each of the harnesses that are using automation.py. It seems wrong for this module to be setting the log handler. Can you file a followup for that? @@ +1020,3 @@ > > def killAndGetStack(self, proc, utilityPath, debuggerInfo): > """Kill the process, preferrably in a way that gets us a stack trace.""" Can you move this comment down to the new method, and mention here that this attempts to save a screenshot and then kill the process? ::: config/makefiles/xpcshell.mk @@ +57,5 @@ > > xpcshell-tests-remote: DM_TRANS?=adb > xpcshell-tests-remote: > $(PYTHON) -u $(topsrcdir)/testing/xpcshell/remotexpcshelltests.py \ > + -I$(testxpcobjdir) \ This...won't actually work. It's not running pythonpath.py. ::: testing/xpcshell/Makefile.in @@ +13,5 @@ > include $(topsrcdir)/config/rules.mk > +# We're installing to _tests/xpcshell, so this is the depth > +# necessary for relative objdir paths. > +TARGET_DEPTH = ../.. > +include $(topsrcdir)/build/automation-build.mk I don't think you really need all this gunk. $objdir/build is already in sys.path, and that has a copy of automation.py that you can use. You're not launching an application with it, so you don't need the paths processed correctly. (TARGET_DEPTH just affects the path to the default binary etc.) You would also need to adjust the xpcshell test packaging goop to stick $objdir/build/automation.py in the xpcshell dir in the test package, but I think that's cleaner than all this. ::: testing/xpcshell/runxpcshelltests.py @@ +844,5 @@ > completeCmd = cmdH + cmdT + args > > + if not interactive and not self.debuggerInfo: > + testTimer = Timer(120.0, lambda: self.testTimeout(name, proc)) > + testTimer.start() I'm not wild about this using threads, but I guess we can clean this up in bug 771578. @@ +869,5 @@ > > if interactive: > # Not sure what else to do here... > return True > + if testTimer: This is going to error in the "interactive or debugger" case because testTimer is undefined. You need to assign testTimer = None or something like that outside of the first conditional you added.

Attachment #731193 - Flags: review?(ted) → review-

(not currently active) Ted Mielczarek

Comment 82

•

12 years ago

(Sorry about the review delay, will do better for the next round.)

Josh Matthews [:jdm]

Assignee

Comment 83

•

12 years ago

Attached patch Add timeout logic to xpcshell test runner. — Details — Splinter Review

Rebased, with comments addressed. I've tested running individual tests under mach, as well as running check-one, check-interactive, and xpcshell-tests under make.

Attachment #746526 - Flags: review?(ted)

Josh Matthews [:jdm]

Assignee

Updated

•

12 years ago

Attachment #731193 - Attachment is obsolete: true

(not currently active) Ted Mielczarek

Comment 84

•

12 years ago

Comment on attachment 746526 [details] [diff] [review] Add timeout logic to xpcshell test runner. Review of attachment 746526 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for fixing this up. ::: config/makefiles/xpcshell.mk @@ +39,5 @@ > # Execute all tests in the $(XPCSHELL_TESTS) directories. > # See also testsuite-targets.mk 'xpcshell-tests' target for global execution. > xpcshell-tests: > $(PYTHON) -u $(topsrcdir)/config/pythonpath.py \ > + -I$(testxpcobjdir) \ I feel like we could probably get rid of pythonpath here, but we can punt that to a followup. (There's already a bug filed on getting rid of pythonpath.) ::: netwerk/test/unit/test_head.js @@ +112,5 @@ > response = channel.getResponseHeader("Proxy-Authenticate"); > do_check_eq(response, "line 1\nline 2\nline 3"); > + > + for (;;) { > + } I suspect you want to remove this before landing. :)

Attachment #746526 - Flags: review?(ted) → review+

Josh Matthews [:jdm]

Assignee

Comment 85

•

12 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/da02b1166a5c

Josh Matthews [:jdm]

Assignee

Updated

•

12 years ago

Blocks: 869640

(no longer active)

Comment 86

•

12 years ago

Backed out because of mochitest bustage: https://hg.mozilla.org/integration/mozilla-inbound/rev/a33030103ec4 https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=da02b1166a5c

Phil Ringnalda (:philor)

Comment 87

•

12 years ago

Not actually mochitest bustage, but postflight unification of Mac universal builds. However, the timeouts in https://tbpl.mozilla.org/php/getParsedLog.php?id=22708325&tree=Mozilla-Inbound (test_cookies_read.js doesn't seem to be in the habit of timing out) and the inexplicable failure in https://tbpl.mozilla.org/php/getParsedLog.php?id=22707181&tree=Mozilla-Inbound mean you might also want to push to try and trigger a few dozen Windows debug and opt runs.

Josh Matthews [:jdm]

Assignee

Comment 88

•

12 years ago

https://tbpl.mozilla.org/?tree=Try&rev=d7a5a55a270f

Josh Matthews [:jdm]

Assignee

Comment 89

•

12 years ago

https://tbpl.mozilla.org/?tree=Try&rev=7b21d85582ce

Josh Matthews [:jdm]

Assignee

Comment 90

•

12 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/4a1cbbb07ce2

Ryan VanderMeulen [:RyanVM]

Comment 91

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/4a1cbbb07ce2

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla23

Ed Morley [:emorley]

Updated

•

12 years ago

Depends on: 870745

(not currently active) Ted Mielczarek

Updated

•

12 years ago

Depends on: 873053

(not currently active) Ted Mielczarek

Updated

•

12 years ago

Depends on: 874550

:Paolo Amadini

Updated

•

12 years ago

Blocks: 869638

:Paolo Amadini

Updated

•

12 years ago

Blocks: 889317

Add timeout control to xpcshell harness v1. 14 years ago Josh Matthews [:jdm] 6.70 KB, patch		Details \| Diff \| Splinter Review
Add timeout logic to xpcshell test runner. 14 years ago Josh Matthews [:jdm] 6.09 KB, patch		Details \| Diff \| Splinter Review
Add timeout logic to xpcshell test runner. 14 years ago Josh Matthews [:jdm] 5.41 KB, patch	ted : review-	Details \| Diff \| Splinter Review
Patch for bug 597064 adding timeout for xpcshell test harness 13 years ago Alan Williams [:aw1231] 9.66 KB, patch	sgautherie : review-	Details \| Diff \| Splinter Review
runxpcshelltests.py prototype 12 years ago Alan Williams [:aw1231] 40.10 KB, text/plain		Details
Almost complete 12 years ago Alan Williams [:aw1231] 12.28 KB, patch		Details \| Diff \| Splinter Review
proposed patch for bug 597064 12 years ago Alan Williams [:aw1231] 12.52 KB, patch		Details \| Diff \| Splinter Review
2nd proposed patch for bug 597064 12 years ago Alan Williams [:aw1231] 11.12 KB, patch		Details \| Diff \| Splinter Review
3rd proposed patch 12 years ago Alan Williams [:aw1231] 8.11 KB, patch		Details \| Diff \| Splinter Review
Add timeout logic to xpcshell test runner. 12 years ago Josh Matthews [:jdm] 17.46 KB, patch		Details \| Diff \| Splinter Review
Add timeout logic to xpcshell test runner. 12 years ago Josh Matthews [:jdm] 17.41 KB, patch		Details \| Diff \| Splinter Review
Add timeout logic to xpcshell test runner. 12 years ago Josh Matthews [:jdm] 17.13 KB, patch		Details \| Diff \| Splinter Review
Add timeout logic to xpcshell test runner. 12 years ago Josh Matthews [:jdm] 18.11 KB, patch	ted : review-	Details \| Diff \| Splinter Review
Add timeout logic to xpcshell test runner. 12 years ago Josh Matthews [:jdm] 19.31 KB, patch	ted : review+	Details \| Diff \| Splinter Review