Last Comment Bug 372581 - Run unit tests on a pre-existing debug build
: Run unit tests on a pre-existing debug build
Status: RESOLVED FIXED
:
Product: Release Engineering
Classification: Other
Component: Other (show other bugs)
: other
: All All
: P2 normal (vote)
: ---
Assigned To: Chris AtLee [:catlee]
:
Mentors:
http://tinderbox.mozilla.org/showbuil...
: 464031 (view as bug list)
Depends on: 383136 410297 411108 421611 448802 463183 463578 486613 489273 493366 507424 510552 514242 516726 518634
Blocks: 279923 509731 397724 400083 472557 493237 498380 507540 508325 512592 517475 610565
  Show dependency treegraph
 
Reported: 2007-03-04 08:18 PST by Ryan Jones
Modified: 2013-08-12 21:54 PDT (History)
43 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
[WIP] buildbot-configs for unittests on debug builds (5.80 KB, patch)
2009-07-10 12:34 PDT, Chris AtLee [:catlee]
no flags Details | Diff | Splinter Review
[WIP] buildbotcustom changes for running unittests on debug builds (1.95 KB, patch)
2009-07-10 12:37 PDT, Chris AtLee [:catlee]
no flags Details | Diff | Splinter Review
[WIP] buildbot-configs for unittests on debug builds (6.28 KB, patch)
2009-08-10 13:51 PDT, Chris AtLee [:catlee]
no flags Details | Diff | Splinter Review
[WIP] buildbotcustom changes for running unittests on debug builds (3.16 KB, patch)
2009-08-10 13:52 PDT, Chris AtLee [:catlee]
no flags Details | Diff | Splinter Review
Logs from mochitest-plain (6.37 KB, application/gzip)
2009-08-10 15:11 PDT, Chris AtLee [:catlee]
no flags Details
[WIP] buildbot-configs for unittests on debug builds (updated) (2.72 KB, patch)
2009-08-21 09:35 PDT, Lukas Blakk [:lsblakk] use ?needinfo
no flags Details | Diff | Splinter Review
[WIP] buildbotcustom changes for running unittests on debug builds (updated) (6.92 KB, patch)
2009-08-21 09:36 PDT, Lukas Blakk [:lsblakk] use ?needinfo
bhearsum: review-
Details | Diff | Splinter Review
Production buildbot-configs for unittests on debug builds (updated) (2.04 KB, patch)
2009-08-21 12:07 PDT, Lukas Blakk [:lsblakk] use ?needinfo
bhearsum: review-
Details | Diff | Splinter Review
Production buildbot-configs for unittests on debug builds - for all debug mozconfigs (3.26 KB, patch)
2009-08-24 14:42 PDT, Lukas Blakk [:lsblakk] use ?needinfo
bhearsum: review+
Details | Diff | Splinter Review
buildbotcustom changes for running unittests on debug builds - now as a config parameter (7.06 KB, patch)
2009-08-24 14:43 PDT, Lukas Blakk [:lsblakk] use ?needinfo
bhearsum: review-
Details | Diff | Splinter Review
buildbotcustom changes for running unittests on debug builds - take 3 (7.17 KB, patch)
2009-08-28 07:01 PDT, Lukas Blakk [:lsblakk] use ?needinfo
bhearsum: review-
Details | Diff | Splinter Review
buildbotcustom changes for running unittests on debug builds - take 4 (7.14 KB, patch)
2009-08-28 15:15 PDT, Lukas Blakk [:lsblakk] use ?needinfo
no flags Details | Diff | Splinter Review
buildbotcustom changes for running unittests on debug builds (10.51 KB, patch)
2009-09-16 06:53 PDT, Chris AtLee [:catlee]
bhearsum: review+
lukasblakk+bugs: review+
catlee: checked‑in+
Details | Diff | Splinter Review
buildbot-configs for unittests on debug builds (13.64 KB, patch)
2009-09-16 06:59 PDT, Chris AtLee [:catlee]
bhearsum: review-
lukasblakk+bugs: review+
Details | Diff | Splinter Review
buildbot-configs for unittests on debug builds, take 2 (7.81 KB, patch)
2009-09-16 07:33 PDT, Chris AtLee [:catlee]
bhearsum: review+
Details | Diff | Splinter Review
Increase server startup timeout to 3 minutes on debug builds (913 bytes, patch)
2009-09-18 14:00 PDT, Chris AtLee [:catlee]
ted: review+
ted: checked‑in+
Details | Diff | Splinter Review
buildbot-configs for unittests on debug builds, take 3 (9.82 KB, patch)
2009-09-23 09:20 PDT, Chris AtLee [:catlee]
bhearsum: review+
catlee: checked‑in+
Details | Diff | Splinter Review
Fixes for tests on debug builds (3.02 KB, patch)
2009-09-24 13:24 PDT, Chris AtLee [:catlee]
bhearsum: review-
Details | Diff | Splinter Review
Add --utility-path to runreftest.py (2.53 KB, patch)
2009-09-28 05:28 PDT, Ted Mielczarek [:ted.mielczarek]
benjamin: review+
ted: checked‑in+
Details | Diff | Splinter Review
Fixes for tests on debug builds (3.02 KB, patch)
2009-09-29 07:50 PDT, Chris AtLee [:catlee]
bhearsum: review+
Details | Diff | Splinter Review
Fixes for tests on debug builds, v3 (3.13 KB, patch)
2009-09-30 04:01 PDT, Chris AtLee [:catlee]
bhearsum: review+
catlee: checked‑in+
Details | Diff | Splinter Review

Description Ryan Jones 2007-03-04 08:18:19 PST
Per a discussion on IRC it could be useful to have some unit tests run on a tinderbox that has debug mode enabled for more test coverage.
Comment 1 binoy 2007-03-05 01:48:35 PST
dddddddddddddddddddddddddddddddddddddddddddddddddddddddd
Comment 2 Rob Campbell [:rc] (:robcee) 2007-06-26 08:37:23 PDT
There has been the occasional demand for this. It's definitely worth examining, whether chaining debug builds onto the existing slaves or creating new dedicated debug slaves.
Comment 3 Jeff Walden [:Waldo] (remove +bmo to email) 2007-06-26 10:02:00 PDT
Actually, it's a bit more important than that; assertions are compiled out of non-debug builds, so currently nothing ensures that no assertions are hit while running unit tests -- an explicit goal of bug 346922.  That change helps with catching assertions during patch creation (if the developer tests a debug build); it does not help with detecting assertions which are hit on platforms the developer doesn't use.
Comment 4 Ray Kiddy 2007-06-27 13:06:06 PDT
*** Bug 369809 has been marked as a duplicate of this bug. ***
Comment 5 Ray Kiddy 2007-06-27 13:06:58 PDT
Gee. What a phenomenal idea! :-)
Comment 6 Jeff Walden [:Waldo] (remove +bmo to email) 2007-07-31 12:08:19 PDT
I just got hit by bug 390324, which is a debug assertion that causes urlclassifier xpcshell tests to fail.  If we had debug tinderboxen running unit tests, this would have been caught before it had the chance to affect anyone other than the original developer.
Comment 7 Rob Campbell [:rc] (:robcee) 2007-08-01 08:43:15 PDT
ray, this has nothing to do with running nightlies with tests enabled. Debug build machines have been suggested before and aren't exactly a new idea.

Jeff: I saw this last night while helping dcamp debug. I'll see if we can get another machine spun up.
Comment 8 Jeff Walden [:Waldo] (remove +bmo to email) 2007-12-29 13:05:24 PST
Any progress on this?  Crashtests are supposed to also check for assertions, but in the run of them I just completed, I hit 150 assertions (23 unique) -- assertions which we can't make fatal without running tests in debug builds.
Comment 9 Chris Cooper [:coop] 2007-12-29 20:59:49 PST
We have 3 tinderboxes (so far) building debug builds and running the same the tests as the unittest machines against them while collecting leak and bloat data (bug 397724). These machines are named qm-leak* and are currently reporting to the MozillaTest tinderbox tree.

I haven't had any time yet to compare the failures these builds are reporting against the regular unittest machines on the Firefox tree. The logs are quite large, so any help is appreciated.
Comment 10 Jeff Walden [:Waldo] (remove +bmo to email) 2007-12-30 00:23:11 PST
I took some brief looks.


The assertion we're hitting in netwerk code related to content-type sniffing that's making xpcshell tests fail is the following:

###!!! ASSERTION: Content type should be known by now.: '!mContentType.IsEmpty()', file c:/slave_coop/trunk_winxp/mozilla/netwerk/streamconv/converters/nsUnknownDecoder.cpp, line 386
necko!nsUnknownDecoder::OnStopRequest+0x0000000000000043 (c:\slave_coop\trunk_winxp\mozilla\netwerk\streamconv\converters\nsunknowndecoder.cpp, line 242)
necko!nsStreamListenerTee::OnStopRequest+0x00000000000000A8 (c:\slave_coop\trunk_winxp\mozilla\netwerk\base\src\nsstreamlistenertee.cpp, line 66)
necko!nsHttpChannel::OnStopRequest+0x00000000000003CE (c:\slave_coop\trunk_winxp\mozilla\netwerk\protocol\http\src\nshttpchannel.cpp, line 4413)
necko!nsInputStreamPump::OnStateStop+0x00000000000000DE (c:\slave_coop\trunk_winxp\mozilla\netwerk\base\src\nsinputstreampump.cpp, line 577)
necko!nsInputStreamPump::OnInputStreamReady+0x0000000000000090 (c:\slave_coop\trunk_winxp\mozilla\netwerk\base\src\nsinputstreampump.cpp, line 401)
xpcom_core!nsInputStreamReadyEvent::Run+0x000000000000004A (c:\slave_coop\trunk_winxp\mozilla\xpcom\io\nsstreamutils.cpp, line 112)
xpcom_core!nsThread::ProcessNextEvent+0x00000000000001FA (c:\slave_coop\trunk_winxp\mozilla\xpcom\threads\nsthread.cpp, line 511)
xpcom_core!NS_InvokeByIndex_P+0x0000000000000027 (c:\slave_coop\trunk_winxp\mozilla\xpcom\reflect\xptcall\src\md\win32\xptcinvoke.cpp, line 102)
xpc3250!XPCWrappedNative::CallMethod+0x0000000000001313 (c:\slave_coop\trunk_winxp\mozilla\js\src\xpconnect\src\xpcwrappednative.cpp, line 2342)
xpc3250!XPC_WN_CallMethod+0x0000000000000181 (c:\slave_coop\trunk_winxp\mozilla\js\src\xpconnect\src\xpcwrappednativejsops.cpp, line 1480)
js3250!js_Invoke+0x0000000000000A2B (c:\slave_coop\trunk_winxp\mozilla\js\src\jsinterp.c, line 1023)
js3250!js_Interpret+0x000000000000C1F9 (c:\slave_coop\trunk_winxp\mozilla\js\src\jsinterp.c, line 3863)
js3250!js_Execute+0x0000000000000354 (c:\slave_coop\trunk_winxp\mozilla\js\src\jsinterp.c, line 1265)
js3250!JS_ExecuteScript+0x0000000000000057 (c:\slave_coop\trunk_winxp\mozilla\js\src\jsapi.c, line 4768)
xpcshell!ProcessFile+0x0000000000000100 (c:\slave_coop\trunk_winxp\mozilla\js\src\xpconnect\shell\xpcshell.cpp, line 660)
xpcshell!Process+0x000000000000009D (c:\slave_coop\trunk_winxp\mozilla\js\src\xpconnect\shell\xpcshell.cpp, line 739)
xpcshell!ProcessArgs+0x00000000000003E8 (c:\slave_coop\trunk_winxp\mozilla\js\src\xpconnect\shell\xpcshell.cpp, line 858)
xpcshell!main+0x000000000000088E (c:\slave_coop\trunk_winxp\mozilla\js\src\xpconnect\shell\xpcshell.cpp, line 1433)

The last time I ran xpcshell tests (mid-November?) I didn't hit this, but I just hit it when I ran them now.  This looks to be a regression; I'll do some regression-finding to figure out what happened here.


Second, many if not most of the failing mochitests look like they're doing focus-y things.  I've had (and am still having) troubles with the app window not being focused, and I wonder whether this is the same problem.  I don't have any idea why regular builds wouldn't hit it as well.
Comment 11 John O'Duinn [:joduinn] (please use "needinfo?" flag) 2008-04-03 23:05:02 PDT
Reassigning to correct component to track setting up infrastructure like this.

Also, from comment#9, at least some of this work is now done. What else needs to be done here?
Comment 12 Serge Gautherie (:sgautherie) 2008-08-17 01:44:26 PDT
(In reply to comment #9)
> We have 3 tinderboxes (so far) building debug builds and running the same the
> tests as the unittest machines against them while collecting leak and bloat
> data (bug 397724). These machines are named qm-leak* and are currently
> reporting to the MozillaTest tinderbox tree.

Currently,

1)
<http://tinderbox.mozilla.org/showbuilds.cgi?tree=MozillaTest>
doesn't seem to have such boxes (anymore).

2)
<http://tinderbox.mozilla.org/showbuilds.cgi?tree=Firefox>
(now !?) has
"Linux mozilla-central leak test build"
"OS X 10.5.2 mozilla-central leak test build"
"WINNT 5.2 mozilla-central leak test build"

It would seem that they are debug builds and doing leak tests (only).
I would be good if they could do |make check| (or equivalent) too (as a next step).
(See bug 448802 for what needs fixing and then should not regress.)

3)
<http://tinderbox.mozilla.org/showbuilds.cgi?tree=Firefox>
reads
"fxdbug-linux-tbox, bm-xserve11, fxdbug-win32-tb Leak tests on a debug compile. Builds continuously, tinderbox client, VM. Assertions are fatal (and thus cause orange)"

It seems these 3 names are obsolete ?
At least, it misses to tell on which page these boxes are...
Update needed !
Comment 13 Ted Mielczarek [:ted.mielczarek] 2008-08-17 10:50:20 PDT
http://tinderbox.mozilla.org/showbuilds.cgi?tree=LeakTest

Looks like a few of them fell off. Not surprising because nobody's really paying attention to them.
Comment 14 Serge Gautherie (:sgautherie) 2008-08-17 13:26:38 PDT
(In reply to comment #13)
> http://tinderbox.mozilla.org/showbuilds.cgi?tree=LeakTest

"Linux qm-leak-centos5-01 dep Leak+Unit test" is RED, but cycles fine.

"MacOSX qm-leak-macosx-01 dep Leak+Unit test" is YELLOW, with
"Started 2008/08/14 02:03, still building..
3 days, 11 hours, 10 minutes elapsed"

There doesn't seem to be a Windows box !?

It looks like they are (still) using cvs instead of hg !?
Comment 15 Chris Cooper [:coop] 2008-08-17 17:16:30 PDT
(In reply to comment #14)
> There doesn't seem to be a Windows box !?

The debug logs for a full unittest run with leak/bloat logging enabled on Windows are well over 30GB for a single run. Until recently, we didn't have sufficient disk space on the Windows VMs to be able to keep two sets of logs around to perform the required comparison between runs. Hence, no Windows data.

However, even now that we have a logging solution in place, these existing 2.0 boxes were running for over a year and getting developers to care about their output continues to be like pulling teeth. All the current debug+leaktest+unittest machines are setup to test 1.8.0-era code. Testing the 1.9 or mozilla-central seems like a better plan, but I know I'm not going to bother if no one is going to look at them. To date, no one has asked for updated coverage.

Someone needs to make a call about how valuable and necessary this type of debug information is. The releng team has no problem setting up these boxes if there is real interest, but we're not going to bend over backwards for a pet project that someone checks in on once a quarter.
Comment 16 Serge Gautherie (:sgautherie) 2008-08-17 18:16:04 PDT
(In reply to comment #15)
> Hence, no Windows data.

I wanted to know if it had "died" only.
No Windows box is not a problem, atm.

> Testing the 1.9 or mozilla-central seems like a better plan, but I know I'm not going to bother if no one is going to look at them. To date, no one has asked for updated coverage.

Yes, I think 1.9.1/m-c (only) is what would be needed, atm.
(In any case, 1.8.0 is obsolete.)

> Someone needs to make a call about how valuable and necessary this type of
> debug information is.

I am surely not the one to decide.
But tests coverage has much increased lately and debug+(leak)+unit seems the next step to go, and possibly not that far away (from GREEN).
Eventually, I would expect these boxes to move from "LeakTest" page to the main Firefox page.

Benjamin, Mike ?
Comment 17 Robert Strong [:rstrong] (use needinfo to contact me) 2008-08-18 14:16:34 PDT
I personally believe running unit tests on a debug build would be a *good thing*. Also, sdwilsh mentioned he found an assertion while writing a unit test that unconvered an underlying bug we wouldn't have easily found otherwise.
Comment 18 Serge Gautherie (:sgautherie) 2008-08-25 14:36:47 PDT
(In reply to comment #14)
> > http://tinderbox.mozilla.org/showbuilds.cgi?tree=LeakTest
> 
> "Linux qm-leak-centos5-01 dep Leak+Unit test" is RED, but cycles fine.

(No change: this box still cycles fully Red.)

> "MacOSX qm-leak-macosx-01 dep Leak+Unit test" is YELLOW, with
> "Started 2008/08/14 02:03, still building..
> 3 days, 11 hours, 10 minutes elapsed"

This box is "back" :-)

That build "never" ended ... now "11 days, 12 hours, 16 minutes elapsed";
but was eventually aborted as is.

Next build is Red:
http://tinderbox.mozilla.org/showlog.cgi?log=LeakTest/1219075259.1219084259.16531.gz
MacOSX qm-leak-macosx-01 dep Leak+Unit test on 2008/08/18 09:00:59

Now and then, there is a few Orange.
Comment 19 Serge Gautherie (:sgautherie) 2008-08-25 14:47:51 PDT
(In reply to comment #16)
> Yes, I think 1.9.1/m-c (only) is what would be needed, atm.
> (In any case, 1.8.0 is obsolete.)

Now that bug 448802 is fixed, (and while/so it stays that way,)
could these boxes (at least the Linux one, to start with) be recycled [pun intended ;-)] to 1.9.1/m-c builds ?

> Benjamin, Mike ?
Comment 20 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2008-10-03 08:26:15 PDT
I think what we need for developers to get these builds into a good state is a slightly different configuration than the one originally done.  In particular:

 * we should not make assertions fatal yet, but instead print the number of assertions to the waterfall and graph it on the graph server

 * we should only make leaks fatal on some of the tests, but we should print the leaks number for the other tests to the waterfall and graph it on the graph server

This would give us green tinderboxes that measure the things that we'd like to improve (and, when zero, turn into conditions that would make the tinderboxes orange).
Comment 21 Mike Shaver (:shaver -- probably not reading bugmail closely) 2008-10-03 08:28:21 PDT
(In reply to comment #20)
> This would give us green tinderboxes that measure the things that we'd like to
> improve (and, when zero, turn into conditions that would make the tinderboxes
> orange).

Can I get that on a T-shirt?
Comment 22 Serge Gautherie (:sgautherie) 2008-10-03 11:24:53 PDT
(In reply to comment #20)
>  * we should not make assertions fatal yet, but instead print the number of
> assertions to the waterfall and graph it on the graph server

Sure, I think there is an environment variable to make them non-fatal ?
Displaying their number would be good, yes !

>  * we should only make leaks fatal on some of the tests, but we should print
> the leaks number for the other tests to the waterfall and graph it on the graph
> server

There's the '--leak-threshold' for mochitests;
I'm not sure if the additional "debug-only" leaks are checked by the log parser yet ?
Leak display is bug 456274.

> This would give us green tinderboxes that measure the things that we'd like to
> improve (and, when zero, turn into conditions that would make the tinderboxes
> orange).

I concur !
Comment 23 Jesse Ruderman 2008-10-03 12:31:59 PDT
Adding a graph or waterfall data seems suboptimal to me, because people have to monitor it and work backwards when they notice an increase.

I'd prefer something similar in spirit to --leak-threshold, where known bugs don't turn things orange, but new bugs do.  For example, there could be lists of "known assertions" for mochitest and reftest, and the goal could be to get each list empty.
Comment 24 Serge Gautherie (:sgautherie) 2008-10-03 17:50:39 PDT
(In reply to comment #23)
> Adding a graph or waterfall data seems suboptimal to me, because people have to
> monitor it and work backwards when they notice an increase.

Waterfall is "needed" not to have to open the log to find out what is wrong.
Graph should be less useful, but would be nice to have, i.e. to see when the results improved if that was no noticed immediately.

> I'd prefer something similar in spirit to --leak-threshold, where known bugs

"like leaks" seems just fine to me.

> don't turn things orange, but new bugs do.  For example, there could be lists
> of "known assertions" for mochitest and reftest, and the goal could be to get
> each list empty.

That's what I'm currently trying to do with leaks: the "list" being (meta-)bugs.
Comment 25 Jesse Ruderman 2008-10-04 00:51:20 PDT
I want Tinderboxes to turn orange if a given assertion stops happening, and a day later, another one starts happening.  Having Tinderboxes only check the number of assertions won't do that.  Having them check each assertion against a list of known assertions will.
Comment 26 Serge Gautherie (:sgautherie) 2008-10-04 08:45:21 PDT
That really is (all) the same kind of story as the leaks (thresholds)...
And I would welcome a list of expected assertions with their expected numbers;

But I think this bug is about getting 1+ boxes,
and if we're now thinking about this kind of enhancements,
then I would say we agree we need such boxes :->

Enhancements can come along later.
Comment 27 Serge Gautherie (:sgautherie) 2008-10-29 20:49:12 PDT
CC Marcia, as we briefly spoke about it at "Mozilla Camp Europe 2008".
Comment 28 Ted Mielczarek [:ted.mielczarek] 2008-10-29 22:10:18 PDT
(In reply to comment #26)
> But I think this bug is about getting 1+ boxes,
> and if we're now thinking about this kind of enhancements,
> then I would say we agree we need such boxes :->
> 
> Enhancements can come along later.

The reality is that we had these boxes on 1.9.0, and they got ignored because they were permanently orange. These are not "enhancements", they're necessary developments to make these boxes useful for developers. Without a reasonable way to manage the assertions/leaks on these boxes, there's no point in setting them up.
Comment 29 Jeff Walden [:Waldo] (remove +bmo to email) 2008-11-03 22:21:46 PST
Bug 462992 reminded me of this problem...

So how about for starters we simply just have a tinderbox (or a few for different platforms, but I'll take even one over the current none) that does debug builds but doesn't do any tests?  Then, as we incrementally fix tests in particular harnesses, we can add test harness runs one by one.

If you think doing this with no tests initially is pointless, I'm pretty sure it's not hard to fix the assertions |make check| fires such that we could use that test run as a starting point.  I need to do that so that I can run |make check| at toplevel locally anyway, so it's going to happen one way or another for the nth time very soon.
Comment 30 Ted Mielczarek [:ted.mielczarek] 2008-11-04 03:37:15 PST
The "leak test build" machines on the Firefox tinderbox are currently debug builds that just run the bloat tests.
Comment 31 Jeff Walden [:Waldo] (remove +bmo to email) 2008-11-04 11:00:34 PST
Sigh, I'm so mentally out of the loop right now.  How about if we get them added there, then?  Log suggests it's buildbot, so getting xpcshell tests running there shouldn't be too complicated.
Comment 32 Tony Mechelynck [:tonymec] 2008-11-04 13:46:36 PST
(In reply to comment #30)
> The "leak test build" machines on the Firefox tinderbox are currently debug
> builds that just run the bloat tests.

Is there some place from which these debug binaries can be downloaded? I believe this would be useful for (QA etc.) testers who can't or don't know how to compile Mozilla applications but are ready either to test bugs which only happen in debug builds, or to use debug builds to get more info concerning their "ordinary" bugs.
Comment 33 Jeff Walden [:Waldo] (remove +bmo to email) 2008-11-04 14:12:41 PST
No; most of the tests are organized around the assumption of having a checked-out tree, although I expect we'll remove that assumption eventually.  In particular, the xpcshell tests that start asserting every so often require xpcshell, which isn't part of packaged browser downloads.
Comment 34 Ted Mielczarek [:ted.mielczarek] 2008-11-04 14:53:47 PST
(In reply to comment #32)
> Is there some place from which these debug binaries can be downloaded? I
> believe this would be useful for (QA etc.) testers who can't or don't know how
> to compile Mozilla applications but are ready either to test bugs which only
> happen in debug builds, or to use debug builds to get more info concerning
> their "ordinary" bugs.

No, these builds are not uploaded. See bug 400083.
Comment 35 Ted Mielczarek [:ted.mielczarek] 2008-11-06 10:18:53 PST
See also my proposal in bug 463455, which seems to mirror what Waldo is arguing for.
Comment 36 Ted Mielczarek [:ted.mielczarek] 2008-11-07 12:15:08 PST
Apparently we only have one assertion in TUnit currently, bug 463578. Once that's fixed, I think we should start running TUnit on the debug tinderboxes. TUnit doesn't take much time, so it's an easy win.
Comment 37 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-01-08 15:49:39 PST
This was discussed a bit more in:
http://groups.google.com/group/mozilla.dev.planning/browse_thread/thread/570b66a3e430c0fa

Getting the configuration we need running seems like it's a simple combination of existing pieces that we already do:  debug builds combined with our existing way of running the unit tests.  The one trick is that fatal assertions would need to be disabled for the reftest/crashtest and mochitest/mochichrome/mochibrowser tests (i.e., we'd need to override setting XPCOM_DEBUG_BREAK=stack-and-abort with XPCOM_DEBUG_BREAK=stack or something like that).  If that makes it easier, we could probably set XPCOM_DEBUG_BREAK=stack for all reftest and mochitest runs, since it wouldn't have any effect on non-debug builds.

Then, to get assertions down for reftest and mochitest-based tests, we can have the test harnesses themselves deal with making *unexpected* assertions cause orange.  For reftest, this is bug 472557.
Comment 38 John O'Duinn [:joduinn] (please use "needinfo?" flag) 2009-01-21 14:16:58 PST
*** Bug 464031 has been marked as a duplicate of this bug. ***
Comment 39 Jeff Walden [:Waldo] (remove +bmo to email) 2009-03-11 18:12:25 PDT
The absence of debug tinderboxen running unit tests is directly responsible for me having to find bug 482861 manually.
Comment 40 Jesse Ruderman 2009-03-12 00:03:52 PDT
I've had the same experience several times.
Comment 41 John O'Duinn [:joduinn] (please use "needinfo?" flag) 2009-03-12 02:26:31 PDT
(tweaking summary because we do not use dedicated machines anymore; we'd just be queuing up new jobs to the same shared pool-of-slaves)


* We currently produce 5 different types of builds: opt, full-debug, partial-debug, nightly and release builds.

* Historically, we only run unittests on the partial-debug builds, because we generate those partial-debug builds *during* the unittest run. See details here http://oduinn.com/2009/02/26/making-unittest-life-better . 

* Once bug#421611 is fixed, we will be able to run unittest separate from the partial-debug build and can try running unittests on other types of builds, like full-debug builds being requested here. At that point, I propose we try running unittests-on-full-debug builds, and post the results to MozillaTest to see if they are perm-orange, or reliably green handling assertions correctly. This would also allow developers & QA time to fix any last minute snafus and also confirm that there is no functionality being lost by turning off partial-debug builds. As soon as unittests-on-full-debug builds give same test results as unittests-on-partial-debug builds, we can move results of unittest-on-full-debug builds to production tinderbox pages and disable unittests-on-partial-debug builds completely. Does that seem reasonable?


* Would we be able to run those same unittests on non-debug builds? In the next phase, I'd like to also run unittests on nightly, and release, builds.
Comment 42 Ted Mielczarek [:ted.mielczarek] 2009-03-12 04:05:06 PDT
(In reply to comment #41)
> * Would we be able to run those same unittests on non-debug builds? In the next
> phase, I'd like to also run unittests on nightly, and release, builds.

Yes, and we should continue to run non-debug unit tests. Debug unit tests will be quite slow, so we'll want the non-debug ones for faster cycle time, even if they're lacking the leak checking.
Comment 43 Ted Mielczarek [:ted.mielczarek] 2009-03-19 13:17:53 PDT
Moved bug 460282 and bug 463605 from blocking bug 421611 to blocking this bug, since they ostensibly block the rollout of this to our hourly/nightly tinderbox builds, not the feature itself.
Comment 44 John O'Duinn [:joduinn] (please use "needinfo?" flag) 2009-05-08 18:11:40 PDT
After talking with dbaron last week about this, he feels we should be able to run unittests on debug builds, as most/all of the assertion problems have been fixed by now.

Can we do an experiment of running unittests on debug builds, and posting results to http://tinderbox.mozilla.org/showbuilds.cgi?tree=MozillaTest to see if they work? If yes, great. If not, at least developers will see the errors and be able to fix them.
Comment 45 John O'Duinn [:joduinn] (please use "needinfo?" flag) 2009-05-08 18:19:03 PDT
(In reply to comment #42)
> (In reply to comment #41)
> > * Would we be able to run those same unittests on non-debug builds? In the next
> > phase, I'd like to also run unittests on nightly, and release, builds.
> 
> Yes, and we should continue to run non-debug unit tests. Debug unit tests will
> be quite slow, so we'll want the non-debug ones for faster cycle time, even if
> they're lacking the leak checking.
urgh... 

How much slower is "quite slow"? Is it so much slower as to justify the extra load of running unittests *twice* per o.s. per checkin?
Comment 46 Chris AtLee [:catlee] 2009-05-08 18:52:06 PDT
(In reply to comment #45)
> (In reply to comment #42)
> > (In reply to comment #41)
> > > * Would we be able to run those same unittests on non-debug builds? In the next
> > > phase, I'd like to also run unittests on nightly, and release, builds.
> > 
> > Yes, and we should continue to run non-debug unit tests. Debug unit tests will
> > be quite slow, so we'll want the non-debug ones for faster cycle time, even if
> > they're lacking the leak checking.
> urgh... 
> 
> How much slower is "quite slow"? Is it so much slower as to justify the extra
> load of running unittests *twice* per o.s. per checkin?

My understanding was that we wanted to end up doing unittests on debug builds *and* on optimized builds.  Especially since we want to be running unittests on our nightly (optimized) builds.
Comment 47 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-05-08 20:22:07 PDT
(In reply to comment #44)
> After talking with dbaron last week about this, he feels we should be able to
> run unittests on debug builds, as most/all of the assertion problems have been
> fixed by now.

To clarify, it's not that the assertions are fixed.  It's that the reftests, crashtests, and mochitests should be run on these unit test machines in a configuration where assertions are not fatal.  (But I have a plan for making *unexpected* assertions turn the tree orange, but that doesn't require help from you guys.)  I think the make check / xpcshell tests should be run with  assertions being fatal (and thus causing orange), although I'm not sure (Waldo would know, and you can always try and see what happens).  I think that configuration may well even be the default for those tests.
Comment 48 Jeff Walden [:Waldo] (remove +bmo to email) 2009-05-09 00:16:30 PDT
(In reply to comment #45)
> How much slower is "quite slow"? Is it so much slower as to justify the extra
> load of running unittests *twice* per o.s. per checkin?

The slowdown is entirely justifiable.  I can point to at least one clear security bug which would never have made its way into builds if we were doing debug builds and running tests on them, and I don't doubt there are others I haven't seen.

(In reply to comment #47)
The check/xpcshell targets by default run with the abort-on-assert setting of the XPCOM_DEBUG_BREAK environment variable, which only affects debug builds and not what tinderbox currently produces.  We regress every so often into something that asserts, but it's infrequent and usually gets batted down in short order when I complain about it.  Most people don't run them in debug builds, but enough do that I see no reason why any effort should be made to make them not run with assertions being fatal.
Comment 49 Ted Mielczarek [:ted.mielczarek] 2009-05-09 05:40:41 PDT
(In reply to comment #45)
> How much slower is "quite slow"? Is it so much slower as to justify the extra
> load of running unittests *twice* per o.s. per checkin?

I'm not sure Waldo answered the right question here. Debug unit tests are clearly valuable (as the number of comments here alone might attest). However, even if we get them running, we should continue to run unit tests on release builds, as a) that's what we *ship to users*, and b) the speed difference will be large, and the release builds will provide faster results to developers.
Comment 50 Jesse Ruderman 2009-05-09 08:39:09 PDT
Once unit tests can be run on Tinderboxen that don't do their own builds (and split up), we'll have the option of making the debug unit tests faster than release unit tests by throwing more machines at them ;)
Comment 51 Lukas Blakk [:lsblakk] use ?needinfo 2009-06-01 16:52:23 PDT
Futuring for now while we work on running unittest on packaged builds.  Once the dependent bugs are closed this will come back to the foreground.
Comment 52 Chris AtLee [:catlee] 2009-07-10 12:34:32 PDT
Created attachment 387929 [details] [diff] [review]
[WIP] buildbot-configs for unittests on debug builds
Comment 53 Chris AtLee [:catlee] 2009-07-10 12:37:51 PDT
Created attachment 387931 [details] [diff] [review]
[WIP] buildbotcustom changes for running unittests on debug builds
Comment 54 John O'Duinn [:joduinn] (please use "needinfo?" flag) 2009-07-30 17:04:52 PDT
While these are good to have, I dont believe these are truely blocking, so removing from dep.bug list:

460282  Audit source tree to ensure there's no code built into the product conditionally on ENABLE_TESTS
463605 make Mac OS X packaging use a packaging manifest
472557 make individual reftests fail when they assert

If these *should* be blocking dependent bugs, please re-insert them, and add a comment explaining why you think they should block.

Moving from Future, as this is a Q3 goal.
Comment 55 Ted Mielczarek [:ted.mielczarek] 2009-07-31 05:06:19 PDT
Yeah, I don't think bug 460282 or bug 463605 are required here. dbaron can answer whether bug 472557 is required.
Comment 56 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-07-31 08:26:01 PDT
Bug 472557 had the dependency the wrong way around -- it depends on this.  Jesse already fixed that.
Comment 57 Chris AtLee [:catlee] 2009-08-05 15:42:32 PDT
One data point for timing:

mochitest-plain took 1.5 hours on linux
everythingelse took over 4 hours before it got killed off (in crashtest)
Comment 58 Olli Pettay [:smaug] (TPAC) 2009-08-06 02:12:28 PDT
Any idea why it crashed? I've seen some JS assertions lately.
(The most common one is fixed already.)
There is also at least one fatal assertion coming from ogg which prevents
me to run all the tests. (That assertion is about a memory leak, I think)
Comment 59 Chris AtLee [:catlee] 2009-08-10 13:51:44 PDT
Created attachment 393591 [details] [diff] [review]
[WIP] buildbot-configs for unittests on debug builds
Comment 60 Chris AtLee [:catlee] 2009-08-10 13:52:23 PDT
Created attachment 393592 [details] [diff] [review]
[WIP] buildbotcustom changes for running unittests on debug builds
Comment 61 Chris AtLee [:catlee] 2009-08-10 15:11:26 PDT
Created attachment 393630 [details]
Logs from mochitest-plain

mochitest-plain is currently failing on mozilla-1.9.1 with the above patches.  log is attached.
Comment 62 Chris AtLee [:catlee] 2009-08-11 08:18:21 PDT
Latest run on mozilla-1.9.1 on windows is stuck in mochitest-browser-chrome spewing this:

###!!! ASSERTION: XPConnect is being called on a scope without a 'Components' property!: 'Error', file e:/builds/moz2_slave/mozilla-1.9.1-win32-debug/build/js/src/xpconnect/src/xpcwrappednativescope.cpp, line 764
###!!! ASSERTION: XPConnect is being called on a scope without a 'Components' property!: 'Error', file e:/builds/moz2_slave/mozilla-1.9.1-win32-debug/build/js/src/xpconnect/src/xpcwrappednativescope.cpp, line 764
JavaScript error: chrome://mochikit/content/browser-test.js, line 89: testMessage is not defined
Comment 63 Chris AtLee [:catlee] 2009-08-11 08:37:34 PDT
So it was running for > 7 hours before I killed it.  The log is around 3.8 GB in size, most of which are the assertions listed above.

We need to fix this somehow before going live.  Make assertions fatal?  Make it less spammy?
Comment 64 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-08-11 08:39:56 PDT
(In reply to comment #63)
> So it was running for > 7 hours before I killed it.  The log is around 3.8 GB
> in size, most of which are the assertions listed above.
> 
> Make assertions fatal?

No.

> Make it less spammy?

Well, by fixing whatever's causing the problem, perhaps?

Was it really in an infinite loop, or was it just really slow because of all the assertions?
Comment 65 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-08-11 08:43:11 PDT
Also, was it working on mozilla-central, or are you testing on 1.9.1 only?
Comment 66 Chris AtLee [:catlee] 2009-08-11 08:46:46 PDT
(In reply to comment #64)
> (In reply to comment #63)
> > So it was running for > 7 hours before I killed it.  The log is around 3.8 GB
> > in size, most of which are the assertions listed above.
> > 
> > Make assertions fatal?
> 
> No.
> 
> > Make it less spammy?
> 
> Well, by fixing whatever's causing the problem, perhaps?

Sure, but in the meanwhile we need to make sure this doesn't tie up our slaves for too long.

> Was it really in an infinite loop, or was it just really slow because of all
> the assertions?

Hard to tell, I can get you the log if you want :)
Comment 67 Chris AtLee [:catlee] 2009-08-11 08:47:20 PDT
(In reply to comment #65)
> Also, was it working on mozilla-central, or are you testing on 1.9.1 only?

It never got around to running on mozilla-central, it was stuck on 1.9.1 all night.
Comment 68 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-08-11 08:52:13 PDT
(In reply to comment #66)
> Hard to tell, I can get you the log if you want :)

Yes, please.
Comment 69 Chris AtLee [:catlee] 2009-08-11 09:22:14 PDT
(In reply to comment #68)
> (In reply to comment #66)
> > Hard to tell, I can get you the log if you want :)
> 
> Yes, please.

http://people.mozilla.org/~catlee/1-log-mochitest-browser-chrome-stdio.gz

This is in buildbot's chunked encoding.  It's mostly readable, but you'll notice the occasional string like "5358:2" in the log.
Comment 70 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-08-11 12:07:32 PDT
For what it's worth, I'm pretty confident that log you posted is a hang; I'm still looking into it, though.
Comment 71 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-08-11 13:11:59 PDT
So mochitest-browser-chrome worked for me on 1.9.1 on a Linux debug build.

That said, from the log, it looks like you hit a known random orange:  bug 498339.  But since you were in a debug build, the infinite loop in question was producing output.  Is there a way we could make buildbot detect the process as hung if |timeout| seconds of output don't contain the string "TEST-", rather than checking for no output at all?
Comment 72 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-08-11 13:12:55 PDT
Perhaps a simpler possibility is just having a log size limit, and killing the process if it exceeds that?
Comment 73 Chris AtLee [:catlee] 2009-08-11 13:45:12 PDT
(In reply to comment #71)
> So mochitest-browser-chrome worked for me on 1.9.1 on a Linux debug build.
> 
> That said, from the log, it looks like you hit a known random orange:  bug
> 498339.  But since you were in a debug build, the infinite loop in question was
> producing output.  Is there a way we could make buildbot detect the process as
> hung if |timeout| seconds of output don't contain the string "TEST-", rather
> than checking for no output at all?

Yeah, I've got an upstream patch to buildbot checked in that sets a maximum run time for these shell commands, regardless of if output is being generated.  We'll either get this when we upgrade to buildbot 0.7.13, or if we decide to cherry-pick those changes earlier.

Having a maximum on the log size sounds like a great idea as well.
Comment 74 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-08-14 09:56:41 PDT
To clarify where I think we are on this (since there seems to have been some confusion):

I think we need *some* fix that's going to make hangs that generate output in an infinite loop behave reasonably, since I think some of our known random orange hangs in that manner on debug builds.  (I think that's what catlee's log showed).

So once that's done, presumably these builds can actually be started running, except they'll be orange.  Once they're running somewhere and orange, we'll know exactly what we need to fix to get them green, and it should be pretty trivial to do so.  (I have a guess at the set of patches necessary, but I can't know for sure until I see what failures the builders actually hit.)

So I think the way to make progress here is:
  1. fix the hang detection issue in an appropriate way
  2. start the builders and have them report somewhere (e.g., MozillaTest)
  3. I or somebody else will land the necessary orange fixes at that time (and we'll know what's necessary by seeing what's actually failing)
  4. builders get moved from MozillaTest (or wherever) to their final location
Comment 75 Serge Gautherie (:sgautherie) 2009-08-14 10:25:08 PDT
(In reply to comment #74)

> I think some of our known random orange hangs in that manner on debug builds.

Or, if we know which tests cause trouble, we could "todo" them (on debug) now/ftb...
Comment 76 John O'Duinn [:joduinn] (please use "needinfo?" flag) 2009-08-14 11:54:29 PDT
switching some dep.bugs into blocked.bugs, after discussion with dbaron on irc. 

Basically, once we start running unittests on debug builds, and posting to MozillaTest, he will start landing the patches in these dep^H^H^H blocked bugs. This way, he can easily see improvements and what followon fixes are needed.
Comment 77 Lukas Blakk [:lsblakk] use ?needinfo 2009-08-21 09:34:08 PDT
Staging-master2 is running the packaged-debug-unittest builds now, reporting to MozillaTest at the moment.
Comment 78 Lukas Blakk [:lsblakk] use ?needinfo 2009-08-21 09:35:09 PDT
Created attachment 395854 [details] [diff] [review]
[WIP] buildbot-configs for unittests on debug builds (updated)

updated for new separate staging-masters
Comment 79 Lukas Blakk [:lsblakk] use ?needinfo 2009-08-21 09:36:01 PDT
Created attachment 395855 [details] [diff] [review]
[WIP] buildbotcustom changes for running unittests on debug builds (updated)
Comment 80 Lukas Blakk [:lsblakk] use ?needinfo 2009-08-21 12:07:04 PDT
Created attachment 395900 [details] [diff] [review]
Production buildbot-configs for unittests on debug builds (updated)
Comment 81 Lukas Blakk [:lsblakk] use ?needinfo 2009-08-21 12:09:11 PDT
So with the buildbotcustom changes and the buildbot-config patch we should be able to turn this on in production - we will also need to update to 0.7.13 to pick up the changes catlee made to maximum run time for shell commands.

The packaged unittest runs as well as the packaged debug unittest runs would both report to tinderbox: Firefox-Unittest
Comment 82 Ben Hearsum (:bhearsum) 2009-08-21 13:03:52 PDT
Comment on attachment 395900 [details] [diff] [review]
Production buildbot-configs for unittests on debug builds (updated)

Flip this for all branches, please. I know we're not enabling the builds for all them, but we need to make sure that the leak test portion is comparable still.
Comment 83 Ben Hearsum (:bhearsum) 2009-08-21 13:05:45 PDT
Comment on attachment 395855 [details] [diff] [review]
[WIP] buildbotcustom changes for running unittests on debug builds (updated)

The factory.py part of this is fine, but we need to be able to flip this on or off per branch. You'll need to add an enable_debug_unittests parameter in config.py, and test for it in generateBranchObjects.
Comment 84 Nick Thomas [:nthomas] 2009-08-24 00:08:52 PDT
(In reply to comment #77)
> Staging-master2 is running the packaged-debug-unittest builds now, reporting to
> MozillaTest at the moment.

The mochitest-browser-chrome log from a "WINNT 5.2 tracemonkey test debug everythingelse" build on staging-master2 grew to 21G, and filled up the /builds partition on the machine. I've bzipped it up first 1e7 lines and put them at 
 http://people.mozilla.org/~nthomas/misc/0-log-mochitest-browser-chrome-stdio-PART.bz2
if you want to take a look. There's a lot of this
###!!! ASSERTION: XPConnect is being called on a scope without a 'Components' property!: 'Error', file e:/builds/moz2_slave/tracemonkey-win32-debug/build/js/src/xpconnect/src/xpcwrappednativescope.cpp, line 786
xul!DumpJSValue+0x000000000002BD5B
xul!DumpJSValue+0x000000000003D772
xul!DumpJSValue+0x000000000003C1FA
xul!DumpJSValue+0x0000000000035D16
xul!DumpJSValue+0x00000000000A71E4
xul!DumpJSValue+0x00000000000AC3B1
...

I think we need a guard against the large logs happening in production, and resolve the issue above.
Comment 85 John O'Duinn [:joduinn] (please use "needinfo?" flag) 2009-08-24 01:53:11 PDT
(In reply to comment #84)
> (In reply to comment #77)

> There's a lot of this
> ###!!! ASSERTION: XPConnect is being called on a scope without a 'Components'
> property!: 'Error', file
> e:/builds/moz2_slave/tracemonkey-win32-debug/build/js/src/xpconnect/src/xpcwrappednativescope.cpp,
> line 786
> xul!DumpJSValue+0x000000000002BD5B
> xul!DumpJSValue+0x000000000003D772
> xul!DumpJSValue+0x000000000003C1FA
> xul!DumpJSValue+0x0000000000035D16
> xul!DumpJSValue+0x00000000000A71E4
> xul!DumpJSValue+0x00000000000AC3B1
> ...
> 
> I think we need a guard against the large logs happening in production, and
> resolve the issue above.

dbaron: iirc, you had fixes in hand for some ASSERTIONS. We've now got these showing on MozillaTest, fyi. 

If this is the scenario you were talking about, any chance you could land your patches, so we can reduce the ASSERTION noise, and figure out whats still-to-do?

tc
John.
Comment 86 Lukas Blakk [:lsblakk] use ?needinfo 2009-08-24 14:42:01 PDT
Created attachment 396301 [details] [diff] [review]
Production buildbot-configs for unittests on debug builds - for all debug mozconfigs
Comment 87 Lukas Blakk [:lsblakk] use ?needinfo 2009-08-24 14:43:21 PDT
Created attachment 396302 [details] [diff] [review]
buildbotcustom changes for running unittests on debug builds - now as a config parameter

Ok, so this version lets you set in config.py if enable_packaged_debug_unittests is set for a particular branch.
Comment 88 Ben Hearsum (:bhearsum) 2009-08-28 06:30:47 PDT
Comment on attachment 396302 [details] [diff] [review]
buildbotcustom changes for running unittests on debug builds - now as a config parameter

>diff --git a/misc.py b/misc.py
>@@ -278,12 +285,17 @@
>         codesighs = True
>         uploadPackages = True
>         uploadSymbols = False
>+        packageTests = False
>+        unittestBranch = "%s-%s-unittest" % (name, platform)
>         talosMasters = config['talos_masters']
>         if platform.find('-debug') > -1:
>             leakTest = True
>             codesighs = False
>-            uploadPackages = False
>+            uploadPackages = True
>             talosMasters = None
>+            packageTests = True
>+            # Platform already has the -debug suffix
>+            unittestBranch = "%s-%s-unittest" % (name, platform)

None of this behaviour should change unless we're going to be testing packaged debug builds. Eg, uploadPackages and packageTests should be False for debug unless we're testing the builds.
Comment 89 Lukas Blakk [:lsblakk] use ?needinfo 2009-08-28 07:01:47 PDT
Created attachment 397266 [details] [diff] [review]
buildbotcustom changes for running unittests on debug builds - take 3

now testing for enable_packaged_debug_unittest to upload debug builds and run packaged tests
Comment 90 Ben Hearsum (:bhearsum) 2009-08-28 13:58:46 PDT
Comment on attachment 397266 [details] [diff] [review]
buildbotcustom changes for running unittests on debug builds - take 3

>@@ -278,12 +285,19 @@
>         codesighs = True
>         uploadPackages = True
>         uploadSymbols = False
>+        packageTests = True
>+        unittestBranch = "%s-%s-unittest" % (name, platform)
>         talosMasters = config['talos_masters']
>         if platform.find('-debug') > -1:
>             leakTest = True
>             codesighs = False
>-            uploadPackages = False
>+            if not config['enable_packaged_debug_unittests']:
>+                uploadPackages = False
>+                packagedTests = False
>             talosMasters = None
>+            packageTests = True

This looks wrong, packageTests shouldn't be getting overridden here.

r=me with that fixed.
Comment 91 Lukas Blakk [:lsblakk] use ?needinfo 2009-08-28 15:15:06 PDT
Created attachment 397347 [details] [diff] [review]
buildbotcustom changes for running unittests on debug builds - take 4

fixed as per bhearsum's last comments
Comment 92 Lukas Blakk [:lsblakk] use ?needinfo 2009-09-03 12:02:13 PDT
Got this running on staging-master2 with libxul enabled in the mozconfig using
catlee's repo.  All three platforms had similar errors like:

error while loading shared libraries: libmozz.so:                           
cannot open shared object file: No such file or                           
directory
 Timed out while waiting for server startup.

Which was due to a 45 second hard-coded timeout setting in runtests.py
Comment 93 Ted Mielczarek [:ted.mielczarek] 2009-09-03 13:36:54 PDT
(In reply to comment #92)
> Got this running on staging-master2 with libxul enabled in the mozconfig using
> catlee's repo.  All three platforms had similar errors like:
> 
> error while loading shared libraries: libmozz.so:                           
> cannot open shared object file: No such file or                           
> directory

This indicates to me that you switched some builds to --enable-libxul without a clobber. Try clobbering them?
Comment 94 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-09-04 07:19:49 PDT
(In reply to comment #85)
> If this is the scenario you were talking about, any chance you could land your
> patches, so we can reduce the ASSERTION noise, and figure out whats
> still-to-do?

This isn't the issue I was talking about; these are not fatal assertions (NS_ABORT_IF_FALSE).

Is the issue in comment 84 specific to tracemonkey, or is it happening on mozilla-central?  Is it happening on all platforms?

It might be more useful to answer those questions in bug 510489 than here, though.

(In reply to comment #77)
> I think we need a guard against the large logs happening in production, and
> resolve the issue above.

Absolutely; see comments 72-74.
Comment 95 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-09-04 07:23:21 PDT
(In reply to comment #85)
> dbaron: iirc, you had fixes in hand for some ASSERTIONS. We've now got these
> showing on MozillaTest, fyi. 

Also, I don't see any builds with "debug" in the name on MozillaTest.  What's the naming pattern for these builds?
Comment 96 Lukas Blakk [:lsblakk] use ?needinfo 2009-09-04 08:00:56 PDT
That's because they were running in a staging environment which was re-configured for other work.  Will have to look into getting this running again.
Comment 97 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-09-09 05:32:40 PDT
Did the stuff in the comment 77 through comment 93 resolve issue (1) in comment 74?  That is, did it fix the large log detection issue?

Or, to ask it another way, is the next step to getting this up and running:
 (1) fixing the large log issue
 (2) getting the builds actually showing up somewhere so developers can look at the orange, or
 (3) getting developers actually looking at the orange, since they're running somewhere and I just haven't heard?
Comment 98 Chris AtLee [:catlee] 2009-09-15 08:27:41 PDT
No, we're still having large log issues.  Currently we're limiting the runtime of the tests to an hour, but that still generates a 600MB log, which then makes all sorts of things blow up.

We need to figure out how to decrease the amount of output being generated, cap the log at some maximum size, or somehow strip redundant from the log after the fact.

I'll look at specifying a maximum log size.
Comment 99 Chris AtLee [:catlee] 2009-09-16 06:53:14 PDT
Created attachment 401010 [details] [diff] [review]
buildbotcustom changes for running unittests on debug builds
Comment 100 Chris AtLee [:catlee] 2009-09-16 06:54:43 PDT
Comment on attachment 401010 [details] [diff] [review]
buildbotcustom changes for running unittests on debug builds

Deploying this is dependent on the maxTime buildbot patch, and also something that limits the log size.
Comment 101 Chris AtLee [:catlee] 2009-09-16 06:59:01 PDT
Created attachment 401011 [details] [diff] [review]
buildbot-configs for unittests on debug builds
Comment 102 Ben Hearsum (:bhearsum) 2009-09-16 07:05:29 PDT
Comment on attachment 401010 [details] [diff] [review]
buildbotcustom changes for running unittests on debug builds


>@@ -282,12 +293,19 @@
>         codesighs = True
>         uploadPackages = True
>         uploadSymbols = False
>+        packageTests = True
>+        unittestBranch = "%s-%s-unittest" % (name, platform)
>         talosMasters = config['talos_masters']
>         if platform.find('-debug') > -1:
>             leakTest = True
>             codesighs = False
>-            uploadPackages = False
>+            if not config.get('enable_packaged_debug_unittests'):
>+                uploadPackages = False
>+                packagedTests = False

Typo here, I think. s/packaged/package/

r=me with that fixed
Comment 103 Ben Hearsum (:bhearsum) 2009-09-16 07:15:42 PDT
Comment on attachment 401011 [details] [diff] [review]
buildbot-configs for unittests on debug builds

catlee, ted, and I just chatted in #build about this. We decided it would be best to only turn these on for mozilla-central to start with for a couple of reasons:
* limits the impact on load
* lets us iron out the kinks before deploying to project branches and/or release branches
Comment 104 Chris AtLee [:catlee] 2009-09-16 07:33:41 PDT
Created attachment 401014 [details] [diff] [review]
buildbot-configs for unittests on debug builds, take 2
Comment 105 Ben Hearsum (:bhearsum) 2009-09-16 07:35:01 PDT
Comment on attachment 401014 [details] [diff] [review]
buildbot-configs for unittests on debug builds, take 2

If we're turning on --enable-libxul for all the linux/mac project branches we should do the same for the Windows ones. r=me with that change.
Comment 106 Ben Hearsum (:bhearsum) 2009-09-16 07:41:54 PDT
(In reply to comment #105)
> (From update of attachment 401014 [details] [diff] [review])
> If we're turning on --enable-libxul for all the linux/mac project branches we
> should do the same for the Windows ones. r=me with that change.

Catlee reminded me that --enable-libxul greatly increases build time on Windows, so turning it on when not needed would be pretty costly. r=me in this patch as-is.
Comment 107 Chris AtLee [:catlee] 2009-09-18 14:00:04 PDT
Created attachment 401528 [details] [diff] [review]
Increase server startup timeout to 3 minutes on debug builds
Comment 108 Ted Mielczarek [:ted.mielczarek] 2009-09-18 14:03:03 PDT
Comment on attachment 401528 [details] [diff] [review]
Increase server startup timeout to 3 minutes on debug builds

We should look into reducing the timeout in the non-debug case, but it's not a big deal.
Comment 109 Ted Mielczarek [:ted.mielczarek] 2009-09-21 05:04:41 PDT
Comment on attachment 401528 [details] [diff] [review]
Increase server startup timeout to 3 minutes on debug builds

Pushed to m-c:
http://hg.mozilla.org/mozilla-central/rev/018e4e830527
Comment 110 Chris AtLee [:catlee] 2009-09-23 09:20:31 PDT
Created attachment 402374 [details] [diff] [review]
buildbot-configs for unittests on debug builds, take 3

Same as before, with the addition of logMaxSize to buildbot master config.  I've set it to 50 MB right now, which is 4x larger than our typical largest log.
Comment 111 Chris AtLee [:catlee] 2009-09-24 05:54:07 PDT
Comment on attachment 401010 [details] [diff] [review]
buildbotcustom changes for running unittests on debug builds

changeset:   417:d64407e7fdaf
Comment 112 Chris AtLee [:catlee] 2009-09-24 05:56:14 PDT
Comment on attachment 402374 [details] [diff] [review]
buildbot-configs for unittests on debug builds, take 3

changeset:   1543:5def63176a0b
Comment 113 Chris AtLee [:catlee] 2009-09-24 13:24:39 PDT
Created attachment 402646 [details] [diff] [review]
Fixes for tests on debug builds

This fixes the upload location for debug builds, as well as passing in --utility-path=bin to reftest/crashtest tests.  runreftest.py doesn't currently support this option, but I have every hope it will soon! :)
Comment 114 Ben Hearsum (:bhearsum) 2009-09-24 13:40:42 PDT
Comment on attachment 402646 [details] [diff] [review]
Fixes for tests on debug builds

This is basically fine, but I think it makes more sense to call the parameter tinderboxBuildsDir, to match the post_upload.py one. Calling it 'upload_dir' implies that there's only one place a build is being uploaded to, which isn't true for nightly builds.
Comment 115 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-09-24 23:01:04 PDT
catlee told me on IRC a few hours ago that these unit test machines are now up and running.  They're currently hidden, but can be seen on:
 http://tinderbox.mozilla.org/showbuilds.cgi?tree=Firefox-Unittest&noignore=1

He said, furthermore, that the reftest/crashtest orange is a known issue with the automation (the timeout being too short, I think).

The worst remaining problem is that mochitest-browser-chrome and mochitest-plain are always orange.  Some of the other tests are intermittently orange.

I investigated mochitest-browser-chrome, and could reproduce the problem locally (and what I saw matches the minidump in the Linux log (which had library names but no symbols); there was no minidump in the Windows log).  The problem is bug 517349, which is currently fixed-on-tracemonkey.

I still haven't investigated the mochitest-plain perma-orange or the intermittent mochitest-chrome orange.  (This would be easier if the minidump processing showed symbols...)
Comment 116 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-09-24 23:59:22 PDT
The windows mochitest-plain failure seems to be a crash on content/media/test/test_mozLoadFrom.html (preceded by a bunch of assertions).  This could be related to bug 518659.

I looked at three Linux mochitest-plain failures.  One was that it went past the timeout, so mochitest-plain also needs a longer timeout (like reftest and crashtest).

The other two were crashes during dom/tests/mochitest/geolocation/test_manyWindows.html


I think running unit tests on debug builds has regressed a bit in the past few months because everybody has been relying on the try server to run unit tests.
Comment 117 Ted Mielczarek [:ted.mielczarek] 2009-09-25 06:04:54 PDT
The problem with symbols on Linux debug builds is filed as bug 518634.
Comment 118 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-09-25 11:17:16 PDT
When I run mochitest-plain locally, I hit bug 498380; we could easily comment out that assertion if it happens on the tinderbox, but I haven't seen it happne there (though maybe we should anyway).
Comment 119 Ben Hearsum (:bhearsum) 2009-09-25 11:42:48 PDT
pushed a bustage fix for 1.9.1 mac builds:
bitters-2:mozilla-1.9.1 bhearsum$ hg diff
diff --git a/mozilla2-staging/macosx/mozilla-1.9.1/nightly b/mozilla2-staging/macosx/mozilla-1.9.1/nightly
deleted file mode 120000
--- a/mozilla2-staging/macosx/mozilla-1.9.1/nightly
+++ /dev/null
@@ -1,1 +0,0 @@
-../mozilla-central/nightly
\ No newline at end of file
diff --git a/mozilla2-staging/macosx/mozilla-central/nightly/mozconfig b/mozilla2-staging/macosx/mozilla-1.9.1/nightly/mozconfig
copy from mozilla2-staging/macosx/mozilla-central/nightly/mozconfig
copy to mozilla2-staging/macosx/mozilla-1.9.1/nightly/mozconfig
--- a/mozilla2-staging/macosx/mozilla-central/nightly/mozconfig
+++ b/mozilla2-staging/macosx/mozilla-1.9.1/nightly/mozconfig
@@ -1,14 +1,14 @@
 . $topsrcdir/build/macosx/universal/mozconfig
 
 ac_add_options --enable-application=browser
 ac_add_options --enable-update-channel=nightly
 ac_add_options --enable-update-packaging
-ac_add_options --enable-tests
+ac_add_options --disable-tests
 ac_add_options --enable-codesighs
 ac_add_options --disable-install-strip
 
 export CFLAGS="-gdwarf-2"
 export CXXFLAGS="-gdwarf-2"
 
 # For NSS symbols
 export MOZ_DEBUG_SYMBOLS=1
diff --git a/mozilla2/macosx/mozilla-1.9.1/nightly b/mozilla2/macosx/mozilla-1.9.1/nightly
deleted file mode 120000
--- a/mozilla2/macosx/mozilla-1.9.1/nightly
+++ /dev/null
@@ -1,1 +0,0 @@
-../mozilla-central/nightly
\ No newline at end of file
diff --git a/mozilla2/macosx/mozilla-central/nightly/mozconfig b/mozilla2/macosx/mozilla-1.9.1/nightly/mozconfig
copy from mozilla2/macosx/mozilla-central/nightly/mozconfig
copy to mozilla2/macosx/mozilla-1.9.1/nightly/mozconfig
--- a/mozilla2/macosx/mozilla-central/nightly/mozconfig
+++ b/mozilla2/macosx/mozilla-1.9.1/nightly/mozconfig
@@ -1,14 +1,14 @@
 . $topsrcdir/build/macosx/universal/mozconfig
 
 ac_add_options --enable-application=browser
 ac_add_options --enable-update-channel=nightly
 ac_add_options --enable-update-packaging
-ac_add_options --enable-tests
+ac_add_options --disable-tests
 ac_add_options --enable-codesighs
 ac_add_options --disable-install-strip
 
 export CFLAGS="-gdwarf-2"
 export CXXFLAGS="-gdwarf-2"
 
 # For NSS symbols
 export MOZ_DEBUG_SYMBOLS=1
bitters-2:mozilla-1.9.1 bhearsum$ hg commit -m "bug 486783: run unit tests on pre-existing nightly bulids - restore --disable-tests on the 1.9.1 nightlies. r=ted"
bitters-2:mozilla-1.9.1 bhearsum$ hg out
hgcomparing with http://hg.mozilla.org/build/buildbot-configs/
 searching for changes
pchangeset:   1549:fff9ea67a4a5
tag:         tip
user:        Ben Hearsum <bhearsum@mozilla.com>
date:        Fri Sep 25 14:43:07 2009 -0400
summary:     bug 486783: run unit tests on pre-existing nightly bulids - restore --disable-tests on the 1.9.1 nightlies. r=ted

bitters-2:mozilla-1.9.1 bhearsum$ hg push ssh://hg.mozilla.org/build/buildbot-configs
pushing to ssh://hg.mozilla.org/build/buildbot-configs
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 2 changes to 2 files
Comment 120 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-09-26 20:44:07 PDT
One of the Linux debug everythingelse runs just went green:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1254015360.1254018228.2631.gz

Not sure why; maybe it picked up a faster slave or a low-load time.


Bug 517349 is fixed-on-mozilla-central since the tracemonkey merge (was it this morning or last night?), so that helps.  But I think a bunch of timeouts still need to be higher, and then we should get a better sense of what's left.
Comment 121 Ted Mielczarek [:ted.mielczarek] 2009-09-28 05:28:40 PDT
Created attachment 403224 [details] [diff] [review]
Add --utility-path to runreftest.py

Simple little patch, should let us fix the reftest on debug build bustage.
Comment 122 Chris AtLee [:catlee] 2009-09-29 07:50:22 PDT
Created attachment 403483 [details] [diff] [review]
Fixes for tests on debug builds

reftests and crashtests will (still) be broken until the patch to runreftest.py lands.  but I'm okay with that.
Comment 123 Chris AtLee [:catlee] 2009-09-30 04:01:40 PDT
Created attachment 403749 [details] [diff] [review]
Fixes for tests on debug builds, v3

Wrong patch attached last time I think?  Only change is the rename from upload_dir to tinderboxBuildsDir.
Comment 124 Ted Mielczarek [:ted.mielczarek] 2009-09-30 04:33:44 PDT
Comment on attachment 403224 [details] [diff] [review]
Add --utility-path to runreftest.py

Pushed to m-c:
http://hg.mozilla.org/mozilla-central/rev/f2fd71991134
Comment 125 Ted Mielczarek [:ted.mielczarek] 2009-10-05 07:03:51 PDT
Comment on attachment 403224 [details] [diff] [review]
Add --utility-path to runreftest.py

Pushed to 1.9.2 and 1.9.1:
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/07ad004b987e
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/0dbbf529b0c3
Comment 126 Ted Mielczarek [:ted.mielczarek] 2009-10-16 10:35:58 PDT
Comment on attachment 403224 [details] [diff] [review]
Add --utility-path to runreftest.py

And a follow-up on this because I forgot to make it handle relative paths:
http://hg.mozilla.org/mozilla-central/rev/f7ccf195ccba
Comment 127 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-10-16 14:33:26 PDT
What's the status on fixing that reftest and crashtest are timing out most of the time?  Is that still waiting for http://groups.google.com/group/mozilla.dev.tree-management/msg/00576a485c655e3a to happen, or has that happened, and there's something else wrong?  (I also find it curious that they don't time out all the time, though.)

Also, do we know why mochitest-plain is timing out?
Comment 128 Chris AtLee [:catlee] 2009-10-16 14:35:46 PDT
Still waiting for a good time to land the changes mentioned in that post.
Comment 129 David Baron :dbaron: ⌚️UTC+1 (busy September 14-25) 2009-10-17 08:54:36 PDT
Do any of the scheduled fixes deal with mochitest-plain timing out?  (Is it worth increasing the timeout?)
Comment 130 Ted Mielczarek [:ted.mielczarek] 2009-10-19 10:01:46 PDT
Comment on attachment 403224 [details] [diff] [review]
Add --utility-path to runreftest.py

Landed the followup on 1.9.2 and 1.9.1:
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/570220ecd415
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/62e1d227ec3a
Comment 131 Chris AtLee [:catlee] 2009-10-20 06:28:21 PDT
Comment on attachment 403749 [details] [diff] [review]
Fixes for tests on debug builds, v3

changeset:   444:7541087e017c
Comment 132 Ted Mielczarek [:ted.mielczarek] 2009-10-22 06:41:12 PDT
These are up and running, this bug has served its purpose. I've filed bug 523385 to track the remaining issues (including tests that are failing).

Note You need to log in before you can comment on or make changes to this bug.