Closed Bug 422296 Opened 12 years ago Closed 12 years ago

Need debug+leak testing builds for mozilla-central

Categories

(Release Engineering :: General, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: benjamin, Assigned: bhearsum)

References

Details

Attachments

(8 files, 6 obsolete files)

4.80 KB, patch
bhearsum
: review+
Details | Diff | Splinter Review
4.58 KB, patch
bhearsum
: review+
Details | Diff | Splinter Review
1.17 KB, patch
nthomas
: review+
Details | Diff | Splinter Review
9.93 KB, patch
Details | Diff | Splinter Review
16.06 KB, patch
ted
: review+
Details | Diff | Splinter Review
1.30 KB, patch
ted
: review+
Details | Diff | Splinter Review
2.13 KB, patch
rcampbell
: review+
Details | Diff | Splinter Review
7.70 KB, patch
rcampbell
: review+
Details | Diff | Splinter Review
This is a companion to bug 417332 - we need debug+leaktest boxes for mozilla-central, like we have for CVS trunk.
Assuming you mean debug build & leaktest build machines, this should be in the Build&Release component, not here in QA component. Reassigning so it shows up in triage.
Component: Testing → Build & Release
Product: Core → mozilla.org
QA Contact: testing → build
Version: unspecified → other
Summary: Need debug+leak testing for mozilla2 → Need debug+leak testing builds for mozilla2
Blocks: 422754
Component: Build & Release → Release Engineering: Projects
Priority: -- → P3
QA Contact: build → release
John O - can we get an owner...
Raising priority, as its needed for moz2-central to be fully opened up. 
Component: Release Engineering: Future → Release Engineering
Priority: P3 → P1
Summary: Need debug+leak testing builds for mozilla2 → Need debug+leak testing builds for mozilla2-central
FWIW, you get some bloat logging for free when running unittests now *IF* you're using runtests.py instead of runtests.pl, because bloat logging is enabled by default in the python version.

Note: this doesn't remove the need for these boxes.
we could use a little clarity on this. Are these leak + debug builds comparable to what we're getting on trunk with fxdbug-linux-tbox and fx-linux-tbox or are you looking for full unittest coverage on debug builds with leaks enabled.

We have a set of those machines for fx3 which haven't really gotten a lot of attention and they were (and still are) quite painful to setup and maintain.

Some guidance please!
Right now we want an exact parallel of what's on Firefox trunk right now:

fxdbug-linux-tbox
bm-xserve11
fxdbug-win32-tbox
The boxes currently on the Firefox tinderbox that we need equivalents for are:

Linux fxdbug-linux-tbox Dep
MacOSX Darwin 8.8.4 bm-xserve11 Dep Debug + Leak Test
WINNT 5.2 fxdbug-win32-tb Dep Debug + Leak Test
bm-xserve18 is available for use for this.
There was a couple of caveats when setting up Moz2 slaves:
1) Linux & Mac need autoconf-2.13 available in /usr/local/bin
2) Win32 needs Mercurial installed
2a) Install http://mercurial.berkwood.com/binaries/Mercurial-0.9.5.exe to d:\mercurial
2b) Append 'd:\mercurial' to system path
3) add MOZ_NO_RESET_PATH=1 to windows env vars

That's all of the caveats I know of.
Assignee: nobody → joduinn
Depends on: 429142
On bm-xserve18, I did the following:
- installed autoconf-2.13 following steps described by bhearsum in irc. 
- setup new moz2-mac-slave2 using the buildbot0.7.5 provided by ref platform.

On staging-master.b.m.o, I updated staging-master so it can see and ping the two mac slaves (the preexisting moz2-mac-slave1, and the new moz2-mac-slave2).


At least now, the machine and slave setup work is done. In the morning, I'll start the next step, of finding out how to configure for debug/leaktest builds.
(In reply to comment #10)
> On bm-xserve18, I did the following:
> - installed autoconf-2.13 following steps described by bhearsum in irc. 
> - setup new moz2-mac-slave2 using the buildbot0.7.5 provided by ref platform.
> 
> On staging-master.b.m.o, I updated staging-master so it can see and ping the
> two mac slaves (the preexisting moz2-mac-slave1, and the new moz2-mac-slave2).
> 
> 
> At least now, the machine and slave setup work is done. In the morning, I'll
> start the next step, of finding out how to configure for debug/leaktest builds.
> 

Two quick things,

Did you have to install Buildbot yourself? That's how we generally do it. I'm not sure if it's on the Mac image though.

Please put patches to the master.cfg up for review before making them. I know Moz2 is on staging-master right now, but it *is* our production grade master. All of the restarts that happened last night burnt a bunch of builders.
(In reply to comment #11)
> Did you have to install Buildbot yourself? That's how we generally do it. I'm
> not sure if it's on the Mac image though.

I wasn't fully aware of this policy when I got the image made for Leopard, so there is a buildbot install there already, as doc'd at 
  http://wiki.mozilla.org/ReferencePlatforms/Mac-10.5#Buildbot

IIRC, BUILDBOT_MOZILLA_PRODUCTION is still 0.7.5
(In reply to comment #12)
> (In reply to comment #11)
> > Did you have to install Buildbot yourself? That's how we generally do it. I'm
> > not sure if it's on the Mac image though.
> 
> I wasn't fully aware of this policy when I got the image made for Leopard, so
> there is a buildbot install there already, as doc'd at 
>   http://wiki.mozilla.org/ReferencePlatforms/Mac-10.5#Buildbot
> 

I don't think it's actually a written down policy anywhere, sadly :(. I kindof figured it was on the Mac anyways, since we often seem to clone from in-use systems.

> IIRC, BUILDBOT_MOZILLA_PRODUCTION is still 0.7.5
> 

Yeah, BUILDBOT_MOZILLA_PRODUCTION is 0.7.5, Mozilla2 uses 0.7.6 currently though -- so we'll have to upgrade the new slave.
I disabled moz2-mac-slave2 from building mozilla-central because it is not setup correctly to do so. (It needs BUILDBOT_0_7_6_BRANCH)
Summary: Need debug+leak testing builds for mozilla2-central → Need debug+leak testing builds for mozilla-central
Assignee: joduinn → ccooper
Status: NEW → ASSIGNED
(In reply to comment #11)
> Please put patches to the master.cfg up for review before making them. I know
> Moz2 is on staging-master right now, but it *is* our production grade master.
> All of the restarts that happened last night burnt a bunch of builders.

Woah!? I would have never touched "staging-master" if I thought it was the *production* master. We need to fix this. We have a production master VM already - whats running there?
(In reply to comment #15)
> (In reply to comment #11)
> > Please put patches to the master.cfg up for review before making them. I know
> > Moz2 is on staging-master right now, but it *is* our production grade master.
> > All of the restarts that happened last night burnt a bunch of builders.
> 
> Woah!? I would have never touched "staging-master" if I thought it was the
> *production* master. We need to fix this. We have a production master VM
> already - whats running there?
> 

(Sorry if I was snippy in my last comment.) We don't have a production-master yet, no room to clone it from what I understand. Bug 426519 is filed about it.
(In reply to comment #13)
> (In reply to comment #12)
> > (In reply to comment #11)
> > > Did you have to install Buildbot yourself? That's how we generally do it. I'm
> > > not sure if it's on the Mac image though.
> > 
> > I wasn't fully aware of this policy when I got the image made for Leopard, so
> > there is a buildbot install there already, as doc'd at 
> >   http://wiki.mozilla.org/ReferencePlatforms/Mac-10.5#Buildbot
> 
> I don't think it's actually a written down policy anywhere, sadly :(. I kindof
> figured it was on the Mac anyways, since we often seem to clone from in-use
> systems.

Both bm-xserve17 and bm-xserve18 were imaged from another machine by IT (cant remember which one). Each of them has Buildbot 0.7.5 installed. The doc http://wiki.mozilla.org/ReferencePlatforms/Mac-10.5 also says to install buildbot 0.7.5. According to all the signs, I thought I'd done it right.

We only use 10.5 for Moz2 machines, so can you please correct the doc? 


> > IIRC, BUILDBOT_MOZILLA_PRODUCTION is still 0.7.5
> Yeah, BUILDBOT_MOZILLA_PRODUCTION is 0.7.5, Mozilla2 uses 0.7.6 currently
> though -- so we'll have to upgrade the new slave.

:-<
(In reply to comment #17)
> (In reply to comment #13)
> > (In reply to comment #12)
> > > (In reply to comment #11)
> > > > Did you have to install Buildbot yourself? That's how we generally do it. I'm
> > > > not sure if it's on the Mac image though.
> > > 
> > > I wasn't fully aware of this policy when I got the image made for Leopard, so
> > > there is a buildbot install there already, as doc'd at 
> > >   http://wiki.mozilla.org/ReferencePlatforms/Mac-10.5#Buildbot
> > 
> > I don't think it's actually a written down policy anywhere, sadly :(. I kindof
> > figured it was on the Mac anyways, since we often seem to clone from in-use
> > systems.
> 
> Both bm-xserve17 and bm-xserve18 were imaged from another machine by IT (cant
> remember which one). Each of them has Buildbot 0.7.5 installed. The doc
> http://wiki.mozilla.org/ReferencePlatforms/Mac-10.5 also says to install
> buildbot 0.7.5. According to all the signs, I thought I'd done it right.
> 
> We only use 10.5 for Moz2 machines, so can you please correct the doc? 
> 

Sure...but we'll have to be wary if we use 10.5 for a non-0.7.6 install in the future.

> 
> > > IIRC, BUILDBOT_MOZILLA_PRODUCTION is still 0.7.5
> > Yeah, BUILDBOT_MOZILLA_PRODUCTION is 0.7.5, Mozilla2 uses 0.7.6 currently
> > though -- so we'll have to upgrade the new slave.
> 
> :-<
> 

Not a big deal, actually. It only takes a few minutes :)
Right now on trunk, we have to add the following to our tinderbox config for debug+leak:

mk_add_options MOZ_CO_MODULE="mozilla/tools/trace-malloc"
ac_add_options --disable-optimize
ac_add_options --enable-debug

Is it the same on moz2?
You don't need MOZ_CO_MODULE (meaningless in hg). You do need the configure options.
(In reply to comment #20)
> You don't need MOZ_CO_MODULE (meaningless in hg).

I was more getting at: is there an equiv command for hg, or is trace-malloc not supported yet?
trace-malloc is part of the hg import. With hg it's an all-or-nothing process, you can't check out only parts of the tree.
Attached patch Add a MacOSX debug builder (obsolete) β€” β€” Splinter Review
My first hg patch, be merciful! ;)

The one part I'm not sure about is this:

+BRANCHES['mozilla-central']['platforms']['macosx-debug']['platform_objdir'] = '%s/ppc' % OBJDIR

This is copied directly from the regular macosx platform entry, but I'm not sure why 'ppc' is in there. Is the regular macosx slave a ppc machine, or does 'ppc' need to be there regardless of platform?
Attachment #316291 - Flags: review?(bhearsum)
The regular build is a universal build, so you get objdir/{ppc,i386}, and have to make package etc. in the ppc dir.
Comment on attachment 316291 [details] [diff] [review]
Add a MacOSX debug builder

>diff -r 5945f5ff43e1 mozilla2/macosx-debug/mozconfig
>--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
>+++ b/mozilla2/macosx-debug/mozconfig	Thu Apr 17 17:46:20 2008 -0400
>@@ -0,0 +1,16 @@
>+. $topsrcdir/build/macosx/universal/mozconfig

If we can inherit from the non-debug mac config, let's do that. It'll help us keep them in sync. ($topsrcdir/configs/macosx/mozconfig, I believe.) If there's a reason not to, then this is fine.


>diff -r 5945f5ff43e1 mozilla2/master.cfg
> BRANCHES['mozilla-central']['platforms']['macosx']['slaves'] = ['moz2-mac-slave1']
>+BRANCHES['mozilla-central']['platforms']['macosx-debug']['slaves'] = ['bm-xserve18']

How different are these slaves? If they're going to be the same I think we should be sharing the slaves between regular builds and debug.


>@@ -64,9 +66,17 @@
>     'SYMBOL_SERVER_PATH': '/mnt/netapp/breakpad/symbols_ffx/',
>     'SYMBOL_SERVER_SSH_KEY': "/Users/cltbld/.ssh/ffxbld_dsa",
> }
>+BRANCHES['mozilla-central']['platforms']['macosx-debug']['env'] = {
>+    'MOZ_OBJDIR': OBJDIR,
>+    'SYMBOL_SERVER_HOST': 'dm-symbolpush01.mozilla.org',
>+    'SYMBOL_SERVER_USER': 'ffxbld',
>+    'SYMBOL_SERVER_PATH': '/mnt/netapp/breakpad/symbols_ffx/',
>+    'SYMBOL_SERVER_SSH_KEY': "/Users/cltbld/.ssh/ffxbld_dsa",
>+}

Won't need these.


>@@ -111,7 +121,8 @@
>-         'macosx': SlaveLock(name='macosx', maxCount=1)}
>+         'macosx': SlaveLock(name='macosx', maxCount=1),
>+         'macosx-debug': SlaveLock(name='macosx-debug', maxCount=1)}

This reminds me, SlaveLock doesn't work like I thought it does :(. The right way to do this is to specify max_builds= in BuildSlaves.py -- I'll do this on the master. Sorry for the confusion here. I'm going to remove these.


>@@ -238,7 +249,7 @@
>         # OS X builds eat up a ton of space with -save-temps enabled
>         # until we have dwarf support we need to clean this up so we don't
>         # fill up the disk
>-        if platform == "macosx":
>+        if platform.startswith("macosx"):
>             mozilla2_dep_factory.addStep(ShellCommand(
>                 command=['find', '-E', '.', '-iregex',
>                          '.*\.(i|s|mii|ii)$', '-exec', 'rm', '{}', ';'],

Can you do something similar to this to disable installers, snippets, symbols, etc. It's going to get a little ugly - I've filed bug 429670 about fixing this.
(In reply to comment #25)
> (From update of attachment 316291 [details] [diff] [review])
> >diff -r 5945f5ff43e1 mozilla2/macosx-debug/mozconfig
> >--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
> >+++ b/mozilla2/macosx-debug/mozconfig	Thu Apr 17 17:46:20 2008 -0400
> >@@ -0,0 +1,16 @@
> >+. $topsrcdir/build/macosx/universal/mozconfig
> 
> If we can inherit from the non-debug mac config, let's do that. It'll help us
> keep them in sync. ($topsrcdir/configs/macosx/mozconfig, I believe.) If there's
> a reason not to, then this is fine.

Is the trunk (1.9) OS X debug machine building universal? Doesn't seem to make a lot of sense to do so.
(In reply to comment #26)
> (In reply to comment #25)
> > (From update of attachment 316291 [details] [diff] [review] [details])
> > >diff -r 5945f5ff43e1 mozilla2/macosx-debug/mozconfig
> > >--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
> > >+++ b/mozilla2/macosx-debug/mozconfig	Thu Apr 17 17:46:20 2008 -0400
> > >@@ -0,0 +1,16 @@
> > >+. $topsrcdir/build/macosx/universal/mozconfig
> > 
> > If we can inherit from the non-debug mac config, let's do that. It'll help us
> > keep them in sync. ($topsrcdir/configs/macosx/mozconfig, I believe.) If there's
> > a reason not to, then this is fine.
> 
> Is the trunk (1.9) OS X debug machine building universal? Doesn't seem to make
> a lot of sense to do so.
> 

Yeah it is - that's a good point. That will only slow us down, methinks. In that case let's not inherit from it.
(In reply to comment #24)
> The regular build is a universal build, so you get objdir/{ppc,i386}, and have
> to make package etc. in the ppc dir.

Ted: Does that mean that for a non-universal build, it defaults to simply objdir/ ?

Dunno, this is specific to Ben's buildbot config.
The objdir should be 'obj-firefox' for non-universal builds. (MOZ_OBJDIR is set in the env.)
Attachment #316291 - Attachment is obsolete: true
Attachment #316291 - Flags: review?(bhearsum)
Comment on attachment 316857 [details] [diff] [review]
[checked in] Add a MacOSX debug builder, take 2

>diff -r 5945f5ff43e1 mozilla2/macosx-debug/mozconfig
>+export CFLAGS="-gstabs -gfull -save-temps"
>+export CXXFLAGS="-gstabs -gfull -save-temps"

You can get rid of these, since we're not uploading symbols for these builds.


>diff -r 5945f5ff43e1 mozilla2/master.cfg

Everything here is fine, but we should skip packaging and uploading for debug builds.

r=bhearsum with those changes.
Mac debug slave is reporting to staging-master now. 

Windows next, then Linux.
Attachment #316857 - Attachment description: Add a MacOSX debug builder, take 2 → [checked in] Add a MacOSX debug builder, take 2
(In reply to comment #33)
> Mac debug slave is reporting to staging-master now. 
> 
> Windows next, then Linux.

Nice, thanks coop!
Mac debug moved to production.
Attachment #316857 - Flags: review?(bhearsum) → review+
Windows debug slave is now reporting to the staging master. I'll have patches for review tomorrow.
Linux debug slave is reporting to the staging master now too.
These were passing in the staging env last week, just moving them over to production.
Attachment #318218 - Flags: review?(bhearsum)
Comment on attachment 318218 [details] [diff] [review]
[checked in] Add Win2k3 and Linux debug slaves

>diff -r 4831b46192f4 mozilla2/linux-debug/mozconfig
>--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
>+++ b/mozilla2/linux-debug/mozconfig	Mon Apr 28 16:06:14 2008 -0400
>@@ -0,0 +1,15 @@
>+ac_add_options --enable-application=browser
>+
>+ac_add_options --disable-optimize
>+ac_add_options --enable-debug
>+
>+ac_add_options --enable-trace-malloc
>+
>+CC=/tools/gcc/bin/gcc
>+CXX=/tools/gcc/bin/g++
>+
>+#export CFLAGS="-gstabs+"
>+#export CXXFLAGS="-gstabs+"
>+
>+# Needed to enable breakpad in application.ini
>+export MOZILLA_OFFICIAL=1

Looks like the trunk has --disable-libxul - do we need that here? It also enables canvas, svg, and pango but I believe those are the defaults now.

Everything else looks fine, seems like staging has been working fine.

You know more about the mozconfig requirements than I do. So r=bhearsum with whatever mozconfig is best.
Attachment #318218 - Flags: review?(bhearsum) → review+
--enable-debug implies --disable-libxul, I told coop to remove the other cruft.
Attachment #318218 - Attachment description: Add Win2k3 and Linux debug slaves → [checked in] Add Win2k3 and Linux debug slaves
All three machines are reporting to the production-master now.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
The machines appear to be building, but are not actually performing the leak tests or reporting the results to the tinderbox.

The output should be similar to that on the Firefox tinderbox:
RLk:0B
Lk:894KB
MH:18.0MB
A:475K
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → ASSIGNED
Talked with coop:

On cvs-trunk, we currently have two different debug build setups.
1) debug builds and leak testing using tinderbox . This is production-ready. 
2) debug builds and leak testing while running unit tests using buildbot. This is *not* production-ready yet.

For mozilla-central, bsmedberg is asking for (1), but we dont have tinderbox on mozilla-central, so need a third different debug build setup.
3) debug builds and leak testing using buildbot. This requires new coding, implementing in buildbot some setup work that is currently done in tinderbox for cvs-trunk.

Coop is currently working on this for linux first. 
(In reply to comment #43)
> 2) debug builds and leak testing while running unit tests using buildbot. This
> is *not* production-ready yet.
Filed bug#431900 to track making these production-ready.
Assignee: ccooper → bhearsum
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED
I'm making good progress on this today. I'm working through a couple more things -- can somebody (dbaron?) have a look at a log a bit later to make sure it's okay?
I looked at the current logs of "mozilla-central-linux-debug" and it looks like you're (so far) running BloatTest but not BloatTest2; we need both.  BloatTest2 requires --enable-trace-malloc in the mozconfig (which I don't see in the log, although I expect to see it there).  (mozilla-central-macosx-debug and mozilla-central-win32-debug don't seem to be running any tests yet.)
OK, I've made great progress here. I've had leak tests running on all through platforms without major issue. I've got a couple of minor issues to sort through on my own. I've patched automation.py.in a bit to make this whole thing easier -- I'll attach a patch as soon as I clean it up and work through one more issue. Who should I be getting to review that?

We also need to land bloatdiff.pl (mozilla/tools/tinderbox/bloatdiff.pl) in mozilla-central -- not sure where the best place to put it is.
I can probably review it (or at worst punt to someone more appropriate).  I'm curious to see what problems you hit, given that I've been running Mochitests locally in debug builds for quite a while without problems.
This is a mostly complete patch. If you have time to have a look and give me feedback I can address any issues with it along with final tweaks. I diff'd this against the mozilla-central version which doesn't appear to have camino support yet. I imagine that we'll be landing here first and possibly in CVS trunk post-Fx3 (I don't imagine this getting blocking1.9+).

I still have to figure out why it's not detecting Windows properly -- I suspect it has to do with MSYS. I might need to make changes in build/pgo/Makefile.in to fix that - we'll see.
Yeah, this stuff's not my purview so much, and I don't know what specific problems you're trying to address with the changes.  Ted would probably work for this, I think he's done a number of reviews on this stuff before.
I think Ted's out until at least Monday -- is there anyone else that can take a look at this. Sayrer, maybe?
(In reply to comment #50)
> 
> I still have to figure out why it's not detecting Windows properly -- I suspect
> it has to do with MSYS.

What is not detecting Windows properly?
(In reply to comment #53)
> (In reply to comment #50)
> > 
> > I still have to figure out why it's not detecting Windows properly -- I suspect
> > it has to do with MSYS.
> 
> What is not detecting Windows properly?
> 

When automation.py is generated I end up with "IS_WIN32 = len("") != 0" -- which I believe means that __WIN32__ wasn't defined.
OK, scratch that last comment. I re-generated it this morning and it worked fine.
Attachment #320165 - Flags: review?(nrthomas) → review+
Comment on attachment 320165 [details] [diff] [review]
[checked in] --enable-tests && --enable-trace-malloc on all debug platforms

changeset# c030b26bbaec
Attachment #320165 - Attachment description: --enable-tests && --enable-trace-malloc on all debug platforms → [checked in] --enable-tests && --enable-trace-malloc on all debug platforms
Update:
* I've got this working off of a clone of mozilla-central, I'll be posting an updated automation.py.in patch shortly.
* I checked in bloatdiff.pl to my mozilla-central clone (into tools/trace-malloc, for lack of a better place). How do people feel about this being a permanent home for it in HG?
* I'm using resource:///res/bloatcycle.html for all of the tests since it closes the window when it's done. (This is being done to avoid the need to implement kill() in automation.py.in on Win32.) . For basic AliveTest tinderbox uses a (iirc) http://www.mozilla.org -- I don't think using bloatcycle.html will make a difference here.
* Mozilla2 builders are using a pool of slaves, which means the leak tests will not always be run on the same machine. Therefore bloat.log.old, malloc.old.log, and sdleak.old.log could be out-of-date when a test is run. I'm not sure how big of a deal this is. If it's very important for the absolute previous log to be used for comparisons I suggest that we upload these logs to stage.mozilla.org and pull them right before we run the comparison scripts.
* I'm having a problem with --shutdown-leaks on windows, possibly the same one as bug 432338. Unless people feel strongly otherwise I'm going to focus on getting all of the above done first and come back to this.
As far as I can know, this patch does not modify any existing behaviour. It does add a few features though:
* Ability to run stand-alone (simply parses command line options and calls runApp())
* Process.wait() can timeout and kill its process.
* Process.kill() can be made to use something other than SIGKILL
* Process output can be logged to a file in addition to the console
* profileDir is now an optional argument. When specified it behaves exactly as before. If not specified a profile will be created in a temp directory
* New preferences have been added:
dom.allow_scripts_to_close_windows
capability.principal.codebase.p0.* for 'resource://'.

These are necessary in order to run resource:///res/bloatcycle.html with a clean shutdown.

Not sure who wants to review this.
Attachment #320193 - Flags: review?
(In reply to comment #58)
> * I checked in bloatdiff.pl to my mozilla-central clone (into
> tools/trace-malloc, for lack of a better place). How do people feel about this
> being a permanent home for it in HG?

This is already in mozilla-central (although there's one change that hasn't been merged from cvs-trunk-mirror to mozilla-central; is that what you were talking about?).

> * Mozilla2 builders are using a pool of slaves, which means the leak tests will
> not always be run on the same machine. Therefore bloat.log.old, malloc.old.log,
> and sdleak.old.log could be out-of-date when a test is run. I'm not sure how
> big of a deal this is. If it's very important for the absolute previous log to
> be used for comparisons I suggest that we upload these logs to
> stage.mozilla.org and pull them right before we run the comparison scripts.

This seems like a problem.  (Those diffs aren't incredibly useful right now because of the stack trace issue, but with that fixed those actually would make it useful, and we've had a lot of problems because people couldn't figure out what was leaking when we had a leak regression.)
Just a quick summary from IRC:

bloatdiff.pl should go in tools/rb -- I'll post a patch
The modified behaviour in bug 432385 should be used when running diffbloatdump.pl
Yeah -- if you're going to do the upload-log thing, you definitely want bug 432385 too, because otherwise you'll be uploading logs that are hundreds of megabytes.
(In reply to comment #62)
> Yeah -- if you're going to do the upload-log thing, you definitely want bug
> 432385 too, because otherwise you'll be uploading logs that are hundreds of
> megabytes.

dbaron: errr... I'm nervous of holding mozilla-central closed while we do bug#432385; we dont even have this on cvs-trunk, afaict. Are you sure this should block opening mozilla-central, or is bug#432385 something that can be done as enhancement afterwards?
Well, the diffs depend on having the old log around.  I'd say that holding mozilla-central closed in order to redesign the test architecture to use buildbot rather than tinderbox and to use multiple build slaves seems like the idea that makes me nervous.  Given that you're already doing all that, yes, I think you need to add the calls to a few more lines of code (unless copying hundreds-of-megabyte files around really won't slow things down).
Never mind -- I suppose for getting mozilla-central open we can just live without diffs, but it's not a situation that we should stay in for more than a week or two.
(In reply to comment #63)
> (In reply to comment #62)
> > Yeah -- if you're going to do the upload-log thing, you definitely want bug
> > 432385 too, because otherwise you'll be uploading logs that are hundreds of
> > megabytes.
> 
> dbaron: errr... I'm nervous of holding mozilla-central closed while we do
> bug#432385; we dont even have this on cvs-trunk, afaict. Are you sure this
> should block opening mozilla-central, or is bug#432385 something that can be
> done as enhancement afterwards?
> 

John, this is what I described to you yesterday. It's already been implemented.
(In reply to comment #66)
> (In reply to comment #63)
> > (In reply to comment #62)
> > > Yeah -- if you're going to do the upload-log thing, you definitely want bug
> > > 432385 too, because otherwise you'll be uploading logs that are hundreds of
> > > megabytes.
> > 
> > dbaron: errr... I'm nervous of holding mozilla-central closed while we do
> > bug#432385; we dont even have this on cvs-trunk, afaict. Are you sure this
> > should block opening mozilla-central, or is bug#432385 something that can be
> > done as enhancement afterwards?
> > 
> John, this is what I described to you yesterday. It's already been implemented.

Ben: oh?!? I remember you were working on the logfile-upload, logfile-download stuff, with reviews pending. However, I wasnt aware that you'd also solved the fix-*-stack.pl problem described in bug#432385. Cool. 
Attached patch AliveTest + log comparison steps (obsolete) β€” β€” Splinter Review
the compare* steps are very much based on the unittest+leaktest configs in CVS. AliveTest is mostly new and calls Firefox through automation.py.

I'm going to be filing a follow-up bug about cleaning this up -- they are not pretty.
Attachment #320525 - Flags: review?(rcampbell)
Attached patch master.cfg steps for debug+leaktest (obsolete) β€” β€” Splinter Review
master.cfg changes to run debug+leaktest. These are pretty straightforward...they run AliveTest in the same way Tinderbox does, download old logs, and do some useful comparisons. Win32 has some redness right now, probably the same problem as bug 432338 - I'll be looking into that later. Let's get these going for now.

These will get cleaned up a bit just like test.py
Attachment #320043 - Attachment is obsolete: true
Attachment #320527 - Flags: review?(rcampbell)
Comment on attachment 320525 [details] [diff] [review]
AliveTest + log comparison steps

the uncapitalized class names compareBloatLogs and compareLeakLogs are inconsistent with our naming. Otherwise, this is a good-looking patch.
Attachment #320525 - Flags: review?(rcampbell) → review+
Attachment #320193 - Flags: review? → review?(ted.mielczarek)
RCS file: /cvsroot/mozilla/tools/buildbotcustom/steps/test.py,v
done
Checking in test.py;
/cvsroot/mozilla/tools/buildbotcustom/steps/test.py,v  <--  test.py
initial revision: 1.1
done
Attachment #320525 - Attachment is obsolete: true
Comment on attachment 320527 [details] [diff] [review]
master.cfg steps for debug+leaktest

solid. need to correct case on import CompareBloatLogs and CompareLeakLogs to reflect new reality. r+ with that in mind.
Attachment #320527 - Flags: review?(rcampbell) → review+
Comment on attachment 320193 [details] [diff] [review]
automation.py.in - working on all platforms

I'm not really a fan of this approach. I think you should either co-opt http://lxr.mozilla.org/mozilla/source/build/pgo/profileserver.py.in or just copy it and mangle to suit your needs. automation.py is meant to be a library for consumers to use, so I think making it into a standalone script confuses things. profileserver.py does exactly what you need, in fact, you just need to serve bloatcycle.html there. I think you ought to just copy bloatcycle.html into some directory in the objdir, and then serve it via local httpd. That will also remove the need for the hack to make resource:// work properly. As for some of those other changes, if they're still necessary we can keep those. I think sayrer has a patch somewhere that might conflict some, we should get that checked in first. (Need to find the bug number first!)
Attachment #320193 - Flags: review?(ted.mielczarek) → review-
Ah, bug 421281. It's reviewed, so just land it in mozilla-central if need be.
OK, I'm working on getting this going with a script similar to profileserver.py. I'll update again later with my progress.
Thanks for all of your suggestions Ted -- they saved me a ton of time here. Running bloatcycle.html through http://localhost negated the need for any additional prefs. This WFM on all 3 platforms. I'm not sure where the best place to put this is, build/pgo doesn't really make much sense, suggestions are welcome.
Attachment #320689 - Flags: review?(ted.mielczarek)
getopt is nothing if not spare in lines of code and readability of its uses; I suggest you consider optparse (example use in testing/mochitest/runtests.py.in) with descriptive argument names (avoid the temptation to abbreviate and make invocations unreadable without reference to --help) and appropriate help messages (optparse handles this if you provide help text for the options you present).

However, do feel free to consider this bikeshedding if you're moderately pressed for time.  :-)
(In reply to comment #77)
> getopt is nothing if not spare in lines of code and readability of its uses; I
> suggest you consider optparse (example use in testing/mochitest/runtests.py.in)
> with descriptive argument names (avoid the temptation to abbreviate and make
> invocations unreadable without reference to --help) and appropriate help
> messages (optparse handles this if you provide help text for the options you
> present).
> 
> However, do feel free to consider this bikeshedding if you're moderately
> pressed for time.  :-)
> 
Time is certainly of the essence here, but that doesnt mean we want to land just anything. If you think this is serious enough to fix now, we should do it. Otherwise, could we file a separate enhancement/cleanup bug to track this suggestion?
Comment on attachment 320689 [details]
leaktest.py.in - runs the browser w/ bloatcycle.html

>shutil.copy(os.path.join(DIST_BIN, 'res', 'bloatcycle.html'), '.')

Can you do this in the Makefile instead? Should be pretty easy. In fact, that file just lives in $(topsrcdir)/build.

>if __name__ == '__main__':
>    opts, extraArgs = getopt(sys.argv[1:], 'l:')

I'd prefer OptionParser, but I won't r- you for it. It's pretty simple to use, and I agree with Waldo's comments above. See here for an example:
http://lxr.mozilla.org/mozilla/source/toolkit/crashreporter/tools/symbolstore.py#603

Did you lose the timeout bits from your previous patch? I'm not sure that automation.py takes care of that for you. If you want it, we can patch automation.py to get that behavior.

You can stick this anywhere in build/ as far as I'm concerned. You can use the same build target as here:
http://lxr.mozilla.org/mozilla/source/testing/mochitest/Makefile.in#127
to get yourself a copy of automation.py where you need it.
Attachment #320689 - Flags: review?(ted.mielczarek) → review+
(In reply to comment #79)
> (From update of attachment 320689 [details])
> >shutil.copy(os.path.join(DIST_BIN, 'res', 'bloatcycle.html'), '.')
> 
> Can you do this in the Makefile instead? Should be pretty easy. In fact, that
> file just lives in $(topsrcdir)/build.

> Did you lose the timeout bits from your previous patch? I'm not sure that
> automation.py takes care of that for you. If you want it, we can patch
> automation.py to get that behavior.
> 

Turns out I can live without that behaviour by just using bloatcycle.html for all of the tests (instead of using http://www.mozilla.org && killing the process for the most basic alive test).

> You can stick this anywhere in build/ as far as I'm concerned. You can use the
> same build target as here:
> http://lxr.mozilla.org/mozilla/source/testing/mochitest/Makefile.in#127
> to get yourself a copy of automation.py where you need it.
> 

Great.

Patch incoming.
This should be the last one. I've stolen some logic from build/pgo/Makefile.in to make automation.py.in + leaktest.py.in go in build/. Like you said Ted, I moved the bloatcycle.html copy over to the Makefile.

leaktest.py.in is exactly the same other than removing the shutil.copy().

bloatdiff.pl is a straight copy from mozilla/tools/tinderbox/bloatdiff.pl (in CVS) as per comment #61.
Attachment #320689 - Attachment is obsolete: true
Attachment #320791 - Flags: review?
Comment on attachment 320791 [details] [diff] [review]
[checked in] makefile.in + leaktest.py.in + bloatdiff.pl import

Looks good. Can you file a followup on moving automation.py.in to live in build/?
Attachment #320791 - Flags: review? → review+
(In reply to comment #82)
> (From update of attachment 320791 [details] [diff] [review])
> Looks good. Can you file a followup on moving automation.py.in to live in
> build/?
> 

bug 433584
Comment on attachment 320791 [details] [diff] [review]
[checked in] makefile.in + leaktest.py.in + bloatdiff.pl import

Pushed in changeset:   15117:c70da85c1383
Attachment #320791 - Attachment description: makefile.in + leaktest.py.in + bloatdiff.pl import → [checked in] makefile.in + leaktest.py.in + bloatdiff.pl import
I've got to modify the buildbot configs a bit to adjust for the new paths. I'll get everything going on staging tonight and tomorrow (EDT) morning I'll push them to production (when I can be around for many hours afterwards to deal with potential fallout).
I stupidly forgot a requisite variable for the Makefile. I also noticed that I'd been testing with allow_scripts_to_close_windows=true.
Attachment #320193 - Attachment is obsolete: true
Attachment #320813 - Flags: review?(ted.mielczarek)
This simplifies the AliveTest a whole lot. Because we're using bloatcycle.html for all of the tests we don't need to worry about timing out, killing, etc., because the browser will close itself.
Attachment #320814 - Flags: review?(rcampbell)
Mostly this patch is s/_profile\/pgo/_leaktest/ from the first previous one. It also removes unnecessary options from AliveTest calls and uses the properly capitalized Compare*Log steps.
Attachment #320527 - Attachment is obsolete: true
Attachment #320816 - Flags: review?(rcampbell)
Attachment #320813 - Flags: review?(ted.mielczarek) → review+
Comment on attachment 320813 [details] [diff] [review]
[checked in] silly follow-up

Landed in changeset:   15120:30fb4015e748
Attachment #320813 - Attachment description: silly follow-up → [checked in] silly follow-up
Comment on attachment 320814 [details] [diff] [review]
[checked in] AliveTest changes to cope with leaktest.py

mm, clean.
Attachment #320814 - Flags: review?(rcampbell) → review+
Comment on attachment 320814 [details] [diff] [review]
[checked in] AliveTest changes to cope with leaktest.py

Checking in test.py;
/cvsroot/mozilla/tools/buildbotcustom/steps/test.py,v  <--  test.py
new revision: 1.2; previous revision: 1.1
done
Attachment #320814 - Attachment description: AliveTest changes to copy with leaktest.py → [checked in] AliveTest changes to cope with leaktest.py
Comment on attachment 320816 [details] [diff] [review]
[checked in] master.cfg changes to cope with leaktest.py + new path

looks good, hopefully the indentation's correct.
Attachment #320816 - Flags: review?(rcampbell) → review+
Comment on attachment 320816 [details] [diff] [review]
[checked in] master.cfg changes to cope with leaktest.py + new path

Landed in changeset:   57:5933c31bb80b
Attachment #320816 - Attachment description: master.cfg changes to cope with leaktest.py + new path → [checked in] master.cfg changes to cope with leaktest.py + new path
Alright, I've applied changes to the master and kicked off debug builds. Windows will go red because sdleak.log isn't being dumped but Mac and Linux should be a-ok.
Still waiting on an OS X build before I close this, I've filed a couple of follow-ups though:
bug 433708: windows not properly outputting sdleak.log
bug 433710: mozilla-central leaktests need to report to a graph server
Alright, Mac has cycled now, we're done with this bug.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.