Closed Bug 360206 Opened 18 years ago Closed 17 years ago

Set up leak tests testing something we ship as Firefox

Categories

(Release Engineering :: General, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bzbarsky, Assigned: rhelmer)

References

Details

At the moment, the only trunk leak tests we have running are:

1)  balsa -- Firefox GTK1
2)  nye -- Seamonkey cairo

Given that we don't ship GTK1 Firefox nightlies and that nye isn't testing Firefox-specific code, it'd be nice if we could set up a leak test on a supported tinderbox config.

While we do that, we should separate out the assertion stuff on balsa into a tinderbox of its own, in my opinion.
Blocks: 326152
OS: Linux → All
Hardware: PC → All
Assignee: build → nobody
Component: Build & Release → Testing
Product: mozilla.org → Core
QA Contact: preed → testing
Version: other → Trunk
Assignee: nobody → build
Component: Testing → Tinderbox Platforms
Product: Core → mozilla.org
QA Contact: testing → dbaron
Version: Trunk → other
OK, this just became a blocker, because balsa can no longer build trunk as of
about two hours ago and nye has been orange ever since the cycle collector
landing, hence we have no leak tests anymore.

The plan is to keep the tree closed until we have these tests running on a
tinderbox that's actually green.

And I do think this is a tinderbox issue, not a core issue... but whichever
way, we basically need another machine here.

I hate to point this out, but it's been 10 months since we said we needed this machine.... (this bug was spun off on the parts of bug 333553 that didn't actually get fixed).
Severity: normal → blocker
Severity: blocker → normal
Component: Tinderbox Platforms → Bugzilla: Keywords & Components
h8 bugzilla
Severity: normal → blocker
Component: Bugzilla: Keywords & Components → Tinderbox Platforms
Priority: -- → P1
Assignee: build → nobody
Component: Tinderbox Platforms → Testing
Product: mozilla.org → Core
QA Contact: dbaron → testing
Version: other → Trunk
Does the build need to be a debug build in order for leak testing to work? balsa-trunk is currently building a debug build. Just wondering whether we need to setup an entirely new build/VM, or whether we can do this on, e.g., argo-vm.

Also, would turning on the trace-malloc module for argo-vm have any adverse affects on the Linux nightlies: size, performance, etc.?
Just a note on the component - 
Assignee: nobody → rhelmer
We can do this on tinderbox if necessary, but I want to make sure robcee sees this in the queue for the new test farm, since we may be able to put it there.
also, after talking in #build, how much do these builds need to be exercised?
Is simply running make check, reftest and/or mochitest enough or is there an
actual "leak test" that needs to be run on it.

In any case, I think I could set this up on my linux reference VM under
buildbot pretty quickly.
another question: who will be the consumers of these builds, if anybody? Is this something we're shipping or do I just need to run some tests on the machine it sits on then throw it away?
(In reply to comment #6)
> also, after talking in #build, how much do these builds need to be exercised?
> Is simply running make check, reftest and/or mochitest enough or is there an
> actual "leak test" that needs to be run on it.

The leak test that is currently run on balsa is what we need to have run on a newer machine, as I understand it.

(In reply to comment #7)
> another question: who will be the consumers of these builds, if anybody? Is
> this something we're shipping or do I just need to run some tests on the
> machine it sits on then throw it away?

The leak tests are for Tinderbox display only, the builds themselves can be thrown away.
> Does the build need to be a debug build in order for leak testing to work?

No.  It needs to either be a debug build or have --enable-logrefcnt in the configure options.  That said, we should have at least one tinderbox with --enable-debug and a tinderbox with fatal assertions.  Balsa was doing all three, but I wouldn't mind all that doing them separately.  Or rather, I wouldn't mind having a non-debug leak tbox in addition to a debug one; we would really rather have both sets of data.  If you have to pick for now, do what we had already?

I can't speak to trace-malloc; last time I tried it on trunk the build crashed when actually trying to trace...  I wouldn't enable it on our "production" builds without some serious testing first.

> We can do this on tinderbox if necessary, but I want to make sure robcee sees
> this in the queue for the new test farm, since we may be able to put it there.

As long as it shows up on the main Firefox tinderbox page or something else everyone looks at after checking in... ;)

> Is simply running make check, reftest and/or mochitest enough or is there an
> actual "leak test" that needs to be run on it.

There's a current leak test.  From balsa's log the last time it was green:

cmd = /builds/tinderbox/Firefox-gcc3.4/Linux_2.4.7-10_Depend/mozilla/obj/dist/bin/firefox-bin -P default resource:///res/bloatcycle.html

It's not a great test, but changing it is a separate issue; I think we want a baseline with the existing test on the new box first, if nothing else.

> who will be the consumers of these builds, if anybody?

No one.  We just want the output; the build can be tossed.
I'm working on getting a new tinderbox up right now, should have it ready this evening.

The new test farm will eventually replace this, but this will get us some immediate relief. 

It'll be the same mozconfig/tinder-config.pl as balsa, but based on the reference platform: http://wiki.mozilla.org/ReferencePlatforms/Linux
Status: NEW → ASSIGNED
I've set up a new tinderbox, fxdbug-linux-tinderbox (you can thank preed for the name :P).

The mozconfig is fairly close to balsa's, although it builds gtk2-cairo instead of gtk, as well as a few others such as:

ac_add_options --enable-canvas
ac_add_options --enable-svg
ac_add_options --enable-pango
ac_add_options --enable-default-toolkit=cairo-gtk2

It's publishing to the MozillaTest tree - http://tinderbox.mozilla.org/MozillaTest/

The leak test seems to be running, but it's crashing (SEGV).

Next cycle I'm going to point it at Firefox tree and hide balsa-trunk, since balsa's not that useful, and AFAICT from talking to bz this is probably a legitimate crash that we'd want to fix, and it might be intermittent (we'll see next cycle!).

The configs are checked into mozilla/tools/tinderbox-configs/firefox/linux on the "test_mem" branch, if anyone would like to take a look and make sure it's not a config issue.
Severity: blocker → normal
The crash sounds to me like bug 366241 (both fxdbug-linux-tinderbox and nye logs end with "bloat test timed out after x seconds").
ajschult tracked this to the cycle collector and has a stack trace attached to the other bug.
(In reply to comment #12)
> The crash sounds to me like bug 366241 (both fxdbug-linux-tinderbox and nye
> logs end with "bloat test timed out after x seconds").
> ajschult tracked this to the cycle collector and has a stack trace attached to
> the other bug.

That could be what I was seeing while running the bloattest on my local machine. Firefox just froze up and spewed to the malloc.log without actually loading the pages from the cycler.

Good work on the unfortunately-named fxdebug-linux-tinderbox, fellas!
> The crash sounds to me like bug 366241

Except fxdebug-linux-tinderbox is not obviously crashing.  No stack in the log, unlike nye.  There's also the minor matter of the nye output showing shutdown and the fxdebug-linux-tinderbox cutting off partway through startup, making it unlikely that it's crashing at shutdown.

So I stand by the possible bugs I put in the tinderbox message, I guess.  ;)

> end with "bloat test timed out after x seconds"

Pretty much any orange will end that way.  The question is what's before that.
Could it simply be timing out?  I bumped up the timeout for nye's bloat test to 300s because it would sometimes get cut off (in part due to pulling stuff over the network).

I was going to say that fxdbug-linux-tbox appears considerably slower than nye, but it looks like it's pulling fresh source every time:

> Starting nightly release build
> rm -rf mozilla/obj
> rm -rf mozilla
(In reply to comment #15)
> Could it simply be timing out?  I bumped up the timeout for nye's bloat test to
> 300s because it would sometimes get cut off (in part due to pulling stuff over
> the network).
> 
> I was going to say that fxdbug-linux-tbox appears considerably slower than nye,
> but it looks like it's pulling fresh source every time:
> 
> > Starting nightly release build
> > rm -rf mozilla/obj
> > rm -rf mozilla


Right, it's using the last-built file; it should start doing depend builds once it goes green.
Yes, but why is $Settings::ReleaseBuild on?  This box is not making builds that are useful to other people (see comment 8)
(In reply to comment #17)
> Yes, but why is $Settings::ReleaseBuild on?  This box is not making builds that
> are useful to other people (see comment 8)

The config was copied from balsa; I didn't change this.
Box is set up; configs are in CVS if we need to make adjustments.
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
We should really figure out why it's intermittently orange too...
Component: Testing → Release Engineering
Product: Core → mozilla.org
QA Contact: testing → release
Version: Trunk → other
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.