Closed Bug 333553 Opened 18 years ago Closed 18 years ago

set up replacement linux perf testing tinderboxes

Categories

(Release Engineering :: General, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: vlad, Assigned: rhelmer)

References

Details

gtk1 and probably gcc 2.95 support will becoming obsolete pretty soon, so we need some replacements for tinderboxes. The affected tinderboxes that I know of are:

balsa (Firefox) - running leak tests
btek (SeaMonkey) - running Tp
luna (SeaMonkey) - running basically every perf test

The difficulty is that the replacements need to be gtk2/gcc3, and need to give stable numbers.  My naiive attempt can be seen on http://tinderbox.mozilla.org/showbuilds.cgi?tree=Firefox-Cairo with the linux-perf box; I made no attempt to disable daemons and other non-essential services there.
Is it desirable (and possible) to set up parallel cairo and non-cairo boxes so that we can tell apart perf impact from cairo work from perf impact from layout changes?  I think this was brought up before in the context of cairo being expected to be slower than gfx for now (essentially giving a perf regression that might hide other things when cairo gets switched on by default).
Yes, that was the goal with the boxes that I set up -- one of them has "Ref" after it, which is the reference non-cairo build.  In retrospect that should've said NoCairo...
Depends on: 334456
Vlad: is this taken care of by rhelmer's test-only Tinderboxen work?
Over to rhelmer.
Assignee: build → rhelmer
(In reply to comment #3)
> Vlad: is this taken care of by rhelmer's test-only Tinderboxen work?

I don't really know; I don't think btek is producing any useful numbers for us right now (because of its config).  However, even given that, we need stable numbers and the test-only tinderboxes aren't producing stable numbers.  If you compare:

http://build-graphs.mozilla.org/graph/query.cgi?testname=pageload&tbox=btek&autoscale=1&days=7&avg=1
http://build-graphs.mozilla.org/graph/query.cgi?testname=pageload&units=ms&tbox=Fx-Trunk-linux-test1&autoscale=1&days=7&avg=1

The btek graph has a variance of <1%, whereas the Fx-Trunk-linux-test1 graph seems to be in the 3% range, if you ignore the random spikes.  Tp2 is in the same boat.  1-2% changes are a big deal; we need to have sub-1% precision on whatever performance numbers that we have.
(In reply to comment #5)
> (In reply to comment #3)
> > Vlad: is this taken care of by rhelmer's test-only Tinderboxen work?
> 
> The btek graph has a variance of <1%, whereas the Fx-Trunk-linux-test1 graph
> seems to be in the 3% range, if you ignore the random spikes.  Tp2 is in the
> same boat.  1-2% changes are a big deal; we need to have sub-1% precision on
> whatever performance numbers that we have.

One theory for this discrepancy is that btek is running the 2.2 Linux kernel, while bl-bldlnx01 is 2.6 (FC4). Since we've tried to minimize userspace process activity on both, this may have to do with the scheduling characteristics.

I am going to try some different scheduling options; for example changing the IO elevator to deadline (default is CFQ) as of now.
Status: NEW → ASSIGNED
This machine is set up and reporting to Mozilla1.8 and Firefox trees; if there is more we can do to get more reliable numbers please open a separate bug.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
The original bug mentioned leak tests.  We're still running those on balsa, and we need to stop doing that to achieve the goal this bug was filed on.

Do you want a separate bug on that too?
(In reply to comment #8)
> The original bug mentioned leak tests.  We're still running those on balsa, and
> we need to stop doing that to achieve the goal this bug was filed on.
> 
> Do you want a separate bug on that too?

Yes please. We should be able to allocate this on VMs and not need to use a standalone machine like we do for wall-clock-sensitive tests.
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.