Closed Bug 401146 Opened 17 years ago Closed 17 years ago

set up production mini machine set.

Categories

(Release Engineering :: General, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: anodelman, Assigned: anodelman)

References

Details

Attachments

(8 files, 4 obsolete files)

4.05 KB, text/plain
rcampbell
: review+
Details
3.86 KB, text/plain
rcampbell
: review+
Details
45.27 KB, patch
rcampbell
: review+
Details | Diff | Splinter Review
10.36 KB, patch
rcampbell
: review+
Details | Diff | Splinter Review
2.08 KB, patch
pavlov
: review+
Details | Diff | Splinter Review
19.13 KB, patch
rcampbell
: review+
Details | Diff | Splinter Review
7.10 KB, patch
bhearsum
: review+
Details | Diff | Splinter Review
4.94 KB, patch
rcampbell
: review+
Details | Diff | Splinter Review
The new mini machines have been (mostly) stable for a while and we are coming to the point of deciding what the production set of these machines will look like.  This is of a high priority as the set of production blades is starting to fail.

As posted in the newsgroup the set will look at follows.

- all machines throttled to run slower
- all machines using the new pageloader (effectively retire tp2)
- increase cycles through web page test set to 20 to reduce jitter
- platforms
     high priority: winxp, ubuntu linux, mac 10.4
     lower priority: vista, mac 10.5
- machine allocation
   3 machines per platform (all trunk)
   1 machine per platform (branch)
   1 machine per platform (trunk, run tests with no-chrome option)

If this is the agreed upon set up we can move ahead updating the buildbot configuration and the machine settings.
Stuart/Vlad are you r+ on this plan?
yep, r++++ would review again.
Attached patch trimmed down master.cfg (obsolete) — Splinter Review
This should trim down the boxes that we currently have up to be just the production machines - we can add back the other boxes as we allocate tasks for them.  This is also My First Buildbot Config File, so I expect there to be problems.

I'm also going to replace the set of config files for the mini boxes down to two - one with gfx for trunk and one without for branch, with everything running the pageloader.
Assignee: nobody → anodelman
Status: NEW → ASSIGNED
Attachment #286747 - Flags: review?(bhearsum)
Depends on: 401786
Alice, since ben's been seconded to buildworld, you should probably r?=me. Also, should use cvs diff -u8pN as mentioned in irc. Thanks!
Moving review request to rcampbell, since bhearsum is build busy.
Attachment #286747 - Attachment is obsolete: true
Attachment #286753 - Flags: review?(rcampbell)
Attachment #286747 - Flags: review?(bhearsum)
Attachment #286754 - Flags: review?(rcampbell)
Attachment #286755 - Flags: review?(rcampbell)
Depends on: 401613
Comment on attachment 286754 [details]
new config file for all mini trunk boxes

looks good.
Attachment #286754 - Flags: review?(rcampbell) → review+
Comment on attachment 286753 [details] [diff] [review]
better version of the previous patch

a little vertical whitespacing couldn't hurt between the different schedulers and around the comment lines.

indents could be better lined-up around line 551.

Minor quibbles aside, this is good to go.
Attachment #286753 - Flags: review?(rcampbell) → review+
Attachment #286755 - Flags: review?(rcampbell) → review+
Newer version of the patch after running through sanity checks.
Attachment #286753 - Attachment is obsolete: true
Attachment #287124 - Flags: review?(rcampbell)
Attachment #287124 - Flags: review?(rcampbell) → review+
Checking in master.cfg;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/master.cfg,v  <--  master.cfg
new revision: 1.16; previous revision: 1.15
done
RCS file: /cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/configs/sample.config,v
done
Checking in configs/sample.config;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/configs/sample.config,v  <--  sample.config
initial revision: 1.1
done
Removing configs/sample.config.js;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/configs/sample.config.js,v  <--  sample.config.js
new revision: delete; previous revision: 1.3
done
Removing configs/sample.config.js.nogfx;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/configs/sample.config.js.nogfx,v  <--  sample.config.js.nogfx
new revision: delete; previous revision: 1.2
done
RCS file: /cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/configs/sample.config.nogfx,v
done
Checking in configs/sample.config.nogfx;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/configs/sample.config.nogfx,v  <--  sample.config.nogfx
initial revision: 1.1
done
Removing configs/sample.config.pageloader;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/configs/sample.config.pageloader,v  <--  sample.config.pageloader
new revision: delete; previous revision: 1.3
done
Removing configs/sample.config.pageloader.nogfx;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/configs/sample.config.pageloader.nogfx,v  <--  sample.config.pageloader.nogfx
new revision: delete; previous revision: 1.2
done
We've got 12 boxes up comprising the basic production set:

- 3 win trunk machines
- 1 win branch machine
- 3 ubuntu trunk machines
- 1 ubuntu branch machine
- 3 mac trunk machines
- 1 mac branch machine

All of these machines are throttled and they are running with 20 cycles through each active test.  They are currently reporting to graphs-stage.m.o until we can get a few runs out of each box to see if we are getting reasonable results.

We still need
- 1 win trunk no-chrome
- 1 ubuntu trunk no-chrome
- 1 mac trunk no-chrome

Along with sets of vista & mac 10.5 boxes.  
We are seeing a lot of timeouts on the minis, this patch increases the timeouts to 4 hours across the board.  It also reduces the amount of cycles taken in tgfx & tdhtml - since it is unclear whether or not doing 20 is any benefit and it is running very slowly.

I'm also removing the tsvg test for now.  Until it can be run to completion more consistently we'll just have to skip it.
Attachment #287561 - Flags: review?(rcampbell)
Attachment #287561 - Flags: review?(rcampbell) → review+
Same as before, but also replacing qm-mini-ubuntu03 (which has suffered disk failure) for qm-mini-ubuntu05.
Attachment #287561 - Attachment is obsolete: true
Attachment #287592 - Flags: review?(rcampbell)
Attachment #287592 - Flags: review?(rcampbell) → review+
Checking in master.cfg;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/master.cfg,v  <--  master.cfg
new revision: 1.17; previous revision: 1.16
done
Checking in configs/sample.config;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/configs/sample.config,v  <--  sample.config
new revision: 1.2; previous revision: 1.1
done
Checking in configs/sample.config.nogfx;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/configs/sample.config.nogfx,v  <--  sample.config.nogfx
new revision: 1.2; previous revision: 1.1
done
We are still getting timeouts all over the place.  This patch reduces the cycles through the web page test down down to 10.
Attachment #287777 - Flags: review?(pavlov)
Attachment #287777 - Attachment is patch: true
Attachment #287777 - Attachment mime type: application/octet-stream → text/plain
Attachment #287777 - Flags: review?(pavlov) → review+
Checking in sample.config;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/configs/sample.config,v  <--  sample.config
new revision: 1.3; previous revision: 1.2
done
Checking in sample.config.nogfx;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/configs/sample.config.nogfx,v  <--  sample.config.nogfx
new revision: 1.3; previous revision: 1.2
done
Depends on: 403229
Depends on: 404593
Depends on: 406874
Depends on: 407056
Depends on: 407971
Adding configuration to master.cfg for vista machines.
Attachment #292688 - Flags: review?(rcampbell)
Comment on attachment 292688 [details] [diff] [review]
initial set up for vista machines

you should be using spaces instead of hard-tabs always.

lines 59, 67, 362, 370:
Replace Vista with WINNT 6.0 to be consistent with the other machines.

Otherwise good! Great patch, would review again!
Comment on attachment 292688 [details] [diff] [review]
initial set up for vista machines

and the requisite minus...
Attachment #292688 - Flags: review?(rcampbell) → review-
Just changing up the vista builder names as suggested.
Attachment #292688 - Attachment is obsolete: true
Attachment #292977 - Flags: review?(rcampbell)
Comment on attachment 292977 [details] [diff] [review]
initial set up for vista machines, take 2

this still has hard tabs in it which is a pain to deal with in any editor that isn't configured the same as yours. Please convert to spaces as per the original review. r+ assuming this is changed. Thanks!
Attachment #292977 - Flags: review?(rcampbell) → review+
Oop - sorry about the tabs.  Pulled those out before I checked in.

Checking in master.cfg;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/master.cfg,v  <--  master.cfg
new revision: 1.22; previous revision: 1.21
done
Depends on: 408383
Attachment #293889 - Flags: review?(bhearsum)
Attachment #293889 - Flags: review?(bhearsum) → review+
push vista/linux to production

Checking in master.cfg;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/master.cfg,v  <--  master.cfg
new revision: 1.25; previous revision: 1.24
done
Status update time.  We have in production:

- 3 winxp trunk/1 winxp branch (all throttled)
- 3 vista trunk/1 vista branch (all throttled)
- 3 ubuntu trunk/1 ubuntu branch (all throttled)

In stage:
- 3 mac (tiger) trunk/1 mac (tiger) branch (all throttled)

The mac machines are waiting on some sort of reasonable plan to appropriately throttle them, or at least disable speedswitch in some fashion.  Once this is figured out we can also move on to creating leopard machines.

See updated graphs and results here http://wiki.mozilla.org/Buildbot/Talos/Machines
We aren't going to block on our confusion about throttling, so we should push the non-throttled mac machines.
Attachment #297460 - Flags: review?(rcampbell)
Attachment #297460 - Flags: review?(rcampbell) → review+
Check in for mac push:

Checking in master.cfg;
/cvsroot/mozilla/tools/buildbot-configs/testing/talos/perfmaster/master.cfg,v  <--  master.cfg
new revision: 1.29; previous revision: 1.28
done
I think that we can safely call this done.  We have:

- 3 winxp trunk, 1 winxp trunk nochrome, 1 winxp branch
- 3 mac trunk, 1 mac trunk nochrome, 1 mac branch
- 3 vista trunk, 1 vista trunk nochrome, 1 vista branch
- 3 linux trunk, 3 linux trunk nochrome, 1 linux branch

We do have three leopard machines (trunk) currently reporting to stage.  They can be moved to production once the high load on the graph server has been dealt with.
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Mass move of Core:Testing bugs to mozilla.org:ReleaseEngineering. Filter on RelEngMassMove to ignore.
Component: Testing → Release Engineering
Product: Core → mozilla.org
QA Contact: testing → release
Version: unspecified → other
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: