Closed Bug 697374 Opened 13 years ago Closed 13 years ago

Linux 32bit tests not running

Categories

(Release Engineering :: General, defect, P1)

x86
Linux
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nthomas, Unassigned)

References

Details

Attachments

(4 files, 1 obsolete file)

We haven't started a job on a talos-r3-fed-NNN slave since 2011-10-25 12:52:18 PDT. Ben had a bunch of trouble doing a reconfigs on the three masters around then.

I've done a graceful shutdown on buildbot-master06 and that hasn't helped. It still thinks the fedora slaves are offline. Something must be screwed up in the master state, or the four new twigs has pushed us over some threshold.
Severity: normal → blocker
Change the severity to BLOCKER, since the tree is now closed for this issue.
All we know so far is that the talos-r3-fed-NNN seem to be connected to the master, but aren't getting any jobs. Restarting the masters hasn't helped. The masters also show strange disconnect messages in the logs.
Attached patch masters json for bm17/18 (obsolete) — Splinter Review
Attachment #569660 - Flags: review?(rail)
In case we want to set up new masters for linux, I've created those two patches

to set up the new masters:
1) land diff to production-masters.json
2) land puppet manifests
3) update master-puppet1.build.m.o
4) sync bm17 with puppet
5) set up puppet on bm18 (it's a fresh VM)
6) add new pool and masters to slavealloc
7) start moving linux32 slaves to new pool, reboot machines as required
Attachment #569659 - Flags: review?(rail) → review+
Comment on attachment 569660 [details] [diff] [review]
masters json for bm17/18

Review of attachment 569660 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with s/windows/linux/

::: buildfarm/maintenance/production-masters.json
@@ +335,4 @@
>      "tools_dir": "/builds/buildbot/tests1-windows/tools"
>    },
>    {
> +    "basedir": "/builds/buildbot/tests1-windows",

s/windows/linux/
Attachment #569660 - Flags: review?(rail) → review+
now with less bustage!
Attachment #569660 - Attachment is obsolete: true
Attachment #569668 - Flags: review?(rail)
Attachment #569668 - Flags: review?(rail) → review+
Attachment #569659 - Flags: checked-in+
Attachment #569668 - Flags: checked-in+
I set-up slavealloc for these by:
* Adding a new pool ("tests-scl1-linux")
* Adding two new masters ("bm17-tests1-linux" and "bm18-tests1-linux"), both disabled for now.
Depends on: 697437
Catlee discovered that we're hitting an upper limit on the # of builders we can have. This patch reduces the # per OS by ~96 by turning off release builders where we don't need them.
Attachment #569686 - Flags: review?(catlee)
Attachment #569686 - Flags: review?(catlee) → review+
Attachment #569689 - Flags: review?(bhearsum) → review+
Attachment #569689 - Flags: checked-in+
My understanding is that this would leave the other non-tegra test masters for doing mac testers?  Would this change resolve bug 696959, in that we have two more masters taking the load?
(In reply to John Ford [:jhford] from comment #12)
> My understanding is that this would leave the other non-tegra test masters
> for doing mac testers?  Would this change resolve bug 696959, in that we
> have two more masters taking the load?

Let's take this offline / to another bug.

This problem is fixed now.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: