Closed Bug 584527 Opened 14 years ago Closed 14 years ago

Move some production build slaves to try pool

Categories

(Release Engineering :: General, defect, P2)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: armenzg, Assigned: armenzg)

Details

Attachments

(2 files, 1 obsolete file)

We currently have very bad wait times on the try server while we have excellent wait times on the production builders. If we move some machines from the production pool to the try pool we can improve these wait times. We have to be careful not to affect the wait-times on production. I believe we don't need to change the hostnames but just clobber the builds and replace production keys for try keys. TODO research lower than usual number of IX slaves Once I figure out the lower number of IX machines I will update the patch. I did a quick scan and there is room for movement: CURRENT DISTRIBUTION --- production/mobile/try: * linux-VMs 44/13/30 * linux-IXs 9/8/0 * win32-VMs 55/4/36 * win32-IXs 18/0/0 NEW DISTRIBUTION --- production/mobile/try/change: * linux-VMs 40/13/34 ( -4, 0, +4) * linux-IXs 7/7/2 ( -2, -1, +3) * win32-VMs 45/4/46 (-10, 0, +10) * win32-IXs 14/0/4 ( -4, 0, +4) NOTE: There are currently 4 Linux IX machines and 4 Windows IX machines on pm waiting for beta3 NOTE2: The mobile pool is currently separated from the production pools NOTE3: There might be less slaves listed as there should be since some slaves could have been rebooting while I was counting and some others might have been loaned or might be missing DATA: ##### pm01: linux-slaves-[01-13,29-34,48-49] - 21 VMs pm01: linux-ix-[03,12,14,19] - 4 IXs pm03: linux-slave{14,27} - 13 VMs pm03: linux-ix-[4,6-8,21] - 5 IXs mobile: linux-slaves-[28,35-40,43-47,50] - 13 VMs mobile: linux-ix-slaves-[2,9-10,13,15-18] - 8 IXs pm01: win32-slave-[12-43,54,56-59] -37 VMs pm01: win32-IX-slave-[10,12,14,16,20-21,24-25] - 8 IXs pm03: win32-slave-[01-11,44,47-51,55] - 18VMs pm03: win32-IX-slave-[3-4,6-9,15,17-18,22] - 10 IXs mobile: win32-slave-[45-46,52-53] - 4VMs mobile: win32-IX-slave - 0 IXs
Attachment #462959 - Flags: feedback?(ccooper)
Comment on attachment 462959 [details] [diff] [review] Move slaves from production pool to try pool These changes seem to match up with your proposal, which I support.
Attachment #462959 - Flags: feedback?(ccooper) → feedback+
(as landed) After a backout and an awful deployment I got this landed: http://hg.mozilla.org/build/buildbot-configs/rev/ad571f121b2e This the list of slaves that have been moved around: moz2-linux-slave47 moz2-linux-slave48 moz2-linux-slave49 moz2-linux-slave50 mv-moz2-linux-ix-slave22 mv-moz2-linux-ix-slave23 NOTE: To replace the keys on Linux I had to login as root to unmount the .ssh keys under scratchbox (which was mounted twice).
Attachment #462959 - Attachment is obsolete: true
Attachment #464153 - Flags: checked-in+
These ones have been moved as well: win32-slave50 win32-slave51 win32-slave52 win32-slave53 win32-slave54 win32-slave55 win32-slave56 win32-slave57 win32-slave58 win32-slave59 mw32-ix-slave22 mw32-ix-slave23 mw32-ix-slave24 mw32-ix-slave25 and this complete the transition of slaves. The summary is that we have moved the following to the try pool * 4 linux VMs * 2 linux IXs * 10 win32 VMs * 4 win32 IXs
This patch removes mv-moz2-linux-ix-slave24 from all calculations since it had been repurposed as the win64 ref machine. This patch move the following to the try pool: * 5 IX linux slaves (guess who is new wost offender on the try server) * 1 IX win32 slave I have checked the wait times from yesterday and linux nailed 100%. 3 of the IX machines will come from the mobile master. Win32 did not nail it yesterday. I moved 3 IX machines that were abandoned on staging. Out of these 3 machines I want one of them on the try pool. NOTE: This plan will move forward like this if the wait times for tomorrow are similar. After this 2nd move: ============= pm{01,03}/mobile/try * linux IXs 8 / 7 / 6 (-2 , -3, +5) * win32 IXs 19 / 0 / 5 (+2*, 0, +1) (NOTE the asterisk indicates the 2 IXs that were on staging)
Attachment #464527 - Flags: review?(ccooper)
Attachment #464527 - Flags: review?(ccooper)
Comment on attachment 464527 [details] [diff] [review] move few more slaves Unlike my comment, my patch says that I am moving 2 Win32 IX machines. If wait times are good that is what I will do. nthomas: coop is not around could you please review this?
Attachment #464527 - Flags: review?(nrthomas)
FTR when I copied .ssh/ for the Win32 slaves I copied them from a linux try slaves since I could not scp from a windows try slave. This caused uploadsymbols to fail since build.m.o and they slave had never talked before. I connected to all Win32 slaves that had been moved and typed this: ssh -i .ssh/trybld_dsa trybld@build.mozilla.org and accepted the prompt. We should pay attention to this if we move more slaves.
(In reply to comment #6) > FTR when I copied .ssh/ for the Win32 slaves I copied them from a linux try > slaves since I could not scp from a windows try slave. > > This caused uploadsymbols to fail since build.m.o and they slave had never > talked before. > > I connected to all Win32 slaves that had been moved and typed this: > ssh -i .ssh/trybld_dsa trybld@build.mozilla.org > and accepted the prompt. > > We should pay attention to this if we move more slaves. Is there a ~/.ssh/config file? If so, can you make any paths in it are correct? Generally they aren't if you copy the config from Linux to Windows. In the future, a safer way to do this is: - scp Windows keys from an existing slave somewhere - scp Those keys onto new slaves
Comment on attachment 464527 [details] [diff] [review] move few more slaves I don't think we should deprive the moz2 pool of any more ix machines. Yesterday there were clobbers set and we kept getting builds on VMs, hence very slow results for win32.
Attachment #464527 - Flags: review?(nrthomas) → review-
It works for me.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: