Closed
Bug 676055
Opened 14 years ago
Closed 14 years ago
Test buildbot master migration during next downtime
Categories
(Release Engineering :: General, defect, P3)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: coop, Unassigned)
Details
(Whiteboard: [buildmasters][buildduty])
One of the things we discussed at the releng work week was building enough extra master capacity that we could migrate slaves from a busy master to an empty master. We could then stop the original master, update the code, and start it back up. This would allow us to stop doing reconfigs, thereby avoid some of the problems/slowdowns associated with reconfigs.
We talked about doubling the number of masters we currently have. After discussing with Dustin, I think the easiest way to do this would be to create a duplicate master on every master host. However, we're unsure whether migrating slaves between 2 loaded masters running on the same hosts is feasible, i.e. it is untested: load may spike, etc. The first step should be to create a duplicate master on one already-loaded test master, and then attempt a trial migration process during the next downtime window to see if/how it works.
If successful, we could do the same for all our masters, and even think about writing tools to automate switching masters on the same host.
Comment 1•14 years ago
|
||
(In reply to comment #0)
> We talked about doubling the number of masters we currently have. After
> discussing with Dustin, I think the easiest way to do this would be to
> create a duplicate master on every master host.
I recall we were worried about having enough memory on the test masters in this scenario. Was that part of your discussion ?
Reporter | ||
Comment 2•14 years ago
|
||
(In reply to comment #1)
> I recall we were worried about having enough memory on the test masters in
> this scenario. Was that part of your discussion ?
Yes, that's the crux of it, actually.
The only way to figure out whether this will work is to try it on a single heavily-loaded master. Also why we want to try this during a downtime.
Comment 4•14 years ago
|
||
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #3)
> Marking it so we see it for next downtime.
this work does not mesh with the downtime for today. I am going to leave the bug in its current state so that the next downtime picks it up.
Comment 5•14 years ago
|
||
I re-read this bug and does not seem that we are ready to do anything.
I probably got confused with the kvm auto-balancing that we wanted with ServerOps.
Flags: needs-treeclosure?
Updated•14 years ago
|
Whiteboard: [buildmasters][buildduty] → [buildmasters][buildduty][triagefollowup]
Reporter | ||
Comment 6•14 years ago
|
||
Reconfigs are less painful now, and we have plans to move away from buildbot anyways.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → WONTFIX
Whiteboard: [buildmasters][buildduty][triagefollowup] → [buildmasters][buildduty]
Assignee | ||
Updated•12 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•