Closed Bug 1202781 Opened 9 years ago Closed 9 years ago

Need treeherder failover

Categories

(Infrastructure & Operations :: Change Requests, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: scabral, Unassigned)

References

Details

Need to failover treeherder production database to defragment as per bug 1190361. We will need 20 mins of prep time and 5 mins of actual database downtime for this.
Flags: cab-review?
Adding pythian for informational purposes only, as they cannot yet access the load balancer to do the failover.
Hal - we would prefer to do this before next TCW - can you socialize and see if that would be possible
Flags: needinfo?(hwine)
Opening bug since there is nothing confidential in here.

On the Treeherder side I'm pretty flexible about this, most of our tasks will resume with no problem. Ideally the same applies to TaskCluster submissions to us - they _should_ retry, but someone will have to confirm.

It would be best to do this at a quite time of course, and to give the sheriffs a heads up.
Group: infra
s/quite/quiet/ even, and by that I mean lower push activity on the trees, so before the US awake on a weekday etc.
From what I understand of mozilla-taskcluster, it will attempt to retry submitting to treeherder.  I haven't dove much into the code that handles the retry, but I believe I have seen it attempt to submit multiple times.
There's no action releng needs to take if treeherder is down, so whatever time Ed & Sheeri set is fine.

Notification list should include releng, sheriffs, taskcluster, and treeherder teams.
Flags: needinfo?(hwine)
OK, other than whistlepig,I have:

email: release-engineering@lists.mozilla.org, sheriffs@mozilla.com, taskcluster-internal@mozilla.com

IRC: #releng #treeherder #moc

Hal - Any other notifications I've missed? 

Ed - "before US wakes up" includes east coast? What about at night? Does 7 or 8 pm Pacific work? (that'd be 10 or 11 pm Eastern, 2 or 3 am UTC, if I'm doing my time zone math properly). I'm thinking tomorrow evening (Thu, 7 or 8 pm Pacific) as a good time, that way we have Friday in case things go awry.
Flags: needinfo?(hwine)
Flags: needinfo?(emorley)
(In reply to Sheeri Cabral [:sheeri] from comment #7)
> OK, other than whistlepig,I have:
> 
> email: release-engineering@lists.mozilla.org, sheriffs@mozilla.com,
> taskcluster-internal@mozilla.com

That's sheriffs@m.o <= not c

add dev-tree-management@lists.m.o (picks up tree herder team)
Flags: needinfo?(hwine)
Waiting on final details before I approve from CAB - please needinfo me when set
OK, thanks Hal, and waiting on Ed for confirmation of when it's OK.
(In reply to Sheeri Cabral [:sheeri] from comment #7)
> Ed - "before US wakes up" includes east coast? What about at night? Does 7
> or 8 pm Pacific work? (that'd be 10 or 11 pm Eastern, 2 or 3 am UTC, if I'm
> doing my time zone math properly). I'm thinking tomorrow evening (Thu, 7 or
> 8 pm Pacific) as a good time, that way we have Friday in case things go awry.

Yeah that probably works too, though I'll be asleep then, so if you wanted someone from Treeherder to be around, you'll need to ask :camd :-)
Flags: needinfo?(emorley)
OK, will ping :camd. How about 7 or 8 pm Pacific on Monday, Sep 14th?
Flags: needinfo?(cdawson)
Sure, I can be around.  Please ping me that afternoon so I don't forget.  :)  I'm usually in the midst of bedtime for the little ones around then.  But I can be available.  I've got IRC on my phone and I'll be sure to check that
Flags: needinfo?(cdawson)
Flags: cab-review? → cab-review+
Treeherder successfully failed over in the past 10 minutes. philor confirmed things were working.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Change Request: --- → approved
Flags: cab-review+
You need to log in before you can comment on or make changes to this bug.