Status

Infrastructure & Operations
Change Requests
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: sheeri, Unassigned)

Tracking

Details

(Reporter)

Description

3 years ago
Need to failover treeherder production database to defragment as per bug 1190361. We will need 20 mins of prep time and 5 mins of actual database downtime for this.
Flags: cab-review?
(Reporter)

Comment 1

3 years ago
Adding pythian for informational purposes only, as they cannot yet access the load balancer to do the failover.
Hal - we would prefer to do this before next TCW - can you socialize and see if that would be possible
Flags: needinfo?(hwine)

Comment 3

3 years ago
Opening bug since there is nothing confidential in here.

On the Treeherder side I'm pretty flexible about this, most of our tasks will resume with no problem. Ideally the same applies to TaskCluster submissions to us - they _should_ retry, but someone will have to confirm.

It would be best to do this at a quite time of course, and to give the sheriffs a heads up.
Group: infra

Comment 4

3 years ago
s/quite/quiet/ even, and by that I mean lower push activity on the trees, so before the US awake on a weekday etc.

Comment 5

3 years ago
From what I understand of mozilla-taskcluster, it will attempt to retry submitting to treeherder.  I haven't dove much into the code that handles the retry, but I believe I have seen it attempt to submit multiple times.
There's no action releng needs to take if treeherder is down, so whatever time Ed & Sheeri set is fine.

Notification list should include releng, sheriffs, taskcluster, and treeherder teams.
Flags: needinfo?(hwine)
(Reporter)

Comment 7

3 years ago
OK, other than whistlepig,I have:

email: release-engineering@lists.mozilla.org, sheriffs@mozilla.com, taskcluster-internal@mozilla.com

IRC: #releng #treeherder #moc

Hal - Any other notifications I've missed? 

Ed - "before US wakes up" includes east coast? What about at night? Does 7 or 8 pm Pacific work? (that'd be 10 or 11 pm Eastern, 2 or 3 am UTC, if I'm doing my time zone math properly). I'm thinking tomorrow evening (Thu, 7 or 8 pm Pacific) as a good time, that way we have Friday in case things go awry.
Flags: needinfo?(hwine)
Flags: needinfo?(emorley)
(In reply to Sheeri Cabral [:sheeri] from comment #7)
> OK, other than whistlepig,I have:
> 
> email: release-engineering@lists.mozilla.org, sheriffs@mozilla.com,
> taskcluster-internal@mozilla.com

That's sheriffs@m.o <= not c

add dev-tree-management@lists.m.o (picks up tree herder team)
Flags: needinfo?(hwine)
Waiting on final details before I approve from CAB - please needinfo me when set
(Reporter)

Comment 10

3 years ago
OK, thanks Hal, and waiting on Ed for confirmation of when it's OK.
(In reply to Sheeri Cabral [:sheeri] from comment #7)
> Ed - "before US wakes up" includes east coast? What about at night? Does 7
> or 8 pm Pacific work? (that'd be 10 or 11 pm Eastern, 2 or 3 am UTC, if I'm
> doing my time zone math properly). I'm thinking tomorrow evening (Thu, 7 or
> 8 pm Pacific) as a good time, that way we have Friday in case things go awry.

Yeah that probably works too, though I'll be asleep then, so if you wanted someone from Treeherder to be around, you'll need to ask :camd :-)
Flags: needinfo?(emorley)
(Reporter)

Comment 12

3 years ago
OK, will ping :camd. How about 7 or 8 pm Pacific on Monday, Sep 14th?
Flags: needinfo?(cdawson)
Sure, I can be around.  Please ping me that afternoon so I don't forget.  :)  I'm usually in the midst of bedtime for the little ones around then.  But I can be available.  I've got IRC on my phone and I'll be sure to check that
Flags: needinfo?(cdawson)
Flags: cab-review? → cab-review+
(Reporter)

Comment 14

3 years ago
Treeherder successfully failed over in the past 10 minutes. philor confirmed things were working.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED

Updated

3 years ago
Cab Review: --- → approved
Flags: cab-review+
You need to log in before you can comment on or make changes to this bug.