Status

--
major
RESOLVED FIXED
6 years ago
4 years ago

People

(Reporter: mburns, Assigned: mburns)

Tracking

Details

(Whiteboard: [buildduty][outage][treeclosure])

(Assignee)

Description

6 years ago
~15:34 Erin Little emailed alerting of a power outage in Mountain View Office
[15:26:04] <arr> mburns: we're seeing a lot of alerts that indicate connectoin loss between scl1 and mtv1
(Assignee)

Updated

6 years ago
Severity: normal → major
(Assignee)

Comment 1

6 years ago
Latest word is that the outage will last for 2 more hours.

The AC in the IDF on 2nd floor appears to have died in the process. rbryce went to MTV1 to help triage and resolve.

Comment 2

6 years ago
All of df301 is dark which includes all core networking and some tegras.
Group: mozilla-corporation-confidential
(Assignee)

Comment 3

6 years ago
[16:59:12] <@rbryce> lstest update from WPR--  power is still on  the 2nd floor but A/C is off. Doors open and the building maint just brough a mobile unit
[16:59:27] <@rbryce> Floor 3 is still powerless,

Comment 4

6 years ago
Why does this need to be Moco conf?

We really need to start not closing bugs at the drop of a hat; whilst only seemingly minor, it's things like this that cause unnecessary rifts between employees and non-employee contributors...

Comment 5

6 years ago
(In reply to Ed Morley [:edmorley UTC+1] from comment #4)
> Why does this need to be Moco conf?
> 
> We really need to start not closing bugs at the drop of a hat; whilst only
> seemingly minor, it's things like this that cause unnecessary rifts between
> employees and non-employee contributors...

(Given that this bug is being linked on IRC and non-employees are understandably frustrated at not being able to access it, to get an ETA on the tree reopening etc)
(In reply to Ed Morley [:edmorley UTC+1] from comment #4)
> We really need to start not closing bugs at the drop of a hat; whilst only
> seemingly minor, it's things like this that cause unnecessary rifts between
> employees and non-employee contributors...

We can talk offline about this if you'd like, but a lot of IT related bugs are necessarily closed by default.  Call it habit.  I will open this one up.
Group: mozilla-corporation-confidential

Comment 7

6 years ago
(In reply to Corey Shields [:cshields] from comment #6)
> We can talk offline about this if you'd like, but a lot of IT related bugs
> are necessarily closed by default.  Call it habit.  I will open this one up.

Thank you :-)

[It was more that (a) this was closed a few comments in, rather than by default; and (b) this affected people outside of IT, due to the tree closure.]
trees closed ~3:30pm PDT
Whiteboard: [buildduty][outage][treeclosure]
(Assignee)

Comment 9

6 years ago
Erin Little reports power has been restored.
Status: NEW → ASSIGNED
(Assignee)

Comment 10

6 years ago
The outage was caused by construction on a different floor of the building. Servers are coming back up now.
(Assignee)

Comment 11

6 years ago
[18:28:37] <@ravi> so far all the netops stuff is looking 5x5
[18:32:53] <@justdave> and ringring is up

...
[18:38:22] <nagios-scl1> kvm2.build.mtv1 is UP: PING OK
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
(Assignee)

Comment 12

6 years ago
Oops, should not have closed this out.

Note: the Tree is still closed. IT is working on bringing everything back online.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
nagios is showing green for releng services, releng is verifying.

We still don't have building AC but should have temp cooling hooked up in the machine room in 30-35 minutes (we already have spot coolers in the office areas).  Maintinance should be on the roof to fix the building AC sometime tonight or tomorrow morning.
Trees reopened at 8:07pm PT
We now have sufficient (yet temporary) AC cooling in the mtv1 server room.  Everything is back to operational now and I'm closing this out.
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago6 years ago
Resolution: --- → FIXED
We also can confirm that Haxxor and 2nd floor data room chillers and equipment are powered on and operating as expected.

Updated

6 years ago
Blocks: 796012
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.