l10n.mozilla-community.org was put offline, breaking many tools and workflows

RESOLVED FIXED

Status

mozilla.org Graveyard
Server Operations
--
critical
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: pascalc, Assigned: Aj)

Tracking

Details

(Reporter)

Description

4 years ago
Bug 829298 put servers offline, one of them was the one hosting my tools (http://l10n.mozilla-community.org/webdashboard/) to track the work on many critical web parts for shipping Firefox key pages and Firefox OS.

Note that the fact that the server was hosting http://l10n.mozilla-community.org/ and that this was an important tool was well known (see https://bugzilla.mozilla.org/show_bug.cgi?id=829298#c7), a migration path was organized by the community but it seems that the server was shut down without any notice, I don't even know if Reed had finished rsyncing the data before the server was switched off...

Please make sure that all the data is backed up and restore the service asap.

Marking as critical given the number of web localizers (probably most of them) impacted. Other tools that were digging data from the Web dashboard via json/rss (such as the main l10n dashboard at l10n.mozilla.org) are also experiencing bugs and difficulties.

Thanks
Pascal,

I'll make sure this is turned back on, but if you notice the bug it's been open for ages and the due date is 3 days away. I apologise that this happened without any notice (I didn't expect that either) but we need to move out of there, quickly.
Assignee: server-ops → server-ops-dcops
Component: Server Operations → Server Operations: DCOps
QA Contact: shyam → dmoore
We're getting power restored to the facility immediately, but we'll need to move quickly as we have to vacate within a matter of days. I'll update again when power is restored.
Power is restored. SRE team will work on bringing the VM guests back online.
(Assignee)

Comment 4

4 years ago
As per irc, the l10n VM was back online at ~11:05 am PDT. The other vm konigseberg was online much earlier.

The issue with the l10n VM was with booting once the esx host was back online. Seems the new kernel did not rebuild initramfs which made the kernel not find /dev/sda1

We booted into a 2.6.x kernel (the other 3.x kernels were bad), automatic fsck took place, after fsck, rebooted into 2.6.x recovery mode and VM booted.
Assignee: server-ops-dcops → afernandez
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Component: Server Operations: DCOps → Server Operations
QA Contact: dmoore → shyam
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.