Status

RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: fubar, Assigned: fubar)

Tracking

Trunk
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Assignee)

Description

2 years ago
The Seamicro chassis that we've been working to migrate off of, because it was a ticking time bomb, went off tonight. While we've had a number of new nodes in the Elasticsearch cluster most of the data was on the Seamicro, or enough of the shards for important indices to matter. Partly that seems to be due to an issue with the ES templating not behaving as expected, o shards for numerous indices weren't able to route to the new hardware. /sadtrombone

While the :jlaz and :van were attempting to get the hardware back up, I beat ES into submission (eventually) and got the templates straightened out. Additionally, the Jenkins instance was off the air pending a security fix; the upgrade also required updating all of the plugins. 

The fastest builds have been queued up and are mostly done and all of the important but slow builds are coming up next. Unfortunately, until mozilla-central is done in a few hours the website will throw ISEs, so it's currently under hardhat via Zeus. 

Anyone can follow along with the re-indexing status at https://jenkins-dxr.mozilla.org

Once m-c is done, the hardhat can be removed in Zeus by myself, :dhouse, :arr, :dividehex, or anyone in the MOC or WebOps if asked.

Later today, I'll queue up the remaining builds.
(Assignee)

Comment 1

2 years ago
Hardhat has been removed, as m-c finished. I queued up a bunch more.
Nagios checks now green for dxr.mozilla.org
Sorry you had to wake up for this, Kendall. Thanks for taking care of it so quickly. Looking forward to our new, less-stuff-shared ES farm. I'm going to start work on qualifying DXR on a more recent version of ES today, so we'll have the option of picking those reliability fruits.
(Assignee)

Updated

2 years ago
Duplicate of this bug: 1317350
(Assignee)

Updated

2 years ago
Duplicate of this bug: 1317288
(Assignee)

Comment 6

2 years ago
rust and servo are throwing new and unusual java errors that I suspect are related to the jenkins+plugins update. will investigate.
(Assignee)

Comment 7

2 years ago
gaia, rust, and servo were all failing in the git plugin trying to compute a merge base. additionally, updating the git plugin changed it's behavior and caused jenkins to re-clone the repo every time. 

updated the jenkins config to disable wipe-workspace, and only build on the master branch, at least until we get an initial index created.
(Assignee)

Comment 8

2 years ago
Everything's done re-building, though in fixing addons we regressed on bug 1310767.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.