Closed Bug 1577946 Opened 6 years ago Closed 6 years ago

tb-signing-mac-v1 not responding again

Categories

(Release Engineering :: Release Automation, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rjl, Unassigned)

Details

It looks like the Thunderbird signing mac in the tb-signing-mac-v1 has crashed again as it's not accepting jobs. This is blocking the 68.1 release and Daily builds.

As we only have the one signing mac in service right now, these frequent failures are really problematic. I understand the second mac that was assigned to Thunderbird is being used to get Puppet working. If there's anything I can do to help get that going let me know. I realize there may not be as my Puppet skills are quite dated and then access challenges.

Last question, is there/will there be monitoring on these so we can catch these problems early? I'm guessing that's dependent on Puppet?

Thanks

I stood this back up a few hours ago, after it unexpectedly got caught up in the power maintenance in bug 1575615.

The puppet patch is up for review at in PR #73, so it may be possible to release tb-mac-v3-signing2.srv.releng.mdc2. What do you think Simon ?

We do have bugs open to add nrpe (bug 1571949), and then can add nagios checks. In the meantime I'll make a note to check on the machine once a day.

Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(sfraser)
Resolution: --- → FIXED

It should be fine to use tb-mac-v3-signing2 - the daemon is running, so we can let it accept tasks. I've removed its quarantine now.

Flags: needinfo?(sfraser)

I removed its quarantine briefly, and a task failed on it due to a secrets error, so the quarantine is back now. Am investigating.

This node is working now.

Component: Release Automation: Signing → Release Automation
You need to log in before you can comment on or make changes to this bug.