Closed
Bug 917877
Opened 12 years ago
Closed 12 years ago
buildapi01 is swapping
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: armenzg, Unassigned)
References
Details
I'm trying to load this report but I'm getting a lot of "Service Unavailable":
https://secure.pub.build.mozilla.org/buildapi/reports/builders/try
I think I got it into this state by using those reports:
http://ganglia1.build.scl1.mozilla.com/ganglia/?c=RelEngSCL1&h=buildapi01.build.scl1.mozilla.com&m=load_one&r=hour&s=by%20name&hc=4&mc=2
https://mana.mozilla.org/wiki/display/IT/BuildAPI
https://wiki.mozilla.org/ReleaseEngineering/BuildAPI#Kicking
https://wiki.mozilla.org/ReleaseEngineering/How_To/Restart_BuildAPI
I will wait for catlee to come back to determine what to do.
| Reporter | ||
Comment 1•12 years ago
|
||
It seems like we're not swapping anymore.
12:02 mconley: I'm having an issue triggering re-runs of the tsvg talos test on https://tbpl.mozilla.org/?tree=Try&rev=c537e0cea4c3 ... I click on the retrigger icon, and "Requesting retrigger of Windows XP Talos svg opt" just spins and spins until eventually, I get "Retrigger request for Windows XP Talos svg opt failed. (network error)"
12:02 mconley: known issue?
12:02 bhearsum: i think armenzg_lunch mentioned that the system that handles those requests was under high load at the moment
12:03 bhearsum: mconley: one sec, i'm going to try to calm it down
12:03 mconley: bhearsum: k
12:03 bhearsum: it's good that we have documentation like this: https://wiki.mozilla.org/ReleaseEngineering/BuildAPI#Kicking
12:04 Standard8 has joined (Standard8@moz-BE33DA21.fw1.sfo1.mozilla.net)
12:05 mconley: heh
12:05 AutomatedTester is now known as AutomatedTester|away
12:05 bhearsum: oh, it looks like it's one of the regular reports eating the system...
12:05 bhearsum: sorry, you'll just have to try again in a bit
12:06 mconley: bhearsum: well, apparently, my retrigger requests just went through
12:06 mconley: hopefully I snuck in
12:06 mconley: :)
12:06 bhearsum: woot
...
12:51 armenzg: bhearsum: what happened with buildapi01? did it get rebooted?
12:52 bhearsum: i restarted the daemon, but it didn't help
12:52 bhearsum: it was the builds-4h job that was eating it
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment 2•12 years ago
|
||
Can you add some links to the maintenance and failure-modes sections of https://mana.mozilla.org/wiki/display/IT/BuildAPI
Comment 3•12 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] from comment #2)
> Can you add some links to the maintenance and failure-modes sections of
> https://mana.mozilla.org/wiki/display/IT/BuildAPI
I don't seem to have permissions to edit that page - was going to change:
"Although a failure is not a tree-closing event, it does make work difficult for developers, and especially sheriffs."
to:
"Failure of buildapi (particularly the output under http://builddata.pub.build.mozilla.org/buildjson/ not being updated regularly) is a tree closing event."
Comment 4•12 years ago
|
||
Updated. Funny how things become tree-closing without any formal decision!
Comment 5•12 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] from comment #4)
> Updated. Funny how things become tree-closing without any formal decision!
If builds-4hr isn't updated, TBPL can't display any jobs completed after that point, which means the trees have to close. TBPL has been using builds-4hr since 2011-06-01. But yeah, know what you mean! :-)
Comment 6•12 years ago
|
||
(And thank you for updating :-))
| Assignee | ||
Updated•7 years ago
|
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Updated•6 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•