Closed
Bug 584365
Opened 13 years ago
Closed 13 years ago
Tinderbox server doesn't work sometimes
Categories
(mozilla.org Graveyard :: Server Operations, task)
mozilla.org Graveyard
Server Operations
Tracking
(Not tracked)
VERIFIED
DUPLICATE
of bug 584920
People
(Reporter: ehsan.akhgari, Assigned: fox2mike)
Details
I tried to view some logs on the tinderbox this morning, and it failed after a *long* time with error 500 (Internal Server Error). I just tested the same logs now and they worked. This has happened in the past few days for me as well.
Assignee | ||
Comment 2•13 years ago
|
||
I don't see any issues with tinderbox (no nagios alerts etc) for today, so this could be a specific case or an issue at your end.
Comment 3•13 years ago
|
||
I've been hitting this on and off for the past few days as well.
Reporter | ||
Comment 4•13 years ago
|
||
(In reply to comment #1) > What exactly were you trying to view? Some logs from mozilla-central, like http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1280916610.1280917886.8234.gz.
Assignee | ||
Comment 5•13 years ago
|
||
Ehsan, Can you try to reproduce this now? (I have apache logging in far more detail than it was before). While I can confirm there were 500 errors in the access_log, nothing in the error_log = I can't really tell you what caused the issue :| Hopefully, if you can reproduce this, we'll get an error we can then work with. Please post the URL on the bug as well.
Reporter | ||
Comment 6•13 years ago
|
||
Will do. FWIW, I've starred 5 million oranges in the past hour, and the Tinderbox server didn't give me a single error back. :(
Assignee | ||
Comment 7•13 years ago
|
||
Yeah. I've seen a few more, but nothing in the error logs, so this isn't going to be any use till I figure out how to get more meaningful errors.
Assignee | ||
Comment 8•13 years ago
|
||
Passing this to Jeremy who'll be the next person oncall, he can take a look if it happens again.
Assignee: shyam → jeremy.orem+bugs
Updated•13 years ago
|
Component: Server Operations: Tinderbox Maintenance → Server Operations
OS: Mac OS X → All
Hardware: x86 → All
Comment 9•13 years ago
|
||
Probably easier to open a new bug if this happens again.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 10•13 years ago
|
||
I encountered this again recently (during the past hour) on the following URL: http://tinderbox.mozilla.org/addnote.cgi This happened when I was trying to star a build from http://tinderbox.mozilla.org/addnote.cgi?tree=Firefox&buildname=Rev3%20Fedora%2012x64%20mozilla-central%20debug%20test%20mochitests-1%2f5&buildtime=1280955356&logfile=1280955356.1280957743.26831.gz&errorparser=unittest The original log URL is: http://tinderbox.mozilla.org/showlog.cgi?tree=Firefox&errorparser=unittest&logfile=1280955356.1280957743.26831.gz&buildtime=1280955356&buildname=Rev3%20Fedora%2012x64%20mozilla-central%20debug%20test%20mochitests-1%2f5&fulltext=1#err1
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 11•13 years ago
|
||
I just tried starring the same build again and this time it was successful.
Comment 12•13 years ago
|
||
Failed to load http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1280964629.1280964880.26289.gz for me a couple of minutes ago. Possibly timing out due to load ?
Reporter | ||
Comment 13•13 years ago
|
||
I got this again: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1280972473.1280973917.1429.gz http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1280973150.1280974056.1819.gz This is *really* hurting the developers, bumping the priority to blocker.
Severity: critical → blocker
Status: REOPENED → NEW
Comment 14•13 years ago
|
||
Though the symptoms are pretty general, they look to me exactly like when in the past the oncall has said "yeah, somebody was spidering tinderbox, I just blocked them."
Comment 15•13 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1280951063.1280954751.13730.gz at around 23:43, after a fairly long period of things working reasonably well.
Reporter | ||
Comment 16•13 years ago
|
||
Another one when I was trying to star a build from: http://tinderbox.mozilla.org/addnote.cgi?tree=Firefox&buildname=WINNT%205.2%20mozilla-central%20opt%20test%20xpcshell&buildtime=1281009572&logfile=1281009572.1281011415.14335.gz&errorparser=unittest
Comment 17•13 years ago
|
||
Resetting assignee to draw attention
Assignee: jeremy.orem+bugs → server-ops
Assignee | ||
Updated•13 years ago
|
Assignee: server-ops → shyam
Assignee | ||
Comment 18•13 years ago
|
||
Who are the devs who work on tinderbox? Without useful error logs, we're stuck. While I see 500s in the access logs, I don't see anything in the error logs. Without knowing what's causing the 500, I'm helpless to be able to help fix it.
Reporter | ||
Comment 19•13 years ago
|
||
I'm going to close mozilla-central until this issue is resolved.
Comment 20•13 years ago
|
||
Turns out we can't close mozilla-central because http://tinderbox.mozilla.org/admintree.cgi?tree=Firefox won't load :( (It "loads" for 4 or 5 minutes, and then stops loading on a blank page.)
Reporter | ||
Comment 21•13 years ago
|
||
FWIW, the current status is that no tinderbox page is accessible from MV, Toronto, and Europe (Italy?).
Comment 22•13 years ago
|
||
There are few devs who work on Tinderbox. bear might be able to help, but I suspect you might need to dig into this yourself.
Assignee | ||
Comment 23•13 years ago
|
||
The box was so loaded it was useless, I've rebooted it. I can't dig into something that doesn't make much sense :) I need to see why the application is throwing a 500. When the apache logs running in debug don't tell me anything, I'm as useless as the next person.
Comment 24•13 years ago
|
||
Seeing as showlog.cgi is one of the worst offenders can you set-up an http cache (mod_cache, varnish, squid -- whatever) to cache hits to that cgi script?
Comment 25•13 years ago
|
||
If it's showlog.cgi that is stalling, then it's running out of resources while trying to generate the html page form the output of the error parsing and the log expansion. It could also be triggering a mem swap if enough logs are requested that are all very large (but I say that not knowing what memory constraints are on that box). The simplest short term solution would be to put it behind a cache for showlog.cgi url's only (as we are discussing in #ops now.)
Assignee | ||
Comment 26•13 years ago
|
||
Okay, tinderbox should be back up now. I've disabled the debug logs as well, as they were fairly useless and adding more load to apache.
Comment 27•13 years ago
|
||
Current status: * mod_cache is running but we're not sure that it's helping much * Aravind is working on putting a different proxy in front of Apache, which we expect will work better than mod_cache (details: mod_cache claims to be working, but apache may be gzip'ing files after they hit the cache, due to Accept-Encoding: gzip being sent. When in place, the cache will be caching the already gzip'ed version).
Comment 28•13 years ago
|
||
bug 584920 is more up to date, duping forward.
Status: NEW → RESOLVED
Closed: 13 years ago → 13 years ago
Resolution: --- → DUPLICATE
Updated•8 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•