Closed
Bug 584920
Opened 14 years ago
Closed 14 years ago
Tinderbox is down again
Categories
(mozilla.org Graveyard :: Server Operations, task)
mozilla.org Graveyard
Server Operations
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: khuey, Assigned: justdave)
References
Details
tinderbox.mozilla.org is down again.
Comment 1•14 years ago
|
||
Is it still down? Looks like someone was spidering showbuilds.cgi
Updated•14 years ago
|
Severity: blocker → major
Reporter | ||
Comment 2•14 years ago
|
||
It appears to have come back up.
Reporter | ||
Comment 3•14 years ago
|
||
It comes and goes, I got a log but now everything is taking forever again.
Comment 4•14 years ago
|
||
Can't get any logs. Tried: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1281052652.1281053554.15301.gz
Comment 5•14 years ago
|
||
Might be an issue with the iscsi storage ? There is a lot of time spent in iowait on the munin monitoring for the last two or three hours.
Comment 6•14 years ago
|
||
I wonder if we need a related bug to renice/cache in tbpl, or use json, or ?
Comment 7•14 years ago
|
||
Looks like the file that is trying to read is 120M. I'm not sure how fast iscsi usually is: time cat /var/www/iscsi/webtools/tinderbox/Firefox/build.dat > /dev/null real 0m8.505s user 0m0.020s
Reporter | ||
Comment 9•14 years ago
|
||
I'm getting logs now.
Comment 10•14 years ago
|
||
from cvs:webtools/tinderbox/README: build.dat is a database file each row is a build and has pipe separated columns: 1) the time stamp of the tinderbox server 2) time stamp of the build machine 3) the official build name (should include build machine name) ( note: that 2 & 3 together uniquely identify the build and all relevant build data) 4) the architecture dependent error parser to use on the log files 5) status of the build (success|busted|building|testfailed|exception) 6) The log file for this build (if completed) 7) the name of the binary (if any) that came from the build Can I get a copy of build.dat for the Firefox tree ? I'd like to check it's expiring builds properly.
Comment 11•14 years ago
|
||
(In reply to comment #8) > The url in comment 4 eventually worked for me. Just took a while. At the time I posted that comment I was getting 500's from that url. Now I don't anymore.
Reporter | ||
Comment 12•14 years ago
|
||
I reopened try, m-c is closed for other reasons.
Comment 13•14 years ago
|
||
The static pages, http://tinderbox.mozilla.org/Firefox/ where you can easily see the (out of) date, and http://tinderbox.mozilla.org/Firefox/json.js that tinderboxpushlog uses, are not updating, so we can't really open m-c unless someone's going to sit around manually hitting the URL that regenerates the static pages every few minutes.
Comment 14•14 years ago
|
||
CPU usage is spiking up again, about 140% of iowait and about 160% of "user"
Severity: major → critical
Comment 15•14 years ago
|
||
Killed off more spidering of showlogs/builds.
Assignee: server-ops → jeremy.orem+bugs
Comment 16•14 years ago
|
||
We hit a spike of 180% iowait and 205% user, now back down to 120% iowait and just under 200% user
Severity: critical → blocker
Comment 18•14 years ago
|
||
iowait is up around 160% again
Comment 19•14 years ago
|
||
(In reply to comment #7) > Looks like the file that is trying to read is 120M. I'm not sure how fast iscsi > usually is: > > > time cat /var/www/iscsi/webtools/tinderbox/Firefox/build.dat > /dev/null > > real 0m8.505s > user 0m0.020s This doesn't seem all that fast, based on a couple of tests on other machines: -rw-r--r-- 1 bhearsum bhearsum 117M 6 Aug 09:08 blahblah foo-ix-blah:tmp bhearsum$ time cat blahblah > /dev/null real 0m0.056s user 0m0.001s sys 0m0.054s -rw-r--r-- 1 bhearsum users 118M Aug 6 06:12 blahblah [bhearsum@cm-vpn01 ~]$ time cat blahblah > /dev/null real 0m0.144s user 0m0.018s sys 0m0.122s Of course, dm-webtools02 was probably loaded at the time your test was run. Is there any way we can do a health check on this disk?
Comment 22•14 years ago
|
||
Load seems to have gone down quite a bit in the past 20 minutes. iowait is at 40%, user is at 150% or so, which is normal for this time of day for the past month. Jeremy, you blocked 195.166.157.111 at some point this morning. Turns out that this is a developer's IP address, can we get it unblocked?
Severity: blocker → major
Updated•14 years ago
|
Assignee: jeremy.orem+bugs → server-ops
Comment 23•14 years ago
|
||
Looks like someone already took care of that.
Comment 24•14 years ago
|
||
OOC, was the developer spidering via tbpl or other tool? Or just digging through logs?
Comment 25•14 years ago
|
||
Nope, he wasn't using any special tools, just browsing tinderbox.mozilla.org directly.
Assignee | ||
Updated•14 years ago
|
Assignee: server-ops → justdave
Comment 26•14 years ago
|
||
Did one of the rounds of blocking block kuix.de? http://kuix.de/mozilla/tinderboxstat/ isn't evil pointless spidering, it's really useful (and potentially load reducing, since I often close tbpl and count on his notifier to tell me when to reopen it) spidering.
Comment 27•14 years ago
|
||
And did we block firebot (a Road Runner account somewhere in the Carolinas, last I knew)? I think he only fetches the static quickparse.txt files, so if blocking him did us any good, we've got really awful problems.
Reporter | ||
Comment 28•14 years ago
|
||
I think this has outlived its usefulness.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•9 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•