Closed
Bug 584920
Opened 15 years ago
Closed 15 years ago
Tinderbox is down again
Categories
(mozilla.org Graveyard :: Server Operations, task)
mozilla.org Graveyard
Server Operations
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: khuey, Assigned: justdave)
References
Details
tinderbox.mozilla.org is down again.
Comment 1•15 years ago
|
||
Is it still down? Looks like someone was spidering showbuilds.cgi
Updated•15 years ago
|
Severity: blocker → major
Reporter | ||
Comment 2•15 years ago
|
||
It appears to have come back up.
Reporter | ||
Comment 3•15 years ago
|
||
It comes and goes, I got a log but now everything is taking forever again.
Comment 4•15 years ago
|
||
Can't get any logs.
Tried: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1281052652.1281053554.15301.gz
Comment 5•15 years ago
|
||
Might be an issue with the iscsi storage ? There is a lot of time spent in iowait on the munin monitoring for the last two or three hours.
Comment 6•15 years ago
|
||
I wonder if we need a related bug to renice/cache in tbpl, or use json, or ?
Comment 7•15 years ago
|
||
Looks like the file that is trying to read is 120M. I'm not sure how fast iscsi usually is:
time cat /var/www/iscsi/webtools/tinderbox/Firefox/build.dat > /dev/null
real 0m8.505s
user 0m0.020s
Reporter | ||
Comment 9•15 years ago
|
||
I'm getting logs now.
Comment 10•15 years ago
|
||
from cvs:webtools/tinderbox/README:
build.dat is a database file each row is a build and has pipe
separated columns:
1) the time stamp of the tinderbox server
2) time stamp of the build machine
3) the official build name (should include build machine name)
( note: that 2 & 3 together uniquely identify the build
and all relevant build data)
4) the architecture dependent error parser to use on the log files
5) status of the build (success|busted|building|testfailed|exception)
6) The log file for this build (if completed)
7) the name of the binary (if any) that came from the build
Can I get a copy of build.dat for the Firefox tree ? I'd like to check it's expiring builds properly.
Comment 11•15 years ago
|
||
(In reply to comment #8)
> The url in comment 4 eventually worked for me. Just took a while.
At the time I posted that comment I was getting 500's from that url. Now I don't anymore.
Reporter | ||
Comment 12•15 years ago
|
||
I reopened try, m-c is closed for other reasons.
Comment 13•15 years ago
|
||
The static pages, http://tinderbox.mozilla.org/Firefox/ where you can easily see the (out of) date, and http://tinderbox.mozilla.org/Firefox/json.js that tinderboxpushlog uses, are not updating, so we can't really open m-c unless someone's going to sit around manually hitting the URL that regenerates the static pages every few minutes.
Comment 14•15 years ago
|
||
CPU usage is spiking up again, about 140% of iowait and about 160% of "user"
Severity: major → critical
Comment 15•15 years ago
|
||
Killed off more spidering of showlogs/builds.
Assignee: server-ops → jeremy.orem+bugs
Comment 16•15 years ago
|
||
We hit a spike of 180% iowait and 205% user, now back down to 120% iowait and just under 200% user
Severity: critical → blocker
Comment 18•15 years ago
|
||
iowait is up around 160% again
Comment 19•15 years ago
|
||
(In reply to comment #7)
> Looks like the file that is trying to read is 120M. I'm not sure how fast iscsi
> usually is:
>
>
> time cat /var/www/iscsi/webtools/tinderbox/Firefox/build.dat > /dev/null
>
> real 0m8.505s
> user 0m0.020s
This doesn't seem all that fast, based on a couple of tests on other machines:
-rw-r--r-- 1 bhearsum bhearsum 117M 6 Aug 09:08 blahblah
foo-ix-blah:tmp bhearsum$ time cat blahblah > /dev/null
real 0m0.056s
user 0m0.001s
sys 0m0.054s
-rw-r--r-- 1 bhearsum users 118M Aug 6 06:12 blahblah
[bhearsum@cm-vpn01 ~]$ time cat blahblah > /dev/null
real 0m0.144s
user 0m0.018s
sys 0m0.122s
Of course, dm-webtools02 was probably loaded at the time your test was run. Is there any way we can do a health check on this disk?
Comment 22•15 years ago
|
||
Load seems to have gone down quite a bit in the past 20 minutes. iowait is at 40%, user is at 150% or so, which is normal for this time of day for the past month.
Jeremy, you blocked 195.166.157.111 at some point this morning. Turns out that this is a developer's IP address, can we get it unblocked?
Severity: blocker → major
Updated•15 years ago
|
Assignee: jeremy.orem+bugs → server-ops
Comment 23•15 years ago
|
||
Looks like someone already took care of that.
Comment 24•15 years ago
|
||
OOC, was the developer spidering via tbpl or other tool? Or just digging through logs?
Comment 25•15 years ago
|
||
Nope, he wasn't using any special tools, just browsing tinderbox.mozilla.org directly.
Assignee | ||
Updated•15 years ago
|
Assignee: server-ops → justdave
Comment 26•15 years ago
|
||
Did one of the rounds of blocking block kuix.de? http://kuix.de/mozilla/tinderboxstat/ isn't evil pointless spidering, it's really useful (and potentially load reducing, since I often close tbpl and count on his notifier to tell me when to reopen it) spidering.
Comment 27•15 years ago
|
||
And did we block firebot (a Road Runner account somewhere in the Carolinas, last I knew)? I think he only fetches the static quickparse.txt files, so if blocking him did us any good, we've got really awful problems.
Reporter | ||
Comment 28•15 years ago
|
||
I think this has outlived its usefulness.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•