Status Graveyard
Server Operations
8 years ago
3 years ago


(Reporter: khuey, Assigned: justdave)


Details is down again.

Comment 1

8 years ago
Is it still down? Looks like someone was spidering showbuilds.cgi


8 years ago
Severity: blocker → major
It appears to have come back up.
It comes and goes, I got a log but now everything is taking forever again.
Might be an issue with the iscsi storage ? There is a lot of time spent in iowait on the munin monitoring for the last two or three hours.

Comment 6

8 years ago
I wonder if we need a related bug to renice/cache in tbpl, or use json, or ?

Comment 7

8 years ago
Looks like the file that is trying to read is 120M. I'm not sure how fast iscsi usually is:

time cat  /var/www/iscsi/webtools/tinderbox/Firefox/build.dat > /dev/null

real	0m8.505s
user	0m0.020s

Comment 8

8 years ago
The url in comment 4 eventually worked for me. Just took a while.
from cvs:webtools/tinderbox/README:

build.dat is a database file each row is a build and has pipe
separated columns: 

1) the time stamp of the tinderbox server 
2) time stamp of the build machine 
3) the official build name (should include build machine name)
   ( note: that 2 & 3 together uniquely identify the build 
        and all relevant build data)
4) the architecture dependent error parser to use on the log files
5) status of the build (success|busted|building|testfailed|exception)
6) The log file for this build (if completed)
7) the name of the binary (if any) that came from the build

Can I get a copy of build.dat for the Firefox tree ? I'd like to check it's expiring builds properly.
(In reply to comment #8)
> The url in comment 4 eventually worked for me. Just took a while.

At the time I posted that comment I was getting 500's from that url. Now I don't anymore.
I reopened try, m-c is closed for other reasons.
The static pages, where you can easily see the (out of) date, and that tinderboxpushlog uses, are not updating, so we can't really open m-c unless someone's going to sit around manually hitting the URL that regenerates the static pages every few minutes.
CPU usage is spiking up again, about 140% of iowait and about 160% of "user"
Severity: major → critical
Killed off more spidering of showlogs/builds.
Assignee: server-ops → jeremy.orem+bugs
We hit a spike of 180% iowait and 205% user, now back down to 120% iowait and just under 200% user
Severity: critical → blocker
Whoops, didn't mean to raise severity
Severity: blocker → major
iowait is up around 160% again
(In reply to comment #7)
> Looks like the file that is trying to read is 120M. I'm not sure how fast iscsi
> usually is:
> time cat  /var/www/iscsi/webtools/tinderbox/Firefox/build.dat > /dev/null
> real    0m8.505s
> user    0m0.020s

This doesn't seem all that fast, based on a couple of tests on other machines:
-rw-r--r--  1 bhearsum  bhearsum   117M  6 Aug 09:08 blahblah
foo-ix-blah:tmp bhearsum$ time cat blahblah > /dev/null

real	0m0.056s
user	0m0.001s
sys	0m0.054s

-rw-r--r-- 1 bhearsum users 118M Aug  6 06:12 blahblah
[bhearsum@cm-vpn01 ~]$ time cat blahblah > /dev/null

real	0m0.144s
user	0m0.018s
sys	0m0.122s

Of course, dm-webtools02 was probably loaded at the time your test was run. Is there any way we can do a health check on this disk?
We're in a death spiral again. 0% idle CPU
Severity: major → blocker
Duplicate of this bug: 584365
Load seems to have gone down quite a bit in the past 20 minutes. iowait is at 40%, user is at 150% or so, which is normal for this time of day for the past month.

Jeremy, you blocked at some point this morning. Turns out that this is a developer's IP address, can we get it unblocked?
Severity: blocker → major


8 years ago
Assignee: jeremy.orem+bugs → server-ops
Looks like someone already took care of that.

Comment 24

8 years ago
OOC, was the developer spidering via tbpl or other tool? Or just digging through logs?
Nope, he wasn't using any special tools, just browsing directly.
Assignee: server-ops → justdave
Did one of the rounds of blocking block isn't evil pointless spidering, it's really useful (and potentially load reducing, since I often close tbpl and count on his notifier to tell me when to reopen it) spidering.
And did we block firebot (a Road Runner account somewhere in the Carolinas, last I knew)? I think he only fetches the static quickparse.txt files, so if blocking him did us any good, we've got really awful problems.
I think this has outlived its usefulness.
Last Resolved: 8 years ago
Resolution: --- → FIXED
Product: → Graveyard
You need to log in before you can comment on or make changes to this bug.