Closed Bug 962089 Opened 10 years ago Closed 10 years ago

Change builds-4hr.js.gz nagios check to use HEAD request instead of GET, and increase timeout

Categories

(mozilla.org Graveyard :: Server Operations, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Assigned: rbryce)

References

Details

We're getting some socket timeout failures for the nagios check for builds-4hr.js.gz.

We should be able to do the check with a HEAD request instead of a GET.

While we're at it, let's increase the timeout to 20 seconds.
Assignee: relops → server-ops
Component: RelOps → Server Operations
Product: Infrastructure & Operations → mozilla.org
QA Contact: arich → shyam
Bump? This is causing sheriffs to close the trees in bug 960054..
 (In reply to Chris AtLee [:catlee] from comment #0)
> We're getting some socket timeout failures for the nagios check for
> builds-4hr.js.gz.
> 
> We should be able to do the check with a HEAD request instead of a GET.
> 
> While we're at it, let's increase the timeout to 20 seconds.

The HEAD request doesn't supply the document modification time.  Instead, I suppressed the body from being downloaded, which significantly lowers the size of the response and supplies the mod time. I also added a 20 sec timeout.  

I believe this will alleviate the timeout issues you are seeing.
Assignee: server-ops → rbryce
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
(In reply to Rick Bryce [:rbryce] from comment #2)
> The HEAD request doesn't supply the document modification time.  Instead, I
> suppressed the body from being downloaded, which significantly lowers the
> size of the response and supplies the mod time. I also added a 20 sec
> timeout.  
> 
> I believe this will alleviate the timeout issues you are seeing.

The Nagios alerts in bug 960054 comment 11 suggest that the timeout is still 10 seconds, and that the payload is still including the body (see bytes returned in the recovery email).

I don't suppose you could you take another look? :-)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
By the way, this check was disabled a few days ago while debugging the underlying issue, and re-enabled yesterday (Monday).
(In reply to Ed Morley [:edmorley UTC+0] from comment #3)
> (In reply to Rick Bryce [:rbryce] from comment #2)
> > The HEAD request doesn't supply the document modification time.  Instead, I
> > suppressed the body from being downloaded, which significantly lowers the
> > size of the response and supplies the mod time. I also added a 20 sec
> > timeout.  
> > 
> > I believe this will alleviate the timeout issues you are seeing.
> 
> The Nagios alerts in bug 960054 comment 11 suggest that the timeout is still
> 10 seconds, and that the payload is still including the body (see bytes
> returned in the recovery email).
> 
> I don't suppose you could you take another look? :-)

This is odd. I double checked my work to make sure I didnt mis-configure the timeout.  It is correct and should work as is.  I have removed the timeout setting from the check. I'm hoping this will force to use the timeout of 60 seconds.
I'm hoping this will force to use of the default NRPE timeout, 60 seconds.
Ed, is this working out for you?
(In reply to Rick Bryce [:rbryce] from comment #7)
> Ed, is this working out for you?

I haven't seen any timeouts since, happy to call this fixed for now :-)
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.