Closed Bug 697627 Opened 13 years ago Closed 11 years ago

nagios checks for pending builds

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: zeller)

References

Details

(Whiteboard: [nagios])

Attachments

(3 files, 13 obsolete files)

nagios check script 13 years ago John Hopkins (:jhopkins) 1.86 KB, text/plain		Details
bug697627-buildapi.patch 11 years ago John Zeller [:zeller] 3.36 KB, patch		Details \| Diff \| Splinter Review
bug697627.patch 11 years ago John Zeller [:zeller] 1.25 KB, patch		Details \| Diff \| Splinter Review
bug697627.patch 11 years ago John Zeller [:zeller] 1.48 KB, patch		Details \| Diff \| Splinter Review
bug697627-unittests.patch 11 years ago John Zeller [:zeller] 3.71 KB, patch		Details \| Diff \| Splinter Review
check_pending_builds.py 11 years ago John Zeller [:zeller] 1.47 KB, text/plain	hwine : feedback-	Details
check_pending_builds.py 11 years ago John Zeller [:zeller] 1.71 KB, text/x-python		Details
check_pending_builds.py 11 years ago John Zeller [:zeller] 2.02 KB, text/x-python		Details
check_pending_builds.py 11 years ago John Zeller [:zeller] 1.69 KB, text/x-python		Details
check_pending_builds.py 11 years ago John Zeller [:zeller] 1.70 KB, text/x-python		Details
test_check_pending_builds.py 11 years ago John Zeller [:zeller] 1.45 KB, text/plain	hwine : review+	Details
check_pending_builds.py 11 years ago John Zeller [:zeller] 1.70 KB, text/plain	hwine : review+	Details
check_pending_builds.py 11 years ago John Zeller [:zeller] 1.75 KB, text/plain		Details
check_pending_builds.py 11 years ago John Zeller [:zeller] 1.66 KB, text/plain	hwine : review+	Details
test_check_pending_builds.py 11 years ago John Zeller [:zeller] 1.62 KB, text/plain	hwine : review+ hwine : checked-in+	Details
check_pending_builds.py 11 years ago John Zeller [:zeller] 1.65 KB, text/plain	hwine : review+ hwine : checked-in+	Details

bhearsum@mozilla.com (:bhearsum)

Reporter

Description

•

13 years ago

bug 697374 tracked a problem where we had thousands of pending jobs (because 32-bit Linux test slaves weren't accepting new ones), and didn't notice for awhile. This could've been caught sooner if we had some Nagios checks in place. A couple of ideas that were thrown out:
* A check that fails when we have over X pending jobs (to account for some unavoidable load). This might legitimately fail sometimes, but we could downtime or ack it in those situations. X should probably be around 500.
* A check that fails when there are more than X pending and less than Y running. I'm not sure what X and Y should be here.

Regardless of what we do, we probably need something on buildapi that returns the # of pending and running in a simple form, so that it's easily understandable by whatever script is doing the checks.

John Hopkins (:jhopkins)

Comment 1

•

13 years ago

Attached file nagios check script — Details

We use the attached perl script to check the Thunderbird buildbot build queue.  It scrapes the /one_box_per_builder page.

Chris Cooper [:coop] (he/him)

Updated

•

13 years ago

Priority: -- → P3

Whiteboard: [nagios]

Hal Wine [:hwine] use NI!

Updated

•

11 years ago

Blocks: re-nagios

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Comment 2

•

11 years ago

Found in triage. 

Sounds like this is actually two items:
1) buildapi work to make the length of pending builds queue be measurable from commandline
2) nagios work to call buildapi and then alert us when length > some-threshold-to-be-determined.

Moving to Tools for now... needinfo to hwine, zeller to see if this seems accurate, and if so, what they think are next steps here.

Component: Release Engineering → Release Engineering: Developer Tools

Flags: needinfo?(jozeller)

Flags: needinfo?(hwine)

QA Contact: hwine

John Zeller [:zeller]

Assignee

Comment 3

•

11 years ago

This looks accurate to me. I can work on having something available for 1 today.

I need these few questions answered before I punch out this expansion:
1) When you say measurable from commandline, is json okay? Or is something else preferable?
2) Are we interested in finding out the length of pending builds by branch? Or by revision?
3) Do we also want length of running? Length of completed? (Not difficult)
4) Can/should this be a new REST URL? Perhaps:
    - If we want length by branch, then: /self-serve/{branch}/status
    - If we want length by revision, then: /self-serve/{branch}/rev/{revision}/status

Flags: needinfo?(jozeller)

Flags: needinfo?(joduinn)

Flags: needinfo?(bhearsum)

John Zeller [:zeller]

Assignee

Comment 4

•

11 years ago

Attached patch bug697627-buildapi.patch (obsolete) — Details — Splinter Review

I went ahead and punched out a draft of this expansion. This is not heavily tested, but was correct in the few instances I tried it in. I assumed that we were interested in searching by branch, having the result returned in json and adding a new endpoint at /self-serve/{branch}/status.

If all that matches the answers to my needinfo in comment 3, then I will go ahead and start the feedback, testing and review process on this patch to get it out asap.

Hal Wine [:hwine] use NI!

Comment 5

•

11 years ago

(In reply to John Zeller [:zeller] from comment #3)
> This looks accurate to me. I can work on having something available for 1
> today.
You're way ahead of your support lines -- let's get the requirements clear first. ATM, there is no definition of "done".

> 
> I need these few questions answered before I punch out this expansion:
> 1) When you say measurable from commandline, is json okay? Or is something
> else preferable?
It will need to interface with some script that will package it for nagios consumption -- see the prior attachment for an example of what would call this API point.

> 2) Are we interested in finding out the length of pending builds by branch?
> Or by revision?
Neither -- see comment 0 for description of problem. We need by machine pool type.

> 3) Do we also want length of running? Length of completed? (Not difficult)
Pending first

> 4) Can/should this be a new REST URL? Perhaps:
Yes, new URL, as this is a completely new query type. That is assuming it can be done at all.

>     - If we want length by branch, then: /self-serve/{branch}/status
>     - If we want length by revision, then:
> /self-serve/{branch}/rev/{revision}/status
None of the above

Flags: needinfo?(hwine)

John Zeller [:zeller]

Assignee

Comment 6

•

11 years ago

(In reply to Hal Wine [:hwine] from comment #5)
> (In reply to John Zeller [:zeller] from comment #3)
> > This looks accurate to me. I can work on having something available for 1
> > today.
> You're way ahead of your support lines -- let's get the requirements clear
> first. ATM, there is no definition of "done".
> 
> > 
> > I need these few questions answered before I punch out this expansion:
> > 1) When you say measurable from commandline, is json okay? Or is something
> > else preferable?
> It will need to interface with some script that will package it for nagios
> consumption -- see the prior attachment for an example of what would call
> this API point.
> 
> > 2) Are we interested in finding out the length of pending builds by branch?
> > Or by revision?
> Neither -- see comment 0 for description of problem. We need by machine pool
> type.
> 
> > 3) Do we also want length of running? Length of completed? (Not difficult)
> Pending first
> 
> > 4) Can/should this be a new REST URL? Perhaps:
> Yes, new URL, as this is a completely new query type. That is assuming it
> can be done at all.
> 
> >     - If we want length by branch, then: /self-serve/{branch}/status
> >     - If we want length by revision, then:
> > /self-serve/{branch}/rev/{revision}/status
> None of the above

You're right, good call. What needs to be done here? And who can we ask for some solid answers to the questions above?

bhearsum@mozilla.com (:bhearsum)

Reporter

Comment 7

•

11 years ago

(In reply to John Zeller [:zeller] from comment #3)
> This looks accurate to me. I can work on having something available for 1
> today.
> 
> I need these few questions answered before I punch out this expansion:
> 1) When you say measurable from commandline, is json okay? Or is something
> else preferable?

I don't know what you mean by this...

> 2) Are we interested in finding out the length of pending builds by branch?
> Or by revision?

These seem like nice to haves, but not requirements.

> 3) Do we also want length of running? Length of completed? (Not difficult)

Again, nice to have.

> 4) Can/should this be a new REST URL? Perhaps:
>     - If we want length by branch, then: /self-serve/{branch}/status
>     - If we want length by revision, then:
> /self-serve/{branch}/rev/{revision}/status

If we're quibbling on names, I think that /self-serve/pending and similar would be more appropriate. "status" is very generic, and would quickly turn into a mess, I think.

Flags: needinfo?(bhearsum)

Nobody; OK to take it and work on it

Updated

•

11 years ago

Product: mozilla.org → Release Engineering

John Zeller [:zeller]

Assignee

Comment 8

•

11 years ago

Attached patch bug697627.patch (obsolete) — Details — Splinter Review

This is a stupid simple patch that adds functionality to the already existing route /pending with the parameter total=all. Currently all this does is return json in the form {'all': 49}.

There is a LOT of room to improve of this, but this patch is looking to start with providing the very most basic information, which is how many pending jobs are currently existing on all machines regardless of pool, branch, or anything else.

Note: I did not add this to a new REST on top of /pending (ie /pending/all) because there is already a route for /pending/{branch}/{platform} and all would be read as a branch. I decided that hacking through that would be a waste given that it undermines the already determined functionality of that URL. However, if there are better suggestions, or if someone thinks that this should be somewhere completely different, then by all means, throw down some ideas! :)

Attachment #787393 - Attachment is obsolete: true

Attachment #791112 - Flags: feedback?(nthomas)

Attachment #791112 - Flags: feedback?(bhearsum)

John Zeller [:zeller]

Assignee

Updated

•

11 years ago

Assignee: nobody → jozeller

John Zeller [:zeller]

Assignee

Updated

•

11 years ago

Flags: needinfo?(joduinn)

John Zeller [:zeller]

Assignee

Comment 9

•

11 years ago

Attached patch bug697627.patch (obsolete) — Details — Splinter Review

While writing up unittests for this, I found a bug that caused the base URL /pending not to be able to load. I fixed it in this updated patch.

Attachment #791112 - Attachment is obsolete: true

Attachment #791112 - Flags: feedback?(nthomas)

Attachment #791112 - Flags: feedback?(bhearsum)

Attachment #791442 - Flags: review?(nthomas)

Attachment #791442 - Flags: checked-in?

John Zeller [:zeller]

Assignee

Comment 10

•

11 years ago

Attached patch bug697627-unittests.patch (obsolete) — Details — Splinter Review

Adding unittests to document the patch bug697627.patch

Attachment #791443 - Flags: review?(nthomas)

Attachment #791443 - Flags: checked-in?

John Zeller [:zeller]

Assignee

Comment 11

•

11 years ago

Attached file check_pending_builds.py (obsolete) — Details

This should take care of a nagios alert plugin, at least foundation ally. This still has a few questions that need answers.

1) Once the buildapi extension in buig697627.patch is live, it'll be found on buildapi/pending?total=all (This could change). However, I'm currently testing this on my local machine instance of buildapi, and so the url I am calling must change in this nagios file. What URL do I use for making a call to this in buildapi?
1.1) Does this call require authentication? If so, how do I go about setting that up properly?
2) Currently the critical and warning thresholds have set defaults at 5000 and 2000 respectively. Should these even have defaults? If so, what are appropriate ones? PS This already takes in thresholds as arguments.

Attachment #791553 - Flags: feedback?(catlee)

Flags: needinfo?(catlee)

John Zeller [:zeller]

Assignee

Comment 12

•

11 years ago

Comment on attachment 791553 [details]
check_pending_builds.py

I'd like some more feedback on this nagios plugin please! :)

Attachment #791553 - Flags: feedback?(rail)

Attachment #791553 - Flags: feedback?(jhopkins)

Attachment #791553 - Flags: feedback?(hwine)

Hal Wine [:hwine] use NI!

Updated

•

11 years ago

Attachment #791553 - Attachment mime type: text/x-python → text/plain

Hal Wine [:hwine] use NI!

Comment 13

•

11 years ago

Comment on attachment 791553 [details]
check_pending_builds.py

>#!/usr/bin/python

this should be #!/usr/bin/env python
>
>import sys
>from optparse import OptionParser

we're on python 2.7 now, so should use argparse, as it's compatible with python 3.x

>import urllib2 as url
>import simplejson as json

The module simplejson isn't part of the standard library -- there's an idiom for using it when available, but falling back to the version shipped with the standard library.

>
>def get_pending_builds():
>    response = url.urlopen('http://127.0.0.1:5000/pending?total=all')

While we don't know yet what the final URL will be, it's better to have it defined once as a global, rather than a hard coded magic value. Here, it makes sense to add the unique query parameters. Hint -- if it's a parameter to this function, unit tests will be much easier to write.

>    result = json.loads(response.read())
>    return result.get('all')
>
>def pending_builds_status(num_pending, critical, warning):

"Intentional naming" is a good concept - "num_pending" is good, the other 2 argument names are not as good.

>    if num_pending > critical:
>        return 'CRITICAL - Pending Builds: %i' % num_pending
>    elif num_pending > warning:
>        return 'WARNING - Pending Builds: %i' % num_pending
>    else:
>        return 'OK - Pending Builds: %i' % num_pending

I'm guessing you don't have unit tests running yet. The above will fail.

>
>if __name__ == '__main__':
>    parser = OptionParser(usage="Usage: %prog [options]",  version="%prog 1.0")

rather than hard code a magic value, you could set the __version__ globally above, and use that value here
__doc__ & __version__ are idiomatic global variables that are used by introspection code when available.

>    parser.add_option('-c', '--critical', action='store', type='int', dest='critical', default=5000,
>                      help='Set CRITICAL level (as integer eg. 5000)')
>    parser.add_option('-w', '--WARNING', action='store', type='int', dest='warning', default=2000,
>                      help='Set WARNING level (as integer eg. 2000)')
>    (options, args) = parser.parse_args()
>
>    output = ""
why?
>    output = pending_builds_status(get_pending_builds(), options.critical, options.warning)
>
>    if len(output) < 2:
>        print 'Please use -h for options'
>        print options("-h")
>    if output[:2] == 'OK':
>        print output
>        sys.exit(0)
>    elif output[:2] == 'WA':
>        print output
>        sys.exit(1)
>    elif output[:2] == 'CR':
>        print output
>        sys.exit(2)
Okay, the above needs rework -- consider how you could avoid all that testing (hint: you've already used the technique in this script). And "magic hard coded values".

Also, as written, there is no provision for error handling. That's one major reason to use python for this, and Nagios even has support for it -- but you're not using that yet.

I'd like to see a version that passes some unit tests.

Attachment #791553 - Flags: feedback?(hwine) → feedback-

Nick Thomas [:nthomas] (UTC+12)

Comment 14

•

11 years ago

Comment on attachment 791442 [details] [diff] [review]
bug697627.patch

Review of attachment 791442 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not convinced we need this new point on the API when 
  https://secure.pub.build.mozilla.org/builddata/buildjson/builds-pending.js
is a snapshot updated every 60 seconds, and is already available without any http auth issues. Can't the status plugin suck that data down (it never gets that big) and the the loop over the nested dict to count the number of pending jobs ?

Attachment #791442 - Flags: review?(nthomas)

Attachment #791442 - Flags: review-

Attachment #791442 - Flags: checked-in?

Nick Thomas [:nthomas] (UTC+12)

Comment 15

•

11 years ago

Comment on attachment 791443 [details] [diff] [review]
bug697627-unittests.patch

No longer needed if my previous comment stands up to scrutiny.

Attachment #791443 - Flags: review?(nthomas)

Attachment #791443 - Flags: checked-in?

John Hopkins (:jhopkins)

Comment 16

•

11 years ago

Comment on attachment 791553 [details]
check_pending_builds.py

removing f? as hwine has already given detailed feedback

Attachment #791553 - Flags: feedback?(jhopkins)

John Zeller [:zeller]

Assignee

Comment 17

•

11 years ago

(In reply to Nick Thomas [:nthomas] from comment #14)
> Comment on attachment 791442 [details] [diff] [review]
> bug697627.patch
> 
> Review of attachment 791442 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I'm not convinced we need this new point on the API when 
>   https://secure.pub.build.mozilla.org/builddata/buildjson/builds-pending.js
> is a snapshot updated every 60 seconds, and is already available without any
> http auth issues. Can't the status plugin suck that data down (it never gets
> that big) and the the loop over the nested dict to count the number of
> pending jobs ?

Wow! Yeah I think I agree, this makes much more sense. Heine if you agree I will go ahead and rewrite that plugin for this service.

Flags: needinfo?(hwine)

Hal Wine [:hwine] use NI!

Comment 18

•

11 years ago

Sounds like a plan -- don't forget to check "v1" into a repo prior to starting on v2, just in case....

Flags: needinfo?(hwine)

John Zeller [:zeller]

Assignee

Updated

•

11 years ago

Attachment #791442 - Flags: review-

John Zeller [:zeller]

Assignee

Comment 19

•

11 years ago

Attached file check_pending_builds.py (obsolete) — Details

This new nagios plugin uses the already existing service found here: https://secure.pub.build.mozilla.org/builddata/buildjson/builds-pending.js

hwine: I made all of the adjustments that you mentioned in comment 13, with the exception of a couple that I could use more info on.

>    if num_pending > critical:
>        return 'CRITICAL - Pending Builds: %i' % num_pending
>    elif num_pending > warning:
>        return 'WARNING - Pending Builds: %i' % num_pending
>    else:
>        return 'OK - Pending Builds: %i' % num_pending

You mentioned that this would fail... I can't seem to figure out what you may be seeing that I have missed here. I wrote up a few tests to check things like a range of num_pending, but alas I cannot seem to figure out what you were pointing to here. Could you clarify?

> Also, as written, there is no provision for error handling. That's one major 
> reason to use python for this, and Nagios even has support for it -- but you're > not using that yet.

So there is Nagios supported error handling? All I seem to be able to find so far is this http://nagios.sourceforge.net/docs/3_0/eventhandlers.html , am I on the right track? Either way, I'm gonna keep reading, but I wanted to see if there was something simple I am missing.

Attachment #791553 - Attachment is obsolete: true

Attachment #791553 - Flags: feedback?(rail)

Attachment #791553 - Flags: feedback?(catlee)

Attachment #793225 - Flags: review?(hwine)

Flags: needinfo?(hwine)

John Zeller [:zeller]

Assignee

Comment 20

•

11 years ago

Attached file check_pending_builds.py (obsolete) — Details

Added error handling for urllib2.URLError if urlopen is ever to fail. I then also added the use of returning UNKNOWN value 3 when a URLError happens or when the num_pending value is NoneType.

Attachment #793225 - Attachment is obsolete: true

Attachment #793225 - Flags: review?(hwine)

Attachment #793601 - Flags: review?(hwine)

John Zeller [:zeller]

Assignee

Comment 21

•

11 years ago

Attached file check_pending_builds.py (obsolete) — Details

Cleaned up this plugin quite a bit. Functionally produces the same result, but in a much cleaner fashion.

As a recap, this plugin grabs the list of pending builds globally, determines how many there are, then checks this number against a preset or passed in warning and critical threshold. The plugin then returns either a 0, 1, 2, or 3 if the number of pending jobs are either OK, WARNING, CRITICAL or UNKNOWN.

This plugin also uses argparse which is part of the python standard library and works with python 3.x.

Attachment #793601 - Attachment is obsolete: true

Attachment #793601 - Flags: review?(hwine)

Attachment #793740 - Flags: review?(hwine)

Chris AtLee [:catlee]

Comment 22

•

11 years ago

Latest plugin looks good! A few random comments:

- `import urllib2 as url` is a bit strange - why did you feel like you needed to rename urllib2?

- run through a pep8 checker to fix up some code formatting issues

- how will nagios handle the output from your top-level exception handler?

- should we add a retry around the urllib2.urlopen call, in case the network or service is flapping? nagios will re-check anyway, so we may be fine without retries.

Flags: needinfo?(catlee)

John Zeller [:zeller]

Assignee

Comment 23

•

11 years ago

Attached file check_pending_builds.py (obsolete) — Details

Reformatted code to match pep8 and removed urllib2 as url... reverted back to urllib2.

Attachment #793740 - Attachment is obsolete: true

Attachment #793740 - Flags: review?(hwine)

Attachment #795688 - Flags: review?(hwine)

John Zeller [:zeller]

Assignee

Comment 24

•

11 years ago

(In reply to Chris AtLee [:catlee] from comment #22)
> Latest plugin looks good! A few random comments:
> 
> - `import urllib2 as url` is a bit strange - why did you feel like you
> needed to rename urllib2?
> 
Yeah, this was unnecessary, so I removed it.

> - run through a pep8 checker to fix up some code formatting issues
Done

> 
> - how will nagios handle the output from your top-level exception handler?
Nagios handles the top-level exception handler as 'UNKNOWN'

> 
> - should we add a retry around the urllib2.urlopen call, in case the network
> or service is flapping? nagios will re-check anyway, so we may be fine
> without retries.
Retry should be unnecessary as nagios handles the rechecks :)

John Zeller [:zeller]

Assignee

Comment 25

•

11 years ago

Attached file test_check_pending_builds.py (obsolete) — Details

These are the unittests to use for testing the nagios plugin attached above, attachment 795688 [details]

Attachment #791443 - Attachment is obsolete: true

Attachment #797528 - Flags: review?(hwine)

John Zeller [:zeller]

Assignee

Comment 26

•

11 years ago

Attached file check_pending_builds.py (obsolete) — Details

I am simply resubmitting this nagios plugin again to then mark old attachments obsolete. Hoping that this cleans up the attachments are a bit :)

The old comments still apply:
Reformatted code to match pep8 and removed urllib2 as url... reverted back to urllib2.

Attachment #791442 - Attachment is obsolete: true

Attachment #795688 - Attachment is obsolete: true

Attachment #795688 - Flags: review?(hwine)

Attachment #797530 - Flags: review?(hwine)

Hal Wine [:hwine] use NI!

Updated

•

11 years ago

Attachment #797528 - Attachment mime type: text/x-python → text/plain

Hal Wine [:hwine] use NI!

Updated

•

11 years ago

Attachment #797530 - Attachment mime type: text/x-python → text/plain

Hal Wine [:hwine] use NI!

Comment 27

•

11 years ago

Comment on attachment 797530 [details]
check_pending_builds.py

Good as is -- and you should understand why the nits :)

>if __name__ == '__main__':
>    parser = argparse.ArgumentParser(
>        prog="check_pending_builds.py", usage="Usage: %(prog)s [options]",  version="%(prog)s " + __version__)

nit: keyword argument 'prog' has a reasonable default, so should be overridden only when the default doesn't work.

>    parser.add_argument(
>        '-c', '--critical', action='store', type=int, dest='critical_threshold', default=5000,
>        help='Set CRITICAL level (as integer eg. 5000)')
>    parser.add_argument(
>        '-w', '--warning', action='store', type=int, dest='warning_threshold', default=2000,
>        help='Set WARNING level (as integer eg. 2000)')

nit: the help is using two different names for each variable, which is confusing. Consider using 'metavar'

>    try:
>        num_pending = get_num_pending_builds()
>        status = pending_builds_status(
>            num_pending, args.critical_threshold, args.warning_threshold)
>        print '%s Pending Builds: %i' % (status, num_pending)
>        sys.exit(status_code.get(status, 3))

nit: '3' is a magic number -- what would happen if you used: sys.exit(status_code[status])

>    except Exception as e:
>        print e
>        sys.exit(status_code.get('UNKNOWN'))

Attachment #797530 - Flags: review?(hwine) → review+

Hal Wine [:hwine] use NI!

Comment 28

•

11 years ago

Comment on attachment 797528 [details]
test_check_pending_builds.py

Mostly good - lets get the test thresholds not equal to defaults before committing. With that change, r+

>import unittest
>from check_pending_builds import pending_builds_status
>
># NOTE: This test is to be run with unittest and NOT nosetests

nit: there's nothing with this test that precludes running via nosetests. The nosetest issue is with upstream suite of tests. So, let's remove the statement from here. Add to a top level README, if anything.

>## THINGS NOT BEING TESTED FOR ###
># 1) UNKNOWN since that is in main and only executed when an exception occurs
># 2) Ranges of warning and critical thresholds since testing boundaries for 1 set, should hold true for any set

Another thing that is not being tested is that pending_build_status honors the values being passed are tested. I.e. you're testing with the default values.

>class TestSequenceFunctions(unittest.TestCase):
>
>    warning_threshold = 2000
>    critical_threshold = 5000

So why these values? Is one set of values sufficient? I'd change these to be not the default.

>
>    def test_ok(self):
>        """Test OK from 0 to warning_threshold - 1"""
>        for x in range(self.warning_threshold):
>            assert pending_builds_status(x, self.critical_threshold, self.warning_threshold) == 'OK'

There's no real value in the extended range -- the range function allows specification of a start point as well. Same with other functions.

Attachment #797528 - Flags: review?(hwine) → review+

John Zeller [:zeller]

Assignee

Comment 29

•

11 years ago

Attached file check_pending_builds.py (obsolete) — Details

Fixed nits :) And made the output less redundant.

> nit: '3' is a magic number -- what would happen if you used: sys.exit(status_code[status])

I left the default return status as 3 here because with just sys.exit(status_code[status]), if status is not as expected by something in the status_code dict, then it just gives a None, which then sys.exit(None)'s. I can change this if you think it's unnecessary to test for something other than what is in the status_code dict.

Attachment #797530 - Attachment is obsolete: true

Attachment #797949 - Flags: review?(hwine)

John Zeller [:zeller]

Assignee

Comment 30

•

11 years ago

> >## THINGS NOT BEING TESTED FOR ###
> ># 1) UNKNOWN since that is in main and only executed when an exception occurs
> ># 2) Ranges of warning and critical thresholds since testing boundaries for 1 set, should hold true for any set
> 
> Another thing that is not being tested is that pending_build_status honors
> the values being passed are tested. I.e. you're testing with the default
> values.

What should I do instead here?

> >class TestSequenceFunctions(unittest.TestCase):
> >
> >    warning_threshold = 2000
> >    critical_threshold = 5000
> 
> So why these values? Is one set of values sufficient? I'd change these to be
> not the default.

I picked these values arbitrarily. I also removed the defaults in the nagios plugin itself. Does this still apply then?

> >    def test_ok(self):
> >        """Test OK from 0 to warning_threshold - 1"""
> >        for x in range(self.warning_threshold):
> >            assert pending_builds_status(x, self.critical_threshold, self.warning_threshold) == 'OK'
> 
> There's no real value in the extended range -- the range function allows
> specification of a start point as well. Same with other functions.

The only reason I gave a range so big here is because it's cleaner than specifying to test 0, threshold-1, threshold and threshold+1. Better to be explicit about the specific values I am really caring about?

John Zeller [:zeller]

Assignee

Comment 31

•

11 years ago

Attached file check_pending_builds.py (obsolete) — Details

Fixed nits :) Should be good to go!

Attachment #797949 - Attachment is obsolete: true

Attachment #797949 - Flags: review?(hwine)

Attachment #798093 - Flags: review?(hwine)

John Zeller [:zeller]

Assignee

Comment 32

•

11 years ago

Attached file test_check_pending_builds.py — Details

Changes defaults and made the test cases only test boundary cases rather than a crazy large range.

Attachment #797528 - Attachment is obsolete: true

Attachment #798094 - Flags: review?(hwine)

John Zeller [:zeller]

Assignee

Updated

•

11 years ago

Attachment #798093 - Attachment mime type: text/x-python → text/plain

John Zeller [:zeller]

Assignee

Comment 33

•

11 years ago

Attached file check_pending_builds.py — Details

Added sys.exit(status_code[status])

Attachment #798093 - Attachment is obsolete: true

Attachment #798093 - Flags: review?(hwine)

Attachment #798101 - Flags: review?(hwine)

Hal Wine [:hwine] use NI!

Updated

•

11 years ago

Attachment #798094 - Flags: review?(hwine) → review+

Hal Wine [:hwine] use NI!

Updated

•

11 years ago

Attachment #798093 - Flags: review+

John Zeller [:zeller]

Assignee

Updated

•

11 years ago

Attachment #798101 - Attachment mime type: text/x-python → text/plain

John Zeller [:zeller]

Assignee

Comment 34

•

11 years ago

Comment on attachment 798094 [details]
test_check_pending_builds.py

Ready to go! :)

Attachment #798094 - Flags: checked-in?

Hal Wine [:hwine] use NI!

Updated

•

11 years ago

Attachment #798101 - Flags: review?(hwine) → review+

John Zeller [:zeller]

Assignee

Updated

•

11 years ago

Attachment #798101 - Flags: checked-in?

Hal Wine [:hwine] use NI!

Comment 35

•

11 years ago

Comment on attachment 798094 [details]
test_check_pending_builds.py

https://hg.mozilla.org/build/braindump/rev/fb1dfe7d15d7

Attachment #798094 - Flags: checked-in? → checked-in+

Hal Wine [:hwine] use NI!

Comment 36

•

11 years ago

Comment on attachment 798101 [details]
check_pending_builds.py

https://hg.mozilla.org/build/braindump/rev/fb1dfe7d15d7

Attachment #798101 - Flags: checked-in? → checked-in+

John Zeller [:zeller]

Assignee

Updated

•

11 years ago

Status: NEW → ASSIGNED

John Zeller [:zeller]

Assignee

Updated

•

11 years ago

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Resolution: --- → WORKSFORME

Hal Wine [:hwine] use NI!

Updated

•

11 years ago

Flags: needinfo?(hwine)

Hal Wine [:hwine] use NI!

Updated

•

11 years ago

Resolution: WORKSFORME → FIXED

Nobody; OK to take it and work on it

Updated

•

7 years ago

Component: Tools → General

You need to log in before you can comment on or make changes to this bug.