mozpool - A failing device request should retrigger the job



Release Engineering
6 years ago
2 months ago


(Reporter: armenzg, Assigned: armenzg)


Firefox Tracking Flags

(Not tracked)



(1 attachment)



6 years ago
Created attachment 694515 [details] [diff] [review]
retry on failure to request a device

08:43:59     INFO - #####
08:43:59     INFO - ##### Running request-device step.
08:43:59     INFO - #####
08:43:59     INFO - Getting output from command: ['/builds/panda-0119/test/build/venv/bin/python', '-c', 'from distutils.sysconfig import get_python_lib; print(get_python_lib())']
08:43:59     INFO - Copy/paste: /builds/panda-0119/test/build/venv/bin/python -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())"
08:43:59     INFO - Reading from file tmpfile_stdout
08:43:59     INFO - Output received:
08:43:59     INFO -  /builds/panda-0119/test/build/venv/lib/python2.7/site-packages
08:43:59     INFO - Request POST
08:43:59    ERROR - Bad return status from 500!
Traceback (most recent call last):
  File "scripts/scripts/", line 140, in <module>
  File "/builds/panda-0119/test/scripts/mozharness/base/", line 730, in run
    self._possibly_run_method(method_name, error_if_missing=True)
  File "/builds/panda-0119/test/scripts/mozharness/base/", line 687, in _possibly_run_method
    return getattr(self, method_name)()
  File "scripts/scripts/", line 95, in request_device
    b2gbase=b2gbase, pxe_config=None)
  File "/builds/panda-0119/test/scripts/mozharness/mozilla/testing/", line 294, in request_device
  File "/builds/panda-0119/test/scripts/mozharness/mozilla/testing/", line 70, in check_mozpool_status
    raise MozpoolException('mozpool status not ok, code %s' % pprint.pformat(status))
mozharness.mozilla.testing.mozpool.MozpoolException: mozpool status not ok, code 500
program finished with exit code 1
Attachment #694515 - Flags: review?(aki)


6 years ago
Assignee: nobody → armenzg
Blocks: 819492

Comment 1

6 years ago
Comment on attachment 694515 [details] [diff] [review]
retry on failure to request a device


                self.fatal("We could not request the device: %s" % str(e))

Attachment #694515 - Flags: review?(aki) → review+

Comment 2

6 years ago
dustin, even if we got 500 status code the requests did go through.
In fact, they are still showing in mozpool:

* After requesting a device [1]
* We check then on the status returned by mozpool [2]

We somehow got a 500 status return code.


Comment 3

6 years ago
Comment on attachment 694515 [details] [diff] [review]
retry on failure to request a device

I also landed an import of MozpoolException from
Attachment #694515 - Flags: checked-in+


6 years ago
Last Resolved: 6 years ago
Resolution: --- → FIXED

Comment 4

6 years ago
15:26 armenzg: dustin: is dbcron something new that got added?
15:26 dustin: armenzg: there's a bug to add one
15:26 dustin: yeah, it was added late last week
15:26 armenzg: is this what caused the 500 issue?
15:27 dustin: yes
15:28 dustin: basically the tables to insert the log entries into weren't there
15:28 dustin: oh, no bug yet, but in my TODO list - "newbug - nagios check to monitor mozpool partitions"
15:28 dustin: bug 819186 introduced dbcron
15:28 bugbot: Bug normal, --, ---, dustin, RESOLVED FIXED, use a crontask on the admin host, rather than a MySQL Scheduled Task, to create new log partitions
Bug 823661 is the fix to this particular issue (dbcron wasn't running because the 'mysql' command wasn't installed), and bug 823666 is the bug to monitor the partitions.  Sheeri recommended this a few weeks ago, and I had it in my TODO but hadn't implemented it yet.  Shame on me!
Product: → Release Engineering
Component: General Automation → General
Product: Release Engineering → Release Engineering
You need to log in before you can comment on or make changes to this bug.