Closed Bug 983725 Opened 11 years ago Closed 9 years ago

Panda tests don't release the mozpool request when bm-remote is down

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: Callek, Unassigned)

References

Details

So today we had a major scl3 power failure that caused pandas to fail while trying to get binaries off bm-remote Then once bm-remote was up, we failed due to the devices not being "ready" because they were "busy" (e.g. busy from prior jobs) and thus retrying a _lot_ Looking at the log for the failed ones, we never told mozpool to release the device. https://tbpl.mozilla.org/php/getParsedLog.php?id=36148449&tree=Mozilla-Inbound&full=1#error0 09:10:50 INFO - #### Running reftest suites 09:10:50 INFO - mkdir: /builds/panda-0761/test/build/hostutils 09:10:50 INFO - Downloading http://bm-remote.build.mozilla.org/tegra/tegra-host-utils.Linux.742597.zip to /builds/panda-0761/test/build/tegra-host-utils.Linux.742597.zip 09:10:50 INFO - retry: Calling <bound method PandaTest._download_file of <__main__.PandaTest object at 0x203ac90>> with args: ('http://bm-remote.build.mozilla.org/tegra/tegra-host-utils.Linux.742597.zip', '/builds/panda-0761/test/build/tegra-host-utils.Linux.742597.zip'), kwargs: {}, attempt #1 09:10:50 WARNING - Server returned status 500 HTTP Error 500: Server Error for http://bm-remote.build.mozilla.org/tegra/tegra-host-utils.Linux.742597.zip 09:10:50 INFO - retry: Failed, sleeping 60 seconds before retrying 09:11:50 INFO - retry: Calling <bound method PandaTest._download_file of <__main__.PandaTest object at 0x203ac90>> with args: ('http://bm-remote.build.mozilla.org/tegra/tegra-host-utils.Linux.742597.zip', '/builds/panda-0761/test/build/tegra-host-utils.Linux.742597.zip'), kwargs: {}, attempt #2 09:11:50 WARNING - Server returned status 500 HTTP Error 500: Server Error for http://bm-remote.build.mozilla.org/tegra/tegra-host-utils.Linux.742597.zip 09:11:50 INFO - retry: Failed, sleeping 120 seconds before retrying 09:13:50 INFO - retry: Calling <bound method PandaTest._download_file of <__main__.PandaTest object at 0x203ac90>> with args: ('http://bm-remote.build.mozilla.org/tegra/tegra-host-utils.Linux.742597.zip', '/builds/panda-0761/test/build/tegra-host-utils.Linux.742597.zip'), kwargs: {}, attempt #3 09:13:50 WARNING - Server returned status 500 HTTP Error 500: Server Error for http://bm-remote.build.mozilla.org/tegra/tegra-host-utils.Linux.742597.zip 09:13:50 INFO - retry: Failed, sleeping 240 seconds before retrying 09:17:50 INFO - retry: Calling <bound method PandaTest._download_file of <__main__.PandaTest object at 0x203ac90>> with args: ('http://bm-remote.build.mozilla.org/tegra/tegra-host-utils.Linux.742597.zip', '/builds/panda-0761/test/build/tegra-host-utils.Linux.742597.zip'), kwargs: {}, attempt #4 09:17:50 WARNING - Server returned status 500 HTTP Error 500: Server Error for http://bm-remote.build.mozilla.org/tegra/tegra-host-utils.Linux.742597.zip 09:17:50 INFO - retry: Failed, sleeping 300 seconds before retrying 09:22:50 INFO - retry: Calling <bound method PandaTest._download_file of <__main__.PandaTest object at 0x203ac90>> with args: ('http://bm-remote.build.mozilla.org/tegra/tegra-host-utils.Linux.742597.zip', '/builds/panda-0761/test/build/tegra-host-utils.Linux.742597.zip'), kwargs: {}, attempt #5 09:22:50 WARNING - Server returned status 500 HTTP Error 500: Server Error for http://bm-remote.build.mozilla.org/tegra/tegra-host-utils.Linux.742597.zip 09:22:50 FATAL - Can't download from http://bm-remote.build.mozilla.org/tegra/tegra-host-utils.Linux.742597.zip to /builds/panda-0761/test/build/tegra-host-utils.Linux.742597.zip! 09:22:50 FATAL - Caught exception: HTTP Error 500: Server Error 09:22:50 FATAL - Caught exception: HTTP Error 500: Server Error 09:22:50 FATAL - Caught exception: HTTP Error 500: Server Error 09:22:50 FATAL - Caught exception: HTTP Error 500: Server Error 09:22:50 FATAL - Caught exception: HTTP Error 500: Server Error 09:22:50 FATAL - Running post_fatal callback... 09:22:50 FATAL - Exiting -1 09:22:50 INFO - Running post-action listener: _resource_record_post_action 09:22:50 INFO - Running post-run listener: _resource_record_post_run 09:22:50 INFO - Running post-run listener: _upload_blobber_files 09:22:50 INFO - Blob upload gear active. 09:22:50 INFO - Preparing to upload files from /builds/panda-0761/test/build/blobber_upload_dir. 09:22:50 INFO - Files from /builds/panda-0761/test/build/blobber_upload_dir are to be uploaded with <mozilla-inbound> branch at the following location(s): https://blobupload.elasticbeanstalk.com 09:22:50 INFO - Running command: ['/builds/panda-0761/test/build/venv/bin/python', '/builds/panda-0761/test/build/venv/bin/blobberc.py', '-u', 'https://blobupload.elasticbeanstalk.com', '-a', '/builds/panda-0761/test/oauth.txt', '-b', 'mozilla-inbound', '-d', '/builds/panda-0761/test/build/blobber_upload_dir'] 09:22:50 INFO - Copy/paste: /builds/panda-0761/test/build/venv/bin/python /builds/panda-0761/test/build/venv/bin/blobberc.py -u https://blobupload.elasticbeanstalk.com -a /builds/panda-0761/test/oauth.txt -b mozilla-inbound -d /builds/panda-0761/test/build/blobber_upload_dir 09:22:50 INFO - (blobuploader) - INFO - Open directory for files ... 09:22:50 INFO - (blobuploader) - INFO - Uploading /builds/panda-0761/test/build/blobber_upload_dir/logcat.log ... 09:22:50 INFO - (blobuploader) - INFO - Using https://blobupload.elasticbeanstalk.com 09:22:50 INFO - (blobuploader) - INFO - Uploading, attempt #1. 09:22:52 INFO - (blobuploader) - INFO - TinderboxPrint: Uploaded logcat.log to http://mozilla-releng-blobs.s3.amazonaws.com/blobs/mozilla-inbound/sha512/e492493ef3c735b23ca7675bc693513e1fcb833c6236e9992a69be17b14a74a5ac77d45fefa175a4657287d8e095a14cbb330f98eb0524a26bf5c5a54b95cdc3 09:22:52 INFO - (blobuploader) - INFO - Blobserver returned 202. File uploaded! 09:22:52 INFO - (blobuploader) - INFO - Done attempting. 09:22:52 INFO - (blobuploader) - INFO - Iteration through files over. 09:22:52 INFO - Return code: 0 program finished with exit code 255
Depends on: 1186615
closing since we are decomming pandas in bug 1186615
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.