Closed
Bug 1049525
Opened 11 years ago
Closed 8 years ago
Tooltool timeouts do not produce a TBPL-compatible error message
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
INCOMPLETE
People
(Reporter: emorley, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: sheriffing-P2, Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2574] )
Several times over the last few days, whilst we've been having infra issues, I've seen unhandled exceptions in tooltool which do not result in TBPL-friendly error output.
eg:
https://tbpl.mozilla.org/php/getParsedLog.php?id=45307376&full=1&branch=mozilla-inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=45307279&full=1&branch=mozilla-inbound
{
...
retry: Calling <function run_with_timeout at 0x7f944e579938> with args: (['/tools/tooltool.py', '--url', 'http://runtime-binaries.pvt.build.mozilla.org/tooltool', '--overwrite', '-m', 'mobile/android/config/tooltool-manifests/android/releng.manifest', 'fetch', '-c', '/builds/tooltool_cache'], 300, None, None, False, True), kwargs: {}, attempt #9
Executing: ['/tools/tooltool.py', '--url', 'http://runtime-binaries.pvt.build.mozilla.org/tooltool', '--overwrite', '-m', 'mobile/android/config/tooltool-manifests/android/releng.manifest', 'fetch', '-c', '/builds/tooltool_cache']
WARNING: Timeout (300) exceeded, killing process 2177
Traceback (most recent call last):
File "/tools/tooltool.py", line 898, in <module>
main()
File "/tools/tooltool.py", line 895, in main
exit(0 if process_command(options, args) else 1)
File "/tools/tooltool.py", line 790, in process_command
cache_folder=options['cache_folder'])
File "/tools/tooltool.py", line 562, in fetch_files
temp_file_name = fetch_file(base_urls, f)
File "/tools/tooltool.py", line 442, in fetch_file
indata = f.read(grabchunk)
File "/tools/python27/lib/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
File "/tools/python27/lib/python2.7/httplib.py", line 561, in read
s = self.fp.read(amt)
File "/tools/python27/lib/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
KeyboardInterrupt
retry: Failed, sleeping 300 seconds before retrying
retry: Calling <function run_with_timeout at 0x7f944e579938> with args: (['/tools/tooltool.py', '--url', 'http://runtime-binaries.pvt.build.mozilla.org/tooltool', '--overwrite', '-m', 'mobile/android/config/tooltool-manifests/android/releng.manifest', 'fetch', '-c', '/builds/tooltool_cache'], 300, None, None, False, True), kwargs: {}, attempt #10
Executing: ['/tools/tooltool.py', '--url', 'http://runtime-binaries.pvt.build.mozilla.org/tooltool', '--overwrite', '-m', 'mobile/android/config/tooltool-manifests/android/releng.manifest', 'fetch', '-c', '/builds/tooltool_cache']
WARNING: Timeout (300) exceeded, killing process 2192
Traceback (most recent call last):
File "/tools/tooltool.py", line 898, in <module>
main()
File "/tools/tooltool.py", line 895, in main
exit(0 if process_command(options, args) else 1)
File "/tools/tooltool.py", line 790, in process_command
cache_folder=options['cache_folder'])
File "/tools/tooltool.py", line 562, in fetch_files
temp_file_name = fetch_file(base_urls, f)
File "/tools/tooltool.py", line 442, in fetch_file
indata = f.read(grabchunk)
File "/tools/python27/lib/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
File "/tools/python27/lib/python2.7/httplib.py", line 561, in read
s = self.fp.read(amt)
File "/tools/python27/lib/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
KeyboardInterrupt
retry: Giving up on <function run_with_timeout at 0x7f944e579938>
Unable to successfully run ['/tools/tooltool.py', '--url', 'http://runtime-binaries.pvt.build.mozilla.org/tooltool', '--overwrite', '-m', 'mobile/android/config/tooltool-manifests/android/releng.manifest', 'fetch', '-c', '/builds/tooltool_cache'] after 10 attempts
program finished with exit code 1
elapsedTime=5002.372033
========= Finished 'sh /builds/slave/m-in-and-d-0000000000000000000/tools/scripts/tooltool/tooltool_wrapper.sh ...' failed (results: 2, elapsed: 1 hrs, 23 mins, 40 secs) (at 2014-08-05 23:31:06.272956) =========
}
This is coming from:
https://github.com/mozilla/build-tooltool/blob/master/tooltool.py#L442
It would be great if we could get TBPL-compatible output here - not sure where best to handle this - in tooltool.py, retry.py or tooltool_wrapper.sh
| Reporter | ||
Updated•11 years ago
|
Summary: Tooltool unhandled KeyboardInterrupt exception during f.read(grabchunk) with no TBPL-friendly error message → Tooltool timeouts do not produce a TBPL-compatible error message
| Reporter | ||
Comment 1•11 years ago
|
||
Comment 2•11 years ago
|
||
The simplest solution is probably to change the message logged by retry.py:
https://hg.mozilla.org/build/tools/file/tip/buildfarm/utils/retry.py#l80
This would make all timeouts managed by retry.py display a tbpl compatible message (not only tooltool ones, if any).
Ed, which message would you like to display here?
Flags: needinfo?(emorley)
| Reporter | ||
Comment 3•11 years ago
|
||
(In reply to Simone Bruno [:simone] from comment #2)
> This would make all timeouts managed by retry.py display a tbpl compatible
> message (not only tooltool ones, if any).
>
> Ed, which message would you like to display here?
Yeah that sounds sensible - I'll just need to check that this won't cause redundant error summary output for other failure modes & think of the best string to use; added to my list for next week.
Flags: needinfo?(emorley)
| Reporter | ||
Comment 4•11 years ago
|
||
Updated•11 years ago
|
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2574]
| Assignee | ||
Updated•9 years ago
|
Component: Tools → General
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•