Need to add a mechanism to retry failed calls and possibly fail gracefully? This is the error from the deep log: spidermonkey/js1_5/Date/regress-346027.abc : unexpected exit code expected:0 actual:1 Signal Name: SIGHUP FAILED! spidermonkey/js1_5/Date/regress-346027.abc : captured output: cannot open file: regress-346027.abc||ssh: asteammips2: Name or service not known|lost connection|rm: cannot remove 'regress-346027.abc': No such file or directory|
Created attachment 498435 [details] [diff] [review] add retry functionality to ssh-shell.sh The try_command function could be moved out of this file and into all/environment.sh if we want to use the function in other scripts.
Assignee: nobody → cpeyer
Status: NEW → ASSIGNED
Attachment #498435 - Flags: review?(brbaker)
Comment on attachment 498435 [details] [diff] [review] add retry functionality to ssh-shell.sh Looks good but I have a couple of comments: 1) When there is an error calling the command there should be some sort of notification back to the calling script and build system. The way that the "adb_proxy" script works is that the proxy script will append its failures to a temp file [A], and then the main script will look for that file at the end of an acceptance run and if it exists will cat it into the stdio and also generate a message for buildbot to display [B]. [A] http://hg.mozilla.org/tamarin-redux/annotate/tip/platform/android/adb_proxy.py#l91 [B] http://hg.mozilla.org/tamarin-redux/annotate/tip/build/buildbot/slaves/all/run-acceptance-generic-adb.sh#l195 2) On max retires should the exit code be 99 or the last exit code from the command call? (currently it is hardcoded to fail with ec 99)
Attachment #498435 - Flags: review?(brbaker) → review+
changeset: 5700:86a3a15289bd user: Chris Peyer <firstname.lastname@example.org> summary: Bug 619956: have ssh-shell script retry calls multiple times before failing (r=brbaker) http://hg.mozilla.org/tamarin-redux/rev/86a3a15289bd
Created attachment 499031 [details] [diff] [review] Handle stderr Need to capture the stderr when making the command call so that if a call fails once and then passes on a retry, stderr is not returned to the caller. This was causing the acceptance run to still see failures since it received unexpected stderr even though all of the tests passed on a connection retry.
Attachment #499031 - Flags: review?(cpeyer)
Comment on attachment 499031 [details] [diff] [review] Handle stderr r+ with slight modifications discussed on phone. Test for ./stderr files with -s instead of -f.
Attachment #499031 - Flags: review?(cpeyer) → review+
changeset: 5702:e9f6e1fffc4c user: Brent Baker <email@example.com> summary: Bug 619956: need to trap the stderr during retries so that it does not leak out to the runtests.py script (r+cpeyer) http://hg.mozilla.org/tamarin-redux/rev/e9f6e1fffc4c
Status: ASSIGNED → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.