Closed Bug 802114 Opened 12 years ago Closed 7 years ago

[mock] Wrap hgtool.py inside a call to retry.py in valgrind.sh

Categories

(Release Engineering :: General, defect, P4)

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: emorley, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: regression, sheriffing-untriaged, Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/3052] [hgtool][valgrind][simple])

The following failure should have been a RETRY (and would have been before mock):

Linux x86-64 mozilla-central valgrind on 2012-10-15 03:06:15 PDT for push 942ed5747b63

slave: bld-linux64-ec2-002

https://tbpl.mozilla.org/php/getParsedLog.php?id=16113510&tree=Firefox#error0

{
command: START
command: hg update -C -r 942ed5747b63807e625f5b692564fe31496c3117
command: cwd: /builds/slave/m-cen-lnx64-valgrind/src
command: output:
abort: data/toolkit/Makefile.in.i@1b08914858da: no match found!
command: ERROR
Traceback (most recent call last):
  File "/builds/slave/m-cen-lnx64-valgrind/scripts/buildfarm/utils/../../lib/python/util/commands.py", line 42, in run_cmd
    return subprocess.check_call(cmd, **kwargs)
  File "/usr/lib64/python2.6/subprocess.py", line 502, in check_call
    raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '['hg', 'update', '-C', '-r', '942ed5747b63807e625f5b692564fe31496c3117']' returned non-zero exit status 255
command: END (97.69s elapsed)

Error updating /builds/slave/m-cen-lnx64-valgrind/src from sharedRepo (/builds/hg-shared/mozilla-central): 
command: START
command: hg clone http://hg.mozilla.org/mozilla-central /builds/slave/m-cen-lnx64-valgrind/src
command: cwd: /builds/slave/m-cen-lnx64-valgrind
command: output:

command timed out: 1200 seconds without output, attempting to kill
program finished with exit code 247
elapsedTime=1317.375264
========= Finished 'mock_mozilla -v ...' failed (results: 2, elapsed: 21 mins, 57 secs) (at 2012-10-15 03:31:25.573484) =========
}
Android mozilla-central nightly on 2012-10-18 03:06:19 PDT for push cb573b9307e5

slave: bld-linux64-ec2-045

https://tbpl.mozilla.org/php/getParsedLog.php?id=16227569&tree=Firefox

{
06:37:12    ERROR -  abort: connection ended unexpectedly
06:37:12    ERROR -  Automation Error: hg not responding
06:37:12    ERROR - Return code: 255
}
jhopkins, where do I need to change to fix this? :-)
Maybe we need to wrap hgtool.py in valgrind.sh with retry.py?
This happened in https://tbpl.mozilla.org/?noignore=1&jobname=valgrind&rev=93cc1ee94291 as well:

( https://tbpl.mozilla.org/php/getParsedLog.php?id=16410542&tree=Firefox&full=1 )

========= Started 'hg clone ...' failed (results: 2, elapsed: 17 secs) (at 2012-10-24 03:22:13.474262) =========
hg clone http://hg.mozilla.org/build/tools scripts
 in dir /builds/slave/m-cen-lnx-valgrind/. (timeout 1200 secs)
 watching logfiles {}
 argv: ['hg', 'clone', 'http://hg.mozilla.org/build/tools', 'scripts']
 environment:
  CCACHE_COMPRESS=1
  CCACHE_DIR=/builds/ccache
  CCACHE_HASHDIR=
  CCACHE_UMASK=002
  DISPLAY=:2
  G_BROKEN_FILENAMES=1
  HG_SHARE_BASE_DIR=/builds/hg-shared
  HISTCONTROL=ignoredups
  HISTSIZE=1000
  HOME=/home/cltbld
  HOSTNAME=bld-linux64-ec2-044.build.aws-us-west-1.mozilla.com
  LC_ALL=C
  LD_LIBRARY_PATH=/tools/gcc-4.3.3/installed/lib
  LESSOPEN=|/usr/bin/lesspipe.sh %s
  LOGNAME=cltbld
  MAIL=/var/spool/mail/cltbld
  MOZ_CRASHREPORTER_NO_REPORT=1
  MOZ_OBJDIR=obj-firefox
  MOZ_UPDATE_CHANNEL=nightly
  PATH=/tools/buildbot/bin:/usr/local/bin:/usr/lib/ccache:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/tools/git/bin:/tools/python27/bin:/tools/python27-mercurial/bin:/home/cltbld/bin
  POST_SYMBOL_UPLOAD_CMD=/usr/local/bin/post-symbol-upload.py
  PWD=/builds/slave/m-cen-lnx-valgrind
  REVISION=93cc1ee9429165ad859ac031ade8fde49eceeeaa
  SHELL=/bin/bash
  SHLVL=1
  SYMBOL_SERVER_HOST=symbols1.dmz.phx1.mozilla.com
  SYMBOL_SERVER_PATH=/mnt/netapp/breakpad/symbols_ffx/
  SYMBOL_SERVER_SSH_KEY=/home/mock_mozilla/.ssh/ffxbld_dsa
  SYMBOL_SERVER_USER=ffxbld
  TERM=linux
  TINDERBOX_OUTPUT=1
  USER=cltbld
  _=/tools/buildbot/bin/python
 using PTY: False
transaction abort!
rollback completed
abort: connection ended unexpectedly
requesting all changes
adding changesets
program finished with exit code 255
elapsedTime=17.732972
========= Finished 'hg clone ...' failed (results: 2, elapsed: 17 secs) (at 2012-10-24 03:22:31.255505) =========

========= Skipped  (results: not started, elapsed: not started) =========
========= Skipped  (results: not started, elapsed: not started) =========


See also https://tbpl.mozilla.org/php/getParsedLog.php?id=16410548&tree=Firefox&full=1
The best thing is to wrap hgtool.py inside a call to retry.py.
(In reply to Chris AtLee [:catlee] from comment #3)
> Maybe we need to wrap hgtool.py in valgrind.sh with retry.py?

(In reply to Chris AtLee [:catlee] from comment #5)
> The best thing is to wrap hgtool.py inside a call to retry.py.

Changing summary as appropriate.
Summary: mock doesn't RETRY on hg failures such as: "abort: data/toolkit/Makefile.in.i@1b08914858da: no match found!" → [mock] Wrap hgtool.py inside a call to retry.py in valgrind.sh
Whiteboard: [sheriff-want]
Priority: -- → P4
Whiteboard: [hgtool][valgrind][simple]
Product: mozilla.org → Release Engineering
Whiteboard: [hgtool][valgrind][simple] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/3043] [hgtool][valgrind][simple]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/3043] [hgtool][valgrind][simple] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/3047] [hgtool][valgrind][simple]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/3047] [hgtool][valgrind][simple] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/3052] [hgtool][valgrind][simple]
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.