The default bug view has changed. See this FAQ.

[Meta] Tracker for failures where the error summary and/or bug suggestions are suboptimal

NEW
Unassigned

Status

Testing
General
5 years ago
6 months ago

People

(Reporter: emorley, Unassigned)

Tracking

(Depends on: 19 bugs, {meta, sheriffing-P1})

Trunk
meta, sheriffing-P1
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

As we all know, there are a variety of failures modes that TBPL cannot currently cope with & so either shows the failure in the annotated summary box but cannot match it to a bug - or worse shows a completely unhelpful summary which requires opening the full log to find the cause. To make things worse, manual starring is a PITA, since TBPL (understandably in the current system) doesn't auto-mark bugs if you enter the bug number in manually, so you have to paste the URL in-bug yourself.

Off the top of my head, cases we should improve:
* Compilation failure (include the actual error line, not just "make[N]: *** [foo.o] Error 1")
* Crashes (we should show the top frame or two in the summary, so you can match against bug summary without loading the log)
* Assertions (eg of style bug 774732)
* Leaks (of style bug 689247, not the new shutdown leak logger)
* Infra issues (particularly Android; eg things like bug 772531)
* Android OOM (ie for cases covered by bug 775227; plus we should start dumping more entries from the Android ADB log into TBPL's summary)
* xpcshell timeouts of the form of bug 762032 and friends
* Anything else that requires manual starring on a regular basis

Now some of the above will require TBPL parser changes (ie at http://hg.mozilla.org/users/mstange_themasta.com/tinderboxpushlog/file/tip/php/inc/GeneralErrorFilter.php), which given the lack of tests, will be 'interesting'. 

Others will require (or can be worked around) by harness changes, ie bug 757838 (and also like bug 772388 did).

CCing sheriffs: Please can you file dependants for any 'requires regular manual starring' or 'ways to make manual starring less likely to need the full log opening' type bugs that I may have overlooked.

Thank you :-)
No longer depends on: 747440
Depends on: 778690
Most if not all of these depend on TBPL parsing the full logs, which will be inefficient.  Someone had plans on creating a useful log parser component which would handle a lot of these cases, IIRC, but that was quite a while ago and I don't remember who that person was.  You might want to ping people in RelEng to see if they remember this.
Another case we need to cover:
Bug 752243

(In reply to Ehsan Akhgari [:ehsan] from comment #1)
> Most if not all of these depend on TBPL parsing the full logs

You mean as opposed to line by line?
If so, I agree some might; but others we can work around in the test harnesses/releng scripts themselves.

Updated

5 years ago
Depends on: 780579
We should also TinderboxPrint or TEST-UNEXPECTED-FAIL
"Output exceeded 52428800 bytes, remaining output has been truncated"
(In reply to comment #2)
> Another case we need to cover:
> Bug 752243
> 
> (In reply to Ehsan Akhgari [:ehsan] from comment #1)
> > Most if not all of these depend on TBPL parsing the full logs
> 
> You mean as opposed to line by line?

No, as opposed to parsing the short log.
The... you mean the short log that Tinderbox used to create?

We no longer know about any log that isn't uploaded by buildbot directly to ftp.m.o, and there's just one log there, and nobody could describe any of them for any job as being short.
(In reply to comment #5)
> The... you mean the short log that Tinderbox used to create?
> 
> We no longer know about any log that isn't uploaded by buildbot directly to
> ftp.m.o, and there's just one log there, and nobody could describe any of them
> for any job as being short.

Ah, clearly I have not paid much attention to this stuff for quite a while.  Thanks for correcting me.  :-)
Slinging this is here so I don't forget:

[Child 2201] ###!!! ABORT: X_CopyArea: BadDrawable (invalid Pixmap or Window parameter); 59 requests ago: file ../../../toolkit/xre/nsX11ErrorHandler.cpp, line 157
https://tbpl.mozilla.org/php/getParsedLog.php?id=14770500&tree=Mozilla-Inbound
Bug 782505
Losing track of what I've slung in here already (or is filed as deps), but would rather dupes than the alternative.

https://tbpl.mozilla.org/php/getParsedLog.php?id=15044826&tree=Mozilla-Inbound
{
NOISE: RSS: Main: 55443456
NOISE: 
Traceback (most recent call last):
  File "C:\talos-slave\talos-data\talos\bcontroller.py", line 221, in ?
    sys.exit(main())
  File "C:\talos-slave\talos-data\talos\bcontroller.py", line 218, in main
    bcontroller.run()
  File "C:\talos-slave\talos-data\talos\bcontroller.py", line 161, in run
    results_file = open(self.browser_log, "a")
IOError: [Errno 13] Permission denied: 'browser_output.txt'
Failed tdhtmlr: 
		Stopped Fri, 07 Sep 2012 03:45:22
FAIL: Busted: tdhtmlr
FAIL: timeout exceeded
Traceback (most recent call last):
  File "run_tests.py", line 250, in run_tests
    talos_results.add(mytest.runTest(browser_config, test))
  File "C:\talos-slave\talos-data\talos\ttest.py", line 366, in runTest
    raise talosError("timeout exceeded")
talosError: 'timeout exceeded'
Traceback (most recent call last):
  File "run_tests.py", line 298, in ?
    main()
  File "run_tests.py", line 295, in main
    run_tests(parser)
  File "run_tests.py", line 259, in run_tests
    raise e
utils.talosError: 'timeout exceeded'
program finished with exit code 1
elapsedTime=3626.515000
TinderboxPrint:<a href = "http://hg.mozilla.org/integration/mozilla-inbound/rev/ebdb7c83b789">rev:ebdb7c83b789</a>
}
Depends on: 788518
Depends on: 790595
Depends on: 790602
Depends on: 790613
Depends on: 790618
Depends on: 790639
For various talos cases, filed bug 790602.

(In reply to Ed Morley [:edmorley UTC+1] from comment #0)
> * Infra issues (particularly Android; eg things like bug 772531)

Filed bug 790595, bug 790613, bug 790618.

(In reply to Ed Morley [:edmorley UTC+1] from comment #3)
> We should also TinderboxPrint or TEST-UNEXPECTED-FAIL
> "Output exceeded 52428800 bytes, remaining output has been truncated"

Filed bug 790639.
https://tbpl-dev.allizom.org/php/getParsedLog.php?id=15177826&tree=Firefox
{
========= Started Running verify.py failed (results: 2, elapsed: 34 secs) (at 2012-09-12 19:02:21.209689) =========
python /builds/sut_tools/verify.py
 in dir /builds/tegra-333/test/build (timeout 1200 secs)
 watching logfiles {}
 argv: ['python', '/builds/sut_tools/verify.py']
 environment:
  HOME=/home/cltbld
  PATH=/tools/buildbot-0.8.4-pre-moz2/bin:/usr/local/bin:/usr/local/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/cltbld/bin
  PWD=/builds/tegra-333/test/build
  SUT_IP=10.250.51.173
  SUT_NAME=tegra-333
 using PTY: False
DEBUG: updateSUT: Using tegra 'tegra-333' found in env variable
INFO: Using tegra 'tegra-333' found in env variable
INFO: attempting to ping tegra
reconnecting socket
INFO: updateSUT.py: We're running SUTAgentAndroid Version 1.13
INFO: Got expected SUTAgent version '1.13'
INFO: attempting to create file /mnt/sdcard/writetest
Push File Failed to Validate!
program finished with exit code 1
elapsedTime=34.562456
========= Finished Running verify.py failed (results: 2, elapsed: 34 secs) (at 2012-09-12 19:02:55.787641) =========
}
Depends on: 790960
{
TEST-PASS | /builds/slave/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_wipeServer.js | test passed (time: 862.297ms)
Traceback (most recent call last):
  File "xpcshell/runxpcshelltests.py", line 992, in <module>
    main()
  File "xpcshell/runxpcshelltests.py", line 988, in main
    if not xpcsh.runTests(args[0], testdirs=args[1:], **options.__dict__):
  File "xpcshell/runxpcshelltests.py", line 866, in runTests
    self.removeDir(self.profileDir)
  File "xpcshell/runxpcshelltests.py", line 325, in removeDir
    shutil.rmtree(dirname)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 244, in rmtree
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 244, in rmtree
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 236, in rmtree
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 234, in rmtree
OSError: [Errno 13] Permission denied: '/var/folders/z5/wgqk5lfs3rggp278v9gyl1xc00000w/T/tmpdJ1l5W/Cache/3'
program finished with exit code 1
}

a la bug 752243
Bug 793091 will make many of the current manual starring Android cases RETRY, which will make our lives quite a bit easier :-)
Depends on: 793091
Depends on: 793627
https://tbpl.mozilla.org/php/getParsedLog.php?id=15462696&tree=Try

{
========= Started 'python mochitest/runtestsremote.py ...' warnings (results: 1, elapsed: 21 secs) (at 2012-09-23 14:34:25.388032) =========
python mochitest/runtestsremote.py --deviceIP 10.250.51.85 --xre-path ../hostutils/xre --utility-path ../hostutils/bin --certificate-path certs --app org.mozilla.fennec --console-level INFO --http-port 30245 --ssl-port 31245 --pidfile /builds/tegra-245/test/../runtestsremote.pid --run-only-tests android.json --symbols-path=http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/georg.fritzsche@googlemail.com-2a4865f16c65/try-android-armv6/fennec-17.0a2.en-US.android-arm-armv6.crashreporter-symbols.zip --total-chunks 8 --this-chunk 3
 in dir /builds/tegra-245/test/build/tests (timeout 2400 secs)
 watching logfiles {}
 argv: ['python', 'mochitest/runtestsremote.py', '--deviceIP', '10.250.51.85', '--xre-path', '../hostutils/xre', '--utility-path', '../hostutils/bin', '--certificate-path', 'certs', '--app', 'org.mozilla.fennec', '--console-level', 'INFO', '--http-port', '30245', '--ssl-port', '31245', '--pidfile', '/builds/tegra-245/test/../runtestsremote.pid', '--run-only-tests', 'android.json', u'--symbols-path=http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/georg.fritzsche@googlemail.com-2a4865f16c65/try-android-armv6/fennec-17.0a2.en-US.android-arm-armv6.crashreporter-symbols.zip', '--total-chunks', '8', '--this-chunk', '3']
 environment:
  HOME=/Users/cltbld
  MINIDUMP_SAVE_PATH=/builds/tegra-245/test/minidumps
  MINIDUMP_STACKWALK=/builds/tegra-245/test/tools/breakpad/linux/minidump_stackwalk
  PATH=/opt/local/bin:/opt/local/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin
  PWD=/builds/tegra-245/test/build/tests
  SUT_IP=10.250.51.85
  SUT_NAME=tegra-245
  __CF_USER_TEXT_ENCODING=0x1F5:0:0
 closing stdin
 using PTY: False
reconnecting socket
{'uptime': ['0 days 0 hours 4 minutes 11 seconds 642 ms'], 'power': ['Power status:', ' AC power ONLINE', ' Battery charge NO BATTERY', ' Remaining charge: 0%', ' Battery Temperature: 0.0 (c)'], 'process': [['10026', '1589', 'com.svox.pico'], ['10028', '1577', 'com.android.defcontainer'], ['10007', '1202', 'com.android.inputmethod.latin'], ['1001', '1211', 'com.android.phone'], ['1000', '1020', 'system'], ['10029', '1335', 'com.android.deskclock'], ['10031', '1496', 'com.mozilla.SUTAgentAndroid'], ['10034', '1525', 'org.mozilla.ffxcp'], ['10018', '1219', 'com.android.launcher'], ['10013', '1443', 'com.cooliris.media'], ['10004', '1363', 'android.process.media'], ['10009', '1422', 'com.android.quicksearchbox'], ['10002', '1435', 'com.android.music'], ['10032', '1413', 'com.mozilla.watcher'], ['10006', '1393', 'com.android.mms'], ['10010', '1374', 'com.android.providers.calendar'], ['10014', '1352', 'com.android.email'], ['10017', '1344', 'com.android.bluetooth'], ['10015', '1253', 'android.process.acore'], ['1000', '1230', 'com.android.settings']], 'screen': ['X:1024 Y:768'], 'memory': ['PA:819421184, FREE: 705040384'], 'systime': ['2012/09/23 02:34:25:637'], 'rotation': ['ROTATION:0'], 'disk': [], 'os': ['harmony-eng 2.2 FRF91 20110202.102810 test-keys'], 'id': ['00:21:e8:70:95:76'], 'uptimemillis': ['251664']}
INFO | runtests.py | Installing extension at /builds/tegra-245/test/build/tests/mochitest/extensions/roboextender@mozilla.org to /tmp/tmpbhKd30.
INFO | runtests.py | Installing extension at /builds/tegra-245/test/build/tests/mochitest/extensions/specialpowers to /tmp/tmpbhKd30.
INFO | runtests.py | Installing extension at /builds/tegra-245/test/build/tests/mochitest/extensions/worker to /tmp/tmpbhKd30.
INFO | runtests.py | Installing extension at /builds/tegra-245/test/build/tests/mochitest/extensions/workerbootstrap to /tmp/tmpbhKd30.
pushing directory: /tmp/tmpbhKd30 to /mnt/sdcard/tests/profile
args: ['/builds/tegra-245/test/build/hostutils/bin/xpcshell', '-g', '/builds/tegra-245/test/build/hostutils/xre', '-v', '170', '-f', './httpd.js', '-e', "const _PROFILE_PATH = '/tmp/tmpMuhcm_';const _SERVER_PORT = '30245'; const _SERVER_ADDR = '10.250.48.224'; const _TEST_PREFIX = undefined;", '-f', './server.js']
INFO | runtests.py | Server pid: 25122
pushing directory: /tmp/tmpbhKd30 to /mnt/sdcard/tests/profile
INFO | runtests.py | Running tests: start.

FIRE PROC: '"MOZ_CRASHREPORTER=1,XPCOM_DEBUG_BREAK=stack,MOZ_CRASHREPORTER_NO_REPORT=1,NO_EM_RESTART=1,MOZ_PROCESS_LOG=/tmp/tmpHdd3Vypidlog,XPCOM_MEM_BLOAT_LOG=/tmp/tmpbhKd30/runtests_leaks.log" org.mozilla.fennec -no-remote -profile /mnt/sdcard/tests/profile/ http://mochi.test:8888/tests/?autorun=1&closeWhenDone=1&logFile=%2Fmnt%2Fsdcard%2Ftests%2Flogs%2Fmochitest.log&fileLevel=INFO&consoleLevel=INFO&totalChunks=8&thisChunk=3&testManifest=android.json&runOnly=true'
INFO | runtests.py | Received unexpected exception while running application
Traceback (most recent call last):
  File "/builds/tegra-245/test/build/tests/mochitest/runtests.py", line 729, in runTests
    timeout = timeout)
  File "/builds/tegra-245/test/build/tests/mochitest/automation.py", line 981, in runApp
    stderr = subprocess.STDOUT)
  File "/builds/tegra-245/test/build/tests/mochitest/remoteautomation.py", line 113, in Process
    return self.RProcess(self._devicemanager, cmd, stdout, stderr, env, cwd)
  File "/builds/tegra-245/test/build/tests/mochitest/remoteautomation.py", line 127, in __init__
    raise Exception("unable to launch process")
Exception: unable to launch process
WARNING | automationutils.processLeakLog() | refcount logging is off, so leaks can't be detected!
}
Depends on: 793630
{
"command timed out: 3600 seconds without output, attempting to kill"
}
etc

{
remoteFailed: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other side was lost in a non-clean fashion.
}

https://tbpl.mozilla.org/php/getParsedLog.php?id=15474592&tree=Mozilla-Inbound
{
========= Started 'hg clone ...' failed (results: 2, elapsed: 0 secs) (at 2012-09-24 03:06:40.465524) =========
hg clone http://hg.mozilla.org/build/tools scripts
 in dir /builds/slave/m-in-lnx64-spidermonkey-warnaserr/. (timeout 1200 secs)
 watching logfiles {}
 argv: ['hg', 'clone', 'http://hg.mozilla.org/build/tools', 'scripts']
 environment:
  CCACHE_COMPRESS=1
  CCACHE_DIR=/builds/ccache
  CCACHE_HASHDIR=
  CCACHE_UMASK=002
  CVS_RSH=ssh
  DISPLAY=:2
  G_BROKEN_FILENAMES=1
  HG_REPO=http://hg.mozilla.org/integration/mozilla-inbound
  HG_SHARE_BASE_DIR=/builds/hg-shared
  HISTSIZE=1000
  HOME=/home/cltbld
  HOSTNAME=bld-centos5-64-vmw-006.build.releng.scl3.mozilla.com
  INPUTRC=/etc/inputrc
  LANG=en_US.UTF-8
  LC_ALL=C
  LESSOPEN=|/usr/bin/lesspipe.sh %s
  LOGNAME=cltbld
  LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35:
  MAIL=/var/spool/mail/cltbld
  MOZ_CRASHREPORTER_NO_REPORT=1
  MOZ_OBJDIR=obj-firefox
  MOZ_SYMBOLS_EXTRA_BUILDID=linux64
  PATH=/tools/buildbot/bin:/usr/local/bin:/usr/lib64/ccache:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/tools/git/bin:/tools/python27/bin:/tools/python27-mercurial/bin:/home/cltbld/bin
  POST_SYMBOL_UPLOAD_CMD=/usr/local/bin/post-symbol-upload.py
  PWD=/builds/slave/m-in-lnx64-spidermonkey-warnaserr
  SHELL=/bin/bash
  SHLVL=1
  SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
  SYMBOL_SERVER_HOST=symbols1.dmz.phx1.mozilla.com
  SYMBOL_SERVER_PATH=/mnt/netapp/breakpad/symbols_ffx/
  SYMBOL_SERVER_SSH_KEY=/home/mock_mozilla/.ssh/ffxbld_dsa
  SYMBOL_SERVER_USER=ffxbld
  TERM=linux
  TINDERBOX_OUTPUT=1
  USER=cltbld
  _=/tools/python/bin/python
 using PTY: False
Upon execvpe hg ['hg', 'clone', 'http://hg.mozilla.org/build/tools', 'scripts'] in environment id 35657168
:Traceback (most recent call last):
  File "/tools/buildbot-0.8.4-pre-moz2/lib/python2.6/site-packages/twisted/internet/process.py", line 414, in _fork
    executable, args, environment)
  File "/tools/buildbot-0.8.4-pre-moz2/lib/python2.6/site-packages/twisted/internet/process.py", line 460, in _execChild
    os.execvpe(executable, args, environment)
  File "/tools/buildbot-0.8.4-pre-moz2/lib/python2.6/os.py", line 353, in execvpe
    _execvpe(file, args, env)
  File "/tools/buildbot-0.8.4-pre-moz2/lib/python2.6/os.py", line 380, in _execvpe
    func(fullname, *argrest)
OSError: [Errno 2] No such file or directory
}
Depends on: 793641
Depends on: 793646
No longer depends on: 686425
No longer depends on: 752113
Assignee: nobody → bmo
Depends on: 793678
Depends on: 793782
> "command timed out: 3600 seconds without output, attempting to kill"

Bug 793782

> remoteFailed: [Failure instance: Traceback (failure with no frames): <class 
> 'twisted.internet.error.ConnectionLost'>: Connection to the other side was lost in a non-clean fashion.

Bug 793646
Depends on: 793800
Status: NEW → ASSIGNED
Depends on: 793855
Depends on: 794017
No longer depends on: 772531
Depends on: 794768
Depends on: 794895
Summary: [Meta] Make TBPL handle more types of failures (to reduce need for manual starring & the need to open full logs) → [Meta] Reduce the number of failures that need manual starring / require opening the full log
Depends on: 797324
Depends on: 783815
Note to self:

https://tbpl.mozilla.org/php/getParsedLog.php?id=15821727&tree=Thunderbird-Try
{
e:/builds/moz2_slave/tb-try-c-cen-w32/build/mailnews/base/src/nsMessenger.cpp(1726) : error C2660: 'nsITransfer::Init' : function does not take 8 arguments
}
https://tbpl.mozilla.org/php/getParsedLog.php?id=15819817&full=1&branch=mozilla-inbound#error0
{
[Child 2222] ###!!! ABORT: invalid segment: '!strncmp(header->mMagic, sMagic, sizeof(sMagic))', file ../../../ipc/glue/Shmem.cpp, line 303
}

Updated

5 years ago
Depends on: 799532
No longer depends on: 799532
Depends on: 799891
Depends on: 800288
Similar to comment 16:

https://tbpl.mozilla.org/php/getParsedLog.php?id=15930895&tree=Thunderbird-Trunk
{
e:/builds/moz2_slave/tb-c-cen-w32/build/objdir-tb/mail/app/module.rc(40) : error RC2135 : file not found: address-book.ico
}

(building a resource file on Windows).
Depends on: 802114
Depends on: 803466
https://tbpl.mozilla.org/php/getParsedLog.php?id=16488760&tree=Mozilla-Inbound

{
05:27:45     INFO -  ccache: FATAL: Could not create /builds/ccache/5/7/ad377879bb67eec02b2e0bcb3252eb-1796071.o.tmp.stdout.bld-linux64-ec2-056.build.aws-us-west-1.mozilla.com.1014 (permission denied?)
05:27:45     INFO -  In the directory  /builds/slave/b2g-m-in-panda-dep/build/objdir-gecko/js/src
05:27:45     INFO -  The following command failed to execute properly:
05:27:45     INFO -  /usr/bin/ccache /builds/slave/b2g-m-in-panda-dep/build/prebuilt/linux-x86/toolchain/arm-linux-androideabi-4.4.x/bin/arm-linux-androideabi-g++ -o jsinterp.o -c -fvisibility=hidden -DENABLE_YARR_JIT=1 -DNO_NSPR_10_SUPPORT -DIMPL_MFBT -DEXPORT_JS_API -DJS_HAS_CTYPES -DDLL_PREFIX="lib" -DDLL_SUFFIX=".so" -DUSE_ZLIB -Ictypes/libffi/include -I. -I/builds/slave/b2g-m-in-panda-dep/build/gecko/js/src/../../mfbt/double-conversion -I/builds/slave/b2g-m-in-panda-dep/build/gecko/js/src -I. -I./../../dist/include -I/builds/slave/b2g-m-in-panda-dep/build/objdir-gecko/dist/include/nspr -I/builds/slave/b2g-m-in-panda-dep/build/gecko/js/src -I/builds/slave/b2g-m-in-panda-dep/build/gecko/js/src/assembler -I/builds/slave/b2g-m-in-panda-dep/build/gecko/js/src/yarr -fPIC -DANDROID -isystem /builds/slave/b2g-m-in-panda-dep/build/bionic/libc/arch-arm/include -isystem /builds/slave/b2g-m-in-panda-dep/build/bionic/libc/include/ -isystem /builds/slave/b2g-m-in-panda-dep/build/bionic/libc/kernel/common -isystem /builds/slave/b2g-m-in-panda-dep/build/bionic/libc/kernel/arch-arm -isystem /builds/slave/b2g-m-in-panda-dep/build/bionic/libm/include -I/builds/slave/b2g-m-in-panda-dep/build/frameworks/base/native/include -I/builds/slave/b2g-m-in-panda-dep/build/system/core/include -isystem /builds/slave/b2g-m-in-panda-dep/build/bionic -pedantic -Wall -Wpointer-arith -Woverloaded-virtual -Werror=return-type -Wtype-limits -Wempty-body -Werror=conversion-null -Wno-ctor-dtor-privacy -Wno-overlength-strings -Wno-invalid-offsetof -Wno-variadic-macros -Wno-long-long -mandroid -fno-short-enums -fno-exceptions -Wno-psabi -DMOZ_ENABLE_JS_DUMP -include /builds/slave/b2g-m-in-panda-dep/build/gonk-misc/Unicode.h -I/builds/slave/b2g-m-in-panda-dep/build/external/stlport/stlport -march=armv7-a -mthumb -mfpu=vfp -mfloat-abi=softfp -fno-rtti -ffunction-sections -fdata-sections -fno-exceptions -pipe -DNDEBUG -DTRIMMED -g -O3 -freorder-blocks -fno-reorder-functions -fomit-frame-pointer -DUSE_SYSTEM_MALLOC=1 -DENABLE_ASSEMBLER=1 -DENABLE_JIT=1 -DANDROID -isystem /builds/slave/b2g-m-in-panda-dep/build/bionic/libc/arch-arm/include -isystem /builds/slave/b2g-m-in-panda-dep/build/bionic/libc/include/ -isystem /builds/slave/b2g-m-in-panda-dep/build/bionic/libc/kernel/common -isystem /builds/slave/b2g-m-in-panda-dep/build/bionic/libc/kernel/arch-arm -isystem /builds/slave/b2g-m-in-panda-dep/build/bionic/libm/include -I/builds/slave/b2g-m-in-panda-dep/build/frameworks/base/native/include -I/builds/slave/b2g-m-in-panda-dep/build/system/core/include -isystem /builds/slave/b2g-m-in-panda-dep/build/bionic -DMOZILLA_CLIENT -include ./js-confdefs.h -MD -MF .deps/jsinterp.o.pp /builds/slave/b2g-m-in-panda-dep/build/gecko/js/src/jsinterp.cpp
05:27:45    ERROR -  make[6]: *** [jsinterp.o] Error 1
}
Depends on: 807707
Depends on: 808419
Depends on: 807094
Depends on: 808536
A lot of the work for this has now been done; main remaining candidates are:
* Outputting top frame of crashes in a starrable format (for shutdown crashes and also those that happen in any tests, so test filename not useful)
* Leaks
* Failures where the "lets search for the whole failure" doesn't work due to varying process IDs (eg bug 782633, bug 603147)
* Android test timeouts (suspect we should just make the harness timeout lower than the buildbot one, then we could print a more easily parsed failure line) and other nonsense. eg bug 722166, bug 663657, bug 724738.
* Handling the datetime and INFO/ERROR prefix added by mozharness, so it doesn't regress the work done so far.

I'll still be working on this, but un-assigning since a meta bug.
Assignee: bmo → nobody
Status: ASSIGNED → NEW
Depends on: 808547
Depends on: 809436
Depends on: 809438
Depends on: 809442
Depends on: 809447
Depends on: 688338
Depends on: 809529
Depends on: 793642
Depends on: 811279
Depends on: 812103
I've just split out comment 16 & comment 18 into bug 812103.
https://tbpl.mozilla.org/php/getParsedLog.php?id=17067262&tree=Mozilla-Inbound
{
rm: cannot remove `build/firefox/firefox.exe.update_in_progress.lock': Permission denied

}

We have various "rm: cannot .*" come up, we should loosen the existing:
    39        || preg_match("/^rm: cannot lstat /", $line) // . . . . . . . . . . . . failures of type bug 692715
https://tbpl.mozilla.org/php/getParsedLog.php?id=17070762&tree=Firefox
{
08:59:42 ERROR 503: Server Too Busy.
}
Depends on: 812205
Depends on: 812207
Depends on: 812214
Also maybe this (found on Try):

https://tbpl.mozilla.org/php/getParsedLog.php?id=17066622&tree=Try
{
No rule to make target 'nsIXFormsUtilityService.h' needed by ['<command-line>', 'nsIXFormsUtilityService.h']
}
https://tbpl.mozilla.org/php/getParsedLog.php?id=17080721&tree=Mozilla-Inbound

{
[Parent 2612] ###!!! ASSERTION: Profile change cancellation.: 'Error', file e:/builds/moz2_slave/m-in-w32-dbg/build/toolkit/xre/nsXREDirProvider.cpp, line 860
NOTE: child process received `Goodbye', closinxul!JSD_AttemptUCScriptInStackFrame+0x0000000000E1776F
xul!JSD_AttemptUCScriptInStackFrame+0x0000000001338D52
xul!JSD_AttemptUCScriptInStackFrame+0x0000000001339250
xul!JSD_AttemptUCScriptInStackFrame+0x0000000000024085
xul!JSD_AttemptUCScriptInStackFrame+0x00000000000186B0
xul!JSD_AttemptUCScriptInStackFrame+0x000000000001F277
xul!JSD_AttemptUCScriptInStackFrame+0x000000000001F46A
firefox!mozilla::detail::GuardObjectNotificationReceiver::GuardObjectNotificationReceiver+0x0000000000000A8E
firefox!mozilla::detail::GuardObjectNotificationReceiver::GuardObjectNotificationReceiver+0x0000000000000CEF
firefox!mozilla::detail::GuardObjectNotificationReceiver::GuardObjectNotificationReceiver+0x0000000000000E6E
firefox!mozilla::detail::GuardObjectNotificationReceiver::GuardObjectNotificationReceiver+0x00000000000025E0
firefox!mozilla::detail::GuardObjectNotificationReceiver::GuardObjectNotificationReceiver+0x0000000000002410
kernel32!BaseThreadInitThunk+0x0000000000000012
ntdll!RtlInitializeExceptionChain+0x0000000000000063
ntdll!RtlInitializeExceptionChain+0x0000000000000036
}
Depends on: 813022
Depends on: 813039
No longer depends on: 793782
https://tbpl.mozilla.org/php/getParsedLog.php?id=17172758&tree=Mozilla-Inbound

{
========= Started Install App on Device failed (results: 2, elapsed: 5 mins, 1 secs) (at 2012-11-19 08:44:45.141602) =========
python /builds/sut_tools/installApp.py 10.250.51.104 build/fennec-19.0a1.en-US.android-arm.apk org.mozilla.fennec
...
11/19/2012 08:44:45: INFO: copying build/fennec/application.ini to build/talos/remoteapp.ini
11/19/2012 08:44:45: DEBUG: calling [cp build/fennec/application.ini build/talos/remoteapp.ini]
11/19/2012 08:44:45: DEBUG: cp: build/talos/remoteapp.ini: No such file or directory
11/19/2012 08:44:45: INFO: connecting to: 10.250.51.104
reconnecting socket
Traceback (most recent call last):
  File "/builds/sut_tools/installApp.py", line 187, in <module>
    sys.exit(main(sys.argv))
  File "/builds/sut_tools/installApp.py", line 166, in main
    dm, devRoot = one_time_setup(ip_addr, path_to_main_apk)
  File "/builds/sut_tools/installApp.py", line 115, in one_time_setup
    dm = devicemanager.DeviceManagerSUT(ip_addr)
  File "/builds/tools/sut_tools/mozdevice/devicemanagerSUT.py", line 53, in __init__
    raise BaseException("Failed to connect to SUT Agent and retrieve the device root.")
BaseException: Failed to connect to SUT Agent and retrieve the device root.
program finished with exit code 1
elapsedTime=301.565341
========= Finished Install App on Device failed (results: 2, elapsed: 5 mins, 1 secs) (at 2012-11-19 08:49:46.728484) =========
}
Depends on: 813650
Depends on: 808410
https://tbpl.mozilla.org/php/getParsedLog.php?id=17316555&tree=Mozilla-Inbound

{


========= Started download failed (results: 2, elapsed: 2 mins, 51 secs) (at 2012-11-23 20:54:47.891650) =========
wget --progress=dot:mega -N http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-macosx64/1353730803/firefox-20.0a1.en-US.mac.tests.zip
 in dir /Users/cltbld/talos-slave/test/build (timeout 1200 secs)
 watching logfiles {}
 argv: ['wget', '--progress=dot:mega', '-N', u'http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-macosx64/1353730803/firefox-20.0a1.en-US.mac.tests.zip']
 environment:
  Apple_PubSub_Socket_Render=/tmp/launch-zSiIDy/Render
  CVS_RSH=ssh
  DISPLAY=/tmp/launch-DCmeIe/org.x:0
  HOME=/Users/cltbld
  LOGNAME=cltbld
  PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin
  PWD=/Users/cltbld/talos-slave/test/build
  PYTHONPATH=/Library/Python/2.5/site-packages
  SHELL=/bin/bash
  SSH_AUTH_SOCK=/tmp/launch-XSN0BP/Listeners
  TMPDIR=/var/folders/Hs/HsDn6a9SG8idoIya6p9mtE+++TI/-Tmp-/
  USER=cltbld
  VERSIONER_PYTHON_PREFER_32_BIT=no
  VERSIONER_PYTHON_VERSION=2.6
  __CF_USER_TEXT_ENCODING=0x1F5:0:0
 using PTY: False
--20:54:47--  http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-macosx64/1353730803/firefox-20.0a1.en-US.mac.tests.zip
           => `firefox-20.0a1.en-US.mac.tests.zip'
Resolving ftp.mozilla.org... failed: nodename nor servname provided, or not known.
program finished with exit code 1
elapsedTime=30.015530
========= Finished download failed (results: 2, elapsed: 2 mins, 51 secs) (at 2012-11-23 20:57:39.506187) =========
}
Depends on: 816581
https://tbpl.mozilla.org/php/getParsedLog.php?id=17458196&tree=Mozilla-Inbound

(Bug 704368)

{
TEST-INFO | /home/cltbld/talos-slave/test/build/xpcshell/tests/xpcom/tests/unit/test_nsIProcess.js | running test ...
TEST-PASS | /home/cltbld/talos-slave/test/build/xpcshell/tests/xpcom/tests/unit/test_nsIProcess.js | test passed (time: 249.395ms)
TEST-INFO | /home/cltbld/talos-slave/test/build/xpcshell/tests/xpcom/tests/unit/test_nsIProcess_stress.js | running test ...
process killed by signal 9
program finished with exit code -1
elapsedTime=1888.987470
TinderboxPrint: xpcshell<br/><em class="testfail">T-FAIL</em>
Unknown Error: command finished with exit code: -1
========= Finished 'bash -c ...' warnings (results: 1, elapsed: 31 mins, 40 secs) (at 2012-11-29 09:49:33.404663) =========
}
Depends on: 816952
Depends on: 816971
Depends on: 817545
Depends on: 819038

Updated

4 years ago
Depends on: 570723
Keywords: sheriffing-P1
Whiteboard: [sheriff-want]
Depends on: 824063
Depends on: 826182
Depends on: 813132
Depends on: 827323
Depends on: 828239
Depends on: 828324
Depends on: 828946
Depends on: 829092
Depends on: 829367
https://tbpl.mozilla.org/php/getParsedLog.php?id=18785479&tree=Mozilla-Inbound

{
NOISE: Outputting datazilla results to https://datazilla.mozilla.org/talos
NOISE: datazilla: https//datazilla.mozilla.org/talos; oauth=True
Traceback (most recent call last):
  File "run_tests.py", line 308, in <module>
    main()
  File "run_tests.py", line 305, in main
    run_tests(parser)
  File "run_tests.py", line 281, in run_tests
    talos_results.output(results_urls, **results_options)
  File "/Users/cltbld/talos-slave/talos-data/talos/results.py", line 78, in output
    _output.output(results, url)
  File "/Users/cltbld/talos-slave/talos-data/talos/output.py", line 400, in output
    self.post(results, results_server, results_path, results_scheme)
  File "/Users/cltbld/talos-slave/talos-data/talos/output.py", line 480, in post
    responses = req.submit()
  File "/Users/cltbld/talos-slave/talos-data/talos/dzclient.py", line 196, in submit
    responses.append(self.send(dataset))
  File "/Users/cltbld/talos-slave/talos-data/talos/dzclient.py", line 254, in send
    return conn.getresponse()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1013, in getresponse
    response.begin()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 402, in begin
    version, status, reason = self._read_status()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 360, in _read_status
    line = self.fp.readline()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 430, in readline
    data = recv(1)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 219, in recv
    return self.read(buflen)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 138, in read
    return self._sslobj.read(len)
socket.error: [Errno 54] Connection reset by peer
}
https://tbpl.mozilla.org/php/getParsedLog.php?id=18800056&tree=Mozilla-Inbound
(with a lovely useless bit we do highlight)

reconnecting socket
Could not connect; sleeping for 5 seconds.
reconnecting socket
Could not connect; sleeping for 10 seconds.
reconnecting socket
Could not connect; sleeping for 15 seconds.
reconnecting socket
Could not connect; sleeping for 20 seconds.
reconnecting socket
Traceback (most recent call last):
  File "/builds/sut_tools/installApp.py", line 187, in <module>
    sys.exit(main(sys.argv))
  File "/builds/sut_tools/installApp.py", line 166, in main
    dm, devRoot = one_time_setup(ip_addr, path_to_main_apk)
  File "/builds/sut_tools/installApp.py", line 115, in one_time_setup
    dm = devicemanager.DeviceManagerSUT(ip_addr)
  File "/builds/tools/sut_tools/mozdevice/devicemanagerSUT.py", line 53, in __init__
    raise BaseException("Failed to connect to SUT Agent and retrieve the device root.")
BaseException: Failed to connect to SUT Agent and retrieve the device root.
program finished with exit code 1
Pymake 

https://tbpl.mozilla.org/php/getParsedLog.php?id=18810345&tree=Mozilla-Inbound

evaluation from e:\builds\moz2_slave\m-in-w32\build\config\rules.mk:1584:4:4:0$ nsinstall nsinstall -t -m 644 "_xpidlgen/nsIDOMValidityState.h" "../../../dist/include"
evaluation from e:\builds\moz2_slave\m-in-w32\build\config\rules.mk:1584:4:4:0$ nsinstall nsinstall -t -m 644 "_xpidlgen/nsIUndoManagerTransaction.h" "../../../dist/include"
evaluation from e:\builds\moz2_slave\m-in-w32\build\config\rules.mk:1584:4:4:0$ nsinstall nsinstall -t -m 644 "_xpidlgen/nsIDOMMozBrowserFrame.h" "../../../dist/include"
<export>: Found error
<../../../dist/include/nsIDOMHTMLOutputElement.h>: Found error
evaluation from e:\builds\moz2_slave\m-in-w32\build\config\rules.mk:1584:4:4:0$ nsinstall nsinstall -t -m 644 "_xpidlgen/nsIDOMNSEvent.h" "../../../dist/include"
evaluation from e:\builds\moz2_slave\m-in-w32\build\config\rules.mk:1584:4:4:0$ nsinstall nsinstall -t -m 644 "_xpidlgen/nsIDOMDataContainerEvent.h" "../../../dist/include"
evaluation from e:\builds\moz2_slave\m-in-w32\build\config\rules.mk:1584:4:4:0$ nsinstall nsinstall -t -m 644 "_xpidlgen/nsIDOMKeyEvent.h" "../../../dist/include"
evaluation from e:\builds\moz2_slave\m-in-w32\build\config\rules.mk:1584:4:4:0$ nsinstall nsinstall -t -m 644 "_xpidlgen/nsIDOMMutationEvent.h" "../../../dist/include"

...

e:\builds\moz2_slave\m-in-w32\build\config\makefiles\target_export.mk:18:0: command 'C:/mozilla-build/python27/python.exe e:/builds/moz2_slave/m-in-w32/build/build/pymake/pymake/../make.py -C dom export' failed, return code 2
e:\builds\moz2_slave\m-in-w32\build\config\rules.mk:608:0: command 'C:/mozilla-build/python27/python.exe e:/builds/moz2_slave/m-in-w32/build/build/pymake/pymake/../make.py export_tier_platform' failed, return code 2
e:\builds\moz2_slave\m-in-w32\build\config\rules.mk:574:0: command 'C:/mozilla-build/python27/python.exe e:/builds/moz2_slave/m-in-w32/build/build/pymake/pymake/../make.py  tier_platform' failed, return code 2
e:\builds\moz2_slave\m-in-w32\build\client.mk:360:0: command 'C:/mozilla-build/python27/python.exe e:/builds/moz2_slave/m-in-w32/build/build/pymake/pymake/../make.py -j4 -C obj-firefox' failed, return code 2
e:\builds\moz2_slave\m-in-w32\build\client.mk:160:0: command 'C:/mozilla-build/python27/python.exe e:/builds/moz2_slave/m-in-w32/build/build/pymake/pymake/../make.py -f e:/builds/moz2_slave/m-in-w32/build/client.mk realbuild' failed, return code 2
program finished with exit code 2
https://tbpl.mozilla.org/php/getParsedLog.php?id=18955042&full=1&branch=mozilla-central

reconnecting socket
Could not connect; sleeping for 5 seconds.
reconnecting socket
Could not connect; sleeping for 10 seconds.
reconnecting socket
Could not connect; sleeping for 15 seconds.
reconnecting socket
Could not connect; sleeping for 20 seconds.
reconnecting socket
program finished with exit code 1
elapsedTime=862.382652
========= Finished Install App on Device failed (results: 2, elapsed: 14 mins, 22 secs) (at 2013-01-19 04:04:07.423363) =========
Depends on: 837085
Depends on: 837134
Depends on: 840425
Depends on: 840192
Depends on: 841786
Depends on: 835588
https://tbpl.mozilla.org/php/getParsedLog.php?id=20591054&tree=Firefox

22:24:12     INFO -  ###!!! ABORT: __delete__()d actor: file PPluginScriptableObject.cpp, line 28
22:24:12     INFO -  ###!!! ABORT: __delete__()d actor: file PPluginScriptableObject.cpp, line 28
Depends on: 850670
Depends on: 851842
Depends on: 852161
{
checking for mawk... (cached) gawk
	***
	*	The CLOBBER file has been updated, indicating that an incremental build
	*	since your last build will probably not work. A full build is required.
	*	The change that caused this is:
	*	Bug 844654 changed all the Makefiles.
	*	
	*	The easiest way to fix this is to manually delete your objdir:
	*	rm -rf /e/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox
	*	
	*	Or, if you know this clobber doesn't apply to you, it can be ignored with:
	*	cp '/e/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/CLOBBER' /e/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox
	***
*** Fix above errors and then restart with               "C:/mozilla-build/python27/python.exe e:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/build/pymake/pymake/../make.py -f client.mk build"
e:\builds\moz2_slave\m-cen-w32-pgo-0000000000000000\build\client.mk:320:0: command 'cd obj-firefox &&  MAKE="C:/mozilla-build/python27/python.exe e:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/build/pymake/pymake/../make.py"  e:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/configure  \
  || ( echo "*** Fix above errors and then restart with\
               \"C:/mozilla-build/python27/python.exe e:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/build/pymake/pymake/../make.py -f client.mk build\"" && exit 1 )' failed, return code 1
}
Depends on: 853994
Depends on: 854002
Depends on: 854407
Depends on: 859065
Depends on: 872116
No longer depends on: 835588
Depends on: 823452
Depends on: 882670
https://tbpl.mozilla.org/php/getParsedLog.php?id=24273074&tree=Mozilla-Inbound
{
Connecting to ftp.mozilla.org|63.245.215.56|:80... connected.
HTTP request sent, awaiting response... 500 Internal Server Error
00:23:50 ERROR 500: Internal Server Error.

program finished with exit code 1
elapsedTime=73.704270
========= Finished download failed (results: 2, elapsed: 1 mins, 16 secs) (at 2013-06-18 00:23:53.626861) =========

}
Depends on: 884721
Depends on: 892958
Depends on: 894365
We could maybe add |MOZ_CRASH()|

{
00:33:49     INFO -  out of memory: 0x000000000089E8EC bytes requested
00:33:49     INFO -  Hit MOZ_CRASH() at e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/memory/mozalloc/mozalloc_abort.cpp:30
00:33:51  WARNING -  TEST-UNEXPECTED-FAIL | /tests/dom/imptests/webapps/DOMCore/tests/approved/test_Range-set.html | Exited with code -2147483645 during test run
...
00:33:52  WARNING -  PROCESS-CRASH | /tests/dom/imptests/webapps/DOMCore/tests/approved/test_Range-set.html | application crashed [@ mozalloc_abort(char const * const)]
}
https://tbpl.mozilla.org/php/getParsedLog.php?id=26011736&tree=Mozilla-Inbound

find: Filesystem loop detected; `./dist/test-package-stage/mochitest/browser/browser/devtools/inspector' is part of the same filesystem loop as `./dist/test-package-stage/mochitest/browser/browser/devtools/inspector'.
e:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\testing\testsuite-targets.mk:421:0: command 'find ./dist/test-package-stage -name "*.pyc" -exec rm {} \;' failed, return code 1
program finished with exit code 2

also from irc:

05:02 <@ted> edmorley: it's a little tricky, this is a failure in package-tests
05:02 <@ted> apparently failing here: http://mxr.mozilla.org/mozilla-central/source/testing/testsuite-targets.mk#421
Depends on: 907925
Depends on: 910320
Depends on: 914092
Depends on: 917252
Depends on: 917817
No longer depends on: 840425
Depends on: 937684
Depends on: 942616
No longer depends on: 758282
I'd like to start reducing the number of failures modes we have where buildbot ends up killing the job - a la:
"command timed out: N seconds without output, attempting to kill".

At the least, we should handle these with a specific (TBPL parsable) failure message that saves us from having to open the full log, and where possible retry things like hg clones more than once before giving up, and in the case of android/b2g/... hangs - actually try to kill the process and get a stack.

Currently open bugs:

# Releng:

Bug 934938 - Intermittent ftp.m.o "ERROR 503: Server Too Busy" or "command timed out: 1200 seconds without output, attempting to kill" during download-and-extract step

Bug 873928 - Intermittent command timed out: 3600 seconds without output, attempting to kill fetching b2g bits from gitmo

Bug 920153 - Cloning of hg.mozilla.org/build/tools and hg.mozilla.org/integration/gaia-central often times out with "command timed out: 1200 seconds without output, attempting to kill"

Bug 934890 - Intermittent "command timed out: 14400 seconds elapsed, attempting to kill" during B2G ICS Emulator Opt Build (Mock)

Bug 820811 - Intermittent Android "command timed out: 1200 seconds without output, attempting to kill" during "Install App on Device" step

Bug 917558 - Intermittent androidx86 AND ONLY ANDROIDX86 NOT THE ARM FAILURE WHICH IS BUG 663657 command timed out: 2400 seconds without output, attempting to kill (Timeouts installing fennec)


# Talos:

Bug 849478 - Intermittent Android talos "command timed out: 3600 seconds without output, attempting to kill", "timed out after 3600 seconds of no output"

Bug 934310 - Intermittent talos timeout on startup, "command timed out: 3600 seconds without output, attempting to kill", "timed out after 3600 seconds of no output"


# Unit-test harness:

Bug 924972 - Intermittent B2G test_reftests_with_caret.html "command timed out: 1200 seconds without output, attempting to kill" or "application timed out after 330 seconds with no output"

Bug 663657 - Intermittent Android "command timed out: 2400 seconds without output, attempting to kill"

Bug 926264 - Intermittent Jetpack command timed out: 7200 seconds elapsed, attempting to kill starting up private-browsing-supported

Bug 942111 - Intermittent Jetpack command timed out: 7200 seconds elapsed, attempting to kill | The process cannot access the file because it is being used by another process

Bug 918754 - Intermittent OSX 10.6 command timed out: 7200 seconds elapsed, attempting to kill


# Build system:

Bug 916765 - Intermittent "command timed out: 600 seconds without output, attempting to kill" running expandlibs_exec.py in libgtest
(In reply to Ed Morley [:edmorley UTC+0] from comment #38)
> # Releng:
> 
> Bug 934938 - Intermittent ftp.m.o "ERROR 503: Server Too Busy" or "command
> timed out: 1200 seconds without output, attempting to kill" during
> download-and-extract step

Bug 961030 - Download tests zip build step should attempt wget more than once & output a TBPL compatible failure message

> Bug 873928 - Intermittent command timed out: 3600 seconds without output,
> attempting to kill fetching b2g bits from gitmo

Bug 961042 - b2g_build.py checkout_sources() should attempt |repo sync| more than once & output a TBPL compatible failure message

> Bug 920153 - Cloning of hg.mozilla.org/build/tools and
> hg.mozilla.org/integration/gaia-central often times out with "command timed
> out: 1200 seconds without output, attempting to kill"

Bug 961048 - Mozharness' vcs_checkout() should attempt repo cloning more than once & output a TBPL compatible failure message
Depends on: 961030, 961042, 961048
> Bug 820811 - Intermittent Android "command timed out: 1200 seconds without
> output, attempting to kill" during "Install App on Device" step

Bug 961058 - Install App on Device should timeout gracefully, outputting a TBPL compatible failure message

> Bug 663657 - Intermittent Android "command timed out: 2400 seconds without
> output, attempting to kill"

Bug 960265 - Better logs for hung Android tests
Depends on: 961058, 960265
I've filed bug 961075 to see if we can make the generic buildbot failure strings include the build step name, which would at least make starring the edge cases (where it's not worth handling in the script itself) easier to tell apart / reduce the number of mis-stars :-)

> Bug 924972 - Intermittent B2G test_reftests_with_caret.html "command timed
> out: 1200 seconds without output, attempting to kill" or "application timed
> out after 330 seconds with no output"

This is handled by the harness properly now (can't remember which bug made the relevant change).
Depends on: 961075
Depends on: 973519
Depends on: 967647
No longer depends on: 937684
Depends on: 991020
Depends on: 991134
Depends on: 991173
Depends on: 991178
Depends on: 992220
Depends on: 994830
Depends on: 995118
Depends on: 995167
No longer depends on: 973519
Depends on: 995195
No longer depends on: 778690
Depends on: 996623
Depends on: 1000995
Depends on: 1008308
Depends on: 1009614
Depends on: 1014028
Depends on: 1017578
Depends on: 1018895
Depends on: 1018910
Depends on: 1020458
Depends on: 1023935
Depends on: 948145
Depends on: 1024416
Depends on: 1024573
Depends on: 1026987
Depends on: 1027597
Depends on: 1027607
Depends on: 1027668
Depends on: 1029204
Depends on: 1030062
Depends on: 1035773
Depends on: 1043428
Depends on: 1043433
Depends on: 1043420
Depends on: 1043485
Depends on: 1046830
Depends on: 1043742
Depends on: 1048080
Depends on: 1048179
Depends on: 1048288
Summary: [Meta] Reduce the number of failures that need manual starring / require opening the full log → [Meta] Tracker for failures where no error summary and/or bug suggestions are generated
Depends on: 1048559
Depends on: 1039633
Summary: [Meta] Tracker for failures where no error summary and/or bug suggestions are generated → [Meta] Tracker for failures where the error summary and/or bug suggestions are suboptimal
Depends on: 1048836
Depends on: 1048855
Depends on: 1049525
Depends on: 1050170
Depends on: 1050242
Depends on: 1051887
Depends on: 1047760
Depends on: 1052523
Depends on: 1055231
Depends on: 1055224
Depends on: 873204
No longer depends on: 1055231
Depends on: 1062427
(Assignee)

Updated

3 years ago
Product: Webtools → Tree Management
Depends on: 867571
Severity: major → normal
Component: TBPL → General
Product: Tree Management → Testing
Depends on: 1191838
Depends on: 1300685
No longer depends on: 1300685
You need to log in before you can comment on or make changes to this bug.