Closed Bug 539334 Opened 15 years ago Closed 12 years ago

Debug builds can race bloat/malloc/sdleak log transfers

Categories

(Release Engineering :: General, defect, P3)

x86
All
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: cjones, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure, Whiteboard: [automation] [simple])

I don't know that this might imply. http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263326727.1263336880.30504.gz&fulltext=1 WINNT 5.2 mozilla-central leak test build on 2010/01/12 12:05:27 s: win32-slave15 ======== BuildStep started ======== compare previous leak logs failed === Output === obj-firefox\dist\bin\leakstats.exe ../malloc.log.old in dir e:\builds\moz2_slave\mozilla-central-win32-debug\build (timeout 1200 secs) watching logfiles {} argv: ['obj-firefox\\dist\\bin\\leakstats.exe', '../malloc.log.old'] environment: !::=::\ !EXITCODE=00000001 ALLUSERSPROFILE=C:\Documents and Settings\All Users APPDATA=C:\Documents and Settings\cltbld\Application Data APR_ICONV_PATH=d:/mozilla-build/svn-win32-1.6.3/iconv BOOTMODE=BKSTD BUILDSLAVE_PASSWORD="secret" BUILDSLAVE_TACFILE="e:\builds\moz2_slave\buildbot.tac" CLUSTERLOG=C:\WINDOWS\Cluster\cluster.log COLORFGBG=0;default;15 COLORTERM=rxvt-xpm COMMONPROGRAMFILES=C:\Program Files\Common Files COMPUTERNAME=WIN32-SLAVE15 COMSPEC=C:\WINDOWS\system32\cmd.exe CONTROLFILE="e:\buildbot-tac.control" CVS_RSH=ssh DEVENVDIR=d:\msvs8\Common7\IDE DISPLAY=:0 EDITOR=emacs.exe --no-window-system FP_NO_HOST_CHECK=NO FRAMEWORKDIR=C:\WINDOWS\Microsoft.NET\Framework FRAMEWORKSDKDIR=d:\msvs8\SDK\v2.0 FRAMEWORKVERSION=v2.0.50727 HOME=c:/Documents and Settings/cltbld HOMEDRIVE=C: HOMEPATH=\ HOSTNAME=win32-slave15 HOSTTYPE=i686 INCLUDE=d:\sdks\v7.0\\include;d:\sdks\v7.0\\include\atl;d:\msvs8\VC\ATLMFC\INCLUDE;d:\msvs8\VC\INCLUDE;d:\msvs8\VC\PlatformSDK\include; INPUTRC=D:/mozilla-build/msys/etc/inputrc LIB=d:\sdks\v7.0\\lib;d:\msvs8\VC\ATLMFC\LIB;d:\msvs8\VC\LIB;d:\msvs8\VC\PlatformSDK\lib;d:\msvs8\SDK\v2.0\lib;;D:\mozilla-build\\atlthunk_compat LIBPATH=C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727;d:\msvs8\VC\ATLMFC\LIB LOGNAME=cltbld LOGONSERVER=\\WIN32-SLAVE15 MACHTYPE=i686-pc-msys MAKE_MODE=unix MOZILLABUILD=D:\mozilla-build\ MOZILLABUILDDRIVE=D: MOZILLABUILDPATH=\mozilla-build\ MOZ_CRASHREPORTER_NO_REPORT=1 MOZ_MSVCVERSION=8 MOZ_NO_RESET_PATH=1 MOZ_OBJDIR=obj-firefox MOZ_TOOLS=D:\mozilla-build\\moztools MSVC6KEY=HKLM\SOFTWARE\Microsoft\VisualStudio\6.0\Setup\Microsoft Visual C++ MSVC71KEY=HKLM\SOFTWARE\Microsoft\VisualStudio\7.1\Setup\VC MSVC8EXPRESSKEY=HKLM\SOFTWARE\Microsoft\VCExpress\8.0\Setup\VC MSVC8KEY=HKLM\SOFTWARE\Microsoft\VisualStudio\8.0\Setup\VC MSVC9EXPRESSKEY=HKLM\SOFTWARE\Microsoft\VCExpress\9.0\Setup\VC MSVC9KEY=HKLM\SOFTWARE\Microsoft\VisualStudio\9.0\Setup\VC MSVCEXPROOTKEY=HKLM\SOFTWARE\Microsoft\VCExpress MSVCROOTKEY=HKLM\SOFTWARE\Microsoft\VisualStudio MSYSTEM=MINGW32 NUMBER_OF_PROCESSORS=1 NVAPSDK=D:\sdks\tegra042\ OLDPWD=d:/mozilla-build OS=Windows_NT OSTYPE=msys PATH=D:\mozilla-build\msys\local\bin;d:\mozilla-build\wget;d:\mozilla-build\7zip;d:\mozilla-build\blat261\full;d:\mozilla-build\python25;d:\mozilla-build\svn-win32-1.6.3\bin;d:\mozilla-build\upx203w;d:\mozilla-build\emacs-22.3\bin;d:\mozilla-build\info-zip;d:\mozilla-build\nsis-2.22;d:\mozilla-build\nsis-2.33u;d:\mozilla-build\hg;d:\mozilla-build\python25\Scripts;d:\mozilla-build\kdiff3;d:\mozilla-build\vim\vim72;.;D:\mozilla-build\msys\local\bin;D:\mozilla-build\msys\mingw\bin;D:\mozilla-build\msys\bin;d:\sdks\v7.0\bin;d:\msvs8\Common7\IDE;d:\msvs8\VC\BIN;d:\msvs8\Common7\Tools;d:\msvs8\Common7\Tools\bin;d:\msvs8\VC\PlatformSDK\bin;d:\msvs8\SDK\v2.0\bin;c:\WINDOWS\Microsoft.NET\Framework\v2.0.50727;d:\msvs8\VC\VCPackages;c:\WINDOWS\system32;c:\WINDOWS;c:\WINDOWS\System32\Wbem;d:\mozilla-build\python25;d:\mercurial;c:\Program Files\Microsoft SQL Server\90\Tools\binn\;d:\sdks\tegra042\tools;d:\sdks\tegra042\platformlibs\bin\winxp\x86\release;d:\sdks\tegra042\3rdparty\bin\winxp\x86\release;d:\mozilla-build\moztools\bin PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH PROCESSOR_ARCHITECTURE=x86 PROCESSOR_IDENTIFIER=x86 Family 6 Model 23 Stepping 8, GenuineIntel PROCESSOR_LEVEL=6 PROCESSOR_REVISION=1708 PROGRAMFILES=C:\Program Files PROMPT=$P$G PS1=\[\033]0;$MSYSTEM:\w\007 \033[32m\]\u@\h \[\033[33m\w\033[0m\] $ PWD=c:/Documents and Settings/cltbld SDK2003SP1KEY=HKLM\SOFTWARE\Microsoft\MicrosoftSDK\InstalledSDKs\8F9E5EF3-A9A5-491B-A889-C58EFFECE8B3 SDK2003SP2KEY=HKLM\SOFTWARE\Microsoft\MicrosoftSDK\InstalledSDKs\D2FF9F89-8AA2-4373-8A31-C838BF4DBBE1 SDK61KEY=HKLM\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v6.1 SDK6AKEY=HKLM\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v6.0A SDK6KEY=HKLM\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v6.0 SDK7KEY=HKLM\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v7.0 SDKDIR=d:\sdks\v7.0\ SDKMINORVER=0 SDKROOTKEY=HKLM\SOFTWARE\Microsoft\MicrosoftSDK\InstalledSDKs SDKVER=7 SESSIONNAME=Console SHELL=D:/mozilla-build/msys/bin/sh SHLVL=1 SSH_AGENT_PID=2600 SSH_AUTH_SOCK=C:/DOCUME~1/cltbld/LOCALS~1/Temp/ssh-dWTcZC2548/agent.2548 SYSTEMDRIVE=C: SYSTEMROOT=C:\WINDOWS TACSCRIPT="d:\tools\buildbot-helpers\buildbot-tac.py" TEMP=C:/DOCUME~1/cltbld/LOCALS~1/Temp TEMPVC9DIR=d:\msvs9\VC\ TERM=msys TMP=C:/DOCUME~1/cltbld/LOCALS~1/Temp TRY_PASSWORD="secret" USERDOMAIN=WIN32-SLAVE15 USERNAME=cltbld USERPROFILE=C:\Documents and Settings\cltbld USESDK=1 VC8DIR=d:\msvs8\VC\ VC9DIR=d:\msvs9\VC\ VCINSTALLDIR=d:\msvs8\VC VS80COMNTOOLS=d:\msvs8\Common7\Tools\ VS90COMNTOOLS=d:\msvs9\Common7\Tools\ VSINSTALLDIR=d:\msvs8 WIN64=0 WINCURVERKEY=HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion WINDIR=C:\WINDOWS WINDOWID=167839040 XPCOM_DEBUG_BREAK=stack-and-abort _=d:/mozilla-build/python25/scripts/buildbot closing stdin using PTY: False obj-firefox\dist\bin\leakstats.exe starting at Tue Jan 12 14:46:18 2010 Unknown event type ; obj-firefox\dist\bin\leakstats.exe: log file incomplete program finished with exit code 1 elapsedTime=2.922000 Unable to parse leakstats output
I'll take a look on win32-slave15. This may be related to the same crasher that's causing bug 539295. (feel free to cut the environment stuff next time)
It looks like the build for b20eadfc68e0 (which started at 12:01) was uploading its malloc.log at 14:46. At the same time the build for f4e56d114546 (started 12:05, the one in comment #0) was downloading that as malloc.log.old. That's 200KB smaller than the malloc.log generated from the f4e56d114546 build so malloc.log.old is almost certainly truncated, which caused leakstats to barf. I have a copy of the file if anyone wants to verify that.
This is a very rare event which we don't have capacity to fix at the moment (--> Future). We'd need a way to make the upload/download operations more atomic.
Component: Release Engineering → Release Engineering: Future
OS: Linux → All
Summary: Red build with "leakstats.exe: log file incomplete" "Unable to parse leakstats output" → Debug builds can race bloat/malloc/sdleak log transfers
Whiteboard: [automation]
Blocks: 438871
Whiteboard: [automation] → [automation] [orange]
Well ... technically random red.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264796582.1264808082.18435.gz WINNT 5.2 mozilla-central leak test build on 2010/01/29 12:23:02 s: win32-slave08
Mass move of bugs from Release Engineering:Future -> Release Engineering. See http://coop.deadsquid.com/2010/02/kiss-the-future-goodbye/ for more details.
Component: Release Engineering: Future → Release Engineering
Priority: -- → P3
Rail, think you could take this one on?
Whiteboard: [automation] [orange] → [automation] [orange] [simple]
I have ran into what looks like this: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1269885683.1269895353.28958.gz WINNT 5.2 mozilla-central leak test build on 2010/03/29 11:01:23 s: mw32-ix-slave10
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1282937262.1282940334.27107.gz Linux mozilla-central leak test build on 2010/08/27 12:27:42 s: mv-moz2-linux-ix-slave05 obj-firefox/dist/bin/leakstats starting at Fri Aug 27 13:08:59 2010 Unknown event type 0xffffffec obj-firefox/dist/bin/leakstats: log file incomplete program finished with exit code 1
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296233427.1296235624.27819.gz using PTY: True obj-firefox/dist/bin/leakstats starting at Fri Jan 28 09:11:38 2011 obj-firefox/dist/bin/leakstats: log file incomplete program finished with exit code 1 elapsedTime=0.714702 Unable to parse leakstats output
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1309803454.1309806275.30435.gz obj-firefox/dist/bin/leakstats starting at Mon Jul 4 04:48:40 2011 Unknown event type 0x7b obj-firefox/dist/bin/leakstats: log file incomplete program finished with exit code 1 elapsedTime=0.055877 Unable to parse leakstats output
assigned during triage; if you think this is better assigned to someone else, please swap with them for another old bug.
Assignee: nobody → nrthomas
Very much tied to bug 551954. In my eyes, the correct fix is to: a) somehow keep track of where the last successful leak log is for a branch/platform/build (db?) b) pull, or pass in, the location of the latest successful leak log when needed c) since that latest successful leak log will not be a mid-upload log, and will return null for the first build on a branch, we shouldn't run into this problem.
http://tinderbox.mozilla.org/showlog.cgi?log=Mozilla-Inbound/1312389962.1312394489.12750.gz obj-firefox/dist/bin/leakstats starting at Wed Aug 3 10:43:34 2011 Unknown event type 0xffffffc0 obj-firefox/dist/bin/leakstats: log file incomplete program finished with exit code 1
http://tinderbox.mozilla.org/showlog.cgi?log=Mozilla-Inbound/1312396820.1312399512.1408.gz Linux mozilla-inbound leak test build obj-firefox/dist/bin/leakstats starting at Wed Aug 3 11:46:47 2011 Unknown event type 0x1 obj-firefox/dist/bin/leakstats: log file incomplete program finished with exit code 1
Assignee: nrthomas → edransch
Blocks: 774844
Whiteboard: [automation] [orange] [simple] → [automation] [simple]
No longer blocks: debug_builds
Assignee: edransch.contact → nobody
Product: mozilla.org → Release Engineering
We're not running these any more.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.