Closed Bug 539334 Opened 15 years ago Closed 11 years ago

Debug builds can race bloat/malloc/sdleak log transfers

Categories

(Release Engineering :: General, defect, P3)

x86
All
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: cjones, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure, Whiteboard: [automation] [simple])

I don't know that this might imply.

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263326727.1263336880.30504.gz&fulltext=1
WINNT 5.2 mozilla-central leak test build on 2010/01/12 12:05:27  
s: win32-slave15

======== BuildStep started ========
compare previous leak logs failed
=== Output ===
obj-firefox\dist\bin\leakstats.exe ../malloc.log.old
 in dir e:\builds\moz2_slave\mozilla-central-win32-debug\build (timeout 1200 secs)
 watching logfiles {}
 argv: ['obj-firefox\\dist\\bin\\leakstats.exe', '../malloc.log.old']
 environment:
  !::=::\
  !EXITCODE=00000001
  ALLUSERSPROFILE=C:\Documents and Settings\All Users
  APPDATA=C:\Documents and Settings\cltbld\Application Data
  APR_ICONV_PATH=d:/mozilla-build/svn-win32-1.6.3/iconv
  BOOTMODE=BKSTD
  BUILDSLAVE_PASSWORD="secret"
  BUILDSLAVE_TACFILE="e:\builds\moz2_slave\buildbot.tac"
  CLUSTERLOG=C:\WINDOWS\Cluster\cluster.log
  COLORFGBG=0;default;15
  COLORTERM=rxvt-xpm
  COMMONPROGRAMFILES=C:\Program Files\Common Files
  COMPUTERNAME=WIN32-SLAVE15
  COMSPEC=C:\WINDOWS\system32\cmd.exe
  CONTROLFILE="e:\buildbot-tac.control"
  CVS_RSH=ssh
  DEVENVDIR=d:\msvs8\Common7\IDE
  DISPLAY=:0
  EDITOR=emacs.exe --no-window-system
  FP_NO_HOST_CHECK=NO
  FRAMEWORKDIR=C:\WINDOWS\Microsoft.NET\Framework
  FRAMEWORKSDKDIR=d:\msvs8\SDK\v2.0
  FRAMEWORKVERSION=v2.0.50727
  HOME=c:/Documents and Settings/cltbld
  HOMEDRIVE=C:
  HOMEPATH=\
  HOSTNAME=win32-slave15
  HOSTTYPE=i686
  INCLUDE=d:\sdks\v7.0\\include;d:\sdks\v7.0\\include\atl;d:\msvs8\VC\ATLMFC\INCLUDE;d:\msvs8\VC\INCLUDE;d:\msvs8\VC\PlatformSDK\include;
  INPUTRC=D:/mozilla-build/msys/etc/inputrc
  LIB=d:\sdks\v7.0\\lib;d:\msvs8\VC\ATLMFC\LIB;d:\msvs8\VC\LIB;d:\msvs8\VC\PlatformSDK\lib;d:\msvs8\SDK\v2.0\lib;;D:\mozilla-build\\atlthunk_compat
  LIBPATH=C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727;d:\msvs8\VC\ATLMFC\LIB
  LOGNAME=cltbld
  LOGONSERVER=\\WIN32-SLAVE15
  MACHTYPE=i686-pc-msys
  MAKE_MODE=unix
  MOZILLABUILD=D:\mozilla-build\
  MOZILLABUILDDRIVE=D:
  MOZILLABUILDPATH=\mozilla-build\
  MOZ_CRASHREPORTER_NO_REPORT=1
  MOZ_MSVCVERSION=8
  MOZ_NO_RESET_PATH=1
  MOZ_OBJDIR=obj-firefox
  MOZ_TOOLS=D:\mozilla-build\\moztools
  MSVC6KEY=HKLM\SOFTWARE\Microsoft\VisualStudio\6.0\Setup\Microsoft Visual C++
  MSVC71KEY=HKLM\SOFTWARE\Microsoft\VisualStudio\7.1\Setup\VC
  MSVC8EXPRESSKEY=HKLM\SOFTWARE\Microsoft\VCExpress\8.0\Setup\VC
  MSVC8KEY=HKLM\SOFTWARE\Microsoft\VisualStudio\8.0\Setup\VC
  MSVC9EXPRESSKEY=HKLM\SOFTWARE\Microsoft\VCExpress\9.0\Setup\VC
  MSVC9KEY=HKLM\SOFTWARE\Microsoft\VisualStudio\9.0\Setup\VC
  MSVCEXPROOTKEY=HKLM\SOFTWARE\Microsoft\VCExpress
  MSVCROOTKEY=HKLM\SOFTWARE\Microsoft\VisualStudio
  MSYSTEM=MINGW32
  NUMBER_OF_PROCESSORS=1
  NVAPSDK=D:\sdks\tegra042\
  OLDPWD=d:/mozilla-build
  OS=Windows_NT
  OSTYPE=msys
  PATH=D:\mozilla-build\msys\local\bin;d:\mozilla-build\wget;d:\mozilla-build\7zip;d:\mozilla-build\blat261\full;d:\mozilla-build\python25;d:\mozilla-build\svn-win32-1.6.3\bin;d:\mozilla-build\upx203w;d:\mozilla-build\emacs-22.3\bin;d:\mozilla-build\info-zip;d:\mozilla-build\nsis-2.22;d:\mozilla-build\nsis-2.33u;d:\mozilla-build\hg;d:\mozilla-build\python25\Scripts;d:\mozilla-build\kdiff3;d:\mozilla-build\vim\vim72;.;D:\mozilla-build\msys\local\bin;D:\mozilla-build\msys\mingw\bin;D:\mozilla-build\msys\bin;d:\sdks\v7.0\bin;d:\msvs8\Common7\IDE;d:\msvs8\VC\BIN;d:\msvs8\Common7\Tools;d:\msvs8\Common7\Tools\bin;d:\msvs8\VC\PlatformSDK\bin;d:\msvs8\SDK\v2.0\bin;c:\WINDOWS\Microsoft.NET\Framework\v2.0.50727;d:\msvs8\VC\VCPackages;c:\WINDOWS\system32;c:\WINDOWS;c:\WINDOWS\System32\Wbem;d:\mozilla-build\python25;d:\mercurial;c:\Program Files\Microsoft SQL Server\90\Tools\binn\;d:\sdks\tegra042\tools;d:\sdks\tegra042\platformlibs\bin\winxp\x86\release;d:\sdks\tegra042\3rdparty\bin\winxp\x86\release;d:\mozilla-build\moztools\bin
  PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH
  PROCESSOR_ARCHITECTURE=x86
  PROCESSOR_IDENTIFIER=x86 Family 6 Model 23 Stepping 8, GenuineIntel
  PROCESSOR_LEVEL=6
  PROCESSOR_REVISION=1708
  PROGRAMFILES=C:\Program Files
  PROMPT=$P$G
  PS1=\[\033]0;$MSYSTEM:\w\007
\033[32m\]\u@\h \[\033[33m\w\033[0m\]
$ 
  PWD=c:/Documents and Settings/cltbld
  SDK2003SP1KEY=HKLM\SOFTWARE\Microsoft\MicrosoftSDK\InstalledSDKs\8F9E5EF3-A9A5-491B-A889-C58EFFECE8B3
  SDK2003SP2KEY=HKLM\SOFTWARE\Microsoft\MicrosoftSDK\InstalledSDKs\D2FF9F89-8AA2-4373-8A31-C838BF4DBBE1
  SDK61KEY=HKLM\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v6.1
  SDK6AKEY=HKLM\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v6.0A
  SDK6KEY=HKLM\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v6.0
  SDK7KEY=HKLM\SOFTWARE\Microsoft\Microsoft SDKs\Windows\v7.0
  SDKDIR=d:\sdks\v7.0\
  SDKMINORVER=0
  SDKROOTKEY=HKLM\SOFTWARE\Microsoft\MicrosoftSDK\InstalledSDKs
  SDKVER=7
  SESSIONNAME=Console
  SHELL=D:/mozilla-build/msys/bin/sh
  SHLVL=1
  SSH_AGENT_PID=2600
  SSH_AUTH_SOCK=C:/DOCUME~1/cltbld/LOCALS~1/Temp/ssh-dWTcZC2548/agent.2548
  SYSTEMDRIVE=C:
  SYSTEMROOT=C:\WINDOWS
  TACSCRIPT="d:\tools\buildbot-helpers\buildbot-tac.py"
  TEMP=C:/DOCUME~1/cltbld/LOCALS~1/Temp
  TEMPVC9DIR=d:\msvs9\VC\
  TERM=msys
  TMP=C:/DOCUME~1/cltbld/LOCALS~1/Temp
  TRY_PASSWORD="secret"
  USERDOMAIN=WIN32-SLAVE15
  USERNAME=cltbld
  USERPROFILE=C:\Documents and Settings\cltbld
  USESDK=1
  VC8DIR=d:\msvs8\VC\
  VC9DIR=d:\msvs9\VC\
  VCINSTALLDIR=d:\msvs8\VC
  VS80COMNTOOLS=d:\msvs8\Common7\Tools\
  VS90COMNTOOLS=d:\msvs9\Common7\Tools\
  VSINSTALLDIR=d:\msvs8
  WIN64=0
  WINCURVERKEY=HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion
  WINDIR=C:\WINDOWS
  WINDOWID=167839040
  XPCOM_DEBUG_BREAK=stack-and-abort
  _=d:/mozilla-build/python25/scripts/buildbot
 closing stdin
 using PTY: False
obj-firefox\dist\bin\leakstats.exe starting at Tue Jan 12 14:46:18 2010
Unknown event type ;
obj-firefox\dist\bin\leakstats.exe: log file incomplete
program finished with exit code 1
elapsedTime=2.922000
Unable to parse leakstats output
I'll take a look on win32-slave15. This may be related to the same crasher that's causing bug 539295.

(feel free to cut the environment stuff next time)
It looks like the build for b20eadfc68e0 (which started at 12:01) was uploading its malloc.log at 14:46. At the same time the build for f4e56d114546 (started 12:05, the one in comment #0) was downloading that as malloc.log.old. That's 200KB smaller than the malloc.log generated from the f4e56d114546 build so malloc.log.old is almost certainly truncated, which caused leakstats to barf. I have a copy of the file if anyone wants to verify that.
This is a very rare event which we don't have capacity to fix at the moment (--> Future). We'd need a way to make the upload/download operations more atomic.
Component: Release Engineering → Release Engineering: Future
OS: Linux → All
Summary: Red build with "leakstats.exe: log file incomplete" "Unable to parse leakstats output" → Debug builds can race bloat/malloc/sdleak log transfers
Whiteboard: [automation]
Blocks: 438871
Whiteboard: [automation] → [automation] [orange]
Well ... technically random red.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264796582.1264808082.18435.gz
WINNT 5.2 mozilla-central leak test build on 2010/01/29 12:23:02
s: win32-slave08
Mass move of bugs from Release Engineering:Future -> Release Engineering. See
http://coop.deadsquid.com/2010/02/kiss-the-future-goodbye/ for more details.
Component: Release Engineering: Future → Release Engineering
Priority: -- → P3
Rail, think you could take this one on?
Whiteboard: [automation] [orange] → [automation] [orange] [simple]
I have ran into what looks like this:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1269885683.1269895353.28958.gz
WINNT 5.2 mozilla-central leak test build on 2010/03/29 11:01:23
s: mw32-ix-slave10
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1282937262.1282940334.27107.gz
Linux mozilla-central leak test build on 2010/08/27 12:27:42
s: mv-moz2-linux-ix-slave05
obj-firefox/dist/bin/leakstats starting at Fri Aug 27 13:08:59 2010
Unknown event type 0xffffffec
obj-firefox/dist/bin/leakstats: log file incomplete
program finished with exit code 1
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296233427.1296235624.27819.gz
 using PTY: True
obj-firefox/dist/bin/leakstats starting at Fri Jan 28 09:11:38 2011
obj-firefox/dist/bin/leakstats: log file incomplete
program finished with exit code 1
elapsedTime=0.714702
Unable to parse leakstats output
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1309803454.1309806275.30435.gz
obj-firefox/dist/bin/leakstats starting at Mon Jul  4 04:48:40 2011
Unknown event type 0x7b
obj-firefox/dist/bin/leakstats: log file incomplete
program finished with exit code 1
elapsedTime=0.055877
Unable to parse leakstats output
assigned during triage; if you think this is better assigned to someone else, please swap with them for another old bug.
Assignee: nobody → nrthomas
Very much tied to bug 551954.

In my eyes, the correct fix is to:

a) somehow keep track of where the last successful leak log is for a branch/platform/build (db?)
b) pull, or pass in, the location of the latest successful leak log when needed
c) since that latest successful leak log will not be a mid-upload log, and will return null for the first build on a branch, we shouldn't run into this problem.
http://tinderbox.mozilla.org/showlog.cgi?log=Mozilla-Inbound/1312389962.1312394489.12750.gz

obj-firefox/dist/bin/leakstats starting at Wed Aug  3 10:43:34 2011
Unknown event type 0xffffffc0
obj-firefox/dist/bin/leakstats: log file incomplete
program finished with exit code 1
http://tinderbox.mozilla.org/showlog.cgi?log=Mozilla-Inbound/1312396820.1312399512.1408.gz
Linux mozilla-inbound leak test build

obj-firefox/dist/bin/leakstats starting at Wed Aug  3 11:46:47 2011
Unknown event type 0x1
obj-firefox/dist/bin/leakstats: log file incomplete
program finished with exit code 1
Assignee: nrthomas → edransch
Blocks: 774844
Whiteboard: [automation] [orange] [simple] → [automation] [simple]
No longer blocks: debug_builds
Assignee: edransch.contact → nobody
Product: mozilla.org → Release Engineering
We're not running these any more.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.