Closed Bug 1028035 Opened 10 years ago Closed 10 years ago

Intermittent Win PGO timeout running expandlibs_exec.py for xul.dll ("command timed out: 7200 seconds without output, attempting to kill")

Categories

(Firefox Build System :: General, defect)

x86
Windows XP
defect
Not set
major

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1047621

People

(Reporter: emorley, Assigned: ted)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

WINNT 5.2 mozilla-central pgo-build on 2014-06-19 16:30:16 PDT for push 79e69d064957 slave: b-2008-sm-0046 https://tbpl.mozilla.org/php/getParsedLog.php?id=42109680&tree=Mozilla-Central { mozmake.exe[5]: Nothing to be done for 'libs'. mozmake.exe[5]: Leaving directory 'c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox/other-licenses/snappy' c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/mozmake.exe -C toolkit/library libs mozmake.exe[5]: Entering directory 'c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox/toolkit/library' c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/mozmake.exe -C build libs mozmake.exe[6]: Entering directory 'c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox/toolkit/library/build' xul.dll c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox/_virtualenv/Scripts/python.exe c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/config/expandlibs_exec.py --depend .deps/xul.dll.pp --target xul.dll --uselist -- c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox/_virtualenv/Scripts/python.exe c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/build/link.py ../../../toolkit/library/linker-vsize link -NOLOGO -DLL -OUT:xul.dll -PDB:xul.pdb -SUBSYSTEM:WINDOWS -MACHINE:X86 ./module.res -LARGEADDRESSAWARE -NXCOMPAT -RELEASE -DYNAMICBASE -SAFESEH -DEBUG -DEBUGTYPE:CV -DEBUG -OPT:REF -LTCG:PGUPDATE -DELAYLOAD:comdlg32.dll -DELAYLOAD:dbghelp.dll -DELAYLOAD:psapi.dll -DELAYLOAD:rasapi32.dll -DELAYLOAD:rasdlg.dll -DELAYLOAD:secur32.dll -DELAYLOAD:wininet.dll -DELAYLOAD:winspool.drv -DELAYLOAD:oleacc.dll -DELAYLOAD:msdmo.dll ../../../toolkit/library/xul.lib c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox/dist/lib/mozjs.lib c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox/dist/lib/crmf.lib c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox/dist/lib/smime3.lib c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox/dist/lib/ssl3.lib c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox/dist/lib/nss3.lib c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox/dist/lib/nssutil3.lib ../../../dist/lib/mozsqlite3.lib ../../../intl/icu/target/lib/icuin.lib ../../../intl/icu/target/lib/icuuc.lib ../../../intl/icu/target/lib/icudt.lib ../../../dist/lib/gkmedias.lib c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox/dist/lib/nspr4.lib c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox/dist/lib/plc4.lib c:/builds/moz2_slave/m-cen-w32-pgo-0000000000000000/build/obj-firefox/dist/lib/plds4.lib ../../../dist/lib/mozalloc.lib -LIBPATH:../../../dist/lib -NODEFAULTLIB:msvcrt -NODEFAULTLIB:msvcprt -DEFAULTLIB:mozcrt kernel32.lib user32.lib gdi32.lib winmm.lib wsock32.lib advapi32.lib secur32.lib netapi32.lib secur32.lib crypt32.lib iphlpapi.lib strmiids.lib dmoguids.lib wmcodecdspuuid.lib amstrmid.lib msdmo.lib wininet.lib mfuuid.lib wmcodecdspuuid.lib strmiids.lib dmoguids.lib wmcodecdspuuid.lib strmiids.lib msdmo.lib shell32.lib ole32.lib version.lib winspool.lib comdlg32.lib imm32.lib msimg32.lib shlwapi.lib psapi.lib ws2_32.lib dbghelp.lib rasapi32.lib rasdlg.lib iphlpapi.lib uxtheme.lib setupapi.lib secur32.lib sensorsapi.lib portabledeviceguids.lib windowscodecs.lib wininet.lib wbemuuid.lib wintrust.lib wtsapi32.lib oleacc.lib usp10.lib oleaut32.lib delayimp.lib command timed out: 7200 seconds without output, attempting to kill program finished with exit code 1 elapsedTime=12643.725000 ========= Finished compile failed (results: 2, elapsed: 3 hrs, 30 mins, 45 secs) (at 2014-06-19 20:18:30.398149) ========= }
Note bug 1021145 has some earlier occurrences of this, that were mis-starred there.
That's pgo linkage :(
We wrap the linker in link.py to measure the max vsize, and that attempts to print a message every 30 minutes: http://mxr.mozilla.org/mozilla-central/source/build/link.py#24 Is expandlibs_exec buffering the output and screwing that up? Maybe we can toss in a -U to Python or set PYTHONUNBUFFERED to hack around this?
Oh, that won't work, since expandlibs_exec uses Popen.communicate to read the process output in one fell swoop: http://hg.mozilla.org/mozilla-central/annotate/bdac18bd6c74/config/expandlibs_exec.py#l357
Regression from bug 837665. I guess it hasn't been a problem for a while since we made PGO linking faster with unified builds etc. glandium: do you think it'd be confusing to just let the exec'ed program's output go to stdout as normal, and then print the commandline afterwards if it fails?
Blocks: 837665
This builds locally on Linux, I'll push it for a Windows build on Try.
Attachment #8443413 - Flags: review?(mh+mozilla)
Assignee: nobody → ted
Status: NEW → ASSIGNED
Comment on attachment 8443413 [details] [diff] [review] Don't buffer command output in expandlibs_exec Review of attachment 8443413 [details] [diff] [review]: ----------------------------------------------------------------- The problem with doing that is that it's going to hide actual errors with 2000 lines :(
Attachment #8443413 - Flags: review?(mh+mozilla) → review-
glandium: what do you mean? They'll be left out of the TBPL summary view? We could easily print some extra error line that gets picked up by the log summarizer if that would help.
This has broken 3/4 attempts to build 30.0b3 so far. attempts 5 and 6 are underway, but I'm not too hopeful at this point. Can we fix this please on beta ASAP?
Severity: normal → major
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #10) > glandium: what do you mean? They'll be left out of the TBPL summary view? We > could easily print some extra error line that gets picked up by the log > summarizer if that would help. TBPL is not the problem. Local builds are. Developer has linkage error, doesn't see it because it's 2000 lines higher.
I'd say you should copy the "write something every 30 minutes" logic to expand_exec.py.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: