Closed Bug 1027983 Opened 10 years ago Closed 10 years ago

bm-2008-sm-00xx fail to link xul.dll when build is PGO enabled

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

x86
Windows Server 2003
task
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: nthomas, Unassigned)

References

Details

Attachments

(1 file)

We hit this in the win32 build for 31.0b1 build1: xul.dll c:/builds/moz2_slave/rel-m-beta-w32_bld-00000000000/build/obj-firefox/_virtualenv/Scripts/python.exe c:/builds/moz2_slave/rel-m-beta-w32_bld-00000000000/build/config/expandlibs_exec.py --depend .deps/xul.dll.pp --target xul.dll --uselist -- c:/builds/moz2_slave/rel-m-beta-w32_bld-00000000000/build/obj-firefox/_virtualenv/Scripts/python.exe c:/builds/moz2_slave/rel-m-beta-w32_bld-00000000000/build/build/link.py ../../../toolkit/library/linker-vsize link -NOLOGO -DLL -OUT:xul.dll -PDB:xul.pdb -SUBSYSTEM:WINDOWS -MACHINE:X86 ./module.res -LARGEADDRESSAWARE -NXCOMPAT -RELEASE -DYNAMICBASE -SAFESEH -DEBUG -DEBUGTYPE:CV -DEBUG -OPT:REF -LTCG:PGUPDATE -DELAYLOAD:comdlg32.dll -DELAYLOAD:dbghelp.dll -DELAYLOAD:psapi.dll -DELAYLOAD:rasapi32.dll -DELAYLOAD:rasdlg.dll -DELAYLOAD:secur32.dll -DELAYLOAD:wininet.dll -DELAYLOAD:winspool.drv -DELAYLOAD:oleacc.dll -DELAYLOAD:msdmo.dll ../../../toolkit/library/xul.lib c:/builds/moz2_slave/rel-m-beta-w32_bld-00000000000/build/obj-firefox/dist/lib/mozjs.lib c:/builds/moz2_slave/rel-m-beta-w32_bld-00000000000/build/obj-firefox/dist/lib/crmf.lib c:/builds/moz2_slave/rel-m-beta-w32_bld-00000000000/build/obj-firefox/dist/lib/smime3.lib c:/builds/moz2_slave/rel-m-beta-w32_bld-00000000000/build/obj-firefox/dist/lib/ssl3.lib c:/builds/moz2_slave/rel-m-beta-w32_bld-00000000000/build/obj-firefox/dist/lib/nss3.lib c:/builds/moz2_slave/rel-m-beta-w32_bld-00000000000/build/obj-firefox/dist/lib/nssutil3.lib ../../../dist/lib/mozsqlite3.lib ../../../intl/icu/target/lib/icuin.lib ../../../intl/icu/target/lib/icuuc.lib ../../../intl/icu/target/lib/icudt.lib ../../../dist/lib/gkmedias.lib -LIBPATH:'C:/Tools/sdks/dx10/lib/x86' c:/builds/moz2_slave/rel-m-beta-w32_bld-00000000000/build/obj-firefox/dist/lib/nspr4.lib c:/builds/moz2_slave/rel-m-beta-w32_bld-00000000000/build/obj-firefox/dist/lib/plc4.lib c:/builds/moz2_slave/rel-m-beta-w32_bld-00000000000/build/obj-firefox/dist/lib/plds4.lib ../../../dist/lib/mozalloc.lib -LIBPATH:../../../dist/lib -NODEFAULTLIB:msvcrt -NODEFAULTLIB:msvcrtd -NODEFAULTLIB:msvcprt -NODEFAULTLIB:msvcprtd -DEFAULTLIB:mozcrt kernel32.lib user32.lib gdi32.lib winmm.lib wsock32.lib advapi32.lib secur32.lib netapi32.lib secur32.lib crypt32.lib iphlpapi.lib strmiids.lib dmoguids.lib wmcodecdspuuid.lib amstrmid.lib msdmo.lib wininet.lib mfuuid.lib wmcodecdspuuid.lib strmiids.lib dmoguids.lib wmcodecdspuuid.lib strmiids.lib msdmo.lib shell32.lib ole32.lib version.lib winspool.lib comdlg32.lib imm32.lib msimg32.lib shlwapi.lib psapi.lib ws2_32.lib dbghelp.lib rasapi32.lib rasdlg.lib iphlpapi.lib uxtheme.lib setupapi.lib secur32.lib sensorsapi.lib portabledeviceguids.lib windowscodecs.lib wininet.lib wbemuuid.lib wintrust.lib oleacc.lib 'C:/Tools/sdks/dx10/Lib/x86'/dxguid.lib 'C:/Tools/sdks/dx10/Lib/x86'/dinput8.lib usp10.lib oleaut32.lib delayimp.lib command timed out: 7200 seconds without output, attempting to kill program finished with exit code 1 Full log: http://ftp.mozilla.org/pub/mozilla.org/firefox/candidates/31.0b3-candidates/build1/logs/release-mozilla-beta-win32_build-bm82-build1-build11.txt.gz
mysql> select builders.name, slaves.name, builds.starttime, builds.result from builds join builders on builds.builder_id=builders.id join slaves on builds.slave_id=slaves.id where slaves.name like 'b-2008-sm-%' and builders.name like '%pgo%'; +---------------------------+----------------+---------------------+--------+ | name | name | starttime | result | +---------------------------+----------------+---------------------+--------+ | b2g-inbound-win32-pgo | b-2008-sm-0001 | 2014-05-07 06:30:16 | 0 | | fx-team-win32-pgo | b-2008-sm-0036 | 2014-06-17 21:30:17 | 2 | | mozilla-central-win32-pgo | b-2008-sm-0038 | 2014-06-18 17:30:11 | 2 | | b2g-inbound-win32-pgo | b-2008-sm-0039 | 2014-06-18 03:30:06 | 2 | | fx-team-win32-pgo | b-2008-sm-0046 | 2014-06-19 15:30:12 | 2 | | mozilla-inbound-win32-pgo | b-2008-sm-0050 | 2014-06-17 17:30:22 | 2 | | mozilla-inbound-win32-pgo | b-2008-sm-0050 | 2014-06-16 23:30:16 | 2 | | fx-team-win32-pgo | b-2008-sm-0052 | 2014-06-18 18:30:12 | 2 | | b2g-inbound-win32-pgo | b-2008-sm-0053 | 2014-06-16 22:26:17 | 2 | | b2g-inbound-win32-pgo | b-2008-sm-0053 | 2014-06-16 18:30:17 | 2 | | mozilla-inbound-win32-pgo | b-2008-sm-0056 | 2014-06-17 23:39:29 | 2 | | fx-team-win32-pgo | b-2008-sm-0059 | 2014-06-16 21:30:17 | 2 | | mozilla-central-win32-pgo | b-2008-sm-0059 | 2014-06-18 20:30:06 | 2 | | mozilla-central-win32-pgo | b-2008-sm-0060 | 2014-06-17 20:30:37 | 2 | | fx-team-win32-pgo | b-2008-sm-0061 | 2014-06-17 12:30:19 | 2 | +---------------------------+----------------+---------------------+--------+ And spot checking a few of those confirms they are also 2 hour timeouts, which suggests the seamicro hardware is slower than existing hardware at linking.
b-2008-ix-0003 (non seamicro machine) timed out linking xul.dll too. The job was re-triggered and buildbot ran it on the same machine (building now). I have disabled b-2008-sm-0003 on slavealloc, for a while, in case we need to re run this job again.
Attached patch desperate.patchSplinter Review
For the sake of completeness, this is the timeout settings we used to re-trigger the new job (r=aki): timeout 3h (was 2h) # time without output before killing maxTime: 5.5h (was 4.5h) #total runtime before killing
Attachment #8443714 - Flags: review+
Attachment #8443714 - Flags: checked-in+
re-enabled b-2008-ix-0003 in slavealloc
There have been a couple of failures since comment #3. b-2008-sm-0064 - https://tbpl.mozilla.org/php/getParsedLog.php?id=42232438&tree=Mozilla-Inbound 19800 second timeout, after the second link of xul.dll b-2008-sm-0046 - https://tbpl.mozilla.org/php/getParsedLog.php?id=42317175&tree=Mozilla-Inbound 19800 second timeout, during the second link of xul.dll. bug 1028035 will help with this one. There have also been 11 successful builds over several hosts.
Depends on: 1047621
No longer blocks: 1002634
(In reply to Nick Thomas [:nthomas] from comment #1) > And spot checking a few of those confirms they are also 2 hour timeouts, > which suggests the seamicro hardware is slower than existing hardware at > linking. MSVC2010's PGO link phase is single-threaded, so if the seamicro cores are slower than the cores on our other machines that would make sense.
I'm wontfixing on the theory this is/was just SeaMicro's. which :arr tells me we're wontfixing bugs for now
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: