Closed Bug 1842032 Opened 1 year ago Closed 1 year ago

Intermittent gmake[4]: *** [/builds/worker/checkouts/gecko/config/rules.mk:526: xul.dll] Killed

Categories

(Firefox Build System :: General, defect, P5)

defect

Tracking

(firefox-esr102 unaffected, firefox-esr115 unaffected, firefox115 unaffected, firefox116 unaffected, firefox117 wontfix, firefox119 fixed, firefox120 fixed, firefox121 fixed)

RESOLVED FIXED
Tracking Status
firefox-esr102 --- unaffected
firefox-esr115 --- unaffected
firefox115 --- unaffected
firefox116 --- unaffected
firefox117 --- wontfix
firefox119 --- fixed
firefox120 --- fixed
firefox121 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: jcristau)

References

(Regression)

Details

(Keywords: intermittent-failure, regression)

Attachments

(1 file)

Filed by: chorotan [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=421685828&repo=try
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/JGOA6KLFTdKPmJQAB0WZyA/runs/0/artifacts/public/logs/live_backing.log


task 2023-07-06T11:28:44.083Z] 11:28:44     INFO -  TEST-PASS | autowinchecksec.py | ../../../dist/bin/jsapi-tests.exe succeeded
[task 2023-07-06T11:28:44.084Z] 11:28:44     INFO -  gmake[4]: Leaving directory '/builds/worker/workspace/obj-build/js/src/jsapi-tests'
[task 2023-07-06T12:10:41.140Z] 12:10:41     INFO -  gmake[4]: Entering directory '/builds/worker/workspace/obj-build/toolkit/library/build'
[task 2023-07-06T12:10:41.142Z] 12:10:41     INFO -  /builds/worker/fetches/clang/bin/lld-link -NOLOGO -DLL -OUT:xul.dll -PDB:xul.pdb -SUBSYSTEM:WINDOWS,6.01 -MACHINE:X64 @/builds/worker/workspace/obj-build/toolkit/library/build/xul_dll.list xul.dll.res -mllvm:-mcpu=x86-64 -mllvm:-import-instr-limit=10 -mllvm:-import-hot-multiplier=30 -LARGEADDRESSAWARE -DEBUG -PDBALTPATH:%_PDB% -OPT:REF,ICF -guard:cf,nolongjmp -time -NATVIS:/builds/worker/checkouts/gecko/toolkit/library/gecko.natvis -DELAYLOAD:avrt.dll -DELAYLOAD:comdlg32.dll -DELAYLOAD:credui.dll -DELAYLOAD:d3d11.dll -DELAYLOAD:D3DCompiler_47.dll -DELAYLOAD:dhcpcsvc.dll -DELAYLOAD:dnsapi.dll -DELAYLOAD:dwmapi.dll -DELAYLOAD:dxgi.dll -DELAYLOAD:gdi32.dll -DELAYLOAD:hid.dll -DELAYLOAD:imm32.dll -DELAYLOAD:iphlpapi.dll -DELAYLOAD:msi.dll -DELAYLOAD:msimg32.dll -DELAYLOAD:netapi32.dll -DELAYLOAD:ole32.dll -DELAYLOAD:oleaut32.dll -DELAYLOAD:secur32.dll -DELAYLOAD:setupapi.dll -DELAYLOAD:shell32.dll -DELAYLOAD:shlwapi.dll -DELAYLOAD:urlmon.dll -DELAYLOAD:user32.dll -DELAYLOAD:userenv.dll -DELAYLOAD:usp10.dll -DELAYLOAD:uxtheme.dll -DELAYLOAD:wininet.dll -DELAYLOAD:winmm.dll -DELAYLOAD:winspool.drv -DELAYLOAD:wtsapi32.dll -DELAYLOAD:oleacc.dll -DELAYLOAD:msdmo.dll -DELAYLOAD:api-ms-win-core-winrt-l1-1-0.dll -DELAYLOAD:api-ms-win-core-winrt-string-l1-1-0.dll  -ORDER:@/builds/worker/checkouts/gecko/build/win64/orderfile.txt ../../../js/src/build/js_static.lib /builds/worker/workspace/obj-build/x86_64-pc-windows-msvc/release/gkrust.lib ../../../security/nss3.lib ../../../config/external/lgpllibs/lgpllibs.lib ../../../mozglue/build/mozglue.lib -DEF:xul.dll.def  shell32.lib dbghelp.lib wscapi.lib hid.lib ktmw32.lib rpcrt4.lib urlmon.lib avrt.lib ksuser.lib usp10.lib ole32.lib msimg32.lib winmm.lib crypt32.lib iphlpapi.lib secur32.lib oleaut32.lib strmiids.lib user32.lib d3d11.lib dxgi.lib shcore.lib ntdll.lib credui.lib taskschd.lib msi.lib bcrypt.lib propsys.lib advapi32.lib dmoguids.lib wmcodecdspuuid.lib amstrmid.lib msdmo.lib wininet.lib mfuuid.lib gdi32.lib version.lib winspool.lib userenv.lib uuid.lib comdlg32.lib imm32.lib netapi32.lib shlwapi.lib ws2_32.lib dnsapi.lib dwmapi.lib uxtheme.lib setupapi.lib sensorsapi.lib portabledeviceguids.lib wintrust.lib wtsapi32.lib locationapi.lib sapi.lib dxguid.lib dhcpcsvc.lib d3dcompiler.lib runtimeobject.lib oleacc.lib delayimp.lib
[task 2023-07-06T12:10:41.142Z] 12:10:41     INFO -  gmake[4]: *** [/builds/worker/checkouts/gecko/config/rules.mk:526: xul.dll] Killed
[task 2023-07-06T12:10:41.142Z] 12:10:41     INFO -  gmake[4]: Leaving directory '/builds/worker/workspace/obj-build/toolkit/library/build'
[task 2023-07-06T12:10:41.142Z] 12:10:41     INFO -  gmake[4]: Target 'target' not remade because of errors.
[task 2023-07-06T12:10:41.142Z] 12:10:41     INFO -  gmake[4]: Target 'target' not remade because of errors.
[task 2023-07-06T12:10:41.142Z] 12:10:41    ERROR -  gmake[3]: *** [/builds/worker/checkouts/gecko/config/recurse.mk:72: toolkit/library/build/target] Error 2
[task 2023-07-06T12:14:43.748Z] 12:14:43     INFO -  gmake[4]: Entering directory '/builds/worker/workspace/obj-build/toolkit/library/gtest'
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -  /builds/worker/fetches/clang/bin/lld-link -NOLOGO -DLL -OUT:xul.dll -PDB:xul.pdb -SUBSYSTEM:WINDOWS,6.01 -MACHINE:X64 @/builds/worker/workspace/obj-build/toolkit/library/gtest/xul_dll.list xul.dll.res -mllvm:-mcpu=x86-64 -mllvm:-import-instr-limit=10 -mllvm:-import-hot-multiplier=30 -LARGEADDRESSAWARE -DEBUG -PDBALTPATH:%_PDB% -OPT:REF,ICF -guard:cf,nolongjmp -time -NATVIS:/builds/worker/checkouts/gecko/toolkit/library/gecko.natvis -DELAYLOAD:avrt.dll -DELAYLOAD:comdlg32.dll -DELAYLOAD:credui.dll -DELAYLOAD:d3d11.dll -DELAYLOAD:D3DCompiler_47.dll -DELAYLOAD:dhcpcsvc.dll -DELAYLOAD:dnsapi.dll -DELAYLOAD:dwmapi.dll -DELAYLOAD:dxgi.dll -DELAYLOAD:gdi32.dll -DELAYLOAD:hid.dll -DELAYLOAD:imm32.dll -DELAYLOAD:iphlpapi.dll -DELAYLOAD:msi.dll -DELAYLOAD:msimg32.dll -DELAYLOAD:netapi32.dll -DELAYLOAD:ole32.dll -DELAYLOAD:oleaut32.dll -DELAYLOAD:secur32.dll -DELAYLOAD:setupapi.dll -DELAYLOAD:shell32.dll -DELAYLOAD:shlwapi.dll -DELAYLOAD:urlmon.dll -DELAYLOAD:user32.dll -DELAYLOAD:userenv.dll -DELAYLOAD:usp10.dll -DELAYLOAD:uxtheme.dll -DELAYLOAD:wininet.dll -DELAYLOAD:winmm.dll -DELAYLOAD:winspool.drv -DELAYLOAD:wtsapi32.dll -DELAYLOAD:oleacc.dll -DELAYLOAD:msdmo.dll -DELAYLOAD:api-ms-win-core-winrt-l1-1-0.dll -DELAYLOAD:api-ms-win-core-winrt-string-l1-1-0.dll  ../../../js/src/build/js_static.lib /builds/worker/workspace/obj-build/x86_64-pc-windows-msvc/release/gkrust_gtest.lib ../../../security/nss3.lib ../../../config/external/lgpllibs/lgpllibs.lib ../../../mozglue/build/mozglue.lib   avrt.lib ksuser.lib ole32.lib shell32.lib dbghelp.lib mpr.lib advapi32.lib bcrypt.lib crypt32.lib kernel32.lib rpcrt4.lib wscapi.lib hid.lib ktmw32.lib urlmon.lib usp10.lib msimg32.lib winmm.lib iphlpapi.lib secur32.lib oleaut32.lib strmiids.lib user32.lib d3d11.lib dxgi.lib shcore.lib ntdll.lib credui.lib taskschd.lib msi.lib propsys.lib dmoguids.lib wmcodecdspuuid.lib amstrmid.lib msdmo.lib wininet.lib mfuuid.lib gdi32.lib version.lib winspool.lib userenv.lib uuid.lib comdlg32.lib imm32.lib netapi32.lib shlwapi.lib ws2_32.lib dnsapi.lib dwmapi.lib uxtheme.lib setupapi.lib sensorsapi.lib portabledeviceguids.lib wintrust.lib wtsapi32.lib locationapi.lib sapi.lib dxguid.lib dhcpcsvc.lib d3dcompiler.lib runtimeobject.lib oleacc.lib delayimp.lib
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -    Input File Reading:            6727 ms (  0.2%)
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -    LTO:                        3091437 ms ( 99.2%)
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -    GC:                             562 ms (  0.0%)
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -    ICF:                           1653 ms (  0.1%)
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -    Code Layout:                   2107 ms (  0.1%)
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -    Commit Output File:               4 ms (  0.0%)
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -    PDB Emission (Cumulative):    12099 ms (  0.4%)
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -      Add Objects:                 8567 ms (  0.3%)
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -        Global Type Hashing:       2241 ms (  0.1%)
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -        GHash Type Merging:        3034 ms (  0.1%)
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -        Symbol Merging:            3205 ms (  0.1%)
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -      Publics Stream Layout:         42 ms (  0.0%)
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -      TPI Stream Layout:             25 ms (  0.0%)
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -      Commit to Disk:              3285 ms (  0.1%)
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -  --------------------------------------------------
[task 2023-07-06T12:14:43.750Z] 12:14:43     INFO -  Total Linking Time:           3115028 ms (100.0%)
[task 2023-07-06T12:14:43.751Z] 12:14:43     INFO -  gmake[4]: Leaving directory '/builds/worker/workspace/obj-build/toolkit/library/gtest'
[task 2023-07-06T12:14:44.033Z] 12:14:44     INFO -  gmake[4]: Entering directory '/builds/worker/workspace/obj-build/toolkit/library/gtest'
[task 2023-07-06T12:14:44.033Z] 12:14:44     INFO -  /builds/worker/.mozbuild/srcdirs/gecko-8a5b87fe5d69/_virtualenvs/build/bin/python -m mozbuild.action.check_binary xul.dll
[task 2023-07-06T12:14:44.033Z] 12:14:44     INFO -  gmake[4]: Leaving directory '/builds/worker/workspace/obj-build/toolkit/library/gtest'
[task 2023-07-06T12:14:44.034Z] 12:14:44     INFO -  gmake[4]: Entering directory '/builds/worker/workspace/obj-build/toolkit/library/gtest'
[task 2023-07-06T12:14:44.034Z] 12:14:44     INFO -  chmod +x xul.dll
[task 2023-07-06T12:14:44.035Z] 12:14:44     INFO -  gmake[4]: Leaving directory '/builds/worker/workspace/obj-build/toolkit/library/gtest'
[task 2023-07-06T12:14:44.036Z] 12:14:44     INFO -  gmake[4]: Entering directory '/builds/worker/workspace/obj-build/toolkit/library/gtest'
[task 2023-07-06T12:14:44.037Z] 12:14:44     INFO -  ../../../config/nsinstall -R -m 644 'xul.dll' '../../../dist/bin/gtest'
[task 2023-07-06T12:14:44.037Z] 12:14:44     INFO -  gmake[4]: Leaving directory '/builds/worker/workspace/obj-build/toolkit/library/gtest'
[task 2023-07-06T12:15:31.610Z] 12:15:31     INFO -  gmake[4]: Entering directory '/builds/worker/workspace/obj-build/toolkit/library/gtest'
[task 2023-07-06T12:15:31.611Z] 12:15:31     INFO -  /builds/worker/.mozbuild/srcdirs/gecko-8a5b87fe5d69/_virtualenvs/build/bin/python -m mozbuild.action.dumpsymbols /builds/worker/workspace/obj-build/toolkit/library/gtest/xul.dll /builds/worker/workspace/obj-build/toolkit/library/gtest/xul.dll_syms.track
[task 2023-07-06T12:15:31.611Z] 12:15:31     INFO -  Running: /builds/worker/.mozbuild/srcdirs/gecko-8a5b87fe5d69/_virtualenvs/build/bin/python /builds/worker/checkouts/gecko/toolkit/crashreporter/tools/symbolstore.py -c --vcs-info -i --install-manifest=/builds/worker/workspace/obj-build/_build_manifests/install/dist_include,/builds/worker/workspace/obj-build/dist/include -s /builds/worker/checkouts/gecko /builds/worker/fetches/dump_syms/dump_syms /builds/worker/workspace/obj-build/dist/crashreporter-symbols /builds/worker/workspace/obj-build/toolkit/library/gtest/xul.dll
[task 2023-07-06T12:15:31.611Z] 12:15:31     INFO -  Beginning work for file: /builds/worker/workspace/obj-build/toolkit/library/gtest/xul.dll
[task 2023-07-06T12:15:31.611Z] 12:15:31     INFO -  Processing file: /builds/worker/workspace/obj-build/toolkit/library/gtest/xul.dll
[task 2023-07-06T12:15:31.611Z] 12:15:31     INFO -  /builds/worker/fetches/dump_syms/dump_syms --inlines /builds/worker/workspace/obj-build/toolkit/library/gtest/xul.dll
[task 2023-07-06T12:15:31.611Z] 12:15:31     INFO -  0050:err:winediag:nodrv_CreateWindow Application tried to create a window, but no driver could be loaded.
[task 2023-07-06T12:15:31.611Z] 12:15:31     INFO -  0050:err:winediag:nodrv_CreateWindow L"The explorer process failed to start."
[task 2023-07-06T12:15:31.611Z] 12:15:31     INFO -  0050:err:systray:initialize_systray Could not create tray window
[task 2023-07-06T12:15:31.612Z] 12:15:31     INFO -  0094:fixme:hid:handle_IRP_MN_QUERY_ID Unhandled type 00000005
[task 2023-07-06T12:15:31.612Z] 12:15:31     INFO -  0094:fixme:hid:handle_IRP_MN_QUERY_ID Unhandled type 00000005
[task 2023-07-06T12:15:31.612Z] 12:15:31     INFO -  0094:fixme:hid:handle_IRP_MN_QUERY_ID Unhandled type 00000005
[task 2023-07-06T12:15:31.612Z] 12:15:31     INFO -  0094:fixme:hid:handle_IRP_MN_QUERY_ID Unhandled type 00000005
[task 2023-07-06T12:15:31.612Z] 12:15:31     INFO -  0024:fixme:heap:RtlSetHeapInformation handle 0000000000000000, info_class 1, info 0000000000000000, size 0 stub!
[task 2023-07-06T12:15:31.612Z] 12:15:31     INFO -  Finished processing /builds/worker/workspace/obj-build/toolkit/library/gtest/xul.dll in 45.96s
[task 2023-07-06T12:15:31.612Z] 12:15:31     INFO -  gmake[4]: Leaving directory '/builds/worker/workspace/obj-build/toolkit/library/gtest'
[task 2023-07-06T12:15:31.766Z] 12:15:31     INFO -  gmake[4]: Entering directory '/builds/worker/workspace/obj-build/toolkit/library/gtest'
[task 2023-07-06T12:15:31.766Z] 12:15:31     INFO -  /builds/worker/.mozbuild/srcdirs/gecko-8a5b87fe5d69/_virtualenvs/build/bin/python /builds/worker/checkouts/gecko/build/win32/autowinchecksec.py xul.dll
[task 2023-07-06T12:15:31.766Z] 12:15:31     INFO -  Warn: large load config, probably contains undocumented fields
[task 2023-07-06T12:15:31.766Z] 12:15:31     INFO -  TEST-PASS | autowinchecksec.py | xul.dll succeeded

Smells like OOM killer after bug 1834815

Keywords: regression
Regressed by: 1834815

Set release status flags based on info from the regressing bug 1834815

:sergesanspaille, since you are the author of the regressor, bug 1834815, could you take a look?

For more information, please visit BugBot documentation.

:glandium do you think it's worth disabling LTO for the test version of xul?

Flags: needinfo?(sguelton) → needinfo?(mh+mozilla)

You can't, because 95% of its objects are built with LTO.

Flags: needinfo?(mh+mozilla)

@Mike, can you look at these win64-nightlyasrelease/win64-shippable build bustages?
They are getting more frequent, recently, on central mostly but happens on autoland too
Here is a recent failure log

Flags: needinfo?(mh+mozilla)
Flags: needinfo?(mh+mozilla) → needinfo?(sguelton)

SO the memory consumption of fullLTO is high, to such an extent that it breaks the build. I've been digging a bit, the two approaches I'm think of as of now are:

  • decrease debuginfo size. Going to -g1 compared to -g has a significant impact on bitcode size. I recall we need more that this, but that could be worth investigating.

  • improve debuginfo representation in clang: there are a lot of duplicate strings in debuginfo, I need to check if they are shared in memory or not

  • decrease the input size. This seems obvious but as FullLTO loads the whole libxul code in memory to optimize it, if we decrease input size, we use less memory. This could be done by splitting libxul in smaller libraries, or decrease the amount of code we link in. I need to look for code that's unused or duplicated.

Flags: needinfo?(sguelton)

This could be done by splitting libxul in smaller libraries

Probably not going to happen.

or decrease the amount of code we link in. I need to look for code that's unused or duplicated.

Probably not going to move the needle much.

Profiles say we peak at 92GB use. ci-configuration says the xlarge-gcp pool is running on n2-custom-64-102400. That's not much of a leeway, we can probably increase that. (And we should probably switch to c2, like bug 1860584 did for the non-xlarge type)

Flags: needinfo?(jcristau)

Heh, in fact, there's a pool with c2-standard-60, and according to the commit that added it, that bumps to 240G of RAM

It's also worth noting we're linking two different xul.dlls in parallel, which effectively doubles the amount of memory we need.

Of course, linking both xul.dlls one after the other would make the overall build slower, since both are essentially single-threaded.

Profile on n2-custom-64-102400: https://share.firefox.dev/3SkXh64
Profile on c2-standard-60: https://share.firefox.dev/3SnD7bQ
(both the same build type on the same changeset)
Unfortunately, there isn't that much of a difference in CPU frequencies between n2 and c2. (2.8GHz vs. 3.1GHz) and it looks like the reduced number of cores has a negative impact on the section that is using all the cores, making the overall build slower, despite the individual xul.dll link/compiles being faster.

Strings are already made unique in debug info, false trail.

(In reply to Mike Hommey [:glandium] from comment #24)

Profiles say we peak at 92GB use. ci-configuration says the xlarge-gcp pool is running on n2-custom-64-102400. That's not much of a leeway, we can probably increase that. (And we should probably switch to c2, like bug 1860584 did for the non-xlarge type)

I made some comparisons between n2-custom-64-102400 and c2-standard-60 with 10 builds on each, and it wasn't conclusive, if anything the n2 seemed a bit faster. The IO stats looked odd though, and I didn't get to the bottom of it... The gecko-1/b-linux-xlarge-gcp-bug1797804-c2 pool is still around if someone else wants to play with that before I get back to it.

Flags: needinfo?(jcristau)
Duplicate of this bug: 1862916
Summary: Intermittent gmake[4]: *** [/builds/worker/checkouts/gecko/config/rules.mk:526: xul.dll] Killed → Perma gmake[4]: *** [/builds/worker/checkouts/gecko/config/rules.mk:526: xul.dll] Killed

(In reply to Julien Cristau [:jcristau] from comment #29)

(In reply to Mike Hommey [:glandium] from comment #24)

Profiles say we peak at 92GB use. ci-configuration says the xlarge-gcp pool is running on n2-custom-64-102400. That's not much of a leeway, we can probably increase that. (And we should probably switch to c2, like bug 1860584 did for the non-xlarge type)

I made some comparisons between n2-custom-64-102400 and c2-standard-60 with 10 builds on each, and it wasn't conclusive, if anything the n2 seemed a bit faster. The IO stats looked odd though, and I didn't get to the bottom of it... The gecko-1/b-linux-xlarge-gcp-bug1797804-c2 pool is still around if someone else wants to play with that before I get back to it.

Could we "just" bump the memory available on those n2s?

It's still intermittent, actually.

Summary: Perma gmake[4]: *** [/builds/worker/checkouts/gecko/config/rules.mk:526: xul.dll] Killed → Intermittent gmake[4]: *** [/builds/worker/checkouts/gecko/config/rules.mk:526: xul.dll] Killed
Assignee: nobody → jcristau
Status: NEW → ASSIGNED
Pushed by jcristau@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/b7694d48d2fd bump memory on b-linux-xlarge-gcp instances to 128G. r=releng-reviewers,bhearsum
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
See Also: → 1876930
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: