Closed Bug 1562612 Opened 1 year ago Closed 1 year ago

Linux 32 bit DevEdition nightly build times out/gets stuck for Gecko 69

Categories

(Firefox Build System :: Task Configuration, defect, P2, major)

Tracking

(firefox-esr60 unaffected, firefox-esr68 unaffected, firefox67 unaffected, firefox67.0.1 unaffected, firefox68 unaffected, firefox69blocking verified)

VERIFIED FIXED
mozilla69
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- unaffected
firefox67 --- unaffected
firefox67.0.1 --- unaffected
firefox68 --- unaffected
firefox69 blocking verified

People

(Reporter: aryx, Assigned: mshal)

References

(Regression)

Details

(Keywords: rca-needed, regression)

After central got merged to beta, Linux DevEdition Nightly build fails for 32-bit builds:

https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&resultStatus=success%2Cusercancel%2Ctestfailed%2Cbusted%2Cexception&searchStr=5d5679753be1c9f1933720fe8d1315b207cf44d4&tochange=bcbdec13649cd4c0d3b8bdd1023e488816fce03d&fromchange=f8c40b71fcb16f6d1713563182c456b6201f7939

Log: https://treeherder.mozilla.org/logviewer.html#?job_id=254204951&repo=mozilla-beta

[task 2019-07-01T10:58:53.144Z] 10:58:53 INFO - make[5]: Entering directory '/builds/worker/workspace/build/src/obj-firefox/toolkit/library'
[task 2019-07-01T10:58:53.144Z] 10:58:53 INFO - /builds/worker/workspace/build/src/obj-firefox/_virtualenvs/init/bin/python -m mozbuild.action.dumpsymbols /builds/worker/workspace/build/src/obj-firefox/toolkit/library/libxul.so /builds/worker/workspace/build/src/obj-firefox/toolkit/library/libxul.so_syms.track --count-ctors
[task 2019-07-01T10:58:53.144Z] 10:58:53 INFO - Running: /builds/worker/workspace/build/src/obj-firefox/_virtualenvs/init/bin/python /builds/worker/workspace/build/src/toolkit/crashreporter/tools/symbolstore.py -c --vcs-info --install-manifest=/builds/worker/workspace/build/src/obj-firefox/_build_manifests/install/dist_include,/builds/worker/workspace/build/src/obj-firefox/dist/include -s /builds/worker/workspace/build/src /builds/worker/workspace/build/src/obj-firefox/dist/host/bin/dump_syms /builds/worker/workspace/build/src/obj-firefox/dist/crashreporter-symbols /builds/worker/workspace/build/src/obj-firefox/toolkit/library/libxul.so --count-ctors
[task 2019-07-01T10:58:53.145Z] 10:58:53 INFO - Beginning work for file: /builds/worker/workspace/build/src/obj-firefox/toolkit/library/libxul.so
[task 2019-07-01T10:58:53.145Z] 10:58:53 INFO - Processing file: /builds/worker/workspace/build/src/obj-firefox/toolkit/library/libxul.so
[task 2019-07-01T10:58:53.145Z] 10:58:53 INFO - /builds/worker/workspace/build/src/obj-firefox/dist/host/bin/dump_syms /builds/worker/workspace/build/src/obj-firefox/toolkit/library/libxul.so
[task 2019-07-01T10:58:53.145Z] 10:58:53 INFO - PERFHERDER_DATA: {"framework": {"name": "build_metrics"}, "suites": [{"subtests": [{"alertChangeType": "absolute", "name": "num_static_constructors", "value": 97, "alertThreshold": 3}], "name": "compiler_metrics"}]}
[task 2019-07-01T10:58:53.145Z] 10:58:53 INFO - Finished processing /builds/worker/workspace/build/src/obj-firefox/toolkit/library/libxul.so in 288.90s
[task 2019-07-01T10:58:53.145Z] 10:58:53 INFO - make[5]: Leaving directory '/builds/worker/workspace/build/src/obj-firefox/toolkit/library'

[taskcluster:error] Task timeout after 7200 seconds. Force killing container.
[taskcluster 2019-07-01 11:12:49.701Z] === Task Finished ===
[taskcluster 2019-07-01 11:12:49.702Z] Unsuccessful task run with exit code: -1 completed in 7216.874 seconds

In Gecko 68, this was followed by
[task 2019-06-30T22:55:40.528Z] 22:55:40 INFO - make[5]: Entering directory '/builds/worker/workspace/build/src/obj-firefox/toolkit/library/gtest'
[task 2019-06-30T22:55:40.532Z] 22:55:40 INFO - /builds/worker/workspace/build/src/clang/bin/clang++ -m32 -Qunused-arguments -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector-strong -Qunused-arguments -Wall -Wbitfield-enum-conversion -Wempty-body -Wignored-qualifiers -Woverloaded-virtual -Wpointer-arith -Wshadow-field-in-constructor-modified -Wsign-compare -Wtype-limits -Wunreachable-code -Wunreachable-code-return -Wwrite-strings -Wno-invalid-offsetof -Wclass-varargs -Wfloat-overflow-conversion -Wfloat-zero-conversion -Wloop-analysis -Wc++1z-compat -Wc++2a-compat -Wcomma -Wimplicit-fallthrough -Werror=non-literal-null-conversion -Wstring-conversion -Wtautological-overlap-compare -Wtautological-unsigned-enum-zero-compare -Wtautological-unsigned-zero-compare -Wno-inline-new-delete -Wno-error=deprecated-declarations -Wno-error=array-bounds -Wno-error=backend-plugin -Wno-error=return-std-move -Wno-error=atomic-alignment -Wformat -Wformat-security -Wno-gnu-zero-variadic-macro-arguments -Wno-unknown-warning-option -Wno-return-type-c-linkage -D_GLIBCXX_USE_CXX11_ABI=0 -fno-sized-deallocation -fcrash-diagnostics-dir=/builds/worker/artifacts -march=pentium-m -msse -msse2 -mfpmath=sse -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector-strong -fno-exceptions -fno-strict-aliasing -fno-rtti -ffunction-sections -fdata-sections -fno-exceptions -fno-math-errno -pthread -pipe -g -Xclang -load -Xclang /builds/worker/workspace/build/src/obj-firefox/build/clang-plugin/libclang-plugin.so -Xclang -add-plugin -Xclang moz-check -O3 -fomit-frame-pointer -funwind-tables -Werror -fprofile-instr-use=/builds/worker/workspace/build/src/obj-firefox/merged.profdata -Wno-error=profile-instr-out-of-date -Wno-error=profile-instr-unprofiled -fPIC -shared -Wl,-z,defs -Wl,--gc-sections -Wl,-h,libxul.so -o libxul.so @/builds/worker/workspace/build/src/obj-firefox/toolkit/library/gtest/libxul_so.list -flto=thin -lpthread -fstack-protector-strong -Wl,-z,noexecstack -Wl,-z,text -Wl,-z,relro -Wl,-z,nocopyreloc -Wl,-Bsymbolic-functions -Wl,--build-id=sha1 -Wl,-rpath-link,/builds/worker/workspace/build/src/obj-firefox/dist/bin -Wl,-rpath-link,/usr/local/lib ../../../security/nss/lib/crmf/crmf_crmf/libcrmf.a ../../../js/src/build/libjs_static.a /builds/worker/workspace/build/src/obj-firefox/i686-unknown-linux-gnu/release/libgkrust_gtest.a ../../../security/sandbox/linux/libmozsandbox.so ../../../config/external/nspr/pr/libnspr4.so ../../../config/external/nspr/libc/libplc4.so ../../../config/external/nspr/ds/libplds4.so ../../../config/external/lgpllibs/liblgpllibs.so ../../../security/nss/lib/nss/nss_nss3/libnss3.so ../../../security/nss/lib/util/util_nssutil3/libnssutil3.so ../../../security/nss/lib/smime/smime_smime3/libsmime3.so ../../../config/external/sqlite/libmozsqlite3.so ../../../security/nss/lib/ssl/ssl_ssl3/libssl3.so ../../../widget/gtk/mozgtk/stub/libmozgtk_stub.so ../../../widget/gtk/mozwayland/libmozwayland.so -Wl,--version-script,symverscript -ldl -lrt -lm -lX11 -lX11-xcb -lxcb -lXcomposite -lXcursor -lXdamage -lXext -lXfixes -lXi -lXrender -lpthread -ldl -lc -latomic -lfreetype -lfontconfig -ldbus-glib-1 -ldbus-1 -lgobject-2.0 -lglib-2.0 -latk-1.0 -lpangocairo-1.0 -lgdk_pixbuf-2.0 -lcairo-gobject -lpango-1.0 -lcairo -lgio-2.0 -lxcb-shm -lpangoft2-1.0 -lXt -lgthread-2.0 -lpulse -Wl,--version-script,/builds/worker/workspace/build/src/build/unix/stdc++compat/hide_std.ld
[task 2019-06-30T22:55:40.532Z] 22:55:40 INFO - make[5]: Leaving directory '/builds/worker/workspace/build/src/obj-firefox/toolkit/library/gtest'

The >10 minute gap seems uncommon and indicates the build process got stuck.

This blocks the release of DevEdition builds, at least on Linux 32-bit.

Flags: needinfo?(cmanchester)

A couple of changes have been made that could have made this longer. Perhaps link times have gone up now that we're collecting profiles from multiple processes? It also looks like we're now linking the gtest libxul during MOZ_PROFILE_GENERATE, which is a regression from before and could take some extra time.

Before we get too far investigating the build time though... aren't intending these jobs to use 3-tier by now?

Flags: needinfo?(cmanchester) → needinfo?(mshal)

I've pushed a hopefully-temporary change to the Linux32 timeout to un-block today's gtb.
https://hg.mozilla.org/releases/mozilla-beta/rev/625005a94d4d

I'm bumping the severity of this to P2/Major as well, though I'd consider it P1/Critical if the workaround isn't effective.

Severity: normal → major
Priority: -- → P2

(In reply to Chris Manchester (:chmanchester) from comment #1)

A couple of changes have been made that could have made this longer. Perhaps link times have gone up now that we're collecting profiles from multiple processes? It also looks like we're now linking the gtest libxul during MOZ_PROFILE_GENERATE, which is a regression from before and could take some extra time.

Bug 1562258 is already on autoland and should address the gtest libxul issue for 3-tier PGO builds.

Before we get too far investigating the build time though... aren't intending these jobs to use 3-tier by now?

Yeah they should. For some reason I only did linux64 devedition in bug 1547395. I filed bug 1562768 for the rest rather than doing it in this bug since RyanVM landed the timeout workaround.

Flags: needinfo?(mshal)

For what it's worth I tracked down a significant slowdown in linking libxul to bug 1542746, which seems likely to be the cause here.

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression

Confirmed that Linux32 DevEdition builds are running in the same range as Linux64 builds now that bug 1562768 has landed. Reverting the timeout increase and calling this bug fixed.

https://hg.mozilla.org/releases/mozilla-beta/rev/cb79a3a1780c565d4b927d7b7f82b98b61701ba3

Assignee: nobody → mshal
Status: NEW → RESOLVED
Closed: 1 year ago
Depends on: 1562768
Regressed by: 1542746
Resolution: --- → FIXED
Target Milestone: --- → mozilla69
Status: RESOLVED → VERIFIED

This bug has been identified as part of a pilot on determining root causes of blocking and dot release drivers.

It needs a root-cause set for it. Please see the list at https://docs.google.com/document/d/1FFEGsmoU8T0N8R9kk-MXWptOPtXXXRRIe4vQo3_HgMw/.

Add the root cause as a whiteboard tag in the form [rca - <cause> ] and remove the rca-needed keyword.

If you have questions, please contact :tmaity.

Keywords: rca-needed
You need to log in before you can comment on or make changes to this bug.