Closed Bug 1601252 Opened 1 year ago Closed 1 year ago

Perma Beta [taskcluster:error] Task timeout after 1200 seconds. Force killing container. when Gecko 73 merges to Beta on 2020-01-06

Categories

(Testing :: General, defect, P1)

defect

Tracking

(Root Cause:Infrastructure/Build Error, firefox-esr68 unaffected, firefox71 unaffected, firefox72 unaffected, firefox73blocking verified)

VERIFIED FIXED
mozilla73
Tracking Status
firefox-esr68 --- unaffected
firefox71 --- unaffected
firefox72 --- unaffected
firefox73 blocking verified

People

(Reporter: malexandru, Assigned: aryx)

References

(Regression)

Details

(Keywords: regression)

Attachments

(2 files)

Central as Beta simulation: https://treeherder.mozilla.org/#/jobs?repo=try&resultStatus=success%2Ctestfailed%2Cbusted%2Cexception%2Cusercancel%2Crunnable&revision=e5ba0286cc5e5b479a831b5e9f04244d3fbd52e9&searchStr=%28run

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=279538041&repo=try&lineNumber=361

[task 2019-12-04T10:47:18.326Z] # Move our fetched firefox into objdir/dist so the jarlog entries will match
[task 2019-12-04T10:47:18.326Z] # the paths when the final PGO stage packages the build.
[task 2019-12-04T10:47:18.326Z] mkdir -p $PGO_RUNDIR
[task 2019-12-04T10:47:18.326Z] + mkdir -p obj-firefox/dist
[task 2019-12-04T10:47:18.327Z] mkdir -p $UPLOAD_PATH
[task 2019-12-04T10:47:18.327Z] + mkdir -p /builds/worker/artifacts
[task 2019-12-04T10:47:18.327Z] mv $MOZ_FETCHES_DIR/firefox $PGO_RUNDIR
[task 2019-12-04T10:47:18.328Z] + mv /builds/worker/fetches/firefox obj-firefox/dist
[task 2019-12-04T10:47:18.559Z] ./mach python build/pgo/profileserver.py --binary $PGO_RUNDIR/firefox/firefox
[task 2019-12-04T10:47:18.559Z] + ./mach python build/pgo/profileserver.py --binary obj-firefox/dist/firefox/firefox
[task 2019-12-04T10:47:19.265Z] New python executable in /builds/worker/checkouts/gecko/obj-x86_64-pc-linux-gnu/_virtualenvs/init/bin/python2.7
[task 2019-12-04T10:47:19.265Z] Also creating executable in /builds/worker/checkouts/gecko/obj-x86_64-pc-linux-gnu/_virtualenvs/init/bin/python
[task 2019-12-04T10:47:20.383Z] Installing setuptools, pip, wheel...done.
[task 2019-12-04T10:47:21.292Z] running build_ext
[task 2019-12-04T10:47:21.292Z] building 'psutil._psutil_linux' extension
[task 2019-12-04T10:47:21.292Z] creating build
[task 2019-12-04T10:47:21.292Z] creating build/temp.linux-x86_64-2.7
[task 2019-12-04T10:47:21.292Z] creating build/temp.linux-x86_64-2.7/psutil
[task 2019-12-04T10:47:21.292Z] x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -DPSUTIL_POSIX=1 -DPSUTIL_VERSION=563 -DPSUTIL_LINUX=1 -I/usr/include/python2.7 -c psutil/_psutil_common.c -o build/temp.linux-x86_64-2.7/psutil/_psutil_common.o
[task 2019-12-04T10:47:21.292Z] x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -DPSUTIL_POSIX=1 -DPSUTIL_VERSION=563 -DPSUTIL_LINUX=1 -I/usr/include/python2.7 -c psutil/_psutil_posix.c -o build/temp.linux-x86_64-2.7/psutil/_psutil_posix.o
[task 2019-12-04T10:47:21.292Z] x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -DPSUTIL_POSIX=1 -DPSUTIL_VERSION=563 -DPSUTIL_LINUX=1 -I/usr/include/python2.7 -c psutil/_psutil_linux.c -o build/temp.linux-x86_64-2.7/psutil/_psutil_linux.o
[task 2019-12-04T10:47:21.292Z] creating build/lib.linux-x86_64-2.7
[task 2019-12-04T10:47:21.292Z] creating build/lib.linux-x86_64-2.7/psutil
[task 2019-12-04T10:47:21.292Z] x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wl,-Bsymbolic-functions -Wl,-z,relro -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security build/temp.linux-x86_64-2.7/psutil/_psutil_common.o build/temp.linux-x86_64-2.7/psutil/_psutil_posix.o build/temp.linux-x86_64-2.7/psutil/_psutil_linux.o -o build/lib.linux-x86_64-2.7/psutil/_psutil_linux.so
[task 2019-12-04T10:47:21.292Z] building 'psutil._psutil_posix' extension
[task 2019-12-04T10:47:21.292Z] x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -DPSUTIL_POSIX=1 -DPSUTIL_VERSION=563 -DPSUTIL_LINUX=1 -I/usr/include/python2.7 -c psutil/_psutil_common.c -o build/temp.linux-x86_64-2.7/psutil/_psutil_common.o
[task 2019-12-04T10:47:21.292Z] x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -DPSUTIL_POSIX=1 -DPSUTIL_VERSION=563 -DPSUTIL_LINUX=1 -I/usr/include/python2.7 -c psutil/_psutil_posix.c -o build/temp.linux-x86_64-2.7/psutil/_psutil_posix.o
[task 2019-12-04T10:47:21.292Z] x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wl,-Bsymbolic-functions -Wl,-z,relro -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security build/temp.linux-x86_64-2.7/psutil/_psutil_common.o build/temp.linux-x86_64-2.7/psutil/_psutil_posix.o -o build/lib.linux-x86_64-2.7/psutil/_psutil_posix.so
[task 2019-12-04T10:47:21.292Z] copying build/lib.linux-x86_64-2.7/psutil/_psutil_linux.so -> psutil
[task 2019-12-04T10:47:21.292Z] copying build/lib.linux-x86_64-2.7/psutil/_psutil_posix.so -> psutil
[task 2019-12-04T10:47:21.292Z]
[task 2019-12-04T10:47:21.292Z] Error processing command. Ignoring because optional. (optional:packages.txt:comm/build/virtualenv_packages.txt)
[taskcluster:error] Task timeout after 1200 seconds. Force killing container.
[taskcluster 2019-12-04 11:06:30.731Z] === Task Finished ===
[taskcluster 2019-12-04 11:06:30.731Z] Unsuccessful task run with exit code: -1 completed in 1251.014 seconds

Connor, can you please take a look at this?

Flags: needinfo?(sheehan)

Similar to the failures we had when bug 1598516 initially landed. Matt, can you take a look at this issue?

Component: Mercurial: hg.mozilla.org → Networking
Flags: needinfo?(sheehan) → needinfo?(matt.woodrow)
Product: Developer Services → Core
Severity: normal → blocker
Priority: -- → P1
Regressed by: 1598516, 1600211
Summary: Perma Beta [taskcluster:error] Task timeout after 1200 seconds. Force killing container. when Gecko 73 merges to Beta on 06-01-20 → Perma Beta [taskcluster:error] Task timeout after 1200 seconds. Force killing container. when Gecko 73 merges to Beta on 2020-01-06

Is it possible to run this locally? This test is annoying, since it doesn't show any gecko logging, except when you run it locally.

I had the same failure when landing bug 1598516, which was fixed by https://hg.mozilla.org/integration/autoland/rev/e100ba2fc666

I expect the issue is related to DocumentChannel being disabled on beta (which we want to change ASAP), but it's not obvious why these changes would break anything.

Run the central-as-beta simulation changes as described at https://wiki.mozilla.org/Sheriffing/How_To/Beta_simulations#TRUNK_AS_EARLY_BETA with --no-push and commit them afterwards. Add ac_add_options --with-branding=browser/branding/official to get the official branding.

It looks like the quitter's contentscript never successfully sends the message, it fails every time with "browser.quitter is undefined".

Shane, do you have any idea why beta would be different here?

Flags: needinfo?(matt.woodrow) → needinfo?(mixedpuppy)

(In reply to Matt Woodrow (:mattwoodrow) from comment #6)

It looks like the quitter's contentscript never successfully sends the message, it fails every time with "browser.quitter is undefined".

Shane, do you have any idea why beta would be different here?

That doesn't quite make sense. browser.quitter is used in the background script when the message is received from the content script. That would indicate to me that the content script is fine, but the experimental api (parent.js) is not being loaded for the extension.

We allow experimental apis to be loaded if AddonSettings.ALLOW_LEGACY_EXTENSIONS is true, which is set here:

https://searchfox.org/mozilla-central/rev/690e903ef689a4eca335b96bd903580394864a1c/toolkit/mozapps/extensions/internal/AddonSettings.jsm#51-60

So one of AppConstants.MOZ_ALLOW_LEGACY_EXTENSIONS || Cu.isInAutomation has changed in Beta.

MOZ_ALLOW_LEGACY_EXTENSIONS is a build flag. Cu.isInAutomation is a pref set here:

https://searchfox.org/mozilla-central/source/js/xpconnect/src/xpcpublic.h#676

That is the security.turn_off_all_security_so_that_viruses_can_take_over_this_computer pref and the MOZ_DISABLE_NONLOCAL_CONNECTIONS environment variable.

So, look for one of those three things to be different where your running the quitter extension.

An alternate issue (but I would look at the above items first) we've run into with perma failures on merge is that we need to call AddonTestUtils.overrideCertDB(); in our xpcshell tests. That is implemented here:

https://searchfox.org/mozilla-central/rev/690e903ef689a4eca335b96bd903580394864a1c/toolkit/mozapps/extensions/internal/AddonTestUtils.jsm#739

Related code in Extension.jsm where we check if experimental apis are allowed:

https://searchfox.org/mozilla-central/rev/690e903ef689a4eca335b96bd903580394864a1c/toolkit/components/extensions/Extension.jsm#1872-1884

Flags: needinfo?(mixedpuppy)

Oh, is it possible that this is because it isn't signed? Same as bug 1523750?

It also looks like the quitter xpi that was checked into the tree (before I updated it) didn't match the in-tree source. Bug 1548515 was definitely missing from it.

Kmag, do you think signing will fix this, and do you have access to do that?

Flags: needinfo?(kmaglione+bmo)

FWIW - this blocks releng from being able to do staging releases.

(In reply to Matt Woodrow (:mattwoodrow) from comment #9)

It also looks like the quitter xpi that was checked into the tree (before I updated it) didn't match the in-tree source. Bug 1548515 was definitely missing from it.

Kmag, do you think signing will fix this, and do you have access to do that?

Yes, signing will fix it, and I have access.

Assignee: nobody → kmaglione+bmo
Flags: needinfo?(kmaglione+bmo)
Component: Networking → General
Product: Core → Testing

Sorry, it looks like my AWS credentials for signing have stopped working.

Assignee: kmaglione+bmo → nobody

Andrew, do you still have the ability to sign quitter.xpi?

Flags: needinfo?(andrew.swan)

Or maybe ahal?

Flags: needinfo?(ahal)

Is quitter.xpi a privileged webextension or a system addon? If so, where is the source?

(We have new automation for privileged webextension + system addons, both for CI and releases, in Github. Docs here. Contacts are me, :rail, :rdalal)

It's a privileged WebExtension. The source is here, though the version in manifest.json needs to be bumped to 2019.12.03 before signing.

Sorry, I have zero recollection of the addon signing process. Looking at the hg log, I did apparently sign it in the past (3+ years ago). I can try it out if someone links me to the docs, but if :kmag's credentials aren't working I don't know why mine would be special.

Flags: needinfo?(ahal)

(In reply to Ryan VanderMeulen [:RyanVM] from comment #14)

Andrew, do you still have the ability to sign quitter.xpi?

I don't think I ever had credentials for privileged signing...

Flags: needinfo?(andrew.swan)

Philipp can sign.

Flags: needinfo?(philipp)
Flags: needinfo?(philipp)

I tried out the signed xpi in https://treeherder.mozilla.org/#/jobs?repo=try&author=bhearsum%40mozilla.com&selectedJob=281957081, and it appeared to work.

Unfortunately, I hit https://bugzilla.mozilla.org/show_bug.cgi?id=1599369 afterwards, so that probably needs to be fixed before the next uplift too.

Assignee: nobody → aryx.bugmail
Status: NEW → ASSIGNED
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/autoland/rev/a5206bee9f52
update quitter extension with signed version. r=mattwoodrow
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla73

Please specify a root cause for this bug. See :tmaity for more information.

Root Cause: --- → ?
Root Cause: ? → Infrastructure/Build Error
You need to log in before you can comment on or make changes to this bug.