Closed Bug 1322705 Opened 3 years ago Closed 3 years ago

failure to run xpcshell update tests due to unsigned updater.exe when running in taskcluster

Categories

(Infrastructure & Operations :: CIDuty, task)

task
Not set

Tracking

(firefox53 fixed)

RESOLVED FIXED
Tracking Status
firefox53 --- fixed

People

(Reporter: jmaher, Assigned: jmaher)

References

Details

Attachments

(1 file, 1 obsolete file)

a try link for reference:
https://treeherder.mozilla.org/#/jobs?repo=try&author=jmaher@mozilla.com&filter-tier=1&filter-tier=2&filter-tier=3&selectedJob=32452054&filter-searchStr=xpcshell%208

a raw log from win7 debug:
https://queue.taskcluster.net/v1/task/c1II_gLKRAq41hrPdsDj8Q/runs/0/artifacts/public/logs/live_backing.log

a snippet from the log:
5:19:10  WARNING -  TEST-UNEXPECTED-FAIL | toolkit/mozapps/update/tests/unit_service_updater/marSuccessCompleteSvc.js | shouldRunServiceTest - [shouldRunServiceTest : 2052] the updater.exe binary should be signed (if not, build system configuration bug?) - false == true
15:19:10     INFO -  ../data/xpcshellUtilsAUS.js:shouldRunServiceTest:2052
15:19:10     INFO -  ../data/xpcshellUtilsAUS.js:setupTestCommon:796
15:19:10     INFO -  Z:/task/build/tests/xpcshell/tests/toolkit/mozapps/update/tests/unit_service_updater/marSuccessCompleteSvc.js:run_test:9
15:19:10     INFO -  Z:\task\build\tests\xpcshell\head.js:_execute_test:535
15:19:10     INFO -  -e:null:1
15:19:10     INFO -  exiting test

it seems that all the xpcshell tests in the toolkit/mozapps/update/tests/unit_service_updater directory are failing.

Given the above error, this looks to be that we are expecting a signed update.exe binary and it is not signed- possibly this is a simple fix.
Summary: failure to run xpcshell update tests due to unsigned update.exe when running in taskcluster → failure to run xpcshell update tests due to unsigned updater.exe when running in taskcluster
Per "the updater.exe binary should be signed (if not, build system configuration bug?" moving bug since this has to do with the build system.

See bug 1272925 for an example of the same bug.
Component: Application Update → Buildduty
Product: Toolkit → Release Engineering
QA Contact: bugspam.Callek
thanks for updating this :rstrong.  This could easily be something related to us not building update.exe properly in taskcluster?  Maybe the machine is configured differently or permissions are different?

Possibly a next step is to get update.exe (not sure where that is) and compare it between buildbot and taskcluster builds.  Ideally run a tool or examine whatever bits are needed to determine if it is signed.  If that doesn't show a difference, we would need to look at the machine/worker.
There have been several cases where the machine wasn't properly configured to sign the binaries. Bug 1302927 is another example. updater.exe is built properly but the signing step wasn't performed. Perhaps Mark can help out
Flags: needinfo?(mcornmesser)
thanks :rstrong.  We are running the builds (and all the related tools/packaging) on taskcluster windows 2012 instances.  Most likely this step needs to be included in the builds.

Is there a way to verify or figure out if we are signing based on downloading firefox.zip or some of the other artifacts generated by a build?

here is a link to a try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bf9bb7cfe7f11f6694113f54797684aabfacf6f6
I can attest that taskcluster builds won't sign inside the taskcluster Build task.

This sounds to me like we may need to rope in :aki and Chain-of-trust support into the conversation soon. So that we can think about enabling signing on the actual CI builds (will be the first case of CI signing in taskcluster, and first case of windows signing -- so far its been only for linux/android and for taskcluster-backed nightlies).

Aki's not back until Dec 19 though... 

Maybe for the time being (unsigned tier 2 support) we don't need this updater.exe test(s)?
(In reply to Joel Maher ( :jmaher) from comment #4)
> thanks :rstrong.  We are running the builds (and all the related
> tools/packaging) on taskcluster windows 2012 instances.  Most likely this
> step needs to be included in the builds.
> 
> Is there a way to verify or figure out if we are signing based on
> downloading firefox.zip or some of the other artifacts generated by a build?
I haven't bothered with manually verifying since the test itself checks if the file that needs to be signed is signed.

I don't know whether it would be ok to disable this for those systems but if you do please do so by not building the updater (e.g. MOZ_UPDATER).
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #3)
> There have been several cases where the machine wasn't properly configured
> to sign the binaries. Bug 1302927 is another example. updater.exe is built
> properly but the signing step wasn't performed. Perhaps Mark can help out
I think that bug and those issues are unrelated. That dealt with the mozilla maintenance service on buildbot testers.
Flags: needinfo?(mcornmesser)
Possibly the same one as bug 1272925 since it is the same error message.
I tried setting MOZ_UPDATER=0 and there was no difference on try:
https://hg.mozilla.org/try/diff/383ed1402c06/browser/confvars.sh

I suspect there is a better way to turn this off- possibly someone on this bug would know how to do that, otherwise I can ask around in irc tomorrow.
Set --disable-updater
or you can possibly just remove that line instead of setting the value to 0

old configure example
MOZ_ARG_DISABLE_BOOL(updater,
[  --disable-updater       Disable building of updater],
    MOZ_UPDATER=,
    MOZ_UPDATER=1 )

if test -n "$MOZ_UPDATER"; then
    AC_DEFINE(MOZ_UPDATER)
fi

iirc setting it to 0 means test -n "$MOZ_UPDATER" will evaluate to true and hence the AC_DEFINE
You can also do the same thing --disable-updater does,

MOZ_UPDATER=
Thanks Phil and exactly.

I prefer disabling building the updater instead of just disabling the tests.
ok, this worked, now to figure out a LOCALE=null issue- I will hack on this after my morning errands.
hmm, now a few browser chrome tests are failing as we depend on update-manager to exist:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2d3e7371d9f76a51a9f24b4213a5f52cf7f7abc7&filter-searchStr=tc&filter-tier=1&filter-tier=2&filter-tier=3&group_state=expanded&selectedJob=32577128

ideally we would have all of this annotated in the manifest files with something like:
skip-if = !MOZ_UPDATE

let me figure out the locale bits before making any decisions.
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #6)
> I don't know whether it would be ok to disable this for those systems but if
> you do please do so by not building the updater (e.g. MOZ_UPDATER).

Robert, to be explicit, when we are ready to turn on signing for these builds, we'll need the build task itself to produce an updater, but it won't be signed at this point, a seperate signing task will run that signs updater.exe for us.

It is a known thing that updater.exe must be signed on windows (as with many of the other binaries in there) but is not something our taskcluster infra supports yet.

I personally think we want the updater to build and be continuously proven buildable in taskcluster (tier2!) even before we enable signing here, which is why I was thinking disabling tests that need the updater to be signed may be a better choice. (Ideally only disabled for taskcluster, somehow)

I'm happy to vidyo about the signing infra and how it works with taskcluster builds, and to rope in people who are working on signing in taskcluster and in particular signing magic for windows in taskcluster, if you feel we need it.
(In reply to Justin Wood (:Callek) from comment #15)
> I'm happy to vidyo about the signing infra and how it works with taskcluster
> builds, and to rope in people who are working on signing in taskcluster and
> in particular signing magic for windows in taskcluster, if you feel we need
> it.

That's very kind, that would be great. I'll be back tomorrow, and I'll ping you. Thanks!
here is the list of tests which fail when I set |MOZ_UPDATE=|,
* xpcshell-7: https://dxr.mozilla.org/mozilla-central/source/toolkit/modules/tests/xpcshell/test_UpdateUtils_url.js?q=path%3Atest_UpdateUtils_url.js&redirect_type=single#191
** toolkit/modules/tests/xpcshell/test_UpdateUtils_url.js | test_locale - [test_locale : 200] the url param for %LOCALE% should equal the expected value - "null" == "en-US"
** "CONSOLE_MESSAGE: (error) [JavaScript Error: "update.locale file doesn't exist in either the application or GRE directories" {file: "resource://gre/modules/UpdateUtils.jsm" line: 148}]
** this update.locale file is not generated when we do not set MOZ_UPDATE

* bc4 (opt|debug):
** TEST-UNEXPECTED-FAIL | browser/components/tests/browser/browser_bug538331.js | Exception thrown - at chrome://mochitests/content/browser/browser/components/tests/browser/browser_bug538331.js:401 - TypeError: Cc['@mozilla.org/updates/update-manager;1'] is undefined
** https://dxr.mozilla.org/mozilla-central/source/browser/components/tests/browser/browser_bug538331.js?q=path%3Abrowser_bug538331.js&redirect_type=single#401
** we should not run this test when |MOZ_UPDATE=|

* bc5 (opt|debug):
** TEST-UNEXPECTED-FAIL | browser/components/preferences/in-content/tests/browser_advanced_update.js | Uncaught exception - [Exception... "Component returned failure code: 0x80040154 (NS_ERROR_FACTORY_NOT_REGISTERED) [nsIComponentRegistrar.contractIDToCID]" nsresult: "0x80040154 
** https://dxr.mozilla.org/mozilla-central/source/browser/components/preferences/in-content/tests/browser_advanced_update.js?q=path%3Abrowser_advanced_update.js&redirect_type=single#13
** we should not run this test when |MOZ_UPDATE=|

* bc7 (opt|debug)
** TEST-UNEXPECTED-FAIL | browser/base/content/test/general/browser_remoteTroubleshoot.js | should have correct update channel. - "undefined" == "default" - JS frame :: chrome://mochitests/content/browser/browser/base/content/test/general/browser_remoteTroubleshoot.js :: <TOP_LEVEL> :: line 70
** Console message: [JavaScript Error: "remote-troubleshooting error message. No Such Channel" {file: "resource://gre/modules/WebChannel.jsm" line: 148}]
** I assume this is related to |MOZ_UPDATE=|, not clear on if this is related or not.

all of the above browser-chrome failures happen on non-e10s and e10s.  There are other xpcshell failures, but I am not certain they are related to this.


This leaves us with a few options:
1) find a way to disable the 3 browser-chrome and 1 xpcshell test via some means when |MOZ_UPDATE=| (note, there are a few other xpcshell tests that are not failing after this |MOZ_UPDATE=| (original bug), possibly they should be hacked in the manifest also)
2) make the tests look for update-manager and fail gracefully
3) disable tests on windows so we can get them running on taskcluster (not ideal, we lose coverage on buildbot)
4) hurry up and implement signing for test binaries
5) ... something else ?

Lets chime in and figure out the right solution.  I like #1, but it will require some figuring out with adding information to mozinfo and possibly there are other ramifications I am not aware of.
(In reply to Justin Wood (:Callek) from comment #15)
> (In reply to Robert Strong [:rstrong] (use needinfo to contact me) from
> comment #6)
> > I don't know whether it would be ok to disable this for those systems but if
> > you do please do so by not building the updater (e.g. MOZ_UPDATER).
> 
> Robert, to be explicit, when we are ready to turn on signing for these
> builds, we'll need the build task itself to produce an updater, but it won't
> be signed at this point, a seperate signing task will run that signs
> updater.exe for us.
Understood.

> 
> It is a known thing that updater.exe must be signed on windows (as with many
> of the other binaries in there) but is not something our taskcluster infra
> supports yet.
> 
> I personally think we want the updater to build and be continuously proven
> buildable in taskcluster (tier2!) even before we enable signing here, which
> is why I was thinking disabling tests that need the updater to be signed may
> be a better choice. (Ideally only disabled for taskcluster, somehow)
I don't believe we've had a case where updater.exe hasn't been buildable when it is able to build other binaries and I think there is just as much of a chance of the tests uncovering an issue on the tc build of the updater.exe as building the updater.exe on tc failing. If there is a reasonable timeline to have the binaries signed then I would be more agreeable with disabling the tests. Say by no later than Jan 31st and if they aren't signed by then building the updater is disabled on tc.

> I'm happy to vidyo about the signing infra and how it works with taskcluster
> builds, and to rope in people who are working on signing in taskcluster and
> in particular signing magic for windows in taskcluster, if you feel we need
> it.
Getting the binaries signed is most definitely preferable.
(In reply to Joel Maher ( :jmaher) from comment #17)
<snip>
> This leaves us with a few options:
> 1) find a way to disable the 3 browser-chrome and 1 xpcshell test via some
> means when |MOZ_UPDATE=| (note, there are a few other xpcshell tests that
> are not failing after this |MOZ_UPDATE=| (original bug), possibly they
> should be hacked in the manifest also)
> 2) make the tests look for update-manager and fail gracefully
> 3) disable tests on windows so we can get them running on taskcluster (not
> ideal, we lose coverage on buildbot)
> 4) hurry up and implement signing for test binaries
> 5) ... something else ?
> 
> Lets chime in and figure out the right solution.  I like #1, but it will
> require some figuring out with adding information to mozinfo and possibly
> there are other ramifications I am not aware of.
I like #1 since those tests shouldn't run when not building the updater in the first place.

Adding is UPDATER to AppConstants.jsm would be another option
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #19)
> (In reply to Joel Maher ( :jmaher) from comment #17)
> <snip>
> > This leaves us with a few options:
> > 1) find a way to disable the 3 browser-chrome and 1 xpcshell test via some
> > means when |MOZ_UPDATE=| (note, there are a few other xpcshell tests that
> > are not failing after this |MOZ_UPDATE=| (original bug), possibly they
> > should be hacked in the manifest also)
> > 2) make the tests look for update-manager and fail gracefully
> > 3) disable tests on windows so we can get them running on taskcluster (not
> > ideal, we lose coverage on buildbot)
> > 4) hurry up and implement signing for test binaries
> > 5) ... something else ?
> > 
> > Lets chime in and figure out the right solution.  I like #1, but it will
> > require some figuring out with adding information to mozinfo and possibly
> > there are other ramifications I am not aware of.
> I like #1 since those tests shouldn't run when not building the updater in
> the first place.
> 
> Adding is UPDATER to AppConstants.jsm would be another option
Adding it to AppConstants.jsm would be another option
this patch assumes a few things:
1) we set MOZ_UPDATER= in a configure file somewhere (need to figure that out for taskcluster specific
2) there are no platform specific oddities, if so this should be adjusted to |skip-if = os==win && !updater|
Assignee: nobody → jmaher
Status: NEW → ASSIGNED
Attachment #8818603 - Flags: review?(robert.strong.bugs)
Comment on attachment 8818603 [details] [diff] [review]
add updater flag to mozinfo and skip tests that depend on it if updater=false

># HG changeset patch
># User Joel Maher <jmaher@mozilla.com>
># Parent  a9374fac86beda4665d4ec08f402e21e6ed4b581
>Bug 1322705 - add support for MOZ_UPDATER in manifests. r=rstrong.
>
>MozReview-Commit-ID: HTnh9OOe01F
>
>diff --git a/browser/base/content/test/general/browser.ini b/browser/base/content/test/general/browser.ini
>--- a/browser/base/content/test/general/browser.ini
>+++ b/browser/base/content/test/general/browser.ini
>@@ -360,16 +360,17 @@ skip-if = (os == 'linux') || (e10s && de
> [browser_purgehistory_clears_sh.js]
> [browser_PageMetaData_pushstate.js]
> [browser_refreshBlocker.js]
> support-files =
>   refresh_header.sjs
>   refresh_meta.sjs
> [browser_relatedTabs.js]
> [browser_remoteTroubleshoot.js]
>+skip-if = !updater # depends on UpdateUtils .Locale
For xpcshell and mochitest-chrome I know you can just add the following on the line following the skip-if. Check if this is also ok for browser.ini and if so do the same
reason = depends on UpdateUtils .Locale

> support-files =
>   test_remoteTroubleshoot.html
> [browser_remoteWebNavigation_postdata.js]
> [browser_removeTabsToTheEnd.js]
> [browser_restore_isAppTab.js]
> [browser_sanitize-passwordDisabledHosts.js]
> [browser_sanitize-sitepermissions.js]
> [browser_sanitize-timespans.js]
>diff --git a/browser/components/preferences/in-content/tests/browser.ini b/browser/components/preferences/in-content/tests/browser.ini
>--- a/browser/components/preferences/in-content/tests/browser.ini
>+++ b/browser/components/preferences/in-content/tests/browser.ini
>@@ -1,14 +1,15 @@
> [DEFAULT]
> support-files =
>   head.js
>   privacypane_tests_perwindow.js
> 
> [browser_advanced_update.js]
>+skip-if = !updater
Add a reason = etc. as above if possible

> [browser_basic_rebuild_fonts_test.js]
> [browser_bug410900.js]
> [browser_bug705422.js]
> [browser_bug731866.js]
> [browser_bug795764_cachedisabled.js]
> [browser_bug1018066_resetScrollPosition.js]
> [browser_bug1020245_openPreferences_to_paneContent.js]
> [browser_bug1184989_prevent_scrolling_when_preferences_flipped.js]
>diff --git a/browser/components/tests/browser/browser.ini b/browser/components/tests/browser/browser.ini
>--- a/browser/components/tests/browser/browser.ini
>+++ b/browser/components/tests/browser/browser.ini
>@@ -1,4 +1,5 @@
> [DEFAULT]
> 
> [browser_bug538331.js]
>+skip-if = !updater # test depends on update channel
Add if possible
reason = test depends on update channel

> [browser_contentpermissionprompt.js]
>diff --git a/python/mozbuild/mozbuild/mozinfo.py b/python/mozbuild/mozbuild/mozinfo.py
>--- a/python/mozbuild/mozbuild/mozinfo.py
>+++ b/python/mozbuild/mozbuild/mozinfo.py
>@@ -91,16 +91,17 @@ def build_dict(config, env=os.environ):
>     d['tsan'] = substs.get('MOZ_TSAN') == '1'
>     d['telemetry'] = substs.get('MOZ_TELEMETRY_REPORTING') == '1'
>     d['tests_enabled'] = substs.get('ENABLE_TESTS') == "1"
>     d['bin_suffix'] = substs.get('BIN_SUFFIX', '')
>     d['addon_signing'] = substs.get('MOZ_ADDON_SIGNING') == '1'
>     d['require_signing'] = substs.get('MOZ_REQUIRE_SIGNING') == '1'
>     d['official'] = bool(substs.get('MOZILLA_OFFICIAL'))
>     d['sm_promise'] = bool(substs.get('SPIDERMONKEY_PROMISE'))
>+    d['updater'] = substs.get('MOZ_UPDATER') == '1'
Run this change by one of the typical reviewers of this file (gps, ted, etc).

> 
>     def guess_platform():
>         if d['buildapp'] in ('browser', 'mulet'):
>             p = d['os']
>             if p == 'mac':
>                 p = 'macosx64'
>             elif d['bits'] == 64:
>                 p = '{}64'.format(p)
>diff --git a/toolkit/modules/tests/xpcshell/xpcshell.ini b/toolkit/modules/tests/xpcshell/xpcshell.ini
>--- a/toolkit/modules/tests/xpcshell/xpcshell.ini
>+++ b/toolkit/modules/tests/xpcshell/xpcshell.ini
>@@ -61,14 +61,15 @@ skip-if = toolkit == 'android'
> skip-if = toolkit == 'android'
> [test_sqlite_shutdown.js]
> skip-if = toolkit == 'android'
> [test_task.js]
> skip-if = toolkit == 'android'
> [test_timer.js]
> skip-if = toolkit == 'android'
> [test_UpdateUtils_url.js]
>+skip-if = !updater # LOCALE is not defined without MOZ_UPDATER
Add
reason = LOCALE is not defined without MOZ_UPDATER

> [test_UpdateUtils_updatechannel.js]
> [test_web_channel.js]
> [test_web_channel_broker.js]
> [test_ZipUtils.js]
> skip-if = toolkit == 'android'
> [test_Log_stackTrace.js]

r=me with the above
Attachment #8818603 - Flags: review?(robert.strong.bugs) → review+
addressed :rstrong's comments (reason does work in all .ini files and I verified locally it doesn't cause any problems) and asking Ted for review on the mozinfo.py patch.
Attachment #8818603 - Attachment is obsolete: true
Attachment #8818644 - Flags: review?(ted)
(In reply to Justin Wood (:Callek) from comment #5)
> I can attest that taskcluster builds won't sign inside the taskcluster Build
> task.
> 
> This sounds to me like we may need to rope in :aki and Chain-of-trust
> support into the conversation soon. So that we can think about enabling
> signing on the actual CI builds (will be the first case of CI signing in
> taskcluster, and first case of windows signing -- so far its been only for
> linux/android and for taskcluster-backed nightlies).
> 
> Aki's not back until Dec 19 though... 
> 
> Maybe for the time being (unsigned tier 2 support) we don't need this
> updater.exe test(s)?

Note, chain of trust is already supported on windows by the generic worker - see generic worker payload definition:
https://docs.taskcluster.net/manual/tasks/workers/generic-worker

We'll still need to get binaries signed etc (not sure how that works) - just wanted to call out that the CoT part in the worker is done.
Attachment #8818644 - Flags: review?(ted) → review+
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/33db940a7342
add support for MOZ_UPDATER in manifests. r=rstrong,ted.
https://hg.mozilla.org/mozilla-central/rev/33db940a7342
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Product: Release Engineering → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.