Partner builds don't download complete update if partial fails, or show workaround information if complete fails (needed for stub release channel metrics)

VERIFIED FIXED in Firefox 16

Status

()

Toolkit
Application Update
VERIFIED FIXED
6 years ago
4 years ago

People

(Reporter: nthomas, Assigned: rstrong)

Tracking

(Depends on: 1 bug, {regression, verifyme})

Trunk
mozilla16
x86
All
regression, verifyme
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox16- verified, firefox17 verified, firefox18 verified, firefox-esr10- wontfix, firefox-esr17 fixed)

Details

Attachments

(5 attachments, 20 obsolete attachments)

3.06 KB, text/plain
Details
22.33 KB, patch
bbondy
: review+
Details | Diff | Splinter Review
3.39 KB, patch
bbondy
: review+
Details | Diff | Splinter Review
3.46 KB, patch
Details | Diff | Splinter Review
432 bytes, text/plain
Details
(Reporter)

Description

6 years ago
I was using a build from
 https://ftp.mozilla.org/pub/mozilla.org/firefox/releases/13.0.1-funnelcake11/mac/en-US/Firefox%2013.0.1.dmg
This is a partner build which appends -cck-mozilla11 to the channel, and I've
switched it to the releasetest channel to do testing ahead of 14.0.1. 

Due to bug 770996 it's expected that the partial will fail, because generating
a partner build means we redo the mac signing, which modifies
Contents/MacOS/firefox compared to the original build. Trying to apply the partial mar from the 'vanilla' build then fails.

I suspect none of this is particularly relevant to the bug - the key thing is that the partial fails to apply.

$ ./firefox -p fc
<initiate a check by going Firefox > About>
*** AUS:SVC UpdateManager:_loadXMLFileIntoArray: XML file does not exist
*** AUS:SVC getLocale - getting locale from file: resource://app/update.locale, locale: en-US
*** AUS:SVC Checker:getUpdateURL - update URL: https://aus3.mozilla.org/update/3/Firefox/13.0.1/20120614114901/Darwin_x86_64-gcc3-u-i386-x86_64/en-US/releasetest-cck-mozilla11/Darwin%2010.8.0/mozilla11/1.0/update.xml?force=1
*** AUS:SVC gCanCheckForUpdates - able to check for updates
*** AUS:SVC Checker:checkForUpdates - sending request to: https://aus3.mozilla.org/update/3/Firefox/13.0.1/20120614114901/Darwin_x86_64-gcc3-u-i386-x86_64/en-US/releasetest-cck-mozilla11/Darwin%2010.8.0/mozilla11/1.0/update.xml?force=1
--------------------
This looks like:
<?xml version="1.0"?>
<updates>
    <update type="minor" displayVersion="14.0.1" appVersion="14.0.1" platformVersion="14.0.1" buildID="20120713134347" detailsURL="https://www.mozilla.com/en-US/firefox/14.0.1/releasenotes/">
        <patch type="complete" URL="http://download.mozilla.org/?product=firefox-14.0.1-complete-funnelcake11&amp;os=osx&amp;lang=en-US&amp;force=1" hashFunction="SHA512" hashValue="10a314f6a8dc44cf425cdcaa50d31f0f9a9ec5edf2badc668e61a356b7a8a2492d52a3012d05f7f2c29cb29e26c8dd27265c125d6ef3d036caafb5731f226907" size="31631314"/>
        <patch type="partial" URL="http://download.mozilla.org/?product=firefox-14.0.1-partial-13.0.1-funnelcake11&amp;os=osx&amp;lang=en-US&amp;force=1" hashFunction="SHA512" hashValue="443d35f5528b1790c66c40b6e8361bd25b49325564c7988bbd4f74cc3b3c1b7289723c542d08c5132f4cdcdd7a365d861f36e07747d8c9baf004fdccc9a85769" size="11765216"/>
    </update>
</updates>
--------------------
*** AUS:SVC Checker:onProgress - 913/913
*** AUS:SVC Checker:onLoad - request completed downloading document
*** AUS:SVC Checker:getUpdateURL - update URL: https://aus3.mozilla.org/update/3/Firefox/13.0.1/20120614114901/Darwin_x86_64-gcc3-u-i386-x86_64/en-US/releasetest-cck-mozilla11/Darwin%2010.8.0/mozilla11/1.0/update.xml?force=1
*** AUS:SVC Checker:onLoad - number of updates available: 1
*** AUS:SVC gCanApplyUpdates - testing write access /Users/nthomas/Desktop/Firefox fc11 en-US.app/Contents/MacOS/./update.test
*** AUS:SVC gCanApplyUpdates - able to apply updates
*** AUS:SVC readStringFromFile - file doesn't exist: /Users/nthomas/Desktop/Firefox fc11 en-US.app/Contents/MacOS/./updates/0/update.status
*** AUS:SVC readStatusFile - status: null, path: /Users/nthomas/Desktop/Firefox fc11 en-US.app/Contents/MacOS/./updates/0/update.status
*** AUS:SVC UpdateManager:_loadXMLFileIntoArray: XML file does not exist
*** AUS:SVC UpdateManager:_loadXMLFileIntoArray: XML file does not exist
*** AUS:SVC Downloader:downloadUpdate - downloading from http://download.mozilla.org/?product=firefox-14.0.1-partial-13.0.1-funnelcake11&os=osx&lang=en-US&force=1 to /Users/nthomas/Desktop/Firefox fc11 en-US.app/Contents/MacOS/./updates/0/update.mar
*** AUS:SVC Downloader:onStartRequest - original URI spec: http://download.mozilla.org/?product=firefox-14.0.1-partial-13.0.1-funnelcake11&os=osx&lang=en-US&force=1, final URI spec: http://mozilla.mirror.uber.com.au/firefox/releases/13.0.1-funnelcake11/update/mac/en-US/firefox-13.0.1-14.0.1.partial.mar
*** AUS:SVC Downloader:onProgress - progress: 5514/11765216
*** AUS:SVC Downloader:onProgress - progress: 374754/11765216
*** AUS:SVC Downloader:onProgress - progress: 434577/11765216
... snip ...
*** AUS:SVC Downloader:onProgress - progress: 11730057/11765216
*** AUS:SVC Downloader:onProgress - progress: 11765216/11765216
*** AUS:SVC Downloader:onStopRequest - original URI spec: http://download.mozilla.org/?product=firefox-14.0.1-partial-13.0.1-funnelcake11&os=osx&lang=en-US&force=1, final URI spec: http://mozilla.mirror.uber.com.au/firefox/releases/13.0.1-funnelcake11/update/mac/en-US/firefox-13.0.1-14.0.1.partial.mar, status: 0
*** AUS:SVC Downloader:onStopRequest - setting state to: pending

<click the'Apply Update' button> - the partial fails to apply here, this is in last-update.log

EXECUTE PATCH Contents/_CodeSignature/CodeResources
unable to open destination file: Contents/_CodeSignature/CodeResources, err: 2
### execution failed

then afterwards

*** AUS:SVC gCanCheckForUpdates - able to check for updates
*** AUS:SVC gCanApplyUpdates - testing write access /Users/nthomas/Desktop/Firefox fc11 en-US.app/Contents/MacOS/./update.test
*** AUS:SVC gCanApplyUpdates - able to apply updates
*** AUS:SVC readStatusFile - status: failed: 6, path: /Users/nthomas/Desktop/Firefox fc11 en-US.app/Contents/MacOS/./updates/0/update.status
*** AUS:SVC readStatusFile - status: failed: 6, path: /Users/nthomas/Desktop/Firefox fc11 en-US.app/Contents/MacOS/./updates/0/update.status
*** AUS:SVC UpdateService:_postUpdateProcessing - install of complete or only one patch offered failed... showing error.

A dialog box is shown (screenshot coming), behind the main window. Clicking the Continue button yields 

*** AUS:SVC readStringFromFile - file doesn't exist: /Users/nthomas/Desktop/Firefox fc11 en-US.app/Contents/MacOS/./updates/0/update.status
*** AUS:SVC readStatusFile - status: null, path: /Users/nthomas/Desktop/Firefox fc11 en-US.app/Contents/MacOS/./updates/0/update.status
*** AUS:SVC Downloader:downloadUpdate - no patch to download
*** AUS:SVC readStringFromFile - file doesn't exist: /Users/nthomas/Desktop/Firefox fc11 en-US.app/Contents/MacOS/./updates/0/update.status
*** AUS:SVC readStatusFile - status: null, path: /Users/nthomas/Desktop/Firefox fc11 en-US.app/Contents/MacOS/./updates/0/update.status

and a 'Connecting to the update server' (2nd screenshot coming for that).

I was expecting the updater to download the complete and apply that.
(Reporter)

Comment 1

6 years ago
Created attachment 642909 [details]
Dialog on startup after failure with partial
(Reporter)

Comment 2

6 years ago
Created attachment 642911 [details]
After using Continue button
Not sure if this is fallout from staging the updates in the background (bug 307181).
Nick, can you try updating with the app.update.stage.enabled pref set to false?

Comment 5

6 years ago
Am I missing something?  bgupdates is only in Firefox 15...
I don't know what's going on but here are some observations that stand out to me:

0) The update.xml only has a full listed, but the original post had both.  Maybe RelEng fixed this problem by no longer offering the partial?

1)
> EXECUTE PATCH Contents/_CodeSignature/CodeResources
> unable to open destination file: Contents/_CodeSignature/CodeResources, err: 2
> ### execution failed

I don't really understand this error, but I guess it's from OS X code signing?


2) The partial log contains this:
*** AUS:SVC readStatusFile - status: null, path: /Users/nthomas/Desktop/Firefox fc11 en-US.app/Contents/MacOS/./updates/0/update.status
*** AUS:SVC UpdateManager:_loadXMLFileIntoArray: XML file does not exist
*** AUS:SVC UpdateManager:_loadXMLFileIntoArray: XML file does not exist

3) Afterwards when the full should be applied there is a read errror:
*** AUS:SVC readStatusFile - status: failed: 6, path: /Users/nthomas/Desktop/Firefox fc11 en-US.app/Contents/MacOS/./updates/0/update.status

4) I don't think this is related to bgupdates or the maintenance service.

Comment 7

6 years ago
Yeah, this looks like a regression from the code signing stuff.
Agreed that it isn't due to staging the updates in the background (bug 307181).

Might be due to changes with the service but I doubt it.

The problem as reported is actually about trying to download the complete after a partial update has failed and not specifically about code signing which if it is a problem would be due to how the mars are generated. Basically, a complete should be downloaded and applied successfully after a partial fails.

Comment 9

6 years ago
If this is only happening on Mac, I'd be very surprised if the service is at fault here.

(In reply to Robert Strong [:rstrong] (do not email) from comment #8)
> The problem as reported is actually about trying to download the complete
> after a partial update has failed and not specifically about code signing
> which if it is a problem would be due to how the mars are generated.
> Basically, a complete should be downloaded and applied successfully after a
> partial fails.

In that case maybe we're looking at two bugs?  (The first one being about the partial failing...)
(In reply to Ehsan Akhgari [:ehsan] from comment #9)
> If this is only happening on Mac, I'd be very surprised if the service is at
> fault here.
Like I said, I doubt it as well but there were changes to the client code as well which could have caused it.

> 
> (In reply to Robert Strong [:rstrong] (do not email) from comment #8)
> > The problem as reported is actually about trying to download the complete
> > after a partial update has failed and not specifically about code signing
> > which if it is a problem would be due to how the mars are generated.
> > Basically, a complete should be downloaded and applied successfully after a
> > partial fails.
> 
> In that case maybe we're looking at two bugs?  (The first one being about
> the partial failing...)
No, Nick stated as much in one of the comments where he talked about changing the channel.
(Reporter)

Comment 11

6 years ago
What Rob said in comment #8, only longer ...

(In reply to Brian R. Bondy [:bbondy] from comment #6)
> 0) The update.xml only has a full listed, but the original post had both. 
> Maybe RelEng fixed this problem by no longer offering the partial?

Yes, we fixed it on the releasetest and release channels. You can still see it on betatest-cck-mozilla11 by setting betatest in channel-prefs.js

> 1)
> > EXECUTE PATCH Contents/_CodeSignature/CodeResources
> > unable to open destination file: Contents/_CodeSignature/CodeResources, err: 2
> > ### execution failed
> 
> I don't really understand this error, but I guess it's from OS X code
> signing?

Retesting with fresh bits and a little less bleary eyedness I get
EXECUTE PATCH Contents/_CodeSignature/CodeResources
...
EXECUTE PATCH Contents/MacOS/firefox
LoadSourceFile: destination file size 57776 does not match expected size 57568
LoadSourceFile failed
### execution failed

Which is expected from resigning a repack, because something gets embedded in the binary (bug 770996).

(In reply to Ehsan Akhgari [:ehsan] from comment #9)
> If this is only happening on Mac, I'd be very surprised if the service is at
> fault here.

I haven't tested other platforms, we should!
(In reply to Nick Thomas [:nthomas] from comment #11)
> What Rob said in comment #8, only longer ...
> 
> (In reply to Brian R. Bondy [:bbondy] from comment #6)
> > 0) The update.xml only has a full listed, but the original post had both. 
> > Maybe RelEng fixed this problem by no longer offering the partial?
> 
> Yes, we fixed it on the releasetest and release channels. You can still see
> it on betatest-cck-mozilla11 by setting betatest in channel-prefs.js
> 
> > 1)
> > > EXECUTE PATCH Contents/_CodeSignature/CodeResources
> > > unable to open destination file: Contents/_CodeSignature/CodeResources, err: 2
> > > ### execution failed
> > 
> > I don't really understand this error, but I guess it's from OS X code
> > signing?
> 
> Retesting with fresh bits and a little less bleary eyedness I get
> EXECUTE PATCH Contents/_CodeSignature/CodeResources
>

So in your re-test it didn't complain about CodeResources?

> EXECUTE PATCH Contents/MacOS/firefox
> LoadSourceFile: destination file size 57776 does not match expected size
> 57568
> LoadSourceFile failed
> ### execution failed

Yeah, this is definitely due to bug 770996. It's fixed in 14.0.1, nothing we can do about it for 13.0/13.0.1 users now =(.
Some more notes:
- I manually put failed: 6 in my udpate.status file before restarting the browser and COULD reproduce with the funnelcake build
- I could not reproduce in Windows (partial failure leads to successful full)
- I could not reproduce in Mac build not funnelcake mac build. (partial failure leads to successful full)

So it seems to be funnelcake related specifically.
Thanks Brian. This leads me to believe this is a bug in the distribution code breaking app update. I'll try to find a way to protect against it from breaking app update and what is broken in the distribution code.
Summary: Firefox 13.0.1 doesn't download complete update after partial fails → Firefox 13.0.1 doesn't download complete update after partial fails when using distribution customizations
(Reporter)

Comment 15

6 years ago
If I rename Contents/MacOS/distribution/ then the partial failure is handled properly. This particular repack only adds the distribution/ directory so that confirms it's the partner code that's interfering. Presumably this is an issue on other platforms too but I haven't confirmed that.
(Reporter)

Comment 16

6 years ago
At gavin's prompting I tried removing various lines from distribution.ini. It's this one that causes the issue:
 app.partner.mozilla11="mozilla11"

So I'm speculating that something in the post update code may not be handling -cck-<foo> suffixes on the channel properly, possibly a regression from recent changes.
Were there errors in the error console with app.update.log set to true when this happens?
(Reporter)

Comment 18

6 years ago
Created attachment 645960 [details]
log for partner build not failing over to complete
(Reporter)

Comment 19

6 years ago
Created attachment 645961 [details]
log for non-partner build failing over to complete successfully

So it looks like we go down the two branches of this if block:
http://hg.mozilla.org/releases/mozilla-release/file/FIREFOX_13_0_1_RELEASE/toolkit/mozapps/update/nsUpdateService.js#l1453
(Reporter)

Comment 20

6 years ago
Created attachment 645962 [details]
log for non-partner build successfully failing over to complete

Oops, same log twice.
Attachment #645961 - Attachment is obsolete: true
I'm able to get into this situation on Windows if I manually corrupt the update.mar file.
Nick, how important do you think this bug is? I ask because we are working on other high priority bugs atm.
(Reporter)

Comment 23

6 years ago
I'll work up some version distributions for vanilla vs partner releases. We'd need some sort of hotfix to reach them anyway, but that should help inform how urgent it is versus other bugs.
(Reporter)

Comment 24

6 years ago
Created attachment 646019 [details]
Version distributions

This shows the skew in the two populations. There are significantly more partner users on all old versions:
 * 29% on 4.0 through 11.0, compared 16% for the non-partner
 * 55% on 12.0 through to 13.0.1, compared to 36% 
 * 17% on 14.0.1, compared to 48%
In absolute numbers it's 5.2 million ADIs for partner FF 12/13.0.1/13.0. 

Obviously there could be other factors that result in partner users having old builds but it seems to have gotten worse recently and this bug is a candidate. I don't have a good reason for why partials should be failing often, but will hopefully have time to find a regression range.

The data is based on blocklist ADI from 2012-07-24, comparing release to release-cck-* for Firefox 4.0 to 14.0.1. I excluded anything which totalled less than 100 counts over all versions, and release-cck-mozilla13 and 14 which artificially push up the partner 14.0.1 count. Happy to share the raw data or more methodology as needed.
Can just a complete be offered to these builds?
I am fairly certain that I just reproduced this with Firefox 5 (same screenshots, etc.) though it would be great if someone else can confirm this. I still suspect this is due to distributions due to this.
(Reporter)

Comment 27

6 years ago
(In reply to Robert Strong [:rstrong] (do not email) from comment #26)
> I am fairly certain that I just reproduced this with Firefox 5 (same
> screenshots, etc.) though it would be great if someone else can confirm
> this. I still suspect this is due to distributions due to this.

I can reproduce with 5.0 win32, as well as 4.0, 10.0, 11.0, and 12.0. Therefore not a recent a regression at all, but a problem in the distribution code forever.

Test method was to copy in the distribution/ directory from funnecake11 build into a clean install, and hardwire the update url to that for 13.0.1 (to make partial failures happen automatically). 

(In reply to Robert Strong [:rstrong] (do not email) from comment #25)
> Can just a complete be offered to these builds?

I have half a patch for the update server that would do this; to handle it with snippets would be extremely painful. 

BTW, is there a nice way to debug/make changes to nsUpdateService.js without unpacking/repacking omni.ja ?
OS: Mac OS X → All
Summary: Firefox 13.0.1 doesn't download complete update after partial fails when using distribution customizations → Partner builds don't download complete update if partial fails
(Reporter)

Comment 28

6 years ago
From the weekly meetings, there is a plan to use partner builds (funnelcake) to test the stub installer. Asa, I want to flag that this bug would mean those users have a sub-optimal update experience.

Comment 29

6 years ago
Thanks, Nick. There is an idea, not yet a plan, to use funnelcake to test the Stub. But we may end up not doing that.

Does this bug mean that if partial updates fail, our partner build users never get upgraded?
(Reporter)

Comment 30

6 years ago
(In reply to Asa Dotzler [:asa] from comment #29)
> Does this bug mean that if partial updates fail, our partner build users
> never get upgraded?

Sort of. If the partial fails they won't get updated. After enough new versions are released they'll only be offered a complete, and update may work then. It really depends on the cause of the partial failing, eg locked files may still be a problem.
(In reply to Nick Thomas [:nthomas] from comment #27)
> (In reply to Robert Strong [:rstrong] (do not email) from comment #26)
> > I am fairly certain that I just reproduced this with Firefox 5 (same
> > screenshots, etc.) though it would be great if someone else can confirm
> > this. I still suspect this is due to distributions due to this.
> 
> I can reproduce with 5.0 win32, as well as 4.0, 10.0, 11.0, and 12.0.
> Therefore not a recent a regression at all, but a problem in the
> distribution code forever.
> 
> Test method was to copy in the distribution/ directory from funnecake11
> build into a clean install, and hardwire the update url to that for 13.0.1
> (to make partial failures happen automatically). 
Thanks for verifying

> (In reply to Robert Strong [:rstrong] (do not email) from comment #25)
> > Can just a complete be offered to these builds?
> 
> I have half a patch for the update server that would do this; to handle it
> with snippets would be extremely painful. 
> 
> BTW, is there a nice way to debug/make changes to nsUpdateService.js without
> unpacking/repacking omni.ja ?
No, there isn't for any files in omni.ja.
(Reporter)

Comment 32

6 years ago
I tested the complete-fails case too (using FF11.0 to get a complete-only update offer, funnelcake11's distribution/ folder copied in, setting 'failed: 6' in update.status after download). The same dialogs are shown as in the partial case (attachment 642909 [details], attachment 642911 [details]).

Are there failure modes which affect partials but not completes ? I'm wondering if it helps to only offer completes.
The difference is that when a partial fails to apply we immediately on startup try to download the complete whereas when a complete fails it notifies the user that they should download. As long as the partial can apply successfully to the build it is advertised to that uses the custom distributions code then I think it is ok.
(Reporter)

Comment 34

6 years ago
Sorry, I didn't follow what you mean. The distribution code disrupts both the failover to the complete, and notifying the user if the complete fails. Do we have any information on what proportion of failed partials lead to failed completes ?
(In reply to Nick Thomas [:nthomas] from comment #34)
> Sorry, I didn't follow what you mean. The distribution code disrupts both
> the failover to the complete, and notifying the user if the complete fails.
> Do we have any information on what proportion of failed partials lead to
> failed completes ?
Not really though there have been bug reports (there haven't been that many) where a user has modified a file, tried to apply a partial, and the complete succeeds.
I did some more digging and found bug 770996, which tracked an issue where partner build signing broke partial MARs for partner builds. This is definitely a dupe of that, but I'm not sure that bug is actually fixed anymore =(
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 770996
Augh, wrong bug. Sorry!
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---

Updated

6 years ago
tracking-firefox16: --- → ?
tracking-firefox16: ? → +
cmore managed to get his Funnelcake14, 14.0.1 build updated to 15.0.1 despite having a failed partial. I'll attach his backup-update.log and last-update.log showing the failure and subsequent success with the complete. Unless I'm misunderstanding the failure conditions, it seems like he should be hitting this bug.
Created attachment 660179 [details]
cmore's backup-update.log
Created attachment 660180 [details]
cmore's last-update.log

Updated

6 years ago
tracking-firefox16: + → -

Comment 41

6 years ago
I am not clear on the status of this bug. Is it covered (and hence fixed) by bug 770996?
(Reporter)

Comment 42

6 years ago
No. bug 770996 is a specific way of breaking partial updates on mac, related to signing to support Gatekeeper on 10.8. It will be resolved for updates to 16.0.

This bug is about the update code not handling any update failures, of which the worst case is not trying the complete if the partial should fail.
Specifically, this is about the distribution code breaking updates. I know the distribution code does some things with preferences on startup which might be the problem.
(Reporter)

Updated

6 years ago
Summary: Partner builds don't download complete update if partial fails → Partner builds don't download complete update if partial fails, or show workaround information if complete fails
Assignee: nobody → tabraldes
Created attachment 667693 [details]
log_xre.txt from failure to rollover partial->complete

I tested and reproduced this using the funnelcake build in the link above and using the latest Nightly.

Using the XRE_CONSOLE_LOG env var and the 'app.update.log' pref, I got the attached log (with latest Nightly).

This error showed up in both logs and seems worth investigating
---
[JavaScript Error: "NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIStringBundle.GetStringFromName]" {file: "XStringBundle" line: 21}]
---

I'm cooking up a local build to investigate further.
  If an "update.status" exists with contents "failed: 6", and if the "channel" property in the associated active-update.xml doesn't match the channel in channel-prefs.js, Firefox will show the "an update is available" dialog, but with info missing from the dialog.  The correct behavior is to show no dialog at all in this case, since no update is actually available.

  If the pref "app.partner.mozilla11" exists, then when a partial update is downloaded, the "channel" property in the active-update.xml file will be "${channel}-cck-${app.partner.mozilla11}".  "nsUpdateService.js" expects that the "channel" property in active-update.xml will match the channel that Firefox is currently on [1], but if the pref "app.partner.mozilla11" exists, then that will never be the case.

  We definitely need to fix "nsUpdateService.js" to handle the mismatched channels case better.

  I'm not sure yet whether the appending of "-cck-${app.partner.mozilla11}" to the "channel" property of "active-update.xml" happens on the update server or if it's done by Firefox, or why we're doing it at all.  If we want to continue doing that appending, then "nsUpdateService.js" needs to be updated to handle the situations where we append to the "channel" property.

[1] https://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/update/nsUpdateService.js?rev=f7505a5f3770#2442
Created attachment 669798 [details] [diff] [review]
Part 1: Do not prompt user if a partial update failed previously but whose channel is now wrong

This patch fixes our handling of this case:
  1) Partial update is downloaded
  2) Partial update fails to apply
  3) User changes update channels
  4) User restarts Firefox

Previously, an incorrect dialog would appear telling the user that an update is available.  In reality, no update may be available.
Attachment #669798 - Flags: review?(robert.bugzilla)
Created attachment 669801 [details] [diff] [review]
Part 2: Ensure that distribution prefs have been applied before using UpdateChannel

This patch makes it so that UpdateChannel.js applies distribution prefs before accessing and returning the channel name.

This patch causes the distribution prefs to be applied more than once, so it's likely to need some changes.
Attachment #669801 - Flags: feedback?(robert.bugzilla)
Created attachment 669806 [details] [diff] [review]
Part 2: Ensure that distribution prefs have been applied before using UpdateChannel

The underlying issue that patch 2 is trying to solve is that `DistributionCustomizer.applyPrefDefaults()` seems to be called [1] *after* the update service uses UpdateChannel.js to check against the active update's channel.

This updated patch ensures that the logic of `DistributionCustomizer.applyPrefDefaults()` only executes once.

[1] https://mxr.mozilla.org/mozilla-central/source/browser/components/nsBrowserGlue.js?rev=479174108809#332
Attachment #669801 - Attachment is obsolete: true
Attachment #669801 - Flags: feedback?(robert.bugzilla)
Attachment #669806 - Flags: review?(robert.bugzilla)
(Reporter)

Comment 49

5 years ago
(In reply to Tim Abraldes (:tabraldes) from comment #45)
>   I'm not sure yet whether the appending of "-cck-${app.partner.mozilla11}"
> to the "channel" property of "active-update.xml" happens on the update
> server or if it's done by Firefox, or why we're doing it at all.  If we want
> to continue doing that appending, then "nsUpdateService.js" needs to be
> updated to handle the situations where we append to the "channel" property.

In case there wasn't an IRL conversation around this point, partner builds are copies of the builds we offer on mozilla.org ("vanilla"), except for some customizations (bookmarks, bundled extensions etc). A partner build will query for updates on the ${channel}-cck-${app.partner.*} channel so that we can offer specific updates to that. In practise the update server will almost always fall back to ${channel} because no specific update has been needed or configured, so the vanilla update files are used. Sometimes we see multiple different -cck-<partner> appended to ${channel}, corresponding to users with more than partner build combined, eg release-cck-yahoo-cck-google. :-S
(In reply to Nick Thomas [:nthomas] from comment #49)
> (In reply to Tim Abraldes (:tabraldes) from comment #45)
> >   I'm not sure yet whether the appending of "-cck-${app.partner.mozilla11}"
> > to the "channel" property of "active-update.xml" happens on the update
> > server or if it's done by Firefox, or why we're doing it at all.  If we want
> > to continue doing that appending, then "nsUpdateService.js" needs to be
> > updated to handle the situations where we append to the "channel" property.
> 
> In case there wasn't an IRL conversation around this point, partner builds
> are copies of the builds we offer on mozilla.org ("vanilla"), except for
> some customizations (bookmarks, bundled extensions etc). A partner build
> will query for updates on the ${channel}-cck-${app.partner.*} channel so
> that we can offer specific updates to that. In practise the update server
> will almost always fall back to ${channel} because no specific update has
> been needed or configured, so the vanilla update files are used. Sometimes
> we see multiple different -cck-<partner> appended to ${channel},
> corresponding to users with more than partner build combined, eg
> release-cck-yahoo-cck-google. :-S

Thanks for the explanation!  After more investigation, I discovered that "nsUpdateService.js" actually _does_ try to use the correct channel name by asking "UpdateChannel.js" what the channel name should be.  We're getting the wrong channel name because we're asking "UpdateChannel.js" for the channel name _before_ the `DistributionCustomizer` has been able to append the "-cck-*" stuff.  The second patch alleviates this by making sure `DistributionCustomizer` does its work before we get a channel name.
Blocks: 322206
Whiteboard: [stub+]
Comment on attachment 669806 [details] [diff] [review]
Part 2: Ensure that distribution prefs have been applied before using UpdateChannel

I don't think it should be UpdateChannel.jsm's responsibility to trigger applying distribution prefs - that should happen on its own already via nsBrowserGlue. If it's not happening before the update service asks for the update channel, that's the problem we need to fix (either by having the update service request it later, or by applying the distribution customizations earlier).
Attachment #669806 - Flags: feedback-
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #51)
> Comment on attachment 669806 [details] [diff] [review]
> Part 2: Ensure that distribution prefs have been applied before using
> UpdateChannel
> 
> I don't think it should be UpdateChannel.jsm's responsibility to trigger
> applying distribution prefs - that should happen on its own already via
> nsBrowserGlue. If it's not happening before the update service asks for the
> update channel, that's the problem we need to fix (either by having the
> update service request it later, or by applying the distribution
> customizations earlier).
Agreed and I plan on looking at having the distribution code apply it earlier since this could bite other code too such as the add-ons manager.
Tim, I'm going to steal this bug and look into making the distribution code run earlier during startup
Assignee: tabraldes → robert.bugzilla
Status: REOPENED → ASSIGNED
Summary: Partner builds don't download complete update if partial fails, or show workaround information if complete fails → Partner builds don't download complete update if partial fails, or show workaround information if complete fails (needed for stub release channel metrics)
Removing the blocking flag since the funnelcake study is finished. If we end up needing to revisit this funnelcake study, then we can re-evaluate this for stub triage.
Whiteboard: [stub+]
Blocks: 392967
Keywords: regression
Notes: I didn't bother including the entire code path since what is missing is simple to figure out.

Preferences are used during startup prior to everything needed being available for the distributions code. It goes through Init in Preferences.cpp
http://mxr.mozilla.org/mozilla-central/source/modules/libpref/src/Preferences.cpp#275

applyCustomizations gets called after final-ui-startup
http://mxr.mozilla.org/mozilla-central/source/browser/components/nsBrowserGlue.js#349

which in turn calls observe with reload-default-prefs for the topic on the preference service
http://mxr.mozilla.org/mozilla-central/source/browser/components/distribution.js#195

Which takes us here
http://mxr.mozilla.org/mozilla-central/source/modules/libpref/src/Preferences.cpp#352

and finally takes us to here
http://mxr.mozilla.org/mozilla-central/source/modules/libpref/src/Preferences.cpp#1083

at which point the preferences from the distribution.ini finally get read in since we now have an observer for this topic in nsBrowserGlue.js
http://mxr.mozilla.org/mozilla-central/source/browser/components/nsBrowserGlue.js#155

Basically, this code relies on Init in Preferences.cpp happening after nsBrowserGlue.js has registered for the prefservice:after-app-defaults topic
Created attachment 671683 [details] [diff] [review]
patch rev1

No idea what else can be done with the current state of things.
Attachment #671683 - Flags: review?(gavin.sharp)
Created attachment 671685 [details] [diff] [review]
patch rev2

Forgot the changes to distribution.js
Attachment #669806 - Attachment is obsolete: true
Attachment #671683 - Attachment is obsolete: true
Attachment #669806 - Flags: review?(robert.bugzilla)
Attachment #671683 - Flags: review?(gavin.sharp)
Attachment #671685 - Flags: review?(gavin.sharp)
Comment on attachment 669798 [details] [diff] [review]
Part 1: Do not prompt user if a partial update failed previously but whose channel is now wrong

Let's take this patch to a new bug
Attachment #669798 - Attachment is obsolete: true
Attachment #669798 - Flags: review?(robert.bugzilla)
Comment on attachment 671685 [details] [diff] [review]
patch rev2

Note that this patch doesn't increase the number of times the preferences are read and that it just moves it from final-ui-startup to profile-do-change.

I think the right fix her would be to move the reading of the distribution.ini preferences into Init in Preferences.cpp but I think that is more appropriate in a followup bug. I'll even take that followup bug though I might ask someone on the platform integration team to take it depending on how busy things are.
Comment on attachment 671685 [details] [diff] [review]
patch rev2

This still keeps the reload of prefs, and moves it earlier in startup - that seems kind of dangerous, perf-wise. It also makes the pref reload happen regardless of whether customizations are set, which I don't think is acceptable.

What we really need is some mechanism to ensure that customizations can be applied during the pref service's initialization. This might be tricky, given that the pref service's initialization happens very early. Perhaps earlier than the observer service's? I don't know if there are any guarantees here.
Attachment #671685 - Flags: review?(gavin.sharp) → review-
I'm going to need some sort of safe workaround for app update so it can hopefully get onto aurora and possibly beta so I spun off the distribution bug to bug 802022.
Attachment #671685 - Attachment is obsolete: true
What I'll likely due is call observe with the reload-default-prefs topic when the update service observe is called with the post-update-processing topic. This only happens after an update operation on the next startup so it won't be a perf hit except after an update where we already have other perf hits such as rebuilding caches, etc.
Comment on attachment 671740 [details] [diff] [review]
patch untested

Looks like this is causing xpcshell-test failures
https://tbpl.mozilla.org/?tree=Try&rev=cf880f464694
Attachment #671740 - Attachment is obsolete: true
I use default prefs in many of the tests since there are requirements for some of app update prefs to be default prefs. So, adding a reload of default prefs in the client code breaks many of the tests. Meh!
In case we do have an opportunity for another chemspill of FF16, we should land this fix on all branches as soon as possible (and comfortable).
I'd love to but working around the distributions bug isn't as easy as I had hoped. :(
Created attachment 672581 [details] [diff] [review]
patch running through try

The tests changed the default update channel to prevent trying to apply an empty update but that isn't really necessary and trying to continue with that makes fixing the tests extremely difficult if not impossible so I removed that as appropriate. Running this through try
https://tbpl.mozilla.org/?tree=Try&rev=c0076572305f
Attachment #672150 - Attachment is obsolete: true
Created attachment 672588 [details] [diff] [review]
patch rev4

Removed a couple of more cases where the update channel doesn't need to be set.
Attachment #672584 - Attachment is obsolete: true
Created attachment 672595 [details] [diff] [review]
patch rev5

Separated out the test changes
Created attachment 672597 [details] [diff] [review]
patch - test changes
Attachment #672588 - Attachment is obsolete: true
(In reply to Robert Strong [:rstrong] (do not email) from comment #69)
> Created attachment 672581 [details] [diff] [review]
> patch running through try
> 
> The tests changed the default update channel to prevent trying to apply an
> empty update but that isn't really necessary and trying to continue with
> that makes fixing the tests extremely difficult if not impossible so I
> removed that as appropriate. Running this through try
> https://tbpl.mozilla.org/?tree=Try&rev=c0076572305f
Wrong patch though I needed to throw that at try again. The right one is here
https://tbpl.mozilla.org/?tree=Try&rev=f4c524a0c1df
Created attachment 672639 [details] [diff] [review]
test patch rev2

Removed an unneeded logging statement I forgot to remove
Attachment #672597 - Attachment is obsolete: true
Comment on attachment 672595 [details] [diff] [review]
patch rev5

I've manually verified that this fixes this bug using a build from
https://ftp.mozilla.org/pub/mozilla.org/firefox/releases/13.0.1-funnelcake11

I modified nsUpdateService.js in omni.ja and removed the precompiled version of nsUpdateService.js from jsloader\resource\gre\components\ in omni.ja. After that the partial failed, the ui was shown correctly, the complete downloaded, and it was successfully applied.
Though the try builds haven't all completed so far so good!
https://tbpl.mozilla.org/?tree=Try&rev=f4c524a0c1df
Attachment #672581 - Attachment is obsolete: true
Attachment #660180 - Attachment is obsolete: true
Attachment #660179 - Attachment is obsolete: true
Attachment #646019 - Attachment is obsolete: true
Attachment #645962 - Attachment is obsolete: true
Attachment #645960 - Attachment is obsolete: true
Attachment #642911 - Attachment is obsolete: true
Attachment #642909 - Attachment is obsolete: true
Attachment #672595 - Flags: review?(gavin.sharp)
Comment on attachment 672595 [details] [diff] [review]
patch rev5

Tim has pointed out a likely code problem that I likely didn't experience due to not clearing the local cache files so holding off on review.
Attachment #672595 - Flags: review?(gavin.sharp)
Created attachment 672920 [details] [diff] [review]
patch rev6
Attachment #672595 - Attachment is obsolete: true
Comment on attachment 672920 [details] [diff] [review]
patch rev6

Review of attachment 672920 [details] [diff] [review]:
-----------------------------------------------------------------

I tested this patch locally and it seems to work correctly.
I did as well and it worked for me as well.

Also, an additional try server build that is all green
https://tbpl.mozilla.org/?tree=Try&rev=10ac2854f6f3
Attachment #672920 - Flags: review?(gavin.sharp)
Comment on attachment 672920 [details] [diff] [review]
patch rev6

I'd like a second set of eyes on this.
Attachment #672920 - Flags: review?(netzen)
Attachment #672639 - Flags: review?(netzen)
Comment on attachment 672920 [details] [diff] [review]
patch rev6

Review of attachment 672920 [details] [diff] [review]:
-----------------------------------------------------------------

So it looks like the problem won't actually be fixed until after a failed update. Correct?
Are we relying on the download happening in the background directly (or about dialog) after this failed update?  

It seems very possible that the user would restart their browser before getting that update, and then the DistributionCustomizer settings wouldn't be set again until another failed update happened.   And they'd have to then get another failed update in the next update.

That could lead to people not getting updates for a long time.
What about instead showing the update wizard dialog right after this failure?

Apologies if I'm misreading the code.

::: toolkit/mozapps/update/nsUpdateService.js
@@ +1630,5 @@
>  
> +    if (!update) {
> +      if (status != STATE_SUCCEEDED) {
> +        LOG("UpdateService:_postUpdateProcessing - previous patch failed " +
> +            "and no patch available");

How do we know the failure wasn't from a partial and there is a full available here?  Just because we only enter this case when !update?
Attachment #672920 - Flags: review?(netzen) → review-
Comment on attachment 672639 [details] [diff] [review]
test patch rev2

Review of attachment 672639 [details] [diff] [review]:
-----------------------------------------------------------------

I'm holding off on reviewing this for now until I r+ the other implementation patch
(In reply to Brian R. Bondy [:bbondy] from comment #84)
> Comment on attachment 672920 [details] [diff] [review]
> patch rev6
> 
> Review of attachment 672920 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> So it looks like the problem won't actually be fixed until after a failed
> update. Correct?
No. See comment #55. The problem is that after a failed updated we try to download the complete or display a page for the user to manually download the installer which happens during profile-after-change. The DistributionCustomizer happens during final-ui-startup so it hasn't applied the changes to the custom channel early enough. It really should be during the preferences init and only applies them at all by what looks to me an accident or workaround in the distribution code. Also see bug 802022

> Are we relying on the download happening in the background directly (or
> about dialog) after this failed update?  
The dialog is displayed informing the user and it is already downloading. When the user clicks OK the dialog advances to the download page. Note that this works for all non distribution builds.

> 
> It seems very possible that the user would restart their browser before
> getting that update, and then the DistributionCustomizer settings wouldn't
> be set again until another failed update happened.   And they'd have to then
> get another failed update in the next update.
It gets set during final-ui startup.

> 
> That could lead to people not getting updates for a long time.
Nope

> What about instead showing the update wizard dialog right after this failure?
We do

> Apologies if I'm misreading the code.
No problem

> ::: toolkit/mozapps/update/nsUpdateService.js
> @@ +1630,5 @@
> >  
> > +    if (!update) {
> > +      if (status != STATE_SUCCEEDED) {
> > +        LOG("UpdateService:_postUpdateProcessing - previous patch failed " +
> > +            "and no patch available");
> 
> How do we know the failure wasn't from a partial and there is a full
> available here?  Just because we only enter this case when !update?
update won't be null
Attachment #672920 - Flags: review- → review?(netzen)
Comment on attachment 672920 [details] [diff] [review]
patch rev6

Review of attachment 672920 [details] [diff] [review]:
-----------------------------------------------------------------

Given that there is another bug filed to sort out DistributionCustomizer/prefs, and given that it works the way I was asking for it to work :)
Attachment #672920 - Flags: review?(netzen) → review+
Attachment #672639 - Flags: review?(netzen) → review+
Comment on attachment 672920 [details] [diff] [review]
patch rev6

Need this to bake sooner than later so cancelling additional review.
Attachment #672920 - Flags: review?(gavin.sharp)
Pushed to mozilla-central
https://hg.mozilla.org/mozilla-central/rev/6eb49c920cd4
https://hg.mozilla.org/mozilla-central/rev/ca6044e25cd2
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago5 years ago
status-firefox-esr10: --- → affected
status-firefox16: --- → affected
status-firefox17: --- → affected
status-firefox18: --- → affected
status-firefox19: --- → affected
status-firefox-esr17: --- → affected
Resolution: --- → FIXED
Target Milestone: --- → mozilla19
status-firefox19: affected → ---
Comment on attachment 672920 [details] [diff] [review]
patch rev6

All looks good on m-c. I am going to request approval now but I will let this bake until Sunday before landing.

[Approval Request Comment]
Regression caused by (bug #): Distributions implementation in Bug 392967
User impact if declined: Future users that get a Funnelcake build will experience this bug.
Testing completed (on m-c, etc.): on m-c, TimA and I both tested locally.
Risk to taking this patch (and alternatives if risky): This patch only reloads the default preferences when the app update channel is different than expected. This only happens in Funnelcake builds and if a user manually changes the channel which should be near if not zero in reality.
String or UUID changes made by this patch: None
Attachment #672920 - Flags: approval-mozilla-release?
Attachment #672920 - Flags: approval-mozilla-beta?
Attachment #672920 - Flags: approval-mozilla-aurora?
No reports of problems and I verified locally with Sunday's update.
Attachment #672920 - Flags: approval-mozilla-release?
Attachment #672920 - Flags: approval-mozilla-release+
Attachment #672920 - Flags: approval-mozilla-beta?
Attachment #672920 - Flags: approval-mozilla-beta+
Attachment #672920 - Flags: approval-mozilla-aurora?
Attachment #672920 - Flags: approval-mozilla-aurora+
Created attachment 673916 [details] [diff] [review]
Main patch pushed to beta and release with minor change due to differences in code
Robert, can you please advise me what I need to do to verify this on Aurora, Beta, and Release?

My assumptions are as follows:
 * Partner Aurora 2012-10-22 or earlier -> Aurora 2012-10-23 fallback update
 * Partner Firefox 17.0b2 or earlier -> 17.0b3 fallback update
 * Partner Firefox 16.0.1 or earlier -> 16.0.2 fallback update

...or will this only be verifiable once we have updates available for fixed builds? (ie. Firefox 17.0b3 -> b4 update).
Keywords: verifyme
QA Contact: anthony.s.hughes
Created attachment 674043 [details]
distribution.ini for testing

This is only verifiable for fixed builds. With a fixed build you should be able to add a distribution directory in the install directory, take this attached distribution.ini, place it in the distribution directory, and rename the crashreporter.exe in the installation directory. You will need a partial update to test this.
Forgot to mention that you will need to disable the maintenance service in options.
Given comment 95, we won't be able to verify this for Firefox 17 until 17.0b3->17.0b4 partial updates are available, for Firefox 16 until 16.0.2->next partial updates are available.

Using Nightly and Aurora 2012-10-22 builds I don't find any updates. Is there a different build I should be using?
RelEng can provide info as to when it will be available on those channels and the nightly 10/22 build can be used to verify it.
(Reporter)

Comment 99

5 years ago
Unfortunately the update server doesn't have the channel fallback for nightly or aurora like it does for beta/release. That's what looks for an update on channel when nothing is found for channel-cck-foo. I'll set up a work around on the staging server and advise details.
(Reporter)

Comment 100

5 years ago
Ok, I have a temporary config in place on aus3-staging.m.o to test on nigthly/aurora. You need to be on a VPN because aus3-staging isn't accessible on a public IP.

Testing steps:
1, Install the previous nightly/aurora build (from at least the 23rd for Aurora)
2, In the install dir create a distribution folder, and save attachment 674043 [details] in it
3, rename crashreporter.exe to make the partial fail
4, start up Firefox and go to about:config
5, add a new string pref app.update.url.override, with value
 http://aus3-staging.mozilla.org/update/3/%PRODUCT%/%VERSION%/%BUILD_ID%/%BUILD_TARGET%/%LOCALE%/%CHANNEL%/%OS_VERSION%/%DISTRIBUTION%/%DISTRIBUTION_VERSION%/update.xml
6, check for updates ....

The only change at 5) is https://aus3. -> http://aus3-staging. The information in the update is the same.
Tried Aurora 2012-10-23 and 2012-10-22, neither of which find updates.
(Reporter)

Comment 102

5 years ago
Verified using the Mac Aurora with buildID 20121023042010 (http://hg.mozilla.org/releases/mozilla-aurora/rev/636b289899a4). The dialog saying the partial failed was shown and the complete downloaded. Perhaps the VPN requirement was missed.
(Reporter)

Comment 103

5 years ago
Nightly verified too for win32/en-US, per comment #100.

Also verified win32/en-US for 17.0b3 and 16.0.2 (set up an update path on a test channel that served them the content for the previous version, so partial guaranteed to fail; channel-prefs.js change to test channel; otherwise steps 2 and 6 of comment #100. The test update paths are gone now).
Flipping the tracking flags to verified per comment #102 and comment #103.
status-firefox16: fixed → verified
status-firefox17: fixed → verified
status-firefox18: fixed → verified
If this has been fixed on 17, it's fixed in ESR17.
status-firefox-esr17: affected → fixed
Given this is fixed in ESR 17 and we're about to EOL ESR10, wontfixing for ESR10.
status-firefox-esr10: affected → wontfix
tracking-firefox-esr10: --- → -
Duplicate of this bug: 504911
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.