Closed Bug 1014194 Opened 6 years ago Closed 6 years ago

Base implementation: updater addon hotfix

Categories

(Firefox :: General, enhancement)

x86
Windows XP
enhancement
Not set
normal
Points:
13

Tracking

()

VERIFIED FIXED
Iteration:
33.3

People

(Reporter: benjamin, Assigned: gps)

References

Details

Attachments

(7 files, 7 obsolete files)

215.84 KB, image/png
jboriss
: feedback+
Details
357.78 KB, image/png
jboriss
: feedback+
Details
10.93 KB, text/plain
Details
4.39 KB, text/plain
Details
217.87 KB, patch
rnewman
: review+
Details | Diff | Splinter Review
211.92 KB, application/x-xpinstall
Details
211.55 KB, application/x-xpinstall
Details
This bug covers the basic UI and behaviors for the updater addon hotfix. The UI is specified at 994882.

The low-level code for launching the installer and getting results is covered in bug 1014187.

This code needs to work properly in Firefox versions 10-present but should only activate itself for Firefox 26 and earlier, and needs to exclude up-to-date Firefox ESR releases. I'm filing a separate bug about a server module which will point us at the correct bits and provide an update hash.

When finished, this code needs to land at https://hg.mozilla.org/releases/firefox-hotfixes/ but we should probably do development on a local branch until we're ready to ship.

Apparently addon hotfixes are only installed by users who have automatic updating enable, but we should double-check in the code that the user has automatic updates enabled. No UI or download should happen for users who have updates disabled.

After the addon is finished, it should uninstall itself.

I am filing a separate bug about post-update data collection through FHR, but it's likely that the addon will need to copy/save relevant update logs so that the post-update data collection works.
I am actively working on this bug. It is my highest priority.
Assignee: nobody → gps
Status: NEW → ASSIGNED
Open question: For when we open a new tab (instead of displaying the notification on about:home), should the content in that new tab be local (from the hotfix) or hosted on www.mozilla.org?
Flags: needinfo?(benjamin)
Hrm. That's additional translation and building. Can we load http://www.mozilla.org/en-US/firefox/desktop/ as the background in that case?
Flags: needinfo?(benjamin) → needinfo?(jboriss)
I talked with Boriss IRL after I sent this needinfo and she said opening about:home was acceptable. I'm not complaining about doing less work :)
Flags: needinfo?(jboriss)
https://bitbucket.org/indygreg/firefox-hotfixes/commits/branch/update-hotfix contains what I have so far. As of just now, I've implemented a .properties file (albeit without proper l10n overlays yet).

At this point, the UI is nearly feature complete.

Next major task is to make it compatible with Firefox and to prettify the UI.
We decided in bug 1014200 to skip update manifests and hard code installer URLs into the hotfix.

Ben or Nick: Would it be possible to get a listing of all Firefox installer URLs and their hashes? Specifically, I need the Windows full and stub installers for the release channel for Firefox 29 for the locales listed in bug 1014535.
Flags: needinfo?(nthomas)
Flags: needinfo?(bhearsum)
Gregory, Won't we be using Firefox 30 for this?
Err, I guess it will be Firefox 30 by the time it rolls out. I'd like to plug a full set of URLs in to test with though. We can always update to 30 later.

Also, I'd like the file size as well.

For my future reference, you can manually query URLs like https://aus3.mozilla.org/update/3/Firefox/10.0/20120129021758/Darwin_x86_64-gcc3-u-i386-x86_64/en-US/release/Darwin%2013.2.0/default/default/update.xml?force=1 and https://aus3.mozilla.org/update/3/Firefox/7.0/20110922153450/WINNT_x86-msvc/en-US/release/Windows_NT%206.1/default/default/update.xml?force=1

I suppose I could manually spider the update server. But I'd kinda like to have a list straight from the source.
I see from https://ftp.mozilla.org/pub/mozilla.org/firefox/releases/29.0.1/ that we appear to have a "win32-EUballot" OS. I'm guessing we shouldn't update existing EU Ballot users to non-EU Ballot builds. So that doubles the number of installers I need.

Also, my limitation for specific locales in comment #6 was wrong: we need all locales.
And I noticed we don't have stub installers for the EU Ballot builds.
And the EU Ballot builds aren't listed in the checksums files. e.g. https://ftp.mozilla.org/pub/mozilla.org/firefox/releases/29.0.1/SHA512SUMS

The EU Ballot builds appear to be under-loved. I'm starting to question whether I need to deal with them at all. If an EU Ballot build can upgrade to a non-EU Ballot build, then the answer is no.
rstrong might be able to shed light onto matters.
Flags: needinfo?(robert.strong.bugs)
I don't know the details of the EU Ballot builds since there was nothing done to the app update code for them.

bsmedberg wanted to go with the full installers so the download could happen in the background and the install process would happen quickly without installer UI so you should be using the full installers.

As for URLs the latest release full installers can be obtained from bouncer using the following url with the locale (e.g. AB_CD) replaced with the installs locale.
http://download.mozilla.org/?os=win&lang=AB_CD&product=firefox-latest
Flags: needinfo?(robert.strong.bugs)
Thanks for the link, Rob.

The goal is to attempt full installer download and fall back to stub installer if we can never fetch the full installer. Do you know how to get the stub installer out of download.mozilla.org?
That seems like a strange fallback since if you have difficulty downloading the full installer then the stub would most likely also have difficulty downloading the full installer since they both download from the same place. If you want a fallback I suggest falling back to opening the web page to download Firefox.
(In reply to Gregory Szorc [:gps] from comment #6)
> We decided in bug 1014200 to skip update manifests and hard code installer
> URLs into the hotfix.
> 
> Ben or Nick: Would it be possible to get a listing of all Firefox installer
> URLs and their hashes? Specifically, I need the Windows full and stub
> installers for the release channel for Firefox 29 for the locales listed in
> bug 1014535.

The best place for checksums is actually directly on FTP. You can find them in the *SUMS files in directories like https://ftp.mozilla.org/pub/mozilla.org/firefox/releases/18.0.2/SHA512SUMS. There's also detached sigs to verify against.

Looks like Rob already answered the Bouncer side of things.

It doesn't look like we have anything with a complete list of file sizes. You can get those yourself pretty easily with a HEAD request to the bouncer links, but if you want something more authoritative, I can grab them directly from FTP on Monday. Just needinfo me again if you want that.
Flags: needinfo?(nthomas)
Flags: needinfo?(bhearsum)
The stub installer verifies the certificate of the full installer instead of the checksum so you might want to do that instead if it is simpler.
Stub installers can be downloaded from
   https://download.mozilla.org/?os=win&lang=ab-CD&product=firefox-stub
That's like firefox-latest, always points to the latest stub. I suggest https, so we start with a secure 302 response, but the stub is served over SSL regardless.

re EUBallot, and actually other partner builds like Bing/Yahoo/Yandex/etc, it's going to depend what a paveover install does. If it removes the customizations in <appdir>/distribution/ then we should exclude all partners in the hotfix (eg testing for existence of app.partner preferences, see UpdateChannel.jsm).
(In reply to Nick Thomas [:nthomas] from comment #18)
> Stub installers can be downloaded from
>    https://download.mozilla.org/?os=win&lang=ab-CD&product=firefox-stub
> That's like firefox-latest, always points to the latest stub. I suggest
> https, so we start with a secure 302 response, but the stub is served over
> SSL regardless.
I'm ok with this though chances are whatever prevented the hotfix from downloading the full installer will also likely prevent the stub from downloading the full installer. It will also need to be interactive since the stub doesn't have the option to install silently, etc.

> re EUBallot, and actually other partner builds like Bing/Yahoo/Yandex/etc,
> it's going to depend what a paveover install does. If it removes the
> customizations in <appdir>/distribution/ then we should exclude all partners
> in the hotfix (eg testing for existence of app.partner preferences, see
> UpdateChannel.jsm).
It does remove the customizations directory.
I think there was confusion about the stub: when we did the original UX mockup we weren't sure whether we were going to use the full installer or the stub installer, so there were designs for both. We've since decided to use the full installer in silent mode. We will not be using the stub installer at any point.

For now let's exclude partner builds, but I'll file a bug for a command-line installer flag that doesn't pave over the distribution directory which we can use in a v2 of this project.
I'll nix support for stub installers from the hotfix. Yay simpler implementation! (This does mean our translators have been doing extra work. I also removed "already downloaded" from one string to make a single string apply to both full and stub cases.)

I've written a Python script that scrapes the HTTP server for hashes and file sizes.

I'll exclude builds with an app.partner pref from automatically updating.

I believe I'm unblocked from all concerns previously cited in this bug.
Note: the http://download.mozilla.org/?os=win&lang=AB_CD&product=firefox-latest url will always give you the latest so the hashes / file sizes won't match after a release while the certificate will match for several more years and is the same for all locales.
I've got downloading implemented. I still need to apply polish.

At this point, I need to do logging upload and add a lot of robustness. If anyone wants to test things: https://people.mozilla.org/~gszorc/hotfix-v20140527.01.xpi. The compatibility checks are disabled, so it will work everywhere. It doesn't attempt to run the downloaded installer, but you should see the notification to run it. It may leave state behind in your profile. I recommend running on a fresh profile. Latest code is being pushed to https://bitbucket.org/indygreg/firefox-hotfixes.
The notification on Windows XP doesn't have the styling issues previously seen on OS X. Attaching a screenshot so Boriss can put her stamp of approval on the UI.
Attachment #8434448 - Flags: feedback?(jboriss)
And here is the notification on Windows 7.
Attachment #8434578 - Flags: feedback?(jboriss)
I've incorporated the install launcher into the XPI. The extension will now launch the installer when it has finished downloading. The other big highlight in the last 24 hours is that logs are uploaded to EC2. I haven't yet incorporated the installer log into this. But it should be pretty straightforward.

The integration with the launcher still needs some love (mainly around inspecting return codes). That's the biggest issue remaining. Other issues should be in the death-by-a-thousand-cuts bucket.

https://people.mozilla.org/~gszorc/hotfix-v20140527.01.xpi
When installing Firefox 29+ on top of Firefox 10, I wasn't getting the "welcome to Australis tour" tab on first run. We may need to trigger that manually.

dolske, et al: what do we need to do to get that tab to display?
Flags: needinfo?(dolske)
Flags: needinfo?(dolske) → needinfo?(MattN+bmo)
One of the remaining issues I'm tracking down is an apparent race condition between the hotfix issuing a notification (via PopupNotifications) on startup and PopupNotifications actually being initialized. I've documented things at https://bitbucket.org/indygreg/firefox-hotfixes/src/98b9e385a3706c7f2b31a4756f6387add48c353e/v20140527.01/resource/update.jsm?at=default#cl-842

I'm not an expert at gBrowser/PopupNotifications initialization. It's quite possible I'm not doing something obvious to wait on gBrowser/PopupNotifications initialization.

gavin: do you have any ideas?
Flags: needinfo?(gavin.sharp)
What problem are you seeing exactly? sessionstore-windows-restored should be a fine time to show notifications, nsBrowserGlue.js's _onWindowsRestored does that in a couple of places already (though those are notification bars, not PopupNotifications).
Flags: needinfo?(gavin.sharp)
(In reply to :Gavin Sharp (email gavin@gavinsharp.com) from comment #29)
> What problem are you seeing exactly? sessionstore-windows-restored should be
> a fine time to show notifications, nsBrowserGlue.js's _onWindowsRestored
> does that in a couple of places already (though those are notification bars,
> not PopupNotifications).

I'm going to look into this in more detail today since it's one of the few remaining blockers. But it certainly looks like PopupNotifications.show() is no-op'ing (none of the callbacks get fired and no notification is displayed).
(In reply to Gregory Szorc [:gps] from comment #30)
> (In reply to :Gavin Sharp (email gavin@gavinsharp.com) from comment #29)
> > What problem are you seeing exactly? sessionstore-windows-restored should be
> > a fine time to show notifications, nsBrowserGlue.js's _onWindowsRestored
> > does that in a couple of places already (though those are notification bars,
> > not PopupNotifications).
> 
> I'm going to look into this in more detail today since it's one of the few
> remaining blockers. But it certainly looks like PopupNotifications.show() is
> no-op'ing (none of the callbacks get fired and no notification is displayed).

Upon deeper investigation, I can verify the problem is that PopupNotifications.show() is being called without error but nothing happens. Furthermore, the bug is present in Firefox 10 but not Firefox 29. I guess I will now attempt to bisect to identify what the fix is so I can work around it. (I've already tried trivial things such as sleeping at start-up to no success.)
The PopupNotifications bug was fixed in Firefox 24. Now, to find which changeset...
Attached patch Auto update hotfix (obsolete) — Splinter Review
Uploading what I have because some parts are ready for review.
Comment on attachment 8437193 [details] [diff] [review]
Auto update hotfix

Honza: Please review the Necko code in update.jsm. I mostly care about lines 950-1340. You may want to read the doc block on line 185 for more context on what is expected from this code.
Attachment #8437193 - Flags: review?(honzab.moz)
Comment on attachment 8437193 [details] [diff] [review]
Auto update hotfix

Richard: Please start familiarizing yourself with the overall structure of the code. The main high-level issue left to sort out is retry logic after failed installation. That will come in a subsequent patch.
Attachment #8437193 - Flags: feedback?(rnewman)
Comment on attachment 8437193 [details] [diff] [review]
Auto update hotfix

Review of attachment 8437193 [details] [diff] [review]:
-----------------------------------------------------------------

I skimmed the entire 20140527 dir of the repo.

I presume that the entire InstallerLauncher directory is generated by Visual Studio. Let me know if there's anything in here that you want me to check.

As usual, I tackled nits as I went.

Main comment is that we should leave trail markers as we go. We'll need them later.

::: v20140527.01/resource/update.jsm
@@ +8,5 @@
> +  "log",
> +  "manager",
> +];
> +
> +const {classes: Cc, interfaces: Ci, results: Cr, utils: Cu} = Components;

Cc isn't used, fwiw.

@@ +69,5 @@
> +
> +/**
> + * Old file appender from legacy log4moz.js.
> + *
> + * The new one doesn't suite our needs because it doesn't do appends

s/suite/suit

@@ +77,5 @@
> + * This appender, while doing main thread I/O, makes our life easier.
> + * Most clients shouldn't be dropping lots of log events, so the main
> + * thread I/O should be in check.
> + */
> +function FileAppender(file, formatter) {

Perhaps name this something like MainThreadFileAppender for clarity on less-old versions?

@@ +124,5 @@
> +      message += "\n";
> +    }
> +
> +    try {
> +      this.outputStream.writeString(message);

This will throw if `get outputStream` returns null. We'll catch it and try again, so we're doing a lot of churn on every log message if we fail to create the output stream.

Perhaps check here for `null`, so we short-circuit the log appender altogether in the worst case?

I'm assuming here that we would rather fail to capture logs than churn the shit trying to open file streams on XP. If logs are more important, then let's make that explicit.

@@ +126,5 @@
> +
> +    try {
> +      this.outputStream.writeString(message);
> +    } catch (ex) {
> +      if (ex.result == Cr.NS_BASE_STREAM_CLOSED) {

You use === elsewhere, so I presume it's available in the earliest release that you're targeting. Use it here?

@@ +217,5 @@
> + * Workflow
> + * ========
> + *
> + * Upon installation, start() is called and we check to see if this add-on
> + * is applicable. If we are not applicable, the add-on is uninstalled.

I think we might want to do a little more than this. I'm thinking ahead to the day when we deploy a successor to this add-on, or we discover that our applicability choices were wrong.

For example:

* Set a pref that we were installed, with version etc.
* Set a pref that we uninstalled ourselves, noting why.

@@ +230,5 @@
> + *
> + * When the installer is fully downloaded and ready to execute, we call
> + * _onInstallerReady(). On the first call, we attempt to execute the
> + * installer. If that goes well, the add-on is uninstalled and our work
> + * is done.

Record success in prefs. I can imagine us tracking this kind of stuff in FHR -- what if these users get jumped to 29/30, and then STILL don't follow a normal upgrade path? Identify the population.

@@ +232,5 @@
> + * _onInstallerReady(). On the first call, we attempt to execute the
> + * installer. If that goes well, the add-on is uninstalled and our work
> + * is done.
> + *
> + * We differentiate failed installations by their liklihood of recurring.

s/liklihood/likelihood

@@ +236,5 @@
> + * We differentiate failed installations by their liklihood of recurring.
> + * We assume some failures such as permission declined or permission not
> + * allowed are transient (the next time the user tries they may click a
> + * different button or may have an administrator available to type in a
> + * password, etc). Non-transient errors result in immediate add-on uninstall.

Recording why.

@@ +242,5 @@
> + * If transient install errors are detected, the add-on will display a
> + * pop-up notification of a pending Firefox upgrade on Firefox start-up.
> + * The pop-up will always be displayed on about:home (opening it if
> + * necessary). The notification will be displayed at most once per calendar
> + * day. These pop-ups will appear indefinitely until Firefox is upgraded.

Note that users with updates disabled won't be nagged by this.

@@ +250,5 @@
> + *
> + * consider HTTPS here for downloads.
> + * TLS will not add any additional verifiability since file hashes are
> + * hard-coded in this file. You can make the argument that TLS will
> + * introduce another point of failure: bad certs. TLS will, however,

There's a huge potential flaw here (and potentially wherever we verify certs during installation): bad clocks.

A TLS connection will fail if the user's clock is outside the validity period of the cert. And we know that lots of devices, presumably including lots of old XP machines with dead BIOS batteries, will have clocks that are wrong.

You should assume that clocks are set to 1980, and test accordingly.

(You can hard-code safe Date.now() ranges in this file, if you wish, and decide whether to do clock-dependent work. But that's complexity...)

@@ +417,5 @@
> +  /**
> +   * The number of times we've attempted a download.
> +   */
> +  get downloadAttempts() {
> +    return this._prefs.getIntPref("downloadAttempts");

|| 0? Later you just ++ this, and it's nice to be specific.

@@ +503,5 @@
> +      let appender = new FileAppender(this._logFile, formatter);
> +      log.addAppender(appender);
> +      fileAppender = appender;
> +
> +      // Now flush buffered messages produced before we loaded into the

Let's say "append"; flush is a loaded word in this context.

@@ +664,5 @@
> +
> +    this.shutdown();
> +
> +    // Operates on user and default despite using a user branch object.
> +    this._prefs.deleteBranch("");

I suggest having two branches: one for "whiteboard" stuff (times, counts), and one for "papers, please" -- who are we, when did we arrive, and why did we leave?

@@ +670,5 @@
> +    // The file log handle may prevent directory removal. Stop logging to
> +    // a file.
> +    if (fileAppender) {
> +      this._log.warn("Closing file appender.");
> +      fileAppender.close();

IIRC this can throw. Wrap it in a try.

@@ +700,5 @@
> +    this._log.warn("Performing a self-uninstall.");
> +
> +    this._uploadForensics(function () {
> +      this._cleanup(function () {
> +        uninstallHotfix();

I think this is the correct scope, but doesn't hurt to comment.

@@ +727,5 @@
> +    }
> +
> +    // We don't upgrade Windows XP SP1 and older because they are only
> +    // compatible up to Firefox 12.
> +    let version = this._getWindowsVersion();

What happens when Firefox is being run in compatibility mode?

@@ +1038,5 @@
> +                         .QueryInterface(Ci.nsIRequest)
> +                         .QueryInterface(Ci.nsIHttpChannel);
> +
> +    // Don't cache download because we do file resuming automatically.
> +    channel.loadFlags |= Ci.nsIRequest.LOAD_BYPASS_LOCAL_CACHE;

Probably want some more flags on here, too -- anonymous, yadda yadda.

@@ +1090,5 @@
> +
> +    let copierRequest;
> +    let channelFailed = false;
> +
> +    // TODO do we care about nsIChannelEventSink:asyncOnChannelRedirect?

Probably :(

Honza can speak to this.

@@ +1105,5 @@
> +        }
> +
> +        // Blocked by Windows parental controls.
> +        if (request.responseStatus == 450) {
> +          this._log.warn("Blocked by parental controls.");

Is this also going to prevent us from uploading the log? And can we leak that knowledge by, e.g., making a crafted DNS request?

@@ +1583,5 @@
> +
> +    let message = bundle.GetStringFromName("ready_full");
> +    let primaryMessage = bundle.GetStringFromName("install_full");
> +
> +    // We inintially display a somewhat generic message. After a week passes,

initially
Attachment #8437193 - Flags: feedback?(rnewman) → feedback+
Comment on attachment 8437193 [details] [diff] [review]
Auto update hotfix

Driveby comment, take it as you will.

>+$ import-installers.py http://download-installer.cdn.mozilla.net/pub/firefox/releases/29.0.1

We could use https, and verify the cert, to ensure we retrieve the hash information securely. The requests library supports cert verification.

FWIW, the hash and size are recorded at build time in files like
  https://ftp.mozilla.org/pub/mozilla.org/firefox/candidates/30.0-candidates/build2/win32/de/firefox-30.0.checksums
Just have to pick out the right line, the format is
  <hash> <hash function> <size> <path/filename>
The firefox/candidates/ directories don't live forever, but probably long enough for your purposes here.
I've bisected the popup notifications on startup failure to bug 809085 / https://hg.mozilla.org/mozilla-central/rev/b96decc34910. I'm not sure what that patch did that causes notifications to work on startup afterwards.

If I had to venture a guess, it would be the changes in PopupNotifications around changing how the active browser / window is detected. A browser guru should know more than me - I know little about how all that stuff works.
Matt provided necessary details over IRC.
Flags: needinfo?(MattN+bmo)
I fixed the notification issue. I was passing gBrowser to PopupNotifications.show(). This worked all the time in Firefox 10-33 but only on startup in Firefox 24+. I started passing gBrowser.selectedBrowser and Firefox 10-23 started working on startup. I don't pretend to understand why.
(In reply to Richard Newman [:rnewman] from comment #36)
> I presume that the entire InstallerLauncher directory is generated by Visual
> Studio. Let me know if there's anything in here that you want me to check.

Correct. You can largely ignore that directory.

> @@ +217,5 @@
> > + * Workflow
> > + * ========
> > + *
> > + * Upon installation, start() is called and we check to see if this add-on
> > + * is applicable. If we are not applicable, the add-on is uninstalled.
> 
> I think we might want to do a little more than this. I'm thinking ahead to
> the day when we deploy a successor to this add-on, or we discover that our
> applicability choices were wrong.
> 
> For example:
> 
> * Set a pref that we were installed, with version etc.
> * Set a pref that we uninstalled ourselves, noting why.
> 
> @@ +230,5 @@
> > + *
> > + * When the installer is fully downloaded and ready to execute, we call
> > + * _onInstallerReady(). On the first call, we attempt to execute the
> > + * installer. If that goes well, the add-on is uninstalled and our work
> > + * is done.
> 
> Record success in prefs. I can imagine us tracking this kind of stuff in FHR
> -- what if these users get jumped to 29/30, and then STILL don't follow a
> normal upgrade path? Identify the population.

I agree with the utility. Will put some persistent prefs in the next revision.
 
> @@ +250,5 @@
> > + *
> > + * consider HTTPS here for downloads.
> > + * TLS will not add any additional verifiability since file hashes are
> > + * hard-coded in this file. You can make the argument that TLS will
> > + * introduce another point of failure: bad certs. TLS will, however,
> 
> There's a huge potential flaw here (and potentially wherever we verify certs
> during installation): bad clocks.
> 
> A TLS connection will fail if the user's clock is outside the validity
> period of the cert. And we know that lots of devices, presumably including
> lots of old XP machines with dead BIOS batteries, will have clocks that are
> wrong.
> 
> You should assume that clocks are set to 1980, and test accordingly.
> 
> (You can hard-code safe Date.now() ranges in this file, if you wish, and
> decide whether to do clock-dependent work. But that's complexity...)

E_TOO_MUCH_COMPLEXITY.

Good point about clocks and TLS though. Although, how much of the internet would break if your clock was reporting 1980 or 2030? Still, another point of failure that IMO is best avoided.

> @@ +417,5 @@
> > +  /**
> > +   * The number of times we've attempted a download.
> > +   */
> > +  get downloadAttempts() {
> > +    return this._prefs.getIntPref("downloadAttempts");
> 
> || 0? Later you just ++ this, and it's nice to be specific.

These low-level nsIPrefBranch getters throw if you attempt to access an undefined pref. That is why we set default pref values early in _startup() and surround the gets in forensic payload creation with try {} || 0. It should be safe to assume they are defined as 0 initially.

> @@ +727,5 @@
> > +    }
> > +
> > +    // We don't upgrade Windows XP SP1 and older because they are only
> > +    // compatible up to Firefox 12.
> > +    let version = this._getWindowsVersion();
> 
> What happens when Firefox is being run in compatibility mode?

Good question. I'll add it to the QA test plan I'm writing up.

> @@ +1105,5 @@
> > +        }
> > +
> > +        // Blocked by Windows parental controls.
> > +        if (request.responseStatus == 450) {
> > +          this._log.warn("Blocked by parental controls.");
> 
> Is this also going to prevent us from uploading the log? And can we leak
> that knowledge by, e.g., making a crafted DNS request?

I... don't know.
Attached patch Auto update hotfix (obsolete) — Splinter Review
I've fixed the notifications issue. I've incorporated rnewman's
feedback. I've rounded off some sharp corners. I've documented all
the remaining TODOs (search for "TODO").

The import-installers.py script now performs GPG verification
of the downloaded hash file. Yay crypto.

I've added a README.rst file with notes on expected behavior
and the testing plan.

The biggest remaining issue is around forensic reporting. This is
mostly a background thing and it shouldn't hold up review of some
of the subcomponents.

Honza: Please Please review the Necko and I/O code for XPCOM sanity.

Blair and Matt: Please review the code for add-on / hotfix sanity. I
tried to use earlier hotfixes as a template. But this hotfix is way more
complicated than anything that has come before.

Gavin: Please review the notifications and browser interation code.
Anything doing XUL, CSS, PopupNotifications, and browser windows is
all you.

Richard: Please do another round of feedback if you want. Feel free to
save your strength.

Georg: Please look at the interaction between your launcher and anything
else you want to look at.

Benjamin: Please look at overall structure and verify the code meets the
requirements for the hotfix. Also, I have some TODOs about forensic
reporting requirements. We may need to IRC or Vidyo to talk about those.
Attachment #8437997 - Flags: review?(honzab.moz)
Attachment #8437997 - Flags: review?(georg.fritzsche)
Attachment #8437997 - Flags: review?(gavin.sharp)
Attachment #8437997 - Flags: review?(bmcbride)
Attachment #8437997 - Flags: review?(MattN+bmo)
Attachment #8437997 - Flags: feedback?(rnewman)
Attachment #8437997 - Flags: feedback?(benjamin)
Attachment #8437193 - Attachment is obsolete: true
Attachment #8437193 - Flags: review?(honzab.moz)
And naturally I encounter a likely race condition as soon as I send the patch for review.

If the temporary downloaded installer has a file size == expected size, we attempt to resume. nsIChannel complains about the download not being resumeable (HTTP server likely complains about a Range outside of file size) and we get into an infinite loop attempting that on every start. Fix should be simple. It should only impact Honza's review.
I also found an issue on XP SP3. The UAC-like prompt that comes up has a checkbox that asks if you would like to prevent the program (the Firefox installer) from making unauthorized changes. It is checked by default. When it is checked, the installer fails very early with a cryptic warning about failing to read an ini file.

In my years as an XP user, I never recall seeing that prompt. Maybe it was because I always ran as an admin user. Maybe it's due to calling the UAC "elevation" APIs on XP? I dunno. But the default makes Firefox fail to install and that is a problem.
There are some additional notes in bug 994882 comment #20

Elevation is initiated using the runas verb on Vista and above as long as UAC is enabled. Using the runas verb without UAC enabled initiates runas which will allow the user to enter other credentials.

In all instances if we already have write access to the install dir and to HKLM in the registry the installer should not be launched with the runas verb.

If the user has write access to the install dir and not to HKLM in the registry the installer can still install the files and as I see it installing the files without updating the registry would be better than leaving them on the old version. If a future update or install runs with write access it will clean up the registry at that time.
Comment on attachment 8437997 [details] [diff] [review]
Auto update hotfix

Review of attachment 8437997 [details] [diff] [review]:
-----------------------------------------------------------------

Don't think I can add a lot here. Am impressed by how robust you've built this - nicely done.

::: v20140527.01/bootstrap.js
@@ +21,5 @@
> +  let resourceURI = ioService.newURI(data.resourceURI.spec + "/resource/", null, null);
> +  resourceHandler.setSubstitution("firefox-hotfix", resourceURI);
> +
> +  // And import the JSM as a side-effect.
> +  Cu.import("resource://firefox-hotfix/update.jsm");

Since this is always loaded, you may want to forcibly unload this JSM via Cu.unload() when the hotfix is uninstalled.
Attachment #8437997 - Flags: review?(bmcbride) → review+
Attached patch Auto update hotfix (obsolete) — Splinter Review
This patch fixes a number of issues with downloading. Temp file size >=
expected size no longer results in an errant HTTP request. I also fixed
some errors that occur if the network is not enabled. Testing these
conditions revealed a bug where we weren't calling the _doDownload cb if
the channel failed before the copier was instantiated. Necko APIs are
hard. I also added nsIEventSink and will abort the channel if the
redirect isn't internal and to the same URI. This should prevent us from
downloading captive portals, etc.

I also moved the installer .ini file generation to JavaScript. After
doing this, reboots after install weren't required anymore. I reckon the
installer wasn't reading the .ini file at all, possibly due to an
encoding issue. I have no clue how NSIS works nor how to debug it. It
works now, I don't think I care :)

Adding :Yoric for performance review. Please be gentle. I know we're
doing some main thread I/O. But, OS.File wasn't available to us in
Firefox 10. When you see a performance concern, please try to weigh its
drawbacks against the benefit of upgrading old clients to Firefox 30.
IMO the upgrade outweighs many concerns about temporary performance
issues on old clients.

Adding Robert for installer interaction review. There was conversation
on IRC last night about potentially not using the correct path for the
install location. I didn't quite follow the conversation. I hope he
weighs in here.
Attachment #8438580 - Flags: review?(robert.strong.bugs)
Attachment #8438580 - Flags: review?(honzab.moz)
Attachment #8438580 - Flags: review?(georgehdd)
Attachment #8438580 - Flags: review?(gavin.sharp)
Attachment #8438580 - Flags: review?(dteller)
Attachment #8438580 - Flags: review?(MattN+bmo)
Attachment #8438580 - Flags: feedback?(rnewman)
Attachment #8438580 - Flags: feedback?(benjamin
Attachment #8437997 - Attachment is obsolete: true
Attachment #8437997 - Flags: review?(honzab.moz)
Attachment #8437997 - Flags: review?(georg.fritzsche)
Attachment #8437997 - Flags: review?(gavin.sharp)
Attachment #8437997 - Flags: review?(MattN+bmo)
Attachment #8437997 - Flags: feedback?(rnewman)
Attachment #8437997 - Flags: feedback?(benjamin)
Comment on attachment 8438580 [details] [diff] [review]
Auto update hotfix

Wrong Georg.
Attachment #8438580 - Flags: review?(georgehdd) → review?(georg.fritzsche)
(In reply to Blair McBride [:Unfocused] from comment #46)
> ::: v20140527.01/bootstrap.js
> @@ +21,5 @@
> > +  let resourceURI = ioService.newURI(data.resourceURI.spec + "/resource/", null, null);
> > +  resourceHandler.setSubstitution("firefox-hotfix", resourceURI);
> > +
> > +  // And import the JSM as a side-effect.
> > +  Cu.import("resource://firefox-hotfix/update.jsm");
> 
> Since this is always loaded, you may want to forcibly unload this JSM via
> Cu.unload() when the hotfix is uninstalled.

I considered this. As much as I would like to do this, I'm worried about the persisted state in the update.jsm compartment getting lost. I really don't want the "manager" singleton to get destroyed.

I don't think there's a huge drawback to keeping the compartment around. I don't want to take a risk for little benefit (a few KB memory savings for the remainder of a browser session, which we know from data collection is typically short).
Comment on attachment 8438580 [details] [diff] [review]
Auto update hotfix

Review of attachment 8438580 [details] [diff] [review]:
-----------------------------------------------------------------

::: v20140527.01/resource/update.jsm
@@ +657,5 @@
> +    this._launcherLogFile.append("installer.log");
> +    this._launcherFile = this._stateDir.clone();
> +    // Give a friendly name so people don't get scared.
> +    this._launcherFile.append("FirefoxInstallLauncher.exe");
> +    this._targetDir = Services.dirsvc.get("CurProcD", Ci.nsIFile);

This is what last nights discussion was about. Per bug 755724 on Fx21+, this will be <firefoxdir>/browser.
Per rstrong in IRC you'll probably want to use the parent dir of "XREExeF", if "XREExeF" is available.
(In reply to Georg Fritzsche [:gfritzsche] from comment #50)
> Per rstrong in IRC you'll probably want to use the parent dir of "XREExeF",
> if "XREExeF" is available.

... and "XREExeF" is already around for a long time [1], so that's not helpful.

[1] http://hg.mozilla.org/mozilla-central/annotate/18c21532848a/xpcom/build/nsXULAppAPI.h#l66
I didn't check when it was added so stated if available and if it has been around a long time then it should just be used and then get its parent dir.
Comment on attachment 8438580 [details] [diff] [review]
Auto update hotfix

Review of attachment 8438580 [details] [diff] [review]:
-----------------------------------------------------------------

The install.rdf targetPlatform proposal is probably the most important comment.

One overall concern is the amount of bandwidth this hotfix may use for users where it costs money or it is limited. In the normal case it's not bad but if we get stuck re-downloading multiple times it can be significant. I'm not sure much can be done other than lowering MAX_DOWNLOAD_FAILURES.

r=me on stuff unrelated to the installer execution and diagnostic/peristent state stuff.

::: README
@@ +33,5 @@
>  v20140319.01 - Bug 985627 - A hotfix to temporarily turn off malware blocklist and
>                 allowlist, as they were causing higher loads on the servers than expected.
>                 Rolled out to Firefox 27.0 - 28.*.
> +v20140527.01 - Bug 1014194 - Automatically update old clients stuck on old
> +               versions.

Nit: there is redundancy with two mentions of "old" IIUC

::: v20140527.01/README.rst
@@ +154,5 @@
> +      available.
> +
> +   d. Clicking *Start using it now* in the notification should result
> +      in an install attempt. See #5. The notification should be removed
> +      on click.

Doesn't Firefox need to be shutdown for the install so that its files aren't locked? There is no mention of that in the readme AFAICT. What is the UX for that?

::: v20140527.01/bootstrap.js
@@ +42,5 @@
> +  // startup.
> +}
> +
> +function startup(data, reason) {
> +  configureResourceHandler(data);

Just an FYI that configureResourceHandler will get called twice (once by install and once by startup) in some cases IIRC. It seems like this is fine for the current code in configureResourceHandler.

::: v20140527.01/content/notification.css
@@ +1,1 @@
> +#upgrade-notification {

Nit: missing license

::: v20140527.01/content/notification.xml
@@ +1,3 @@
> +<?xml version="1.0"?>
> +
> +<!DOCTYPE bindings [

Nit: missing license

::: v20140527.01/install.rdf
@@ +6,5 @@
> +  <Description about="urn:mozilla:install-manifest">
> +    <em:id>firefox-hotfix@mozilla.org</em:id>
> +    <em:version>20140527.01</em:version>
> +    <em:bootstrap>true</em:bootstrap>
> +    <em:strictCompatibility>true</em:strictCompatibility>

You should include <em:targetPlatform>WINNT</em:targetPlatform> so this only gets downloaded/installed on Windows. It looks like that was supported in Fx4 (bug 558834). Example: http://hg.mozilla.org/releases/firefox-hotfixes/file/844b1d518c93/v20120430.01/install.rdf#l10

@@ +17,5 @@
> +      <Description>
> +        <!-- Firefox -->
> +        <em:id>{ec8030f7-c20a-464f-9b0e-13a3a9e97384}</em:id>
> +        <em:minVersion>10.0</em:minVersion>
> +        <em:maxVersion>32.*</em:maxVersion>

I'm assuming this is 32.* so that the extension will still run after an upgrade to 30+. IIRC, the bootstrap methods don't run when an extension isn't compatible with the application version but I could be wrong. If that's the case, we need to make sure this maxVersion stays high enough.

::: v20140527.01/resource/update.jsm
@@ +299,5 @@
> +  this._retryTimer = null;
> +}
> +UpgradeManager.prototype = {
> +  // Where to store our preferences.
> +  PREFS_BRANCH: "hotfix.v20140527.01.",

Nit: We've been storing other hotfix prefs under extensions.[firefox-]hotfix.

@@ +331,5 @@
> +  // TODO need production URL with SSL.
> +  UPLOAD_URL: "http://ec2-54-185-43-79.us-west-2.compute.amazonaws.com:8080/submit/hotfix/hotfix",
> +
> +  // Value of startup.homepage_override_url to set on successful upgrade.
> +  UPGRADE_HOMEPAGE_OVERRIDE: "https://www.mozilla.org/projects/firefox/%VERSION%/whatsnew/?oldversion=%OLD_VERSION%",

As I mentioned on IRC the other day, this URL differs from the one we recently used for 29 (see bug 987480) although it seems to redirect to the same place (which I didn't realize) so it's probably fine.

@@ +609,5 @@
> +
> +    // Back up the upgrade URL in case it is set.
> +    if (Services.prefs.prefHasUserValue("startup.homepage_override_url")) {
> +      // But only if we haven't backed it up yet (so we don't overrite the backup.)
> +      if (!this._prefs.prefHasUserValue("override_url_backup_performed")) {

Looks like you don't set this pref anywhere. I think you meant to remove the "_performed" suffix.

@@ +1591,5 @@
> +    const lines = [
> +      "[Install]",
> +      "PreventRebootRequired=true",
> +      "InstallDirectoryPath=%InstallPath%",
> +      "RemoveDistributionDir=true",

I thought we didn't want to remove the distribution directory in the event we wanted to deploy the hotfix to partner builds? Shouldn't it be false?

@@ +1802,5 @@
> +      // JavaScript's Date type. 2^36 in milliseconds corresponds to ~795 days.
> +      timeout: Math.pow(2, 36),
> +
> +      // Keep showing during location changes. Upgrades are important!
> +      persistWhileVisible: true,

I'm not sure how about:home would cause location changes but okay.
Attachment #8438580 - Flags: review?(MattN+bmo) → review+
Comment on attachment 8438580 [details] [diff] [review]
Auto update hotfix

(In reply to Matthew N. [:MattN] from comment #53)
>...
> ::: v20140527.01/README.rst
> @@ +154,5 @@
> > +      available.
> > +
> > +   d. Clicking *Start using it now* in the notification should result
> > +      in an install attempt. See #5. The notification should be removed
> > +      on click.
> 
> Doesn't Firefox need to be shutdown for the install so that its files aren't
> locked? There is no mention of that in the readme AFAICT. What is the UX for
> that?
The PreventRebootRequired=true in the ini file moves the files out of the way prior to the install.

>...
> @@ +1591,5 @@
> > +    const lines = [
> > +      "[Install]",
> > +      "PreventRebootRequired=true",
> > +      "InstallDirectoryPath=%InstallPath%",
> > +      "RemoveDistributionDir=true",
> 
> I thought we didn't want to remove the distribution directory in the event
> we wanted to deploy the hotfix to partner builds? Shouldn't it be false?
Yes, that should be false.

I have been told that for some cases that browser is being appended to the install dir and that will need to be fixed before r+. I suspect that using the parent dir of XREExeF should just work.

Overall looks good. I'll take another look after those items have been resolved.
Attachment #8438580 - Flags: review?(robert.strong.bugs) → review-
gps, do I understand correctly that you want me to look at bootstrap.js and update.jsm?
Flags: needinfo?(gps)
Comment on attachment 8438580 [details] [diff] [review]
Auto update hotfix

Gave relevant feedback in person. Mainly about saving off state from the "old" Firefox and only submitting it in the "new" Firefox once FHR notifications are shown and opt-out UI is available.
Attachment #8438580 - Flags: feedback?(benjamin)
Heads up: I completely refactored forensics upload in the yet-to-be-pushed code. Along the way, I also moved all state capture into a local file (as opposed to prefs). It was necessary to eliminate some cracks that would form if prefs didn't get saved (which we know from FHR to be something that happens in the wild).
(In reply to David Rajchenbach Teller [:Yoric] from comment #55)
> gps, do I understand correctly that you want me to look at bootstrap.js and
> update.jsm?

Yes.
Flags: needinfo?(gps)
Attached patch Auto update hotfix (obsolete) — Splinter Review
- Now using XREExeF.parent to determine install location
- Now only performing forensic upload if it is allowed via FHR or
  Telemetry
  - This required integrating with some FHR APIs post upgrade.
- Removed distribution dir removal from installer ini
- Now storing most state in a JSON file instead of prefs
- Addressed all of MattN's feedback

I'm now confident things are very close to final and am asking for full
review. I expect (hope really) there will only be small changes from
this point out.

rnewman: please pay special attention to the newly-added code that
interacts with FHR APIs post upgrade.
Attachment #8439402 - Flags: review?(robert.strong.bugs)
Attachment #8439402 - Flags: review?(rnewman)
Attachment #8439402 - Flags: review?(honzab.moz)
Attachment #8439402 - Flags: review?(georg.fritzsche)
Attachment #8439402 - Flags: review?(gavin.sharp)
Attachment #8439402 - Flags: review?(dteller)
Attachment #8439402 - Flags: review?(benjamin)
Attachment #8438580 - Attachment is obsolete: true
Attachment #8438580 - Flags: review?(honzab.moz)
Attachment #8438580 - Flags: review?(georg.fritzsche)
Attachment #8438580 - Flags: review?(gavin.sharp)
Attachment #8438580 - Flags: review?(dteller)
Attachment #8438580 - Flags: feedback?(rnewman)
Duplicate of this bug: 1014215
The new data upload functionality works iff Telemetry or FHR is enabled and allowing upload.

On upgrade, we now wait for FHR to prompt about the notification. This will occur after 24 hours and may require multiple restarts. Once FHR has prompted, we uninstall immediately, performing an upload if allowed by FHR policy.

This is the most conservative approach possible because it avoids getting too deep into FHR APIs. The downside is we'll likely have 1+ day lag between upgrade and receiving the upload from clients upgrading from Firefox <21. It also means the hotfix will persist during this time. This is somewhat annoying. Unless we want to introduce slightly more complexity, I'm tempted to leave it this way.

Benjamin: This decision is your call. Please address in your review.
Hard to do reviews on a patch that constantly changes.  I have a second draft now on another obsolete patch..  I will give review for the one I've started the last time on and then check on most recent what has changed.
(In reply to Honza Bambas (:mayhemer) from comment #62)
> Hard to do reviews on a patch that constantly changes.  I have a second
> draft now on another obsolete patch..  I will give review for the one I've
> started the last time on and then check on most recent what has changed.

The Necko code has hardly changed in the last few days. The last major change was adding nsIChannelEventSink and that was mostly additive.

(In reply to David Rajchenbach Teller [:Yoric] from comment #63)
> Same here.

I shouldn't need to upload any major changes any more. Please start your reviews.
Comment on attachment 8438580 [details] [diff] [review]
Auto update hotfix

Review of attachment 8438580 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!  Good implementation eventually.

Is there a way to test this?  I can rip the code, put to some extension and try to play with it, but that seems a lot of work.  This is complex and should be checked and I usually have a rule to not give r+ until I test myself.  Or am ensured on good testing ;)

::: v20140527.01/resource/update.jsm
@@ +100,5 @@
> +
> +  _populateOutputStream: function () {
> +    // First create a raw stream. We can bail out early if that fails.
> +    try {
> +      this._outputStream = FileUtils.openFileOutputStream(this._file,

this is _fileOutputStream, but (see bellow) don't refer this stream at all, it's kept alive by the converter and you should not touch the underlying file stream at all.

@@ +137,5 @@
> +  close: function () {
> +    if (this._outputStream) {
> +      this._outputStream.close();
> +      this._outputStream = null;
> +      this._converterStream = null;

you should close just the converter and null out _outputStream.  Closing the _outputStream may prevent any buffered data pending in the converter to be written which close() on the converter should take care to write.  And generally it's not a good idea to close an underlying stream ;)

@@ +143,5 @@
> +  },
> +
> +  flush: function () {
> +    if (this._outputStream) {
> +      this._outputStream.flush();

Do this on the converter, not on the file stream.

@@ +171,5 @@
> +
> +let memoryAppender = new InMemoryAppender();
> +this.log.addAppender(memoryAppender);
> +
> +let fileAppender;;

;; and maybe give this a prefix like gFileAppender ?

@@ +1101,5 @@
> +    this.downloadAttempts++;
> +    this._log.warn("Starting download. Attempt " + this.downloadAttempts);
> +
> +    // No existing installer file. Download it.
> +    this._doDownload(function onDownload(error, file) {

what is the 'file' arg here for?

@@ +1156,5 @@
> +                         .QueryInterface(Ci.nsIRequest)
> +                         .QueryInterface(Ci.nsIHttpChannel);
> +
> +    // Don't cache download because we do file resuming automatically.
> +    channel.loadFlags |= Ci.nsIRequest.LOAD_BYPASS_LOCAL_CACHE;

Comment says you want to prevent the file to be also stored to the cache.  But the correct flag is nsIRequest.INHIBIT_CACHING

@@ +1228,5 @@
> +          newChannel.QueryInterface(Ci.nsIHttpChannel);
> +        } catch (e) {
> +          this._log.warn("New channel is not an nsIHttpChannel.");
> +          callback.onRedirectVerifyCallback(Cr.NS_ERROR_NO_INTERFACE);
> +          return;

it's enough to throw from the method.  you are obligated to call the callback only when you return OK.  but this is alright too, if tested, then leave it.

@@ +1262,5 @@
> +    let copierRequest;
> +    let channelFailed = false;
> +
> +    channelListener.init(pipe.outputStream, {
> +      onStartRequest: function (request, context) {

Note: versions < 27 don't check if the response entity size is what we expect.  but it's OK since you check the file size and also do a checksum, so you are OK.  Just that when the partial response is somehow broken (bad transparent proxy, whatever) you may get weird errors in the log.

@@ +1308,5 @@
> +        //
> +        // For now, we go with blowing away the partial download.
> +        if (request.status == Cr.NS_ERROR_ENTITY_CHANGED) {
> +          this._log.warn("Entity ID changed!");
> +          resumeFromBytes = 0;

maybe drop your saved entity as well now?

@@ +1312,5 @@
> +          resumeFromBytes = 0;
> +        }
> +
> +        // Ensure a requested resume will actually work.
> +        if (request instanceof Ci.nsIResumableChannel) {

can this happen not to be true?  nsHttpChannel is always implementing this.  But there can be some JS impl override from some buggy extension...

@@ +1316,5 @@
> +        if (request instanceof Ci.nsIResumableChannel) {
> +          let entityID = null;
> +          try {
> +            // This may throw.
> +            entityID = request.entityID;

don't you need to QI first?

@@ +1342,5 @@
> +
> +        // We delay initializing the copier until this point because we don't
> +        // want the copier mucking about with the file output stream before
> +        // that stream has been seeked.
> +        copier.init(pipe.inputStream, fos, null, true, true, 1048576, true, true);

should be 

copier.init(pipe.inputStream, fos, null, true, *false*, 1048576, true, true);

since the file stream is not buffered, hence doesn't support WriteSegments.

@@ +1347,5 @@
> +
> +        copier.asyncCopy({
> +          onStartRequest: function (request, context) {
> +            this._log.warn("copier:onStartRequest()");
> +            copierRequest = request;

nit: request == copier, just QI'ed to nsIRequest

@@ +1361,5 @@
> +            }
> +
> +            if (!Components.isSuccessCode(status)) {
> +              this._log.warn("Copier didn't complete successfully.");
> +              cb(new Error("Download failed."));

I think here you should rather log that file saving failed, since it can IMO fail only when the sink stream (=file out stream) fails to do its stuff.

@@ +1388,5 @@
> +          if (copierRequest) {
> +            this._log.warn("Cancelling copier due to failed channel.");
> +            copierRequest.cancel(Cr.NS_BINDING_ABORTED);
> +            // This should trigger the copier's onStopRequest().
> +            pipe.outputStream.close();

I think it's enough to close just the pipe.  This way you can lose some of the buffered not yet written data.  Tho you won't get the error status in copier observer onstop.  But you check the channelFailed flag anyway.  Also let you know there is a closeWithStatus on nsIAsyncOutputStream that pipe do impl.  You can pass NS_BINDING_ABORTED there.  Anyway, that is no needed for what you want to achieve.  Up to you.

@@ +1431,5 @@
> +   */
> +  _verifyDownload: function (file, cb) {
> +    // nsIFile.fileSize may lie on Windows. We copy the nsIFile as a workaround.
> +    // See bug 1022704.
> +    file = file.clone();

smart!

@@ +1492,5 @@
> +
> +    let os = FileUtils.openSafeFileOutputStream(this._entityIDFile);
> +    let converter = Cc["@mozilla.org/intl/scriptableunicodeconverter"]
> +                      .createInstance(Ci.nsIScriptableUnicodeConverter);
> +    converter.charset = "UTF-8";

"us-ascii" rather?  The entry should be byte-to-byte identical.
Attachment #8438580 - Attachment is obsolete: false
Attachment #8438580 - Flags: feedback+
Attached file sanitized install log
This is a sanitized installer log that will be uploaded to Mozilla after the hotfix has upgraded Firefox. As you can see, we're replacing potentially PII data such as filesystem paths with markers like <ProgramFilesPath>.

I'm not familiar with everything the installer may write to this log. If anyone can think of anything that we'll need to sanitize that _removeSensitiveData() isn't already doing, please speak now.
Comment on attachment 8439402 [details] [diff] [review]
Auto update hotfix

Review of attachment 8439402 [details] [diff] [review]:
-----------------------------------------------------------------

First pass. I want to do another.

::: README
@@ +33,5 @@
>  v20140319.01 - Bug 985627 - A hotfix to temporarily turn off malware blocklist and
>                 allowlist, as they were causing higher loads on the servers than expected.
>                 Rolled out to Firefox 27.0 - 28.*.
> +v20140527.01 - Bug 1014194 - Automatically update clients stuck on old versions
> +               by manually downloading and running an installer.

Kinda the opposite of "manually", no? I'd say "downloading and running an installer outside of normal update channels".

::: v20140527.01/install.rdf
@@ +18,5 @@
> +      <Description>
> +        <!-- Firefox -->
> +        <em:id>{ec8030f7-c20a-464f-9b0e-13a3a9e97384}</em:id>
> +        <em:minVersion>10.0</em:minVersion>
> +        <!-- We need to run after upgrade. Ensure this is higher than upgrade

Greater than, or greater than or equal to?

::: v20140527.01/resource/update.jsm
@@ +746,5 @@
> +      // was ongoing.
> +      _installInProgress: false,
> +
> +      // Whether this client was at one time a candidate for the hotfix.
> +      _wasCompatible: false,

Perhaps a better name is "_everCompatible"?

@@ +833,5 @@
> +      return false;
> +    }
> +
> +    // We don't upgrade Windows XP SP1 and older because they are only
> +    // compatible up to Firefox 12.

Did you check to see how this behaved in compat mode?

@@ +880,5 @@
> +        return false;
> +      }
> +    } catch (e) {}
> +
> +    // Don't attempt upgrade past a specific modern version.

Run this check first?

@@ +1081,5 @@
> +
> +    this._attemptInstall();
> +  },
> +
> +  _onNotifiyDismissed: function () {

onNotifyDismissed. Which I guess means this flow hasn't been tested?

@@ +1115,5 @@
> +   * not acceptable.
> +   */
> +  _onPostUpgrade: function () {
> +    // If we know an upload is allowed, go ahead and do it without involving
> +    // all the FHR logic. This will result in some redundant logs. Meh.

Not just "meh": we have a forensic ID, so we can detect dupes trivially.

@@ +2094,5 @@
> +   * Mozilla. If upload is not applicable or not allowed (via the user not
> +   * agreeing to data collection), then this no-ops.
> +   *
> +   * All uploaded data should be anonymous and contain no identifiable
> +   * information.

Well, no *personally* identifiable information.

But something to consider here: should we send the FHR stable ID along with this upgrade forensic bundle? That would answer a lot of questions.

@@ +2153,5 @@
> +        // version because the path may be a subset of the URI.
> +
> +        // try here because .spec may throw.
> +        try {
> +          s = s.replace(uri.spec, '<' + name + 'URI>', 'g');

Should these be .asciiSpec?
Comment on attachment 8439402 [details] [diff] [review]
Auto update hotfix

Review of attachment 8439402 [details] [diff] [review]:
-----------------------------------------------------------------

Comment 65 applies to this patch as well.

::: v20140527.01/resource/update.jsm
@@ +2284,5 @@
> +        this._log.warn("Error writing file " + file.leafName + ": " +
> +                       this._err2str(status));
> +        cb(new Error("Error writing to file."));
> +        return;
> +      }

this is using the stream service pool, there is some chance that when this gets called in rapid bursts, not the last state may be actually stored.  worth some manual testing.
Attachment #8439402 - Flags: review?(honzab.moz) → feedback+
(In reply to Honza Bambas (:mayhemer) from comment #68)
> ::: v20140527.01/resource/update.jsm
> @@ +2284,5 @@
> > +        this._log.warn("Error writing file " + file.leafName + ": " +
> > +                       this._err2str(status));
> > +        cb(new Error("Error writing to file."));
> > +        return;
> > +      }
> 
> this is using the stream service pool, there is some chance that when this
> gets called in rapid bursts, not the last state may be actually stored. 
> worth some manual testing.

I was worried about that. Damn. I guess I have to reinvent DeferredSave as well. Oy.

Thank you very much for the reviews.
(In reply to Gregory Szorc [:gps] from comment #69)
> (In reply to Honza Bambas (:mayhemer) from comment #68)
> > ::: v20140527.01/resource/update.jsm
> > @@ +2284,5 @@
> > > +        this._log.warn("Error writing file " + file.leafName + ": " +
> > > +                       this._err2str(status));
> > > +        cb(new Error("Error writing to file."));
> > > +        return;
> > > +      }
> > 
> > this is using the stream service pool, there is some chance that when this
> > gets called in rapid bursts, not the last state may be actually stored. 
> > worth some manual testing.
> 
> I was worried about that. Damn. I guess I have to reinvent DeferredSave as
> well. Oy.
> 
> Thank you very much for the reviews.

Or start a thread of your own and pass it as an arg to the copier, now you are passing null, so it picks the one from the pool.
Also note that when shutting down an xpcom thread, it loops the current threads event queue, so you may get reentered when you try to shutdown your own thread in some callback.  Posting the shutdown as a single event on the main thread does the job.
Comment on attachment 8439402 [details] [diff] [review]
Auto update hotfix

Review of attachment 8439402 [details] [diff] [review]:
-----------------------------------------------------------------

The installer integration looks sound to me, some comments/questions though below.

::: v20140527.01/resource/update.jsm
@@ +1762,5 @@
> +
> +    let onProcessFinished = function (process, topic) {
> +      if (topic == "process-failed") {
> +        this._log.warn("Launcher failed.");
> +        cb(new Error("Launcher failed."));

This and the following failure don't count into this._s.installFailures intentionally?

@@ +1814,5 @@
> +          if (details.details) {
> +            this._log.warn("Failure details: " + details.details);
> +          }
> +
> +          if (exitCode == 5 || exitCode == 6) {

Named exit codes would be much nicer.

@@ +1830,5 @@
> +      }.bind(this));
> +    }.bind(this);
> +
> +    this._log.warn("Running launcher process.");
> +    let us = this;

Unused?

@@ +1833,5 @@
> +    this._log.warn("Running launcher process.");
> +    let us = this;
> +    process.runAsync(args, args.length, {
> +      QueryInterface: XPCOMUtils.generateQI([
> +        Ci.nsIWebProgressListener,

Copy'n'paste left-over?

@@ +2140,5 @@
> +      CurWorkDir: "InstallDir",
> +      Home: "Home",
> +      TmpD: "Temp",
> +      ProgF: "ProgramFiles",
> +    };

What about paths potentially outside of those dirs? Do we really need to submit them fully?
Attachment #8439402 - Flags: review?(georg.fritzsche) → review+
(In reply to Honza Bambas (:mayhemer) from comment #65)
> Is there a way to test this?  I can rip the code, put to some extension and
> try to play with it, but that seems a lot of work.  This is complex and
> should be checked and I usually have a rule to not give r+ until I test
> myself.  Or am ensured on good testing ;)

I'm periodically uploading builds to https://people.mozilla.org/~gszorc/hotfix-v20140527.01.xpi. Although, that's also where I send my experimental builds to try out from remote machines. So you are never guaranteed a stable build.

This hotfix will undergo extensive manual testing. But I encourage you to test it yourself. If you have any test scenarios not covered in the README.rst file, please let me know so I can add them to the test plan!
 
> ::: v20140527.01/resource/update.jsm
> @@ +100,5 @@
> > +
> > +  _populateOutputStream: function () {
> > +    // First create a raw stream. We can bail out early if that fails.
> > +    try {
> > +      this._outputStream = FileUtils.openFileOutputStream(this._file,
> 
> this is _fileOutputStream, but (see bellow) don't refer this stream at all,
> it's kept alive by the converter and you should not touch the underlying
> file stream at all.

Fixed.


> @@ +1101,5 @@
> > +    this.downloadAttempts++;
> > +    this._log.warn("Starting download. Attempt " + this.downloadAttempts);
> > +
> > +    // No existing installer file. Download it.
> > +    this._doDownload(function onDownload(error, file) {
> 
> what is the 'file' arg here for?

It was let over from an earlier version. Dropped it.

> @@ +1156,5 @@
> > +                         .QueryInterface(Ci.nsIRequest)
> > +                         .QueryInterface(Ci.nsIHttpChannel);
> > +
> > +    // Don't cache download because we do file resuming automatically.
> > +    channel.loadFlags |= Ci.nsIRequest.LOAD_BYPASS_LOCAL_CACHE;
> 
> Comment says you want to prevent the file to be also stored to the cache. 
> But the correct flag is nsIRequest.INHIBIT_CACHING

Great catch!

> @@ +1228,5 @@
> > +          newChannel.QueryInterface(Ci.nsIHttpChannel);
> > +        } catch (e) {
> > +          this._log.warn("New channel is not an nsIHttpChannel.");
> > +          callback.onRedirectVerifyCallback(Cr.NS_ERROR_NO_INTERFACE);
> > +          return;
> 
> it's enough to throw from the method.  you are obligated to call the
> callback only when you return OK.  but this is alright too, if tested, then
> leave it.

I'm not sure how to exercise this code path. I've seen this pattern in other code, which is where I copied it from.

> @@ +1262,5 @@
> > +    let copierRequest;
> > +    let channelFailed = false;
> > +
> > +    channelListener.init(pipe.outputStream, {
> > +      onStartRequest: function (request, context) {
> 
> Note: versions < 27 don't check if the response entity size is what we
> expect.  but it's OK since you check the file size and also do a checksum,
> so you are OK.  Just that when the partial response is somehow broken (bad
> transparent proxy, whatever) you may get weird errors in the log.

I've added this note as a comment.

> @@ +1308,5 @@
> > +        //
> > +        // For now, we go with blowing away the partial download.
> > +        if (request.status == Cr.NS_ERROR_ENTITY_CHANGED) {
> > +          this._log.warn("Entity ID changed!");
> > +          resumeFromBytes = 0;
> 
> maybe drop your saved entity as well now?

Good idea. Done.

> @@ +1312,5 @@
> > +          resumeFromBytes = 0;
> > +        }
> > +
> > +        // Ensure a requested resume will actually work.
> > +        if (request instanceof Ci.nsIResumableChannel) {
> 
> can this happen not to be true?  nsHttpChannel is always implementing this. 
> But there can be some JS impl override from some buggy extension...

You tell me ;) Again, I copied this pattern from elsewhere (services/common/rest.js). I suspect we may have seen this during internal redirects?

> @@ +1316,5 @@
> > +        if (request instanceof Ci.nsIResumableChannel) {
> > +          let entityID = null;
> > +          try {
> > +            // This may throw.
> > +            entityID = request.entityID;
> 
> don't you need to QI first?

You said a few comments back that a nsIHttpChannel is always a Ci.nsIResumeableChannel. I QI request to nsIHttpChannel earlier in the function. Does this not encapsulate nsIResumableChannel as well?

In the end, the "Download is resumable" message is logged. So this is working in practice.

> @@ +1342,5 @@
> > +
> > +        // We delay initializing the copier until this point because we don't
> > +        // want the copier mucking about with the file output stream before
> > +        // that stream has been seeked.
> > +        copier.init(pipe.inputStream, fos, null, true, true, 1048576, true, true);
> 
> should be 
> 
> copier.init(pipe.inputStream, fos, null, true, *false*, 1048576, true, true);
> 
> since the file stream is not buffered, hence doesn't support WriteSegments.

Changed.

> @@ +1347,5 @@
> > +
> > +        copier.asyncCopy({
> > +          onStartRequest: function (request, context) {
> > +            this._log.warn("copier:onStartRequest()");
> > +            copierRequest = request;
> 
> nit: request == copier, just QI'ed to nsIRequest

I changed this to a "copierStarted" variable. It seems to work. Although I'd appreciate you looking at this again after review.

> @@ +1388,5 @@
> > +          if (copierRequest) {
> > +            this._log.warn("Cancelling copier due to failed channel.");
> > +            copierRequest.cancel(Cr.NS_BINDING_ABORTED);
> > +            // This should trigger the copier's onStopRequest().
> > +            pipe.outputStream.close();
> 
> I think it's enough to close just the pipe.  This way you can lose some of
> the buffered not yet written data.  Tho you won't get the error status in
> copier observer onstop.  But you check the channelFailed flag anyway.  Also
> let you know there is a closeWithStatus on nsIAsyncOutputStream that pipe do
> impl.  You can pass NS_BINDING_ABORTED there.  Anyway, that is no needed for
> what you want to achieve.  Up to you.

I'll make an attempt at changing this. Will appreciate special attention on next review.

> @@ +1492,5 @@
> > +
> > +    let os = FileUtils.openSafeFileOutputStream(this._entityIDFile);
> > +    let converter = Cc["@mozilla.org/intl/scriptableunicodeconverter"]
> > +                      .createInstance(Ci.nsIScriptableUnicodeConverter);
> > +    converter.charset = "UTF-8";
> 
> "us-ascii" rather?  The entry should be byte-to-byte identical.

I use this function for writing JSON. I don't think I'll be introducing any non-ASCII code points. But, I think it's best to leave this as UTF-8, just to be safe.

Thanks again for the great review.
(In reply to Richard Newman [:rnewman] from comment #67)

> ::: v20140527.01/resource/update.jsm
> @@ +746,5 @@
> > +      // was ongoing.
> > +      _installInProgress: false,
> > +
> > +      // Whether this client was at one time a candidate for the hotfix.
> > +      _wasCompatible: false,
> 
> Perhaps a better name is "_everCompatible"?

Changed.

> @@ +833,5 @@
> > +      return false;
> > +    }
> > +
> > +    // We don't upgrade Windows XP SP1 and older because they are only
> > +    // compatible up to Firefox 12.
> 
> Did you check to see how this behaved in compat mode?

What do you mean by "compat mode?" Windows safe mode? Firefox compatibility mode? If the latter, all add-ons should be disabled.

> @@ +880,5 @@
> > +        return false;
> > +      }
> > +    } catch (e) {}
> > +
> > +    // Don't attempt upgrade past a specific modern version.
> 
> Run this check first?

Moved.

> @@ +1081,5 @@
> > +
> > +    this._attemptInstall();
> > +  },
> > +
> > +  _onNotifiyDismissed: function () {
> 
> onNotifyDismissed. Which I guess means this flow hasn't been tested?

Yikes. Good thing it wasn't important beyond data logging.

> @@ +2094,5 @@
> > +   * Mozilla. If upload is not applicable or not allowed (via the user not
> > +   * agreeing to data collection), then this no-ops.
> > +   *
> > +   * All uploaded data should be anonymous and contain no identifiable
> > +   * information.
> 
> Well, no *personally* identifiable information.
> 
> But something to consider here: should we send the FHR stable ID along with
> this upgrade forensic bundle? That would answer a lot of questions.

That's a question for bsmedberg. It's a bit more complicated.

You'll note that with the state refactor to a JSON file, we now keep state around in a file in the root of the profile, even after the hotfix is uninstalled. If we want this in FHR, my vote is to add a probe to mainline (say in Firefox 31 or 32) that reports on the contents of this file.

> @@ +2153,5 @@
> > +        // version because the path may be a subset of the URI.
> > +
> > +        // try here because .spec may throw.
> > +        try {
> > +          s = s.replace(uri.spec, '<' + name + 'URI>', 'g');
> 
> Should these be .asciiSpec?

No. Profile paths may have Unicode (Unicode usernames for example). I also think "Program Files" might be localized, though I'm less certain of that. I'll add a comment.
(In reply to Georg Fritzsche [:gfritzsche] from comment #72)
> ::: v20140527.01/resource/update.jsm
> @@ +1762,5 @@
> > +
> > +    let onProcessFinished = function (process, topic) {
> > +      if (topic == "process-failed") {
> > +        this._log.warn("Launcher failed.");
> > +        cb(new Error("Launcher failed."));
> 
> This and the following failure don't count into this._s.installFailures
> intentionally?

Good catch!

> @@ +1814,5 @@
> > +          if (details.details) {
> > +            this._log.warn("Failure details: " + details.details);
> > +          }
> > +
> > +          if (exitCode == 5 || exitCode == 6) {
> 
> Named exit codes would be much nicer.

Agreed.

> @@ +1830,5 @@
> > +      }.bind(this));
> > +    }.bind(this);
> > +
> > +    this._log.warn("Running launcher process.");
> > +    let us = this;
> 
> Unused?

Yup. Removed.

> @@ +1833,5 @@
> > +    this._log.warn("Running launcher process.");
> > +    let us = this;
> > +    process.runAsync(args, args.length, {
> > +      QueryInterface: XPCOMUtils.generateQI([
> > +        Ci.nsIWebProgressListener,
> 
> Copy'n'paste left-over?

Yup. Nuked.

> @@ +2140,5 @@
> > +      CurWorkDir: "InstallDir",
> > +      Home: "Home",
> > +      TmpD: "Temp",
> > +      ProgF: "ProgramFiles",
> > +    };
> 
> What about paths potentially outside of those dirs? Do we really need to
> submit them fully?

I'm not sure what else we can do here. This is a very imprecise science. Looking at installer logs, I'm not sure what other paths could creep in. This is really a question for rstrong.

FWIW, when we launch the hotfix, I'd strongly prefer to enable it for a few minutes so we can see returns from the wild. That way, if things slip through, we can fix and re-deploy.
I do not think we should send the stable ID with these pings. We don't intend to do correlations at this point.

I don't think we should plan on soft-rolling the hotfix. Doing that within AMO would likely be very complicated, and returns from the wild are going to be delayed by a couple days anyway. We should have reasonable confidence in our testing and go with it.

The only thing I'm worried about in terms of anonymizing the paths is making sure that we deal with both long-form and short-form paths. rstrong said in the past that he is fairly confident that as long as we deal with the home directory, we won't be leaking usernames.
To deal with paths I'd go with the drive letter, an ID, and the install dir name. The ID can be along the lines of 0=ProgramFiles, 1=HomeDir, and 2=Other.
Attached patch Auto update hotfix (obsolete) — Splinter Review
I believe I've incorporated all review feedback. I also moved data
upload to the final production/TLS URL.

You can test with
https://people.mozilla.org/~gszorc/hotfix-v20140527.01.xpi
Attachment #8440111 - Flags: review?(robert.strong.bugs)
Attachment #8440111 - Flags: review?(rnewman)
Attachment #8440111 - Flags: review?(honzab.moz)
Attachment #8440111 - Flags: review?(gavin.sharp)
Attachment #8440111 - Flags: review?(dteller)
Attachment #8440111 - Flags: review?(benjamin)
Attachment #8438580 - Attachment is obsolete: true
Attachment #8439402 - Attachment is obsolete: true
Attachment #8439402 - Flags: review?(robert.strong.bugs)
Attachment #8439402 - Flags: review?(rnewman)
Attachment #8439402 - Flags: review?(gavin.sharp)
Attachment #8439402 - Flags: review?(dteller)
Attachment #8439402 - Flags: review?(benjamin)
(In reply to Gregory Szorc [:gps] from comment #74)
> (In reply to Richard Newman [:rnewman] from comment #67)
> > @@ +833,5 @@
> > > +      return false;
> > > +    }
> > > +
> > > +    // We don't upgrade Windows XP SP1 and older because they are only
> > > +    // compatible up to Firefox 12.
> > 
> > Did you check to see how this behaved in compat mode?
> 
> What do you mean by "compat mode?" Windows safe mode? Firefox compatibility
> mode? If the latter, all add-ons should be disabled.

I believe he means Windows compatibility mode where the OS will lie to your program to try an make it work like older versions of Windows and I believe it will even lie to the program through API calls and registry accesses about which Windows version is being used. I agree that we should be testing with compatibility mode e.g. set to Windows 2000 or Windows XP SP1. It's possible that users are stuck on old versions because Firefox got identified as incompatible (Windows tries to auto-detect in some cases and offer compatibility mode) and is reporting it's running on an older version of Windows than it really is. If we see from the data that many of the users are supposedly on pre-SP2, it may be worth another hotfix to figure out how many of those users are in compat. mode and perhaps try to disable it so the user can upgrade.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #76)
> I do not think we should send the stable ID with these pings. We don't
> intend to do correlations at this point.

While that's reason enough, another reason is that if we send the stable/FHR ID and accidentally send PII, we would be able to theoretically correlate FHR back to users. (Not that Mozilla would do such a thing, of course.) It's best to minimize our exposure.

> The only thing I'm worried about in terms of anonymizing the paths is making
> sure that we deal with both long-form and short-form paths. rstrong said in
> the past that he is fairly confident that as long as we deal with the home
> directory, we won't be leaking usernames.

How would I test 8.3 paths? I /think/ I would need to install Windows 95 or 98 with FAT32 and then upgrade to XP. Is there an easier way? Do we even have APIs for converting long-form paths to 8.3 versions?
No upgrade is necessary. Just when you install Firefox, try it from e.g. "C:\Progra~1\Mozil~1\" and/or when you launch Firefox, put that in the shortcut.
Review comments.
Attachment #8440111 - Flags: review?(benjamin) → superreview+
> > > > +    // We don't upgrade Windows XP SP1 and older because they are only
> > > > +    // compatible up to Firefox 12.
> > > 
> > > Did you check to see how this behaved in compat mode?
> > 
> > What do you mean by "compat mode?" Windows safe mode? Firefox compatibility
> > mode? If the latter, all add-ons should be disabled.
> 
> I believe he means Windows compatibility mode where the OS will lie to your
> program to try an make it work like older versions of Windows

Yes, exactly that. 

If you find your Firefox, Properties > Compatibility > Compatibility mode, you can go all the way back to Windows 95. Find out what version we think Windows is if you launch it in this way.
Windows compatibility mode causes the hotfix to not detect the Windows version properly (which it currently does by calling GetVersionExW()). Time to hit up Stack Overflow.
I am able to call into GetModuleHandleW() and GetProcAddress() via ctypes to look up the presence of exported functions from kernel32.dll. I have verified that I can report the actual Windows and Service Pack version via this method. Look for this code in the next patch.

FWIW, I also plan to record whether the actual version is different from reported so we can report on it. It will be very interesting to see how many clients are running in compatibility mode. I wonder if this detection would be useful as a general Firefox feature. "You are running Firefox in compatibility mode but your version of Windows fully supports Firefox. Learn how to disable compatibility mode."
(In reply to Gregory Szorc [:gps] from comment #41)
> I'll add it to the QA test plan I'm writing up.
Is this ready? Where can I find it?
Flags: needinfo?(gps)
Attachment #8440111 - Flags: review?(gavin.sharp) → review?(MattN+bmo)
Comment on attachment 8440111 [details] [diff] [review]
Auto update hotfix

Oops, Matt already reviewed this.

Matt, did you also review the "notifications and browser interation code"?
Attachment #8440111 - Flags: review?(MattN+bmo) → review?(gavin.sharp)
Flags: needinfo?(MattN+bmo)
Comment on attachment 8440111 [details] [diff] [review]
Auto update hotfix

As I read bugmail, I keep answering my own questions - sorry for the spam. Matt says he reviewed pretty much everything in comment 53.
Attachment #8440111 - Flags: review?(gavin.sharp)
Flags: needinfo?(MattN+bmo)
(In reply to Paul Silaghi, QA [:pauly] from comment #87)
> (In reply to Gregory Szorc [:gps] from comment #41)
> > I'll add it to the QA test plan I'm writing up.
> Is this ready? Where can I find it?

See the README.rst file in the patch.
Flags: needinfo?(gps)
Attached patch Auto update hotfix (obsolete) — Splinter Review
Now with detection for Windows compatibility mode. Please pay special
attention to the new function _getActualWindowsVersion() and the
refactored version checking in _isHotfixApplicable().

I also filed bug 1026030 to report compatibility mode in FHR.

It's worth noting that if you have compatibility mode enabled, running
the installer to upgrade Firefox will wipe out that flag. My guess is
compatibility mode is a setting on the binary/file and when firefox.exe
is replaced, the old value is overwritten.
Attachment #8440920 - Flags: review?(robert.strong.bugs)
Attachment #8440920 - Flags: review?(rnewman)
Attachment #8440920 - Flags: review?(honzab.moz)
Attachment #8440920 - Flags: review?(dteller)
Attachment #8440111 - Attachment is obsolete: true
Attachment #8440111 - Flags: review?(robert.strong.bugs)
Attachment #8440111 - Flags: review?(rnewman)
Attachment #8440111 - Flags: review?(honzab.moz)
Attachment #8440111 - Flags: review?(dteller)
(In reply to Paul Silaghi, QA [:pauly] from comment #87)
> (In reply to Gregory Szorc [:gps] from comment #41)
> > I'll add it to the QA test plan I'm writing up.
> Is this ready? Where can I find it?

You can get the add-on from https://people.mozilla.org/~gszorc/hotfix-v20140527.01.xpi.

IMO this is now ready for testing.
I finally got around to testing 8.3 paths and we don't sanitize them properly :(

Should hopefully be a simple patch (although I'm not sure what 8.3 path functions we have in the tree to make my life easier).
Going with what I suggested in comment #77 should suffice. I wouldn't let perfect be the enemy of good here especially since most users should have the long path since launching using our shortcuts use full paths.
Attached patch Auto update hotfix (obsolete) — Splinter Review
Fixed sanitization. It wasn't in sync with the recent switch to
XREExeF.parent. 8.3 paths are now sanitized properly.
Attachment #8440961 - Flags: review?(robert.strong.bugs)
Attachment #8440961 - Flags: review?(rnewman)
Attachment #8440961 - Flags: review?(honzab.moz)
Attachment #8440961 - Flags: review?(dteller)
Attachment #8440920 - Attachment is obsolete: true
Attachment #8440920 - Flags: review?(robert.strong.bugs)
Attachment #8440920 - Flags: review?(rnewman)
Attachment #8440920 - Flags: review?(honzab.moz)
Attachment #8440920 - Flags: review?(dteller)
Comment on attachment 8440961 [details] [diff] [review]
Auto update hotfix

For the case where the user is unable to elevate (e.g. UAC turned off or WinXP) you can end up with a path the user can't write to without an attempt to use runas or a path that the user can write to. There is no perfect solution but this case should handled and it can possibly be improved further. See bug 1014187 comment #22 for additional details. Clearing review for now and I can make time to discuss this with you today.
Attachment #8440961 - Flags: review?(robert.strong.bugs)
Comment on attachment 8440961 [details] [diff] [review]
Auto update hotfix

Review of attachment 8440961 [details] [diff] [review]:
-----------------------------------------------------------------

Caveat: As requested, this review is solely oriented towards performance and I looked only at bootstrap.js and update.jsm.

Side-note: Do you include InstallerLauncher.exe in this patch (twice) on purpose?

I realize you asked me to be gentle, but your code seems to be doing *lots* of main thread I/O, both during startup and while logging, as well as some disk flushing. I am not going to insist you remove everything, but I have the impression that most of this can be avoided, so I don't see a very good reason to perform that MTIO.

::: v20140527.01/bootstrap.js
@@ +48,5 @@
> +
> +  log.warn("startup() - " + reason);
> +
> +  // 3 == ADDON_ENABLE; 5 == ADDON_INSTALL; 7 == ADDON_UPGRADE; 8 == ADDON_DOWNGRADE
> +  let isInstall = reason == 3 || reason == 5 || reason == 7 || reason == 8;

Nit: using strings instead of numbers would probably be more readable (at least for the logs).
Also, could you name the constants?

::: v20140527.01/import-translations.py
@@ +44,5 @@
> +
> +            if not line.startswith(';'):
> +                continue
> +
> +            if line == ';Free upgrade: a new, faster Firefox is already downloaded and ready for you to use!':

That looks like a very strange way to perform translation. Was there no better way?

::: v20140527.01/resource/update.jsm
@@ +170,5 @@
> +let memoryAppender = new InMemoryAppender();
> +this.log.addAppender(memoryAppender);
> +
> +let gFileAppender;
> +

It looks like you're logging a lot, so anything you can do to avoid doing MTIO would be useful. Could you perhaps open your output stream for async wriing and feed it progressively via a pump?

@@ +337,5 @@
> +
> +  /**
> +   * The installer locale we should fetch.
> +   */
> +  get locale() {

Right now, this getter has a number of sources of main thread I/O (I count 12 calls, marked below) that could mostly be moved to OMT I/O by using NetUtil.asyncFetch and avoiding FileUtils.getFile(). If this is executed at each startup, possibly forever, I'd prefer if you could fix this.

Also, I'm trying to understand exactly what this getter returns. Is it the file produced in import-translations.py?

@@ +343,5 @@
> +      return this._locale;
> +    }
> +
> +    // Modern versions of Firefox have the locale in omni.jar.
> +    for each (let res in ["app", "gre"]) {

Just to be sure, have you checked that `for each` is supported by Firefox 10?
(it is my understanding that we're targeting FF10+)

@@ +345,5 @@
> +
> +    // Modern versions of Firefox have the locale in omni.jar.
> +    for each (let res in ["app", "gre"]) {
> +      let url = "resource://" + res + "/update.locale";
> +      let channel = Services.io.newChannel(url, null, null);

(MTIO)

@@ +347,5 @@
> +    for each (let res in ["app", "gre"]) {
> +      let url = "resource://" + res + "/update.locale";
> +      let channel = Services.io.newChannel(url, null, null);
> +      try {
> +        let is = channel.open();

(MTIO)

@@ +348,5 @@
> +      let url = "resource://" + res + "/update.locale";
> +      let channel = Services.io.newChannel(url, null, null);
> +      try {
> +        let is = channel.open();
> +        this._locale = this._readInputStream(is);

(MTIO)

@@ +353,5 @@
> +        return this._locale;
> +      } catch (e) { }
> +    }
> +
> +    let file = FileUtils.getFile("XCurProcD", ["update.locale"]);

(MTIO)

@@ +354,5 @@
> +      } catch (e) { }
> +    }
> +
> +    let file = FileUtils.getFile("XCurProcD", ["update.locale"]);
> +    if (!file.exists()) {

(MTIO)

@@ +356,5 @@
> +
> +    let file = FileUtils.getFile("XCurProcD", ["update.locale"]);
> +    if (!file.exists()) {
> +      this._log.warn("update.locale not found in XCurProcD");
> +      file = FileUtils.getFile("GreD", ["update.locale"]);

(MTIO)

@@ +357,5 @@
> +    let file = FileUtils.getFile("XCurProcD", ["update.locale"]);
> +    if (!file.exists()) {
> +      this._log.warn("update.locale not found in XCurProcD");
> +      file = FileUtils.getFile("GreD", ["update.locale"]);
> +      if (!file.exists()) {

(MTIO)

@@ +370,5 @@
> +    }
> +
> +    let fis = Cc["@mozilla.org/network/file-input-stream;1"]
> +                .createInstance(Ci.nsIFileInputStream);
> +    fis.init(file, 0x01 /* read */, 420 /* 0644 */, 0);

(MTIO)

@@ +371,5 @@
> +
> +    let fis = Cc["@mozilla.org/network/file-input-stream;1"]
> +                .createInstance(Ci.nsIFileInputStream);
> +    fis.init(file, 0x01 /* read */, 420 /* 0644 */, 0);
> +    this._locale = this._readInputStream(fis);

(MTIO)

@@ +435,5 @@
> +      Services.obs.addObserver(function obs() {
> +        Services.obs.removeObserver(obs, "sessionstore-windows-restored");
> +
> +        us._startup(isInstall);
> +      }, "sessionstore-windows-restored", false);

Are you certain that your code is called before sessionstore-windows-restored?

@@ +447,5 @@
> +   *
> +   * The instance is not usable until this is called. This should only
> +   * be called once the browser has been fully initialized.
> +   */
> +  _startup: function (isInstall) {

Nit: `_startup` is a bit misleading.

@@ +475,5 @@
> +        this._log.warn("Upgrade appears to not have worked. Will attempt again.");
> +        this._s._installInProgress = false;
> +      }
> +
> +      this._saveState();

Do we actually need to save state if Firefox closes while an install is running? Is that piece of statistics really important?

@@ +536,5 @@
> +      this._log.warn("Resetting lastNotifyDay because of apparent clock skew.");
> +      this._s.lastNotifyDay = 0;
> +    }
> +
> +    // Back up the upgrade URL in case it is set.

I'm trying to understand why we take the trouble to do that.

@@ +603,5 @@
> +      this.tryToDownloadAndUpdate();
> +    }.bind(this));
> +  },
> +
> +  _getInstallersJSON: function () {

Again main thread I/O that could be avoided by using NetUtil.asyncFetch or possibly XHR.

@@ +681,5 @@
> +      this._log.removeAppender(gFileAppender);
> +      gFileAppender = null;
> +    }
> +
> +    if (this._stateDir && this._stateDir.exists()) {

You could avoid the call to exists() and just call remove() immediately. That's one less piece of MTIO.

@@ +935,5 @@
> +    // We are upgrading to release builds, so only target the release channel.
> +    try {
> +      let channel = Services.prefs.getCharPref("app.update.channel");
> +      if (channel != "release") {
> +        this._log.warn("Not applicable - not release channel: " + channel);

Incidentally, a misconfigured channel might be the reason for which Firefox has never been upgraded. Should we do something here?

@@ +1193,5 @@
> +    // That means we can use its API.
> +    try {
> +      let service = Cc["@mozilla.org/datareporting/service;1"]
> +                      .getService(Ci.nsISupports)
> +                      .wrappedJSObject;

Out of curiosity, can't you use the jsm?

@@ +1260,5 @@
> +    if (this._installerFile.exists()) {
> +      this._log.warn("Existing installer present. Verifying.");
> +
> +      // TODO this could happen on startup, triggering up to 30MB in read I/O.
> +      // There will be performance impact. Do we care?

If I understand correctly your code, this is called only after the session has been restored. In this case, we are most likely past the high disk contention peak during startup, so that should be ok. If you wish to add a further delay by a few seconds, just to be sure that we are not too busy hitting the cache, that works for me, too.

@@ +1532,5 @@
> +        } else {
> +          openFlags |= 0x08 | 0x20; // create | truncate
> +        }
> +
> +        // NOTE: safe-file-output-stream is buggy in append mode (bug 834042).

Good thing we don't need it :)

@@ +1540,5 @@
> +
> +        // We delay initializing the copier until this point because we don't
> +        // want the copier mucking about with the file output stream before
> +        // that stream has been seeked.
> +        copier.init(pipe.inputStream, fos, null, true, false, 1048576, true, true);

Why don't you use NetUtil.asyncFetch?

@@ +1853,5 @@
> +      // it sends "process-finished" and sets a non-zero process.exitValue.
> +      // If the process exits with a non-0 code, process.exitValue is
> +      // defined. However, if the process exits successfully, .exitValue
> +      // is undefined. Furthermore, .exitValue appears to sometimes be
> +      // undefined on non-0 exit code. Oy.

Yeah... I wonder if someone is working on a replacement API.

@@ +2279,5 @@
> +
> +  /**
> +   * Convert a binary string to its hex representation.
> +   */
> +  _bin2hex: function (b) {

That looks costly. Do I understand correctly that we are only using this on short strings?

@@ +2297,5 @@
> +  /**
> +   * Attempt to resolve an integer error code to a string name.
> +   */
> +  _err2str: function (e) {
> +    for (let k in Cr) {

Have you checked that `for ...in` is available in FF10?

@@ +2317,5 @@
> +   *        binary.
> +   */
> +  _readFileToString: function (file, cb) {
> +    let fis = Cc["@mozilla.org/network/file-input-stream;1"]
> +                .createInstance(Ci.nsIFileInputStream);

I suspect that XHR would be much faster and more convenient.

@@ +2338,5 @@
> +    pump.asyncRead({
> +      onStartRequest: function () {},
> +      onDataAvailable: function (request, context, is, offset, count) {
> +        bis.setInputStream(is);
> +        buffer += bis.readBytes(is.available());

I have the impression that the log can be pretty long, so we may end up with a large cost here.
If you decide to not use XHR, place everything in an array and call `join()` once you are done.

@@ +2365,5 @@
> +   * @param cb
> +   *        (function) Called on completion. Receives (error).
> +   */
> +  _writeStringToFile: function (s, file, cb) {
> +    let os = FileUtils.openSafeFileOutputStream(file);

safeFileOutputStream is very slow. Unless the data is critical, you should use openFileOutputStream.

@@ +2387,5 @@
> +
> +  /**
> +   * Read an input stream synchronously and strip trailing newline.
> +   */
> +  _readInputStream: function (is) {

Reading synchronously? That doesn't sound good.
Attachment #8440961 - Flags: review?(dteller) → feedback+
You can use http://mxr.mozilla.org/mozilla-central/source/xpcom/io/nsILocalFileWin.idl#32 nsILocalFileWin.canonicalPath to get the short pathname for a nsIFile (if 8.3 is disabled, it will do the right thing and just return the long path).

I don't know of an existing way to go in the opposite direction.
Other than ctypes to GetLongPathName, that is.
(In reply to David Rajchenbach Teller [:Yoric] from comment #97)
> Comment on attachment 8440961 [details] [diff] [review]
> Auto update hotfix
> 
> Review of attachment 8440961 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Caveat: As requested, this review is solely oriented towards performance and
> I looked only at bootstrap.js and update.jsm.
> 
> Side-note: Do you include InstallerLauncher.exe in this patch (twice) on
> purpose?
> 
> I realize you asked me to be gentle, but your code seems to be doing *lots*
> of main thread I/O, both during startup and while logging, as well as some
> disk flushing. I am not going to insist you remove everything, but I have
> the impression that most of this can be avoided, so I don't see a very good
> reason to perform that MTIO.

A number of the MTIO areas you've flagged are occurring accessing files inside omni.jar. It's my understanding that omni.jar is preloaded and memory mapped at startup. The overhead to access a file is effectively unzip. Since the files I'm reading are small, that hit is negligible. I'm therefore inclined to ignore performance concerns about MTIO when reading from omni.jar.

Will post more replies later.
(In reply to David Rajchenbach Teller [:Yoric] from comment #97)
> Side-note: Do you include InstallerLauncher.exe in this patch (twice) on
> purpose?

We'll clean it up before final.
 
> I realize you asked me to be gentle, but your code seems to be doing *lots*
> of main thread I/O, both during startup and while logging, as well as some
> disk flushing. I am not going to insist you remove everything, but I have
> the impression that most of this can be avoided, so I don't see a very good
> reason to perform that MTIO.

We can avoid a lot of it, sure. But we're still doing open() and stat() on the main thread and there's no way we can avoid that in Firefox 10. Furthermore, the cost to avoiding as much MTIO as possible is not negligible and increases the complexity and surface area for bugs. I also argue that the sins we are committing are on par with what was customary in the ancient Firefox versions people are stuck on.

> ::: v20140527.01/bootstrap.js
> @@ +48,5 @@
> > +
> > +  log.warn("startup() - " + reason);
> > +
> > +  // 3 == ADDON_ENABLE; 5 == ADDON_INSTALL; 7 == ADDON_UPGRADE; 8 == ADDON_DOWNGRADE
> > +  let isInstall = reason == 3 || reason == 5 || reason == 7 || reason == 8;
> 
> Nit: using strings instead of numbers would probably be more readable (at
> least for the logs).
> Also, could you name the constants?

Tell that to the AddonManager people. These constants aren't expected for whatever reason, sadly. I don't feel like going through the effort to define constants for things that are used once.

> ::: v20140527.01/import-translations.py
> @@ +44,5 @@
> > +
> > +            if not line.startswith(';'):
> > +                continue
> > +
> > +            if line == ';Free upgrade: a new, faster Firefox is already downloaded and ready for you to use!':
> 
> That looks like a very strange way to perform translation. Was there no
> better way?

Yeah, it made me sad having to write that too. But that's how the translations were done for this project. Maybe there was a better way. Have a look at https://svn.mozilla.org/projects/l10n-misc/trunk/firefoxupdater/fr/updater.lang and let me know if you can figure out one.

> ::: v20140527.01/resource/update.jsm
> @@ +170,5 @@
> > +let memoryAppender = new InMemoryAppender();
> > +this.log.addAppender(memoryAppender);
> > +
> > +let gFileAppender;
> > +
> 
> It looks like you're logging a lot, so anything you can do to avoid doing
> MTIO would be useful. Could you perhaps open your output stream for async
> wriing and feed it progressively via a pump?

Indeed. Logging is IMO the most important MTIO to remove, if possible.

Again, doing the I/O synchronously was easier. I may try to move this to a pump.

> 
> @@ +343,5 @@
> > +      return this._locale;
> > +    }
> > +
> > +    // Modern versions of Firefox have the locale in omni.jar.
> > +    for each (let res in ["app", "gre"]) {
> 
> Just to be sure, have you checked that `for each` is supported by Firefox 10?
> (it is my understanding that we're targeting FF10+)

The plurality of my testing has been on Firefox 10.

> @@ +345,5 @@
> > +
> > +    // Modern versions of Firefox have the locale in omni.jar.
> > +    for each (let res in ["app", "gre"]) {
> > +      let url = "resource://" + res + "/update.locale";
> > +      let channel = Services.io.newChannel(url, null, null);

Ignoring because omni.jar.

> 
> @@ +347,5 @@
> > +    for each (let res in ["app", "gre"]) {
> > +      let url = "resource://" + res + "/update.locale";
> > +      let channel = Services.io.newChannel(url, null, null);
> > +      try {
> > +        let is = channel.open();

Ignoring because omni.jar.

> 
> @@ +348,5 @@
> > +      let url = "resource://" + res + "/update.locale";
> > +      let channel = Services.io.newChannel(url, null, null);
> > +      try {
> > +        let is = channel.open();
> > +        this._locale = this._readInputStream(is);

Ignoring because omni.jar.

> @@ +353,5 @@
> > +        return this._locale;
> > +      } catch (e) { }
> > +    }
> > +
> > +    let file = FileUtils.getFile("XCurProcD", ["update.locale"]);
> 
> (MTIO)

FileUtils.getFile() is a small wrapper to get an nsIFile. Even if it did perform MTIO (I'm not certain it does), how would I avoid this in Firefox 10? I don't think I can.

> @@ +354,5 @@
> > +      } catch (e) { }
> > +    }
> > +
> > +    let file = FileUtils.getFile("XCurProcD", ["update.locale"]);
> > +    if (!file.exists()) {
> 
> (MTIO)

I'm pretty sure I can't avoid this in Firefox 10. I can try to read from the file and catch the error code for not exists, I suppose.

> @@ +356,5 @@
> > +
> > +    let file = FileUtils.getFile("XCurProcD", ["update.locale"]);
> > +    if (!file.exists()) {
> > +      this._log.warn("update.locale not found in XCurProcD");
> > +      file = FileUtils.getFile("GreD", ["update.locale"]);
> 
> (MTIO)

Again, I don't think this is avoidable MTIO.

> @@ +370,5 @@
> > +    }
> > +
> > +    let fis = Cc["@mozilla.org/network/file-input-stream;1"]
> > +                .createInstance(Ci.nsIFileInputStream);
> > +    fis.init(file, 0x01 /* read */, 420 /* 0644 */, 0);
> 
> (MTIO)

You caught me. I can probably make this filesystem I/O async.

> @@ +371,5 @@
> > +
> > +    let fis = Cc["@mozilla.org/network/file-input-stream;1"]
> > +                .createInstance(Ci.nsIFileInputStream);
> > +    fis.init(file, 0x01 /* read */, 420 /* 0644 */, 0);
> > +    this._locale = this._readInputStream(fis);
> 
> (MTIO)

Ditto.

> @@ +435,5 @@
> > +      Services.obs.addObserver(function obs() {
> > +        Services.obs.removeObserver(obs, "sessionstore-windows-restored");
> > +
> > +        us._startup(isInstall);
> > +      }, "sessionstore-windows-restored", false);
> 
> Are you certain that your code is called before
> sessionstore-windows-restored?

Pretty certain. Restartless add-ons are started very early. I'm pretty sure the startup sequence is put on hold while AddonManager initializes, so there are no race conditions.

> @@ +475,5 @@
> > +        this._log.warn("Upgrade appears to not have worked. Will attempt again.");
> > +        this._s._installInProgress = false;
> > +      }
> > +
> > +      this._saveState();
> 
> Do we actually need to save state if Firefox closes while an install is
> running? Is that piece of statistics really important?

It isn't for statistical purposes: it is for state recovery. See the first part of _onStateLoaded().

> @@ +536,5 @@
> > +      this._log.warn("Resetting lastNotifyDay because of apparent clock skew.");
> > +      this._s.lastNotifyDay = 0;
> > +    }
> > +
> > +    // Back up the upgrade URL in case it is set.
> 
> I'm trying to understand why we take the trouble to do that.

So we don't drop a user-defined value.

> @@ +603,5 @@
> > +      this.tryToDownloadAndUpdate();
> > +    }.bind(this));
> > +  },
> > +
> > +  _getInstallersJSON: function () {
> 
> Again main thread I/O that could be avoided by using NetUtil.asyncFetch or
> possibly XHR.

This is coming from a file inside the add-on's .xpi. I /think/ the add-on files are paged, like omni.jar. If I am wrong about this, I would consider changing it. That being said, this code only gets called if the hotfix is applicable. I'm inclined to apply less weight to MTIO concerns that only impact applicable clients because the cost of downloading and running the installer will vastly outweigh small things like reading installers.json. (Yes, I feel dirty saying that.)

> @@ +935,5 @@
> > +    // We are upgrading to release builds, so only target the release channel.
> > +    try {
> > +      let channel = Services.prefs.getCharPref("app.update.channel");
> > +      if (channel != "release") {
> > +        this._log.warn("Not applicable - not release channel: " + channel);
> 
> Incidentally, a misconfigured channel might be the reason for which Firefox
> has never been upgraded. Should we do something here?

We'll evaluate that possibility if a significant population is remaining after this hotfix is deployed.

> @@ +1193,5 @@
> > +    // That means we can use its API.
> > +    try {
> > +      let service = Cc["@mozilla.org/datareporting/service;1"]
> > +                      .getService(Ci.nsISupports)
> > +                      .wrappedJSObject;
> 
> Out of curiosity, can't you use the jsm?

Probably. This is how everyone else does it.

> @@ +1260,5 @@
> > +    if (this._installerFile.exists()) {
> > +      this._log.warn("Existing installer present. Verifying.");
> > +
> > +      // TODO this could happen on startup, triggering up to 30MB in read I/O.
> > +      // There will be performance impact. Do we care?
> 
> If I understand correctly your code, this is called only after the session
> has been restored. In this case, we are most likely past the high disk
> contention peak during startup, so that should be ok. If you wish to add a
> further delay by a few seconds, just to be sure that we are not too busy
> hitting the cache, that works for me, too.

Great to hear!

> @@ +1540,5 @@
> > +
> > +        // We delay initializing the copier until this point because we don't
> > +        // want the copier mucking about with the file output stream before
> > +        // that stream has been seeked.
> > +        copier.init(pipe.inputStream, fos, null, true, false, 1048576, true, true);
> 
> Why don't you use NetUtil.asyncFetch?

I tried this and couldn't get streaming to work reliably. I needed some low-level control over all the pieces to get this part just right.

It's entirely possible that the solution (one of dozens that I tried) that is currently implemented is compatible with asyncFetch. At this point, I don't want to touch this working code.

> @@ +1853,5 @@
> > +      // it sends "process-finished" and sets a non-zero process.exitValue.
> > +      // If the process exits with a non-0 code, process.exitValue is
> > +      // defined. However, if the process exits successfully, .exitValue
> > +      // is undefined. Furthermore, .exitValue appears to sometimes be
> > +      // undefined on non-0 exit code. Oy.
> 
> Yeah... I wonder if someone is working on a replacement API.

FWIW, subprocess.js from the add-on SDK has a pretty good process API AFAICT. But it relies on workers so is not usable here. I'm pleasantly surprised how often I looked towards the add-on SDK for examples of how to write some of this code. I'm not sure why more Firefox components aren't utilizing their excellent libraries.

> @@ +2279,5 @@
> > +
> > +  /**
> > +   * Convert a binary string to its hex representation.
> > +   */
> > +  _bin2hex: function (b) {
> 
> That looks costly. Do I understand correctly that we are only using this on
> short strings?

Correct. Only on SHA-512 hashes.

> @@ +2297,5 @@
> > +  /**
> > +   * Attempt to resolve an integer error code to a string name.
> > +   */
> > +  _err2str: function (e) {
> > +    for (let k in Cr) {
> 
> Have you checked that `for ...in` is available in FF10?

Yup.

> @@ +2317,5 @@
> > +   *        binary.
> > +   */
> > +  _readFileToString: function (file, cb) {
> > +    let fis = Cc["@mozilla.org/network/file-input-stream;1"]
> > +                .createInstance(Ci.nsIFileInputStream);
> 
> I suspect that XHR would be much faster and more convenient.

I'll consider it.

> @@ +2338,5 @@
> > +    pump.asyncRead({
> > +      onStartRequest: function () {},
> > +      onDataAvailable: function (request, context, is, offset, count) {
> > +        bis.setInputStream(is);
> > +        buffer += bis.readBytes(is.available());
> 
> I have the impression that the log can be pretty long, so we may end up with
> a large cost here.
> If you decide to not use XHR, place everything in an array and call `join()`
> once you are done.

Will do.

> @@ +2365,5 @@
> > +   * @param cb
> > +   *        (function) Called on completion. Receives (error).
> > +   */
> > +  _writeStringToFile: function (s, file, cb) {
> > +    let os = FileUtils.openSafeFileOutputStream(file);
> 
> safeFileOutputStream is very slow. Unless the data is critical, you should
> use openFileOutputStream.

I'm using this for the JSON state. It is somewhat critical.

What is the surface area for failures if I switch back to non-safe FOS? Unexpected power loss? Or more?

> @@ +2387,5 @@
> > +
> > +  /**
> > +   * Read an input stream synchronously and strip trailing newline.
> > +   */
> > +  _readInputStream: function (is) {
> 
> Reading synchronously? That doesn't sound good.

This is only used to fetch locale. The synchronous uses we care about may disappear.
Comment on attachment 8440961 [details] [diff] [review]
Auto update hotfix

Review of attachment 8440961 [details] [diff] [review]:
-----------------------------------------------------------------

Further to earlier reviews:

I'm probably at the point of error blindness here, but nits and questions inline.

N.B., your r= lines need some changes :)

This seems to me like it's ready for thorough QA. We might have hit the point of diminishing returns for review.

::: v20140527.01/resource/update.jsm
@@ +241,5 @@
> + * TODOS
> + * =====
> + *
> + * Stop showing notifications after N days? After N failures? If not, we
> + * potentially never uninstall.

Make a decision.

In theory we could push out a hotfix to uninstall this one, right?

@@ +317,5 @@
> +  UPGRADED_VERSION: "30",
> +
> +  // How often to retry failed downloads.
> +  // Value was chosen arbitrarily.
> +  DOWNLOAD_RETRY_MILLISECONDS: 10 * 60 * 1000, // 10 minutes.

Too frequent. An hour?

@@ +387,5 @@
> +   *        merely starting up like normal.
> +   */
> +  start: function (isInstall) {
> +    if (this._destroyed) {
> +      this._log.warn("Cannot start an instance after it's been destoryed.");

destroyed

@@ +451,5 @@
> +  _startup: function (isInstall) {
> +    this._log.warn("Performing start-up tasks.");
> +
> +    this._stateFile = Services.dirsvc.get("ProfD", Ci.nsIFile);
> +    this._stateFile.append("hotfix.v20140527.01.json");

What are the odds that this will interact with Reset Firefox? Dumping stuff in the profile directory feels a tiny bit risky in that regard.

@@ +708,5 @@
> +    this._log.warn("Performing a self-uninstall.");
> +
> +    this._uploadForensics(function () {
> +      this._cleanup(reason || "UNKNOWN", function () {
> +        uninstallHotfix();

function () { foo() } => foo

@@ +892,5 @@
> +      this._s.actualWindowsVersion = actualVersion;
> +      this._log.warn("Reported Windows version: " + version[0] +
> +                     "." + version[1] + " SP" + version[2]);
> +      this._log.warn("Actual Windows version: " + actualVersion[0] +
> +                     "." + actualVersion[1] + " SP" + actualVersion[2]);

Worth logging some unique string if they differ, like "COMPAT MODE". Make grepping easier.

@@ +2485,5 @@
> +   * The actual versions could be greater than what is returned. However,
> +   * we will never give back versions higher than what's possible. Therefore
> +   * this function is useful for enforcing a lower bound, not an upper bound.
> +   *
> +   * This function doesn't work for ancient versions of Versions. Also,

versions of Windows?

@@ +2495,5 @@
> +    const LPCSTR = ctypes.char.ptr;
> +    const LPCTSTR = ctypes.jschar.ptr;
> +    const FARPROC = ctypes.size_t;
> +
> +    let kernel32 = this._getKernel32();

You fetch and close this twice. Perhaps some reuse is in order?

@@ +2504,5 @@
> +    try {
> +      let GetModuleHandleW = kernel32.declare("GetModuleHandleW",
> +                                             ctypes.default_abi,
> +                                             HMODULE,
> +                                             LPCTSTR);

Nit: alignment is one char off.

@@ +2513,5 @@
> +        return null;
> +      }
> +
> +      let GetProcAddress = kernel32.declare("GetProcAddress",
> +                                            ctypes.default_abi,

Is there a reason why this isn't ctypes.winapi_abi?

@@ +2540,5 @@
> +      let is62OrHigher = isFunctionExported(module, "CheckTokenCompatibility");
> +
> +      // GetNumaNodeNumberFromHandle() was added in Windows 7 and Server 2008 R2.
> +      // http://msdn.microsoft.com/en-us/library/dd405492%28v=vs.85%29.aspx
> +      let is61OrHigher = isFunctionExported(module, "GetNumaNodeNumberFromHandle");

As a general approach point, I think you could either:

  if (isFunctionExported(...)) {
    majorVersion = 6;
    minorVersion = 2;
  } else if (...) {
    ...


or, at the very least, short circuit the later ones:

  let is60OrHigher = is62OrHigher || is61OrHigher || isFunctionExported(...);

@@ +2618,5 @@
> +      }
> +
> +      return [majorVersion, minorVersion, spMajorVersion];
> +    } catch (e) {
> +      this._log.warn("Exception obtaining true Windows version:" + e);

Space after :.

@@ +2692,5 @@
> +    let uuid = Cc["@mozilla.org/uuid-generator;1"]
> +                 .getService(Ci.nsIUUIDGenerator)
> +                 .generateUUID()
> +                 .toString();
> +    return uuid.substring(1, uuid.length - 1);

return uuid.slice(1, -1);

Might also note that you're trimming off the {}.

@@ +2696,5 @@
> +    return uuid.substring(1, uuid.length - 1);
> +  },
> +};
> +
> +// Out singleton instance.

Our

@@ +2735,5 @@
> +  onWindowTitleChange: function () {},
> +};
> +
> +/**
> + * Inject CSS to register our custom notfication type.

notification

@@ +2738,5 @@
> +/**
> + * Inject CSS to register our custom notfication type.
> + *
> + * Old Firefox versions don't have icons in notifications. We add
> + * that. We also give our notficiation a dedicated binding to make

notification
Attachment #8440961 - Flags: review?(rnewman) → review+
I've created an etherpad for the update hotfix testing: https://etherpad.mozilla.org/update-hotfix-testing
Here's the set of changes since the last uploaded full patch. Hoping to get very quick sign-off on this. You can ignore things related to the launcher binary.
Attachment #8442924 - Flags: review?(rnewman)
(In reply to Richard Newman [:rnewman] from comment #102)
> This seems to me like it's ready for thorough QA. We might have hit the
> point of diminishing returns for review.

Agreed. Georg and I both feel we are ready for QA. I feel only very small changes will come through at this point. I'd like to get this off to QA within an hour or two.
 
> ::: v20140527.01/resource/update.jsm
> @@ +241,5 @@
> > + * TODOS
> > + * =====
> > + *
> > + * Stop showing notifications after N days? After N failures? If not, we
> > + * potentially never uninstall.
> 
> Make a decision.
> 
> In theory we could push out a hotfix to uninstall this one, right?

Benjamin and I discussed this and decided that letting it go forever is OK. We should never see this. Worst case we push out a hotfix to force uninstall.

> @@ +317,5 @@
> > +  UPGRADED_VERSION: "30",
> > +
> > +  // How often to retry failed downloads.
> > +  // Value was chosen arbitrarily.
> > +  DOWNLOAD_RETRY_MILLISECONDS: 10 * 60 * 1000, // 10 minutes.
> 
> Too frequent. An hour?

Sure.

> @@ +451,5 @@
> > +  _startup: function (isInstall) {
> > +    this._log.warn("Performing start-up tasks.");
> > +
> > +    this._stateFile = Services.dirsvc.get("ProfD", Ci.nsIFile);
> > +    this._stateFile.append("hotfix.v20140527.01.json");
> 
> What are the odds that this will interact with Reset Firefox? Dumping stuff
> in the profile directory feels a tiny bit risky in that regard.

Oy. I'm inclined to not care about this right now. Once we have FHR report on clients upgraded via this hotfix, we may want to have Reset Firefox preserve this file.

> @@ +2495,5 @@
> > +    const LPCSTR = ctypes.char.ptr;
> > +    const LPCTSTR = ctypes.jschar.ptr;
> > +    const FARPROC = ctypes.size_t;
> > +
> > +    let kernel32 = this._getKernel32();
> 
> You fetch and close this twice. Perhaps some reuse is in order?

Meh.

> @@ +2513,5 @@
> > +        return null;
> > +      }
> > +
> > +      let GetProcAddress = kernel32.declare("GetProcAddress",
> > +                                            ctypes.default_abi,
> 
> Is there a reason why this isn't ctypes.winapi_abi?

default_abi on Windows == winapi_abi (I think). I changed this anyway because winapi_abi is more correct.

> @@ +2540,5 @@
> > +      let is62OrHigher = isFunctionExported(module, "CheckTokenCompatibility");
> > +
> > +      // GetNumaNodeNumberFromHandle() was added in Windows 7 and Server 2008 R2.
> > +      // http://msdn.microsoft.com/en-us/library/dd405492%28v=vs.85%29.aspx
> > +      let is61OrHigher = isFunctionExported(module, "GetNumaNodeNumberFromHandle");
> 
> As a general approach point, I think you could either:
> 
>   if (isFunctionExported(...)) {
>     majorVersion = 6;
>     minorVersion = 2;
>   } else if (...) {
>     ...

Agreed. Changed.
Paul: Please start QA against https://people.mozilla.org/~gszorc/hotfix-v20140527.01.xpi

I will use a separate xpi for my personal testing if needed. So, you shouldn't need to worry about it changing. In case it does, I'll upload a copy to this bug.
Flags: needinfo?(paul.silaghi)
PI to be used for QA verification.
Comment on attachment 8442924 [details] [diff] [review]
changes since last full upload

Review of attachment 8442924 [details] [diff] [review]:
-----------------------------------------------------------------

I feel particularly unqualified to review InstallerLauncher.cpp. Any takers?
Attachment #8442924 - Flags: review?(rnewman) → review+
That is being reviewed in bug 1014187.
Comment on attachment 8440961 [details] [diff] [review]
Auto update hotfix

Review of attachment 8440961 [details] [diff] [review]:
-----------------------------------------------------------------

::: v20140527.01/resource/update.jsm
@@ +1500,5 @@
> +        //
> +        // For now, we go with blowing away the partial download.
> +        //
> +        // NOTE: Firefox <27 don't verify entity sizes. So we may get more
> +        // failures than expected on those clients.

Correctly should be:

Firefox <27 doesn't verify the 206 response size is what we actually expect.

@@ +1546,5 @@
> +        copier.asyncCopy({
> +          onStartRequest: function (request, context) {
> +            this._log.warn("copier:onStartRequest()");
> +            copierStarted = true;
> +            copier = copier.QueryInterface(Ci.nsIRequest);

any need to this here?

@@ +1586,5 @@
> +
> +          if (copierStarted) {
> +            this._log.warn("Closing pipe due to failed channel.");
> +            // This should trigger the copier's onStopRequest().
> +            pipe.outputStream.closeWithStatus(Cr.NS_BINDING_ABORTED);

I didn't see this the last time.. but why are you closing the output only after the copier "has started" (=onStartRequest from it has been called)?  I think you could potentially get here sooner then OnStart from the copier is posted.  And then the copier may "get started" after this point.

Just close the output all the time.  I don't think you will do any bad with that in any case.  BTW, the pipe's stream is already open here, so you should close it when you don't need it anymore.

According the cb() invoke, when you pass copier.asyncCopy() you are ensured you get onStopRequest from the copier when you close the output.

So, set the copierStarted flag after successful call to copier.asyncCopy().  Note: onStartRequest from copier.asyncCopy() is always called and always asynchronously.
Attachment #8440961 - Flags: review?(honzab.moz) → review+
gps, when you need the hotfix signed please file a bug in Release Engineering::Releases. You could point to an existing bugzilla attachment, or add a new one on that bug.
(In reply to Gregory Szorc [:gps] from comment #101)
> > It looks like you're logging a lot, so anything you can do to avoid doing
> > MTIO would be useful. Could you perhaps open your output stream for async
> > wriing and feed it progressively via a pump?
> 
> Indeed. Logging is IMO the most important MTIO to remove, if possible.

Agreed.

> Again, doing the I/O synchronously was easier. I may try to move this to a
> pump.

> > @@ +345,5 @@
> > > +
> > > +    // Modern versions of Firefox have the locale in omni.jar.
> > > +    for each (let res in ["app", "gre"]) {
> > > +      let url = "resource://" + res + "/update.locale";
> > > +      let channel = Services.io.newChannel(url, null, null);
> 
> Ignoring because omni.jar.

In my experience, during startup, during startup, reading from disk is very slow. I'll try nd double-check if this applies to omni.jar

> > > +    let file = FileUtils.getFile("XCurProcD", ["update.locale"]);
> > 
> > (MTIO)
> 
> FileUtils.getFile() is a small wrapper to get an nsIFile. Even if it did
> perform MTIO (I'm not certain it does), how would I avoid this in Firefox
> 10? I don't think I can.

Just create the nsIFile without using `FileUtils.getFile`. This will avoid one piece of main thread IO, as FileUtils checks for the existence of the directory.

> > @@ +354,5 @@
> > > +      } catch (e) { }
> > > +    }
> > > +
> > > +    let file = FileUtils.getFile("XCurProcD", ["update.locale"]);
> > > +    if (!file.exists()) {
> > 
> > (MTIO)
> 
> I'm pretty sure I can't avoid this in Firefox 10. I can try to read from the
> file and catch the error code for not exists, I suppose.

That was the idea.

> 
> > @@ +356,5 @@
> > > +
> > > +    let file = FileUtils.getFile("XCurProcD", ["update.locale"]);
> > > +    if (!file.exists()) {
> > > +      this._log.warn("update.locale not found in XCurProcD");
> > > +      file = FileUtils.getFile("GreD", ["update.locale"]);
> > 
> > (MTIO)
> 
> Again, I don't think this is avoidable MTIO.

See above.


> > @@ +370,5 @@
> > > +    }
> > > +
> > > +    let fis = Cc["@mozilla.org/network/file-input-stream;1"]
> > > +                .createInstance(Ci.nsIFileInputStream);
> > > +    fis.init(file, 0x01 /* read */, 420 /* 0644 */, 0);
> > 
> > (MTIO)
> 
> You caught me. I can probably make this filesystem I/O async.

\o/


> > Are you certain that your code is called before
> > sessionstore-windows-restored?
> 
> Pretty certain. Restartless add-ons are started very early. I'm pretty sure
> the startup sequence is put on hold while AddonManager initializes, so there
> are no race conditions.

Ok, good to know.

> > Again main thread I/O that could be avoided by using NetUtil.asyncFetch or
> > possibly XHR.
> 
> This is coming from a file inside the add-on's .xpi. I /think/ the add-on
> files are paged, like omni.jar. If I am wrong about this, I would consider
> changing it. That being said, this code only gets called if the hotfix is
> applicable. I'm inclined to apply less weight to MTIO concerns that only
> impact applicable clients because the cost of downloading and running the
> installer will vastly outweigh small things like reading installers.json.
> (Yes, I feel dirty saying that.)

I agree that it's much lower priority.


> > Why don't you use NetUtil.asyncFetch?
> 
> I tried this and couldn't get streaming to work reliably. I needed some
> low-level control over all the pieces to get this part just right.
> 
> It's entirely possible that the solution (one of dozens that I tried) that
> is currently implemented is compatible with asyncFetch. At this point, I
> don't want to touch this working code.

ok


> > @@ +2365,5 @@
> > > +   * @param cb
> > > +   *        (function) Called on completion. Receives (error).
> > > +   */
> > > +  _writeStringToFile: function (s, file, cb) {
> > > +    let os = FileUtils.openSafeFileOutputStream(file);
> > 
> > safeFileOutputStream is very slow. Unless the data is critical, you should
> > use openFileOutputStream.
> 
> I'm using this for the JSON state. It is somewhat critical.
> 
> What is the surface area for failures if I switch back to non-safe FOS?
> Unexpected power loss? Or more?

With the regular FOS, there are two risks:
1/ you overwrite any file with the same name;
2/ in case of power loss/disk failure/OS failure, the data may be incomplete/corrupted.
(the risk still exists with the safeFOS, but during a much shorter window).

> This is only used to fetch locale. The synchronous uses we care about may
> disappear.

ok
Hotfix with review comments addressed and updated binary from bug 1014187.

Paul: Please use this for testing.
Attachment #8440961 - Attachment is obsolete: true
(In reply to Gregory Szorc [:gps] from comment #114)
> Paul: Please use this for testing.
ok
Flags: needinfo?(paul.silaghi)
Since the code has been stable for a few days and since we're only anticipating minor changes at this point, I committed everything to the firefox-hotfixes repo:

https://hg.mozilla.org/releases/firefox-hotfixes/rev/b78b894e1441
You know what, I'm going to mark this as resolved. Please file follow-ups against this or bug 928173.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Blocks: 1029780
Blocks: 1030851
Blocks: 1030856
Blocks: 1030860
Blocks: 1030862
Blocks: 1030911
Blocks: 1031021
Attachment #8434448 - Flags: feedback?(jboriss) → feedback+
Attachment #8434578 - Flags: feedback?(jboriss) → feedback+
Points: --- → 13
QA Contact: florin.mezei
Added to Iteration 33.3 as carry over.
Iteration: --- → 33.3
QA Whiteboard: [qa+]
Blocks: 1037029
Blocks: 1037649
Flags: firefox-backlog+
QA Contact: florin.mezei → paul.silaghi
Hi Paul, following up to see if this bug can be verified by Monday which is the end of the Desktop iteration.
Flags: needinfo?(paul.silaghi)
The testing is complete, the important dependent bugs have been fixed and verified. The hotfix works fine on staging at the moment. Feel free to go live anytime you want.
https://etherpad.mozilla.org/update-hotfix-testing
https://etherpad.mozilla.org/hotfix-testing
Status: RESOLVED → VERIFIED
QA Whiteboard: [qa+] → [qa!]
Flags: needinfo?(paul.silaghi)
You need to log in before you can comment on or make changes to this bug.