use conservative, robust communication for update check to prevent broken updater in builds with issues
Categories
(Toolkit :: Application Update, task, P3)
Tracking
()
People
(Reporter: aryx, Unassigned)
References
Details
Secure connections in Nightly started to fail last Friday (bug 1750188). This prevents the affected Nightly from updating to a version with the fix (unknown if the background updater on Windows fixes this for that platform).
The use of a conservative mode for update requests to ensure networking for critical services doesn't break has been discussed.
Dana said "Updates are also signed, so maybe it's safe enough to not do revocation checking (or limit it to a subset of our checks)"
Updated•3 years ago
|
Comment 1•3 years ago
|
||
We do something like this for update checking. I'm guessing that you would like this to be done for update downloads as well? AFAIK, we don't have any way of doing that for BITS downloads (which, on Windows, are typically the default). But failed BITS downloads fall back to nsIIncrementalDownload
. And it's presumably possible to do this for nsIIncrementalDownload
.
@aryx Would that accomplish what you had in mind here?
Another possible option is to have update downloads use HTTP instead of HTTPS. I believe that we started using HTTPS for this somewhat recently. Because the updates are signed, I've never been particularly clear on why this is necessary in the first place. If it is not necessary, perhaps we should consider returning to HTTP transport for update downloads.
Reporter | ||
Comment 2•3 years ago
|
||
Benjamin, could you evaluate Kirk's suggestions for more robust application updates in case of connections impacted by HTTPS issues?
Comment 3•3 years ago
|
||
(In reply to Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout) from comment #2)
Benjamin, could you evaluate Kirk's suggestions for more robust application updates in case of connections impacted by HTTPS issues?
302 to Dan who might have an opinion here as well.
Updated•3 years ago
|
Comment 4•3 years ago
|
||
Changing the priority to p1 as the bug is tracked by a release manager for the current release.
See What Do You Triage for more information
Updated•3 years ago
|
Updated•3 years ago
|
Comment 5•3 years ago
|
||
We do something like this for update checking. I'm guessing that you would like this to be done for update downloads as well?
If we're not using beConservative
for the downloads themselves we certainly should. The conservative flag turns off some newer HTTP and DNS features, and if the connection is proxied we'll fall-back and try a direct connection. It would not have helped in this case because that was not the problem! Switching back to insecure downloads (or rather, only secured by package signature) would not have helped either.
Once users were in this state all HTTPS connections would fail according to jschanck, and that includes the update check itself. That the update check was the problem is supported by the error message people got, as in bug 1750188 comment 14, where it says no new version was available and not that the download failed. [Tangent rant: I've always hated that! People can handle the difference between "You're all good" and "Check failed, try later", and there IS a difference! When we pretend everything is fine people will be happy but insecure. If we tell them there is a problem they have a fighting chance of fixing it if it persists, and possibly detecting a local attack.]
beConservative
does not turn off certificate or revocation checking, or bypass any number of imagined HTTPS bugs we could have introduced -- it's not passed along to PSM as far as I can tell. There's a reasonable argument to be made that the TLS code should honor the flag and turn off features under active development (like previously TLS 1.3, or maybe currently ECH, maybe some new ciphers), but for most of those we don't ship the feature until it's got robust interop, and usually it's also avoided by not using the feature on the server which we also control. I think we did have a problem one time when our cloud provider turned on something without warning that ended up breaking things temporarily? But if it's broken on the server it can be fixed on the server.
We certainly don't want to turn off certificate checking (== no security at all), and I'm reluctant to turn off revocation checking, too. If we revoke our own cert it's probably a pretty serious problem. If we'd had this flag passed in should we have skipped new CRLite code and fallen back to standard OCSP revocation? That's a harder question for me. If a user had their updates attacked by an AITM it would be trivial to block the OCSP check and avoid the revocation. On the other hand, CRLite helping presumes that we've detected the bad cert, revoked it, and the user has gotten a CRLite update before the AITM starts. You could argue CRLite isn't "conservative", but I'd argue our update checks are amongst the most important places to get revocation right. (I don't care as much about revocation on our download server--TLS there is more important for privacy than security since updates are signed--but as noted above that wouldn't have helped in the 1750188 case)
Since we're mostly dealing only with Firefox and our servers, any problem should be found in testing -- and in this case it was! It was reported within just a few hours of landing on mozilla-central. Not nice that it broke everyone, but builds that are unusable for some chunk of users aren't unknown.
Comment 6•3 years ago
|
||
TL;DR
- We should use
beConservative
in nsIncrementalDownload.cpp - We should brainstorm if "conservative" TLS features make sense. If so, PSM needs to know which requests are conservative and do the whatever we decide
#1 would not have avoided bug 1750188, but it does mean a bunch of the "Be Conservative" work would not actually keep updates working if the download can be broken using the non-conservative features. For example, the recent work to hook proxy failover to the conservative flag would not have actually rescued the folks who were captured by a set of malicious addons we battled last summer/fall.
I'm not sure what goes in #2.
Updated•3 years ago
|
Description
•