Closed Bug 1003159 Opened 6 years ago Closed 6 years ago

Default the updater to downloading the entire update at once for Nightly/Aurora

Categories

(Firefox :: General, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 33

People

(Reporter: mconnor, Assigned: mconnor)

References

Details

Attachments

(1 file)

This traffic is low enough on these branches that this shouldn't hurt anything.  Mostly want to take a first step to see if this actually causes noticeable perf hits on the client side.  cc-ing Jake to make sure I'm not making things up about the bandwidth implications.
For nightly and aurora, this is fine... bandwidth on them is trivial. Even Beta would be okay.
Okay, we'll look at Beta next if this pans out.  Slightly more complicated, but doable.
If this is implemented on Nightly, Aurora, and Beta it would make it so chunked interval downloads are only used on release which would mean that if anything broke in the code that uses intervals
http://mxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsIncrementalDownload.cpp
then we wouldn't see it until it hit release.

Perhaps just using a shorter interval would be better if this does get implemented on Nightly, Aurora, and Beta but not Release.

Another option we used to use to deal with bandwidth issues is update advertisement throttling and we could throttle the percentage of systems that are advertised the update for background updates to mitigate the bandwidth issue.
OS: Mac OS X → All
Hardware: x86 → All
To be clear, the end goal is to see if switching to one-shot downloads is feasible.  If it works for Nightly/Aurora, we'd try it on Beta.  If that's promising, we'd roll it to Release (assuming it wasn't going to cost us a ton of cash).  If we decide to bail, rolling back to chunked downloads across the branches would be in order.
Thanks for the clarification.

The "going to cost us a ton of cash" possible issue could likely be mitigated using throttling on the background update advertisement.
Yep, there's a bunch of ways we could mitigate.  However, the problem is actually somewhat inverted because of how our CDN billing works.  A taller peak in less time is actually better, oddly enough.
Right and iirc that is for a highest usage period (3 days?) and it is a good thing as I see it that we have a mitigation method in place that has been used previously.
Attached patch fullUpdatesSplinter Review
Simple.  Should work fine, we'll keep an eye out for issues.
Attachment #8414749 - Flags: review?(robert.strong.bugs)
If we're going to stick with this behavior longer term, we should remove the use of nsIIncrementalDownload in favor of a straight nsIRequest, rather than just flipping the interval pref to zero, since nsIIncrementalDownload with an interval of 0 just introduces a bunch of unnecessary overhead.
Probably true, though I'm not clear on how much extra overhead there is with interval == 0.  If we get that far, I'll file followups to look into it.

Of course, if we're going to change the downloader at all, it might be worth exploring OS APIs for handling background downloads (i.e. BITS on Windows, NSURLSession on OS X 10.9+).
This landed on m-c and m-a but didn't get marked as fixed.

https://hg.mozilla.org/mozilla-central/rev/7a4f15c7cec5
Target Milestone: --- → Firefox 32
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Meh, this got backed out due to test failures so re-opening.

mconnor, are you going to land that test fix we discussed that I gave you an r=me for?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Crap, I thought I re-landed it.  https://hg.mozilla.org/integration/mozilla-inbound/rev/ccc885c8f4ac

Jake, assuming this doesn't bounce again (it shouldn't) this will start being active on June 6 on the nightly channel, and will ride the trains to Aurora next week.  And then it'll stay for a while and we can watch things.
sorry Mike had to back this out for test failures like https://tbpl.mozilla.org/php/getParsedLog.php?id=41093531&tree=Mozilla-Inbound (note: the unexpected failures in this case here, the type errors were something different)
IMO, most of the discussion is about the "cost" to Mozilla and nothing was mentioned about the "cost" to "nightly" testers who might have metered internet service.
downloading  40+ mb everyday is alot compared to 4-5 mb.
"cost" to "nightly" testers who might have metered internet service.
should be taken into consideration
We're not changing how big updates are, we're changing how they are downloaded (from many 300k increments to one continuous download).  I'll note that updates are significantly bigger than 4-5 MB last I looked.
(In reply to Mike Connor [:mconnor] from comment #17)
>from many 300k increments to one continuous download

isn't that better for load balancing ?
can you explain ?

>I'll note that updates are significantly bigger than 4-5 MB last I looked

yes and getting bigger 

also ff surpassed chrome size 32.7 vs 34.4 mb(was 29 mb recently) !
The jump in size is huge in the short span[24vs29]
And worried about the increasing size
I just landed a test fix in bug 1032559. Try results with that patch and the patch from this bug are green.
https://tbpl.mozilla.org/?tree=Try&rev=98bae828826e

If that sticks this bug should be unblocked from landing.
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #19)
> I just landed a test fix in bug 1032559. Try results with that patch and the
> patch from this bug are green.
> https://tbpl.mozilla.org/?tree=Try&rev=98bae828826e
> 
> If that sticks this bug should be unblocked from landing.
It looks like it stuck. After bug 1032559 is merged it should be safe to land this.
Try looks good so I landed this
https://tbpl.mozilla.org/?tree=Try&rev=951cd8b72f08

Pushed to fx-team
https://hg.mozilla.org/integration/fx-team/rev/37ba46f2c7c2
Status: REOPENED → ASSIGNED
Flags: in-testsuite+
Target Milestone: Firefox 32 → Firefox 33
I pushed the test changes in bug 1032559 to mozilla-aurora in case you would like to land this on aurora as well.
https://hg.mozilla.org/mozilla-central/rev/37ba46f2c7c2
Status: ASSIGNED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Woo, thanks Robert!

(In reply to ElevenReds from comment #18)
> (In reply to Mike Connor [:mconnor] from comment #17)
> >from many 300k increments to one continuous download
> 
> isn't that better for load balancing ?
> can you explain ?

In general, not really, because most of the overhead is in handling/routing the request, not from the bandwidth consumed.  It's not more bandwidth.  Handing 100x as many requests is a lot more routing, logging, connection negotiation, etc.  We're mostly worried about how this will impact our bandwidth pattern than load balancing.

> >I'll note that updates are significantly bigger than 4-5 MB last I looked
> 
> yes and getting bigger 
> 
> also ff surpassed chrome size 32.7 vs 34.4 mb(was 29 mb recently) !
> The jump in size is huge in the short span[24vs29]
> And worried about the increasing size

You're not alone. http://atlee.ca/blog/posts/firefox-update-sizes.html outlines the current thinking for reducing download sizes.  It'll take time, of course, and it's well outside the scope of this bug.
(In reply to :Gavin Sharp [email: gavin@gavinsharp.com] from comment #10)
> If we're going to stick with this behavior longer term, we should remove the
> use of nsIIncrementalDownload in favor of a straight nsIRequest, rather than
> just flipping the interval pref to zero, since nsIIncrementalDownload with
> an interval of 0 just introduces a bunch of unnecessary overhead.

With an interval of 0, does it still generate a bunch of 300KB Range requests, just with no delay between them? If so, then I'm definitely in favor of this idea. :)

The range request mechanic (rarely) causes odd problems due to caching. It's very possible that one Range request doesn't come from the same server that the last one did. Usually that's fine, but it can mean that if they have different cached data (usually because we had to re-build) that the update will fail. IIRC it simply retries later, but it's still wasted bandwidth and extra uptake delay.

It also mucks up the CDN vendor stats on download completion, for the same reason. More or less that stat is simply unreliable because of this.



There are financial considerations to doing this, but I believe they can be largely mitigated by proper use of our AUS server-side throttling if we need to.

In general, eliminating the delay between update chunks has the effect of squishing the bandwidth used into a smaller time frame, with a higher peak. If the peak is short enough (<5% of a month), then it's essentially free and this is a good thing financially. If not, then we could wind up paying more. Fortunately, we can coarsely control the length (and even the magnitude) of the peak with our AUS server throttling. If we get 30-ish hours into a release and it's looking real bad, we can throttle off (or at least down) to avoid a huge overage.

Because we have some control over when updates are offered (via AUS throttling), I'm in favor of removing client-side throttling altogether. Or at least removing the current scheme and doing something else... I think I've said before that if we really want to keep a client-side throttle, I'd rather see a smooth KB/s limit (1 connection) rather than 300KB Range requests at full blast every other minute.
Would be best to ask someone from the necko team what the exact behavior of this code is
http://mxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsIncrementalDownload.cpp

Jason, can you answer Jake's question in comment #25?
Flags: needinfo?(jduell.mcbugs)
http://mxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsIncrementalDownload.cpp#279 implies we don't do a range request if the interval is 0.  But I'll let jduell confirm that.

Jake, we can probably infer the likely pattern by looking at when users start downloads.  It should be taller and shorter, but that's relatively guesswork-ish.  That said, the range requests have been a problem for some since we landed this a long time ago, I'd love to see it go away.
The incrementalDownload code is old and the last time Patrick went in to touch it, we weren't even clear on why it really exists (smells like a dial-up era thing?).  IIRC Patrick is more excited about doing a single large download, but having the ability to throttle that download's bandwidth when other network activity is going on.  We don't have that yet, but it sounds like we're on the same page as far as our wishlist goes.  Patrick may have more info on the timeline for us getting to that (Patrick: this is in reference to comment 25).  Meanwhile I'd guess you're fine doing a regular nsIRequest.
Flags: needinfo?(jduell.mcbugs) → needinfo?(mcmanus)
I agree - going with the straight nsIRequest is mostly a good thing

1] pro - 1 atomic GET means less problems with resuming ranges when changes have occurred (as mentioned above)

2] pro - ranges sometimes have problems with proxies, which means fallback code is running

3] pro - trickle downloads are a worst case scenario for battery - we keep spinning the radio up and down on wifi devices to do this. Its much better to do it in a single stream where possible

4] pro - apparently its easier on the server side too. yea!

5] con - with no rate limiting in the client, clients with really poor bandwidths (android - b2g) will overwhelm other, perhaps higher priority, tasks in gecko. making our responsiveness poor.

about 5 - the existing rate limiting is pretty crude - I've heard it creates the same problem already.

also about 5 - we can build real rate limiting that is based on flow control (instead of being based on multiple bursty range requests) which will give the best of both worlds. That idea is on the idea backlog right now - so I'm not sure how to say when to expect it. But its good to know you're going this route.
Flags: needinfo?(mcmanus)
Checking back in on this. Landed in Nightly and Aurora on July 2, so it's simmered for a bit over 2 months now... thoughts on moving to Beta?
you should ask the android team for input.. I know the updater blast has caused them grief in the past.
Beta for android doesn't generally use our updater, it's distributed via the play store.
You need to log in before you can comment on or make changes to this bug.