Closed Bug 306170 Opened 18 years ago Closed 17 years ago
background download fails if server returns 200 response [was: hide button on software-update stops download]
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b4) Gecko/20050826 Firefox/1.0+ Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b4) Gecko/20050826 Firefox/1.0+ If clicking on hide-button in software-update, it closes software-update window (that's ok!), but it stops the download. I think, a hide-button should just hide that window and not stopping the download-process :) Reproducible: Always
I can confirm this. Under the help menu the text is download update and the throbber is spinning, but after leaving for half an hour and reopening, the update starts downloading at the same point it was at when I clicked hide. Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20050827 Firefox/1.6a1 ID:2005082706
Status: UNCONFIRMED → NEW
Ever confirmed: true
(In reply to comment #1) > I can confirm this. Under the help menu the text is download update and the > throbber is spinning, but after leaving for half an hour and reopening, the > update starts downloading at the same point it was at when I clicked hide. > > Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20050827 > Firefox/1.6a1 ID:2005082706 Just leave it for some seconds... it stops in all cases... I forgot: The same happens with software-update in Thunderbird.
Are you sure it's actually stopping? Hiding the window makes software update go into a sort-of background/silent download mode, where it retrieves the update at a much more leisurely pace. Every sixty seconds, it will attempt to retrieve a 64K chunk. See bug 304381.
I've retested this with a http monitor running on my system. It does in general get more data every 60 seconds. However a couple of times there seemed to be a glitch. For some reason the server it was downloading the mar from (mozilla.osuosl.org) didnt return a partial content in some places, but tried to return the full file. When that happened the throbber on the help menu goes away and firefox never sends another request. There was also this in the js console. Error: [Exception... "'Failure' when calling method: [nsIWritablePropertyBag::getProperty]" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: chrome://mozapps/content/update/updates.js :: anonymous :: line 612" data: no] Source File: chrome://mozapps/content/update/updates.js Line: 612
Attached the request and response headers from the fnal request.
Pretty sure this comes from this code: http://lxr.mozilla.org/mozilla/source/netwerk/base/src/nsIncrementalDownload.cpp#499 It seems if the response is not a 206 code and the file is not totally downloaded then the response is considered an unexpected error. The timer is only started to track the next download if there is no error.
(In reply to comment #3) > Are you sure it's actually stopping? > > Hiding the window makes software update go into a sort-of background/silent > download mode, where it retrieves the update at a much more leisurely pace. > Every sixty seconds, it will attempt to retrieve a 64K chunk. > > See bug 304381. Important: User should be informed, that download-process will be much slower when clicking on hide-button. Actually, user wonders why download-process is slow...
(In reply to comment #4) > However a couple of times there seemed to be a > glitch. For some reason the server it was downloading the mar from > (mozilla.osuosl.org) didnt return a partial content in some places, but tried to > return the full file. That sounds like a bug with the server side of the update system. According to the comments in bug 291910 and elsewhere, the client code of the update system only works with HTTP 1.1 servers that support byte-range requests.
The strange bit is that the server returned a number of partial contents before the one that wasnt partial. Maybe mozilla.osuosl.org is actually in a load balancing scenario and some of their servers are misconfigured. In any case, should the background downloading give up just because of one download failure, especially when it has been successfully downloading from that server previously? Perhaps the server was overloaded at the time or something. Maybe on an error it should back off and try again in 5 minutes or something?
We should handle the non-206 response gracefully at the very least. And, we should try to fix the server as well. -> me
Assignee: nobody → darin
Severity: normal → major
Status: NEW → ASSIGNED
OS: Windows XP → All
Hardware: PC → All
Target Milestone: --- → Firefox1.5
Hmm.. it is actually valid for a HTTP server to respond with 200 for a range request. We don't want to download the entire document though, so we probably should just try again until we get the desired 206 response. Hmm..
Summary: hide button on software-update stops download → background download fails if server returns 200 response [was: hide button on software-update stops download]
Darin - this looks like something we should get in for 1.5b2/1.8b5. Any updates on status for this?
After talking with Darin, we're going to push this out to after the beta2 (first RC).
Flags: blocking1.8b5+ → blocking1.8b5-
Here's a simple patch that just causes us to keep trying until we get the 206 response that we're looking for. The assumption here is that we won't end up in an infinite loop because we expect that our mirror network will only give out 200 responses intermittently.
Attachment #199481 - Flags: review?(cbiesinger)
Notice that this patch also fixes a bug where we do not update mChannel when we redirect. That was preventing us from being able to cancel the download in some cases. Redirects are such a pain :-/
A couple more changes I want to make to this: 1) Make the retry logic become gradually less frequent, and eventually give up. 2) If requesting the entire file, then support a 200 response.
Comment on attachment 199481 [details] [diff] [review] v1 patch ok, r=me with that 200 handling, and the minimum timeout in OnStopRequest removed (because it's not needed there), as discussed on irc I also want to look at the changes for your point (1) but can only do that tomorrow.
Attachment #199481 - Flags: review?(cbiesinger) → review+
Comment on attachment 199481 [details] [diff] [review] v1 patch Do we need additional reviews here?
We should also get this landed and verified on the trunk before we take it on the branch.
We also need to set up a server to test this against and to make sure we don't put ourselves in a worse state where the client is hanging or the server is being DOSsed.
Comment on attachment 199481 [details] [diff] [review] v1 patch >Index: nsIncrementalDownload.cpp ... >+// We don't want to ping the server anymore frequently than this interval. >+#define MIN_INTERVAL 2 // seconds We don't want to ping the server anymore frequently than a minute, likely. :) Darin, any reason the client couldn't handle having this interval set to 1-5 minutes?
Chase: the issue is that the code needs to distinguish between an explicit download from the user and a true background download. In the background download case, waiting minutes is fine. In the explicit case, retrying more quickly is better. However, that is all moot if I support a 200 response when the consumer asks to have all of the file downloaded. That case corresponds to the explicit download from the user. New patch coming up.
(In reply to comment #5) > Created an attachment (id=194190)  > Headers from the failing request > > Attached the request and response headers from the fnal request. Mossop, we're trying to reproduce this locally against mozilla.osuosl.org and getting a 206 response code, not a 200. Can you reproduce the 200 response code against mozilla.osuosl.org?
Darin, we need to pin down a server installation that we can use to test that the fixes we take on the trunk and branch put us in a better position than where we started. Michael (from OSUOSL) states that the headers captured in attachment 194190 [details] are from around the time that the servers at OSUOSL were being moved. Perhaps that old server no longer exists. I presume that mozilla.osuosl.org has been fixed. Aravind is testing ftp.mozilla.org directly now under that presumption attempting to find a site that reproduces this. Can you make recommendations for how we can reproduce locally? (Affected server versions we could install, or a hostname/IP you know of you use to reliably reproduce this bug.)
I don't know of a server where this problem exists, but you could surely reproduce this manually using an Apache server. Just disable byte range support in the Apache configuration. For my own testing, I'm doing something like that. I've got a patch now that I'll be posting shortly that comment #16. I've tested it pretty carefully, but I think that having someone else verify the patch is a good idea ;-)
Darin - any pointers on how to disable byte-range requests? Nothing obvious in the docs.
Here's the more full featured patch. I decided to set a limit of no more than 20 retries, spaced out in an ever increasing multiple of the fetch interval. Given a fetch interval of 1 minute (which is what we currently have), it would take about 3.5 hours of retrying before FF would give up. This feels like a pretty decent configuration.
> Just disable byte range support in the Apache configuration. It sounded pretty simple to me, but now that I look into it, I also don't see an easy way to disable range support in Apache without hacking the code. I don't suggest you go down that route. You might want to play around with the Apache "as-is" module. It lets you send an arbitrary response including headers and all. Similarly, you could use netcat (i.e., the "nc" command under Unix). $ nc -p 80 -l < RESPONSE Create a file called RESPONSE that contains the headers and data that you want, and point FF's update service to http://host:80/ to have it fetch RESPONSE.
Attachment #200162 - Flags: superreview?(bzbarsky) → superreview+
Comment on attachment 200162 [details] [diff] [review] v2 patch - if (chunkSize != -1) + if (chunkSize > 0) mChunkSize = chunkSize; Should it be documented that values <= 0 will mean "reasonable default value"? hm... maybe those vars should even be unsigned, with 0 meaning default size? + PRInt32 len; + rv = http->GetContentLength(&len); the other part of the code can handle 64-bit lengths :-/ + if (status == NS_ERROR_DOWNLOAD_NOT_PARTIAL) maybe NS_ERROR_NOT_RESUMABLE, to avoid having to make up a new error code? r=biesi
Attachment #200162 - Flags: review?(cbiesinger) → review+
> - if (chunkSize != -1) > + if (chunkSize > 0) > mChunkSize = chunkSize; > > Should it be documented that values <= 0 will mean "reasonable default value"? Yeah, that's a good idea. > hm... maybe those vars should even be unsigned, with 0 meaning default size? 0 for interval means no-wait, but -1 means use default interval. so, i went with -1 to mean the default value in both cases. > + PRInt32 len; > + rv = http->GetContentLength(&len); > > the other part of the code can handle 64-bit lengths :-/ true, ok.. guess that means i should use nsIPropertyBag2::getPropertyAsInt64. > + if (status == NS_ERROR_DOWNLOAD_NOT_PARTIAL) > > maybe NS_ERROR_NOT_RESUMABLE, to avoid having to make up a new error code? The error code is an internal error code used to communicate with the listener, so that it knows that this is not reason to notify the downloader's observer. I didn't want to reuse an error code that could be generated by other modules in Necko.
patch with revisions suggested by biesi: * switched to using nsIPropertyBag2 to get a 64-bit valued content length * updated comments in nsIIncrementalDownload.idl
Attachment #200278 - Flags: approval1.8rc1?
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Attachment #200278 - Flags: approval1.8rc1? → approval1.8rc1+
17 years ago
Whiteboard: [needs review cbiesinger, SR bzbarsky]
You need to log in before you can comment on or make changes to this bug.