download manager sometimes thinks that incomplete downloads are complete; cannot resume/retry

RESOLVED FIXED in Firefox 39

Status

()

defect
P3
critical
RESOLVED FIXED
16 years ago
2 years ago

People

(Reporter: navneetkarnani, Assigned: bagder)

Tracking

(Depends on 2 bugs, Blocks 1 bug, {dataloss})

Trunk
mozilla39
Points:
---

Firefox Tracking Flags

(firefox39 fixed, relnote-firefox 39+)

Details

(Whiteboard: [please read and respect Comment 107] [lame-network], )

Attachments

(5 attachments, 31 obsolete attachments)

141.00 KB, application/msword
Details
40.57 KB, image/png
Details
129.90 KB, application/xml
Details
37.35 KB, application/octet-stream
Details
20.25 KB, patch
bagder
: review+
Details | Diff | Splinter Review
Reporter

Description

16 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040206 Firefox/0.8
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040206 Firefox/0.8

I have a situation where the download is of 35MB+ and after 15MB-20MB the
download manager assumes that the download is complete. This could be due to the
source server terminating the connection or any connection related problem.


Reproducible: Always
Steps to Reproduce:


Actual Results:  
If i try to download the file again, the download manager overwrites my download
and does it all over again. 

Expected Results:  
I would want a retry button in this case so that i can resume the download from
where it stands today.

Comment 1

16 years ago
Reporter,

WORKSFORME. Resuming is a FTP server feature, and As Far As I Know is not
available when downloading from http. The retry button in the download manager
hangs, which I semi-expect. 

As far as the overwrite, because we are not downloading from an FTP server I
expect that.

Comment 2

16 years ago
This is a dupe of either bug 230870 or bug 236514. Reporter can you please dupe
against either one of these? If not, please explain in some more detail how this
is a separate bug.

Comment 3

16 years ago
The issue here is that Firefox can't tell the difference between a prematurely
terminated download (i.e. when your connection drops half way through) and a
finished download. It needs to distinguish between the two and offer a retry
option on failed downloads, though perhaps this problem will just go away with
bug 230870.

Comment 4

16 years ago
Are you sure about this? Using a recent nightly build, I started a large
download and unplugged my ethernet cord halfway through, and Firefox said the
download had failed, not completed.

Can you reproduce this with a newer build?

Comment 5

16 years ago
(In reply to comment #4)
> Can you reproduce this with a newer build?

I've stopped using the DM manager over the last few months, so perhaps this has
been fixed since 0.8. I'll start using it again and update this bug if I see the
problem (unplugging my ethernet cable gave the same results you got).

Comment 6

15 years ago
I've been using the download manager since my last comment and I've seen the
problem that I described in comment 3 several times. Last time I was downloading
a 4MB file, but at about 400KB the file stopped downloading and the DM showed it
in the finished state (i.e. with the "show" link instead of the "retry" link).
When I clicked on the original link on the webpage again it started downloading
but immediately leapt up to 400KB before continuing at a normal rate.

I can't believe me and the reporter are the only people who've experienced this
bug but I can't find any similar reports.
Reporter

Comment 7

15 years ago
I have faced situations similar to yours while downloading Netbeans off their
server http://www.netbeans.org

But the only difference has been that if i try to re-download the file, it
starts from zero instead of the 20MB that has already been downloaded.

Comment 8

15 years ago
Is this still a problem in recently nightly builds? Please do test with clean
profiles and no extensions. Thanks.

Comment 9

15 years ago
Confirming on: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8a)
Gecko/20040504 Firefox/0.8.0+

Updating summary.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: download manager does not allow retry after download competles → download manager sometimes thinks that incomplete downloads are complete; cannot resume/retry

Comment 10

15 years ago
i got the same while downloading 
http://web.archive.org/web/20021203213324/www.ctl.sri.com/escot/dist/escot_1_0_2/InstData/Windows/VM/Install_Escot.exe
it just thinks that file download is complete, but it isn't.

Comment 11

15 years ago
I would like to confirm that I have the same problem.
My setup is Windows98SE, Firefox 0.9.1... P200 56MB ram...
I have the download manager enabled (to appear).
I frequently download a number of files at once.

Often one or two of my downlaods are incomplete, but firefox says they completed
succesfully.
Frusterating later when I go to use them.  Anything but very small files that I
will check right away; I don't even use Firefox to download.  It's a hassle to
copy/paste to a prompt and use wget or filezilla though...

Comment 12

15 years ago
I have also experienced this problem downloading files from 
http://www.mp3.qo.pl/ Some (not all) files will start and then show completed in
the DM.   The most recent example was the file should have been 17 mb and was
only  3.4 mb when the firefox showed it as complete.  I am using the a recent
(from Sep 1 2004) nightly build on a new XP installation.

Comment 13

15 years ago
This happens to me even when downloading a local file.  If I am viewing a local
directory and click on a large file it starts to download (copy) the file to my
download directory.  The download will abort partway through but the download
manager says it was complete.  Since this happens with a local copy, the problem
must involve more than just dropped network connections.  I am using version
0.9.3 on x86_64 Linux.

Comment 14

14 years ago
Possible duplication with <a
href="https://bugzilla.mozilla.org/show_bug.cgi?id=287932">287932</a>?

Comment 15

14 years ago
The longer the file the bigger the chance of seeing the problem; look at bug
#230451 for a better description of the problem.
Firefox 1.0.3 still has this annoying bug that finally sent me back to Mozilla.
*** Bug 291665 has been marked as a duplicate of this bug. ***

Comment 17

14 years ago
I also have this problem it seems to happen when there is a connection problem
between me and the download host.

it seems some hosts will drop the http transfer but end the TCP session nicely
FireFox wont recognize that the total bytes and bytes received are not the same
and doesn’t reconnect to finish the transfer, the result is files of any size
(I’ve had files as small as 200k not finish) are marked completed when they aren’t…

If the TCP session doesn’t finish nicely FireFox will behave correctly and won’t
mark the download as complete (although FireFox doesn’t seem to want to resume
unfinished downloads…)

This can be replicated by setting up a say a web server start downloading from
the web server and pull out the network cable ff will just sit and wait…

but if in the same scenario you kill apache instead of pulling the network cable
the TCP connection will finish nicely (well nicer :P ) and assume the download
has finished..

I assume that this happens in real life when the web server you are downloading
from for some reason kills your http session (e.g. TCP still finishes nicely)

This problem had been in every version of ff I have used from 0.9 - 1.04

Comment 18

14 years ago
Just tested this bug using 

Test Enviroment
---------------

Client: latest Firefox (Nightly Build - 8th Jun 2005) in a clean version of
Windows XP (with all updates installed)..

Server: Debian Sarge running apache and squid (with all updates installed)

Results
-------

FF thinks that the file has successfully downloaded if you are proxying through
squid and kill squid. (http doesnt finish nicley, tcp does)

FF thinks that the file has successfully downloaded if you are downloading from
apache and kill apache. (http doesnt finish nicley, tcp does)

FF thinks that the file hasnt successfully downloaded if you are downloading
from apache and pull the network cable out of the server. (http doesnt finish
nicley, tcp doesnt finish nicley)

Comment 19

14 years ago
(In reply to comment #18)
> Just tested this bug using ...

Do you know if the same thing happens with the Mozilla suite, or is this Firefox
specific?
Firefox: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b2)
Gecko/20050609 Firefox/1.0+ ID:2005060915

Apache: 2.0.47-1.9.91.mdk (under Mandrake Linux 9.2)

Attempting downloading of a large file, server returns headers of:

HTTP/1.1 206 Partial Content
Date: Thu, 09 Jun 2005 23:55:48 GMT
Server: Apache-AdvancedExtranetServer/2.0
Last-Modified: Tue, 18 May 2004 02:08:42 GMT
ETag: "3ff4-637a8f0-6bbcde80"
Accept-Ranges: bytes
Content-Length: 102105285
Content-Range: bytes 2205739-104311023/104311024
Content-Type: application/x-executable

After a few hundred KB downloaded I start killing off httpd processes using
"kill -KILL <pid>". After a number are gone, the download stops. Once, I got a
successful download, despite the download obviously not being complete. Another
time I did get an error saving the file, but it was the error I get all the time
with successful downloads anyway, I think there is something up with my disk on
that front.
Suite also believes it downloaded fine. In fact in the transferred column it
reckons it got all 101MB.
Same here. I too experience the defect even with Firefox 1.04 version. Nothing
has changed. When the connnection fails (slow connection, route not available
etc.) Firefox simply marks it as completed!

It doesn't even allow you to retry the failed download, nor does it shows the
actual URL from where it downloaded. 

At the least irrespective of its opinion on the state of download, it should
allow for re-download!

Updated

14 years ago
Assignee: bugs → darin
Component: Download Manager → Networking: HTTP
Product: Firefox → Core
QA Contact: aebrahim-bmo → networking.http
Version: unspecified → Trunk

Comment 23

14 years ago
(In reply to comment #19)

> Do you know if the same thing happens with the Mozilla suite, or is this
> Firefox specific?

I used Mozilla for quite some time before FF and had only occasionally damaged
downloads. With FF in the beginning I didn't think the problem would be FF
specific but with repeated occurrences I got more and more annoyed. Since I went
back to Mozilla I haven't had a single broken download yet (I am very attentive
to this problem now :-)

Comment 24

14 years ago
I have experienced this problem in 1.04, 1.03, and 1.0Pre, but I can 't remember
farther back.

When downloading a large file, usually about 600MB .iso files on my LAN, the
download will begin. Downloads will start around 45000K and level out around
11000-12000KB/sec, and then fail. It can fail at 30MB, 40MB, 50MB...130MB...
When it fails, I do not receive any errors, it (Downloads dialog) just says
done. I have not had the problem on slower networks (internet), where download
speeds are much slower.

On a Windows box, it downloads 1900KB--5000KB/sec. and it hangs (Downloads
dialog does not go away and does not say done).

The server I am downloading from is an IBM x336, 4GB RAM running Apache on SuSe
Enterprise 9. This is connected through a gig Cisco switch. My PCs are on 100MB
FULL Cisco connection. I do not see other issues with the connection, and the
Galeon downloads work through all the same hardware and software.

I have no other downloads in the list, and the Windows box is a fresh install of
Firefox.

I am able to download the complete file in Galeon, consistently. Does anyone
know if this is a bug, or is there a setting I can alter to troubleshoot this? I
have created a new profile, not adding extensions, bookmarks, themes, etc. Same
problem. This is a local LAN with a reliable connection. Both sides are set to
100/Full.

Linux 2.6.9 - 2.6.11.x kernel
Firefox 1.x
P4 3.4Ghz
2GB RAM
root filesystem is standard IDE ext3
2 SATA Raptors in RAID 0 with JFS filesystem for a storage area
Failure happens on both filesystems/drives

Winblows box:
P4 IBM Thinkcentre
512MB RAM
Standard SATA drive
Firefox 1.0Pre

Comment 25

14 years ago
I'm seeing this bug with disturbing frequency with the latest nightly trunk
(Deer Park) build, when trying to ftp the latest Firefox nightlies from the
mozilla.org mirrors. I'm not sure if it's the mirrors, the browser, my network
or what that is causing the termination, but the download manager is fooled into
thinking it was a complete download.

Also on Mac OS X -> All/All.

I'm ignorant of the ftp process, but is it possible that a server can send some
sort of termination signal that the browser incorrectly assumes means that the
download has finished instead of just aborted? Alternatively, is there any
time-out or lost-connection code in the browser that could be incorrectly
signalling a completed download instead of an aborted one?
OS: Windows XP → All
Hardware: PC → All
IIRC the download manager code treats any end of data receving as normal end of
file, irrespective of what other code tells it.

Comment 27

14 years ago
*** Bug 296785 has been marked as a duplicate of this bug. ***

Comment 28

14 years ago
*** Bug 214328 has been marked as a duplicate of this bug. ***

Comment 29

14 years ago
*** Bug 295408 has been marked as a duplicate of this bug. ***

Comment 30

14 years ago
The following example fails reliably for me.  I'm accessing it via a 300kbit
cable connection, in case that matters:

http://rochebruneadsl.free.fr/gcc/pdf/gccint.pdf

This is a 2,096,519 byte file.  Mozilla (FF 1.0) downloads about 1.8M and then
the progress bar in the download manager jumps to 100% and reports "done".  The
remaining 200k is not downloaded.

Here is the result of downloading using wget, which is interesting:

$ wget http://rochebruneadsl.free.fr/gcc/pdf/gccint.pdf
--12:09:04--  http://rochebruneadsl.free.fr/gcc/pdf/gccint.pdf
           => `gccint.pdf.1'
Resolving rochebruneadsl.free.fr... 212.27.40.178
Connecting to rochebruneadsl.free.fr[212.27.40.178]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2,096,519 [application/pdf]

86% [==============================>      ] 1,813,007     35.32K/s    ETA 00:08

12:09:56 (34.74 KB/s) - Read error at byte 1813007/2096519 (Connection reset by
peer). Retrying.

--12:09:57--  http://rochebruneadsl.free.fr/gcc/pdf/gccint.pdf
  (try: 2) => `gccint.pdf.1'
Connecting to rochebruneadsl.free.fr[212.27.40.178]:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 2,096,519 (283,512 to go) [application/pdf]

100%[+++++++++++++++++++++++++++++++=====>] 2,096,519     31.18K/s    ETA 00:00

12:10:05 (32.37 KB/s) - `gccint.pdf.1' saved [2096519/2096519]


So it looks like the server drops the connection after 1813007 bytes.  wget
detects this, and retries with a "partial content" request [re comment #1, HTTP
certainly does support partial downloads], eventually getting all content, while
Mozilla erroneously reports it "done".

For comparison, IE6 (under Wine) downloads 76% of the file and then reports
"Internet Explorer cannot download gccint.pdf from rochebruneadsl.free.fr.  The
connection with the server was reset."  Opera 8 gets to the same point and shows
"Error"; it is then possible to select "Resume", and it will complete.

Comment 31

14 years ago
*** Bug 304985 has been marked as a duplicate of this bug. ***

Comment 32

14 years ago
*** Bug 301595 has been marked as a duplicate of this bug. ***

Comment 33

14 years ago
*** Bug 311057 has been marked as a duplicate of this bug. ***

Comment 34

14 years ago
What's even more confusing, the download window (be it progress window or download manager) incorrectly reports that the expected number of bytes has been downloaded.

In fact, this problem persists since Mozilla 1.0 and apparently nobody ever bothered to report that.

Comment 35

14 years ago
Posted patch patch v0 (obsolete) — Splinter Review
Check the final progress notification versus the expected file size to determine if download was successful. I think this should solve the problem.
Attachment #210862 - Flags: review?(neil)
Comment on attachment 210862 [details] [diff] [review]
patch v0

Sorry, I can't tell if this approach is reasonable.
Attachment #210862 - Flags: review?(neil)
I'm not sure that this should be done in frontend code... necko should be able to detect this. or, if not necko, then it should be exthandler/webbrowserpersist.

Comment 38

14 years ago
(In reply to comment #35)
> Created an attachment (id=210862) [edit]
> patch v0
> 
> Check the final progress notification versus the expected file size to
> determine if download was successful. I think this should solve the problem.

This patch implements the minimum level of support that Moz should have; it just gets us what IE does, i.e. reporting an error.  All that the user can do is retry, and the download will probably then fail again in the same way.

It would be better to implement a "resume" button as Opera does, or to automatically resume as wget does.  See comment #30.  This needs to use the HTTP range mechanism to continue from the point where the previous fetch stopped.

--Phil.

Comment 39

14 years ago
Although this is a little off of subject, are there any known plans to update Firefox's Download Manager completely?  It sure could use it, I am having this problem, too, and many others.

Comment 40

14 years ago
I filed a bug report that appears to be a duplicate of this, bug #329895. I'm sure I've seen it in Mozilla suite, and today in a current Seamonkey tinderbox on Win98SE and 56K dialup. The possibility of timing out has often occurred to me. Does Mozilla or server software calculate estimated time of a download based on connection speed? If so, all Mozilla versions I've seen may be sending over-estimated speeds to servers.

An excessive speed estimate (or too-short time estimate) could happen because the speed isn't calculated until the user enters info in the save-as dialog box, but the download is progressing during that time into the system temp folder. After the dialog box data is entered, THEN the rate is calculated based on the data already received but not on the corresponding time interval. It's based on maybe one second of downloading after the dialog data is entered. So the rate always starts substantially higher than the actual instantaneous rate and declines as the download progresses. The estimated time is initially too small and gradually approaches a realistic estimate as the rate falls.

If servers use the initial rate or time estimate from the recipient, it could under-estimate the time, perhaps seriously, and treat the realistic download time as a stall and kill the connection.

Correctly calculating download speed and estimated time based on an instantaneous rate instead of the faulty present method might help to fix dropped downloads.

Mozilla/Seamonkey/Firefox should also compare the filesize on the server to what's been received to indicate whether a download is complete or failed.
> If servers use the initial rate or time estimate from the recipient, it could
> under-estimate the time, perhaps seriously, and treat the realistic download
> time as a stall and kill the connection.

servers don't know the time estimates from the client...

> Mozilla/Seamonkey/Firefox should also compare the filesize on the server to
> what's been received to indicate whether a download is complete or failed.

That's what this bug is about, as I understand it.

Comment 42

14 years ago
*** Bug 329895 has been marked as a duplicate of this bug. ***

Comment 43

14 years ago
(In reply to comment #39)
> Although this is a little off of subject, are there any known plans to update
> Firefox's Download Manager completely?  It sure could use it, I am having this
> problem, too, and many others.

According to the Firefox 2 wiki pages ( http://wiki.mozilla.org/Firefox2/Features ), there was a "pluggable download manager" scheduled for that release. But it now has a line through it, so I assume it's been postponed.

Could we get at least minimal checking in for the Firefox 2 branch (i.e. something based on the current patch), and worry about a better fix later on the trunk? I assume 1.8.1 is the branch leading to Fx 2?
Flags: blocking1.8.1?

Comment 44

14 years ago
*** Bug 330794 has been marked as a duplicate of this bug. ***

Comment 45

14 years ago
A new twist: using Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.9a1) Gecko/20060309 SeaMonkey/1.5a, I had an apparently unsuccessful download that unzipped OK. This Seamonkey's download manager reported the filesize to be 11054Kb, called the download done at 11052Kb, and it unzipped ok anyway with no errors. Why would there be a 2K discrepancy between original filesize and bytes downloaded?

Comment 46

14 years ago
The 2K discrepancy is probably due to the dl mgr missing the final notification. This patch should address that issue as well.

Updated

14 years ago
Attachment #210862 - Flags: review?(darin)
note comment 37. I'm not sure this is the best way to fix this.

Comment 48

14 years ago
Comment on attachment 210862 [details] [diff] [review]
patch v0

Delegating first review to cbiesinger.
Attachment #210862 - Flags: review?(darin) → review?(cbiesinger)
Comment on attachment 210862 [details] [diff] [review]
patch v0

ok then, review-. imo this should be done in necko or exthandler, which should report an error to the frontend.
Attachment #210862 - Flags: review?(cbiesinger) → review-

Comment 50

13 years ago
Not going to block 1.8.1 for this, but "we'll" happily consider a trunk patch once it has baked.
Flags: blocking1.8.1? → blocking1.8.1-
Priority: -- → P3
Target Milestone: --- → mozilla1.9alpha

Comment 51

13 years ago
I've had this occur frequently when pausing, then disconnecting my dial-up. Upon reconnecting, pressing the resume button causes firefox to mark the download as complete, with no option other than to delete the truncated file and start again. FF 1.5.0.7 on WinXP SP2.

Updated

13 years ago
Duplicate of this bug: 322200

Comment 53

13 years ago
I still have this problem using:
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9

I almost gave up on FF downloader for big/long downloads. I find wget more reliable when dealing with difficult downloads that keep failing.
wget detects broken/failed/incomplete downloads and has no problem with resuming those.

...something I would expect from FF download manager from V.1.0 but still not there, because it's "stupid" to have this super browser with such a weak download managing code. :(

Some of the Firefox plug-in download managers I've tested also have this problem (I guess because all of them are simply using the same code built into Firefox, the one with this problem).

I found another open-source app built using Mozilla code-base with this same problem of incomplete downloaded files marked as complete, and that's the Participatory Culture's Democracy RSS Video player.
I stopped using it for this same reason. Lot's of incomplete video files downloaded as if they were complete (and then, unplayable, of course).

I'm thinking of upgrading to FF 2.0 to see if this got solved, although I'm affraid this bug's still unfixed :/

Comment 54

13 years ago
(In reply to comment #53)
> I'm thinking of upgrading to FF 2.0 to see if this got solved, although I'm
> affraid this bug's still unfixed :/

The bug still exists in Fx 2.0, or it'd be marked as closed.

We seem to have a better download manager... in the Software Update code. Could that be harnessed for this?
Duplicate of this bug: 369231

Comment 56

13 years ago
I'm another surprise victim of DM confusion.

I was downloading three large files simultaneously, and decided to suspend two of them so that I'd get one completed sooner. When it finished and I went to resume the other two, they both downloaded a bit more successfully, but then stopped and were marked complete. The DM knew how many bytes to expect, and the result was well short, yet it didn't note the error.

So from my perspective:

1. If the final size doesn't match the expected size, this should not be reported by default as a successful transfer.
2. If DM knows or suspects that it cannot successfully suspend and resume the transfer, then either the Pause option should be removed, be greyed out, or require the user to confirm intent despite the risk.
3. The DM should learn how to resume a download more reliably.

Comment 57

13 years ago
I've encountered this problem numerous times using FireFox2 on an WinXP machine. The download seems to complete but when you try to install the application you just downloaded it fails because the files are incomplete.

Comment 58

13 years ago
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.2) Gecko/20070222 SeaMonkey/1.1.1

I just updated from Seamonkey 1.0.7 (I think).  I do NOT use the Download Manager.

With 1.1.1 if I download a file the normal dialog box appears that has the "Close", "Launch File", "show File Location" and the Percent complete.

However, if I try to download it again it "instantly" is "downloaded" which is impossible for the huge file (10mb).  The file is not actually being download but instead coming from some hidden cache.  This is the case regardless of whether the "Save As" file name is kept the same and is overwritten, the file was removed just prior, or the name is changed.

The previous version did NOT react this way.  If one did a download repeat it would actually download the file.

This is a problem particularly if the src file one is trying to download has the same name but different location from the previous.

How does one get a fresh copy download with 1.1.1. now?  Obviously something has changed.

thanks,
rich

Comment 59

13 years ago
Hi Rich,

Guessing here, but did you ensure that there wasn't a hidden/temp file in your download directory, and also that the browser cache was emptied?

Comment 60

13 years ago
Several things- 

1. I have been downloading a zillion files previously with several years worth of mozilla suite versions and then a previous verion of Seamonkey.  I have never had to clear the cache and I have never used the Download Manager.  I have never had this problem before.  I have at times for other reasons cleared the cache though.

2. I uninstalled the previous version of Seamonkey prior to installing this Seamonkey 1.1.1.  Then I almost immediately went off to do these other downloads that exposed this problem.

3. I'm not sure what you mean by a hidden or temp file in this circumstance.  I was saving an exe for zone alarm free.  Also something I have done a lot without this kind of problem.  This filename was ordinary and the directory I was saving it to was nothing special.  This dir was something like c:\d\software\updates\mozilla.  Nowhere near places that may have temp files or even program-related files.

4. I have my Windows xp pro as I have had all of my **** windows systems since the beginning of time set to display all files and display all extensions.  I don't see any temp files in these dirs.

5. Are you saying that ever time after I download a file I have to clear the cache?  If so there is no question that this is a bug.

This happened just after installing 1.1.1 following an outstall of the previous (maybe 1.0.7 I can't remember).  Stands to reason that the new version is the culprit.  It is what changed.  I have used this procedure dozens of times with moz and now a few times with Seamonkey.  This is the first time I have had this problem.  I have not exhaustively tested the combinations that are possible with the in- and outstall followed by repeated downloading so I can't tell the cause nor the entire scope.

thanks,
rich

Comment 61

13 years ago
My suggestions were intended to try to find a work-around for you until this bug was fixed, and did not imply that this isn't a real bug, or that (even if they did work) my suggestions were a satisfactory long-term solution.

I have had the problem in the past where the browser continued to grab a truncated cached download instead of starting afresh, but it's not something that happened very often... usually only after certain bad/truncated downloads. From memory, emptying the browser cache got around the problem in those rare cases, but it's been a while since that happened and it's not something I can reproduce at will to test again. As for the "temp file" possibility, if the browser hasn't placed a temporary download file in the same location as the intended final download, then  that's clearly not an issue.

Comment 62

13 years ago
Those instant downloads might be coming not from a temp file but from a buffer in RAM. I've seen this too, I think, though not recently. Haven't tested 1.1x much.

Comment 63

12 years ago
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3

This error still occurs. It just happened to me when a "network unplugged" error ocurred on either my NIC or my router. As soon as Windows displayed this error in the system tray, my 70MB download (which was only at 20MB so far) immediately completed with no option to retry.

It is really sad that this has not been fixed since 2004. Please upgrade this to critical. Not sure why you developers think it is okay for a download manager to falsely claim that a download has completed sucessfully.

There are later bug reports which seem to be a duplicate of this one, such as #253328 and #262179 .

Comment 64

12 years ago
It seems like an old-fashioned flow chart ought to be able to describe the steps to perform in a download, and then one could make sure that the ifs, thens, elses and case structures to perform the steps are correct, so there's no dropthrough to the end of the function with a false report of a successful download.

I saw this bug once in the past week, though I forget whether it was a 1.5a, 1.1x or 1.0x version.

Comment 65

12 years ago
-> reassign to default owner
Assignee: darin.moz → nobody
this is still an issue on Firefox Trunk, Download manager often reports files as DONE instead of FAILED in cases like:
- corrupt download
- interrupted download
- not valid http auth credentials
- non existant remote file

imho this should get into the radar before 3.0 release

Comment 67

12 years ago
(In reply to comment #66)
> - interrupted download

The suggested patch, should at least take care of this particular issue (the most common that I get in the few occasions where I still try to use Firefox downloader).

(patch v0   (5.39 KB, patch) 2006-02-06 04:45 PDT, Son Le)
[...]
-      mPercentComplete = 100;
+      // If file sizes don't match, mark the download as failed.
+      if (LL_NE(mCurrBytes, mMaxBytes))
+        mDownloadState = FAILED;
[...]


Also, when dealing with the "unknown size" file download (mainly FTP?) couldn't it try to LIST it and parse the file size? (erm... if that's how it's done currently, forget that I ever wrote this paragraph, please ;)
(Note to self: take a closer look at nsDownloadManager.cpp)

Comment 68

12 years ago
does anyone know if this patch is safe to merge into 1.8 branch.  going to review my local tree, initial dry-run failed partially but I'm experiencing similar issues when doing multiple (>2) from same server with potentially throttled cpu's, etc. thanks.

Comment 69

12 years ago
I can confirm that manually merging this patch into my 1.8 branch builds gives a much better experience and tags downloads much better than previous "Completed" status that was stamped on failed files.

Comment 70

12 years ago
Posted patch updated 1.8 Branch patch (obsolete) — Splinter Review
Attachment #284191 - Flags: review?(cbiesinger)
Comment on attachment 284191 [details] [diff] [review]
updated 1.8 Branch patch

I don't own download manager and already said that I don't like this approach (comment 37/comment 49)
Attachment #284191 - Flags: review?(cbiesinger)

Comment 72

12 years ago
I'm getting this on Camino, reported as Bug 420318. At the very least the download manager needs to provide both the expected size and the final size, regardless of the cause of the discrepancy.

If the file size is unknown it obviously can't do anything about the problem, but if it is it should report it.

Comment 73

11 years ago
firefox 3b5 still have this issue.

i'm using it behing a (very) slow proxy and offen the file is displayed finish this a size of few octets downloaded when download manager dispayed the good size when estimating download duration.

i don't see any difficulties to add a warning after comparing the to size.

Comment 74

11 years ago
Firefox 3 has this bug.

Is it so hard to fix? Just ensure:

excepted_file_size == actual_file_size

If it is not equal, then mark download as broken.

Comment 75

11 years ago
You had about convinced me that it was my proxy that was at fault, but I've had it happen when there was no proxy involved.

I'd also like to note that it claims the download was the *expected* size, not the *actual* size.

If the expected size doesn't match the actual size, you should get a warning, *and* you should get an option to continue the download.

I'm pondering some kind of extension to use curl or wget instead of the browser's own downloader.

Comment 76

11 years ago
I used Firefox 3.0 (final) to download files from Gmail.

When I started download, expected file size was ~15Mb. After a while I found this file downloaded in Download Manager with size ~7Mb. No warnings, no messages, no *any* difference from correct download (and no options to retry or continue).

Isn't is this bug?
Duplicate of this bug: 444447
Duplicate of this bug: 448364
Duplicate of this bug: 453359
Duplicate of this bug: 464391

Comment 82

11 years ago
Experienced this bug today running SM 1.1.13 for Linux/i586.

Comment 83

11 years ago
Experienced this bug today running SM 1.1.13 for Linux/i586. The download was a SM 2.0a2pre nightly, 13Mb, reported complete at the indicated filesize, actual file was a partial of 9Mb.
Last I knew, seamonkey was not using the toolkit download manager UI.  As a result, I don't think you want this bug.

Comment 85

11 years ago
I just noticed: This bug is still there with Firefox 3.0.5 (WinXP).
It's true: bug persist in the Firefox 3.0.5 Portuguese at Windows Vista Portuguese. 
Download Manager doesn't identify the interrupt/break of download and show as COMPLETED... :-( 

BUT... if you DELETE saved file (target), keep files history at Download Manager and open the same download address (URL) at same session file is resumed(?) and CONTINUE download from last part/length save... 

Why FF Download Manager can't identify break and do this steps ? :-( 

Thanks.
Morais

Comment 87

11 years ago
Needs fixing, and seems like necko is the right place for the logic.  Putting high on my list of bugs.
Assignee: nobody → jduell.mcbugs
Status: NEW → ASSIGNED
Target Milestone: mozilla1.9alpha1 → ---

Comment 88

10 years ago
Any news on fixing this? Bugs tend to become forgotten-

Comment 89

10 years ago
It's still on my list.

Comment 90

10 years ago
Not sure if this should come under the same bug, but I've found that some Netgear routers have a 'Trend security' feature which, when enabled, produces the same problem in Firefox. Other browsers don't seem to have a problem when this is enabled though.

Comment 91

10 years ago
I recommend Sun Download Manager (free at java.sun.com) as a workaround.

Updated

10 years ago
Duplicate of this bug: 458580

Comment 93

9 years ago
Well, I signed up to put in a RFE to allow restarting.  This is a bit off topic but I feel that it is part of this issue.

Looking through issues and other stuff over the last 24 hours has shown me that the Download Manager should be able to detect a bad/incomplete download and resume it.  The file size is stored in the sqlite database according to the schema.

To add to this issue.  With IP, copyrights and licensing, it can be a problem to download files from various sources without going through a proxy.  The issue is some programs or sources are larger than the proxy will allow to download or download so slow that the proxy drops the connection.  Result is to start again.

I did a quick search and found that the Download manager can pause and restart through a businesses proxy server as long as it gets the correct information for:

entityID  
TotalEntitySize
mResuming
ContentLength()


in

nsIDownloadManager, nsIDownload, nsIDownloadProgressListener nsIDownloadManagerUI, nsIResumableChannel

As the ability to pause between sessions is there, then the ability to resume a failed download should be possible using the same idea.  Store the size, file name, blocks completed and other information in the download schema.

The only issue I see is the referrer reference.  If using a proxy, the referrer link could be different.

The added ability to save the *.part file and then restart would be nice.

The Download manager would have to not close out the temporary files when they are not complete and flag them in the UI as incomplete.

I have just started to look at the code for Linux but I am not a programmer (not with much experience anyway) and I have never worked with C++ though I am trying to learn.

In general, the incomplete downloads should be marked as such, if the size is not met.  I have seen downloads that are listed at 90Meg and when 5Meg is downloaded, the proxy shuts the link and the download manager marks the file as complete.  This is on Linux with 3.5.8 from last night.

Fixing the resume for corrupted or stopped downloads will, at least as I see it, fix this bug as well.  I am going to examine the source code further.

Comment 94

9 years ago
Hi Robin, 
this is great, thank you for your interest. 
I did not catch the only thing - referrer problem:
"The only issue I see is the referrer reference. If using a proxy, the referrer link could be different."

Regards,
Milan

Comment 95

9 years ago
I am not 100% but if my knowledge doesn't fail me, the referrer would be the link for the file.  When using some web proxies, the link for the file would change on each access to the file due to the way that some public proxies function.

I have not had time to check the schema and test this so I am hoping my aging brain is correct.

referrer TEXT 	The referrer for the download. 

Just looking at the schema and I see this also.

source 	TEXT 	The source URI for the download. 

Now I am talking public proxies similar to what you would find from here.

  http://www.publicproxyservers.com/

It may have to be an option to resume a proxied download but I will have to think about it some more.

Comment 96

9 years ago
Okay.  This is the reference that I have started with.

https://developer.mozilla.org/en/The_Download_Manager_schema

This refers to downloads.sqlite in Firefox's home directory.  The data stored in this file is as per the schema.  I have compared it with actual downloads.

I am going to test with proxies on the same file later.

The data base does do what I expected it did.  The "referrer" is the url for the page that you click to download.  The "source" is where the file came from.

The "source" is going to be the problem when working through proxies as the addresses for the "source" can and in many cases will change.  Some way to select the *.part file that goes with the source we wish.  Use the "target" to help confirm the correct file.

I will post more after I have done some proxie tests.
Why would proxies affect what the source or the referrer is?

Comment 98

9 years ago
First, I am talking about public proxies.

Some proxies will change the URL for the source site on each access.  It is part of the encoding that the proxy uses.  I was doing some testing over the last couple of nights and I would enter a link and the source was different each time it accessed the download site.

I didn't check the referrer that much in detail yet.  Maybe tonight.

Some proxies don't even keep the source file name from my tests over the last couple of nights.  One I used last night changed the file name to browse.php and the other name was init.php.

I was thinking that if the download manager checked the reported file size when the download is terminated and then saved the file as a *.cor file indicating corrupted, it could allow a way to access the downloaded part.  I know that the file size is transmitted on many downloads but if the file is not complete, the download manager doesn't indicated it as incomplete.  This would fix the original complaint about not indicating partial downloads as incomplete.

I tried to look through the code last night but in the short time I had, I didn't find the part that saves the *.part file during the download.

Updated

9 years ago
Duplicate of this bug: 203331

Comment 100

9 years ago
I just found out this bug I've also reported in 2003 is still alive and well, I supose the fact that it becomes rarer as the user internet connection improves (over these 7 years i've gone from 56k up to 20mbps fiber) will make it very unlikely to get fixed.

Comment 101

9 years ago
(In reply to comment #100)
> I just found out this bug I've also reported in 2003 is still alive and well, I
> supose the fact that it becomes rarer as the user internet connection improves
> (over these 7 years i've gone from 56k up to 20mbps fiber) will make it very
> unlikely to get fixed.

There will always be those with unreliable connections, even if they are fast. Plus, as connections get faster, people download bigger files.

I'd think that such an annoying bug being nearly seven years old would make fixing it more of a priority. Can't be good for the image.

Comment 102

9 years ago
(In reply to comment #101)
> There will always be those with unreliable connections, even if they are fast.
> Plus, as connections get faster, people download bigger files.

The last time I've seen this bug happen was a couple a days ago when a friend of mine was downloading some drivers using the Firefox 4.0 nightly over a 3G link.

The installer would just fail with "not valid archive" error, but ofc we both knew what was going on so we just used a download manager to finish the download properly. Most users would just re-download the 100mb+ file (of which half was already on the hdd) without even realizing what exactly happened.

Don't get me wrong, I would love to see this bug fixed. When I had a very pour internet service this bug would happen on 90% of the downloads, but after 7 years I'm not holding my breath anymore. Me and every other firefox user I know have just learned to live with it by keeping a download manager around and not trusting firefox download manager.

Comment 103

9 years ago
I use mobile connection. This topic is still actual after seven years.

Comment 104

9 years ago
I can not understand how this behavior can be considered acceptable. If an operation as important as downloading fails, that HAS to be reported to the user.

And it's not hard to fix. Firefox knows the size of the download. It knows it got less than the anticipated size. Even if a proxy hides network errors, it has all the information it needs to flag the download as incomplete.

Comment 105

9 years ago
(In reply to comment #104)
> And it's not hard to fix. Firefox knows the size of the download. It knows it
> got less than the anticipated size. Even if a proxy hides network errors, it
> has all the information it needs to flag the download as incomplete.

Exactly, like suggested in Comment 67
https://bugzilla.mozilla.org/show_bug.cgi?id=237623#c67

(patch v0   (5.39 KB, patch) 2006-02-06 04:45 PDT, Son Le)
[...]
-      mPercentComplete = 100;
+      // If file sizes don't match, mark the download as failed.
+      if (LL_NE(mCurrBytes, mMaxBytes))
+        mDownloadState = FAILED;
[...]

and even FTP downloads of unkown size, could be pre-parse a LIST to get the size.

Comment 106

9 years ago
btw, this bug and the JavaScript speed were 2 of the main reasons to change to Chrome, a long time ago.

...and I'm still not sure now, but I'm under the suspicion that even Chrome has an issue like this. It's harder to check this with larger broadband access now.

Comment 107

9 years ago
Not sure when I get to this, so cc-ing some other likely suspects.  It would be great to fix this in FF4 timeframe if possible.

Folks: please don't add any more comments unless they've got technical details that will help with fixing this. Otherwise this bug report just gets harder to read.
Assignee: jduell.mcbugs → nobody

Comment 108

8 years ago
This annoying behavior (major bug) existed from Firebird 0.7 to Firefox 1, 2, 3, 4, 5, 6, 7, 8 (the version I'm using now), and today I was testing FirefoxAurora 11.0a2. Guess what! Firefox still cannot download files properly, interrupts downloads as "finished", offers no notification and resume option. I was trying to download today three times Thunderbird from the Mozilla website with Aurora 11. It downloaded 33 MB, then 19 MB, then 1.8 MB and always ended as "finished" and giving me only corrupted files. It's a shame that I have to use a special download program for downloading files instead of Firefox. BTW: Today I was using a slow mobile modem (25 kB/s). Maybe this bug is not notable when on broadband. Please fix this for good!
Whiteboard: [lame-network]

Updated

7 years ago
Duplicate of this bug: 469837

Comment 110

7 years ago
When file downloaded incompletely, no warning from firefox.
Duplicate of this bug: 607020
Duplicate of this bug: 732645

Comment 113

7 years ago
Firefox should assert a warning whenever the actual size of the downloaded file is smaller than the expected file size (Regardless the reason of the download failure).
Comment hidden (advocacy)
Comment hidden (advocacy)

Comment 116

7 years ago
(In reply to Jason Duell (:jduell) from comment #107)
> Folks: please don't add any more comments unless they've got technical
> details that will help with fixing this. Otherwise this bug report just gets
> harder to read.

Apologies for adding to the bugspam, but I wanted to point out that the Downloads Panel extension displays a details panel similar to Opera's, allowing to see at a glance if a download is complete. So it can serve as a workaround to alleviate this problem for the time being.
https://addons.mozilla.org/firefox/addon/download-panel/

Updated

7 years ago
Status: ASSIGNED → NEW
Comment hidden (me-too)

Updated

7 years ago
Duplicate of this bug: 748982
Duplicate of this bug: 773241
Duplicate of this bug: 811443
Duplicate of this bug: 537452

Comment 122

7 years ago
FYI, I came across this article giving workarounds for the problem that sometimes work.
http://maggiesfarm.anotherdotcom.com/archives/21021-Docs-Computin-Tips-Resuming-broken-uploads-downloads.html#extended

Comment 123

6 years ago
This issue has just broken my download of quite a large file from one of these slow file sharing services with time restrictions two times in a row, not speaking about many other times before. I thought that my issue was en exception, but now (as I found this bugreport) I'm really surprised this bug is well-known and moreover, it's more than 8 years old now, it has been present since the first version. Ppl, this bug makes firefox almost useless for downloading large files, IMO, it's a major issue which needs to be fixed. I even can't just redownload file so I won't have that time penalty, firefox just leaves me without any options, simply marking a 0 bytes file as downloaded..
Duplicate of this bug: 840452

Comment 125

6 years ago
still present in 18.0.2.

Comment 126

6 years ago
I believe it's present in 22.0a1 too.

Updated

6 years ago
Duplicate of this bug: 838495

Comment 128

6 years ago
I've had this happen with s3.amazonaws.com. 22.0a1 (2013-03-25)

Comment 129

6 years ago
It happened for 2 out of 5 files today (with s3.amazonaws.com).

Firefox showed me the declared file size before starting to save the file. This means that FF has everything it needs to detect an incomplete download operation — it should just compare two numbers.

Please raise importance of this bug!

Comment 130

6 years ago
Simply copy the download URL to a separate download program. That is what I do since ages. Firefox is unable to reliably download files. And since nine years no-one at Mozilla cares.

Comment 131

6 years ago
The upgrade that I suggested is very easy to implement. I don't see a reason why it will not be done in the very near future.

I repeat it, here:

Firefox should assert a warning whenever the actual size of the downloaded file is smaller than the expected file size (Regardless the reason of the download failure).

I will not solve all the problems, but at least the user will be aware of partial downloads.

Comment 132

6 years ago
Somebody with good writing skills, please write an article in a well-known computer publication about Firefox's inability to detect incomplete failed downloads. This will probably attract attention of those who is responsible for FF's development.

And God's sake, raise the importance of this bug report!
Whiteboard: [lame-network] → [please read and respect Comment 107] [lame-network]
:mak, what do you think about fixing this in the download manager?

That has been rejected in the past in favor of a necko approach, but I'm not convinced that's the right thing to do. Generic http consumers have to be a little more flexible in order to be backwards compatible, but I wouldn't be surprised if the download manager would be successful in enforcing the content-length (as long as there was on - making sure chunked scenarios work ok is important, and I wouldn't enforce the trailing 0 chunk requirement, that really does come up missing all the time.).
Flags: needinfo?(mak77)

Comment 134

6 years ago
Now, writing an article is a great idea.

The idea of coping the link to insert in a different program is not a great suggestion for many people.  Heck, there are people that have a hard time just understanding that the file is actually downloaded to their computer when they download.  I have seen cases of the same file being downloaded multiple times on the same computer.

If FF is supposed to be a great browser, then it has to be complete.  A download manager that doesn't manage is just useless.
(In reply to Patrick McManus [:mcmanus] from comment #133)
> :mak, what do you think about fixing this in the download manager?

I'm all-in to reopen the discussion and evaluate fixes at the DM level and based on discussions I had with Paolo (that is currently the owner of the DM module) I think this is a global acknowledge.
It's pretty clear the "wait for magic Necko fixes" strategy didn't pay out, considered the age of this bug.

Now, would be good to start by identifying all of the possible situations where this may happen, and for each evaluate what could be the strategy to attack it at a necko and a DM level, and taking the best path for each.

At this time Paolo is working on a new implementation of the Downloads API and may already have a plan to attack this in the new API, or we we may start planning that.
Flags: needinfo?(mak77) → needinfo?(paolo.mozmail)

Comment 136

6 years ago
(In reply to Marco Bonardo [:mak] from comment #135)
> Now, would be good to start by identifying all of the possible situations
> where this may happen, and for each evaluate what could be the strategy to
> attack it at a necko and a DM level, and taking the best path for each.

Yes, I agree this is how we should proceed. I had already filed bug 847184 with
the goal of finding at least one such possible situation, leveraging the test
suite of the new Downloads API.

While in these days I'm focused on other tasks, it's my goal for the imminent
Networking Work Week to find as many as possible of those situations on various
platforms, and understand the best way to resolve them.
Flags: needinfo?(paolo.mozmail)

Comment 137

6 years ago
Please make sure to cover the case described in bug 856282.
@XtC4UaLL: Normally I'd agree with "reading and respecting" such a comment as comment #107; however, the comment was made in October 2010. Granted, it said they didn't know when they'd get around to it, but... 2½ years, closing in fast on 3, without any word from jduell? I know he said he put it high on his list of bugs - in January 2009, 4½ years ago - but AFAIK, no progress has been made since then? (I'd love to be proven wrong.) I can definitely see why people find it difficult to respect comment 107.

Comment 139

6 years ago
I think I was recently hit by this bug as well. I have a WebGL application that loads relatively large static models using Javascript and XMLHttpRequests. If the loading process is interrupted by reloading the page quickly enough, subsequent page reloads might get only partial data for the interrupted file from the cache. 

I have seen this also happen to included javascript files as well so it's clearly a generic interrupted download issue.

It would be really nice to see this bug fixed before its 10th anniversary.

Comment 140

6 years ago
Download manager's timeout mechanism causes many incomplete downloads, and should be improved.

I was trying to download a 23MB TIFF image from the American congress library. The download always stopped at the middle, probably due to a lag at the server's side.

I used the following manual method/algorithm to force Firefox to download the image for me:

1. Download of a 23MB image starts.

2. I watch the download speed to see if its changing - an indication that the download is still in progress. No other way in Firefox to know.

3. Whenever I see that the download hangs, I wait ten seconds and then I click "pause". (if I fail to click "pause" fast enough, Firefox will terminate the entire download).

4. I wait a while. Lets say ten seconds.

5. I click "resume".

6. I wait max of ten seconds to see if download continues. 

   If it does not - I go back to step 3 (click pause and wait).
   If it does - I go back to step 2.

For the given images I had to repeat this "algorithm" for 5-6 times before download was completed.

Why can't Firefox do automatically what I am forced to do manually?

Also, maybe the timeout is to short before Firefox decides to terminate a download.

This bug is very easy to fix - why nobody does it?

Comment 141

6 years ago
Somebody needs to contact Boris Zbarsky (https://air.mozilla.org/mozilla-distinguished-engineer-boris-zbarsky/) to get this fixed. When he gets involved, things get done.

Comment 142

6 years ago
If his involvement wouldn't help fix this nearly 10 year old issue, probably nothing will. Then it should just be closed and accepted as mozilla "feature".

Comment 143

6 years ago
wow, when i saw this hasn't been fixed for almost 10 years

i have no hope to fix another bug which i mentioned here
https://bugzilla.mozilla.org/show_bug.cgi?id=834870

great job mozilla

Comment 144

6 years ago
I am curious about streamed http downloads...  FTP you automatically I think know the file size I would hope. I haven't examined the protocol lately. but HTTP you get several different kinds of things that can happen I should guess, two of which are:
- streamed HTTP with filesize
- streamed HTTP without filesize
it's the latter which can throw things off. a dropped connection (router, switch on a hop, whatever) will kill the download. and the internet isn't known for being reliable for HTTP downloads, and some companies which handle large file hosting know that, which is why they offer download managers for downloads and software products.

google chrome I think has this down.
so does IE.
safari? o?

while I don't know what all causes a tcp connection to die in the middle (too many dropped packets?) and I can only guess, exactly HOW to handle the streamed stuff

am I stating the problem correctly? I don't have an awful lot of knowledge in this area because I have not a lot of knowledge of the underpinnings of the browser or the HTTP protocol, I am going by what I see sometimes.
 some PHP files seem to do byte streaming somehow. perhaps that should be looked up in the php manual, it has helpful comments, on how to do this and that, and perhaps such an implementation can be shown which causes this.

Comment 145

6 years ago
It's not an issue with the server not sending file sizes. (I would consider that a buggy server; even though the spec doesn't require it, there's no way to ensure the download succeeded without.) You can watch in the download manager and it will show the correct size, but when the connection drops, the displayed size changes to the amount it's managed to download so far, and the download is considered successful.

Perhaps this is an intentional workaround for servers that send the wrong size, but there are much better ways to deal with that. (If the download is interrupted at the same place every time, after a couple attempts, *then* you assume the server is giving the wrong size. Or you just keep trying endlessly, in hopes that the server admin notices and fixes their broken script.)

Comment 146

6 years ago
Jim, you make an interesting point about download managers. In my experience those are rarely trustworthy and are often a vector for malware to get into your system, but unwary people use them because they're more reliable. You might say that by making such an important feature unreliable in one of the most popular browsers for *over nine years*, Mozilla is just helping malware authors...

Comment 147

6 years ago
(In reply to Jim Michaels from comment #144)
> I am curious about streamed http downloads...  FTP you automatically I think
> know the file size I would hope. I haven't examined the protocol lately. but
> HTTP you get several different kinds of things that can happen I should
> guess, two of which are:
> - streamed HTTP with filesize
> - streamed HTTP without filesize
> it's the latter which can throw things off. a dropped connection (router,
> switch on a hop, whatever) will kill the download. and the internet isn't
> known for being reliable for HTTP downloads, and some companies which handle
> large file hosting know that, which is why they offer download managers for
> downloads and software products.
> 
> google chrome I think has this down.
> so does IE.
> safari? o?
> 
> while I don't know what all causes a tcp connection to die in the middle
> (too many dropped packets?) and I can only guess, exactly HOW to handle the
> streamed stuff
> 
> am I stating the problem correctly? I don't have an awful lot of knowledge
> in this area because I have not a lot of knowledge of the underpinnings of
> the browser or the HTTP protocol, I am going by what I see sometimes.
>  some PHP files seem to do byte streaming somehow. perhaps that should be
> looked up in the php manual, it has helpful comments, on how to do this and
> that, and perhaps such an implementation can be shown which causes this.

Well the problem is not failing at downloading the file, it is of course understandable
The problem is showing Complete for a failed downloads which is really annoying and causes many problems

i never download large files with firefox, this happens even to 2-3mb files
and the filesize is known, i mean firefox shows the download filesize is 5mb but fails at 3mb and shows it as Completed!
there is no way to be sure unless we check the size ourself
for example in the middle of downloading a mp4 clip, it fails but shown as completed, i only realise it when i'm watching the clip and it stops in the middle of clip

Comment 148

6 years ago
Re. download managers (comments 144 and 146)
There are some good ones out there, and others to watch out for.
Sun's Java site had a good one which went away when they became part of Oracle.  I was using that as a workaround when this bug was still young. :)
One to avoid is FileZilla, which is spyware (unless you have a firewall with fine enough controls that you can block FZ from phoning home).

All download managers do seem to share one drawback, which I hope a fix to this bug will be able to avoid -- namely, they get lost if the download site is in the habit of changing filenames every few minutes in order to force you to go through a front-end web page to retrieve files from them.  The Files sections of groups on both Google and Yahoo do this, so they are not a good place to upload your material if an alternative is available.

Comment 149

6 years ago
Solution to this bug is simply working around this broken Firefox download manager. I use Down Them All, which has an own inbuilt download manager, and for videos FlashGot, which sents the download command directly to Down Them All or any other specified download program, on your PC. Since I do that I don't have anymore broken files.

Comment 150

6 years ago
(In reply to Rotis from comment #149)
> Solution to this bug is simply working around this broken Firefox download
> manager. I use Down Them All, which has an own inbuilt download manager, and
> for videos FlashGot, which sents the download command directly to Down Them
> All or any other specified download program, on your PC. Since I do that I
> don't have anymore broken files.

I tried this add-on tool, but when I download a video from YouTube, I get it without any file extension (the .flv or .mp4 has disappeared). Also renaming file names in the tool is difficult or not possible. So for now I will probably stick to the old and buggy Firefox download manager.

Comment 151

6 years ago
Very often when I download videos from a website such as Youtube, the download speed is very low (I am not sure if the problem is in the website or in Firefox).

I found out that in many cases you can greatly increase the download speed simply by clicking "pause" and then "resume".

Comment 152

6 years ago
A third party download manager isn't a "solution". It's an "unacceptable workaround".

Also, this should be a higher priority. It's a data-loss bug. It should be release-blocker.

Comment 153

6 years ago
Since the download manager is a communication tool, it should be implemented as one.

In communication between two points A and B, you must verify that the data is valid. This is basic.

1. All the time, the download manager should monitor the download speed. Whenever the download is hang or too slow, follow the steps that I suggested in comments 140 and 151.

2. Compare file size and CRC before and after.

3. The tool should report to the user about the status of his/her download in terms of completeness.

I am sure that there are many other parameters that can be checked and monitored. I am not a communication expert, but I guess that all these factors are already known to Firefox developers, so what are we waiting for?

Comment 154

6 years ago
question: does the HTTP (maybe FTP does) implement a CRC hash/checksum on the data and put it as part of the packets as part of the protocol?
does IPV6 HTTP or FTP do this, which if any?
I am almost sure you get at least a filesize with ftp, but I am only guessing since I don't have the protocol in front of me, there's probably an RFC for those...
well, here is proftpd's list of ftp-related rfc's. http://www.proftpd.org/docs/rfc.html

Comment 155

6 years ago
It's clearly possible to detect broken downloads since other browsers don't have the same problem.

Comment 156

6 years ago
It's enough to rely on the file length as reported by HTTP or FTP server (and seen by user). It doesn't present always, but the case of absence may be just ignored (+1 for comment #145).
As for checksum, it is absent in HTTP and FTP headers for a common case. Not sure for a possible extensions, but the checksum verification isn't a point here, I think. TCP protocol does some of this work.

Comment 157

6 years ago
Recently I get a lot of these errors:

"file.mp4 could not be saved, because the source file could not be read. Try again later, or contact the server administrator.".

Few remarks:

1. It is not true that the file could not be saved. The file was saved to the disk, but was not complete (0.98GB instead of 1.2GB, in this case).

2. "The source file could not be read" - maybe it is a temporary communication problem. Did Firefox try to 'pause' and 'resume', as I suggested in previous comments? How long did Firefox try to communicate with the server before stopping to try the download of a 1.2GB file?

3. "contact the server administrator." - is it a joke?

4. Error messages should NOT appear in a new window. I hate it when there are many windows open in my computer. Usually I don't even notice that window because it hides behind other windows. This message belongs to the main window, or, even better, to the download manager window.

Comment 158

6 years ago
(In reply to travneff from comment #156)
> It's enough to rely on the file length as reported by HTTP or FTP server
> (and seen by user). It doesn't present always, but the case of absence may
> be just ignored (+1 for comment #145).
> As for checksum, it is absent in HTTP and FTP headers for a common case. Not
> sure for a possible extensions, but the checksum verification isn't a point
> here, I think. TCP protocol does some of this work.

MD5, CRC, are great, but so far, even file size is not checked whenever Firefox decides that a download "has completed". Firefox should at least compare file sizes before and after download, as said in many previous comments.

In other words: 'at least do something'.

Comment 159

6 years ago
Today I tried to download a 500MB file.

After half of the file was downloaded, the download manager stopped progressing.

Usually, in such cases, I right-click "pause" and "resume", and that solves the problem.

However, sometimes, including today, the "pause" option is inactive (light gray color). I guess it was a moody day for the DM.

I would like to use this opportunity to thank Firefox download-manager developers for not fixing this 2004 bug yet.

Comment 160

6 years ago
I am still trying to download that 500MB file. Long night ahead.

In Trial #2, the download manager stopped after 94MB, but the indication in the DM showed that the entire 500MB is there. In other words: the DM thinks that the download has completed but actually it hasn't.

I don't understand how this bug can occur - can't this DM simply check the file size in the local disk?

In Trial #3, the DM stopped after 70MB, saying that "source file cannot be read". Wouldn't it at least be polite from the DM's side to let me decide whether I want to try and continue this download without early termination?

Comment 161

6 years ago
When I try to download a large file, firefox assumes that it is complete when it is only half way through.
This as well happens with google chrome, and on other laptops.
but it always happen.

but you know what works? wget. the terminal download manager from GNU. when something in the connection happens it just attempts to connect again and just continue.

I am pretty sure if wget can do it, you guys can do it too.

Comment 162

6 years ago
hmmm. I like cURL better, maybe cURL can do a better job? it handles more protocols.

Comment 163

6 years ago
How can I change the timeout of the download manager? 

As I previously wrote, many "lost" downloads can be saved manually using simple "pause" and "resume", but the user must respond very fast, when a download stops, before the download-manager decides to terminate it.
  
Increasing the timeout will give the user more time to respond.

(this change should be temporary, until the DM will automatically do this procedure of "pause" and "resume").
Depends on: 947846
Patrick McManus is the current Necko owner, so we can conclude that the former decision has been reversed. (See comment #133 and bug 947846.)
Changing the component to clarify the new decision.
Component: Networking: HTTP → Downloads Panel
No longer depends on: 947846
Flags: blocking1.8.1-
Product: Core → Firefox
The new Downloads Panel landed long ago. Paolo, any ETA?
Flags: needinfo?(paolo.mozmail)

Comment 166

6 years ago
We've identified some possible actions here based on some research (<https://wiki.mozilla.org/User:P.A./Download_networking_improvements>), but we've not started working on this yet.
Flags: needinfo?(paolo.mozmail)
Whiteboard: [please read and respect Comment 107] [lame-network] → [please read and respect Comment 107] [lame-network][triage]
Assignee

Updated

6 years ago
Assignee: nobody → daniel
Assignee

Comment 167

6 years ago
Reading through the history of this bug, it isn't easy to fully grasp which the single problem the original poster may have talked about but I've started out to focus on the part "Firefox doesn't notice/alert when my HTTP transfer gets cut off".

Firefox doesn't care about cut HTTP transfers when Content-Length says N bytes but the connection gets cut off when only Y bytes have been delivered (N > Y). If that's a random binary filed being saved, of course it'll be broken. If it is a text/plain or text/html page the content is of course incomplete.

I have a stupid test server of mine send Content-Length: 17, then send 6 bytes of response-body and then close the connection. Simulating an early broken TCP connection basically.

Chrome silently doesn't show any content for this URL. curl shows the content but then exits with an error that says the transfer was partial.

Firefox mozilla-central silently renders a page with the 6 bytes. If I make it a more binary Content-Type, Firefox will instead silently store the 6 bytes as a complete file without a hint about it being cut off.

Compressed Content-encoding will ruin this scheme but I assume that if we break a compressed stream before the end of the response body, the gzip/deflate decompressor will instead signal an error so it shouldn't be as important.

The history of this bug and I believe also the source code seem to imply that Content-Length is unreliable and therefore it is ignored like this on purpose. That's however not my experience and I would object to that. For HTTP/1.1 responses, pages without chunked and content encoding, the Content-Length seem to generally be fairly accurate today - as long as it larger than zero (a decade or so ago, before the wide introduction of large file support everywhere sometimes servers wrapped the file size counters and went negative over 4GB).

I'm currently experimenting with CloseWithStatus(NS_ERROR_NET_INTERRUPT) from within nsHttpTransaction::Close() which prevents both a rendering of a broken transfer and tricking the user that a binary transfer succeeded, but it is not good enough for several reasons, including: 1 - the wording of the big error message is slightly off and will lead a user to looking for problems in the wrong place. I couldn't find an error code that would trigger a correct error message. 2 - probably generating an error message for this is wrong in the first place so perhaps the error should not be user visible like this but it should at least not allow users to believe a file download was fine.

My minimal experiment patch looks like this:

diff --git a/netwerk/protocol/http/nsHttpTransaction.cpp b/netwerk/protocol/http/nsHttpTransaction.cpp
--- a/netwerk/protocol/http/nsHttpTransaction.cpp
+++ b/netwerk/protocol/http/nsHttpTransaction.cpp
@@ -825,21 +825,41 @@ nsHttpTransaction::SaveNetworkStats(bool
 
 void
 nsHttpTransaction::Close(nsresult reason)
 {
     LOG(("nsHttpTransaction::Close [this=%p reason=%x]\n", this, reason));
 
     MOZ_ASSERT(PR_GetCurrentThread() == gSocketThread);
 
     if (mClosed) {
         LOG(("  already closed\n"));
         return;
     }
 
+    if(mContentLength >= 0 &&
+       (mContentRead != mContentLength)) {
+        /* If Content-Length was given (this response was not chunked
+           encoded), the transaction is done without the full promised content
+           having been delivered */
+        reason = NS_ERROR_NET_INTERRUPT;
+    }
+
     if (mTokenBucketCancel) {
         mTokenBucketCancel->Cancel(reason);
         mTokenBucketCancel = nullptr;
     }
 
     if (mActivityDistributor) {
         // report the reponse is complete if not already reported
         if (!mResponseIsComplete)
(In reply to Daniel Stenberg [:bagder] from comment #167)

> Compressed Content-encoding will ruin this scheme but I assume that if we
> break a compressed stream before the end of the response body, the
> gzip/deflate decompressor will instead signal an error so it shouldn't be as
> important.

But the decompression occurs only later. When I download a file and Firefox does not tell me about an error, I suppose that the file is complete. I may decompress it one month later. Then, it may be too late to re-download the file. The sooner we know about the error, the better.

Comment 169

6 years ago
Can FF just rely on Content-type response header? For example, if it's "text/*" (gzipped content goes here) — things could be left as is. If it's "application/*" — be more strict.

However, it would be better to take into account the display/save method instead. For example, I don't need any additional info for a partially loaded HTML while browsing, but I'd like to have a (new) common "not completed" mark if I save it using download manager.

Comment 170

6 years ago
I'd much rather it retry at least a few times, and only give up if it disconnects at the same place every time, since the most likely reason for it disconnecting is **** wifi, and I've told it to download the file, not download part of it and stop. I hate having to tell things to try again until they manage to succeed.
Assignee

Comment 171

6 years ago
(In reply to Nicolas Barbulesco from comment #168)

> But the decompression occurs only later. When I download a file and Firefox
> does not tell me about an error, I suppose that the file is complete. I may
> decompress it one month later. Then, it may be too late to re-download the
> file. The sooner we know about the error, the better.

I was talking about the automatically decompression that's done by Firefox itself on compressed content... But I just gave this a quick test and my crude patch will also detect cut off content-encoding content when sent with Content-Length since the "mContentRead" there is the raw data size - before being unpacked.
daniel - for the moment I'd like to leave part of the code that deals with unreliable content-lengths in place.. but I think we can take a step forward and toss such an error specifically with >= http/1.1

1] move your check inside the existing if (!mResponseComplete) block.. all you need to do is check (mChunkedDecoder || (mContentLength >= int64_t(0))) && (mHttpVersion >= NS_HTTP_VERSION_1_1) before setting the reason because the block is only exectued if the framing did not find a end of file marker.

2] LOG() it

3] I would also put your content length check in the download manager - so it applies to all protocol versions... for files we can afford to be more strict.


sound good?
(In reply to Patrick McManus [:mcmanus] from comment #172)
> 3] I would also put your content length check in the download manager - so
> it applies to all protocol versions... for files we can afford to be more
> strict.

I'd put it only there and also base it for auto retry and resume.  I think it's somewhat dangerous to play with the code in the channel if not strictly necessary.

Comment 174

6 years ago
At the VERY LEAST, it must be possible for the user to know if the content-length doesn't match the length of the downloaded data. That seems to be the kind of minimal functionality that would pass muster, just from a data loss standpoint. It's incomprehensible to me that this bug passed the first review without someone going "holy cow, we need to fix this, people are LOSING DATA" and fixing it. That it's not been fixed after this many years just boggles my mind.

Comment 175

6 years ago
(In reply to Honza Bambas (:mayhemer) from comment #173)
> (In reply to Patrick McManus [:mcmanus] from comment #172)
> > 3] I would also put your content length check in the download manager - so
> > it applies to all protocol versions... for files we can afford to be more
> > strict.
> 
> I'd put it only there and also base it for auto retry and resume.  I think
> it's somewhat dangerous to play with the code in the channel if not strictly
> necessary.

Seconded, any potentially blocking change should be done at the upper level.

We'll need some Telemetry first to gather data about the situation, and that can be added in the channel code as well, provided it's non-blocking.
we don't need telemetry here - this has been paralyzed by non action way to long. anyhow, it can't do the interesting thing of separating real errors from false positives.

as far as I am concerned both changes can proceed.
I think that what I have just encountered is one of the cases of this bug.

Firefox 26, Mac OS X Mavericks.

Firefox was in the background, downloading a file. 
I was using Safari, and Safari went spinning around. 
My Mac's processor was in the cabbage during a few minutes. 
Then, things calmed down, and everything was responding again. 
Firefox informed me that my download was complete. 
But this is not true. 
My download has ≈ 200 MiB and by luck I feel that “it went too fast”. Indeed, the full file has ≈ 400 MiB. But I could very well not have noticed any problem. 
Typically, the user just thinks the file is complete because Firefox says so. 

This behaviour of Firefox is incorrect. It means data loss.

What has happened is that Firefox went “not responding” during maybe two minutes, and so the download was interrupted.

Here is what I expect instead :

1/ * Ideally * — Don’t interrupt the download. If necessary, just stall the download for a while and then continue where it was. Firefox may not be the only problem here, the server too may need to allow that. 
2/ * If necessary * — Interrupt the download. Inform the user of the problem. And propose the user to resume downloading. From where it was if this is possible, otherwise from the beginning. 
3/ * At least * — Interrupt the download. And inform the user of the problem.

Thank you for improving this.

Nicolas

Comment 178

6 years ago
> as far as I am concerned both changes can proceed.

FWIW I also support making the change in both the channel and the download manager--we can always back out the channel change if we suspect it's causing trouble.  Of course it's Patrick's call, and he's made it, so let's proceed and see how it goes.
No longer blocks: fxdesktopbacklog
Whiteboard: [please read and respect Comment 107] [lame-network][triage] → [please read and respect Comment 107] [lame-network]
Assignee

Comment 179

5 years ago
Here's an updated take. This code detects broken transfers with HTTP1.1 if Content-Length is used or if the chunked-encoded transfer didn't end properly (ie it was cut off). Updated two test cases due to the changes. Added a new test case for basic testing of this new treatment.
 
This patch also addresses Bug 654577.
Attachment #8389842 - Flags: review?(mcmanus)
Component: Downloads Panel → Networking: HTTP
Product: Firefox → Core
Comment on attachment 8389842 [details] [diff] [review]
0001-Bug-237623-detect-broken-HTTP1.1-transfers.patch

> --- a/netwerk/test/unit/test_gzipped_206.js
> +++ b/netwerk/test/unit/test_gzipped_206.js

I don't think changes in this file are correct. This test intentionally terminates the connection in the middle and verify that the resume works correctly.
Attachment #210862 - Attachment is obsolete: true
Assignee

Comment 181

5 years ago
(In reply to Masatoshi Kimura [:emk] from comment #180)
> Comment on attachment 8389842 [details] [diff] [review]
> 0001-Bug-237623-detect-broken-HTTP1.1-transfers.patch
> 
> > --- a/netwerk/test/unit/test_gzipped_206.js
> > +++ b/netwerk/test/unit/test_gzipped_206.js
> 
> I don't think changes in this file are correct. This test intentionally
> terminates the connection in the middle and verify that the resume works
> correctly.

Thanks, but the point with my code changes in this bug is that termination prematurely is now an error so something needs to be modified in test case as otherwise it'll abort on that error (NS_ERROR_NET_INTERRUPT)!

Do you have an alternative modification to suggest to make the test case pass?

Comment 182

5 years ago
I'm not sure that this is the right level to fix it. The HTTP level needs to retain the information necessary for higher layers to determine if the transfer is complete or not, but flagging an incomplete transfer as an _error_ rather than _incomplete_ is likely to trigger a cascade of other problems... this being just one of them.

If the transfer tracks expected size and actual size, that provides enough information for the download manager to mark incomplete downloads, give the _user_ better information about the state of the transfer, and provide a UI for continuing or retrying it... without triggering other error code unnecessarily.
Comment on attachment 8389842 [details] [diff] [review]
0001-Bug-237623-detect-broken-HTTP1.1-transfers.patch

Review of attachment 8389842 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/xpconnect/src/xpc.msg
@@ +81,4 @@
>  XPC_MSG_DEF(NS_ERROR_UNEXPECTED                    , "Unexpected error")
>  XPC_MSG_DEF(NS_ERROR_OUT_OF_MEMORY                 , "Out of Memory")
>  XPC_MSG_DEF(NS_ERROR_INVALID_ARG                   , "Invalid argument")
> +XPC_MSG_DEF(NS_ERROR_NET_INTERRUPT                 , "Interrupted network transfer")

get r? from bholley (or his delegate) on this file

::: netwerk/test/unit/head_channels.js
@@ +29,4 @@
>  const CL_EXPECT_LATE_FAILURE = 0x20;
>  const CL_FROM_CACHE = 0x40; // Response must be from the cache
>  const CL_NOT_FROM_CACHE = 0x80; // Response must NOT be from the cache
> +const CL_IGNORE_CL = 0x100; // don't both to check the content-length

s/both/bother

::: netwerk/test/unit/test_chunked_responses.js
@@ +101,4 @@
>  ////////////////////////////////////////////////////////////////////////////////
>  // Test 3: OK in spite of non-hex digits after size in the length field
>  
> +test_flags[3] = CL_ALLOW_UNKNOWN_CL|CL_EXPECT_LATE_FAILURE;

update this test so it is a positive test

::: netwerk/test/unit/test_gzipped_206.js
@@ +21,4 @@
>    response.setHeader("Content-Encoding", "gzip", false);
>    response.setHeader("ETag", "Just testing");
>    response.setHeader("Cache-Control", "max-age=3600000"); // avoid validation
> +  response.setHeader("Content-Length", "" + 17);

as discussed in person, the better fix to this test is to conditionally set the content-length in each iteration
Attachment #8389842 - Flags: review?(mcmanus)
also, as discussed, update nsHttpChannel::CloseCacheEntry() to consider a failed channel with this condition to be eligible for partial (resumable with range request) caching. ask :mayhemer for review on that

Comment 185

5 years ago
(In reply to Peter da Silva from comment #182)
> I'm not sure that this is the right level to fix it. The HTTP level needs to
> retain the information necessary for higher layers to determine if the
> transfer is complete or not, but flagging an incomplete transfer as an
> _error_ rather than _incomplete_ is likely to trigger a cascade of other
> problems... this being just one of them.

We just discussed this at the recent Networking Work Week, and concluded that our best bet is to go ahead with the change at the network level, and promptly handle any follow-up bugs that might arise. We know that it is likely that some behavior changes will appear at the upper integration levels, like Aurora or Beta, but we can handle regressions with specific follow-ups, and in the worst case just revert the change, since this is a simple patch.
Assignee

Comment 186

5 years ago
bholley: please take a look at the xpc.msg change I want (single-line, adding a new error code)

This patch is otherwise updated according to feedback, with test cases updated and now an additional check in nsHttpChannel::CloseCacheEntry so that interrupted transfers still can be cached.
Attachment #8389842 - Attachment is obsolete: true
Attachment #8390525 - Flags: review?(bobbyholley)

Updated

5 years ago
Blocks: 983197
Assignee

Comment 187

5 years ago
v3, removed the CloseCacheEntry change again as it isn't necessary
Attachment #8390525 - Attachment is obsolete: true
Attachment #8390525 - Flags: review?(bobbyholley)
Attachment #8390587 - Flags: review?(bobbyholley)
Assignee

Comment 188

5 years ago
Comment on attachment 8390587 [details] [diff] [review]
v3-0001-Bug-237623-detect-broken-HTTP1.1-transfers.-r-mcm.patch

The xpc.msg change wasn't necessary, widthdrawn!
Attachment #8390587 - Flags: review?(bobbyholley)
Assignee

Comment 189

5 years ago
This patch sent to the try server shows mochitest 5 test_partially_cached_content.html failing:

https://tbpl.mozilla.org/?tree=Try&rev=7c897be5271f

Working on an update.

Comment 190

5 years ago
one thing I just thought about... is if the server gets a replacement file uploaded that has a different file hash or maybe size. could this be a part of the problem that is being seen? if this happens during a download... how would you detect that? the content-type has already been sent. not sure how this works...  maybe this particular case can't be remedied. if there was a way - good. but I was just thinking about it as a possible situation that could occur.

Comment 191

5 years ago
Well I hope this can get resolved soon, because this bug has now been open for TEN YEARS.
Assignee

Comment 192

5 years ago
(In reply to Jim Michaels from comment #190)
> one thing I just thought about... is if the server gets a replacement file
> uploaded that has a different file hash or maybe size. could this be a part
> of the problem that is being seen? if this happens during a download... how
> would you detect that? the content-type has already been sent. not sure how
> this works...  maybe this particular case can't be remedied. if there was a
> way - good. but I was just thinking about it as a possible situation that
> could occur.

No, that would be a totally separate case (if at all). This bug is "simply" about the transfer being prematurely cut off and nothing signals an error about that fact.

Comment 193

5 years ago
Maybe my approach to the problem is naive, but since FF cannot download files without data loss, I'm using the Add-on DownThemAll for years. DTA tells me, when a download is interrupted and I can resume it without a problem. Zero bytes data loss with DTA. Why not contact the author of DTA and use his code in FF? Obviously the DTA code is far superior to the code FF uses. Looking at the response time to this major problem (10 years) mankind had never been to the moon if NASA engineers had been slow like the Mozilla coders. ;)
Attachment #8394104 - Flags: feedback?(bzbarsky)
Comment on attachment 8394104 [details] [diff] [review]
let docshell ignore NET_INTERRUPT when something has already been rendered

What about other necko consumers?  What should happen with images in the interrupt case?  Stylesheets?  External scripts?  Why are pages special?
Attachment #8394104 - Flags: feedback?(bzbarsky) → feedback-
Note that that came up in comment 133...

What do other UAs do in those situations?
(In reply to Boris Zbarsky [:bz] from comment #196)
> Note that that came up in comment 133...
> 
> What do other UAs do in those situations?

I have softened my tune on making the DL manager do this itself because I don't actually think it has the right tools to do it correctly.

Sure, if it has a content-length and the DL doesn't match the CL then it can take action on its own. But what about other indications of transfer problems such as aborted http/1 chunks or reset http/2 streams? The channel knows that those transactions have problems just as much as the mixmatched content-length case but the channel consumer (e.g. dl manager) has no insight into it. That doesn't seem right. The transaction failed by the definition of the protocol, so OnStopRequest ought to reflect that so that the consumer can decide how it wants to handle the error. The DL manager really doesn't have the tools to make that decision.

I can think of two reasons to argue against making the channel report the error and I'd like to think neither or them carry the day:

1] If the Internet was filled with servers with broken framing then the channel should probably be working around bad servers. Having broken framing actually was the case 10+ years ago but persistent connections have really forced people to get this right on the server side because the framing is so critical to make persistent connections work. Note that daniel's patch actually only creates the OnStopRequest error when version is >= 1.1.. so while this is out there as something to keep on the radar, I don't find the argument compelling.. and the time that has passed here is largely what has changed my mind (as I pretty much never see this problem manifest itself innocuously anymore).

2] There are just too many consumers that want to ignore NET_INTERRUPT but aren't doing so - so its just a unsolvable compatibility problem in practice. I am hearing a little of that. If that's the feedback, then we can add something silly like nsIHttpChannel::SetRequireStrictFraming that makes this opt-in.. but I worry that there are consumers who are silently accepting NET_INTERRUPT that should actually not be.. I think the script consumer is an excellent example.. is using a truncated https:// script a security vulnerability?(given that an attacker can inject a TCP FIN without cracking the crypto) It certainly sounds like one to me. Other consumers that happily stream their output (like images) can probably safely ignore it. (and I believed they already were.. mistakenly?)
(In reply to Boris Zbarsky [:bz] from comment #196)
> Note that that came up in comment 133...
> 
> What do other UAs do in those situations?

e.g. url that reports too large of a C-L http://www.ducksong.com/cgi-bin/nph-bad-cl

firefox renders it.. chrome fails to render it but I sometimes get a error page and sometimes get a blank page.
(In reply to Patrick McManus [:mcmanus] from comment #198)
> e.g. url that reports too large of a C-L
> http://www.ducksong.com/cgi-bin/nph-bad-cl
> 
> firefox renders it.. chrome fails to render it but I sometimes get a error
> page and sometimes get a blank page.

IE11 also renders.
> is using a truncated https:// script a security vulnerability?

It's possible, yes.

This is why we should decide what the desired behavior is for the various scenarios.  Then we can figure out how to implement that desired behavior.  Narrowly hacking the docloader to fix the test may be the right thing, but it's hard to tell without knowing what behavior we really want in all the other cases.
(In reply to Boris Zbarsky [:bz] from comment #200)
> > is using a truncated https:// script a security vulnerability?
> 
> It's possible, yes.
> 
> This is why we should decide what the desired behavior is for the various
> scenarios.  Then we can figure out how to implement that desired behavior. 
> Narrowly hacking the docloader to fix the test may be the right thing, but
> it's hard to tell without knowing what behavior we really want in all the
> other cases.

So, since we would like to fix this download bug primarily, I'd personally expose the incompleteness information as a read-only attribute on nsIHttpChannel and let the download manager handle it in its OnStopRequest and do any 'proper' solution in a followup.

Up to you to decide.  This is my vote.

(In reply to Patrick McManus [:mcmanus] from comment #198)
> chrome fails to render it but I sometimes get a error
> page and sometimes get a blank page.

Chrome doesn't execute a partially downloaded script, IE11 and Fx do.

I have a small local php that sets cl to some 10000 but only delivers few bytes of some otherwise valid (but incomplete) html.  Chrome times out on it...  After a minute or so I get a pop-up dialog that the page doesn't answer.  Clicking 'end the page' replaces the content with an error page similar to ours.
Assignee

Comment 202

5 years ago
(In reply to Patrick McManus [:mcmanus] from comment #197)

> 2] There are just too many consumers that want to ignore NET_INTERRUPT but
> aren't doing so - so its just a unsolvable compatibility problem in
> practice. I am hearing a little of that.

I just wanted to add that at least some of those cases may actually benefit from getting this problem in their face so that they actually realize that the transfer may not be complete. Like the download manager does now.
Hey Boris,

I need to figure out what to do here, as this is actually a fairly impactful bug.

I feel like the change to throw NET_INTERRUPT is the right thing. not just from a purist sense, but channels need to be aware of that they are handling truncated data.

It is defined as:
/* The connection was established, but the data transfer was interrupted. */

I've looked for other things in necko that already generate it and the most prominent user is the spdy code - which generates it when a stream gets canceled by the server before a natural finish.. that's a very tight match to the behavior introduced by this patch. The difference is probably that the issue doesn't come up very much until to tie it to socket closures as is done here.

http://www.ducksong.com/cgi-bin/nph-bad-cl generates invalid content lengths for the html, an external js, and 3 references to 2 image urls.

chrome does not render the page.

firefox silently ignores all of those errors and displays the page with images and executes the truncated javascript.

if we add daniel's patch to firefox we also do not render the page.

if we add honza's uriloader patch to the mix we will render the html, but not execute the truncated javascript or display the images. Given that they are known to be corrupted, I think that's the right thing to do.. but let's flag some people to chime in

:seth - what do you think about images or whom should we ask? If you would like to display the truncated images do you think you should be handling the INTERRUPTED error?

boris, do you think we should be pinging someone else wrt js? Would css logically be different?
Flags: needinfo?(seth)
Flags: needinfo?(bzbarsky)
Hmm.  So the current things that can fail a channel with NS_ERROR_NET_INTERRUPT in addition to the spdy code you mention are:

1)  Anything that sends PR_END_OF_FILE_ERROR through ErrorAccordingToNSPR() over in the socket transport.

2)  The range request failure on ContinueConnect in nsHttpChannel::OnStopRequest.

3)  Maybe the bits in nsHttpTransaction::ParseHead ?

Docshell showing an error page for this error code was implemented in bug 106865, looks like; that was one of the cases that went through PR_END_OF_FILE_ERROR.

I'm pretty happy to try not running truncated JS (just like we don't run JS with an unclosed <script> tag, for similar reasons).

I'm also pretty happy to try not using truncated CSS, esp. if some other browsers already do that.

I'd like to understand Chrome's behavior here a bit better.  Does it start doing progressive rendering and then throw it away when there is a sudden connection close?
Flags: needinfo?(bzbarsky) → needinfo?(mcmanus)
(In reply to Boris Zbarsky [:bz] from comment #204)
> Hmm.  So the current things that can fail a channel with
> NS_ERROR_NET_INTERRUPT in addition to the spdy code you mention are:
> 
> 1)  Anything that sends PR_END_OF_FILE_ERROR through ErrorAccordingToNSPR()
> over in the socket transport.

right - I think that is probably the interesting one. afaict this change was meant to cover the empty transfer (but successful handshake) issue.. and we shouldn't be retriggering 106865 because daniel's patch only applies when there is an explicit mismatch in the framing information - and without some transfer we can't get any explicit framing information (i.e. response headers) to have a mismatch with.

> I'd like to understand Chrome's behavior here a bit better.  Does it start
> doing progressive rendering and then throw it away when there is a sudden
> connection close?

I updated http://www.ducksong.com/cgi-bin/nph-bad-cl to be a bit bigger.. now it has 266KB of HTML (but a content length 10x that). Chrome renders "some" of the HTML and then stops. It doesn't draw it all - it appears racy as to what has been laid out when the error comes in. The final bit of text of the document says "premature eof" but chrome in my tests never gets that far even though it is always transferred to the client. I'm going to assume when it gets the error it stops drawings/parsings in progress.

one happy thing to conclude from this is that the web is in a pretty good place wrt framing conformance.

chrome also doesn't replace the partial draw with an error page.. that's basically the same behavior honza's patch gives us except we seem to deterministically draw whatever we get.
Flags: needinfo?(mcmanus)
(In reply to Patrick McManus [:mcmanus] from comment #205)
> (In reply to Boris Zbarsky [:bz] from comment #204)
> > Hmm.  So the current things that can fail a channel with
> > NS_ERROR_NET_INTERRUPT in addition to the spdy code you mention are:
> > 
> > 1)  Anything that sends PR_END_OF_FILE_ERROR through ErrorAccordingToNSPR()
> > over in the socket transport.
> 
> right - I think that is probably the interesting one. afaict this change was
> meant to cover the empty transfer (but successful handshake) issue.. and we
> shouldn't be retriggering 106865 because daniel's patch only applies when
> there is an explicit mismatch in the framing information - and without some
> transfer we can't get any explicit framing information (i.e. response
> headers) to have a mismatch with.
> 
> > I'd like to understand Chrome's behavior here a bit better.  Does it start
> > doing progressive rendering and then throw it away when there is a sudden
> > connection close?
> 
> I updated http://www.ducksong.com/cgi-bin/nph-bad-cl to be a bit bigger..
> now it has 266KB of HTML (but a content length 10x that). Chrome renders
> "some" of the HTML and then stops. It doesn't draw it all - it appears racy
> as to what has been laid out when the error comes in. The final bit of text
> of the document says "premature eof" but chrome in my tests never gets that
> far even though it is always transferred to the client. I'm going to assume
> when it gets the error it stops drawings/parsings in progress.

For me chrome (with my own testing page that has just few bytes) shows no content and the error after some 60 seconds of some kind of a timeout...

Can you try similar 266kb test in Chrome with a javascript?

> 
> one happy thing to conclude from this is that the web is in a pretty good
> place wrt framing conformance.
> 
> chrome also doesn't replace the partial draw with an error page.. that's
> basically the same behavior honza's patch gives us except we seem to
> deterministically draw whatever we get.
(In reply to Boris Zbarsky [:bz] from comment #204)
> I'm pretty happy to try not running truncated JS (just like we don't run JS
> with an unclosed <script> tag, for similar reasons).
> 
> I'm also pretty happy to try not using truncated CSS, esp. if some other
> browsers already do that.

Just to confirm: in both cases for CSS and JS - to not use means: do start parsing/compilation as the data chunks are coming but don't ever apply/execute until the content is finally complete (with NS_OK in OnStop) ?
> and we shouldn't be retriggering 106865

That bug just means we need to keep some of the current handling of NET_INTERRUPT in the docshell.  Which means to the extent that we want different handling for the case at hand we may want a different error code here.  Or something.  That would get us the desired behavior for free, I believe, without requiring the extra checking in Honza's patch.

> to not use means: do start parsing/compilation as the data chunks are coming but don't
> ever apply/execute until the content is finally complete

In theory, yes.  In practice, both use a streamloader for now, so don't do anything with the data until OnStopRequest anyway (though we'd like to start changing that).
OK - Daniel, let's introduce a new error code.. NS_ERROR_NET_PARTIAL_TRANSFER (unless there is a better suggestion) and rework the patch to use that. we'll see if that gets the desired behavior for free or if we still needs something like honza's patch.. if we do need it at least it won't conflict with existing uses.

please also change the spdy/http2 code that generates INTERRUPT to use the new error code (as it is signaling the same condition from a different protocol).
Assignee

Comment 210

5 years ago
As requested, I'm attaching a new version of this patch that introduces and returns a new error code called NS_ERROR_NET_PARTIAL_TRANSFER.

It is used to signal exactly what it says: that the peer said there was going to be more data arriving but for some reason it didn't. It was cut off prematurely.

(I didn't obsolete the v3 version just yet since we're still discussing approaches.)
(In reply to Patrick McManus [:mcmanus] from comment #203)
> :seth - what do you think about images or whom should we ask? If you would
> like to display the truncated images do you think you should be handling the
> INTERRUPTED error?

Imagelib decodes images in a streaming fashion and already has to handle cases where data is truncated or otherwise corrupted, so I don't think that side of things needs to be updated to support this error. We can just run the existing code in imgRequest::OnStopRequest as usual, and things should just work.

However, I do think that if we hit NS_ERROR_NET_PARTIAL_TRANSFER, we need to consider carefully how this should interact with the image cache. We definitely don't want to create a situation where an image loads partially due to a transient network error, the user fixes the network problem and refreshes the page, and we display the same truncated version from the cache!

Probably the right thing is to destroy the cache entry if we encounter this error.
Flags: needinfo?(seth)
Assignee

Comment 212

5 years ago
I wrote up a small local test with 4 pieces if content, all of them delivered broken with NS_ERROR_NET_PARTIAL_TRANSFER being signaled - according to the current patch. I did this by making sure the server sent a too large Content-Length and closed the connection after having sent the actual (smaller) content.

A) HTML that used...
B) a small png image
C) CSS
D) an external javascript

Without this patch all 4 entities were used to render the page as if everything worked fine.

With this patch, only part (A) renders like before.

B - The image gets discarded in imgRequest::OnStopRequest() due to the error, so it isn't cached but also isn't displayed. Instead one of those "image missing" symbols are used.

C - The CSS is ignored

D - The javascript seems to have been discarded, the button in the HTML I had run code from it doesn't work anymore

Detecting what's going on: Using the "Web Developer->Networking" isn't very helpful to help a user to diagnose what's going on. It shows the requests, it shows the responses and it can even show the (partial) image rendered but there's no (to me) clear hint anywhere why they don't render (or run in the javascript case).

To me, the not-showing-the-image is the surprising and perhaps not entirely desirable part.
so let's work with :seth to decide if the image should be rendered to partial completion or not.. my experience with chrome was that this was basically undefined (it depends on how much of the image was delivered before the error was found).

let's keep the ball moving.
Assignee

Comment 214

5 years ago
I altered my test slightly. In this test #2, the image (B), was a bigger JPEG that was delivered _very_ slowly. I could then see it being partially rendered as the image is being downloaded all the way during many seconds until the connection finally gets cut and then *blink* the now almost completely rendered image instead gets replaced by the (much smaller) image-not-found symbol...
(In reply to Daniel Stenberg [:bagder] from comment #214)
> I altered my test slightly. In this test #2, the image (B), was a bigger
> JPEG that was delivered _very_ slowly. I could then see it being partially
> rendered as the image is being downloaded all the way during many seconds
> until the connection finally gets cut and then *blink* the now almost
> completely rendered image instead gets replaced by the (much smaller)
> image-not-found symbol...

So, on a very slow connection (mobile somewhere in woods or satellite somewhere in a sea) I'll see the image loads but slowly.  I'll go to make a coffee with intention to later look at the image as it has download later.  When connection drops in at 99% I'll get my coffee and also a tiny "broken image" symbol instead - I will uninstall Firefox immediately - by throwing my laptop over board!! :))
Slightly off topic: could we indicate in the address bar (Larry) area that something didn't load completely on the page (mainly scripts) and offer user an option to attempt a reload only of those resources?
Assignee

Comment 217

5 years ago
(In reply to Honza Bambas (:mayhemer) from comment #215)

> So, on a very slow connection (mobile somewhere in woods or satellite
> somewhere in a sea) I'll see the image loads but slowly.  I'll go to make a
> coffee with intention to later look at the image as it has download later. 
> When connection drops in at 99% I'll get my coffee and also a tiny "broken
> image" symbol instead - I will uninstall Firefox immediately - by throwing
> my laptop over board!! :))

I completely agree with that. We want it to behave better. Is this really the only error situation that can happen that _shouldn't_ clear the image? It surprises me. I'll have to read more code...

Comment 218

5 years ago
does this maybe have something to do with http fragmentation? safari has a fragmentation bug where it won't load resources completely.
Assignee

Comment 219

5 years ago
(In reply to Jim Michaels from comment #218)
> does this maybe have something to do with http fragmentation? safari has a
> fragmentation bug where it won't load resources completely.

It depends on what you mean with "http fragmentation". It isn't a term I recognize and I couldn't find any definite explanation of it - nor did I find any info about any such Safari bug...
(In reply to Honza Bambas (:mayhemer) from comment #215)

> So, on a very slow connection (mobile somewhere in woods or satellite
> somewhere in a sea) I'll see the image loads but slowly.  I'll go to make a
> coffee with intention to later look at the image as it has download later. 
> When connection drops in at 99% I'll get my coffee and also a tiny "broken
> image" symbol instead - I will uninstall Firefox immediately - by throwing
> my laptop over board!! :))

Here Firefox is on the safe side : better no info than partial info. 
This is a good principle. 

Firefox must not give the partial image *without saying that is is partial*. 

But we can handle this case better : giving the partial image AND saying that the image is partial. We can do this by displaying the partial image with the sign "Broken image" just next to it. Ideally, the sign here would be a more precise version of the sign "Broken image", a sign "Partial image" that would mean that the image is partial. For example, we could add a torn photo missing a part.

Comment 221

5 years ago
I was about to suggest that the partial image or file still load and be indicated that it is damaged.  Reload as an option.

I would rather see a partial and know it is broken than to be dumped into an icon saying that only part downloaded.  I would then have an option to re-download or possibly to start again and have Firefox use the cache and hopefully finish what it started.

Put a border around the image?  Text at the bottom or as suggested at the side with a question to redownload the image?

This is still progress.
Assignee

Comment 222

5 years ago
Anyone with an idea about a suitable person involved or interested enough in the image or UI parts to involve in this discussion? I would like to push this topic forward!

I could also imagine that an extra UI element to mark the image as "broken" isn't really worth it, but I'm certainly not qualified to tell.
(In reply to Daniel Stenberg [:bagder] from comment #222)
> Anyone with an idea about a suitable person involved or interested enough in
> the image or UI parts to involve in this discussion? I would like to push
> this topic forward!
> 
> I could also imagine that an extra UI element to mark the image as "broken"
> isn't really worth it, but I'm certainly not qualified to tell.

Daniel - I would recommend you try and wade into the image lib code and make a patch to simply map PARTIAL to OK that you can propose to :seth or joe drew. If you can get their attention they might be able to give you direction or commentary
You probably want to look at imgRequest::OnStopRequest for starters.

Comment 225

5 years ago
I think it's really unlikely that we'll come up with UI for "showing this image, but really it's only partially loaded" that makes sense to most users. So I think Seth's plan to display what we get, but not cache it, makes sense.  (The only other approach I could see doing would be to try to reload the full image in the background to see if the partial content failure was transient, and replace it with the full image if so.  But that's at best a followup bug.)

bz: so far we've been talking about throwing away partial CSS.  Does that still seem like the right answer to you?
Flags: needinfo?(bzbarsky)
Throwing away partial CSS sounds just fine.
Flags: needinfo?(bzbarsky)

Comment 227

5 years ago
I'd suggest to not "enhance" UI with "question to redownload" or other buttons and texts at all. Just display partial images (as seen in the attached picture). Note, there is already "Reload image" in the context menu. 

If page is not okay (some images, CSS and images defined in CSS, etc. are partial/not loaded) user can see it (and experienced user can reload). Yes, FF could display info about it (see in attachment "Page not loaded correctly") but I think we should rather avoid such disturbing. At maximum, FF could display balloon ONLY AFTER clicking at that small icon.

I vote for throwing away partial CSS.

Comment 230

5 years ago
I'm glad this is finally being worked on, but let's not bikeshed with UI ideas for broken images. The bug is the download manager not handling failed downloads correctly, not the display of incomplete files. What the UI should do when page elements fail to load is another issue.
Assignee

Comment 231

5 years ago
Here's another update to the patch. The notable difference this time is that the imgRequest::OnStopRequest() function now is modified to not remove partial images from being rendered, but also to not cache them.

It will still throw away partial CSS and partial javascript.
Attachment #8390587 - Attachment is obsolete: true
Attachment #8399259 - Attachment is obsolete: true
Assignee

Updated

5 years ago
Attachment #8411651 - Flags: feedback?(seth)
Assignee

Comment 232

5 years ago
Here's what the try server says about the current patch: https://tbpl.mozilla.org/?tree=Try&rev=976834fce528
Assignee

Comment 233

5 years ago
Grrr, sorry commented on wrong bug. Scratch that try server link.
Assignee

Comment 234

5 years ago
Here's the try server push of the latest patch from here: https://tbpl.mozilla.org/?tree=Try&rev=2b1cd654e041 seems fine to me.

Anyone has anything else for me to test or adjust with this before we can land it?
(In reply to Daniel Stenberg [:bagder] from comment #234)
> Here's the try server push of the latest patch from here:
> https://tbpl.mozilla.org/?tree=Try&rev=2b1cd654e041 seems fine to me.
> 
> Anyone has anything else for me to test or adjust with this before we can
> land it?

you need an image peer to review the image bit.. :seth?
Assignee

Updated

5 years ago
Attachment #8411651 - Flags: feedback?(seth) → review?(seth)

Comment 236

5 years ago
Comment on attachment 8411651 [details] [diff] [review]
v5-0001-Bug-237623-detect-broken-HTTP1.1-transfers.-r-mcm.patch

Review of attachment 8411651 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/xpconnect/src/xpc.msg
@@ +150,4 @@
>  XPC_MSG_DEF(NS_ERROR_PORT_ACCESS_NOT_ALLOWED        , "Establishing a connection to an unsafe or otherwise banned port was prohibited")
>  XPC_MSG_DEF(NS_ERROR_NET_RESET                      , "The connection was established, but no data was ever received")
>  XPC_MSG_DEF(NS_ERROR_NET_INTERRUPT                  , "The connection was established, but the data transfer was interrupted")
> +XPC_MSG_DEF(NS_ERROR_NET_PARTIAL_TRANSFER           , "A transfer was only partially done when it completed")

Now that we figured out the desired behavior in case of partial transfers for several consumers, do we have a better understanding of whether we really need the new error code?

If so, do we have a clear set of cases defining when each one of these three network error codes should be generated, or do they have different meanings based on where they are used?

For example, in bug 983187 I verified that we generate NS_ERROR_NET_RESET when an RST packet is received, even if some data was already transferred, which doesn't seem to match the short description above. This is likely to affect the result, for example, when partial image rendering is involved, but I can't tell if that is a feature or a bug :-) The error code is probably generated here:

http://mxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsSocketTransport2.cpp#156

I also wonder whether some consumers may better detect "partial transfer" based on whether they actually received any data from the stream, instead of the specific error code with which it was closed.

::: netwerk/protocol/http/nsHttpTransaction.cpp
@@ +870,5 @@
> +        (mHttpVersion >= NS_HTTP_VERSION_1_1)) {
> +
> +        if (NS_SUCCEEDED(reason) && !mResponseIsComplete) {
> +            reason = NS_ERROR_NET_PARTIAL_TRANSFER;
> +            LOG(("Partial transfer, incomplete HTTP responese received: %s",

typo: response

Comment 237

5 years ago
Well, that was my original point... the download manager knows the expected size of the transfer and can therefore provide that information in the user interface regardless of whether the underlying TCP connection was reset or timed out or otherwise terminated abnormally.

However...

If there is going to be detection of a mismatch between the size indicated in the header and the size in an otherwise complete transfer, it should be a new code. The new error is not an error at the TCP layer, but at the HTTP layer. It's distinct from a reset or interrupt.
Assignee

Comment 238

5 years ago
The download manager knows that for one of the cases this error can occur, yes - when Content-Length is used for HTTP 1.x. For the other cases involving chunked decoding or SPDY/http2, it doesn't.

Partial transfer means that the error was detected in the application protocol layer, above TCP, and that there were more data promised to get delivered but wasn't. A partial transfer means the transfer was cut off before it could finish.
(In reply to Daniel Stenberg [:bagder] from comment #238)
> The download manager knows that for one of the cases this error can occur,
> yes - when Content-Length is used for HTTP 1.x. For the other cases
> involving chunked decoding or SPDY/http2, it doesn't.

that comment made me look at the code again. tell me if this makes sense:

you need to update the spdy/http-2 paths to only generate partial for this case, rather than everywhere we currently do interrupt.

that pretty much just means testing the stream for having read a data frame to decide which error to return.. for spdy that's stream->RecvdData().. for http2 you'll need to add RecvdData() and SetRecvdData() methods. (I managed to remove them, but I guess they are coming back). You can easily do SetRecvdData() in http2session::readytoprocessdataframe().

likewise session::shutdownenumerator should be changed to generate partial instead of abort in cases of stream->recvddata()
Assignee

Comment 240

5 years ago
Another iteration. Now with added checks in the SPDY and http2 code that only returns the "partial transfer" error if data has actually been received at all.

The image part remains like in the v5 version of the patch that is still pending review so I leave v5 "open" here for that reason.
Attachment #8417236 - Flags: review?(mcmanus)
Comment on attachment 8417236 [details] [diff] [review]
v6-0001-Bug-237623-detect-broken-HTTP1.1-transfers.-r-mcm.patch

Review of attachment 8417236 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/Http2Session.cpp
@@ -140,5 @@
>    // registered an ID haven't actually been sent yet so they can always be
>    // restarted.
>    if (self->mCleanShutdown &&
>        (stream->StreamID() > self->mGoAwayID || !stream->HasRegisteredID())) {
> -    self->CloseStream(stream, NS_ERROR_NET_RESET); // can be restarted

this path should remain unchanged. It applies to transactions greater than the id of a clean goaway. They haven't actually seen any response data (we could warn if they had - but not necessary) and will be restarted by nsHttpTransaction with this code.

@@ -142,5 @@
>    if (self->mCleanShutdown &&
>        (stream->StreamID() > self->mGoAwayID || !stream->HasRegisteredID())) {
> -    self->CloseStream(stream, NS_ERROR_NET_RESET); // can be restarted
> -  } else {
> -    self->CloseStream(stream, NS_ERROR_ABORT);

this is the path that should test recvdData and choose between PARTIAL_TRANSFER and ERROR_ABORT.. basically these are streams that are either ambiguous by the goawayID or we have never receied a goaway (for example when the tcp session fails)..

::: netwerk/protocol/http/SpdySession3.cpp
@@ +1993,3 @@
>      if (mDownstreamRstReason == RST_REFUSED_STREAM)
>        rv = NS_ERROR_NET_RESET;            //we can retry this 100% safely
>      else if (mDownstreamRstReason == RST_CANCEL ||

I apologize for iterating like this through review comments Daniel!

I think we only want this recvdData check on CANCEL.. PROTOCOL_ERR, INTERNAL_ERR, and UNSUPPOTED_VERSION should continue to always return NET_INERRUPT. (same for spdy 3.1)
Attachment #8417236 - Flags: review?(mcmanus)
Comment on attachment 8411651 [details] [diff] [review]
v5-0001-Bug-237623-detect-broken-HTTP1.1-transfers.-r-mcm.patch

Review of attachment 8411651 [details] [diff] [review]:
-----------------------------------------------------------------

You'll need to update the patch to say 'r=mcmanus,seth', it looks like.

I only reviewed the changes in imgRequest.cpp; please let me know if I was supposed to look at anything else.

::: image/src/imgRequest.cpp
@@ +670,5 @@
>  
> +  bool isPartial = false;
> +  if (mImage && (status == NS_ERROR_NET_PARTIAL_TRANSFER)) {
> +    isPartial = true;
> +    status = NS_OK; // fake happy face

It might be a good ideal to add some kind of debug-mode log message here. NS_WARNING is fine with more, or you could use the PR_LOG stuff.

@@ +695,5 @@
>      // loading compressed source data, which is part of our size calculus.
>      UpdateCacheEntrySize();
>    }
> +  else if (!isPartial) {
> +    // if the error isn't "just" a partial transfer

This looks like it needs a little more refinement. The cases are:

- Things succeeded (mImage && NS_SUCCEEDED(status)) and we didn't get a partial transfer. This looks to be handled correctly to me - just call UpdateCacheEntrySize.

- Things succeeded and we *did* get a partial transfer. In this case we should still call UpdateCacheEntrySize, or else what we're doing is retaining the cache entry but treating it as if it only occupied 0 bytes, which isn't true. For the cache logic to behave correctly we need to know the actual size of the cache entries. Since this case and the preceding case end up needing to run the same code, you should just get rid of the |&& !isPartial| in the first |if| condition.

- Things failed. In this case we want to call Cancel no matter what, because presumably we don't actually have an image to show. If we make the change discussed above this should work correctly, because we only set isPartial to true if mImage is non-null, and when we do that we set status to NS_OK, meaning that if isPartial is true we'll always end up in the |if| case. That does mean, though, that the check for |if (!isPartial)| in the else branch doesn't add anything and should be removed - it should still just be |else|.

All of that was a very long way of saying that the additional checks for |!isPartial| seem to me like they should not be there.

Were you running into an issue that caused you to add them?
Attachment #8411651 - Flags: review?(seth)
Argh, sorry, I got too caught up in that 'if' logic and missed the bigger picture. You actually will need three branches of the |if| now, because in the case where things succeeded and we did get a partial transfer, we want to *throw out* the cache entry.

You can just call RemoveFromCache to do this, except - isn't there always a catch? - OnStopRequest may be running off main thread, but you need to call RemoveFromCache on the main thread. You should be able to write code very similar to the code in |imgRequest::Cancel| that calls ContinueCancel on the main thread if needed. (But don't reuse ContinueCancel since you don't want to do the other thing that ContinueCancel does.)
Assignee

Comment 244

5 years ago
Thanks for the reviews.

mccmanus: its really no problem, talking about specific source code changes makes it very concrete and direct. I believe I've now fixed the nits you pointed out in my v6 version.

seth: yes thanks a lot, that was exactly the kind of review I wished for. I've now cloned the trick cancel() used and added a call that removes the partial image from the cache, I believe this is in the style you meant and it seems to work fine in my local tests.
Attachment #8411651 - Attachment is obsolete: true
Attachment #8417236 - Attachment is obsolete: true
Attachment #8417880 - Flags: review?(seth)
Attachment #8417880 - Flags: review?(mcmanus)
Comment on attachment 8417880 [details] [diff] [review]
v7-0001-Bug-237623-detect-broken-HTTP1.1-transfers.-r-mcm.patch

Review of attachment 8417880 [details] [diff] [review]:
-----------------------------------------------------------------

r+ on netwerk bits
Attachment #8417880 - Flags: review?(mcmanus) → review+
Assignee

Updated

5 years ago
Blocks: 1008075
Assignee

Updated

5 years ago
Blocks: 1008091

Comment 246

5 years ago
Seth, any change you can review the last bits here?  We only need you to look at the 2 ImgRequest changes.
Flags: needinfo?(seth)
Comment on attachment 8417880 [details] [diff] [review]
v7-0001-Bug-237623-detect-broken-HTTP1.1-transfers.-r-mcm.patch

Review of attachment 8417880 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!

::: image/src/imgRequest.cpp
@@ +743,5 @@
>    }
> +  else if (isPartial) {
> +    // Remove the partial image from the cache.
> +    this->EvictFromCache();
> +  }

This looks good, but a minor style nit: I'd suggest that one of the last two |else if|s be replaced with an |else| so that it's immediately obvious to the reader that we *always* take one of these branches.

If it were me, I'd probably use this sequence:

if (mImage && ...) {
} else if (isPartial) {
} else {
}

I'd do it that way only out of a personal preference to avoid negative conditions where possible.
Attachment #8417880 - Flags: review?(seth) → review+
Flags: needinfo?(seth)
Assignee

Comment 248

5 years ago
This is a rebased/refreshed version of the patch with the minor suggestion from seth addressed.

This patch is in tested in this try run: 
https://tbpl.mozilla.org/?tree=Try&rev=ae6a776e72ee
Attachment #8417880 - Attachment is obsolete: true
Attachment #8436705 - Flags: review+
Assignee

Updated

5 years ago
Keywords: checkin-needed
Do we have an explanation for the test_partially_cached_content.html perma-fail on that Try run?
Keywords: checkin-needed
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #249)
> Do we have an explanation for the test_partially_cached_content.html
> perma-fail on that Try run?

In the test I have a iframe that is supposed to load partial content. Without the patches in this bug, I can call document.getElementById('contentFrame').addEventListener('load'...) and the load listener is called. The test does its checks and continues to completion.

With the patch, however, the load event for the iframe object does not fire and the test times out. Is this expected behavior?

I can get around it by calling window.addEventListener(...) - not sure if this is the right thing to do.
Assignee

Comment 251

5 years ago
(In reply to Steve Workman [:sworkman] from comment #250)

> With the patch, however, the load event for the iframe object does not fire
> and the test times out. Is this expected behavior?

Thanks Steve! I'll admit this is a territory I don't master yet so I can't answer this quickly, I need to dig deeper.

The patch basically now makes the nsHttpTransaction::Close() method set mStatus to a non-zero value as compared to 0 (OK) before. A log from a test run shows that this condition happens in this case: "Partial transfer, incomplete HTTP responese received: c-l underrun" is present.
Assignee

Comment 252

5 years ago
I've now learned that when nsHttpChannel::OnStopRequest() stores NS_ERROR_NET_PARTIAL_TRANSFER in mStatus (which is the 'status' argument passed in to it after this failed HTTP transaction), it stops the test from passing.

I can make it run again with a weird hack like this:

-    mStatus = status;
+
+    if (status == NS_ERROR_NET_PARTIAL_TRANSFER) {
+        LOG(("nsHttpChannel::OnStopRequest: partial, store fake OK\n"));
+        mStatus = NS_OK;
+    }
+    else
+        mStatus = status;

I'm not suggesting this is the correct way forward, I'm just showing what I've found.

Comment 253

5 years ago
I pre-emptively apologise if I have misunderstood what's going on here from only reading the last four comments. I've been following the bug progress on CC list for 7 years but not looking at the code at all.

(In reply to Daniel Stenberg [:bagder] from comment #252)
> I can make it run again with a weird hack like this:

Red flags rising at this point.

> +    if (status == NS_ERROR_NET_PARTIAL_TRANSFER) {
> +        LOG(("nsHttpChannel::OnStopRequest: partial, store fake OK\n"));
> +        mStatus = NS_OK;
> +    }

Red flags waving and alarm bells ringing at this point.

Pretending to the next higher layer that nothing bad happened in the lower layer is what gave rise to this bug in the first place 10 years ago. It looks to me like those 3 lines would preserve this bug for HTML client javascript code just as Mozilla is on the verge of fixing the bug at the file download and image display level.

Now it may be that the test case as currently written doesn't pass with the patch installed, but is that because this fix will be visible to client code in mStatus value, so any web page application JS that has assumed only 100% or 0% transfers in IFRAMEs will be broken by the fixed behaviour? If yes, both legacy app code and this test case would have to be updated, and the test would ensure load is NOT fired? 

In particular, what does firing of the "load" event mean in the HTML DOM spec, and therefore should it fire if any transfer was only partial?

The old DOM Level 2 spec seems quite clear:
 http://www.w3.org/TR/DOM-Level-2-Events/events.html#Events-eventgroupings-htmlevents
> " The load event occurs when the DOM implementation finishes loading all content within a document, all frames within a FRAMESET, or an OBJECT element. "

A partial transfer means not all content was loaded, so I would intepret this spec to imply that any NS_ERROR_NET_PARTIAL_TRANSFER in any part of loading the page would prevent .load() from firing, even on an IFRAME. (That's what I guessed above.)

Checking the most recent draft L3 spec from last year it adds an exception:
  http://www.w3.org/TR/DOM-Level-3-Events/#event-type-load
> "A user agent MUST dispatch this event when the DOM implementation finishes loading the resource (such as the document) and any dependent resources (such as images, style sheets, or scripts). Dependent resources that fail to load MUST NOT prevent this event from firing if the resource that loaded them is still accessible via the DOM."

Well in this case the IFRAME object is still accessible from the parent page's DOM, right? So this would mean the .load() would fire on the top Document even if a NS_ERROR_NET_PARTIAL_TRANSFER happened in loading the child IFRAME.

The two specs seem to imply different browser behaviour for the same comms error sequence for the same HTML.

So even after solving the lower level issue of passing on the true value of mStatus, you may even need some code at a higher layer to decide whether Document.load() will be fired based on whether a partial transfer (or zero transfer) occurred in an IFRAME *and* which compatibility/conformance profile is being used to render the page.

I really hope I haven't just confused things further and that the above is helpful.

Comment 254

5 years ago
I suspect that test_partial_content if failing when it tries to set up the partial cache entry. I.e. the cache code stores a partial entry when it gets NS_OK and a content length that != Content-Length.   But it doesn't store one when is get NS_ERROR_NET_PARTIAL_TRANSFER.  So the test fails.

I'm not sure how much partial caching really buys us anyway, and this isn't a correctness issue (we'll just download whole object instead of remainder). But I suspect it's an easy fix in the cache code, so let's try that.  Assuming I'm right about the cause... :)
(In reply to Jason Duell (:jduell) from comment #254)
> I suspect that test_partial_content if failing when it tries to set up the
> partial cache entry. I.e. the cache code stores a partial entry when it gets
> NS_OK and a content length that != Content-Length.   But it doesn't store
> one when is get NS_ERROR_NET_PARTIAL_TRANSFER.  So the test fails.

I'm not so sure about that. The test carries out 2 requests. The sjs checks that the first one is not a byte-range request, but only returns part of the data. This seems to prime the cache as expected. The 2nd request is a  byte-range request for the rest of the data, and the sjs checks this. I don't see any error due to this part. (And I can make the

The problem that I see is to do with the test expecting but not getting a 'load' event being fired from the iframe in which it is loading data. I have a workaround that listens for 'load' on the window object instead, but we should make sure that the lack of these events is expected and desirable before making that change.

I'm going to look into these events later today.
Not a whole lot to add yet. I can see nsHTMLDocument::EndLoad and nsDocument::EndLoad being called for the first response, which has a content-length underrun. They're indirectly called by the nsHtml5StreamParser when it's done parsing. I see them being called with and without the patch, but, of course, no js 'load' event with the patch.

nsDocument::DispatchContentLoadedEvents looks interesting and I'd be looking there next. Daniel, maybe you could try looking at that code first. If you're stuck, maybe someone else looking at the bug can help.
Assignee

Comment 257

5 years ago
(In reply to Steve Workman [:sworkman] from comment #255)

> The problem that I see is to do with the test expecting but not getting a
> 'load' event being fired from the iframe in which it is loading data. I have
> a workaround that listens for 'load' on the window object instead, but we
> should make sure that the lack of these events is expected and desirable
> before making that change.

The need for having the load event on the window object and not on the iframe seems to me to be perfectly consistent and aligned with what Andrew says (and the specs he links to) in comment 253.
If a load event fires on the window in an iframe, a load event will also file on the iframe.  If that's not happening, there's a bug somewhere
(In reply to Daniel Stenberg [:bagder] from comment #257)
> The need for having the load event on the window object and not on the
> iframe seems to me to be perfectly consistent and aligned with what Andrew
> says (and the specs he links to) in comment 253.

Cool - yes, no load on the embedded iframe, but a load for the main window/document.

(In reply to Boris Zbarsky [:bz] from comment #258)
> If a load event fires on the window in an iframe, a load event will also
> file on the iframe.  If that's not happening, there's a bug somewhere

Ah, let me clarify, sorry: it's not the iframe's window on which the load event is firing; it's the owning window.

So, given all this, changing the test to listen for load on the main document window is ok. I have done this and the test passes.

In response to Jason's comment #254 which queries if partial data should be cached, Pat and Honza have confirmed that resumable partial data is cache-able. Resumable means can do byte-ranges, and this test case can, so we're good.

Try run of Daniel's patch and the amended test patch:
https://tbpl.mozilla.org/?tree=Try&rev=f1ee7285ebd4
Attachment #8439535 - Flags: review?(mcmanus)
Attachment #8439535 - Flags: review?(mcmanus) → review+
Assignee

Updated

5 years ago
Keywords: checkin-needed
Updating commit line with r=mcmanus.
Attachment #8439535 - Attachment is obsolete: true
Attachment #8440096 - Flags: review+

Comment 262

5 years ago
How does this affect XMLHttpRequest, streamability of CSS, etc?
Assignee

Comment 263

5 years ago
(In reply to Anne (:annevk) from comment #262)
> How does this affect XMLHttpRequest, streamability of CSS, etc?

If they get partial content they might be affected, otherwise not.

Code paths that really want to handle a partial transfer in some distinct way other than treating it as one of the other possible network errors need additional logic.

A partial transfer being a transfer that has promised to deliver more data than what was actually delivered before it was closed.

Is there any particular use case you think could be affected?

Comment 264

5 years ago
What concerned me was your blog post and respecting Content-Length. Most of the stuff we have today assumes that if Content-Length was wrong, we don't need to clean up what we got so far.
Assignee

Comment 265

5 years ago
A partial error is actually not that common so first of all this really shouldn't trigger very often.

In my perception, the biggest pain this has caused is what the subject of this bug is: the download manager thinking a download works fine even tough it was cut off before the whole transfer was done.

I would like to view this from the other end: this change helps us all over to not do the wrong assumption on partially received content. Incomplete data is now clearly signaled. This change should not add any extra need for cleanups than before. Unless I'm missing something.

Comment 266

5 years ago
I don't really care about frequency.

So it's only a signal, down-level APIs can decide whether to use or discard that information? That seems fine and would match the model described by Fetch.
Assignee

Comment 267

5 years ago
(In reply to Anne (:annevk) from comment #266)

> So it's only a signal, down-level APIs can decide whether to use or discard
> that information?

Yes

Comment 268

5 years ago
(In reply to Daniel Stenberg [:bagder] from comment #263)
> If they get partial content they might be affected, otherwise not.
> 
> Code paths that really want to handle a partial transfer in some distinct
> way other than treating it as one of the other possible network errors need
> additional logic.
> 
> A partial transfer being a transfer that has promised to deliver more data
> than what was actually delivered before it was closed.
> 
> Is there any particular use case you think could be affected?

Hope this can help fixing bug 976608
https://hg.mozilla.org/mozilla-central/rev/bb7ae1cc7789
https://hg.mozilla.org/mozilla-central/rev/2707ae2da0eb
Status: NEW → RESOLVED
Closed: 5 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla33

Updated

5 years ago
Depends on: 1027353
Duplicate of this bug: 87151
Added in the release notes with the wording:
"Fix bad downloads by detecting broken HTTP1.1 transfers (237623)"
(better wording would be appreciated)
"Fix incomplete downloads being marked as complete by detecting broken HTTP1.1 transfers (bug 237623)"? (expanding on what 'bad' means here)

I'm not 100% on what the new behavior here is though: does a broken transfer now just properly cause a download to fail, or does the download manager also attempt to retry it?
Much better. Updated. Thanks :
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #272)

> I'm not 100% on what the new behavior here is though: does a broken transfer
> now just properly cause a download to fail, or does the download manager
> also attempt to retry it?

previously a certain type of broken transfer could be reported as ok in the download manager. now it is properly reported as broken and it can be resumed if the user clicks the resume button. (there was no resume button before because it was incorrectly perceived as ok).

Updated

5 years ago
Duplicate of this bug: 536916

Comment 276

5 years ago
I wonder myself if the release notes are correct.

https://www.mozilla.org/en-US/firefox/33.0a2/auroranotes/
lists this bug as fixed

https://www.mozilla.org/en-US/firefox/33.0beta/releasenotes/
lists this bug as unresolved

Updated

5 years ago
Duplicate of this bug: 1069220

Updated

5 years ago
Duplicate of this bug: 915829
(In reply to karl155 from comment #276)
> I wonder myself if the release notes are correct.
> 
> https://www.mozilla.org/en-US/firefox/33.0a2/auroranotes/
> lists this bug as fixed
> 
> https://www.mozilla.org/en-US/firefox/33.0beta/releasenotes/
> lists this bug as unresolved

Still listed as unresolved in 33.0beta
Flags: needinfo?(sledru)
Depends on: 1079136
That is a bug of our release notes interface ... I reported by 1079136

> Still listed as unresolved in 33.0beta
Yes and no. The tag is unresolved but the line under shows "Resolved in v33.0a2"
Flags: needinfo?(sledru)
Depends on: 1083090

Updated

5 years ago
Depends on: 1083740
Depends on: 1084395
Depends on: 1085094

Comment 281

5 years ago
This 'fix' has 'broken' a lot of web sites.

Older versions of apache put the uncompressed length in the header (instead of the compressed length).
In particular this causes style sheets to be ignored and pages badly rendered.

Comment 282

5 years ago
Re. comment 281 - I would complain to those sites.  Then again, how hard would it be to add a right-click menu option, "Reload this page ignoring file sizes in headers"?

If you file a new bug, please tell us about it here.
Depends on: 1088850
Depends on: 1088940

Comment 284

5 years ago
How about checking both the compressed and uncompressed size?
reopened based on 1088850

(In reply to Peter da Silva from comment #284)
> How about checking both the compressed and uncompressed size?

that's only a subset of the problem. A large number of pdf issues are traced to invalid chunked encodings.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee

Comment 286

5 years ago
So how about that we, in the "relaxed" state (default) add two checks:

1) if gzipped content-ecoding is being present: we consider an exact match for the "other size" to also be fine if the transfer ended. There's a small risk for false positives and we only detect it at the end of an HTTP transaction. If the other size doesn't match either, it is a partial error.

2) when the transaction ends, we consider leftover non-received bytes in the last chunk to be an error. That'll make us be fine with responses just missing a chunk (it seems to be most commonly the final zero sized one) but still detect most/many premature disconnects.

However, given how this area is problematic to get tested properly in real-life, would it make sense to do introduce these within some additional prefs checks or similar so that they can be switched on/off in case of need?

BTW, the check for #2 above can probably be made as simple as this:

--- a/netwerk/protocol/http/nsHttpTransaction.cpp
+++ b/netwerk/protocol/http/nsHttpTransaction.cpp
@@ -893,11 +893,12 @@ nsHttpTransaction::Close(nsresult reason)
         (NS_SUCCEEDED(reason) && !mResponseIsComplete)) {
 
         NS_WARNING("Partial transfer, incomplete HTTP response received");
 
         if ((mHttpVersion >= NS_HTTP_VERSION_1_1) &&
-            gHttpHandler->GetEnforceH1Framing()) {
+            (gHttpHandler->GetEnforceH1Framing() ||
+             (mChunkedDecoder && mChunkedDecoder->GetChunkRemaining()) ) ) {
             reason = NS_ERROR_NET_PARTIAL_TRANSFER;
             LOG(("Partial transfer, incomplete HTTP response received: %s",
                  mChunkedDecoder ? "broken chunk" : "c-l underrun"));
         }
(In reply to Daniel Stenberg [:bagder] from comment #286)
> So how about that we, in the "relaxed" state (default) add two checks:
> 
> 1) if gzipped content-ecoding is being present: we consider an exact match
> for the "other size" to also be fine if the transfer ended. There's a small
> risk for false positives and we only detect it at the end of an HTTP
> transaction. If the other size doesn't match either, it is a partial error.
> 

Actually, we can probably consider it OK if either we read the right number of bytes (to match the CL) or the zlib decoder gives Z_STREAM_END.. we don't need to try and match the mismatched lengths as the deflate has its own framing.

> 2) when the transaction ends, we consider leftover non-received bytes in the
> last chunk to be an error. That'll make us be fine with responses just
> missing a chunk (it seems to be most commonly the final zero sized one) but
> still detect most/many premature disconnects.
> 

I think the problem with the assertion about the missing last chunk is that we only have evidence it applies to the false positives that gave us fits (i.e. the pdfs). So I agree it would keep us away from the interop problems - but we don't really have any evidence to suggest that this will catch the kind of real failure that 237623 is trying to address. Nonetheless its conservative, and that's good.
How about if the download manager downloads a file, it gets the partial error, and triggers another download. If the size for the second download matches the first, it means there's probably a server side issue, and we should keep the file.

Of course, downloading things twice might excessive for multi-megabyte files.
Assignee

Comment 289

5 years ago
(In reply to Valentin Gosu [:valentin] from comment #288)
> How about if the download manager downloads a file, it gets the partial
> error, and triggers another download. If the size for the second download
> matches the first, it means there's probably a server side issue, and we
> should keep the file.
> 
> Of course, downloading things twice might excessive for multi-megabyte files.

That, and it would only be a fix for the case #1 which I believe mcmanus's Z_STREAM_END check is a better work-around for.

Comment 290

5 years ago
How about at least getting the download manager to notify the user when there's an inconsistent download size (as was suggested earlier in this thread) and letting the user decide what to do? This wouldn't impact web pages and would avoid invisible data loss from incomplete downloads.

Comment 291

5 years ago
In addition to comment 290, it would be nice in general to be notified in some way that the server has this kind of problems in general so that they would eventually get fixed. This information should be available at least in some Web Developer component and/or in Page Info window.

It is quite possible that person doing the development or server management does not even notice that their site is broken otherwise.
Given that we shipped Firefox 33 being unable to download from them, that should give them some idea they have a problem, right :-) 

Feel free to file a bug against our WebDev tools, might be suitable for their network component. But such servers are thankfully a minority and I'm not sure we'll want or should detect and warn for every possible broken server config in the browser.
Assignee

Updated

5 years ago
No longer blocks: 1008091
Comment hidden (me-too)
Assignee

Comment 295

4 years ago
Here's my plan to address this issue. I want to make the checks stricter but still allow some common protocol violations.

In particular I want to allow two flaws that have proven to be somewhat common in the wild and were the reasons for the previous fix being backed out again:

A - HTTP chunked responses that lack the last 0-sized chunk.

B - HTTP gzipped responses where the Content-Length is not the same as the actual contents.

So, in order to allow A + B and yet be able to detect prematurely cut off transfers I want to:

1 - Detect incomplete chunks then the transfer has ended. So, if a chunked-encoded transfer ends on exactly a chunk boundary we consider that fine. Good: This will allow case (A) to be considered fine. Bad: It will make us not detect a certain amount of cut-offs.

2 - When receiving a gzipped response, we consider a gzip stream that doesn't end fine according to the gzip decompressing state machine to be a partial transfer. IOW: if a gzipped transfer ends fine according to the decompressor, we do not check for size unalignments. This allows case (B) as long as the content could be decoded.

3 - When receiving HTTP that isn't content-encoded/compressed (like in case 2) and not chunked (like in case 1), perform the size comparison between Content-Length: and the actual size received and consider a mismatch to mean a NS_ERROR_NET_PARTIAL_TRANSFER error.

Not really decided on yet: how to do with the pref previously added to enable the strict check: "network.http.enforce-framing.http1". Right now I keep the pref set to false by default to do it like described above while setting it to strict will make it act like it worked in the previous version of my patch which is: strict checking for content-length == response-size and if chunked-encoded: check that the stream ended with a fine 0-chunk.

Stay tuned for the first take on the first version of patches for 1 - 3.
Assignee

Comment 296

4 years ago
When doing chunked decoding, detect and return error for a cut off last chunk.
Attachment #8559098 - Flags: review?(mcmanus)
Assignee

Comment 297

4 years ago
If the gzip decompression hasn't ended fine when the HTTP request is stopped, consider it partial.
Attachment #8559100 - Flags: review?(mcmanus)
Assignee

Comment 298

4 years ago
reject "unaligned" (Content-Length is not the same size as has been received) HTTP transfers that are not content-decoded and not chunked-encoded
Attachment #8559101 - Flags: review?(mcmanus)
Comment on attachment 8559098 [details] [diff] [review]
0001-bug-237623-check-for-incomplete-chunks-on-transactio.patch

Review of attachment 8559098 [details] [diff] [review]:
-----------------------------------------------------------------

the whole patchset needs to be controllable via pref
Attachment #8559098 - Flags: review?(mcmanus) → review+
Comment on attachment 8559100 [details] [diff] [review]
0002-bug-237623-check-for-incomplete-gzip-on-transaction-.patch

Review of attachment 8559100 [details] [diff] [review]:
-----------------------------------------------------------------

the whole patchset needs to be controllable via pref

::: netwerk/streamconv/converters/nsHTTPCompressConv.cpp
@@ +87,5 @@
>  NS_IMETHODIMP
>  nsHTTPCompressConv::OnStopRequest(nsIRequest* request, nsISupports *aContext, 
>                                    nsresult aStatus)
>  {
> +    if (!mStreamEnded) {

if (!mStreamEnded && NS_SUCCEEDED(aStatus)) {
Attachment #8559100 - Flags: review?(mcmanus) → review+
Comment on attachment 8559101 [details] [diff] [review]
0003-bug-237623-reject-unaligned-HTTP-transfers-that-are-.patch

Review of attachment 8559101 [details] [diff] [review]:
-----------------------------------------------------------------

I think its ok to morph EnforceH1Framing to mean enforce this new algorithm (which is a slight weakening) and then flip the pref to true. That's up to you if you want that or a new pref, but the new path definitely needs a pref based disabling mechanism.

::: netwerk/protocol/http/nsHttpTransaction.h
@@ +146,5 @@
>          mPushedStream = nullptr;
>          return r;
>      }
>      void SetPushedStream(Http2PushedStream *push) { mPushedStream = push; }
> +    void SetContentDecoding(bool decode) { mContentDecoding = decode; }

I think you can figure out mContentDecoding inside nsHttpTransaction without creating this method and having nsHttpChannel do it for you.. maybe the same place we make the pushback() decision

@@ +282,5 @@
>      bool                            mResponseTimeoutEnabled;
>      bool                            mDontRouteViaWildCard;
>      bool                            mForceRestart;
>      bool                            mReuseOnRestart;
> +    bool	                        mContentDecoding;

indentation
Attachment #8559101 - Flags: review?(mcmanus)
Assignee

Comment 302

4 years ago
Thanks for the review. Did you mean something like this?

(and yes, I'll make another iteration and make the set use a pref)
Attachment #8559101 - Attachment is obsolete: true
Attachment #8559810 - Flags: review?(mcmanus)
Assignee

Comment 303

4 years ago
Added a new pref for this. This is called "network.http.enforce-framing.soft" and is set true by default.

I made it this way so that the previous "network.http.enforce-framing.http1" pref still overrides this new pref if enabled, and this variable sets a middle strictness level.
Attachment #8559875 - Flags: review?(mcmanus)
Assignee

Comment 304

4 years ago
With this additional patch that updates two tests as well to use the new pref, I get the following try-run:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ce6c67a8cf1e

Some oranges I can't see happen because of my patch series.
Attachment #8561281 - Flags: review?(mcmanus)
daniel - can you mark old patches on this bug as obsolete. I'm getting confused :)

(its under details)
Comment on attachment 8559810 [details] [diff] [review]
v2-0003-bug-237623-reject-unaligned-HTTP-transfers-that-a.patch

Review of attachment 8559810 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/nsHttpTransaction.cpp
@@ +913,5 @@
>  
>          if ((mHttpVersion >= NS_HTTP_VERSION_1_1) &&
>              (gHttpHandler->GetEnforceH1Framing() ||
> +             (mChunkedDecoder && mChunkedDecoder->GetChunkRemaining()) ||
> +             (!mChunkedDecoder && !mContentDecoding) ) ) {

!mChunkedDecoder && !mContentDecoding && mContentDecodingCheck  (maybe?)

@@ +1742,5 @@
>          }
> +
> +        if (!mContentDecodingCheck && mResponseHead) {
> +            mContentDecoding =
> +                (mResponseHead->PeekHeader(nsHttp::Content_Encoding) != nullptr);

write as !!foo not foo != nullptr
Attachment #8559810 - Flags: review?(mcmanus) → review+
Comment on attachment 8559875 [details] [diff] [review]
v3-0004-bug-237623-use-network.http.enforce-framing.soft-.patch

Review of attachment 8559875 [details] [diff] [review]:
-----------------------------------------------------------------

::: modules/libpref/init/all.js
@@ +1348,5 @@
>  
>  pref("network.http.tcp_keepalive.long_lived_connections", true);
>  pref("network.http.tcp_keepalive.long_lived_idle_time", 600);
>  
>  pref("network.http.enforce-framing.http1", false);

comment that this one should really be named hard

::: netwerk/protocol/http/nsHttpHandler.cpp
@@ +1537,4 @@
>          rv = prefs->GetBoolPref(HTTP_PREF("enforce-framing.http1"), &cVar);
> +        if (NS_SUCCEEDED(rv) && cVar) {
> +            mEnforceH1Framing = FRAMECHECK_STRICT;
> +        }

one line for } else {

@@ +1542,5 @@
> +            rv = prefs->GetBoolPref(HTTP_PREF("enforce-framing.soft"), &cVar);
> +            if (NS_SUCCEEDED(rv) && cVar) {
> +                mEnforceH1Framing = FRAMECHECK_BARELY;
> +            }
> +            else {

one line
Attachment #8559875 - Flags: review?(mcmanus) → review+
Attachment #8561281 - Flags: review?(mcmanus) → review+
Assignee

Updated

4 years ago
Attachment #8436705 - Attachment is obsolete: true
Assignee

Comment 308

4 years ago
Posted patch gzip-check-use-pref-too.patch (obsolete) — Splinter Review
Thanks for all the reviews Patrick!

It struck me that I didn't make the check for a clean gzip-stop use the new pref in my patch, so I wanted to amend the 0003-patch to do that. This is my attempt. I'm not entirely happy with reading out the http framing prefs from within the nsHTTPCompressConv code, but I also couldn't come up with a much better solution. What do you think? Do you have a better suggestion?

(This patch file is done on top of the 0003 one and I made it this delta only to make it easier to read and discuss.)
Attachment #8562023 - Flags: feedback?(mcmanus)
Comment on attachment 8562023 [details] [diff] [review]
gzip-check-use-pref-too.patch

Review of attachment 8562023 [details] [diff] [review]:
-----------------------------------------------------------------

can you actually just do a little random web browsing and ensure that this is main thread normally?

::: netwerk/streamconv/converters/nsHTTPCompressConv.cpp
@@ +70,5 @@
>  
>      else if (!PL_strncasecmp(aFromType, HTTP_DEFLATE_TYPE, sizeof(HTTP_DEFLATE_TYPE)-1))
>          mMode = HTTP_COMPRESS_DEFLATE;
>  
> +    mFailUncleanStops =

what do you think of?

mFailUncleanStops = false; // actually I would put this in the ctor
if (NS_ISMainThread()) {
 mFailUncleanStops = (Preferences::GetBool("network.http.enforce-framing.soft", false) ||
    Preferences::GetBool("network.http.enforce-framing.theonethatshouldbenamedhard", false));
}
Attachment #8562023 - Flags: feedback?(mcmanus) → feedback+
Assignee

Comment 310

4 years ago
(In reply to Patrick McManus [:mcmanus] from comment #309)

> what do you think of?

Looks good, and yeah putting it in the constructor makes sense.
Assignee

Comment 311

4 years ago
My patch series squashed into a single one. 73 insertions(+), 21 deletions

I obsoleted all other patches attached to this bug.

Try-run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=69bc13d9d5e3
Attachment #284191 - Attachment is obsolete: true
Attachment #8394104 - Attachment is obsolete: true
Attachment #8440096 - Attachment is obsolete: true
Attachment #8559098 - Attachment is obsolete: true
Attachment #8559100 - Attachment is obsolete: true
Attachment #8559810 - Attachment is obsolete: true
Attachment #8559875 - Attachment is obsolete: true
Attachment #8561281 - Attachment is obsolete: true
Attachment #8562023 - Attachment is obsolete: true
Attachment #8562776 - Flags: review+
Severity: normal → critical
Status: REOPENED → ASSIGNED
Keywords: dataloss

Comment 318

4 years ago
How about taking a step back and fixing the actual download manager problem, where the download manager actually has enough information to determine that a download has failed and throws tat information away and presents a successful download to the user? Worry about whether it's even practical to deal with all the weird edge cases that may or may not be issues for live content later.
(In reply to Peter da Silva from comment #318)
>
> where the download manager actually has enough information to determine that

you misunderstand the patches. Currently the download manager does not have this information - the patches are designed to provide it.
Assignee

Updated

4 years ago
Keywords: checkin-needed

Comment 320

4 years ago
In the most common visible failure mode the download manager has enough information. It has the correct size of the download, and displays the correct size of the download during the download. The server or a proxy then terminates the download early, without crashing (eg, the server process terminates and the socket is closed normally during process cleanup, or the server crashes and the proxy closes the connection normally). The download manager then discards the original size and updates the display to show the download as having completed successfully.

This problem could be resolved _right now_ (or could have been resolved at any time in the past decade and change) without waiting for lower level fixes or breaking working web pages.
>In the most common visible failure mode the download manager has enough information.

From what I understand it's not only the failure modes, it's also the modes that work correctly now but *would break* with your proposal because the server gives the wrong length.

Comment 322

4 years ago
There are many possible user interface decisions that would avoid false positives being a problem. For example, a warning icon could be presented that would let the end user decide whether the download was successful or not, and accept it or request a retry.

Comment 323

4 years ago
In addition the most likely false positives seen in the previous fix attempt rarely if ever apply to downloads, because they are rarely if ever being dynamically compressed and dynamically decompressed by the server, since most download formats are already statically compressed.
Target Milestone: mozilla33 → ---
https://hg.mozilla.org/mozilla-central/rev/7abb1cb9739e
Status: ASSIGNED → RESOLVED
Closed: 5 years ago4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Depends on: 1133156
backed it out for unintended regression on css transfer

https://bugzilla.mozilla.org/show_bug.cgi?id=1133156#c3
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Comment 327

4 years ago
I don't get it.  Why can't we leave the fix in place and just have the hallmark users set the preference?
Because most users wouldn't know where to look, and would either complain or (more likely) switch to another browser. In addition, Hallmark may simply be a popular example of a more widespread problem.
Guys... you know... it only landed on Nightly,
which is pre-alpha version for testers and should be used for testing purposes only.
So IMO backing this out was an exaggeration,
especially when you can disable it and it's not causing any crashes.

Comment 330

4 years ago
Fixing the underlying problem is hard and may not be generally solvable. That was clear the first time it had to be backed out.

Modifying the download manager to let the end user know when downloads _may_ be incomplete is straightforward and easily understood.
[1] is another example of a website that breaks with network.http.enforce-framing.soft set to true, so it's not just Hallmark. That being said, I think the patch could have stayed in with the default flipped to false.

[1] http://forum.bioware.com/topic/529838-a-collection-of-tweaks-and-fixes-for-the-pc-version/
Assignee

Comment 332

4 years ago
First I'll sob for a bit then I'll analyze these sites and see what made them end up as partial transfers, and come up with a new route forward.
(In reply to Virtual_ManPL [:Virtual] from comment #329)

> So IMO backing this out was an exaggeration,
> especially when you can disable it and it's not causing any crashes.

I tried disabling it, but it failed at least one test that way.. so the disabling patch was going to have to be more than a pref change.. made more sense just to do a clean backout at that point.
Assignee

Comment 334

4 years ago
The part of the patch that checks that the nsHTTPCompressConv state machine ends cleanly or otherwise reports an error clearly is too trigger happy. Removing that little piece seems to be what I need to do, to have the sites that reportedly broke to show up nicely again...