Last Comment Bug 789932 - nsExternalAppHandler downloads files on the main thread
: nsExternalAppHandler downloads files on the main thread
Status: VERIFIED FIXED
[Snappy:P1]
: dev-doc-needed, main-thread-io
Product: Core Graveyard
Classification: Graveyard
Component: File Handling (show other bugs)
: 16 Branch
: All All
: -- normal with 6 votes (vote)
: mozilla20
Assigned To: :Paolo Amadini
:
Mentors:
: 599991 812150 (view as bug list)
Depends on: 391975
Blocks: 551427 803570 817381
  Show dependency treegraph
 
Reported: 2012-09-10 07:34 PDT by Vladan Djeric (:vladan)
Modified: 2016-06-22 12:16 PDT (History)
49 users (show)
See Also:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
Skeleton (26.50 KB, patch)
2012-09-20 10:30 PDT, :Paolo Amadini
no flags Details | Diff | Splinter Review
Minimal implementation on top of the skeleton (16.78 KB, patch)
2012-10-04 14:42 PDT, :Paolo Amadini
no flags Details | Diff | Splinter Review
Interface for a component that saves downloaded data in the background (6.99 KB, patch)
2012-10-11 08:32 PDT, :Paolo Amadini
no flags Details | Diff | Splinter Review
Skeleton of a component that saves downloaded data in the background (28.32 KB, patch)
2012-10-13 10:32 PDT, :Paolo Amadini
no flags Details | Diff | Splinter Review
Patch that doesn't work (40.66 KB, patch)
2012-10-23 10:23 PDT, :Paolo Amadini
cbiesinger: feedback+
Details | Diff | Splinter Review
Patch that works (72.01 KB, patch)
2012-11-03 10:06 PDT, :Paolo Amadini
cbiesinger: review+
dteller: feedback+
Details | Diff | Splinter Review
Fixed patch (81.21 KB, patch)
2012-11-29 15:08 PST, :Paolo Amadini
dteller: feedback+
Details | Diff | Splinter Review
Revised patch (87.51 KB, patch)
2012-12-08 11:31 PST, :Paolo Amadini
no flags Details | Diff | Splinter Review
Final patch (89.35 KB, patch)
2012-12-10 11:31 PST, :Paolo Amadini
no flags Details | Diff | Splinter Review

Description Vladan Djeric (:vladan) 2012-09-10 07:34:37 PDT
We noticed a lot of main-thread hangs
Comment 1 Vladan Djeric (:vladan) 2012-09-10 07:35:59 PDT
(In reply to Vladan Djeric (:vladan) from comment #0)
> We noticed a lot of main-thread hangs

We noticed a lot of main-thread hangs in the Telemetry reports with the following stack. We believe it has to do with files being downloaded on the main thread when their filetype is handled by an external app.

FileSeek64 (in nspr4.dll)
 -> FileWrite (in nspr4.dll)
 -> nsFileStreamBase::Write(char const *,unsigned int,unsigned int *) (in xul.dll)
 -> nsFileOutputStream::Write(char const *,unsigned int,unsigned int *) (in xul.dll)
 -> nsBufferedOutputStream::Flush() (in xul.dll)
 -> nsBufferedOutputStream::Write(char const *,unsigned int,unsigned int *) (in xul.dll)
 -> nsExternalAppHandler::OnDataAvailable(nsIRequest *,nsISupports *,nsIInputStream *,unsigned int,unsigned int) (in xul.dll)
 -> nsDocumentOpenInfo::OnDataAvailable(nsIRequest *,nsISupports *,nsIInputStream *,unsigned int,unsigned int) (in xul.dll)
 -> mozilla::net::nsHttpChannel::OnDataAvailable(nsIRequest *,nsISupports *,nsIInputStream *,unsigned int,unsigned int) (in xul.dll)
 -> nsInputStreamPump::OnStateTransfer() (in xul.dll)
Comment 2 :Gavin Sharp [email: gavin@gavinsharp.com] 2012-09-10 07:42:54 PDT
I think this is hit for basically all of our downloads.
Comment 3 Christian :Biesinger (don't email me, ping me on IRC) 2012-09-10 16:16:08 PDT
(not for "save target as", fwiw)
Comment 4 :Ehsan Akhgari 2012-09-18 20:13:16 PDT
Paolo, I think the easiest fix for this is to use NS_AsyncCopy to copy the input stream into the output stream asynchronously and then do the work at the end of OnDataAvailable in the callback depending on whether the copy was successful or not.
Comment 5 :Paolo Amadini 2012-09-20 10:30:54 PDT
Created attachment 663063 [details] [diff] [review]
Skeleton

(In reply to Ehsan Akhgari [:ehsan] from comment #4)
> Paolo, I think the easiest fix for this is to use NS_AsyncCopy to copy the
> input stream into the output stream asynchronously and then do the work at
> the end of OnDataAvailable in the callback depending on whether the copy was
> successful or not.

I thought this, too, when first looking at the code. But it turns out there
are at least two showstoppers here:

* NS_AsyncCopy writes without blocking only when the output stream supports
  nsIAsyncOutputStream, and native file streams don't, as they are main thread
  only.  They can be turned into asynchronous streams using the stream
  transport service, but this comes at the cost of some additional state
  management, especially when combined with the next bullet point.

* At some point, the file in the temporary directory may be moved near to its
  final location, when the user selects the final file destination. This is
  currently done by closing the current stream, moving the file, and reopening
  a new stream to the copied file. This is done on the main thread between
  calls to OnDataAvailable, for which the change is transparent. Obviously we
  should also move the file copy operation to another thread, and during that
  time we might still receive OnDataAvailable notifications, whose input
  stream we are _required_ to consume before returning, by interface contract.

So I started a separate C++ class (attached) to handle saving data on a
background thread. The code should be looked into by someone that could help
me with C++ style, conventions and design, as I'm not very familiar with them.
If you know someone that can help me, please redirect the review.

Currently, the code does nothing else than managing worker thread state. Our
C++ code is really verbose! I also thought about writing a JavaScript XPCOM
component for that, but it turns out that our JS workers aren't mature enough,
there are serialization overheads or at least overheads for copying large
amount of data to an ArrayBuffer. In the end, I went for direct memory buffer
marshaling across threads in a C++ component.

If you have any better idea that fits into the current architecture, feel free
to share!
Comment 6 Christian :Biesinger (don't email me, ping me on IRC) 2012-09-20 13:30:14 PDT
It may be easier if you created a pipe, did an NS_AsyncCopy from the pipe to the file stream (only one of the streams needs to be async), and then just wrote into the pipe from ODA.
Comment 7 :Ehsan Akhgari 2012-09-20 16:18:08 PDT
(In reply to comment #6)
> It may be easier if you created a pipe, did an NS_AsyncCopy from the pipe to
> the file stream (only one of the streams needs to be async), and then just
> wrote into the pipe from ODA.

Yeah that makes sense.  Let me know if you still want me to review this patch (I tried to do that today but it was too big and I didn't get enough time...)
Comment 8 :Paolo Amadini 2012-09-22 08:31:10 PDT
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #6)
> It may be easier if you created a pipe, did an NS_AsyncCopy from the pipe to
> the file stream (only one of the streams needs to be async), and then just
> wrote into the pipe from ODA.

I'm still looking at the how the streams code works and trying to understand if
the resulting state management would be actually easier here if I used a pipe
to communicate with a background thread, or even to off-load work using an async
file output stream created by the stream transport service. I would not want to
end up with more callbacks and state variables than with the current solution.

Sorry for the long post here, but documentation and examples are somewhat
lacking in this code area, and I prefer to make an informed decision before
starting to write something that could be more complex in the end.

As per comment 5, we need to close the file output stream at some point, copy
the file contents, and reopen the output stream. This copy operation should also
not block the main thread, thus a direct operating system copy call could only
be done in a dedicated background thread, or the file could be copied manually
using an output stream created by the stream transport service.

To detach temporarily the "input stream" of the pipe (the one I would be
attaching to the file output stream using NS_AsyncCopy) I've seen a method
called NS_CancelAsyncCopy, but it's not clear to me whether by calling that
method I could be losing some data on the input or output streams passed
to NS_AsyncCopy. If I'm not losing data, I would then need to wait for the
output stream to be completely written and the file closed (I presume, using
the callback passed to NS_AsynCopy), and distinguish this manual cancellation
condition from a real error occurred when writing the output stream. If the
output stream is handled through an nsITransport, I would also need to close
the transport and still be sure that this operation doesn't lose any data,
nor try to cancel or close any of the involved streams abruptly.

Then I would need to reopen a new file stream (and a newly associated
nsITransport, if I'm working on the main thread) and copy the data from the
previous temporary file to the new one, before reattaching the new output stream
to the original pipe. If I'm working on a background thread I need do the same
thing, though using a direct file copy and without the nsITransport complexity,
but with the same background thread handling code of the current patch.

All of this, while still checking for cancellation conditions on the original
external app handler object, and input or output errors.

Also, from OnDataAvailable I can only directly copy to a stream that has
blocking semantics but never blocks. This is because on one hand we are
required to consume all data from the input stream before returning, on
the other hand we should never block the main thread anymore.

So, even if I used a pipe, I would need one with an infinite buffer, and if we
start getting too much data, I should Suspend() the source request at some
point. Thus I would need, in any case, to keep track of how many bytes are in
the pipe's buffer on my own. Or is there a property I could use on some of the
streams, or another technique I missed?

A final note, since the stream transport service has a pool of four threads,
if we use it to off-load the actual file I/O to a background thread instead
of spawning our own thread, we should probably increase the pool size by one
for any file we are currently downloading, lest we block network I/O when
writing to slow media like USB sticks. And increasing the pool's size is not
a perfect solution either, since some network I/O might end up on the same
thread as a file operation regardless. That's why I'm more inclined to having
our own background thread for input/output in any case.
Comment 9 Christian :Biesinger (don't email me, ping me on IRC) 2012-09-24 15:25:20 PDT
(In reply to Paolo Amadini [:paolo] from comment #8)
> As per comment 5, we need to close the file output stream at some point, copy
> the file contents, and reopen the output stream. This copy operation should
> also
> not block the main thread, thus a direct operating system copy call could
> only
> be done in a dedicated background thread, or the file could be copied
> manually
> using an output stream created by the stream transport service.

I'll note that you could solve that problem by waiting until the download is done and only then moving the file. Could call nsIFile.moveTo from the background thread for that.

> To detach temporarily the "input stream" of the pipe (the one I would be
> attaching to the file output stream using NS_AsyncCopy) I've seen a method
> called NS_CancelAsyncCopy, but it's not clear to me whether by calling that
> method I could be losing some data on the input or output streams passed
> to NS_AsyncCopy. 

No, cancelling doesn't lose data (this is why it needs async/buffered streams). It does close the streams if that was requested.

> If I'm not losing data, I would then need to wait for the
> output stream to be completely written and the file closed (I presume, using
> the callback passed to NS_AsynCopy),

yes, using the callback

> and distinguish this manual cancellation
> condition from a real error occurred when writing the output stream.

that's easy since you can just pass an error code to NS_CancelAsyncCopy that wouldn't happen on its own

> If the
> output stream is handled through an nsITransport, I would also need to close
> the transport and still be sure that this operation doesn't lose any data,
> nor try to cancel or close any of the involved streams abruptly.

Closing a stream doesn't lose data, but maybe I misunderstood the question.

As a sidenote, using the stream transport service to make a stream async uses a pipe+NS_AsyncCopy under the hood.

As another (important) sidenote, I don't think necko guarantees that the stream will be useful after onDataAvailable returns, so if you just do NS_AsyncCopy(input_stream, some_output_stream); that may not work well. 

sidenote #3, you are not supposed to read more than aCount bytes in onDataAvailable; therefore, you can't use NS_AsyncCopy from the stream

> Then I would need to reopen a new file stream (and a newly associated
> nsITransport, if I'm working on the main thread) and copy the data from the
> previous temporary file to the new one, before reattaching the new output
> stream
> to the original pipe. If I'm working on a background thread I need do the
> same
> thing, though using a direct file copy and without the nsITransport
> complexity,
> but with the same background thread handling code of the current patch.

yes, though see my suggestion above

> Also, from OnDataAvailable I can only directly copy to a stream that has
> blocking semantics but never blocks.

(or to a nonblocking stream that never returns NS_ERROR_WOULD_BLOCK)

> This is because on one hand we are
> required to consume all data from the input stream before returning, on
> the other hand we should never block the main thread anymore.
> 
> So, even if I used a pipe, I would need one with an infinite buffer, and if
> we
> start getting too much data, I should Suspend() the source request at some
> point. Thus I would need, in any case, to keep track of how many bytes are in
> the pipe's buffer on my own. Or is there a property I could use on some of
> the
> streams, or another technique I missed?

That's essentially true, if you want to make sure to limit the memory usage, yes. (But that's an issue independent of what technique you use, right?)

> A final note, since the stream transport service has a pool of four threads,
> if we use it to off-load the actual file I/O to a background thread instead
> of spawning our own thread, we should probably increase the pool size by one
> for any file we are currently downloading, lest we block network I/O when
> writing to slow media like USB sticks.

Network I/O doesn't generally use the stream transport service, so that's not an issue.

> And increasing the pool's size is not
> a perfect solution either, since some network I/O might end up on the same
> thread as a file operation regardless. That's why I'm more inclined to having
> our own background thread for input/output in any case.

see above
Comment 10 Mike Hommey [:glandium] 2012-09-24 23:23:07 PDT
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #9)
> (In reply to Paolo Amadini [:paolo] from comment #8)
> > As per comment 5, we need to close the file output stream at some point, copy
> > the file contents, and reopen the output stream. This copy operation should
> > also
> > not block the main thread, thus a direct operating system copy call could
> > only
> > be done in a dedicated background thread, or the file could be copied
> > manually
> > using an output stream created by the stream transport service.
> 
> I'll note that you could solve that problem by waiting until the download is
> done and only then moving the file. Could call nsIFile.moveTo from the
> background thread for that.

Moving the file may require actually copying it if it is on a different file system (not so much of a stretch, the user may have chosen a separate drive), and if the file is big enough, that is going to be a noticeable delay at the end of the download, while copying what's already there at the beginning is going to be unnoticed.
Comment 11 :Ehsan Akhgari 2012-09-26 11:15:00 PDT
Comment on attachment 663063 [details] [diff] [review]
Skeleton

Clearing the request for now.
Comment 12 Christian :Biesinger (don't email me, ping me on IRC) 2012-09-26 15:10:54 PDT
(In reply to Mike Hommey [:glandium] from comment #10)
> Moving the file may require actually copying it if it is on a different file
> system (not so much of a stretch, the user may have chosen a separate
> drive), and if the file is big enough, that is going to be a noticeable
> delay at the end of the download, while copying what's already there at the
> beginning is going to be unnoticed.

Yes, that is a downside of my suggestion. Not as big a problem if we do it on a background thread, but not ideal.
Comment 13 :Paolo Amadini 2012-10-04 14:42:02 PDT
Created attachment 668190 [details] [diff] [review]
Minimal implementation on top of the skeleton

Thanks for the input! So it's safe to use a pipe and a cancelable NS_AsyncCopy
here, and I could expose the pipe's output stream to the consumers.

The advantage is that PartFileSaver would expose a standard interface to write
the data to the file (nsIOutputStream). The disadvantage is that basically
nsIOutputStream is more complex than we need right now (the Write method
doesn't necessarily accept all the data at once, we still need the loop in
OnDataAvailable), and we need more state management in the worker thread
because NS_AsyncCopy is canceled asynchronously and we may process more
events while we are waiting for the file stream to be closed.

Would you mind looking at the current code and see if the above is correct?
I'll probably be able to look into implementing the pipe next week.

Note that the memory limit with Suspend() is still missing (it's independent
from the rest) and there's still another file move on the main thread (as well
as some createUnique and similar) that we can easily handle on top of this
bug, in follow-ups.
Comment 14 :Paolo Amadini 2012-10-04 14:49:55 PDT
Comment on attachment 663063 [details] [diff] [review]
Skeleton

Ehsan, since it looks like we'll need the worker thread in any case to handle
file moves and avoid thread pool complexities, you'd do me a favor if you could
verify if this is handled correctly and the C++ looks good, or if you can
indicate me someone that can provide me some initial feedback.

I'm also thinking to turn this into an XPCOM component, so I could write some
tests for it in JavaScript. Tests would require some time to be written, but
it's probably better to have them, since this part of the code doesn't have
many already. What do you think?
Comment 15 :Ehsan Akhgari 2012-10-05 11:21:45 PDT
So I'm missing something here, I think.  Did we not decide to go with the suggestion in comment 6?  I really don't like the fact that we are creating a worker thread here, or that we're sort of reinventing what NS_AsyncCopy gives us...
Comment 16 :Paolo Amadini 2012-10-09 10:55:30 PDT
(In reply to Ehsan Akhgari [:ehsan] from comment #15)
> So I'm missing something here, I think.  Did we not decide to go with the
> suggestion in comment 6?  I really don't like the fact that we are creating
> a worker thread here, or that we're sort of reinventing what NS_AsyncCopy
> gives us...

NS_AsyncCopy needs a worker thread to handle the actual file write operation.

We also need a worker thread to handle the file move or copy operation.

The stream transport service provides a pipe and NS_AsyncCopy working on its own
background threads, but I found out that closing an output transport actually
breaks the "output stream" (aka input end) of the pipe so we simply can't
redirect the transport to a different file or suspend it temporarily.

http://mxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsStreamTransportService.cpp#329
Comment 17 :Paolo Amadini 2012-10-09 11:08:16 PDT
While working on top of the current patch, I also realized that NS_AsyncCopy
continues to copy data uninterruptedly while data is available. With fast
network connections or slow file systems, this might actually prevent the
worker thread from processing any event.

This means I will need to call NS_CancelAsyncCopy directly from the main thread
when needed (I cannot be sure that an event posted on the worker thread is
actually handled while NS_AsyncCopy is in progress).

Instead, I still need to call NS_AsyncCopy after I've opened the output stream
on the worker thread (opening the file stream is a synchronous operation, this
is why I should avoid doing that on the main thread).

I'll probably need to implement shared state between the main and worker threads,
so the new design is probably going to be a bit different from the current patch.
Comment 18 :Paolo Amadini 2012-10-09 11:16:20 PDT
I'm also thinking that if we choose to expose nsIOutputStream to the consumers,
we could go for a standalone XPCOM component (implemented in netwerk?) so we can
reuse it for other download types too (for example resumed downloads, that don't
use this code path at all).
Comment 19 :Ehsan Akhgari 2012-10-09 11:24:17 PDT
Comment on attachment 663063 [details] [diff] [review]
Skeleton

Sorry for the delay here Paolo.  Thinking about this a bit more, I don't think I'm the right person to review this code...  I'd really want someone who knows more about this stuff to look at this, perhaps biesi, bz, or bsmedberg?
Comment 20 :Paolo Amadini 2012-10-09 11:33:31 PDT
Ok, I'll wait for Christian to comment on the general approach here before going
for the C++ reviews, thanks anyways!
Comment 21 :Paolo Amadini 2012-10-11 08:32:45 PDT
Created attachment 670403 [details] [diff] [review]
Interface for a component that saves downloaded data in the background

I'm now investigating the route of designing a component that handles our use
case but can also be used elsewhere (as well as tested from JavaScript).

I've created an nsIBackgroundFileSaver interface, designed to be decoupled
from the data feeding method. It could be used either with nsIStreamListener
(our case) or nsIAsyncOutputStream (for future uses where we aren't forced by
an input stream pump to consume all data immediately).

We might derive the implementation from the same class and then use different
contract IDs based on the component ("background-file-saver-stream-listener"
and "background-file-saver-output-stream", for example), or write a component
that implements all the interfaces the same time.

About naming, I've called this FileSaver rather than Downloader because it
handles only the local file part, does not take the URI that is the source of
the download. Also, this is not related to download management at all.

I've chosen Background rather than Async because this is specifically designed
for cases where we use a background thread (other Async components actually are
asynchronous but are often used on the main thread only).

I've called the feedback interface Observer rather than Callback since I've
seen Callback objects only used as function parameters, and a similar use of
the word Observer around (nsIStreamLoaderObserver for example). I've also seen
that the observer methods generally pass a reference to the calling object,
though we'll not use it probably.

In the future, we may also add the createUnique logic to this component (hence
the callback that tells the caller what the actual file name is). Meanwhile
I think we'll only use this to distinguish and report the real file name on
failure, when we might have already requested to change the file name in the
background. We may also just do without this notification for now.

Finally, while working on this I found two related interfaces around
(nsIDownloader and nsIIncrementalDownload) that, while named after downloads,
provide orthogonal functionality and I think aren't suitable for our use case.
Comment 22 :Paolo Amadini 2012-10-13 10:32:34 PDT
Created attachment 671111 [details] [diff] [review]
Skeleton of a component that saves downloaded data in the background

Now with XPCOM structure and fake tests!
Comment 23 Josh Aas 2012-10-16 10:48:31 PDT
Honza, Patrick, Jason - Can you guys offer advice here? Feedback requested from Christian, but we could use more.
Comment 24 :Paolo Amadini 2012-10-23 10:23:08 PDT
Created attachment 674271 [details] [diff] [review]
Patch that doesn't work

Josh, do you know someone who could help me with this code? I've started the
implementation, but I'm not sure I'm moving in the right direction. I'm trying
to achieve code that is the simplest that can cover the use cases, and it still
looks too complex to me. I'm currently struggling with state management (there
is surely something wrong in this patch since it doesn't pass its own test).
Comment 25 Jan Horak 2012-11-01 02:59:26 PDT
I've peeked into "Patch that doesn't work" and I see nsIFile as mTarget. I would recommend using nsIURI instead of nsIFile. I'm working on bug 682838 where I have to replace a lot of nsIFile with nsIURI to be able to support local file and remote location, see:
https://bugzilla.mozilla.org/attachment.cgi?id=642614&action=diff#a/toolkit/components/downloads/nsDownloadManager.cpp_sec4
I would like to avoid this kind of constructions.

By using GIO files can also be stored on network, like sftp:// or smb:// and these saving operations are even more required to run off the main thread and I would like to use this interface to do so.
Comment 26 Christian :Biesinger (don't email me, ping me on IRC) 2012-11-02 11:57:08 PDT
Comment on attachment 674271 [details] [diff] [review]
Patch that doesn't work

Discussed this on IRC with Paolo. I'll mark this feedback+ because I like the approach, though there's a few things to address.
Comment 27 :Paolo Amadini 2012-11-03 10:06:02 PDT
Created attachment 678028 [details] [diff] [review]
Patch that works

This patch will probably need minor adjustments, in particular some more tests,
but in general it is ready. Christian, do you think you can review it?
Comment 28 :Paolo Amadini 2012-11-03 10:16:30 PDT
(In reply to jhorak from comment #25)
> I would recommend using nsIURI instead of nsIFile. I'm working on bug 682838
> where I have to replace a lot of nsIFile with nsIURI to be able to support
> local file and remote location, see:

Thanks for pointing that out. Both this component and its primary consumer (in
the exthandler folder) are designed internally to use nsIFile features, so for
the moment I think using nsIURI would just add some complexity for wrapping and
unwrapping, that for now we can avoid.

Since we don't freeze interfaces anymore, we'll be able to update the interfaces
as needed to keep the code simple, if we decide to support nsIURI in the future
(that, for temporary files, are unsupported also with the patch in bug 682838).
Comment 29 Emanuel Hoogeveen [:ehoogeveen] 2012-11-08 07:22:31 PST
Is this patch expected to solve the behavior seen in bug 444467 as well? (which this bug blocks through bug 551427)
Comment 30 :Gavin Sharp [email: gavin@gavinsharp.com] 2012-11-08 08:58:39 PST
Probably, given bug 444467 comment 30.
Comment 31 Christian :Biesinger (don't email me, ping me on IRC) 2012-11-13 11:00:06 PST
hey, sorry for being late with the review, but I have/had some work deadlines. those will be over tonight, so I should be able to deal with this tomorrow or the day after.
Comment 32 Ted Mielczarek [:ted.mielczarek] 2012-11-15 07:17:20 PST
*** Bug 812150 has been marked as a duplicate of this bug. ***
Comment 33 Ted Mielczarek [:ted.mielczarek] 2012-11-15 07:18:51 PST
Profile from my dupe:
http://people.mozilla.com/~bgirard/cleopatra/#report=ff45146a5759ef28274624b7ccd6e38fcb644236

I clicked a download link to a large (1.5GB) video file, got the "Open With" dialog, selected "Save", and then my browser was incredibly unresponsive while the file downloaded (saving to an SMB network share on my LAN).
Comment 34 :Gavin Sharp [email: gavin@gavinsharp.com] 2012-11-26 13:28:44 PST
Comment on attachment 678028 [details] [diff] [review]
Patch that works

I found some victims to provide feedback on this one!
Comment 35 David Rajchenbach-Teller [:Yoric] (please use "needinfo") 2012-11-27 07:55:37 PST
Comment on attachment 678028 [details] [diff] [review]
Patch that works

Review of attachment 678028 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good. The synchronization between threads, however, is complex enough that I cannot be sure that I have understood every subtlety, so do not hesitate to add further tests and further documentation.

Also, let me suggest cutting the patch in three, to make the life of reviewers easier:
1. Stream Utils
2. Background file saver, nsNet*, xpcshell tests
3. External helper

::: netwerk/base/src/BackgroundFileSaver.cpp
@@ +72,5 @@
> +, mAssignedTarget(nullptr)
> +, mAssignedTargetKeepPartial(false)
> +, mAsyncCopyContext(nullptr)
> +, mActualTarget(nullptr)
> +, mActualTargetKeepPartial(false)

Fields mControlThread, mWorkerThread, mPipeInputStream and mPipeOutputStream are left uninitialized. I believe that this is on purpose, as they are initialized in Init(), but that might deserve some comments.

@@ +93,5 @@
> +
> +  rv = ::NS_GetCurrentThread(getter_AddRefs(mControlThread));
> +  NS_ENSURE_SUCCESS(rv, rv);
> +
> +  rv = ::NS_NewThread(getter_AddRefs(mWorkerThread));

Do I understand correctly that we are starting a new thread for every single download? I seem to remember that the hardware on most systems will not allow concurrent access to devices, so isn't this overkill/counterproductive?

@@ +126,5 @@
> +    aTarget->Clone(getter_AddRefs(mAssignedTarget));
> +    mAssignedTargetKeepPartial = aKeepPartial;
> +  }
> +
> +  return GetWorkerThreadAttention(true);

It took me some time to figure out through which code path the copy is resumed after |GetWorkerThreadAttention(true)| stops it. So, I believe that future readers of this code would appreciate more documentation.

@@ +155,5 @@
> +  return GetWorkerThreadAttention(NS_FAILED(aStatus));
> +}
> +
> +nsresult
> +BackgroundFileSaver::GetWorkerThreadAttention(bool aShouldInterruptCopy)

In the header, the argument is named |aDontInterrupt|, which I assume is wrong. Could you please harmonize both versions?

@@ +207,5 @@
> +  }
> +
> +  (void)self->ProcessAttention();
> +
> +  NS_RELEASE(self);

I believe that this release deserves a little documentation, in particular since it is so distant from the NS_ADDREF.

@@ +217,5 @@
> +  nsresult rv;
> +
> +  // It may happen that a change event was posted before we started copying
> +  // data. In this case, interrupt the copy so that we are able to check if we
> +  // were requested, for instance, to rename the target file.

I have puzzled some time on this comment and the block below.
Barring any race condition, |mAsyncCopyContext| is |nullptr| if |ProcessAttention| has been called by |AsyncCopyCallback|, and also |nullptr| if we have been called directly by dispatching an event. So the only way we can reach this point is if |ProcessAttention| has been called by dispatching an event, but we have called |NS_AsyncCopy| before the event is actually dispatched. Am I correct? Regardless, additional comments could be useful here.

@@ +265,5 @@
> +    // From now on, another attention event needs to be posted if state changes.
> +    mWorkerThreadAttentionRequested = false;
> +  }
> +
> +  // The target can only be null if it has never been assigned.

// (in which case mAssignedTargetKeepPartial has never been assigned either)
Am I right?

@@ +278,5 @@
> +
> +    // Verify whether we have actually been instructed to use a different file.
> +    bool equalToCurrent;
> +    rv = mActualTarget->Equals(target, &equalToCurrent);
> +    NS_ENSURE_SUCCESS(rv, rv);

I seem to remember that NS_ENSURE_SUCCESS is deprecated.

@@ +280,5 @@
> +    bool equalToCurrent;
> +    rv = mActualTarget->Equals(target, &equalToCurrent);
> +    NS_ENSURE_SUCCESS(rv, rv);
> +    if (!equalToCurrent)
> +    {

I realize that all this sync I/O takes place on the worker thread, but I would be reassured by a comment mentioning it.

@@ +348,5 @@
> +                        GetProgressCallback());
> +    if (NS_FAILED(rv)) {
> +      NS_WARNING("NS_AsyncCopy failed.");
> +      mAsyncCopyContext = nullptr;
> +      NS_RELEASE_THIS();

Do I understand correctly that the other release matching this |NS_ADDREF_THIS()| is the |NS_RELEASE(self)| of |AsyncCopyCallback|?

@@ +372,5 @@
> +    if (mComplete) {
> +      return true;
> +    }
> +
> +    // If an error occurred, the operation can be completed immediately.

That comment doesn't seem to fit the companion block.

@@ +609,5 @@
> +  // Read the requested data.  Since the pipe has an infinite buffer, we don't
> +  // expect any write error to occur here.
> +  uint32_t writeCount;
> +  rv = mPipeOutputStream->WriteFrom(aInputStream, aCount, &writeCount);
> +  NS_ENSURE_SUCCESS(rv, rv);

As above, I believe that NS_ENSURE_SUCCESS is deprecated.

::: netwerk/base/src/BackgroundFileSaver.h
@@ +37,5 @@
> +
> +  /**
> +   * Initializes the pipe and the worker thread on XPCOM construction.
> +   */
> +  nsresult Init();

Could you document somewhere the fact that this Init() method and the one below are called by the factory constructor?

@@ +45,5 @@
> +
> +  /**
> +   * Stream that receives data from derived classes.
> +   */
> +  nsCOMPtr<nsIAsyncOutputStream> mPipeOutputStream;

I realize it is probably obvious from the name, but perhaps you could mention explicitly that mPipeInputStream is connected to mPipeOutputStream.

@@ +119,5 @@
> +   */
> +  nsresult mStatus;
> +
> +  /**
> +   * New target file, or null if a target was not assigned yet.

Nit: I believe that "New" is a little confusing.

@@ +151,5 @@
> +  bool mActualTargetKeepPartial;
> +
> +  //////////////////////////////////////////////////////////////////////////////
> +  //// Private methods
> +

I like how you documented separately worker state, controller state and shared state. Could you do the same for methods and functions?

@@ +160,5 @@
> +
> +  /**
> +   * Called from a control thread after state changes, to ensure that the worker
> +   * thread will process the state change appropriately.  This cancels the
> +   * current NS_AsyncCopy, if any.

Could you please document aDontInterrupt?

@@ +170,5 @@
> +   */
> +  nsresult ProcessAttention();
> +
> +  /**
> +   * Called by ProcessAttention, failure causes the operation to be aborted.

That's not very clear.

::: netwerk/test/unit/test_backgroundfilesaver.js
@@ +27,5 @@
> +      "nsIBackgroundFileSaver");
> +
> +const BackgroundFileSaverStreamListener = Components.Constructor(
> +      "@mozilla.org/network/background-file-saver;1?mode=streamlistener",
> +      "nsIBackgroundFileSaver");

I have the impression that you never test this, do you?

@@ +34,5 @@
> +      "@mozilla.org/io/string-input-stream;1",
> +      "nsIStringInputStream",
> +      "setData");
> +
> +const REQUEST_SUSPEND_AT = 1024 * 1024 * 4

Nit: ;

@@ +76,5 @@
> +function run_test()
> +{
> +  run_next_test();
> +}
> +

Could you please add a description for each test?
Also, I believe that it might be interesting to add tests for the following behaviors:
- write some bytes, rename, then write some more bytes;
- rename several times.

It might be interesting to combine these tests with the various behaviors for the final file:
- cancel, leave the file as is;
- cancel, remove the file;
- finish correctly.

::: toolkit/components/osfile/tests/mochi/main_test_osfile_async.js
@@ +66,5 @@
>        });
>        return result;
>      } catch (x) {
>        utils.fail("Error " + x + " at " + x.stack);
> +      return null;Finish

Typo?

::: uriloader/exthandler/nsExternalHelperAppService.cpp
@@ +1373,2 @@
>  
> +/* TODO: This looks broken when final name is chosen.  Is this still supported?

Reminder: Don't forget to solve this TODO before checkin.

@@ +1761,3 @@
>      mProgress += count;
> +
> +    // TODO: Can we still detect the case?

I believe that the nsresult of |inStr->GetAvailable(...)| may be used for this purpose.
Comment 36 Christian :Biesinger (don't email me, ping me on IRC) 2012-11-28 16:04:38 PST
Comment on attachment 678028 [details] [diff] [review]
Patch that works

Review of attachment 678028 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/src/BackgroundFileSaver.cpp
@@ +42,5 @@
> +{
> +public:
> +  NotifyTargetChangeRunnable(
> +    BackgroundFileSaver *aSaver
> +  , nsIFile *aTarget

I think this would be easier to read if the parameter list was all on a single line. the initializer list is fine on separate lines.

@@ +85,5 @@
> +BackgroundFileSaver::Init()
> +{
> +  nsresult rv;
> +
> +  rv = ::NS_NewPipe2(getter_AddRefs(mPipeInputStream),

the :: prefix for global functions is not common in mozilla. just call NS_NewPipe2 etc. directly

@@ +401,5 @@
> +  if (failed && mActualTarget && !mActualTargetKeepPartial) {
> +    (void)mActualTarget->Remove(false);
> +  }
> +
> +  // Post an event to notify that the operation completed.

Doesn't this function get called on the background thread?

Come to think of it, can you add a comment above each function in this file mentioning which thread they get called on?

@@ +626,5 @@
> +    if (NS_SUCCEEDED(rv) && available > REQUEST_SUSPEND_AT) {
> +      mRequest = aRequest;
> +      mRequestSuspended = true;
> +      if (NS_FAILED(mRequest->Suspend())) {
> +        NS_WARNING("Unable to suspend the request.");

you should also set mRequestSuspended=false here, right?

::: uriloader/exthandler/nsExternalHelperAppService.cpp
@@ +1125,5 @@
>    
>    // Make sure extension is correct.
>    EnsureSuggestedFileName();
>  
>    mBufferSize = Preferences::GetUint("network.buffer.cache.size", 4096);

Is this still used by anything?

@@ +1373,2 @@
>  
> +/* TODO: This looks broken when final name is chosen.  Is this still supported?

huh, good question. are our mac builds 32-bit or 64-bit? clearly this code will only do anything on 32-bit...

@@ +1761,3 @@
>      mProgress += count;
> +
> +    // TODO: Can we still detect the case?

yeah - OnDataAvailable will return an error if the read failed. But all errors are write errors now as far as this function is concerned, so you probably don't need the readError variable anymore.

write errors would get reported asynchronously via the callbacks.

::: xpcom/io/nsStreamUtils.cpp
@@ +251,5 @@
>          mClosure = closure;
>          mChunkSize = chunksize;
>          mCloseSource = closeSource;
>          mCloseSink = closeSink;
> +        mProgressCallback = progressCallback;

You also have to add this to the initializer list in the constructor, around line 219
Comment 37 Christian :Biesinger (don't email me, ping me on IRC) 2012-11-28 16:06:20 PST
Comment on attachment 678028 [details] [diff] [review]
Patch that works

sorry for overwriting yoric's feedback+ :/
Comment 38 Christian :Biesinger (don't email me, ping me on IRC) 2012-11-28 16:26:44 PST
Josh (Aas), any thoughts on the TODO comment about the MACOS code in https://bugzilla.mozilla.org/attachment.cgi?id=678028&action=diff#a/uriloader/exthandler/nsExternalHelperAppService.cpp_sec4 ?
Comment 39 Josh Aas 2012-11-29 03:33:40 PST
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #38)
> Josh (Aas), any thoughts on the TODO comment about the MACOS code in
> https://bugzilla.mozilla.org/attachment.cgi?id=678028&action=diff#a/
> uriloader/exthandler/nsExternalHelperAppService.cpp_sec4 ?

I don't remember much about the history of that, though I used to. We don't use it any more in the vast majority of cases because we don't ship i386 by default on any platform, and we don't support OS X 10.5 any more. It can probably be removed.
Comment 40 Christian :Biesinger (don't email me, ping me on IRC) 2012-11-29 09:36:32 PST
>Fields mControlThread, mWorkerThread, mPipeInputStream and mPipeOutputStream >are left uninitialized. I believe that this is on purpose, as they are >initialized in Init(), but that might deserve some comments

Good point... comptrs are initialized to null automatically, so there's no need to explicitly initialize them.

>Do I understand correctly that we are starting a new thread for every single >download? I seem to remember that the hardware on most systems will not allow >concurrent access to devices, so isn't this overkill/counterproductive?

I think starting a new thread for each download is fine/better, because:
- If one of your downloads is to a really slow destination (network drive, maybe), then it is beneficial for other downloads to be unaffected
- The OS is in a better place to multiplex the concurrent writes for multiple files
- In the normal case, there are zero or one active downloads, so this makes no difference
Comment 41 :Paolo Amadini 2012-11-29 14:58:55 PST
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #40)
> I think starting a new thread for each download is fine/better, because:
> - If one of your downloads is to a really slow destination (network drive,
> maybe), then it is beneficial for other downloads to be unaffected

Yeah, this is the main reason why we do one thread per download.

> - In the normal case, there are zero or one active downloads, so this makes
> no difference

The only problematic case would be a resource consumption one if an add-on
launches thousands of downloads at the same time, but we should probably not
design for this case in advance, and we already use a lot of memory for the
data buffers.
Comment 42 :Paolo Amadini 2012-11-29 15:08:54 PST
Created attachment 686818 [details] [diff] [review]
Fixed patch

I've written a complete test suite and caught a nasty runtime "QueryInterface
needed" bug, as well as a minor race condition due to the fact that moveTo
surprisingly changes the nsIFile instance it's called on, making it point to
the new location (I just documented this in MDN).

I've addressed the various review comments and tried to clarify the code
comments where possible. A final round of feedback from either David or
Christian is welcome :-)
Comment 43 :Paolo Amadini 2012-11-29 15:10:53 PST
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #40)
> >Fields mControlThread, mWorkerThread, mPipeInputStream and mPipeOutputStream >are left uninitialized. I believe that this is on purpose, as they are >initialized in Init(), but that might deserve some comments
> 
> Good point... comptrs are initialized to null automatically, so there's no
> need to explicitly initialize them.

Oh, in the end I initialized them explicitly, it doesn't hurt.
Comment 44 :Irving Reid (No longer working on Firefox) 2012-11-30 10:59:17 PST
Comment on attachment 686818 [details] [diff] [review]
Fixed patch

Review of attachment 686818 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good over all. Can you describe (either here in the bug, or in code comments) the expected flow when the browser is shut down while an async download is in progress? I'm not sure if that's something we can cover in an xpcshell test, but I'd like to know we're not leaving things half done in that case.

::: netwerk/base/src/BackgroundFileSaver.cpp
@@ +37,5 @@
> +/**
> + * Runnable object used to notify the control thread that file contents will now
> + * be saved to the specified file.
> + */
> +class NotifyTargetChangeRunnable : public nsRunnable

Should this be declared MOZ_FINAL to avoid non-virtual destructor warnings?

@@ +132,5 @@
> +    aTarget->Clone(getter_AddRefs(mAssignedTarget));
> +    mAssignedTargetKeepPartial = aKeepPartial;
> +  }
> +
> +  return GetWorkerThreadAttention(true);

I'd like a comment on GetWorkerThreadAttention() calls saying what the worker thread is expected to do

@@ +143,5 @@
> +  nsresult rv;
> +
> +  // This will cause the NS_AsyncCopy operation, if it's in progress, to consume
> +  // all the data that is still in the pipe, and then finish.
> +  rv = mPipeOutputStream->Close();

This makes me nervous - we're closing the pipe on the control thread with no lock, can the worker thread be manipulating it at the same time?

@@ +171,5 @@
> +  MutexAutoLock lock(mLock);
> +
> +  if (mWorkerThreadAttentionRequested) {
> +    return NS_OK;
> +  }

If we get two calls to GetWorkerThreadAttention() on the control thread before the worker thread wakes up, and the first has aShouldInterruptCopy false and the second true, we won't interrupt the copy.

@@ +377,5 @@
> +    // We must ensure that we keep this object alive for the entire duration of
> +    // the copy, since the raw pointer will be provided as the argument of the
> +    // AsyncCopyCallback function.  We must add the reference before the call,
> +    // because NS_AsyncCopy might invoke the callback immediately, and we risk
> +    // that this object is released unnecessarily.

The comment on the definition of AsyncCopyCallback() says it is called only on the worker thread; is it possible for a control-thread call (NS_AsyncCopy() or NS_CancelAsyncCopy() in particular) to synchronously call AsyncCopyCallback()? If so, we're going to recursively lock on mLock.

::: netwerk/base/src/BackgroundFileSaver.h
@@ +219,5 @@
> +  BackgroundFileSaverOutputStream();
> +
> +protected:
> +  bool HasInfiniteBuffer();
> +  nsAsyncCopyProgressFun GetProgressCallback();

Should define these with NS_OVERRIDE to enforce that they match the abstract definition in BackgroundFileSaver (see https://developer.mozilla.org/en-US/docs/NS_OVERRIDE)

@@ +245,5 @@
> +  BackgroundFileSaverStreamListener();
> +
> +protected:
> +  bool HasInfiniteBuffer();
> +  nsAsyncCopyProgressFun GetProgressCallback();

NS_OVERRIDE
Comment 45 Christian :Biesinger (don't email me, ping me on IRC) 2012-11-30 11:39:39 PST
(In reply to Irving Reid (:irving) from comment #44)
> Looks good over all. Can you describe (either here in the bug, or in code
> comments) the expected flow when the browser is shut down while an async
> download is in progress? I'm not sure if that's something we can cover in an
> xpcshell test, but I'd like to know we're not leaving things half done in
> that case.

That's a good point. I think it may be ok because nsIChannels send an error status to their listeners in this case, but maybe we need to observe xpcom-shutdown-threads?

Come to think of it, don't you need to call shutdown() on the worker thread at some point?

> > +  rv = mPipeOutputStream->Close();
> 
> This makes me nervous - we're closing the pipe on the control thread with no
> lock, can the worker thread be manipulating it at the same time?

I'm not sure I understand the question correctly, but pipes are designed to support closing the output stream on one thread while the input stream is processed on another thread, so this ought to be fine.
Comment 46 David Rajchenbach-Teller [:Yoric] (please use "needinfo") 2012-12-02 04:36:17 PST
(In reply to Paolo Amadini [:paolo] from comment #41)
> (In reply to Christian :Biesinger (don't email me, ping me on IRC) from
> comment #40)
> > I think starting a new thread for each download is fine/better, because:
> > - If one of your downloads is to a really slow destination (network drive,
> > maybe), then it is beneficial for other downloads to be unaffected
> 
> Yeah, this is the main reason why we do one thread per download.
> 
> > - In the normal case, there are zero or one active downloads, so this makes
> > no difference
> 
> The only problematic case would be a resource consumption one if an add-on
> launches thousands of downloads at the same time, but we should probably not
> design for this case in advance, and we already use a lot of memory for the
> data buffers.

Let's use real-world data (in a followup bug) to determine this: bug 817381.
Comment 47 David Rajchenbach-Teller [:Yoric] (please use "needinfo") 2012-12-03 07:28:42 PST
(I am in conference this week, I might have difficulties providing feedback until at least Thursday.)
Comment 48 David Rajchenbach-Teller [:Yoric] (please use "needinfo") 2012-12-07 06:14:45 PST
Comment on attachment 686818 [details] [diff] [review]
Fixed patch

Review of attachment 686818 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good, although I would like Irving's remarks to be addressed.
General remark: I am almost certain that NS_ENSURE_SUCCESS has been deprecated.

Due to general review overload, I have not had time to read carefully the tests. If you want me to, please re-f? me.

::: netwerk/base/src/BackgroundFileSaver.cpp
@@ +37,5 @@
> +/**
> + * Runnable object used to notify the control thread that file contents will now
> + * be saved to the specified file.
> + */
> +class NotifyTargetChangeRunnable : public nsRunnable

I agree with MOZ_FINAL.

@@ +238,5 @@
> +  // executed in the worker thread.  ProcessStateChange will then start the copy
> +  // before this function is called on the worker thread.  In this case, we
> +  // interrupt the copy and re-enter through AsyncCopyCallback.  This allows us
> +  // to check if, for instance, we should rename the target file.  This function
> +  // will then restart the copy if needed.

I *think* I understand that comment, but there must be a way to make it clearer :)

Is the following correct?

«
This function is called whenever the controller thread has requested the attention of the worker thread.
This may happen in the following circumstances:
1. we have interrupted the copy for some reason (in this case, we are called by AsyncCopyCallback and mAsyncCopyContext is null);
2. we are about to start the copy for the first time (in this case, we are called by GetWorkerThreadAttention and mAsyncCopyContext is null);
3. we are currently updating the state (in ProcessStateChange) and another state change request is received (in this case, we are called by GetWorkerThreadAttention but mAsyncCopyContext is not null).

In case 3., ProcessStateChange will first resume the copy, but we reinterrupt it immediately so as handle the state change.
»

@@ +356,5 @@
> +  if (CheckCompletion()) {
> +    return NS_OK;
> +  }
> +
> +  // Create the target file or append to it as requested.

Do I understand correctly that we append iff we have not changed the file but we had to cancel/restart the async copy for some other reason? If so, could you please mention this explicitly in the doc?

@@ +431,5 @@
> +      // output file, allow the copy operation to resume.  The Available getter
> +      // may return an error if one of the pipe's streams has been already closed.
> +      uint64_t available;
> +      rv = mPipeInputStream->Available(&available);
> +      if (NS_SUCCEEDED(rv) && available > 0) {

Nit: I would use |available != 0|.

@@ +672,5 @@
> +    if (NS_SUCCEEDED(rv) && available > REQUEST_SUSPEND_AT) {
> +      mShouldSuspendRequest = true;
> +
> +      // Try to suspend the request.  If this fails, don't try to resume later.
> +      if (NS_SUCCEEDED(aRequest->Suspend())) {

Does this callback to anything?

@@ +687,5 @@
> +// Called on the worker thread.
> +// static
> +void
> +BackgroundFileSaverStreamListener::AsyncCopyProgressCallback(void *aClosure,
> +                                                             uint32_t aCount)

I do not follow what can call |AsyncCopyProgressCallback| and unsuspend a request. Is this documented somewhere?

@@ +692,5 @@
> +{
> +  BackgroundFileSaverStreamListener *self =
> +    (BackgroundFileSaverStreamListener *)aClosure;
> +
> +  if (self->mShouldSuspendRequest) {

If I understand correctly, |mShouldSuspendRequest| is ill-named. In this context, |mShouldSuspendRequest| actually means that the request is currently suspended, is that it?

::: netwerk/base/src/BackgroundFileSaver.h
@@ +159,5 @@
> +  //////////////////////////////////////////////////////////////////////////////
> +  //// Private methods
> +
> +  /**
> +   * Called when NS_AsyncCopy completes.

Could you document |aClosure|?

::: netwerk/test/unit/test_backgroundfilesaver.js
@@ +23,5 @@
> +                                  "resource://gre/modules/NetUtil.jsm");
> +XPCOMUtils.defineLazyModuleGetter(this, "Promise",
> +                                  "resource://gre/modules/commonjs/promise/core.js");
> +XPCOMUtils.defineLazyModuleGetter(this, "Task",
> +                                  "resource://gre/modules/Task.jsm");

/me likes Task.jsm :)

@@ +85,5 @@
> + *
> + * @param aFile
> + *        nsIFile whose contents should be verified.
> + * @param aExpectedContents
> + *        String containing the octects that are expected in the file.

Nit: "octets".

@@ +102,5 @@
> +      do_check_eq(contents, aExpectedContents);
> +    } else {
> +      do_check_eq(contents.length, aExpectedContents.length);
> +      do_check_true(contents == aExpectedContents);
> +    }

ok
Comment 49 Mike Hommey [:glandium] 2012-12-07 07:10:34 PST
> > + *        String containing the octects that are expected in the file.
> 
> Nit: "octets".

8 people playing music?
Comment 50 :Irving Reid (No longer working on Firefox) 2012-12-07 12:45:59 PST
(In reply to Mike Hommey [:glandium] from comment #49)
> > > + *        String containing the octects that are expected in the file.
> > 
> > Nit: "octets".
> 
> 8 people playing music?

Maybe *that's* why my IETF RFC implementations never behave they way I expect them to...


One other thought: If the input side of the async download is coming from an HTTPS connection, the networking layer may make blocking calls back to the UI thread to interact with the user. I haven't spotted any places where this will cause us to deadlock, since all the inter-thread communication in this new code is non-blocking as far as I can tell, but I wanted to point the issue out just in case.

I've seen shutdown hangs in Thunderbird caused by this - the main thread is waiting for the worker thread to close an SSL connection, and for some reason that connection wants to notify the main thread about a certificate issue before it shuts down.
Comment 51 :Gavin Sharp [email: gavin@gavinsharp.com] 2012-12-07 13:13:21 PST
As far as I know we've eliminated all PSM blocking calls back to the UI thread (in bug 682329 and maybe others).
Comment 52 :Paolo Amadini 2012-12-08 11:31:02 PST
Created attachment 690112 [details] [diff] [review]
Revised patch

I've updated this patch with all the comments, including calling Shutdown on
the thread (that isn't done by the destructor like I thought), and fixed a
few more issues I found out on my own, see the attached patch for details.

The only problem I've not identified yet is an intermittent failure on the
Linux build machines, in "test_backgroundfilesaver.js". The test times out.
Unfortunately, I just discovered that the xpcshell harness doesn't handle
timeouts, so I don't get any useful test output from the build log.

In fact, I've spent hours trying to chase a phantom main-thread mutex deadlock
(it would have caused the timeout detection code not to run) just to discover
later that there is simply no timeout detection code in xpcshell like there
is in mochitest.

Any suggestion on how to debug the issue on the build machines, other than
adding timeout support to the xpcshell test harness? The timeout never
happens on my machine, even if I download the builds and tests locally,
and run the exact commands listed in the build logs.
Comment 53 :Paolo Amadini 2012-12-08 11:36:28 PST
By the way, I'm not a fan of overly detailed comments, since they tend to become
outdated very quickly as other parts of the code change. But in this case, given
the complexity of the code, I think a 15-line comment here and there, based on
the specific feedback received by people who read the code, can be fine.
Comment 54 :Paolo Amadini 2012-12-08 11:39:46 PST
Failing build, for the record: https://tbpl.mozilla.org/?tree=Try&rev=1157533514d6
Comment 55 David Rajchenbach-Teller [:Yoric] (please use "needinfo") 2012-12-08 12:29:57 PST
(In reply to Paolo Amadini [:paolo] from comment #52)
> Any suggestion on how to debug the issue on the build machines, other than
> adding timeout support to the xpcshell test harness? The timeout never
> happens on my machine, even if I download the builds and tests locally,
> and run the exact commands listed in the build logs.

In this circumstance, I generally add a combo do_timeout/do_throw to the test. This ensures that xpcshell logs more details about the error.

If this is not sufficient, you can always request the loan of a test slave.
Comment 56 :Paolo Amadini 2012-12-10 11:31:01 PST
Created attachment 690489 [details] [diff] [review]
Final patch

(In reply to David Rajchenbach Teller [:Yoric] from comment #55)
> In this circumstance, I generally add a combo do_timeout/do_throw to the
> test. This ensures that xpcshell logs more details about the error.

That helped, I've also left the timeout check in this final patch.

The bug was that Resume was called on the wrong thread.

I think this is ready to land now. I'll do it soon, unless there's more
feedback related to the latest changes.
Comment 58 Ed Morley [:emorley] 2012-12-11 08:09:31 PST
https://hg.mozilla.org/mozilla-central/rev/eaa3e3765678
Comment 59 Christian :Biesinger (don't email me, ping me on IRC) 2012-12-11 16:36:00 PST
(In reply to Paolo Amadini [:paolo] from comment #52)
> I've updated this patch with all the comments, including calling Shutdown on
> the thread (that isn't done by the destructor like I thought), and fixed a
> few more issues I found out on my own, see the attached patch for details.

There is no destructor being called if you don't call Shutdown :) The wonders of refcounting...
Comment 60 Bogdan Maris, QA [:bogdan_maris] 2013-01-11 08:15:56 PST
Let me know if I can help you test this.
Comment 61 :Paolo Amadini 2013-01-17 07:10:32 PST
(In reply to Bogdan Maris [QA] from comment #60)
> Let me know if I can help you test this.

Help with testing and identifying edge cases would be much appreciated!

This change affects downloads started by clicking a link with the left mouse
button (as opposed to the "Save Link As" option). Examples of manual tests
that can complement the automated test suite:
 * Download a large file, when the dialog box for selecting what to do with
    the file is displayed (in this case, the download starts in the background
    while you select the action). Select to save the file and select the file
    name before the download in the background is finished. Check that the
    file size is as expected.
 * Download a large file, when the dialog box for selecting what to do with
    the file is NOT displayed (when "don't ask me again" was previously
    selected). Check that the file size is as expected.
 * Download a small file, when the dialog box for selecting what to do with
    the file is displayed (in this case, the download starts and finishes in
    the background while you select the action). Select to save the file and
    select the file name. Check that the file size is as expected.
 * Download a small file, when the dialog box for selecting what to do with
    the file is NOT displayed (when "don't ask me again" was previously
    selected). Check that the file size is as expected.
 * Download a small file directly to slow media (floppy drive).
 * Download a large file directly to slow media (USB stick).
 * Download a large file, and interrupt it by disconnecting from the network.
    Check that the error case is properly reported.

If you can execute other exploratory testing or analyze edge cases, please do!
Comment 62 Bogdan Maris, QA [:bogdan_maris] 2013-02-01 09:37:00 PST
(In reply to Paolo Amadini [:paolo] from comment #61)
> (In reply to Bogdan Maris [QA] from comment #60)
> > Let me know if I can help you test this.
> 
> Help with testing and identifying edge cases would be much appreciated!
> 
> This change affects downloads started by clicking a link with the left mouse
> button (as opposed to the "Save Link As" option). Examples of manual tests
> that can complement the automated test suite:
>  * Download a large file, when the dialog box for selecting what to do with
>     the file is displayed (in this case, the download starts in the
> background
>     while you select the action). Select to save the file and select the file
>     name before the download in the background is finished. Check that the
>     file size is as expected.
>  * Download a large file, when the dialog box for selecting what to do with
>     the file is NOT displayed (when "don't ask me again" was previously
>     selected). Check that the file size is as expected.
>  * Download a small file, when the dialog box for selecting what to do with
>     the file is displayed (in this case, the download starts and finishes in
>     the background while you select the action). Select to save the file and
>     select the file name. Check that the file size is as expected.
>  * Download a small file, when the dialog box for selecting what to do with
>     the file is NOT displayed (when "don't ask me again" was previously
>     selected). Check that the file size is as expected.
>  * Download a small file directly to slow media (floppy drive).
>  * Download a large file directly to slow media (USB stick).
>  * Download a large file, and interrupt it by disconnecting from the network.
>     Check that the error case is properly reported.
> 
> If you can execute other exploratory testing or analyze edge cases, please
> do!

Sorry for the delay, I was a bit busy. I tested all of those scenarios except the one with the floppy (I don't have such hardware). All tests were a pass. Will try to find some edge cases further on.
I tested on last Nightly (Build ID: 20130201030927). If you need any more help please let me know.
Comment 63 :Gavin Sharp [email: gavin@gavinsharp.com] 2013-02-01 09:40:54 PST
(In reply to Bogdan Maris [QA] from comment #62)
> Sorry for the delay, I was a bit busy. I tested all of those scenarios
> except the one with the floppy (I don't have such hardware). All tests were
> a pass. Will try to find some edge cases further on.

Awesome, thanks Bogdan!
Comment 64 Bogdan Maris, QA [:bogdan_maris] 2013-02-11 08:19:17 PST
Here are some edge cases: 
https://docs.google.com/spreadsheet/ccc?key=0AjbDkQ2Zlg86dDRPSjMyai05S0d1SGo5akNQSjQwbHc&usp=sharing#gid=0
I will try to update the document with more tests.
Comment 65 :Gavin Sharp [email: gavin@gavinsharp.com] 2013-04-12 15:57:55 PDT
*** Bug 599991 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.