Closed Bug 497003 Opened 15 years ago Closed 11 years ago

support sending OnDataAvailable() to other threads

Categories

(Core :: Networking, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla25

People

(Reporter: jduell.mcbugs, Assigned: sworkman)

References

(Blocks 1 open bug, )

Details

Attachments

(5 files, 26 obsolete files)

20.00 KB, text/plain
Details
5.23 KB, text/plain
Details
10.20 KB, patch
Details | Diff | Splinter Review
51.56 KB, patch
sworkman
: review+
Details | Diff | Splinter Review
25.55 KB, patch
sworkman
: review+
Details | Diff | Splinter Review
We are still trying to figure out if we want to implement sending OnDataAvailable() to other threads.

Design notes and discussion are on the wiki:

https://wiki.mozilla.org/Necko:_support_sending_OnDataAvailable%28%29_to_other_threads

Ben Newman has done some performance tests that seem to indicate that this would not be a very big win for the common case, so I'm still inclined to focus my energy on Electrolysis, which will make this API moot.   But comments from the peanut gallery are still welcome.

For now I'm guessing that the wiki page (and dev.planning) are probably better places for data and discussion, but we wanted a bug for this too, so here it is :)
(In reply to comment #0)
> Ben Newman has done some performance tests that seem to indicate that this
> would not be a very big win for the common case,

I've been trying to quantify the best possible performance gains, and a good (albeit uncommon) test-case is the bugzilla HTML diff for Henri's HTML5 parsing patch:

http://people.mozilla.org/~bnewman/html5diff.html

Although not a very web-representative page (9Mb, very table-oriented), this HTML document induces extreme event queue contention between PresShell::ReflowEvent::Run (49.8% according to Shark) and nsParserContinueEvent (23.9%).

More detailed numbers to come.
Lines beginning with a pipe ('|') were printf'd in nsParser::OnDataAvailable.  The data give statistics about how long the event currently on the stack had to wait in the main thread event queue before being executed.

The "last" value is the number of seconds waited.

The "count" value tracks how many distinct events have resulted in an OnDataAvailable call (when the count doesn't change from line to line, it's the same event resulting in another OnDataAvailable call).

The "mean" value is the average of all "last" values so far.

The "stdev" value is the standard deviation of those values (of dubious usefulness because these values are by no means normally distributed).

The "max" value is the longest time waited by any single OnDataAvailable-inducing event.

The "min" value is the shortest such time.

As indicated by the first line of the attachment, these data were obtained by loading html5diff.html from localhost, to minimize extraneous network overhead.  Also, the data were recorded during an initial page load, including browser startup time.  For that reason, not all the data are associated purely with page load (but startup performance is still relevant to this bug).

Interpretation to follow.
(In reply to comment #2)
> Created an attachment (id=382620) [details]
> main event queue latencies for OnDataAvailable-inducing events

The segment of these data that argues for delivering network events off the main thread is

| last:  0.36s   count:    90   mean: 0.023289s   stdev: 0.08148s   max:  0.44s   min:     0s
| last: 1.124s   count:    91   mean: 0.035385s   stdev: 0.14048s   max: 1.124s   min:     0s
| last: 0.529s   count:    92   mean: 0.04075s   stdev: 0.14879s   max: 1.124s   min:     0s
| last: 0.392s   count:    93   mean: 0.044527s   stdev: 0.15236s   max: 1.124s   min:     0s
| last: 0.905s   count:    94   mean: 0.053681s   stdev: 0.17538s   max: 1.124s   min:     0s
| last: 0.573s   count:    95   mean: 0.059147s   stdev: 0.18233s   max: 1.124s   min:     0s
| last: 0.296s   count:    96   mean: 0.061615s   stdev: 0.18297s   max: 1.124s   min:     0s
Reading symbols for shared libraries . done
| last: 0.902s   count:    97   mean: 0.070278s   stdev: 0.20084s   max: 1.124s   min:     0s
| last: 0.375s   count:    98   mean: 0.073388s   stdev: 0.20215s   max: 1.124s   min:     0s
| last: 0.329s   count:    99   mean: 0.07597s   stdev: 0.20274s   max: 1.124s   min:     0s
| last: 0.329s   count:   100   mean: 0.0785s   stdev: 0.20329s   max: 1.124s   min:     0s
Reading symbols for shared libraries . done
| last: 0.352s   count:   101   mean: 0.081208s   stdev: 0.20409s   max: 1.124s   min:     0s
| last: 0.356s   count:   102   mean: 0.083902s   stdev: 0.20488s   max: 1.124s   min:     0s
| last: 0.358s   count:   103   mean: 0.086563s   stdev: 0.20565s   max: 1.124s   min:     0s
| last: 0.375s   count:   104   mean: 0.089337s   stdev: 0.20658s   max: 1.124s   min:     0s
| last: 0.372s   count:   105   mean: 0.092029s   stdev: 0.20742s   max: 1.124s   min:     0s
| last: 0.376s   count:   106   mean: 0.094708s   stdev: 0.20826s   max: 1.124s   min:     0s
| last: 0.377s   count:   107   mean: 0.097346s   stdev: 0.20905s   max: 1.124s   min:     0s
| last: 0.381s   count:   108   mean: 0.099972s   stdev: 0.20985s   max: 1.124s   min:     0s
| last: 0.393s   count:   109   mean: 0.10266s   stdev: 0.21074s   max: 1.124s   min:     0s
| last: 0.388s   count:   110   mean: 0.10525s   stdev: 0.21153s   max: 1.124s   min:     0s
| last: 0.388s   count:   111   mean: 0.1078s   stdev: 0.21226s   max: 1.124s   min:     0s
| last: 0.515s   count:   112   mean: 0.11144s   stdev: 0.21475s   max: 1.124s   min:     0s
| last: 0.399s   count:   113   mean: 0.11398s   stdev: 0.21549s   max: 1.124s   min:     0s
| last: 0.402s   count:   114   mean: 0.11651s   stdev: 0.21622s   max: 1.124s   min:     0s
| last:  0.41s   count:   115   mean: 0.11906s   stdev: 0.21699s   max: 1.124s   min:     0s
| last: 0.076s   count:   116   mean: 0.11869s   stdev: 0.21609s   max: 1.124s   min:     0s
| last: 0.078s   count:   117   mean: 0.11834s   stdev: 0.2152s   max: 1.124s   min:     0s
| last: 0.085s   count:   118   mean: 0.11806s   stdev: 0.21431s   max: 1.124s   min:     0s
| last: 0.149s   count:   119   mean: 0.11832s   stdev: 0.21342s   max: 1.124s   min:     0s
| last: 0.098s   count:   120   mean: 0.11815s   stdev: 0.21254s   max: 1.124s   min:     0s
| last: 0.103s   count:   121   mean: 0.11802s   stdev: 0.21166s   max: 1.124s   min:     0s
| last: 0.106s   count:   122   mean: 0.11793s   stdev: 0.2108s   max: 1.124s   min:     0s
| last: 0.113s   count:   123   mean: 0.11789s   stdev: 0.20994s   max: 1.124s   min:     0s
| last: 0.123s   count:   124   mean: 0.11793s   stdev: 0.20909s   max: 1.124s   min:     0s
| last: 0.125s   count:   125   mean: 0.11798s   stdev: 0.20825s   max: 1.124s   min:     0s
| last: 0.134s   count:   126   mean: 0.11811s   stdev: 0.20743s   max: 1.124s   min:     0s
| last:  0.14s   count:   127   mean: 0.11828s   stdev: 0.20662s   max: 1.124s   min:     0s
| last: 0.147s   count:   128   mean: 0.11851s   stdev: 0.20583s   max: 1.124s   min:     0s
| last: 0.083s   count:   129   mean: 0.11823s   stdev: 0.20505s   max: 1.124s   min:     0s
| last: 0.076s   count:   130   mean: 0.11791s   stdev: 0.2043s   max: 1.124s   min:     0s
| last: 0.077s   count:   131   mean: 0.1176s   stdev: 0.20355s   max: 1.124s   min:     0s
| last: 0.141s   count:   132   mean: 0.11777s   stdev: 0.20278s   max: 1.124s   min:     0s

Multiple events are waiting hundreds of milliseconds, even full seconds, in the queue.  That seems worrisome, but there are several things to remember:

1. This test page is a BIT extreme.

2. While we might be able to start parsing sooner if we delivered events directly to a non-main thread, the parser would still have to post events back to the main thread event queue in order for the layout engine to display the page.  These events would presumably have to wait just as long before receiving rendering attention.

3. According to Shark profiles, most of the competing events are ReflowEvents.  While I'm willing to grant that layout probably deserves the lion's share of processing time, I would also just mention that it may be possible to reduce the OnDataAvailable latencies by interrupting reflow more often (the point being we can tune the balance between nsParserContinueEvents and ReflowEvents without modifying necko).
I can post the patch that generated these numbers if there's interest.

I also have a patch to quantify the cost of copying the "available data" into a separate buffer (presumably required if we're sending that data to another thread asynchronously).  Sneak preview: the additional overhead of using Read rather than ReadSegments is negligible compared with the numbers above.
Definitely post the patch for posterity. Might be useful for others that want to do similar things.
Obviously a much shorter request-to-render time, but still some worrisome latencies toward the end (events 4-8).
Attached patch the patch I usedSplinter Review
Not super pretty code, but cleaning it up would mean posting a slightly different patch from the one used to generate the data.
Assignee: jduell.mcbugs → nobody
Jason is planning to work on this in Q1 2013.
Assignee: nobody → jduell.mcbugs
And what about OnStopRequest on a background thread?  It is the most significant UI thread blocker.
Assignee: jduell.mcbugs → sworkman
Created 2 new IDLs
-- nsIThreadRetargetableRequest (with method redirectDeliveryTo, taking an nsIThread as input)
-- nsIThreadRetargetableStreamListener (with method 
(Not sure if WebIDL is needed here; assuming XPIDL is fine for this WIP)

Adjusted classes in the HTTP to HTML 5 Parser chain.
Implementing nsIThreadRetargetableRequest::RedirectDeliveryTo(nsIThread* aThread)
(called from nsHtml5StreamParser::OnStartRequest)
-- nsHttpChannel
-- nsInputStreamPump

Implementing nsIThreadRetargetableStreamListener::CheckListenerChain()
(called from nsInputStreamPump::RedirectDeliveryTo)
-- nsHttpChannel
-- nsDocumentOpenInfo
-- nsHtml5Parser

I had build errors with multiple inheritance (multiple paths to resolve nsIRequest and nsISupports for nsInputStreamPump and nsHttpChannel), so I had to do a couple of static_casts. I've read online that virtual inheritance is a way to get around this, but that doesn't seem possible with headers generated from IDLs. Hence static casting.
Attachment #732006 - Flags: feedback?(jduell.mcbugs)
Attachment #732006 - Flags: feedback?(hsivonen)
Attached patch WIP v1.0 Changes to events (obsolete) — Splinter Review
This patch adds and changes some async events. Also, it includes an async event that changes the nature of nsIWebProgressListeners, whereby they are notified asynchronously of transport status changes in OnDataAvailable. I don't know enough about nsIWebProgressListeners to know if this is ok or not ….

Also, not that this patch still fails running - there are still async events to be managed/changed. But I wanted to upload now to check the direction, mostly in reference to nsIWebProgressListener.

New async events/changes to events:

OnTransportStatusAsyncEvent added to nsHttpChannel::OnDataAvailable when called off the main thread. Dispatched to call OnTransportStatus on the main thread. This affects nsIWebProgressListeners and means they're not called synchronously before the next nsIStreamListener::OnDataAvailable (nsDocumentOpenInfo).

nsHtml5DataAvailable not dispatched not dispatched from main thread to parser thread. Instead, if nsHtml5StreamParser::OnDataAvailable is executing on the parser thread, DoDataAvailable is called directly.

Ditto for nsHtml5RequestStopper from ::OnStopRequest, but with the threads the other way around, so it's always called on the parser thread.

OnStateChangeAsyncEvent dispatched from nsDocLoader::OnStopRequest, to call FireOnStateChange on the main thread. This also affects nsIWebProgressListeners.

There is also a hack to avoid refcounting off the main thread for nsHtml5StreamParser in nsDocumentOpenInfo::OnDataAvailable. I need to think more about the solution for this one - maybe addrefing in OnStart and dispatching a release to the main thread.
Attachment #732007 - Flags: feedback?(jduell.mcbugs)
Attachment #732007 - Flags: feedback?(hsivonen)
Status: NEW → ASSIGNED
For non-main thread refcounting, consider nsMainThreadPtrHolder: http://mxr.mozilla.org/mozilla-central/source/xpcom/glue/nsProxyRelease.h#67
Oops. Didn't hg add the new idls. Patch should have them now.
Attachment #732006 - Attachment is obsolete: true
Attachment #732006 - Flags: feedback?(jduell.mcbugs)
Attachment #732006 - Flags: feedback?(hsivonen)
Attachment #732029 - Flags: feedback?(jduell.mcbugs)
Attachment #732029 - Flags: feedback?(hsivonen)
(In reply to Josh Matthews [:jdm] from comment #12)
> For non-main thread refcounting, consider nsMainThreadPtrHolder:
> http://mxr.mozilla.org/mozilla-central/source/xpcom/glue/nsProxyRelease.h#67

Thanks, Josh. I'll look into it.
Comment on attachment 732007 [details] [diff] [review]
WIP v1.0 Changes to events

For refcounting objects that can be refcounted only from the main thread, see nsHtml5RefPtr for a smart pointer that can be used as a field of a runnable posted from the main thread to a non-main thread.

The HTML parser parts of these patches look good to me, except the new code copies the data as many times as the old code, which probably now means one useless copy. In the new case, it should be possible to use a ReadSegments-like facility to pass data directly from Necko-owned buffers to chardet and WriteStreamBytes/SniffStreamBytes without making an intermediate copy (nsAutoArrayPtr<uint8_t> data(new (fallible) uint8_t[aLength]); in nsHtml5StreamParser::OnDataAvailable.).

Calling nsDocLoader::OnStopRequest from a non-main thread looks wrong, since the code touches flags on the docloader itself. It might be safer not to deliver the stream events off the main thread to listeners other than the parser and to make the HTML parser call the non-parser legacy listeners either from the preload operation queue to keep the semantics as close as possible to the current semantics (nsDocLoader::OnStopRequest when all data has been queued up in the parser but before the parse has finished) or from the tree op queue in case that actually results in more desirable semantics (nsDocLoader::OnStopRequest when the parse actually finishes).
Attachment #732007 - Flags: feedback?(hsivonen)
Attachment #732029 - Flags: feedback?(hsivonen)
Attachment #382639 - Attachment is patch: true
Attachment #382639 - Attachment mime type: application/octet-stream → text/plain
Comment on attachment 732007 [details] [diff] [review]
WIP v1.0 Changes to events

Canceling feedback request for Jason - the patch is out of date and another one coming in the next week or so.
Attachment #732007 - Flags: feedback?(jduell.mcbugs)
Attachment #732029 - Flags: feedback?(jduell.mcbugs)
Updated patch

-- nsInputStreamPump now correctly sets up the target thread as a callback from AsyncRead.
-- Ref ptr on non main thread issues fixed.
-- Removed DocLoader changes - these weren't needed.
Attachment #732007 - Attachment is obsolete: true
Attachment #732029 - Attachment is obsolete: true
Attachment #743325 - Flags: review?(jduell.mcbugs)
Attachment #743325 - Flags: review?(hsivonen)
-- Split previous patch into two (Necko code and Parser code).
-- Added some minor changes.
Attachment #743325 - Attachment is obsolete: true
Attachment #743325 - Flags: review?(jduell.mcbugs)
Attachment #743325 - Flags: review?(hsivonen)
Attachment #745958 - Flags: review?(jduell.mcbugs)
Comment on attachment 745959 [details] [diff] [review]
v2.1 Support delivery of OnDataAvailable on the HTML5 parser thread

> nsHtml5StreamParser::OnDataAvailable(nsIRequest* aRequest,
>                                nsISupports* aContext,
>                                nsIInputStream* aInStream,
>                                uint64_t aSourceOffset,
>                                uint32_t aLength)
> {
>   nsresult rv;
>-  if (NS_FAILED(rv = mExecutor->IsBroken())) {
>+  if (NS_FAILED(rv = mExecutor->IsBroken(false))) {
>     return rv;
>   }
> 
>   NS_ASSERTION(mRequest == aRequest, "Got data on wrong stream.");
>   uint32_t totalRead;
>   const mozilla::fallible_t fallible = mozilla::fallible_t();
>   nsAutoArrayPtr<uint8_t> data(new (fallible) uint8_t[aLength]);
>   if (!data) {
>     return mExecutor->MarkAsBroken(NS_ERROR_OUT_OF_MEMORY);
>   }

Please move the allocation of |data| to inside this |if| branch:

>+  if (NS_IsMainThread()) {

r+ with that change.

>+    inline nsresult IsBroken(bool aAssertMainThread = true) {
>+      if (aAssertMainThread) {
>+        NS_ASSERTION(NS_IsMainThread(), "Wrong thread!");
>+      }
>       return mBroken;
>     }

Probably better to remove the assertion than to bother with the added argument. But r+ either way.

>--- a/uriloader/base/nsURILoader.cpp
>+++ b/uriloader/base/nsURILoader.cpp

These changes should probably be reviewed by someone who is qualified to review stuff under uriloader/.
Attachment #745959 - Flags: review?(hsivonen) → review+
Comment on attachment 745959 [details] [diff] [review]
v2.1 Support delivery of OnDataAvailable on the HTML5 parser thread

Review of attachment 745959 [details] [diff] [review]:
-----------------------------------------------------------------

a couple nits...

::: parser/html/nsHtml5StreamParser.cpp
@@ +1138,5 @@
>  
>    NS_ASSERTION(mRequest == aRequest, "Got data on wrong stream.");
>    uint32_t totalRead;
>    const mozilla::fallible_t fallible = mozilla::fallible_t();
>    nsAutoArrayPtr<uint8_t> data(new (fallible) uint8_t[aLength]);

We can move the allocation of the 'data' field into the main-thread only case--it's not used otherwise, so it's a waste of CPU if we run on the parser thread.

::: uriloader/base/nsURILoader.cpp
@@ +277,5 @@
> +  nsresult rv = NS_OK;
> +  nsCOMPtr<nsIThreadRetargetableStreamListener> retargetableListener =
> +    do_QueryInterface(m_targetStreamListener, &rv);
> +  NS_ENSURE_SUCCESS(rv, rv);
> +  if (retargetableListener) {

I should know this by heart already, but is there any reason to use the version of do_QueryInterface that takes &rv here?  Isn't it enough to just test whether retargetableListener winds up null or not?  (I was worried that if we could somehow get null but rv=NS_OK here we'd wind up returning NS_OK here accidentally.  From a brief wade into nsISupportsImpl.h it looks like rv will always be !NS_OK if we get null, but I'm not 100% sure.)
Thanks Henri!

(In reply to Henri Sivonen (:hsivonen) from comment #20)
> Comment on attachment 745959 [details] [diff] [review]

> Please move the allocation of |data| to inside this |if| branch:

D'oh. Of course.

> r+ with that change.

Thanks.

> >+    inline nsresult IsBroken(bool aAssertMainThread = true) {
> >+      if (aAssertMainThread) {
> >+        NS_ASSERTION(NS_IsMainThread(), "Wrong thread!");
> ...
> Probably better to remove the assertion than to bother with the added
> argument. But r+ either way.

Okie doke. I'll just remove it then.

> >--- a/uriloader/base/nsURILoader.cpp
> >+++ b/uriloader/base/nsURILoader.cpp
> 
> These changes should probably be reviewed by someone who is qualified to
> review stuff under uriloader/.

No problem.
Ehsan, can you take a look at the changes to nsURILoader.h/cpp please?
Attachment #745959 - Attachment is obsolete: true
Attachment #750157 - Flags: review?(ehsan)
Comment on attachment 750157 [details] [diff] [review]
v2.2 Support delivery of OnDataAvailable on the HTML5 parser thread

Wait - I uploaded the wrong patch. Hold on.
Attachment #750157 - Flags: review?(ehsan)
Ok - this should be the right one.
Attachment #750157 - Attachment is obsolete: true
Attachment #750168 - Flags: review?(ehsan)
Comment on attachment 750168 [details] [diff] [review]
v2.2 Support delivery of OnDataAvailable on the HTML5 parser thread

Bobby, can you take a look at the nsMainThreadPtrHandle changes please? I added a bool operator because the compiler was complaining about ambiguity for if statements.
Attachment #750168 - Flags: review?(bobbyholley+bmo)
Comment on attachment 750168 [details] [diff] [review]
v2.2 Support delivery of OnDataAvailable on the HTML5 parser thread

Review of attachment 750168 [details] [diff] [review]:
-----------------------------------------------------------------

r=bholley on nsProxyRelease

::: xpcom/glue/nsProxyRelease.h
@@ +202,5 @@
>  
>    operator T*() { return get(); }
>    T* operator->() { return get(); }
>  
> +  // Added to remove compiler ambiguity in condition statements.

I'd say probably say "conditional", but you can probably just nix this comment.

@@ +204,5 @@
>    T* operator->() { return get(); }
>  
> +  // Added to remove compiler ambiguity in condition statements.
> +  operator bool() { return get(); }
> +  

Whitespace error.
Attachment #750168 - Flags: review?(bobbyholley+bmo) → review+
(In reply to Jason Duell (:jduell) from comment #21)
> Comment on attachment 745959 [details] [diff] [review]
> v2.1 Support delivery of OnDataAvailable on the HTML5 parser thread
> 
> Review of attachment 745959 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> a couple nits...
> 
> ::: parser/html/nsHtml5StreamParser.cpp
> @@ +1138,5 @@
> >  
> >    NS_ASSERTION(mRequest == aRequest, "Got data on wrong stream.");
> >    uint32_t totalRead;
> >    const mozilla::fallible_t fallible = mozilla::fallible_t();
> >    nsAutoArrayPtr<uint8_t> data(new (fallible) uint8_t[aLength]);
> 
> We can move the allocation of the 'data' field into the main-thread only
> case--it's not used otherwise, so it's a waste of CPU if we run on the
> parser thread.

Yup - stupid mistake. Henri noticed it too.

> ::: uriloader/base/nsURILoader.cpp
> @@ +277,5 @@
> > +  nsresult rv = NS_OK;
> > +  nsCOMPtr<nsIThreadRetargetableStreamListener> retargetableListener =
> > +    do_QueryInterface(m_targetStreamListener, &rv);
> > +  NS_ENSURE_SUCCESS(rv, rv);
> > +  if (retargetableListener) {
> 
> I should know this by heart already, but is there any reason to use the
> version of do_QueryInterface that takes &rv here?  Isn't it enough to just
> test whether retargetableListener winds up null or not?  (I was worried that
> if we could somehow get null but rv=NS_OK here we'd wind up returning NS_OK
> here accidentally.  From a brief wade into nsISupportsImpl.h it looks like
> rv will always be !NS_OK if we get null, but I'm not 100% sure.)

That makes sense. I'm not sure about what values rv could hold, but in any case, I think we only care about null. I may have had a moment of over-exuberant error-checking.
Thanks Bobby!

(In reply to Bobby Holley (:bholley) from comment #27)
> Comment on attachment 750168 [details] [diff] [review]
> v2.2 Support delivery of OnDataAvailable on the HTML5 parser thread
> 
> Review of attachment 750168 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=bholley on nsProxyRelease
> 
> ::: xpcom/glue/nsProxyRelease.h
> @@ +202,5 @@
> >  
> >    operator T*() { return get(); }
> >    T* operator->() { return get(); }
> >  
> > +  // Added to remove compiler ambiguity in condition statements.
> 
> I'd say probably say "conditional", but you can probably just nix this
> comment.

Yeah - I put it there mainly for the review. I'll remove it.

> @@ +204,5 @@
> >    T* operator->() { return get(); }
> >  
> > +  // Added to remove compiler ambiguity in condition statements.
> > +  operator bool() { return get(); }
> > +  
> 
> Whitespace error.

Oops. Gone.
Comment on attachment 750168 [details] [diff] [review]
v2.2 Support delivery of OnDataAvailable on the HTML5 parser thread

Review of attachment 750168 [details] [diff] [review]:
-----------------------------------------------------------------

Not sure why you asked me to review this patch, but I'm not the right reviewer here.
Attachment #750168 - Flags: review?(ehsan)
Comment on attachment 750168 [details] [diff] [review]
v2.2 Support delivery of OnDataAvailable on the HTML5 parser thread

Bz, can you take a look at nsURILoader in this patch please? Mostly converting m_targetStreamListener to use an nsMainThreadPtrHandle so that OnDataAvailable can be called off the main thread.
Attachment #750168 - Flags: review+ → review?(bzbarsky)
(In reply to Steve Workman [:sworkman] from comment #28)
> (In reply to Jason Duell (:jduell) from comment #21)
> > Comment on attachment 745959 [details] [diff] [review]
> >
> > I should know this by heart already, but is there any reason to use the
> > version of do_QueryInterface that takes &rv here?  Isn't it enough to just
> > test whether retargetableListener winds up null or not?  (I was worried that
> > if we could somehow get null but rv=NS_OK here we'd wind up returning NS_OK
> > here accidentally.  From a brief wade into nsISupportsImpl.h it looks like
> > rv will always be !NS_OK if we get null, but I'm not 100% sure.)
> 
> That makes sense. I'm not sure about what values rv could hold, but in any
> case, I think we only care about null. I may have had a moment of
> over-exuberant error-checking.

Ah, so I remembered why I did this as I was looking at removing it. I wanted to return the rv obtained from do_QueryInterface in an error case (so I didn't have to specify one). I also wanted to get an error in the console log to tell me where in CheckListenerChain things failed. Otherwise, I wouldn't have used the rv version of do_QueryInterface.
I'm trying to read the first patch to understand the primitives used in the second, and I have some questions:

1)  Should nsIThreadRetargetableStreamListener really be scriptable?  Can it actually be
    implemented or used by scripts?
2)  The documentation there looks like it was copy/pasted from onDataAvailable; I can't
    tell what CheckListenerChain should actually be doing in implementations.
3)  Lowercase first 'c'?
(In reply to Boris Zbarsky (:bz) from comment #33)
> I'm trying to read the first patch to understand the primitives used in the
> second, and I have some questions:
> 
> 1)  Should nsIThreadRetargetableStreamListener really be scriptable?  Can it
> actually be implemented or used by scripts?

No - I will fix that.

> 2)  The documentation there looks like it was copy/pasted from
> onDataAvailable; I can't tell what CheckListenerChain should actually be doing in implementations.

Oh - sorry about that. I'll fix that too. It returns NS_OK if the listener is currently ok to receive ODA off the main thread, which is dependent on it's own listeners, if they exist. It is called in the context of nsIThreadRetargetableRequest::redirectDeliveryTo. Specifically, the last request in the chain of requests, which is nsInputStreamPump in this case. It will only redirect delivery if all the listeners in the chain are ok with off main thread delivery.
I'll work on that comment.

> 3)  Lowercase first 'c'?

Indeed. I'll fix that too.
Questions about the second patch:

4)  Why the changes to nsDocumentOpenInfo::OnStopRequest?  They seem wrong in the obvious
    case of multipart data and a callee that reenters the event loop from OnStopRequest.
5)  nsDocumentOpenInfo::CheckListenerChain only logs on success.  If it really means to
    log on failure, the code needs to be restructured accordingly.
6)  This code pattern:

      nsMainThreadPtrHolder<nsIStreamListener>* holder =
        new nsMainThreadPtrHolder<nsIStreamListener>(listener, false);
      m_targetStreamListener = nsMainThreadPtrHandle<nsIStreamListener>(holder);

    should just be:

      m_targetStreamListener = 
        new nsMainThreadPtrHolder<nsIStreamListener>(listener, false);

7) It's not quite clear to me why we want to pass false for that second arg to the
   nsMainThreadPtrHolde ctor.  Is that just so we can dereference it in OnDataAvailable
   and the like?  If so, worth documenting somewhere.
8) This code pattern:

       return (m_targetStreamListener);

   should be "return bool(targetStreamListener);" if the explicit cast is needed.  Why is
   it needed?
9) Please put the NS_DECL_NSITHREADRETARGETABLESTREAMLISTENER over by the other things
   like that.
10) Should CheckListenerChain be asserting that it's called on the main thread?  Again,
    the docs for it are unclear on exactly when it should be called and what it should do.
Attachment #750168 - Flags: review?(bzbarsky) → review-
Dealt with Bz's comments (see comment 33 and comment 44).
Bz, I added you as a reviewer on this one since you commented on the IDLs earlier.
Attachment #745958 - Attachment is obsolete: true
Attachment #745958 - Flags: review?(jduell.mcbugs)
Attachment #750783 - Flags: review?(jduell.mcbugs)
Attachment #750783 - Flags: review?(bzbarsky)
Thanks, bz.

(In reply to Boris Zbarsky (:bz) from comment #35)
> Questions about the second patch:
> 
> 4)  Why the changes to nsDocumentOpenInfo::OnStopRequest?  They seem wrong
> in the obvious case of multipart data and a callee that reenters the event loop 
> from OnStopRequest.

I adjusted this to minimize the changes. I wasn't working with nsMainThreadPtrHandle right here.

> 5)  nsDocumentOpenInfo::CheckListenerChain only logs on success.  If it
> really means to log on failure, the code needs to be restructured accordingly.

Good catch. I adjusted this to log the result in all cases. In addition, I added an NS_WARN_IF_FALSE to warn even when LOG is not enabled.

> 6)  This code pattern:
> 
>       nsMainThreadPtrHolder<nsIStreamListener>* holder =
>         new nsMainThreadPtrHolder<nsIStreamListener>(listener, false);
>       m_targetStreamListener =
> nsMainThreadPtrHandle<nsIStreamListener>(holder);
> 
>     should just be:
> 
>       m_targetStreamListener = 
>         new nsMainThreadPtrHolder<nsIStreamListener>(listener, false);

Indeed. Fixed in three places.

> 7) It's not quite clear to me why we want to pass false for that second arg
> to the nsMainThreadPtrHolde ctor.  Is that just so we can dereference it in
> OnDataAvailable and the like?  If so, worth documenting somewhere.

Correct - I added a comment.

> 8) This code pattern:
> 
>        return (m_targetStreamListener);
> 
>    should be "return bool(targetStreamListener);" if the explicit cast is
> needed.  Why is it needed?

The compiler complains about using != with nsMainThreadPtrHandle<T>. I went back to the original and added a .get() instead.

> 9) Please put the NS_DECL_NSITHREADRETARGETABLESTREAMLISTENER over by the
> other things like that.

Done.

> 10) Should CheckListenerChain be asserting that it's called on the main
> thread?  Again, the docs for it are unclear on exactly when it should be called and what it should do.

I added comments in the other patch, and checks for main thread in the functions.
Attachment #750168 - Attachment is obsolete: true
Attachment #750811 - Flags: review?(bzbarsky)
Added main thread check to CheckListenerChain.
Attachment #750783 - Attachment is obsolete: true
Attachment #750783 - Flags: review?(jduell.mcbugs)
Attachment #750783 - Flags: review?(bzbarsky)
Attachment #750814 - Flags: review?(jduell.mcbugs)
Attachment #750814 - Flags: review?(bzbarsky)
Comment on attachment 750814 [details] [diff] [review]
v2.31 Support delivery of OnDataAvailable off the main thread

r=me for just the idl bits; I didn't look at the rest.
Attachment #750814 - Flags: review?(bzbarsky) → review+
Comment on attachment 750811 [details] [diff] [review]
v2.3 Support delivery of OnDataAvailable on the HTML5 parser thread

>+  NS_WARN_IF_FALSE(NS_SUCCEEDED(rv), "CheckListenerChain failed");

Probably better not, since this can fail for totally legitimate reasons (e.g. unknown decoder, multipart, etc).

r=me on the uriloader changes.  The parser bits I can try to review if you can't get Henri to do it, but he's probably a better reviewer than me if he has the time.
Attachment #750811 - Flags: review?(bzbarsky) → review+
Thanks bz!

(In reply to Boris Zbarsky (:bz) from comment #39)
> Comment on attachment 750814 [details] [diff] [review]
> v2.31 Support delivery of OnDataAvailable off the main thread
> 
> r=me for just the idl bits; I didn't look at the rest.

That's great - Jason is looking at the rest.

(In reply to Boris Zbarsky (:bz) from comment #40)
> Comment on attachment 750811 [details] [diff] [review]
> v2.3 Support delivery of OnDataAvailable on the HTML5 parser thread
> 
> >+  NS_WARN_IF_FALSE(NS_SUCCEEDED(rv), "CheckListenerChain failed");
> 
> Probably better not, since this can fail for totally legitimate reasons
> (e.g. unknown decoder, multipart, etc).

Yeah - I was on the fence with this one. I'll just remove such error messages.

> r=me on the uriloader changes.  The parser bits I can try to review if you
> can't get Henri to do it, but he's probably a better reviewer than me if he
> has the time.

Thanks - Henri already r+'d it in comment 20.
Updated to attend to bz's comments (removed NS_WARN_IF_FALSE).
Attachment #750814 - Attachment is obsolete: true
Attachment #750814 - Flags: review?(jduell.mcbugs)
Attachment #751895 - Flags: review?(jduell.mcbugs)
Updated to remove NS_WARN_IF_FALSE.
Attachment #750811 - Attachment is obsolete: true
Attachment #751896 - Flags: review+
Comment on attachment 751895 [details] [diff] [review]
v2.4 Support delivery of OnDataAvailable off the main thread

Review of attachment 751895 [details] [diff] [review]:
-----------------------------------------------------------------

Good stuff.  Getting close I think.  I have various plans to make this code work for e10s too, some of which we might be able to do here. But I don't want to delay getting this to work for desktop ASAP.

Tests:  we'll want the following coverage (some of which may already be provided by existing mochitests):
  - test loads that come from cache, so we check blocking/streamtransportsvc path.  Should already have these I'd guess, but make sure.
  - test partial entries (where part of page comes from cache, part from network).  This is quite likely to be broken, as it involves two pumps, and we'd only have retargeted one AFAICT.  Ask Honza or Michal if they know of existing tests if you can't find any.
  - test something that Cancels and/or returns error from OnStartRequest after redirecting to new thread--make sure OnStop happens on new thread. Nice to have, not a deal breaker if hard.
  - test sniffers: test with a server handler that doesn't provide a "Content-Type" header in response.  As bz noted in the wiki page in comment 0, we may have to convert current sniffers like nsUnknownDecoder to nsIContentSniffers (or veto the retargeting) to make these work, since they essentially gobble up OnStart and some OnData events before they ever call the actual final listener's (HTTP Parser, etc) OnStart (which is when we'd actually be retargeting).  I'm fine with just vetoing retargeting for these cases, and either adding support in a followup, or just leaving them main-thread only, since they should be a minority of requests. (If you happen to find a fix for bug 748117 while you look into this, great).
  - I'm not sure what other sniffers or other code might be doing similar delaying of OnStart: "grep -r -- '->OnStartRequest'" in /network turns up a bunch of calls.  Take a look.

--
Let's add to the documentation for nsIInputStreamPump.idl--where it says

 * It utilizes the current thread's nsIEventTarget in order to make 
 * reading from the stream asynchronous.

add

 * (a different thread can be used if the pump also implements nsIThreadRetargetableRequest)

::: netwerk/base/public/nsIThreadRetargetableRequest.idl
@@ +1,1 @@
> +/* -*- Mode: C++; tab-width: 4; indent-tabs-mode: nil; c-basic-offset: 4 -*- */

Include vim modeline too, both here and in other new files in patch.

  https://developer.mozilla.org/en-US/docs/Developer_Guide/Coding_Style#Mode_Line

@@ +15,5 @@
> +[uuid(27B84C48-5A73-4BA4-A8A4-8B5E649A145E)]
> +interface nsIThreadRetargetableRequest : nsISupports
> +{
> +    /**
> +     * Called to redirect delivery of OnDataAvailable to another thread.

"of OnDataAvailable and OnStopRequest to another thread"                                 
                                                                                         
We should also add info on when it's OK to call redirectDeliveryTo().  I'm fine with saying for now that we only support calling it during OnStartRequest() since that's the only use case right now.  

I think for e10s we may want to allow retargeting before AsyncOpen is called (and have OnStartRequest also delivered to a non-main thread): the goal would be to have the socket transport thread read stuff and sent it directly to the IPDL thread.  AFAICT this would mostly be a matter of 

1) providing a variant of nsIInputStreamPump::AsyncRead() that takes an eventTarget (instead of it assuming the calling thread is the target)

2) changing nsHttpChannel so that instead of failing in RedirectDeliveryTo() if there's no pump yet, keeping an eventTarget (mNewTarget) that, when we open the pump, we pass to the new function from #1.

Think about this--if it's easy enough to implement now, then let's do it. Otherwise we can do in a followup bug.  I'm guessing it'll be enough work that it's a followup.

Whatever we decide it would be nice to fail RedirectDeliveryTo() if the client calls it too early/late.  I hate APIs that do weird things w/o warning/failure when called at unplanned times.

@@ +24,5 @@
> +     * normal delivery to the main thread will continue. As such, listeners
> +     * should be ready to deal with OnDataAvailable and OnStopRequest on
> +     * either the main thread or the new target thread.
> +     */
> +    void redirectDeliveryTo(in nsIEventTarget aNewTarget);

Don't put "redirect" in the name please.  Muddies the waters...  How about "retargetTo"?

::: netwerk/base/public/nsIThreadRetargetableStreamListener.idl
@@ +1,1 @@
> +/* -*- Mode: C++; tab-width: 4; indent-tabs-mode: nil; c-basic-offset: 4 -*- */

vim modeline

@@ +22,5 @@
> +     * possible, but it must check any and all nsIStreamListener objects that
> +     * might be called in the listener chain.
> +     *
> +     * An exception should be thrown if a listener in the chain does not
> +     * support retargeted delivery.

"does not support retargeted delivery, i.e. if the next listener does not implement nsIThreadRetargetableStreamListener, or a call to its checkListenerChain() fails."

::: netwerk/base/src/nsBaseChannel.cpp
@@ +47,5 @@
>  // Used to suspend data events from mPump within a function scope.  This is
>  // usually needed when a function makes callbacks that could process events.
>  #define SUSPEND_PUMP_FOR_SCOPE() \
> +  /* Static casting to avoid multiple inheritance 'Diamond Problem'. */ \
> +  ScopedRequestSuspender pump_suspender__(static_cast<nsIRequest*>(static_cast<nsIInputStreamPump*>(mPump)))

I'm surprised you need to do this.  nsInputStreamPump already inherits from multiple nsI** classes, and your class isn't adding concrete class double inheritance of nsISupports AFAICT.  Double check that you can't compile w/o this.

Hmm--I think this might be the issue

  https://developer.mozilla.org/en-US/docs/nsRefPtr

Try changing baseChannel::mPump to an nsCOMPtr--nsRefPtr is the wrong class to use for IDL pointers.

::: netwerk/base/src/nsInputStreamPump.cpp
@@ +127,2 @@
>  
> +    if (!mWaiting && mAsyncStream) {

Did you add a case where we could now be calling EnsureWaiting with mAsyncStream == null, or did you just figure it was good to check this?  We might want MOZ_ASSERT(mAsyncStream) since EnsureWaiting is not something we can casually return from--if we don't call AsyncWait the pump will never make progress again.

@@ +129,5 @@
> +        nsCOMPtr<nsIEventTarget> target
> +            = mDataTarget ? mDataTarget : do_QueryInterface(mTargetThread);
> +        if (!target) {
> +            NS_WARNING("thread target is null!");
> +            return NS_ERROR_NULL_POINTER;

MOZ_ASSERT(target) here--I don't see how a null target is possible, since we always set it to non-null (first in in AsyncRead then if RedirectDeliveryTo is called).

@@ +417,5 @@
> +            	NS_NewInputStreamReadyEvent(this, mDataTarget);
> +            // Update mState before dispatch.
> +            mState = nextState;
> +            // This function call will dispatch this->OnInputStreamReady.
> +            nsresult rv = event->OnInputStreamReady(stream);

It's not OK to call OnInputSteamReady blindly here: 1) the channel may have been suspended in OnStartRequest.  2) We also don't know for sure that the stream is really ready for more I/O (well, it probably is for the OnStart case, but I'd like to generalize the code).

It looks to me like you'd be better off with code like this:

  // If our eventTarget has been changed during OnStartRequest, call 
  // EnsureWaiting() to switch to the new thread now.  If we're suspended
  // this will happen in Resume() instead.
  if (mDataTarget &&
      mState == STATE_START &&
      nextState == STATE_TRANSFER &&
      mSuspendCount == 0) {
        EnsureWaiting();
        return rv;
      }
  }

Or, if you take my suggestion to get rid of mDataChannel and just have a bool 'mRetargeting' (in comment for .h file, below), it could just be

  // If our eventTarget has been changed, call EnsureWaiting() to switch to the
  // new thread now.  If we're suspended this will happen in Resume() instead.
  if (mRetargeting && 
      mSuspendCount == 0) {
        EnsureWaiting();
        // EnsureWaiting has retargeted for us, so clear flag.
        mRetargeting = false;    // [do this in Resume too]
        return rv;
      }
  }

In this case we can even support things like calling RedirectDeliveryTo() during an OnDataAvailable call and have the next ODA go to the new thread.  And the design is simpler.

@@ +600,5 @@
> +NS_IMETHODIMP
> +nsInputStreamPump::RedirectDeliveryTo(nsIEventTarget* aNewTarget)
> +{
> +    NS_ENSURE_ARG(aNewTarget);
> +    NS_ENSURE_FALSE(aNewTarget == NS_GetCurrentThread(), NS_ERROR_ILLEGAL_VALUE);

Seems harsh--maybe just warn "redirecting to same thread" and return?

@@ +617,5 @@
> +    LOG(("nsInputStreamPump::RedirectDeliveryTo [this=%x aNewTarget=%p] "
> +         "%s listener [%p] rv[%x]",
> +         this, aNewTarget, (mDataTarget == aNewTarget ? "success" : "failure"),
> +         (nsIStreamListener*)mListener, rv));
> +    return rv;

So if mListener isn't a nsIThreadRetargetableStreamListener you return NS_OK, but if it is and CheckListenerChain fails, you return an error. That seems weird: I'd return an error in both cases (i.e. set the initial value of rv to some error code--NS_ERROR_NOT_AVAILABLE?)

::: netwerk/base/src/nsInputStreamPump.h
@@ +86,5 @@
>      uint32_t                      mLoadFlags;
>      bool                          mIsPending;
>      bool                          mWaiting; // true if waiting on async source
>      bool                          mCloseWhenDone;
> +    nsCOMPtr<nsIEventTarget>      mDataTarget;

AFAICT you could switch mTargetThread to a nsIEventTarget (we never use any nsIThread methods on it), get rid of mDataTarget (we only ever have one target at a time, so we can just overwrite mTargetThread during RedirectDeliveryTo()), and set a bool ('mRetargeting') that you can set then clear when you've finished the steps needed to retarget (just the change in OnInputStreamReady AFAICT).  I think that's slightly cleaner.

If that's a bad idea for some reason, please make sure we also set mDataTarget = null in OnStateStop() as we do for mTargetThread.

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +5148,5 @@
> +    } else {
> +        nsresult rv = NS_DispatchToMainThread(
> +            NS_NewRunnableMethod(this, &nsHttpChannel::ReleaseListeners));
> +        NS_ENSURE_SUCCESS(rv, rv);
> +    }

For efficiency (one less event to dispatch) let's do the isMainThread check once, and if false dispatch a single event that calls both RemoveLoadGroup and ReleaseListeners on the main thread.

@@ +5189,2 @@
>  NS_IMETHODIMP
>  nsHttpChannel::OnDataAvailable(nsIRequest *request, nsISupports *ctxt,

Ask Honza if it's OK to call the tracer macros off main thread:

>  MOZ_EVENT_TRACER_EXEC(this, "net::http::channel");

@@ +5269,5 @@
> +nsHttpChannel::RedirectDeliveryTo(nsIEventTarget* aNewTarget)
> +{
> +    NS_ENSURE_ARG(aNewTarget);
> +    NS_ENSURE_FALSE(aNewTarget == NS_GetCurrentThread(), NS_ERROR_ILLEGAL_VALUE);
> +    NS_ENSURE_TRUE(mTransactionPump || mCachePump, NS_ERROR_NULL_POINTER);

I think I'd use NS_ERROR_NOT_AVAILABLE here.

Also, this is the check we'd have to remove and replace with setting a "mNewTarget" if we want to support calling this before AsyncOpen.

@@ +5272,5 @@
> +    NS_ENSURE_FALSE(aNewTarget == NS_GetCurrentThread(), NS_ERROR_ILLEGAL_VALUE);
> +    NS_ENSURE_TRUE(mTransactionPump || mCachePump, NS_ERROR_NULL_POINTER);
> +    
> +    nsresult rv = NS_OK;
> +    nsRefPtr<nsInputStreamPump> pump = mCachePump ?

pointer to an interface, so nsCOMPtr not nsRefPtr (which is for concrete types).  Looks like we need to change nsHttpChannel.h's misuse of this for mTransactionPump/mCachePump too!

@@ +5276,5 @@
> +    nsRefPtr<nsInputStreamPump> pump = mCachePump ?
> +                                       mCachePump : mTransactionPump;
> +    nsCOMPtr<nsIThreadRetargetableRequest> threadRetargetableRequest;
> +    rv = pump->QueryInterface(NS_GET_IID(nsIThreadRetargetableRequest),
> +                              getter_AddRefs(threadRetargetableRequest));

Use do_QueryInterface: terser and more efficient:

  nsCOMPtr<nsIThreadRetargetableRequest> threadRetargetableRequest
    = do_QueryInterface(pump, &rv);

you could prob even do 

  nsCOMPtr<nsIThreadRetargetableRequest> threadRetargetableRequest
    = do_QueryInterface(mCachePump ? mCachePump : mTransactionPump, &rv);

and get rid of the 'pump' variable.

@@ +5277,5 @@
> +                                       mCachePump : mTransactionPump;
> +    nsCOMPtr<nsIThreadRetargetableRequest> threadRetargetableRequest;
> +    rv = pump->QueryInterface(NS_GET_IID(nsIThreadRetargetableRequest),
> +                              getter_AddRefs(threadRetargetableRequest));
> +    if (threadRetargetableRequest) {

MOZ_ASSERT(threadRetargetableRequest)?  If we're always using an nsInputStreamPump for our pump type (and I'm fairly certain we are) then this should always succeed.

::: netwerk/protocol/http/nsHttpChannel.h
@@ +72,5 @@
>      NS_DECL_NSIAPPLICATIONCACHECHANNEL
>      NS_DECL_NSIASYNCVERIFYREDIRECTCALLBACK
>      NS_DECL_NSITIMEDCHANNEL
> +    NS_DECL_NSITHREADRETARGETABLEREQUEST
> +    NS_DECL_NSITHREADRETARGETABLESTREAMLISTENER

Put inherited classes next to each other in inheritance order, i.e. put NS_DECL_NSITHREADRETARGETABLESTREAMLISTENER right after NS_DECL_NSISTREAMLISTENER.
Attachment #751895 - Flags: review?(jduell.mcbugs)
Attachment #751895 - Flags: review-
Attachment #751895 - Flags: feedback+
Comment on attachment 751895 [details] [diff] [review]
v2.4 Support delivery of OnDataAvailable off the main thread

Review of attachment 751895 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/src/nsBaseChannel.cpp
@@ +47,5 @@
>  // Used to suspend data events from mPump within a function scope.  This is
>  // usually needed when a function makes callbacks that could process events.
>  #define SUSPEND_PUMP_FOR_SCOPE() \
> +  /* Static casting to avoid multiple inheritance 'Diamond Problem'. */ \
> +  ScopedRequestSuspender pump_suspender__(static_cast<nsIRequest*>(static_cast<nsIInputStreamPump*>(mPump)))

As a general rule, you should never use static_cast to cast from one XPCOM interface to another. Instead you should use QueryInterface if a cast is necessary (especially since in general the object may be a JS object and not a native object).
Thanks for the review Jason!

(In reply to Jason Duell (:jduell) from comment #44)
> Comment on attachment 751895 [details] [diff] [review]
> v2.4 Support delivery of OnDataAvailable off the main thread
> 
> Tests:  we'll want the following coverage (some of which may already be
> provided by existing mochitests):
>   - test loads that come from cache, so we check blocking/streamtransportsvc
> path.  Should already have these I'd guess, but make sure.

I think this is implicitly tested in the new test (see next).

>   - test partial entries (where part of page comes from cache, part from
> network).  This is quite likely to be broken, as it involves two pumps, and
> we'd only have retargeted one AFAICT.  Ask Honza or Michal if they know of
> existing tests if you can't find any.

New test added in most recent patch for mochitest-plain:
-- test_partially_cached_content.js and partial_content.sjs
-- performs two requests for the same resource: first response is partial but header says it's full; second one is a range request and response.

>   - test something that Cancels and/or returns error from OnStartRequest
> after redirecting to new thread--make sure OnStop happens on new thread.
> Nice to have, not a deal breaker if hard.

I'll look into it, but I wanted to get the rest done first.

>   - test sniffers: …

Sniffers will need another bug. Currently they don't implement either of the new IDLs, so no retargeting will happen and main thread delivery will continue as normal.

> Let's add to the documentation for nsIInputStreamPump.idl--where it says …

Done.

> ::: netwerk/base/public/nsIThreadRetargetableRequest.idl
> @@ +1,1 @@
> > +/* -*- Mode: C++; tab-width: 4; indent-tabs-mode: nil; c-basic-offset: 4 -*- */
> 
> Include vim modeline too, both here and in other new files in patch.

Done.

> @@ +15,5 @@
> > +[uuid(27B84C48-5A73-4BA4-A8A4-8B5E649A145E)]
> > +interface nsIThreadRetargetableRequest : nsISupports
> > +{
> > +    /**
> > +     * Called to redirect delivery of OnDataAvailable to another thread.
> 
> "of OnDataAvailable and OnStopRequest to another thread"                    

Done.

> We should also add info on when it's OK to call redirectDeliveryTo().  I'm
> fine with saying for now that we only support calling it during
> OnStartRequest() since that's the only use case right now.

In order to support partially cached entries it needs to be ok to call Retarget~ during STATE_START or STATE_TRANSFER. mCachePump will be in STATE_START, but mTransactionPump will be in STATE_TRANSFER. This is because the cache pump is put on hold before it calls OnStart, until the response headers are received from the network and it can determine if the content is valid. If it is valid, nsHttpChannel calls OnStart once, then OnData for the cache, then OnData for transaction pump, and then OnStop. OnStop from the cache pump is only heard by nsHttpChannel.

But the point is that Retarget~ may be in the context of OnStart called from nsHttpChannel, and mCachePump is in STATE_START, but mTransactionPump will be in STATE_TRANSFER.

I still think the comment is a good idea - I wrote "should", so it's more of a strong guideline :)

> I think for e10s we may want to allow retargeting before AsyncOpen is called
> …
> Think about this--if it's easy enough to implement now, then let's do it.
> Otherwise we can do in a followup bug.  I'm guessing it'll be enough work
> that it's a followup.

Yup - I'd prefer to land this part now and get it tested in Nightly for while.

> Whatever we decide it would be nice to fail RedirectDeliveryTo() if the
> client calls it too early/late.  I hate APIs that do weird things w/o
> warning/failure when called at unplanned times.

See my response above re STATE_START and STATE_TRANSFER.

> @@ +24,5 @@
> > +    void redirectDeliveryTo(in nsIEventTarget aNewTarget);
> 
> Don't put "redirect" in the name please.  Muddies the waters...  How about
> "retargetTo"?

Done.

> ::: netwerk/base/public/nsIThreadRetargetableStreamListener.idl
> @@ +1,1 @@
> > +/* -*- Mode: C++; tab-width: 4; indent-tabs-mode: nil; c-basic-offset: 4 -*- */
> 
> vim modeline

Done.

> @@ +22,5 @@
> > +     * An exception should be thrown if a listener in the chain does not
> > +     * support retargeted delivery.
> 
> "does not support retargeted delivery, i.e. if the next listener does not
> implement nsIThreadRetargetableStreamListener, or a call to its
> checkListenerChain() fails."

Done.

> ::: netwerk/base/src/nsBaseChannel.cpp
> > +  /* Static casting to avoid multiple inheritance 'Diamond Problem'. */ \
> > +  ScopedRequestSuspender pump_suspender__(static_cast<nsIRequest*>(static_cast<nsIInputStreamPump*>(mPump)))
> 
> I'm surprised you need to do this.  nsInputStreamPump already inherits from
> multiple nsI** classes, and your class isn't adding concrete class double
> inheritance of nsISupports AFAICT.  Double check that you can't compile w/o
> this.

Artifact from earlier version. Removed.

> Try changing baseChannel::mPump to an nsCOMPtr--nsRefPtr is the wrong class
> to use for IDL pointers.

nsInputStreamPump implements PeekStream which is used by nsHttpChannel, but is not in the idl. Hence the existing use of nsRefPtrs and not nsCOMPtrs (and my hacky use of both). I could add PeekStream to the idl, but it seems ok to mix the ptrs here and I'm not sure we want to expose PeekStream.

> ::: netwerk/base/src/nsInputStreamPump.cpp
> > +    if (!mWaiting && mAsyncStream) {
> 
> Did you add a case where we could now be calling EnsureWaiting with
> mAsyncStream == null, or did you just figure it was good to check this?  We
> might want MOZ_ASSERT(mAsyncStream) since EnsureWaiting is not something we
> can casually return from--if we don't call AsyncWait the pump will never
> make progress again.

In an earlier version of the code it was null and caused a crash. I can't remember the details surrounding it. Nonetheless, it's good to check it's not null before calling one of it's functions. So, I added a MOZ_ASSERT.

> @@ +129,5 @@
> > +        if (!target) {
> 
> MOZ_ASSERT(target) here--I don't see how a null target is possible, since we
> always set it to non-null (first in in AsyncRead then if RedirectDeliveryTo
> is called).

Agreed.

> @@ +417,5 @@
> > +            	NS_NewInputStreamReadyEvent(this, mDataTarget);
> > +            // Update mState before dispatch.
> > +            mState = nextState;
> > +            // This function call will dispatch this->OnInputStreamReady.
> > +            nsresult rv = event->OnInputStreamReady(stream);
> 
> It's not OK to call OnInputSteamReady blindly here: 1) the channel may have
> ...
> In this case we can even support things like calling RedirectDeliveryTo()
> during an OnDataAvailable call and have the next ODA go to the new thread. 
> And the design is simpler.

Done.
> 
> @@ +600,5 @@
> > +NS_IMETHODIMP
> > +nsInputStreamPump::RedirectDeliveryTo(nsIEventTarget* aNewTarget)
> > +{
> > +    NS_ENSURE_ARG(aNewTarget);
> > +    NS_ENSURE_FALSE(aNewTarget == NS_GetCurrentThread(), NS_ERROR_ILLEGAL_VALUE);
> 
> Seems harsh--maybe just warn "redirecting to same thread" and return?

ZERO TOLERANCE!!! :)
Sigh, ok then - Changed to a warning and return NS_OK.

> @@ +617,5 @@
> > +    LOG(("nsInputStreamPump::RedirectDeliveryTo [this=%x aNewTarget=%p] "
> > …
> > +    return rv;
> 
> So if mListener isn't a nsIThreadRetargetableStreamListener you return
> NS_OK, but if it is and CheckListenerChain fails, you return an error. That
> seems weird: I'd return an error in both cases (i.e. set the initial value
> of rv to some error code--NS_ERROR_NOT_AVAILABLE?)

I changed this to get the rv using do_QueryInterface, and then return that.

> ::: netwerk/base/src/nsInputStreamPump.h
> > +    nsCOMPtr<nsIEventTarget>      mDataTarget;
> 
> AFAICT you could switch mTargetThread to a nsIEventTarget (we never use any
> nsIThread methods on it), get rid of mDataTarget (we only ever have one
> target at a time, so we can just overwrite mTargetThread during
> RedirectDeliveryTo()), and set a bool ('mRetargeting') that you can set then
> clear when you've finished the steps needed to retarget (just the change in
> OnInputStreamReady AFAICT).  I think that's slightly cleaner.

Yup - cool - done.

> ::: netwerk/protocol/http/nsHttpChannel.cpp
> @@ +5148,5 @@
> > +    } else {
> > +        nsresult rv = NS_DispatchToMainThread(
> > +            NS_NewRunnableMethod(this, &nsHttpChannel::ReleaseListeners));
> > +        NS_ENSURE_SUCCESS(rv, rv);
> > +    }
> 
> For efficiency (one less event to dispatch) let's do the isMainThread check
> once, and if false dispatch a single event that calls both RemoveLoadGroup
> and ReleaseListeners on the main thread.

Done.

> @@ +5189,2 @@
> >  NS_IMETHODIMP
> >  nsHttpChannel::OnDataAvailable(nsIRequest *request, nsISupports *ctxt,
> 
> Ask Honza if it's OK to call the tracer macros off main thread:
> 
> >  MOZ_EVENT_TRACER_EXEC(this, "net::http::channel");

There are notes by the macro definition - it's not thread safe, but there is a thread safe version. So, I changed it where appropriate. Note: there's a missing 'D' in MOZ_EVENT_TRACER_DONE_THREASAFE in the declaration, so it's not a typo on my behalf :)

I'm going to test those changes today, but I'll upload this now anyway.

> @@ +5269,5 @@
> > +nsHttpChannel::RedirectDeliveryTo(nsIEventTarget* aNewTarget)
> > +{
> > +    NS_ENSURE_ARG(aNewTarget);
> > +    NS_ENSURE_FALSE(aNewTarget == NS_GetCurrentThread(), NS_ERROR_ILLEGAL_VALUE);
> > +    NS_ENSURE_TRUE(mTransactionPump || mCachePump, NS_ERROR_NULL_POINTER);
> 
> I think I'd use NS_ERROR_NOT_AVAILABLE here.
> 
> Also, this is the check we'd have to remove and replace with setting a
> "mNewTarget" if we want to support calling this before AsyncOpen.

Hmm. With the quieter warning above, I just changed this to a warning and an NS_OK.

> @@ +5272,5 @@
> > +    nsresult rv = NS_OK;
> > +    nsRefPtr<nsInputStreamPump> pump = mCachePump ?
> 
> pointer to an interface, so nsCOMPtr not nsRefPtr (which is for concrete
> types).  Looks like we need to change nsHttpChannel.h's misuse of this for
> mTransactionPump/mCachePump too!

See explanation above.

> @@ +5276,5 @@
> > +    nsRefPtr<nsInputStreamPump> pump = mCachePump ?
> > +                                       mCachePump : mTransactionPump;
> > +    nsCOMPtr<nsIThreadRetargetableRequest> threadRetargetableRequest;
> > +    rv = pump->QueryInterface(NS_GET_IID(nsIThreadRetargetableRequest),
> > +                              getter_AddRefs(threadRetargetableRequest));
> 
> Use do_QueryInterface: terser and more efficient:

Yup, but this is where I still have an issue with multiple inheritance. The compiler doesn't know how to get an nsISupports - is it better to use QueryInterface off the objects, or do_QueryInterface with some static_casts?

> @@ +5277,5 @@
> > +                                       mCachePump : mTransactionPump;
> > +    nsCOMPtr<nsIThreadRetargetableRequest> threadRetargetableRequest;
> > +    rv = pump->QueryInterface(NS_GET_IID(nsIThreadRetargetableRequest),
> > +                              getter_AddRefs(threadRetargetableRequest));
> > +    if (threadRetargetableRequest) {
> 
> MOZ_ASSERT(threadRetargetableRequest)?  If we're always using an
> nsInputStreamPump for our pump type (and I'm fairly certain we are) then
> this should always succeed.

Yup - done.

> ::: netwerk/protocol/http/nsHttpChannel.h
> @@ +72,5 @@
> >      NS_DECL_NSIAPPLICATIONCACHECHANNEL
> >      NS_DECL_NSIASYNCVERIFYREDIRECTCALLBACK
> >      NS_DECL_NSITIMEDCHANNEL
> > +    NS_DECL_NSITHREADRETARGETABLEREQUEST
> > +    NS_DECL_NSITHREADRETARGETABLESTREAMLISTENER
> 
> Put inherited classes next to each other in inheritance order, i.e. put
> NS_DECL_NSITHREADRETARGETABLESTREAMLISTENER right after
> NS_DECL_NSISTREAMLISTENER.

OK done.



Some other necessary changes:

@@ -3889,17 +3890,18 @@ nsHttpChannel::InstallCacheListener(uint
-    mListener = tee;
+    mListener = do_QueryInterface(tee, &rv);

With the changes to nsStreamListenerTee's inheritance, the compiler now complains that it needs do_QueryInterface. It looks like it would have been good to have it there anyway.
Attachment #751895 - Attachment is obsolete: true
Attachment #755541 - Flags: review?(jduell.mcbugs)
> > @@ +5189,2 @@
> > >  NS_IMETHODIMP
> > >  nsHttpChannel::OnDataAvailable(nsIRequest *request, nsISupports *ctxt,
> > 
> > Ask Honza if it's OK to call the tracer macros off main thread:
> > 
> > >  MOZ_EVENT_TRACER_EXEC(this, "net::http::channel");
>
> There are notes by the macro definition - it's not thread safe, but there is a 
> thread safe version. So, I changed it where appropriate. Note: there's a missing 
> 'D' in MOZ_EVENT_TRACER_DONE_THREASAFE in the declaration, so it's not a typo on 
> my behalf :)
>
> I'm going to test those changes today, but I'll upload this now anyway.

Test done - no crashes or complaints about AddRef'ing/Releasing on the wrong thread.

Also, I was able to visually confirm ODA and OnStop on the HTML5 Parser thread - yay!
According the event tracer:
- "thread safety" means that an event may happen concurrently for a single OBJECT, this is probably not the case
- "normal" version can be used safely on multiple threads

I may need to adjust the comments if those are not well understood.

Please revert the macros back to the normal versions.
(In reply to Honza Bambas (:mayhemer) from comment #49)
> According the event tracer:
> - "thread safety" means that an event may happen concurrently for a single
> OBJECT, this is probably not the case
> - "normal" version can be used safely on multiple threads
> 
> I may need to adjust the comments if those are not well understood.
> 
> Please revert the macros back to the normal versions.

Ah I see. Cool - I'll revert those changes.
-- Reverted tracer macros, as per Honza's comment.
-- Added another do_QueryInterface for nsStreamListenerTee to InstallOfflineCacheListener (similar to the one added in InstallCacheListener).
Attachment #755541 - Attachment is obsolete: true
Attachment #755541 - Flags: review?(jduell.mcbugs)
Attachment #756211 - Flags: review?(jduell.mcbugs)
Includes some fixes to get all green on try.
Attachment #756211 - Attachment is obsolete: true
Attachment #756211 - Flags: review?(jduell.mcbugs)
Attachment #758729 - Flags: review?(jduell.mcbugs)
(In reply to Steve Workman [:sworkman] from comment #52)
> Created attachment 758729 [details] [diff] [review]
> v2.7 Support delivery of OnDataAvailable off the main thread
> 
> Includes some fixes to get all green on try.

Actually, only really one fix: removed return from mLoadGroup->RemoveRequest in nsHttpChannel::OnStopRequestCleanup. That stopped ReleaseListeners from being called and created a whole host of memory leaks.

Other changes are adding some thread checks and removing whitespace.
Updating patch; some minor fixes to get all green (one unrelated orange) on try.

https://tbpl.mozilla.org/?tree=Try&rev=6dcc8fda8f2d
Attachment #751896 - Attachment is obsolete: true
Attachment #758733 - Flags: review+
Comment on attachment 758729 [details] [diff] [review]
v2.7 Support delivery of OnDataAvailable off the main thread

Review of attachment 758729 [details] [diff] [review]:
-----------------------------------------------------------------

Good work! Very nice to have the test, too.

We're very close--if there's nothing significant to discuss about my comments you can mark the next patch +r w/o review.

::: netwerk/base/public/nsIInputStreamPump.idl
@@ +21,5 @@
>   * call the stream's AsyncWait method to drive the stream listener.  Otherwise,
>   * the stream will be read on a background thread utilizing the stream
>   * transport service.  More details are provided below.
>   */
> +[scriptable, uuid(f6567b72-533b-4682-a853-d04d6078f80c)]

don't change uuid since we're not changing the IDL, just tweaking a comment.

::: netwerk/base/src/nsInputStreamPump.cpp
@@ +131,5 @@
>          if (NS_FAILED(rv)) {
>              NS_ERROR("AsyncWait failed");
>              return rv;
>          }
> +        mRetargeted = false;

comment above here: 

  // Calling AsyncWait with the new mEventTarget completes any retargeting in progress

Also, rename to 'mRetargeting': retargeted implies "ever been retargeted", but the variable is only set during retargeting.

@@ +405,5 @@
> +            // EnsureWaiting() to switch to the new target now. If we're
> +            // suspended this will happen in Resume() instead.
> +            mState = nextState;
> +            mWaiting = false;
> +            return EnsureWaiting();

So... I notice that the block above this is really very similar to this code, except that this new code returns EnsureWaiting while the above block sets mStatus and continues the 'for' loop if it failed.  I believe the latter is correct--if for some reason AsyncWait fails we need to call OnStop, and I don't think your code would do that.

Also the if..else conditions here are getting really non-obvious to read.  Would you agree with me that this is cleaner and correct?

  bool stillTransferring = (nextState == STATE_TRANSFER) &&
                           (mState == STATE_TRANSFER);
  if (stillTransferring) {
      NS_ASSERTION(NS_SUCCEEDED(mStatus), "unexpected status");
  }

  // AsyncRead from stream if either 1) we're in TRANSFER and are not done
  // reading yet, or 2) we're retargeting the load to a different thread.
  if (!mSuspendCount && (stillTransferring || mRetargeting)) {
      mWaiting = false;
      mStatus = EnsureWaiting();
      if (NS_SUCCEEDED(mStatus))
          break;
      nextState = STATE_STOP;
  }

::: netwerk/base/src/nsInputStreamPump.h
@@ +73,5 @@
>      uint32_t                      mState;
>      nsCOMPtr<nsILoadGroup>        mLoadGroup;
>      nsCOMPtr<nsIStreamListener>   mListener;
>      nsCOMPtr<nsISupports>         mListenerContext;
> +    nsCOMPtr<nsIEventTarget>      mEventTarget;

nit: I'd keep the name 'mTargetThread': makes for less hg context changes, and it's still accurate (the nsIEvent will always be a thread, right?)

::: netwerk/base/src/nsStreamListenerTee.cpp
@@ +9,5 @@
> +NS_IMPL_THREADSAFE_ADDREF(nsStreamListenerTee)
> +NS_IMPL_THREADSAFE_RELEASE(nsStreamListenerTee)
> +
> +NS_INTERFACE_MAP_BEGIN(nsStreamListenerTee)
> +  NS_INTERFACE_MAP_ENTRY_AMBIGUOUS(nsISupports, nsIThreadRetargetableStreamListener)

ugh, this multiple IDL inheritance stuff is horrible.  Let's get rid of it by no longer making nsIThreadRetargetableStreamListener inherit from nsIStreamListener.  (Just as nsIThreadRetargetableRequest isn't inheriting from nsIRequest).

::: netwerk/base/src/nsStreamListenerWrapper.cpp
@@ +7,2 @@
>  
> +// Not using NS_IMPL_ISUPPORTS to deal with inheritance ambiguity.

FWIW this class didn't actually have multiple inheritance here so IMPL_ISUPPORTS works fine.

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +3895,5 @@
>      }
>  
>      if (NS_FAILED(rv)) return rv;
> +    mListener = do_QueryInterface(tee, &rv);
> +    NS_ENSURE_SUCCESS(rv, rv);

Another reason to get rid of the multiple inheritance.  FWIW I would have used MOZ_ASSERT here since this couldn't have failed.

@@ +5138,5 @@
> +        OnStopRequestCleanup(status);
> +    } else {
> +        nsresult rv = NS_DispatchToMainThread(
> +            new OnStopRequestCleanupEvent(this, status));
> +        NS_ENSURE_SUCCESS(rv, rv);

If the dispatch fails (it basically never does and we'd be in a world of pain anyway, but still...) we're better off calling OnStopRequestCleanup on the main thread than just returning.  That way we at least deliver OnStopRequest.

@@ +5232,5 @@
> +        } else {
> +            nsresult rv = NS_DispatchToMainThread(
> +                new OnTransportStatusAsyncEvent(this, transportStatus,
> +                                                progress, progressMax));
> +            NS_ENSURE_SUCCESS(rv, rv);

Here it actually does something to return a failure code if dispatch fails (the pump will cancel the transaction).  yay

@@ +5275,5 @@
> +    nsresult rv = NS_OK;
> +    // If both cache pump and transaction pump exist, we're probably dealing
> +    // with partially cached content. So, we must be able to retarget both.
> +    // Note: Pumps are implemented as nsInputStreamPump objects, not
> +    // nsIInputStreamPump interfaces. Hence mix of nsCOMPtrs and nsRefPtrs.

don't need comment about mix of com/refPtrs..

@@ +5281,5 @@
> +    nsCOMPtr<nsIThreadRetargetableRequest> retargetableTransactionPump;
> +    if (mCachePump) {
> +        // Direct call to QueryInterface to avoid multiple inheritance issues.
> +        rv = mCachePump->QueryInterface(NS_GET_IID(nsIThreadRetargetableRequest),
> +                                        getter_AddRefs(retargetableCachePump));

yet another reason to ditch multiple inheritance!

@@ +5296,5 @@
> +        rv = retargetableTransactionPump->RetargetDeliveryTo(aNewTarget);
> +
> +        // If retarget fails for transaction pump, we must restore mCachePump.
> +        if (NS_FAILED(rv) && retargetableCachePump) {
> +            rv = retargetableCachePump->RetargetDeliveryTo(NS_GetCurrentThread());

Calling RetargetTo is only supported on main thread, so add an assert to that effect at top of the function, and use NS_GetMainThread() here.

::: netwerk/test/Makefile.in
@@ +54,5 @@
> +MOCHITEST_FILES = \
> +  partial_content.sjs \
> +  test_partially_cached_content.html \
> +  $(NULL)
> +

Let's put the mochitests in a separate directory ('netwerk/test/mochitests') instead of mixing them in with the C++ tests.  Yes, I know all other directories in the tree just call their mochitest directories 'test', but we also have xpcshell tests, so we're special.

::: netwerk/test/httpserver/httpd.js
@@ +3976,5 @@
>        // XXX assumes stream will always report the full amount of data available
> +      // Set "Content-Length" if not already set by request handler.
> +      if (!headers.hasHeader("Content-Length")) {
> +        headers.setHeader("Content-Length", "" + avail, false);
> +      }

Why do you need to modify httpd.js to always set content-length? We want to be able to test cases where servers don't pass it.  Both your test responses set Content-Length explcitly, so I don't see the reason for this change.

Also, there are six (!) copies of httpd.js in the tree (IIRC because windoze doesn't support symlinks), so some or all of them would need that fix (ask :ted if we really need to go there).

::: netwerk/test/partial_content.sjs
@@ +104,5 @@
> +
> +  DBG("totalLength: " + totalLength);
> +
> +  // Prepare common headers for the two responses.
> +  response.setHeader("Content-Length", "" + totalLength, false);

nit: the relevant RFCs say that Content-Length for a 206 response should be the length of the partial response, not the total length:

  http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.7

  http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p5-range-latest.html#status.206

::: netwerk/test/test_partially_cached_content.html
@@ +6,5 @@
> +  This test verifies that partially cached content is read from the cache first
> +  and then from the network. It is written in the mochitest framework to take
> +  thread retargeting into consideration of nsIStreamListener callbacks (inc.
> +  nsIRequestObserver). E.g. HTML5 Stream Parser requesting retargeting of
> +  nsIStreamListener callbacks to the parser thread.

Do we have any way to verify that the HTML parser actually winds up requesting and receiving its data off main thread? It would be great to guarantee that we never silently regress off-main thread delivery back to regular main-thread delivery and don't notice it.  (Perhaps a perf test could capture that too, but IIRC talos right now won't capture that because none of those tests clog the main thread's event queue, right?)

Let's ask henri if he can think of a way to do this quickly and easily (in a followup).  Perhaps we could set a bit somewhere and if we give the mochitest SpecialPowers it could read it.

@@ +25,5 @@
> +
> +
> +/* Check iframe loaded ok first time.
> + */
> +function onFrameLoadPartialContent(e) {

nit: I find the names a little confusing.  How about 'expectInitialContent' and then 'expectRemainderContent'?
Attachment #758729 - Flags: review?(jduell.mcbugs) → review-
Thanks Jason!

(In reply to Jason Duell (:jduell) from comment #55)
> Comment on attachment 758729 [details] [diff] [review]
> v2.7 Support delivery of OnDataAvailable off the main thread
> 
> Review of attachment 758729 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Good work! Very nice to have the test, too.
> 
> We're very close--if there's nothing significant to discuss about my
> comments you can mark the next patch +r w/o review.

Okie doke - I don't think there is any significant discussion to be had, so I'm marking this r+ and moving forward with landing. If I misjudged, I'll do follow-ups :)

> ::: netwerk/base/public/nsIInputStreamPump.idl
> @@ +21,5 @@
> > +[scriptable, uuid(f6567b72-533b-4682-a853-d04d6078f80c)]
> 
> don't change uuid since we're not changing the IDL, just tweaking a comment.

Done.

> ::: netwerk/base/src/nsInputStreamPump.cpp
> @@ +131,5 @@
> >          if (NS_FAILED(rv)) {
> >              NS_ERROR("AsyncWait failed");
> >              return rv;
> >          }
> > +        mRetargeted = false;
> 
> comment above here: 
> 
>   // Calling AsyncWait with the new mEventTarget completes any retargeting
> in progress

Done - different comment (cos I didn't see yours until I had written mine), but same meaning.

> Also, rename to 'mRetargeting': retargeted implies "ever been retargeted",
> but the variable is only set during retargeting.

OK, done.

> @@ +405,5 @@
> > +            // EnsureWaiting() to switch to the new target now. If we're
> > +            // suspended this will happen in Resume() instead.
> > +            mState = nextState;
> > +            mWaiting = false;
> > +            return EnsureWaiting();
> 
> So... I notice that the block above this is really very similar to this
> code, except that this new code returns EnsureWaiting while the above block
> sets mStatus and continues the 'for' loop if it failed.  I believe the
> latter is correct--if for some reason AsyncWait fails we need to call
> OnStop, and I don't think your code would do that.
> 
> Also the if..else conditions here are getting really non-obvious to read. 
> Would you agree with me that this is cleaner and correct?
> 
>   bool stillTransferring = (nextState == STATE_TRANSFER) &&
>                            (mState == STATE_TRANSFER);
>   if (stillTransferring) {
>       NS_ASSERTION(NS_SUCCEEDED(mStatus), "unexpected status");
>   }
> 
>   // AsyncRead from stream if either 1) we're in TRANSFER and are not done
>   // reading yet, or 2) we're retargeting the load to a different thread.
>   if (!mSuspendCount && (stillTransferring || mRetargeting)) {
>       mWaiting = false;
>       mStatus = EnsureWaiting();
>       if (NS_SUCCEEDED(mStatus))
>           break;
>       nextState = STATE_STOP;
>   }

I see what you're saying. I've done that and included a couple more assertions.

> ::: netwerk/base/src/nsInputStreamPump.h
> @@ +73,5 @@
> >      uint32_t                      mState;
> >      nsCOMPtr<nsILoadGroup>        mLoadGroup;
> >      nsCOMPtr<nsIStreamListener>   mListener;
> >      nsCOMPtr<nsISupports>         mListenerContext;
> > +    nsCOMPtr<nsIEventTarget>      mEventTarget;
> 
> nit: I'd keep the name 'mTargetThread': makes for less hg context changes,
> and it's still accurate (the nsIEvent will always be a thread, right?)

Yeah, I guess. Not too important to keep it.

> ::: netwerk/base/src/nsStreamListenerTee.cpp
> @@ +9,5 @@
> > +NS_IMPL_THREADSAFE_ADDREF(nsStreamListenerTee)
> > +NS_IMPL_THREADSAFE_RELEASE(nsStreamListenerTee)
> > +
> > +NS_INTERFACE_MAP_BEGIN(nsStreamListenerTee)
> > +  NS_INTERFACE_MAP_ENTRY_AMBIGUOUS(nsISupports, nsIThreadRetargetableStreamListener)
> 
> ugh, this multiple IDL inheritance stuff is horrible.  Let's get rid of it
> by no longer making nsIThreadRetargetableStreamListener inherit from
> nsIStreamListener.  (Just as nsIThreadRetargetableRequest isn't inheriting
> from nsIRequest).

Right - it IS horrible. I'll change the inheritance.

One thing, though. I had to change nsDocumentOpenInfo pointers to be nsRefPtrs. The following suggests it's actually preferable (and you've said it yourself for concrete types, nsRefPtr is better) and works around multiple interface inheritance. Also seems that mixing is ok too - I know I used that in nsHttpChannel - https://developer.mozilla.org/en-US/docs/nsRefPtr

> ::: netwerk/base/src/nsStreamListenerWrapper.cpp
> @@ +7,2 @@
> >  
> > +// Not using NS_IMPL_ISUPPORTS to deal with inheritance ambiguity.
> 
> FWIW this class didn't actually have multiple inheritance here so
> IMPL_ISUPPORTS works fine.

Opps. Yeah. Trigger happy.

> ::: netwerk/protocol/http/nsHttpChannel.cpp
> @@ +3895,5 @@
> >      }
> >  
> >      if (NS_FAILED(rv)) return rv;
> > +    mListener = do_QueryInterface(tee, &rv);
> > +    NS_ENSURE_SUCCESS(rv, rv);
> 
> Another reason to get rid of the multiple inheritance.  FWIW I would have
> used MOZ_ASSERT here since this couldn't have failed.

Done.

> @@ +5138,5 @@
> > +        OnStopRequestCleanup(status);
> > +    } else {
> > +        nsresult rv = NS_DispatchToMainThread(
> > +            new OnStopRequestCleanupEvent(this, status));
> > +        NS_ENSURE_SUCCESS(rv, rv);
> 
> If the dispatch fails (it basically never does and we'd be in a world of
> pain anyway, but still...) we're better off calling OnStopRequestCleanup on
> the main thread than just returning.  That way we at least deliver
> OnStopRequest.

This was discussed over email with biesi and bz.
-- loadGroup->RemoveRequest can't be called off main thread
-- dispatch failing is likely during shutdown, so it doesn't really matter

So, while it would be nice to handle this error case by calling OnStopRequestCleanup off the main thread, we can't do it, and it shouldn't matter too much anyway.

> @@ +5232,5 @@
> > +        } else {
> > +            nsresult rv = NS_DispatchToMainThread(
> > +                new OnTransportStatusAsyncEvent(this, transportStatus,
> > +                                                progress, progressMax));
> > +            NS_ENSURE_SUCCESS(rv, rv);
> 
> Here it actually does something to return a failure code if dispatch fails
> (the pump will cancel the transaction).  yay
> 
> @@ +5275,5 @@
> > +    nsresult rv = NS_OK;
> > +    // If both cache pump and transaction pump exist, we're probably dealing
> > +    // with partially cached content. So, we must be able to retarget both.
> > +    // Note: Pumps are implemented as nsInputStreamPump objects, not
> > +    // nsIInputStreamPump interfaces. Hence mix of nsCOMPtrs and nsRefPtrs.
> 
> don't need comment about mix of com/refPtrs..

Ok - comment removed.

> @@ +5281,5 @@
> > +    nsCOMPtr<nsIThreadRetargetableRequest> retargetableTransactionPump;
> > +    if (mCachePump) {
> > +        // Direct call to QueryInterface to avoid multiple inheritance issues.
> > +        rv = mCachePump->QueryInterface(NS_GET_IID(nsIThreadRetargetableRequest),
> > +                                        getter_AddRefs(retargetableCachePump));
> 
> yet another reason to ditch multiple inheritance!

This one is awkward and I don't think it can go, even with the change to nsIThreadRetargetableStreamListener's inheritance. mCachePump and mTransactionPump are of type nsRefPtr<nsInputStreamPump>, so the compiler complains "error: ambiguous conversion from derived class 'nsInputStreamPump' to base class 'nsISupports'". If they were nsCOMPtrs, it would be fine - it can deal with multiple interface inheritance. But they're not. So, this is a little ugly, but unless we change the pumps to nsCOMPtrs, and add PeekStream to the interface … which I don't think we want to do right? Is this an ugliness we can live with?

> @@ +5296,5 @@
> > +        rv = retargetableTransactionPump->RetargetDeliveryTo(aNewTarget);
> > +
> > +        // If retarget fails for transaction pump, we must restore mCachePump.
> > +        if (NS_FAILED(rv) && retargetableCachePump) {
> > +            rv = retargetableCachePump->RetargetDeliveryTo(NS_GetCurrentThread());
> 
> Calling RetargetTo is only supported on main thread, so add an assert to
> that effect at top of the function, and use NS_GetMainThread() here.

Done.

> ::: netwerk/test/Makefile.in
> @@ +54,5 @@
> > +MOCHITEST_FILES = \
> > +  partial_content.sjs \
> > +  test_partially_cached_content.html \
> > +  $(NULL)
> > +
> 
> Let's put the mochitests in a separate directory ('netwerk/test/mochitests')
> instead of mixing them in with the C++ tests.  Yes, I know all other
> directories in the tree just call their mochitest directories 'test', but we
> also have xpcshell tests, so we're special.

Done. And indeed, Necko is special.

> ::: netwerk/test/httpserver/httpd.js
> @@ +3976,5 @@
> >        // XXX assumes stream will always report the full amount of data available
> > +      // Set "Content-Length" if not already set by request handler.
> > +      if (!headers.hasHeader("Content-Length")) {
> > +        headers.setHeader("Content-Length", "" + avail, false);
> > +      }
> 
> Why do you need to modify httpd.js to always set content-length? We want to
> be able to test cases where servers don't pass it.  Both your test responses
> set Content-Length explcitly, so I don't see the reason for this change.

Existing code always sets Content-Length. I changed it so that it is set if it is not already present, i.e. the response handler in the sis hasn't already added one. The snippet above is not the same as the snippet in my diff - there is a missing line...

-      headers.setHeader("Content-Length", "" + avail, false);

> Also, there are six (!) copies of httpd.js in the tree (IIRC because windoze
> doesn't support symlinks), so some or all of them would need that fix (ask
> :ted if we really need to go there).

Fun. I emailed him for more info:

Hi Steve,

AFAIK only the httpd.js you pointed out is used for the tests you'd be
concerned with (Mochitest/Reftest/xpcshell tests). The other copies look
to be part of Mozmill and Jetpack tests, which I don't think you need to
worry about.

-Ted

So, no need to propagate to the other httpd.js files. Yay!

> ::: netwerk/test/partial_content.sjs
> @@ +104,5 @@
> > +
> > +  DBG("totalLength: " + totalLength);
> > +
> > +  // Prepare common headers for the two responses.
> > +  response.setHeader("Content-Length", "" + totalLength, false);
> 
> nit: the relevant RFCs say that Content-Length for a 206 response should be
> the length of the partial response, not the total length:
> 
>   http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.7
> 
>  
> http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p5-range-latest.
> html#status.206

Ah, ok. I updated the test. First load will still respond with a 200, with only the first bytes, but will set Content-Length to the full length of the resource (to simulate a partial download, but not a range request). Second load will respond with a 206 with the remaining bytes and Content-Length set to the remainder's length.

> ::: netwerk/test/test_partially_cached_content.html
> @@ +6,5 @@
> > +  This test verifies that partially cached content is read from the cache first
> > +  and then from the network. It is written in the mochitest framework to take
> > +  thread retargeting into consideration of nsIStreamListener callbacks (inc.
> > +  nsIRequestObserver). E.g. HTML5 Stream Parser requesting retargeting of
> > +  nsIStreamListener callbacks to the parser thread.
> 
> Do we have any way to verify that the HTML parser actually winds up
> requesting and receiving its data off main thread? It would be great to
> guarantee that we never silently regress off-main thread delivery back to
> regular main-thread delivery and don't notice it.  (Perhaps a perf test
> could capture that too, but IIRC talos right now won't capture that because
> none of those tests clog the main thread's event queue, right?)

Right re talos - it doesn't show anything for this change.

> Let's ask henri if he can think of a way to do this quickly and easily (in a
> followup).  Perhaps we could set a bit somewhere and if we give the
> mochitest SpecialPowers it could read it.

Sounds good for a followup.

> @@ +25,5 @@
> > +
> > +
> > +/* Check iframe loaded ok first time.
> > + */
> > +function onFrameLoadPartialContent(e) {
> 
> nit: I find the names a little confusing.  How about 'expectInitialContent'
> and then 'expectRemainderContent'?

Indeed. Changed to 'expectInitialContent' and 'expectFullContent', since all the content, not just the remainder is expected after the second load.
Attachment #758729 - Attachment is obsolete: true
Attachment #762146 - Flags: review+
Minor updates to deal with changes to underlying patches.
Attachment #758733 - Attachment is obsolete: true
Attachment #762149 - Flags: review+
Blocks: 882803
(In reply to Steve Workman [:sworkman] from comment #56)
> (In reply to Jason Duell (:jduell) from comment #55)
>
> > Do we have any way to verify that the HTML parser actually winds up
> > requesting and receiving its data off main thread? It would be great to
> > guarantee that we never silently regress off-main thread delivery back to
> > regular main-thread delivery and don't notice it.  (Perhaps a perf test
> > could capture that too, but IIRC talos right now won't capture that because
> > none of those tests clog the main thread's event queue, right?)
> 
> Right re talos - it doesn't show anything for this change.
> 
> > Let's ask henri if he can think of a way to do this quickly and easily (in a
> > followup).  Perhaps we could set a bit somewhere and if we give the
> > mochitest SpecialPowers it could read it.
> 
> Sounds good for a followup.

Bug 882803 - Investigate a test to confirm Off Main Thread ODA for nsHtml5StreamParser
> > + // Direct call to QueryInterface to avoid multiple inheritance issues.
> > + rv = mCachePump->QueryInterface(NS_GET_IID(nsIThreadRetargetableRequest),
> > +                                 getter_AddRefs(retargetableCachePump));
> 
> This one is awkward and I don't think it can go, even with the change to
> nsIThreadRetargetableStreamListener's inheritance. mCachePump and
> mTransactionPump are of type nsRefPtr<nsInputStreamPump>, so the compiler
> complains "error: ambiguous conversion from derived class 'nsInputStreamPump'
> to base class 'nsISupports'".

Hmm. There's usually a way around this sort of thing.  Ask :bent or :jst if you want to know.
(In reply to Jason Duell (:jduell) from comment #61)
> > > + // Direct call to QueryInterface to avoid multiple inheritance issues.
> > > + rv = mCachePump->QueryInterface(NS_GET_IID(nsIThreadRetargetableRequest),
> > > +                                 getter_AddRefs(retargetableCachePump));
> > 
> > This one is awkward and I don't think it can go, even with the change to
> > nsIThreadRetargetableStreamListener's inheritance. mCachePump and
> > mTransactionPump are of type nsRefPtr<nsInputStreamPump>, so the compiler
> > complains "error: ambiguous conversion from derived class 'nsInputStreamPump'
> > to base class 'nsISupports'".
> 
> Hmm. There's usually a way around this sort of thing.  Ask :bent or :jst if
> you want to know.

There's no super clean way around it, you'll need to somehow tell the compiler which version of nsISupports you want here, and the easiest way to do that is to simply cast to the version you want, i.e. static_cast<nsIThreadRetargetableRequest>(mCachePump.get()) should be close in this case. Another alternative is that you write an inline ToISupports() helper in the concrete class that does the cast for you and returns an nsISupports *.
Whiteboard: [leave open]
> Hmm. There's usually a way around this sort of thing.

  retargetableCachePump = do_QueryObject(mCachePump);

should do the trick.
Follow-up to avoid changing httpd.js.

-- Reverts httpd.js so that Content-Length is always set, unless js/sjs has called response.seizePower().
-- Adjusts partial_content.sjs to call response.seizePower() and writes out headers using .write and not setHeader.
-- (Minor) Adds some checks in test_partially_cached_content.js to avoid exceptions.
Attachment #762423 - Flags: review?(jduell.mcbugs)
Ooo ... do_QueryObject - nice.
Attachment #762431 - Flags: review?(jduell.mcbugs)
Blocks: 882996
Comment on attachment 762423 [details] [diff] [review]
v1.0 Adjust partial content test and revert httpd.js

Review of attachment 762423 [details] [diff] [review]:
-----------------------------------------------------------------

https://hg.mozilla.org/integration/mozilla-inbound/rev/26c1d80edf1f
Attachment #762423 - Flags: review?(jduell.mcbugs) → review+
Comment on attachment 762431 [details] [diff] [review]
v1.0 Replace some QueryInterface calls in nsHttpChannel with do_QueryObject

Review of attachment 762431 [details] [diff] [review]:
-----------------------------------------------------------------

https://hg.mozilla.org/integration/mozilla-inbound/rev/43223a927976
Attachment #762431 - Flags: review?(jduell.mcbugs) → review+
>  retargetableCachePump = do_QueryObject(mCachePump);

Good to know.  I updated the nsRefPtr docs:

  https://developer.mozilla.org/en-US/docs/nsRefPtr

This bug can close once these last patches reach m-c.  Yay!
Whiteboard: [leave open]
https://hg.mozilla.org/mozilla-central/rev/26c1d80edf1f
https://hg.mozilla.org/mozilla-central/rev/43223a927976
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla24
Depends on: 882913
Depends on: 883607
Depends on: 883541
Depends on: 883779
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Thanks, Ryan!
Blocks: 867755
Looks like the bug was caused by nsInputStreamPump::Cancel managing to arrange for its HttpChannel to release the last reference to it, at which points you're operating on deleted memory. Just jumping to the end of EnsureWaiting allows the stack to unwind and the browser to continue, so one workaround is to include a death grip in Cancel.
Attachment #768431 - Attachment description: Dispatch nsInputStreamPump::OnStateStopCleanup to main thread to avoid crashing on null mAsyncStream → v1.0 Dispatch nsInputStreamPump::OnStateStopCleanup to main thread to avoid crashing on null mAsyncStream
Comment on attachment 768430 [details] [diff] [review]
v1.0 Add main thread assertions and null check to avoid crashing on null HttpBaseChannel::mProgressSink

Review of attachment 768430 [details] [diff] [review]:
-----------------------------------------------------------------

+r with nit fixed.

::: netwerk/protocol/http/HttpBaseChannel.cpp
@@ +198,5 @@
>  
>  NS_IMETHODIMP
>  HttpBaseChannel::SetLoadGroup(nsILoadGroup *aLoadGroup)
>  {
> +  MOZ_ASSERT(NS_IsMainThread(), "Should only be called on the main thread.");

nit: we traditionally throw a blank line after whatever section of ASSERTS, ENSURE_ARG etc checks we have at the beginning of a function.  Ditto for other uses.

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +5303,5 @@
>  NS_IMETHODIMP
>  nsHttpChannel::OnTransportStatus(nsITransport *trans, nsresult status,
>                                   uint64_t progress, uint64_t progressMax)
>  {
> +    MOZ_ASSERT(NS_IsMainThread(), "Should be on main thread only");

blank line after
Attachment #768430 - Flags: review?(jduell.mcbugs) → review+
Comment on attachment 768431 [details] [diff] [review]
v1.0 Dispatch nsInputStreamPump::OnStateStopCleanup to main thread to avoid crashing on null mAsyncStream

Review of attachment 768431 [details] [diff] [review]:
-----------------------------------------------------------------

Neil--thanks for the pointer!
Attachment #768431 - Flags: review?(jduell.mcbugs) → review+
Comment on attachment 768432 [details] [diff] [review]
v1.0 Release nsHttpChannel::mDNSPrefetch, mTransaction and mTransactionPump on the main thread to avoid null ptr crashes

Review of attachment 768432 [details] [diff] [review]:
-----------------------------------------------------------------

+r with nit fixed.

Go ahead and just fix the nits in a big roll up patch with all the /netwerk bits, mark it +r, and let's land it with the HTML parser patch and see what happens :)

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +5056,5 @@
> +        // Release in OnStopRequestCleanupEvent if off main thread.
> +        if (NS_IsMainThread()) {
> +            mTransaction = nullptr;
> +            mTransactionPump = nullptr;
> +        }

You'll null these out in OnStopRequestCleanup.  No need to do it twice (and the timing of when it happens isn't critical).  So I think you can just ditch this.

Change the comments about "no longer needing transaction/prefetch" to 

  // get timings from transaction/prefetch

and just let nullification happen in the cleanup function.

@@ +5068,5 @@
>          }
> +        // Release in OnStopRequestCleanupEvent if off main thread.
> +        if (NS_IsMainThread()) {
> +            mDNSPrefetch = nullptr;
> +        }

This goes away too.
Comment on attachment 768432 [details] [diff] [review]
v1.0 Release nsHttpChannel::mDNSPrefetch, mTransaction and mTransactionPump on the main thread to avoid null ptr crashes

Review of attachment 768432 [details] [diff] [review]:
-----------------------------------------------------------------

+r with nit fixed.

Go ahead and just fix the nits in a big roll up patch with all the /netwerk bits, mark it +r, and let's land it with the HTML parser patch and see what happens :)

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +5056,5 @@
> +        // Release in OnStopRequestCleanupEvent if off main thread.
> +        if (NS_IsMainThread()) {
> +            mTransaction = nullptr;
> +            mTransactionPump = nullptr;
> +        }

You'll null these out in OnStopRequestCleanup.  No need to do it twice (and the timing of when it happens isn't critical).  So I think you can just ditch this.

Change the comments about "no longer needing transaction/prefetch" to 

  // get timings from transaction/prefetch

and just let nullification happen in the cleanup function.

@@ +5068,5 @@
>          }
> +        // Release in OnStopRequestCleanupEvent if off main thread.
> +        if (NS_IsMainThread()) {
> +            mDNSPrefetch = nullptr;
> +        }

This goes away too.
Attachment #768432 - Flags: review?(jduell.mcbugs) → review+
Thanks Jason! I'll roll these up, land them and see what happens.
After some discussion, it has become apparent that OnStopRequest should be called on the main thread, since OnProgress/OnStatus should be called before OnStopRequest ends.

This patch makes the necessary changes/reversions to nsHttpChannel and nsInputStreamPump.
Attachment #769202 - Flags: review?(jduell.mcbugs)
Reverted changes to nsHtml5StreamParser::OnStopRequest, and removed the need for m_targetStreamListener to be an nsMainThreadPtrHandle.
Attachment #769205 - Flags: review?(jduell.mcbugs)
Comment on attachment 769202 [details] [diff] [review]
v1.0 Force OnStopRequest to be called on the main thread

Review of attachment 769202 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me.  Let's give it a go. 

Now that we're explcitly assuming we use main thread for OnStart/Stop, let's throw in a MOZ_ASSERT(isMainThread) where we set 

  mTargetThread = do_GetCurrentThread();

::: netwerk/base/src/nsInputStreamPump.cpp
@@ +408,5 @@
> +        if (nextState == STATE_STOP) {
> +            mRetargeting = true;
> +            nsCOMPtr<nsIThread> mainThread;
> +            nsresult rv = NS_GetMainThread(getter_AddRefs(mainThread));
> +            NS_ENSURE_SUCCESS(rv, rv);

None of the OnInputStreamReady callers check error returns.  No one can hear you scream :)    

Just leave out error checking--GetMainThread and the QI should never fail.

@@ +410,5 @@
> +            nsCOMPtr<nsIThread> mainThread;
> +            nsresult rv = NS_GetMainThread(getter_AddRefs(mainThread));
> +            NS_ENSURE_SUCCESS(rv, rv);
> +            mTargetThread = do_QueryInterface(mainThread);
> +            NS_ENSURE_TRUE(mTargetThread, NS_ERROR_NULL_POINTER);

ditto

@@ +417,2 @@
>          // Wait asynchronously if there is still data to transfer, or if
>          // delivery of data has been requested on another thread.

Change comment to

  // Wait asynchronously if there is still data to transfer, or we're switching event delivery to another thread.

::: netwerk/base/src/nsInputStreamPump.h
@@ +64,5 @@
>       */
>      void OnStateStopCleanup();
>  
> +    /**
> +     * Called on the main thread if EnsureWaiting fails.

"so that we always call OnStopRequest on main thread."
Attachment #769202 - Flags: review?(jduell.mcbugs) → review+
Attachment #769205 - Flags: review?(jduell.mcbugs) → review+
Attachment #762146 - Attachment is obsolete: true
Attachment #762423 - Attachment is obsolete: true
Attachment #762431 - Attachment is obsolete: true
Attachment #768430 - Attachment is obsolete: true
Attachment #768431 - Attachment is obsolete: true
Attachment #768432 - Attachment is obsolete: true
Attachment #769202 - Attachment is obsolete: true
Attachment #772431 - Flags: review+
Attachment #762149 - Attachment is obsolete: true
Attachment #769205 - Attachment is obsolete: true
Attachment #772433 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/57ffaa460a6b
https://hg.mozilla.org/mozilla-central/rev/056043e8570d
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Depends on: 891347
Steve: It's scary that onDataAvailable and onStopRequest goes to two different threads.

For example in workers we would like to send onDataAvailable directly to the worker thread when doing network requests from there. But if onStopRequest is going to the main thread, and then the main thread forwards that signal to the worker thread, is there anything that guarantees that the worker thread won't get the onStopRequest before the last onDataAvailable?
Depends on: 896241
(In reply to Jonas Sicking (:sicking) from comment #90)
> ... is there anything that guarantees that the worker
> thread won't get the onStopRequest before the last onDataAvailable?

Yes - this should not happen. OnStopRequest should only be called/dispatched to the main thread AFTER the last OnDataAvailable returns. If you want to look at the code, the main loop is in nsInputStreamPump::OnInputStreamReady.
Blocks: 897904
No longer depends on: 897904
So the intermittent crashes in bug 904474, bug 883779, and bug 896241 are all probably caused by this bug.  But their crash frequency is fairly low (a couple a day, and this patch is now on nightly, aurora, and beta).  I'd like an opinion on whether that's enough crashes to merit backing this out (or disabling use of off-main thread OnData delivery) or not.
Depends on: 904474
Flags: needinfo?(ryanvm)
I would like to do *something* for this at least on beta since Fx24 is the next ESR and we'll be living with it for the next year roughly.
Flags: needinfo?(ryanvm)
(In reply to Jason Duell (:jduell) from comment #92)
> So the intermittent crashes in bug 904474, bug 883779, and bug 896241 are
> all probably caused by this bug.  But their crash frequency is fairly low (a
> couple a day, and this patch is now on nightly, aurora, and beta).  I'd like
> an opinion on whether that's enough crashes to merit backing this out (or
> disabling use of off-main thread OnData delivery) or not.

Do we understand why this is happening? If so, I'd imagine that there should be a lower-risk fix than a backout.
Bug 904474 sure looks like us possibly doing OnStopRequest on a background thread?
Yes, we're doing OnStopRequest sometimes on a non-main thread, which shouldn't happen even when you've asked for OnDataAvailable off-main (we should still always deliver OnStop on the main thread, and after the last OnData has been called).

We don't really know at this point why this happens (and only rarely).  I suspect it's just some rarely trodden codepath.  We could see if turning off the HTML parser's use of off-main ODA stop the crashes (it's the only client at the moment).  If we're really unlucky we may have to back the whole patch out to stop the crashes.
> Yes, we're doing OnStopRequest sometimes on a non-main thread

That's pretty bad.  The stack in bug 904474 is definitely operating on refcounted non-threadsafe objects, so if someone figures out how to trigger this reliably they're likely to be able to exploit it...
Steve points out that for now we can simply redispatch OnStop to the main thread if we detect we're about to dispatch it on another thread.  That'll fix the crashes, and buy us some time to figure out the underlying issue.  Seems preferable to backing out the entire patch at this point.  We'll open a new bug for the redispatch patch.
(In reply to Jason Duell (:jduell) from comment #98)
> Steve points out that for now we can simply redispatch OnStop to the main
> thread if we detect we're about to dispatch it on another thread.  That'll
> fix the crashes, and buy us some time to figure out the underlying issue. 
> Seems preferable to backing out the entire patch at this point.  We'll open
> a new bug for the redispatch patch.

Bug 913151 - Always call nsInputStreamPump::OnStateTransfer on the main thread
Depends on: 960519
Target Milestone: mozilla24 → mozilla25
You need to log in before you can comment on or make changes to this bug.