Closed Bug 975338 Opened 10 years ago Closed 10 years ago

e10s: provide way to "Divert" a channel from child listener to one on parent

Categories

(Core :: Networking, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla30

People

(Reporter: jduell.mcbugs, Assigned: sworkman)

References

Details

Attachments

(6 files, 26 obsolete files)

6.77 KB, text/plain
mayhemer
: feedback+
Details
5.71 KB, patch
Details | Diff | Splinter Review
5.05 KB, patch
sworkman
: review+
Details | Diff | Splinter Review
51.82 KB, patch
jduell.mcbugs
: review+
Details | Diff | Splinter Review
27.57 KB, patch
sworkman
: review+
Details | Diff | Splinter Review
38.71 KB, patch
jduell.mcbugs
: review+
Details | Diff | Splinter Review
See bug 946239 comment 16 through 18.

The goal: during OnStartRequest, a new channel.DivertToParent() method can be called, which returns a new IPDL object of type PChannelDiverter. Clients are responsible for calling their own IPDL method with the PChannelDiverter as an argument.  On the parent, PChannelDiverterParent.Divert(nsIStreamListener) can be called, and magically OnStart/OnData/OnStop will all be called as if the channel were newly opened.

We need to do this for both FTP and HTTP (since either protocol could be used for a download).

I've attached a text file (with ASCII art for message passing!) with the architecture.  Asking honza for feedback since he always seems to find flaws in my e10s proposals :)  Honza, if you're too busy, maybe ask jdm.  Steve you should also look this over and see if it makes sense to you since you probably get to code this :O
Attachment #8379597 - Flags: feedback?(honzab.moz)
Honza: this is a B2G 1.4 blocker so it's fairly urgent (we only have 2 weeks).
A couple of additional design points:

- The ChildChannel should be considered essentially dead after DivertToParent() is called.  Let's start off by enforcing that ruthlessly--MOZ_ABORT if virtually anything (Suspend, Cancel, definitely anything else that would normally send IPDL to the parent).  We'll see if we can get away with that, and the aborts will tell us if there are use cases we need to support otherwise.  (I think the design can already handle the channel being suspended and resumed before/after DivertToParent, but if we can get away with banning it, I say let's go for it).

- I didn't mention when the new IPDL PChannelDiverter gets destroyed.  Looks like it can be right after ChannelDiverterParent::DivertTo() completes.  It may be easiest to just let the client code handle destroying it.  I'd guess it doesn't need to be refcounted.
Comment on attachment 8379597 [details]
Architecture description

f+ by means of "feedback done", not "granted" :) since there are questions.

Overall the algorithm is IMO correct in general, only some details IMO could be simplified.


># :vim set tw=999:
>
>I occasionally just refer to HTTP classes here, but generally I mean {HTTP or FTP}
>
>PARENT      IPDL                          CHILD
>--------------------------------------------------------------------------------
>            --> OnStartRequest -->        Decide to Divert to parent
>                                          nsIDivertableChannel dc = QI(channel)
>                                          [hmm, can't return IPDL actors from an IDL: 

Even when defined as a raw ptr?  We can return PRFileDesc, so why not an IPDL actor?

But in the first place, why do you actually need a new IPDL for this?  Why there is PChannelDiverter?  Cannot HttpChannel* implement this directly?  Or am I missing something?


>                                           - may need to do static_cast<> here to get concrete C++ class]
>                                          ChannelDiverterChild divertChild = dc.DivertToParent();
>                                            - DivertToParent() internals:
>                                              - suspend() called on channel 
>                                                - new version that doesn't send Suspend msg to parent,
>                                                  just sets mSuspended in child channel

This sounds like it could be exposed on the nsIChildChannel interface...  I'd call this un/blockDataIPDL().

Or, the whole Divert stuff ("from nsIDivertableChannel dc = QI(channel)" to "call PChannelDiverter constructor" might be hidden in nsIChildChannel.divertToParent(), that can be unimplemented only for HTTP and FTP child channels.  When it fails, do nothing and behave as usual.

>                                              - call PChannelDiverter constructor
>                                                - takes a union argument: { PHttpChannel or PFtpChannel }
>
>          <-- PChannelDiverter <--
>
>- PChannelDiverter constructor on parent side 
>  - has HttpParent pointer: call Suspend() on it.

How does PChannelDiverter reach the HttpParent ?

And how do we reach the target listener on the parent process? (which is ExternalHelperAppParent, IIUC)


>    - do Suspend here rather than using Child's Suspend
>      because that might not go to main thread once we're
>      using PBackground and we'd have a race with this IPDL msg.

Aren't you on the parent side here ? :)

>  - TODO: security issues? Think about attacks
>    - should be hard to fake/access a PHttpChannel that your own process didn't create.
>
>                                              - set an mDivertedFlag
>                                              - return ChannelDiverterChild

To whom?

>
>          --> OnDataAvailable...          // have been getting delivered (may be delivered on IPDL
>          --> possibly OnStopRequest      // thread before OnStart is even called) until we Suspend on parent.
>                                          // -- wind up first in IPDL queue, then will be stored in
>                                          // channelChild.ChannelEventQueue because it's suspended.
>
>
>                                          // up to client code to send some sort of IPDL msg to parent 
>                                          // that has PChannelDiverter as an argument
>                                          // -- require that IPDL has to arrive on main thread in parent
>                                          //    so no race with suspend.
>            <-- SomeClientMsg(Diverter)
>
>RecvSomeClientMsg: client code on parent calls
>DiverterParent->DivertTo(newListener)
>  - DivertTo() internals
>  - (Need to add a pointer from HttpChannelParent
>    to HttpChannelParentListener during AsyncOpen)
>    ParentListener->Divert
>      - sets mListener (maybe unsets mActiveChannel?)
>      - could set some mDiverted var if needed
>
>  - Call newListener->OnStartRequest(mChannel, null)
>    - note: we need to add some new channel->FakePending
>      method: if channel already hit OnStop, it will be marked
>      not pending, but during onStart Pending must be true,
>      so fake it during the method call.

Better wrap the channel in a class that overloads the IsPending() value?  I never like adding these single-purpose methods...  Overload by aggregation is better.

>    - Let's explicitly not allow ListenerContext to be set
>      for any channel that's diverted, or at least say that
>      we're going to pass null for it on the parent. IIRC almost
>      nothing uses context anyway.

Why do you actually consider this?

>
>  - Send a new PHttpChannel->Flushed message to child
>        
>          --> Flushed -->                 // Lets child know that no more OnData/OnStop will be arriving.
>                                          // Gets queued in ChannelEventQueue, after any ODA/OnStop msgs.
>
>  - Send a special DivertMessages IPDL msg
>
>          --> DivertMessages              // unlike all other IPDL msgs to ChannelChild, this one does 
>                                          // NOT get queued if we're suspended (and it must only arrive
>                                          // when we're suspended: ASSERT that). Instead it gets dispatched
>                                          // immediately, out of order.
>                                          DivertMessages internals
>                                            - resume() (but hack resume so it will never send Resume 
>                                              IPDL msg to parent if mDiverted=true, just resume dispaching
>                                              msgs on child.
>                                              - if something else has called Suspend() in the meantime, honor
>                                                that and wait for regular resume to happen. 
>                                          When channel resume hits 0 and we resume:
>                                            - Change ChannelEventQueue to know about mDiverted, and if true,
>                                              send ODA/OnStop back to child insted of delivering on child

wanted to say "back to parent" instead of back to child?

>
>          <-- DivertOnData(payload)       // new P{Http|Ftp}Channel methods
>          <-- DivertOnStop(payload)       // onStop may or may not get called (depends if we saw it on child)
>
>HttpChannelParent dispatches these to HttpChannelParentListener
>- HttpChannelParentListener calls mListener>OnData/OnStop 
>
>                                          // when Flushed message is dispatched from ChannelEventQueue
>                                          // we send a message to parent saying that we're done
>
>         <-- DivertComplete               // This could also possibly be __delete__?
>
>- if DivertOnStop was called, we're done
>- else mChannel.Resume() called:
>  - from now on OnData/OnStop go from mChannel -> HttpChannelParentListener -> mListener
>- (actually I guess call Resume either way: don't want to leave even a completed channel
>   in suspended state)

Definitely!

>- shut down PHttpChannel IPDL connection 
>
>                      --> __delete__ -->
>                                          // HttpChannelChild drops refcount, will go away normally
>                                          // We should null out ChildChannel's
>                                          // mListener/mCallbacks/mListenerContext at some point: maybe during 
>                                          // DivertToParent. If not then, do it now.
>                                          // We need to document that once DivertToParent called, we don't call 
>                                          // OnData/OnStop, and the listener's refcount is dropped.
>
>// PHttpChannelListener stays alive--refcounted and kept alive by mChannel

What's PHttpChannelListener?  Wanted to type HttpChannelParentListener ?

>// like any normal listener until OnStop is completed.
>// Just forwards all OnData/OnStop to mListener.
>
>// All very similar for FTP, expect there's no ChannelParentListener class, so it's FTPChannelParent
>// that's the listener for mChannel (and will be kept alive by mChannel ref even after IPDL destroyed)
>
Attachment #8379597 - Flags: feedback?(honzab.moz) → feedback+
Jason, correct me if I'm wrong in my responses here. I think I have a handle on your design, but I could be wrong with some things.

(In reply to Honza Bambas (:mayhemer) from comment #3)
> >                                              - call PChannelDiverter constructor
> >                                                - takes a union argument: { PHttpChannel or PFtpChannel }
> >
> >          <-- PChannelDiverter <--
> >
> >- PChannelDiverter constructor on parent side 
> >  - has HttpParent pointer: call Suspend() on it.
> 
> How does PChannelDiverter reach the HttpParent ?

NeckoParent::RecvPChannelDiverterParentConstuctor should have a PHttpChannel actor param, which (I think!) is the parent in the parent process.

> And how do we reach the target listener on the parent process? (which is
> ExternalHelperAppParent, IIUC)

That happens after we have created the diverter parent. The client (ExternalAppHelperChild) needs the child actor to be created before it can call its own SendSomeDivertMsg(diverterChild). Then in clientParent::RecvSomeDivertMsg it tells diverterParent actor where to divert.

> >    - do Suspend here rather than using Child's Suspend
> >      because that might not go to main thread once we're
> >      using PBackground and we'd have a race with this IPDL msg.
> 
> Aren't you on the parent side here ? :)

Yes, but I think Jason means that when the child's suspend is called earlier, we don't forward it to the parent. The point of the earlier suspend is just to block the queue on the child side. Then, at this point in the parent we block its queue.

> >  - TODO: security issues? Think about attacks
> >    - should be hard to fake/access a PHttpChannel that your own process didn't create.
> >
> >                                              - set an mDivertedFlag
> >                                              - return ChannelDiverterChild
> 
> To whom?

To the child. After gNeckoChild->SendPChannelDiverterChildConstructor is called, it should return a pointer to the ChannelDiverterChild actor.
In addition to Steve's comments:

>>  can't return IPDL actors from an IDL
> 
> Even when defined as a raw ptr?  

Maybe you're right.  I don't know exactly what you mean by a raw ptr though.  Just forward declare the IPDL type and we can declare it in IDL?

> why do you actually need a new IPDL for this?  Why there is PChannelDiverter?  
> Cannot HttpChannel* implement this directly?

2 reasons, neither of which may be right or good :)

1) the channel can be either FTP or HTTP, and there's no way IIRC to have an inheritance chain in IPDL, and so no way to write a single IPDL method that take both types unless you use an IPDL Union (which I have PChannelDiverter do under the covers).  We could force client code to do it, but if they forget it may be coredump time if they forget to handle the FTP case and try to handle an FTP channel as HTTP.

2) We've been talking about using the new "PBackground" stuff, where we'd have PHttpChannel connect to a background thread in the main process, instead of the main thread.  I have the PChannelDiverter doing a suspend on the parent, and I wouldn't want that to race with the the client code's main-thread IPDL.  But I suppose this is moot if we do the suspend during the Divert() call.  So if I'm wrong about #1 we could probably do without a separate IPDL channel.

> Better wrap the channel in a class that overloads the IsPending() value? 

That's cute.  I guess it would work, too.  Though it's a lot of forwarding code to write (i.e.  all the other IDL methods of an nsHttpChannel have to be forwarded) and maintain as we add new methods. I'm lazy so doing something less pure seems tempting.

>> Let's explicitly not allow ListenerContext to be set
>> for any channel that's diverted, or at least say that
>> we're going to pass null for it on the parent.
>
> Why do you actually consider this?

Because we can't support shipping some arbitrary context object across processes.

> Wanted to say "back to parent" instead of back to child?
> Wanted to type HttpChannelParentListener ?

Yes and yes :)
Ben: is there a way to return an IPDL object from an XPCOM method?

>>>  can't return IPDL actors from an IDL
> 
>> Even when defined as a raw ptr? 
>
> I don't know what you mean by a raw ptr.

If nothing else we should be able to declare an accessor method in a "%{C++" section (inside the XPCOM class definition) that returns a forward-declared IPDL class type.  That avoids a cast at least. E.g.

%{C++
  struct ChannelDiverterChild;

  // up to caller to destroy IPDL object when done with it.
  ChannelDiverterChild* DivertToParent();

%}
Flags: needinfo?(bent.mozilla)
(In reply to Jason Duell (:jduell) from comment #6)
> Ben: is there a way to return an IPDL object from an XPCOM method?
> 
> >>>  can't return IPDL actors from an IDL
> > 
> >> Even when defined as a raw ptr? 
> >
> > I don't know what you mean by a raw ptr.
> 
> If nothing else we should be able to declare an accessor method in a "%{C++"
> section (inside the XPCOM class definition) that returns a forward-declared
> IPDL class type.  That avoids a cast at least. E.g.
> 
> %{C++
>   struct ChannelDiverterChild;
> 
>   // up to caller to destroy IPDL object when done with it.
>   ChannelDiverterChild* DivertToParent();
> 
> %}

You're looking for http://mxr.mozilla.org/mozilla-central/source/dom/interfaces/events/nsIDOMEvent.idl#11 and http://mxr.mozilla.org/mozilla-central/source/dom/interfaces/events/nsIDOMEvent.idl#215 .
Flags: needinfo?(bent.mozilla)
So, here's an overview of the API (with some suggested renaming :) ). Read this in conjunction with Jason's design and let me know what you think.


== Necko Code ==

nsIDivertibleChannel (checked in dictionary, Merriam-Webster likes -ible no -able ;) )
// implemented by {Http|Ftp}Channel{Child|Parent}
{
  void suspendForDiversion();
  void resumeForDiversion();
};

nsIDivertibleChannelChild
// implemented by {Http|Ftp}ChannelChild
{
  PChannelDiverter* prepareChannelDiverter();
  // Was divertToParent, but it really only gets the channel diverter actor for the client to use.
};

Add PChannelDiverter protocol to PNecko
Needs two constructors for HTTP and FTP

PNecko
{
...
parent:
  ...
  PChannelDiverter(PHttpChannel)
  PChannelDiverter(PFtpChannel)
};
I think we need both of these because we need IPDL to give us the parent actor of the respective channels, right?

PChannelDiverter
{
child:
  // Parent has been suspended; no more events to be enqueued (was "Flushed")
  NotifyParentSuspendedForDiversion();
  // Child should resume processing the ChannelEventQueue, i.e. diverting any
  // OnDataAvailable and OnStopRequest messages in the queue back to the parent.
  // Was "DivertMessages".
  ResumeForDiversion();

parent:
  DivertOnDataAvailable(ODA params);
  DivertOnStopRequest(OnStop params);
  // Child has no more events/messages to divert to the parent.
  // Was "DivertComplete", but it's not really complete until the parent's ready to resume, I think.
  NotifyChildDone();
};

ChannelDiverterChild : PChannelDiverterChild
{
  // I don't think there are any public non-IPDL functions here, right?
};

ChannelDiverterParent : PChannelDiverterParent
{
  // Called by client parent with target listener.
  // Note, we may need this to be an IDL or use:
  //   static_cast<ChannelDiverterParent*>(channelDiverterActor)
  // ... so that the client parent can call diverter->DivertTo();
  void DivertTo(nsIStreamListener* newListener);
};


== Client Code ==

nsIDivertibleStreamListener
// implemented by ExternalAppHelper{Child|Parent}
{
  // This IDL and method are not strictly necessary, but they might encourage
  // a common naming convention among clients.
  // Requires a DivertUsing message to be created in the client's IPDL protocol.
  // In child, call SendDivertUsing(PChannelDiverterChild* childDiverter)
  // In parent, in RecvDivertUsing(PChannelDiverterParent* parentDiverter),
  //   call DivertUsing(parentDiverter);
  //   in that, call parentDiverter->DivertTo(nsIStreamListener* newListener).
  void divertUsing(PChannelDiverter);
};
Just chatted with Jonas offline about a concern related to content sniffers. So, let's say a content sniffer is involved in the nsIStreamListener chain - nsUnknownDecoder - something that doesn't just call PeekStream, but consumes an OnDataAvailable to find out content type. It gets OnStartRequest and one (maybe a few) OnDataAvailables from HttpChannelChild. When the httpChannelChild is told to ResumeForDiversion (my name)/DivertMessages (Jason's name) it has already dequeued and forgotten that OnStart and those ODAs; the content sniffer consumed them. So httpChannelChild cannot divert them back to the parent.

If I've checked this right, it's httpChannelChild that is supposed to call DivertOnData, DivertOnStop for the messages that have already made it to the child by the time ResumeForDiversion/DivertMessages is called. If that's the case, then what happens to the OnDataAvailable(s) that were already removed from httpChannelChild's ChannelEventQueue by the sniffer? They might have to be forwarded by the next listener (ExternalAppHelperChild).

Maybe we should have the listener (i.e. ExternalAppHelperChild) call DivertOnData and DivertOnStop once httpChannelChild resumes? But that seems contrary to the point of pushing this code down into Necko. And as Jonas was getting at, how does that listener know when to stop diverting?

I'm not an expert on the content sniffers, so I might have got this wrong. However, I imagine that it's still possible for some nsIStreamListener to basically buffer an OnStart and several ODAs before calling OnStart for another listener. And that's the same case - some of these callbacks will have been dequeued from httpChannelChild already and it won't know to forward them.

Thoughts?
I forgot about content sniffers...

But since content-sniffers are only created on the parent by nsHttpChannel, we should be fine.  If a sniffer diverts some OnStart/OnData calls, that happens before HttpChannelParent's OnStart/Data get called, but eventually the nsHttpChannel will call them, and we'll wind up proceeding with Divert exactly as in the non-sniffing case, i.e. we ship OnStart/Data across to the child process.

Note that sniffers have the same IsPending problem that I outline for diverted channels (they report Pending=false during a delayed OnStart/Data if they hit OnStop while they were queueing/sniffing data)--see bug 748117.  We may want to use the same mechanism to fix them both.
(In reply to Jason Duell (:jduell) from comment #10)
> I forgot about content sniffers...
> 
> But since content-sniffers are only created on the parent by nsHttpChannel,
> we should be fine.  If a sniffer diverts some OnStart/OnData calls, that
> happens before HttpChannelParent's OnStart/Data get called, but eventually
> the nsHttpChannel will call them, and we'll wind up proceeding with Divert
> exactly as in the non-sniffing case, i.e. we ship OnStart/Data across to the
> child process.

Ah, ok. If they're only in the parent, then we should be fine.

Are there any other cases of nsIStreamListeners buffering ODAs before calling OnStartRequest? Specifically in the child?

> Note that sniffers have the same IsPending problem that I outline for
> diverted channels (they report Pending=false during a delayed OnStart/Data
> if they hit OnStop while they were queueing/sniffing data)--see bug 748117. 
> We may want to use the same mechanism to fix them both.

Understood.
> Are there any other cases of nsIStreamListeners buffering ODAs before 
> calling OnStartRequest? Specifically in the child?

Dunno about parent, but definitely not in the child.

Now for the bike-shedding:

> nsIDivertibleChannel (checked in dictionary, Merriam-Webster likes -ible no -able ;) )

Refering to dictionaries won't help you in HTTP-land, Steve.

-ible is less common and the more common -able is generally used when the base phrase is a complete word (like "divert"):

   http://www.oxforddictionaries.com/us/words/words-ending-able-or-ible-american

   http://www.ecenglish.com/learnenglish/lessons/when-use-able-and-ible

I think Divertable is more friendly to non-native English speakers.  Shame on you for being Mozilla-community-unfriendly and Anglo-elitist :P

> PChannelDiverter* prepareChannelDiverter();
>  // Was divertToParent, but it really only gets the channel diverter actor 
> for the client to use.

I think DivertToParent implies a finality that 'prepareChannelDiverter' misses.  After the function is called, it will be illegal to do virtually anything with the channel--it's dead.  So we've done a lot more than just prepare a diverter object, even if it's true that the function doesn't complete all the work needed.

>   NotifyChildDone();
> Was "DivertComplete", but it's not really complete until the parent's ready to resume

Throw in a Divert somewhere (otherwise, at a glance, "done with what?") and I can live with it (NotifyChildDivertDone).  But really I'm fan of short names (you can remember them), hence DivertDone. And the method *is* where the parent resumes the channel (and the divert is complete).

// I don't think there are any public non-IPDL functions here, right?

Correct.

> nsIDivertibleStreamListener
> // implemented by ExternalAppHelper{Child|Parent}
> {
>   // This IDL and method are not strictly necessary

True--there's no inherent reason to add an IDL here--client code needs to add an IPDL method, but any additional fanciness is optional and I'd lean towards skipping it.  We can still find call sites easily by searching for PChannelDiverter.
Not much there yet for these patches, but I'm getting the IPDL auto-generated code and my repo is building.
Comment on attachment 8381598 [details] [diff] [review]
WIP 0.1 Skeleton code for nsIDivertibleChannel

Review of attachment 8381598 [details] [diff] [review]:
-----------------------------------------------------------------

Not sure you need a new IDL for the new suspend/resume?  We do them both on child, and both will be called within the channel code itself, so just new private class methods instead of an IDL?

If you do need them, does it need to be two separate IDLs?  AFAICT all these functions are called only on the child, and they're either all called or none of them, so I'd think one IDL.
Comment on attachment 8381599 [details] [diff] [review]
WIP 0.1 Skeleton code for PChannelDiverter

Review of attachment 8381599 [details] [diff] [review]:
-----------------------------------------------------------------

You've stuffed a lot of methods that need to be in the Channel IPDL into the Diverter IPDL. See comments.

::: netwerk/base/src/ChannelDiverterChild.cpp
@@ +21,5 @@
> +ChannelDiverterChild::AddIPDLReference()
> +{
> +  MOZ_ASSERT(!mIPCOpen, "Attempt to retain more than one IPDL reference");
> +  mIPCOpen = true;
> +  AddRef();

Can scrap add/removeIPDLReference methods for this class--no refcounting.

::: netwerk/ipc/NeckoParent.cpp
@@ +613,5 @@
> +NeckoParent::AllocPChannelDiverterParent()
> +{
> +  ChannelDiverterParent* p = new ChannelDiverterParent();
> +  p->AddRef();
> +  return p;

I'm not convinced that we want to refCount PChannelDiverter (on either Parent or child).  We only refcount channels because we have to (they need to be XPCOM objects).  Should be easy enough (?) for the client code on the child to get the allocated pointer, then have it go away magically when the client code on the parent deletes it after calling DivertTo().

So for now just call new() and return ptr, and delete() in the Dealloc function.

::: netwerk/ipc/PChannelDiverter.ipdl
@@ +16,5 @@
> +  manager PNecko;
> +
> +child:
> +  // Parent has been suspended; no more events to be enqueued.
> +  NotifyParentSuspendedForDiversion();

Nope--this is what I was calling "Flushed", right?  That's a message that should be sent to the child channel, so add it to P{HTTP|FTP}Channel.

@@ +21,5 @@
> +
> +  // Child should resume processing the ChannelEventQueue, i.e. diverting any
> +  // OnDataAvailable and OnStopRequest messages in the queue back to the parent.
> +  // Was "DivertMessages".
> +  ResumeForDiversion();

This also belongs in the channels.

The only IPDL methods here are really the constructor (to pass the channel to the parent) and the destructor.

The 
  void DivertTo(nsIStreamListener* newListener);

method that the client needs to call can just be a public method of ChannelDiverterParent (and I'm fine with requiring the client to cast the IPDL PChannelDiverterParent pointer it gets to the concrete class, so it can call DivertTo).

@@ +29,5 @@
> +
> +  // XXX Unsure about these: Initial workaround: IPDL doesn't like multiple
> +  // constructors.
> +  InitForHttpChannel(PHttpChannel httpChannel);
> +  InitForFTPChannel(PFTPChannel ftpChannel);

The constructor in PNecko.ipdl needs to take a Union type.  See how we've kludged this for FTPChannelCreationArgs, for instance (we define those Unions in netwerk/ipc/NeckoChannelParams.ipdlh. *.ipdlh files are magikal files that let you define struct/unions that IPDL then handles all the serialization, etc, details for).

@@ +40,5 @@
> +  // Child calls to send OnStopRequest to the parent.
> +  DivertOnStopRequest(nsresult statusCode);
> +  
> +  // Child has no more events/messages to divert to the parent.
> +  NotifyChildDone();

The DivertOn* and NotifyChildDone methods also need to be part of the FTP|HTTP channels.
Note: I should have mentioned when I uploaded the patches, that I hadn't taken your naming comments into consideration yet. But I will!

(In reply to Jason Duell (:jduell) from comment #15)
> Not sure you need a new IDL for the new suspend/resume?  We do them both on
> child, and both will be called within the channel code itself, so just new
> private class methods instead of an IDL?

Ah yes. Mostly private, except httpChannelParent.SuspendForDiversion - isn't it called from the ChannelDiverter when it's created/init'd? So, it should be public right?

> If you do need them, does it need to be two separate IDLs?  AFAICT all these
> functions are called only on the child, and they're either all called or
> none of them, so I'd think one IDL.

Again, httpChannelParent.SuspendForDiversion, no? Or did you mean its regular suspend?
> httpChannelParent.SuspendForDiversion - isn't it called from the ChannelDiverter
> when it's created/init'd

IDL hands do the devil's work.  Just make it a public function in HttpChannelParent (but not in any IDL).  That's what we do for intra-necko calls (at this point, it will actually work for anything that's only going to get called from within gecko C++, now that we don't enforce module symbol hiding any more.

Think of XPCOM as a beast that doesn't need to grow any more than necessary :)
This is where we do this for certain function in nsHttpChannel, for instance:

   http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpChannel.h#122
(In reply to Jason Duell (:jduell) from comment #16)
> ::: netwerk/base/src/ChannelDiverterChild.cpp
> @@ +21,5 @@
> > +ChannelDiverterChild::AddIPDLReference()
> > +{
> > +  MOZ_ASSERT(!mIPCOpen, "Attempt to retain more than one IPDL reference");
> > +  mIPCOpen = true;
> > +  AddRef();
> 
> Can scrap add/removeIPDLReference methods for this class--no refcounting.

Yeah - wasn't sure about that; just copied and pasted for now.

> ::: netwerk/ipc/NeckoParent.cpp
> @@ +613,5 @@
> > +NeckoParent::AllocPChannelDiverterParent()
> > +{
> > +  ChannelDiverterParent* p = new ChannelDiverterParent();
> > +  p->AddRef();
> > +  return p;
> 
> I'm not convinced that we want to refCount PChannelDiverter (on either
> Parent or child).  We only refcount channels because we have to (they need
> to be XPCOM objects).  Should be easy enough (?) for the client code on the
> child to get the allocated pointer, then have it go away magically when the
> client code on the parent deletes it after calling DivertTo().
> 
> So for now just call new() and return ptr, and delete() in the Dealloc
> function.

No problem.

> ::: netwerk/ipc/PChannelDiverter.ipdl
> @@ +16,5 @@
> > +  manager PNecko;
> > +
> > +child:
> > +  // Parent has been suspended; no more events to be enqueued.
> > +  NotifyParentSuspendedForDiversion();
> 
> Nope--this is what I was calling "Flushed", right?  That's a message that
> should be sent to the child channel, so add it to P{HTTP|FTP}Channel.

Well, yes, but if it's going to be the same for both channels, why not use the PChannelDiverter protocol for both?

> @@ +21,5 @@
> > +
> > +  // Child should resume processing the ChannelEventQueue, i.e. diverting any
> > +  // OnDataAvailable and OnStopRequest messages in the queue back to the parent.
> > +  // Was "DivertMessages".
> > +  ResumeForDiversion();
> 
> This also belongs in the channels.

Same question - can't we reuse PChannelDiverter for this too? And just call chanChild.ResumeForDiversion. If we do go this way, then nsIDivertableChannel.Resume could be useful.

> The only IPDL methods here are really the constructor (to pass the channel
> to the parent) and the destructor.
> 
> The 
>   void DivertTo(nsIStreamListener* newListener);
> 
> method that the client needs to call can just be a public method of
> ChannelDiverterParent (and I'm fine with requiring the client to cast the
> IPDL PChannelDiverterParent pointer it gets to the concrete class, so it can
> call DivertTo).

Agreed. I missed that in ChannelDiverterParent in this patch.

> @@ +29,5 @@
> > +
> > +  // XXX Unsure about these: Initial workaround: IPDL doesn't like multiple
> > +  // constructors.
> > +  InitForHttpChannel(PHttpChannel httpChannel);
> > +  InitForFTPChannel(PFTPChannel ftpChannel);
> 
> The constructor in PNecko.ipdl needs to take a Union type.  See how we've
> kludged this for FTPChannelCreationArgs, for instance (we define those
> Unions in netwerk/ipc/NeckoChannelParams.ipdlh. *.ipdlh files are magikal
> files that let you define struct/unions that IPDL then handles all the
> serialization, etc, details for).

FTPChannelCreationArgs looks like it doesn't contain actors. Serializing is one thing, but we need IPDL to convert PHttp|FTPChannelChild pointers to parent actor pointers - can this be done with something like FTPChannelCreationArgs?

> @@ +40,5 @@
> > +  // Child calls to send OnStopRequest to the parent.
> > +  DivertOnStopRequest(nsresult statusCode);
> > +  
> > +  // Child has no more events/messages to divert to the parent.
> > +  NotifyChildDone();
> 
> The DivertOn* and NotifyChildDone methods also need to be part of the
> FTP|HTTP channels.

Again, IPDL protocol reuse, no?
(In reply to Jason Duell (:jduell) from comment #18)
> > httpChannelParent.SuspendForDiversion - isn't it called from the ChannelDiverter
> > when it's created/init'd
> 
> IDL hands do the devil's work.  Just make it a public function in
> HttpChannelParent (but not in any IDL).  That's what we do for intra-necko
> calls (at this point, it will actually work for anything that's only going
> to get called from within gecko C++, now that we don't enforce module symbol
> hiding any more.
> 
> Think of XPCOM as a beast that doesn't need to grow any more than necessary
> :)

So, I get the arguments against XPCOM and agree. Avoid bloat. But if we're going to call SuspendForDiversion for http and ftp we need some kind of common pointer in ChannelDiverterParent. I was thinking of using an nsIDivertibleChannel pointer here. What were you thinking?
>  why not use the PChannelDiverter protocol for both?

Because we want to allow the client to do

   (On Parent)
   ParentChannelDiverter->DivertTo(newListener);
   delete ParentChannelDiverter

(sub in a variable name, of course).  And it's done.  Otherwise you're going to require the client code to keep the PChannelDiverter alive until some time (I guess you'd have to notify it, or have it delete it when OnStartRequest is called?  Seems messy). Avoiding having to type out Flush/etc in 2 IPDL files doesn't seem worth the client code memory managment inconvenience.

> we need IPDL to convert PHttp|FTPChannelChild pointers to parent actor pointers
> - can this be done with something like FTPChannelCreationArgs

I sure hope so :)  Ask bent if you have trouble with it and we can probably hack a workaround (we could have separate constructor and InitHTTP/InitFTP methods if nothing else).

>  But if we're going to call SuspendForDiversion for http and ftp we need some 
> kind of common pointer

Ah, good point.  You're right--we need polymorphism here, so an IDL is probably easiest. (we could do it with just a C++ abstract class, FWIW, but we haven't done it before.  I'm fine either way).
(In reply to Jason Duell (:jduell) from comment #22)
> >  why not use the PChannelDiverter protocol for both?
> 
> Because we want to allow the client to do
> 
>    (On Parent)
>    ParentChannelDiverter->DivertTo(newListener);
>    delete ParentChannelDiverter
> 
> (sub in a variable name, of course).  And it's done.  Otherwise you're going
> to require the client code to keep the PChannelDiverter alive until some
> time (I guess you'd have to notify it, or have it delete it when
> OnStartRequest is called?  Seems messy). Avoiding having to type out
> Flush/etc in 2 IPDL files doesn't seem worth the client code memory
> managment inconvenience.

Ah, I see what you're saying. So, you want the ChannelDiverter to live only as long as it takes to tell the channelChild to do DivertMessages? You know, I've been sitting here thinking about how we could keep it alive for reuse with refcounting, and assume ownership by the channel ... and I think you're right - it doesn't seem worth the bother or extra complexity.

> > we need IPDL to convert PHttp|FTPChannelChild pointers to parent actor pointers
> > - can this be done with something like FTPChannelCreationArgs
> 
> I sure hope so :)  Ask bent if you have trouble with it and we can probably
> hack a workaround (we could have separate constructor and InitHTTP/InitFTP
> methods if nothing else).

Ooo. Ben just told me that actors in unions are dealt with magically. Awesome.

> >  But if we're going to call SuspendForDiversion for http and ftp we need some 
> > kind of common pointer
> 
> Ah, good point.  You're right--we need polymorphism here, so an IDL is
> probably easiest. (we could do it with just a C++ abstract class, FWIW, but
> we haven't done it before.  I'm fine either way).

I'm fine with an abstract class too. But since we're already using nsIDivertableChannel initially in the client's OnStartRequest, I'm inclined to stick it in that one place.
All sounds good to me, except

>  But since we're already using nsIDivertableChannel initially in the 
> client's OnStartRequest, I'm inclined to stick it in that one place.

So we call 
  
  HttpChannelChild.DivertToParent()

but the function you're adding is

  HttpChannelParent.DivertTo()

I wouldn't use the same IDL for those simply because you don't want both classes to implement both methods.  They each implement only one.
2 interfaces:
-- nsIDivertableChannel, with one method: divertToParent()
(this could be called nsIDivertableChannelChild, but this is meant to be QI'd by the client, so dropping the ~Child might look cleaner. But I'll change it if you want).

-- nsIDivertableChannelParent, with two methods: suspendForDiversion() (called in ChannelDiverterParent constructor) and divertTo() (called by ChannelDiverterParent::DivertTo()).
Attachment #8381598 - Attachment is obsolete: true
Moved and renamed IPDL messages from PChannelDiverter to PHttpChannel.

Notes:
-- I think it's best to do PFtpChannel once we get this working for PHttpChannel.
-- I'll do the union param for the PChannelDiverter constructor tomorrow.
Attachment #8381599 - Attachment is obsolete: true
Implement nsIDivertableChannel in HttpChannelChild
-- HttpChannelChild::DivertToParent (Called by client in OnStartRequest()).

Implement nsIDivertableChannelParent in HttpChannelParent
-- HttpChannelParent::SuspendForDiversion() (Called by ChannelDiverterParent constructor).
-- HttpChannelParent::DivertTo(nsIStreamListener *aListener) (Called by ChannelDiverterParent::DivertTo()).

... and private, supporting functions
-- HttpChannelChild::SuspendForDiversion() (Called in DivertToParent, above).
-- HttpChannelChild::ResumeForDiversion() (Called in RecvDivertMessages()).

-- HttpChannelParent::ResumeForDiversion() (Called in RecvDivertComplete()).
Added PChannelDiverter constructor with union param for PHttpChannel and PFTPChannel.
Attachment #8381931 - Attachment is obsolete: true
Started coloring in the functions.

Note: This should be applied after attachment 8381934 [details] [diff] [review].
Comment on attachment 8381930 [details] [diff] [review]
WIP 0.2 Skeleton code for nsIDivertableChannel

Review of attachment 8381930 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/public/nsIDivertableChannel.idl
@@ +6,5 @@
> +
> +#include "nsISupports.idl"
> +
> +%{C++
> +#include "mozilla/net/ChannelDiverterChild.h"

Need the whole .h file, or could just predeclare 'struct ChannelDivererChild'? (note that predeclare would have to be inside namespace moz { net { ..)

@@ +14,5 @@
> +
> +interface nsIStreamListener;
> +
> +/**
> + * A channel implementing this interface allows one to divert its data from

allows diverting from an nsIStreamListener in the child..."

@@ +25,5 @@
> +   * CHILD ONLY.
> +   * Called by Necko client in child process during OnStartRequest to divert
> +   * nsIStreamListener and nsIRequest callbacks to the parent process.
> +   *
> +   * The process should look like the following:

1) divertToParent is called in the child process.  It can only be called during OnStartRequest().

2) The ChannelDiverterChild that is returned is an IPDL object. It should be passed via some other IPDL method of the client's choosing, and on the parent the ChannelDiverterParent's divertTo() function should be called with a listener that will then receive the OnStartRequest/OnDataAvailable/OnStopRequest for the channel.  The ChannelDiverterParent can then be deleted (which will also destroy the ChannelDiverterChild in the child).

After divertToParent() has been called, NO further function calls should be made on the channel.  It is a dead object for all purposes.

@@ +47,5 @@
> +   *       functions.
> +   *       Channel in parent should implement a private ResumeForDiversion
> +   *       function in addition to SuspendForDiversion.
> +   */
> +  void suspendForDiversion();

Not sure why this is part of the IDL.  Oh right, it's so we can call FTP/HTTP without knowing which one we've got.  Stick it after DivertTo in the IDL and mark it as "used only internally: do not call".  (Messy: maybe it's cleaner to just share an abstract C+ class that's not in any IDL).

@@ +51,5 @@
> +  void suspendForDiversion();
> +
> +  /**
> +   * Called to divert remaining nsIStreamListener callbacks to the specified
> +   * listener.

To complete the channel diversion, this function must be called in the parent on the parent end of the ChannelDiverterChild that was returned from divertToParent().   Once it returns the listener that is passed will eventually receive the usual notifications (OnStartRequest/OnDataAvailable/StopRequest) as if it had been passed to asyncOpen().  A reference to the listener will be added and kept until OnStopRequest has completed.
Attachment #8381930 - Flags: feedback+
Thanks for the feedback. Also, just wanted to say that I just coded up a really basic patch for ExternalHelperAppChild and I've been able to download (over HTTP) a zip file in an e10s window. (And I can show it going through the new code path).

Still work to do, of course, but that's a nice sign :)
Comment on attachment 8381934 [details] [diff] [review]
WIP 0.1 Start implementing nsIDivertableChannel in HttpChannelChild|Parent and supporting functions

Review of attachment 8381934 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +104,5 @@
>                     nsIRequestObserver,
>                     nsIStreamListener,
>                     nsIParentChannel,
> +                   nsIParentRedirectingChannel,
> +                   nsIDivertableChannelParent)

hmm, let's follow Honza's convention of putting "Parent" at the beginning (nsIParentChannel, nsIParentRedirectingChannel), to distinguish these classes from the "fooParent" clases that are parent ends of IPDL protocols.  So nsIParentDivertableChannel.

We've got a convention, so let's stick to it.
Attachment #8381934 - Flags: feedback+
Reuben, this is the really basic patch I used to add PChannelDiverter to ExternalHelperApp. It does no cleanup or anything like that, but it should give you an idea of how we think the API should be used.
Comment on attachment 8382488 [details] [diff] [review]
WIP 0.3 Skeleton code for PChannelDiverter and PHttpChannel modifications

Review of attachment 8382488 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good--mainly documentation comment suggestions.

Fun details I've thought of:

We probably shouldn't call OnStartRequest directly from within DivertTo() (we generally try to avoid re-entering client code). So schedule an event to run it (use AsyncCall?).

We need to handle the case where the new listener in the parent (or something else) either suspends or cancels the channel.  Cancel is easy--just keep an mCanceled/mStatus pair, when you RecvDivertOnData just toss data on the floor (don't call OnDataAvailable) and then either call OnStopRequest(mStatus) if we get RecvOnStop, or if we're flushed w/o OnStop, just Cancel(mStatus) the nsHttpChannel before you resume it.

Suspend is trickier: we need to queue the RecvDivertOn{Data|Stop} messages.  I suggest you look at adding a ChannelEventQueue to HttpChannelParent and use it to queue messages while we're suspended, just as we do on the child.

::: netwerk/ipc/NeckoParent.h
@@ +149,5 @@
> +  virtual bool
> +  RecvPChannelDiverterConstructor(PChannelDiverterParent* actor,
> +                                  const ChannelDiverterArgs& channel);
> +  virtual bool DeallocPChannelDiverterParent(PChannelDiverterParent* actor)
> +                                                                MOZ_OVERRIDE;

I forget when/why we use MOZ_OVERRIDE.  Why use for alloc/dealloc but not RecvConstructor?

::: netwerk/ipc/PChannelDiverter.ipdl
@@ +10,5 @@
> +
> +namespace mozilla {
> +namespace net {
> +
> +async protocol PChannelDiverter

// Used when diverting necko channels from child back to the parent.  See nsIDivertableChannel.

::: netwerk/protocol/http/HttpChannelParent.h
@@ +92,5 @@
> +  virtual bool RecvDivertOnDataAvailable(const nsCString& data,
> +                                         const uint64_t& offset,
> +                                         const uint32_t& count);
> +  virtual bool RecvDivertOnStopRequest(const nsresult& statusCode);
> +  virtual bool RecvDivertComplete();

MOZ_OVERRIDE for these?
Attachment #8382488 - Flags: feedback+
Don't let the suspend/queue stuff on the parent stop you from getting something working first that doesn't support suspend/resume.
Comment on attachment 8382597 [details] [diff] [review]
WIP 0.1 More implementation; flesh on the skeleton

Review of attachment 8382597 [details] [diff] [review]:
-----------------------------------------------------------------

Mostly looks good!

::: netwerk/protocol/http/HttpChannelChild.cpp
@@ +358,5 @@
>    LOG(("HttpChannelChild::OnTransportAndData [this=%p]\n", this));
>  
> +  // XXX SendOnTransportAndData or just SendDivertOnDataAvailable?
> +  if (mDivertingToParent) {
> +    SendDivertOnDataAvailable(data, offset, count);

probably want a "return" after this? :)

@@ +825,3 @@
>  bool
>  HttpChannelChild::RecvDivertMessages()
>  {

MOZ_ASSERT(mSuspended)

@@ +1523,5 @@
>  {
> +  mSuspendCount++;
> +  mEventQ->Suspend();
> +
> +  return NS_OK;

I imagined a common Suspend function that checks mDiverted and skips SendSuspend if true.

Maybe we should change RemoteChannelExists() to also check mDiverted?

And you're not setting any mDiverted/etc variable in DivertToParent--you should do that, we'll want to fail lots of functions if they're called after that, and RemoteChannelExists seems to be the place we can usually do the check.

@@ +1536,5 @@
> +  mSuspendCount--;
> +  // XXX Don't think we need this next block.
> +  /*if (mCallOnResume) {
> +    AsyncCall(mCallOnResume);
> +    mCallOnResume = nullptr;

mCallOnResume is used when we decide to abort the channel w/o doing any IPDL (ie. early fail from AsyncAbort).  For instance on-modify-request observers could have canceled the channel already (but another observer might have suspended it too, in which case we store the HandleAsyncAbort function ptr in mCallOnResume.

Fortunately for us all that crap happens before OnStartRequest is called and the client can call DivertToParent().  However, we *should* check mStatus and fail DivertToParent() if the channel has already been canceled and !RemoteChannelExists(), because if that's true there is no HttpChannelParent yet and won't ever be so we need to let client handle on child.

(Which means we need to document in nsIDivertableChannel() that it can fail and that means the channel was canceled early).

And yes, you don't need that block in ResumeForDiversion: we'll never hit it if we're canceled early since we'll fail DivertToParent.

::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +447,5 @@
> +  nsresult rv = NS_NewByteInputStream(getter_AddRefs(stringStream), data.get(),
> +                                      count, NS_ASSIGNMENT_DEPEND);
> +  if (NS_FAILED(rv)) {
> +    if (mChannel) {
> +      mChannel->Cancel(rv);

nit: I don't expect this to ever fail, but if it did and we're going to get RecvOnStop (i.e. nsHttpChannel is already done), then you don't handle that case.  I already mentioned we need to handle Cancel, so maybe just

AGH--just realized that when the listener gets OnStartRequest(channel) they get the nsHttpChannel, and if they Cancel or Suspend it we don't find out about it.  I think we'll (eventually) need a proxy object that looks like a channel but forwards all calls to the nsHttpChannel (and also notifies ParentListener and/or HttpChannelParent) so we can suspend (or drop) RecvOn{Data|Stop} messages from the child.

meh.  

Too sleepy to think about this lucidly. I'll ponder more in the AM.  I smell "followup bug" unless the Download manager suspends/cancels the channel early in practice.  We can probably be done with Divert before a user can swipe-kill a download?

@@ +721,5 @@
>  HttpChannelParent::SuspendForDiversion()
>  {
> +  MOZ_ASSERT(mChannel);
> +  nsresult rv = mChannel->Suspend();
> +  if (NS_WARN_IF(NS_FAILED(rv))) {

might need to be MOZ_ASSERT--we don't have a story for the client as to what they should do if we fail, and I don't think this can really fail in practice?

@@ +728,5 @@
> +
> +  nsHttpChannel *httpChan = static_cast<nsHttpChannel *>(mChannel.get());
> +  nsCOMPtr<nsIInterfaceRequestor> callbacks;
> +  rv = httpChan->GetNotificationCallbacks(getter_AddRefs(callbacks));
> +  if (NS_WARN_IF(NS_FAILED(rv))) {

ditto

@@ +737,5 @@
> +    do_QueryInterface(callbacks);
> +  MOZ_ASSERT(parentListener);
> +
> +  mParentListener = do_QueryInterface(parentListener);
> +  MOZ_ASSERT(mParentListener);

You can get mParentListener more easily by just setting mParentListener to it in RecvAsyncOpen when we create it.

@@ +766,5 @@
> +    do_QueryInterface(mParentListener);
> +  MOZ_ASSERT(divertable);
> +
> +  nsresult rv = divertable->DivertTo(aListener);
> +  if (NS_WARN_IF(NS_FAILED(rv))) {

Cant' fail?  MOZ_ASSERT?

@@ +770,5 @@
> +  if (NS_WARN_IF(NS_FAILED(rv))) {
> +    return rv;
> +  }
> +
> +  // XXX null context ok?

yes

@@ +777,5 @@
> +    return rv;
> +  }
> +
> +  if (NS_WARN_IF(!SendFlushedForDiversion())) {
> +    return NS_ERROR_FAILURE;

I suppose we might want to document in nsIDivertableParentChannel that DivertTo() can fail if the child process has died.  In that case the channel should be considered dead (we should cancel it and throw away OnStart/Stop/Data msgs)

::: netwerk/protocol/http/HttpChannelParentListener.cpp
@@ +92,4 @@
>      return NS_ERROR_UNEXPECTED;
>  
>    LOG(("HttpChannelParentListener::OnDataAvailable [this=%p]\n", this));
> +  return mNextListener->OnDataAvailable(aRequest, aContext, aInputStream, aOffset, aCount);

It would be weird to pass a non-null context (i.e. that was set by the original listener) to the divertedTo listener.   But aContext is IIRC always null for the e10s case.  Not sure if it's worth explicitly nulling it out for future-proofing the code.  I don't expect us to ever use channel contexts in the e10s codepath...

@@ +211,5 @@
> +    parent = do_QueryInterface(mNextListener);
> +    MOZ_ASSERT(parent);
> +    parent->Delete();
> +    mNextListener = do_QueryInterface(redirectChannel);
> +    MOZ_ASSERT(mNextListener);

sigh--opt builds toss out the MOZ_ASSERT checks, so I wonder if we need to use NS_RUNTIMEABORT() instead everywhere that we would core dump otherwise in an opt build. (not sure if there's already a macro that does a check and calls RUNTIMEABORT if false).

@@ +234,5 @@
> +NS_IMETHODIMP
> +HttpChannelParentListener::DivertTo(nsIStreamListener* aListener)
> +{
> +  MOZ_ASSERT(aListener);
> +  if (mNextListener) {

Seems like we want to MOZ_ASSERT mNextListener here--things would be very weird w/o that being true

::: netwerk/protocol/http/HttpChannelParentListener.h
@@ +38,5 @@
>    HttpChannelParentListener(HttpChannelParent* aInitialChannel);
>    virtual ~HttpChannelParentListener();
>  
>  private:
> +  nsCOMPtr<nsIStreamListener> mNextListener;

// Can be original HttpChannelParent that created this object (normal case), a different {HTTP|FTP}ChannelParent that we've been redirected to, or some other listener that we have been diverted to via nsIDivertableChannel.
Attachment #8382597 - Flags: feedback+
I've thought of an easier (at least much less code) way to support handling the parent client code calling Suspend/Resume/Cancel on the channel before we're done Diverting/dispatching all the messages that were queued on the child.  For suspend/resume, just add a (public, non-IDL) "SuspendCount" method to nsHttpChannel (or HttpBaseChannel, I don't care which) and in our RecvOnData/Stop methods, query it to see if the channel suspend count is >1 (the 1 is for the suspend that we do as part of Divert). If so, queue the message instead of dispatching.  Ditto for Cancel--just get GetCanceled (already existing method) and drop ODA and/or dispatch OnStop with channel.GetStatus for the status if the nsHttpChannel has been canceled.  Make sense?  

Weird things would happen if a client calls resume() an extra time (we'd start dispatching msgs from the nsHttpChannel before/while we continue dispatching them from RecvDivertedOnData/Stop), but that's a fairly serious client programming error, so I don't care :)
-- Merged previous patches dealing with HttpChannelChild|Parent (attachment 8381934 [details] [diff] [review], attachment 8382597 [details] [diff] [review]).
-- Dealt with most of Jason's comments except the Suspend/Resume/Cancel question. On first reading, Jason's most recent suggestion for this sounds good.

FYI: PTO tomorrow, Friday Feb 28, and traveling over the weekend, so I'll be back at this on Mon.
Attachment #8381934 - Attachment is obsolete: true
Attachment #8382597 - Attachment is obsolete: true
Attachment #8383506 - Attachment description: WIP 0.3 nsIDivertableChannel and ADivertableParentChannel → Part 1 WIP 0.3 nsIDivertableChannel and ADivertableParentChannel
Attachment #8383507 - Attachment description: WIP 0.4 PChannelDiverter and modifications to PHttpChannel → Part 2 WIP 0.4 PChannelDiverter and modifications to PHttpChannel
Attachment #8383508 - Attachment description: WIP 0.2 Implement nsI|ADivertableChannel in HttpChannelChild|Parent → Part 3 WIP 0.2 Implement nsI|ADivertableChannel in HttpChannelChild|Parent
Comment on attachment 8383507 [details] [diff] [review]
Part 2 WIP 0.4 PChannelDiverter and modifications to PHttpChannel

Review of attachment 8383507 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/src/ChannelDiverterParent.cpp
@@ +29,5 @@
> +  switch (aChannel.type()) {
> +  case ChannelDiverterArgs::TPHttpChannelParent:
> +  {
> +    mDivertableChannelParent =
> +      dynamic_cast<ADivertableParentChannel*>(aChannel.get_PHttpChannelParent());

We build with -fno-rtti which means dynamic_cast is only allowed when it can be resolved statically (upcasting w/o multiple inheritance). In this case, it looks like we'll need XPCOM.
(In reply to Reuben Morais [:reuben] from comment #41)
> We build with -fno-rtti which means dynamic_cast is only allowed when it can
> be resolved statically (upcasting w/o multiple inheritance). In this case,
> it looks like we'll need XPCOM.

Err, or just walk through the inheritance chain manually:

mDivertableChannelParent =
      static_cast<ADivertableParentChannel*>(static_cast<HttpChannelParent*>(aChannel.get_PHttpChannelParent()));
Will the ExternalHelperAppService ever get a channel that isn't divertable after this bug is fixed?
> Will the ExternalHelperAppService ever get a channel that isn't divertable after 
> this bug is fixed?

It's *possible* long term--addons can provide their own JS implementation (which usually is a thin wrapper around an nsHttpChannel, but it's still not going to be QI-able to nsIDivertableChannel).  But I don't think you need to worry about it.  I'm hoping that the code up to DivertToParent() will be fail-safe, so it should always return OK and you can do the IPC to the parent with the ChannelDiverter.  On the parent, DivertTo() could still fail if the child has exited by the time you call it.  In that case it's just cleanup time.
>  Suspend/Resume/Cancel question. On first reading, Jason's most recent suggestion 
> for this sounds good.

Actually I've realized my plan in comment 37 has a big flaw: if we notice that the nsHttpChannel is suspended, and it's still suspended when RecvDivertOnStop arrives, we have no hook to notice when the channel is eventually resume()'d.  There's two possible fixes:

1) we implement a XPCOM nsIResumeListener IDL that notifies a listener whenever a channel has resume called on it (it can't be just when it's finally resumed, since we'll be holding one of the suspend counts: we'll have to check whether channel.resumeCount == 1 each time we're called).  It's gross, and I don't like it, but it's a lot less boilerplate then

2) We write out wrapper channels classes that implements all the interfaces that ns{Http|Ftp|Channel} do, which just proxy all calls to the real channel (and also notify us when resume is called, and I guess Suspend and Cancel, too).

I guess I lean towards #1, however gross it is, because it seems simpler.
Actually, maybe let's not make it XPCOM, but just a internal necko ResumeListener abstract class (with one method, NotifyResumed(suspendCount), that passes the new suspendCount of the channel). I don't want to deal with supporting multiple listeners for this, or figuring out what happens if a new listener stomps over an existing one.  Just make the (public, non XPCOM) channel.registerResumeListener() method fail if there's already a listener.
Comment on attachment 8383506 [details] [diff] [review]
Part 1 WIP 0.3 nsIDivertableChannel and ADivertableParentChannel

Review of attachment 8383506 [details] [diff] [review]:
-----------------------------------------------------------------

+r with a couple comment tweaks.

::: netwerk/base/public/nsIDivertableChannel.idl
@@ +44,5 @@
> +   *    ChannelDiverterParent can then be deleted (which will also destroy the
> +   *    ChannelDiverterChild in the child).
> +   *
> +   *    After divertToParent() has been called, NO further function calls
> +   *    should be made on the channel.  It is a dead object for all purposes.

The reference that the channel holds to the listener in the child is released when this method is called.

::: netwerk/base/src/ADivertableParentChannel.h
@@ +28,5 @@
> +  // added and kept until OnStopRequest has completed.
> +  //
> +  // Note: DivertTo() can fail if the child process has died.  In that case the
> +  // channel should be considered dead (we should cancel it and throw away
> +  // OnStart/Stop/Data messages).

Indeed "we should cancel it and throw away OnStart/Stop/Data messages".  But the language here should avoid those internals, and just say

  // ...considered dead (it will not provide any OnStart/OnDataAvailable/OnStop notifications).

@@ +36,5 @@
> +  //
> +  // It is expected that the parent channel will also implement a private
> +  // ResumeForDiversion function in addition to SuspendForDiversion.
> +  // Similarly, the child channel should implement private Suspend|
> +  // ResumeForDiversion functions.

I think you can just say

  // This function is provides a common way to call the ResumeForDiversion method for different channel types (HTTP/FTP).
Attachment #8383506 - Flags: review+
Comment on attachment 8383507 [details] [diff] [review]
Part 2 WIP 0.4 PChannelDiverter and modifications to PHttpChannel

Review of attachment 8383507 [details] [diff] [review]:
-----------------------------------------------------------------

All the implemented parts look good.

::: netwerk/base/src/ChannelDiverterChild.cpp
@@ +23,5 @@
> +{
> +}
> +
> +bool
> +ChannelDiverterChild::Init(const ChannelDiverterArgs& aChannel)

I don't think we need an Init() method?  AFAICT we don't ever need to reference the channel on the child.

::: netwerk/base/src/ChannelDiverterParent.cpp
@@ +35,5 @@
> +  }
> +  case ChannelDiverterArgs::TPFTPChannelParent:
> +  {
> +    mDivertableChannelParent =
> +      dynamic_cast<ADivertableParentChannel*>(aChannel.get_PFTPChannelParent());

I'm pretty sure Reuben is right about the dynamic casts--I wasn't aware we ever used them...

::: netwerk/protocol/http/PHttpChannel.ipdl
@@ +72,5 @@
> +                        uint32_t  count);
> +
> +  // Divert OnStopRequest to the parent.
> +  DivertOnStopRequest(nsresult statusCode);
> +  

extra whitespace
Attachment #8383507 - Flags: feedback+
Comment on attachment 8383508 [details] [diff] [review]
Part 3 WIP 0.2 Implement nsI|ADivertableChannel in HttpChannelChild|Parent

Review of attachment 8383508 [details] [diff] [review]:
-----------------------------------------------------------------

a few bugs and nits, but mostly looks great.

::: netwerk/protocol/http/HttpChannelChild.cpp
@@ +357,5 @@
>                                       const uint32_t& count)
>  {
>    LOG(("HttpChannelChild::OnTransportAndData [this=%p]\n", this));
>  
> +  // XXX SendOnTransportAndData or just SendDivertOnDataAvailable?

Good question.  I'm not sure of the answer.  nsHttpChannel already calls OnStatus/OnProgress for all the bits that are sent to the child, so in some sense observers have already been notified and it would be weird to notify them again.  OTOH it might be possible that the download manager gets these notifications and only signs up for them once OnStartRequest has been called (which we won't do until diverting has started).  IIRC the OnProgress API gives you how many bytes have been downloaded (not just a delta) so it shouldn't matter if a client misses some of them.  

So let's just send ODA back for now, and see if things break.  I'm reluctant to double-notify OnProgress/Status w/o evidence that we need to.

@@ +811,5 @@
>  {
> +  if (mEventQ->ShouldEnqueue()) {
> +    mEventQ->Enqueue(new FlushedForDiversionEvent(this));
> +  } else {
> +    FlushedForDiversion();

If I'm understanding the logic correctly, we should always be suspended when we get this msg, so ShouldEnqueue() should always be true.  So put a NSABORT or MOZ_ASSERT(0) here.

@@ +819,5 @@
> +
> +void
> +HttpChannelChild::FlushedForDiversion()
> +{
> +  //mFlushedForDiversion = true;

Don't need mFlushedForDiversion?

@@ +826,5 @@
>  
>  bool
>  HttpChannelChild::RecvDivertMessages()
>  {
> +  if (NS_WARN_IF(mSuspendCount <= 0)) {

Toss in MOZ_ASSERT too

@@ +830,5 @@
> +  if (NS_WARN_IF(mSuspendCount <= 0)) {
> +    return false;
> +  }
> +
> +  if (NS_WARN_IF(NS_FAILED(Resume()))) {

Throw in comment:

  // DivertTo() has been called on parent, so we can now start sending queued IPDL messages back to parent listener.

It looks like you're also going to need to set mDivertingToParent to false (or some other trick) to get Resume to work correctly.

@@ +1002,5 @@
>  NS_IMETHODIMP
>  HttpChannelChild::Suspend()
>  {
>    NS_ENSURE_TRUE(RemoteChannelExists(), NS_ERROR_NOT_AVAILABLE);
> +  if (!mDivertingToParent || !mSuspendCount++) {

Important!  This needs to be && not || :)

Also let's invert the order of the operands to

  !mSuspendCount++ && !mDivertingToParent

Reads better.  Also I think it's mildly brain-melting to let the mSuspendCount for the channel != that for the mEventQueue.

@@ +1018,5 @@
>    NS_ENSURE_TRUE(mSuspendCount > 0, NS_ERROR_UNEXPECTED);
>  
>    nsresult rv = NS_OK;
>  
> +  if (!mDivertingToParent || !--mSuspendCount) {

Same here: && and reverse order.

@@ +1510,5 @@
> +  MOZ_ASSERT(aChild);
> +  MOZ_ASSERT(gNeckoChild);
> +  MOZ_ASSERT(!mDivertingToParent);
> +
> +  if (NS_FAILED(mStatus) && (mCanceled || !RemoteChannelExists())) {

got a reason for the (mCanceled || !RemoteChannelExists())?  I would think we'd want just !RemoteChannelExists here, to catch early failure where we haven't opened IPDL yet.  If we have opened IPDL, then the channel should proceed regularly (it will deliver OnStart/Stop(canceled_status) on the parent listener).

Add comment:

  // We must fail DivertToParent() if there's no parent end of the channel (and won't be!) due to early failure.

@@ +1524,5 @@
> +
> +  PChannelDiverterChild* diverter =
> +    gNeckoChild->SendPChannelDiverterConstructor(this);
> +  MOZ_ASSERT(diverter);
> +  

extra whitespace.

::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +447,5 @@
> +  if (NS_FAILED(rv)) {
> +    if (mChannel) {
> +      mChannel->Cancel(rv);
> +    }
> +    return false;

"return false" from an IPDL method kills the entire child process.  That's a bit harsh here.

@@ +450,5 @@
> +    }
> +    return false;
> +  }
> +
> +  // XXX nullptr for context.

// We deliberately pass null for aContext here.

@@ +458,5 @@
> +  if (NS_FAILED(rv)) {
> +    if (mChannel) {
> +      mChannel->Cancel(rv);
> +    }
> +    return false;

return false is definitely wrong here--it's fine for the listener to cancel the channel if it wants by returning !OK from ODA.  Don't kill the child because of that.

@@ +468,5 @@
>  HttpChannelParent::RecvDivertOnStopRequest(const nsresult& statusCode)
>  {
> +  MOZ_ASSERT(mParentListener);
> +
> +  // XXX nullptr for context.

// We deliberately pass null for aContext here.

@@ +736,5 @@
> +  if (NS_WARN_IF(NS_FAILED(rv))) {
> +    return rv;
> +  }
> +  // XXX may need to add delete in this direction??
> +  //Send__delete__();

I do think we may want to tear down IPDL here.  But use SendDeleteSelf() (which tells child to Send__delete).  I'm 99% sure we could get away with directly calling Send_delete from the parent, but for most regular channels that caused subtle races when we used to do it, so I don't want it in the *.ipdl file that it's possible to send_delete directly from parent.  We don't care about efficiency here anyway.

@@ +762,5 @@
> +void
> +HttpChannelParent::StartDiversion()
> +{
> +  // Call OnStartRequest for the "DivertTo" listener.
> +  nsresult rv = mParentListener->OnStartRequest(mChannel, nullptr);

If we NS_FAILED from OnStartRequest we need to Cancel the nsHttpChannel (with the rv returned from OnStart), set mCanceled(rv), drop RecvOnDataAvail, and change RecvOnStop (if we get it) to call OnStop(rv).

::: netwerk/protocol/http/HttpChannelParentListener.cpp
@@ +38,5 @@
>                     nsIStreamListener,
>                     nsIRequestObserver,
>                     nsIChannelEventSink,
> +                   nsIRedirectResultListener,
> +                   ADivertableParentChannel)

Erm. I'm not sure that would work since ADivertableParentChannel isn't an XPCOM class. But since we don't ever QI to it (just cast to it, right?) we shouldn't need it.

@@ +54,4 @@
>      return NS_ERROR_UNEXPECTED;
>  
>    LOG(("HttpChannelParentListener::OnStartRequest [this=%p]\n", this));
> +  return mNextListener->OnStartRequest(aRequest, aContext);

Still need isPending hack.  We may be able to get away w/o it for first cut (and do it when we do Cancel/Suspend/Resume correctly).

@@ +226,5 @@
> +//-----------------------------------------------------------------------------
> +nsresult
> +HttpChannelParentListener::SuspendForDiversion()
> +{
> +  //mSuspendedForDiversion = true;

yeah looks like we may not even need that.  Maybe we'll wind up needing it for cancel/suspend/resume.

@@ +236,5 @@
> +{
> +  MOZ_ASSERT(aListener);
> +  MOZ_ASSERT(mNextListener);
> +  
> +  mNextListener = aListener;

extra whitespace above and below this line.
Attachment #8383508 - Flags: feedback+
Thanks for the r+ Jason.

(In reply to Jason Duell (:jduell) from comment #47)
> @@ +36,5 @@
> > +  //
> > +  // It is expected that the parent channel will also implement a private
> > +  // ResumeForDiversion function in addition to SuspendForDiversion.
> > +  // Similarly, the child channel should implement private Suspend|
> > +  // ResumeForDiversion functions.
> 
> I think you can just say
> 
>   // This function is provides a common way to call the ResumeForDiversion
> method for different channel types (HTTP/FTP).

I'm not sure what you mean here, but I simplified the comment just to say "Called to suspend parent channel in ChannelDiverterParent constructor".
(In reply to Jason Duell (:jduell) from comment #44)
> > Will the ExternalHelperAppService ever get a channel that isn't divertable after 
> > this bug is fixed?
> 
> It's *possible* long term--addons can provide their own JS implementation
> (which usually is a thin wrapper around an nsHttpChannel, but it's still not
> going to be QI-able to nsIDivertableChannel).  But I don't think you need to
> worry about it.  I'm hoping that the code up to DivertToParent() will be
> fail-safe, so it should always return OK and you can do the IPC to the
> parent with the ChannelDiverter.  On the parent, DivertTo() could still fail
> if the child has exited by the time you call it.  In that case it's just
> cleanup time.

Wait, do you mean it's possible but I don't have to handle that case? I'm wondering if we could remove the current manual parent>child>parent forwarding of callbacks completely.
(In reply to Reuben Morais [:reuben] from comment #51)
> (In reply to Jason Duell (:jduell) from comment #44)
> 
> Wait, do you mean it's possible but I don't have to handle that case? I'm
> wondering if we could remove the current manual parent>child>parent
> forwarding of callbacks completely.

I think we would like to have one method of diverting data back to the parent in the long term, e.g. nsIDivertableChannel. And I think this is essentially what you're saying - let the channel handle the callback forwarding instead of ExternalAppHandler.

However, this fix is going in quite late, so I'd be wary of making a big change like that. It's possible that there are channels that we haven't thought about that still use ExAppHandler's manual forwarding; I don't think we want to risk breaking that functionality.

So, I suggest you add something like an NS_WARNING if the channel is not an nsIDivertableChannel, for those cases that are not HTTP and FTP. AND, add a MOZ_ASSERT to crash if any HTTP or FTP channels try to use ExAppHandler's manual forwarding: this will help us in testing and verification. And then we should come up with a long term plan to first verify that we can remove the ExAppHandler manual forwarding code, and then how and when to do it.

Jason and Gregor may disagree though.
For B2G currently we only use FTP|HTTP channels for downloads, so there should be no immediate issue if we remove the existing ExtApp code that forwards msgs back to the parent.

The issue is when we support e10s desktop (and/or addons in Firefox OS) and we have addons which provide their own protocols.  This is rare, but it seems to happen.  So the tradeoff is either 1) we fail downloads when that happens, or 2) we maintain a very rarely used codepath for ExtApp handler.

I'm not really happy with either option--I guess I lean towards keeping the existing codepath, because it works, and it won't do too much harm if it's not taken in practice (other than code obfuscation).  I could probably be persuaded either way though, especially if removing the code cleans things up a lot.  Reuben, do you have an opinion now that you've worked with the ExtApp code a bit?
Flags: needinfo?(reuben.bmo)
(In reply to Jason Duell (:jduell) from comment #53)
> I'm not really happy with either option--I guess I lean towards keeping the
> existing codepath, because it works, and it won't do too much harm if it's
> not taken in practice (other than code obfuscation).  I could probably be
> persuaded either way though, especially if removing the code cleans things
> up a lot.  Reuben, do you have an opinion now that you've worked with the
> ExtApp code a bit?

Removing this code would not be a huge clean up, so I also think keeping the code for now is the better option, specially considering the timeline of these bugs. I'll file a follow-up bug to investigate how common custom protocols are and decide if we want to remove this code in the future.
Flags: needinfo?(reuben.bmo)
(In reply to Jason Duell (:jduell) from comment #48)
> Comment on attachment 8383507 [details] [diff] [review]
> Part 2 WIP 0.4 PChannelDiverter and modifications to PHttpChannel
>
> > +bool
> > +ChannelDiverterChild::Init(const ChannelDiverterArgs& aChannel)
> 
> I don't think we need an Init() method?  AFAICT we don't ever need to
> reference the channel on the child.

You're right. Poof! It's gone!

> ::: netwerk/base/src/ChannelDiverterParent.cpp
> @@ +35,5 @@
> > +  }
> > +  case ChannelDiverterArgs::TPFTPChannelParent:
> > +  {
> > +    mDivertableChannelParent =
> > +      dynamic_cast<ADivertableParentChannel*>(aChannel.get_PFTPChannelParent());
> 
> I'm pretty sure Reuben is right about the dynamic casts--I wasn't aware we
> ever used them...

Yeah, I wasn't sure either; hadn't seen them much. But then a search showed up http://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsAHttpTransaction.h#118

But I removed them anyway for the chained static_casts. We can pretty much guarantee that they should work.
Attachment #8383506 - Attachment is obsolete: true
Attachment #8383507 - Attachment is obsolete: true
Attachment #8385626 - Flags: review?(jduell.mcbugs)
(In reply to Jason Duell (:jduell) from comment #49)
> Comment on attachment 8383508 [details] [diff] [review]
> Part 3 WIP 0.2 Implement nsI|ADivertableChannel in HttpChannelChild|Parent
>
> > +  // XXX SendOnTransportAndData or just SendDivertOnDataAvailable?
> 
> Good question.  I'm not sure of the answer.  nsHttpChannel already calls
> OnStatus/OnProgress for all the bits that are sent to the child, so in some
> sense observers have already been notified and it would be weird to notify
> them again.  OTOH it might be possible that the download manager gets these
> notifications and only signs up for them once OnStartRequest has been called
> (which we won't do until diverting has started).  IIRC the OnProgress API
> gives you how many bytes have been downloaded (not just a delta) so it
> shouldn't matter if a client misses some of them.  
> 
> So let's just send ODA back for now, and see if things break.  I'm reluctant
> to double-notify OnProgress/Status w/o evidence that we need to.

Sounds good. Current code is working with my minimal ExAppHelper patch, but the spinner on the tab still goes round after a download completes. Otherwise, the download panel says the file is downloaded. So, not sure what's going on there, but hoping it's my minimal ExAppHelper patch.

> > +  //mFlushedForDiversion = true;
> 
> Don't need mFlushedForDiversion?

I looked at this and some other status flags - I've played around with setting them and checking them. Take a look and see what you think - I think I've covered the main assertion points for each one.

> >  HttpChannelChild::Suspend()
> >  {
> >    NS_ENSURE_TRUE(RemoteChannelExists(), NS_ERROR_NOT_AVAILABLE);
> > +  if (!mDivertingToParent || !mSuspendCount++) {
> 
> Important!  This needs to be && not || :)
> 
> Also let's invert the order of the operands to
> 
>   !mSuspendCount++ && !mDivertingToParent
> 
> Reads better.  Also I think it's mildly brain-melting to let the
> mSuspendCount for the channel != that for the mEventQueue.

So, I fixed the order and || to && (d'oh, Stephen!), but I'm not sure what you meant about brain melting. Perhaps mine is already a puddle?

> > +    return false;
> 
> "return false" from an IPDL method kills the entire child process.  That's a
> bit harsh here.

Zero tolerance, man! ;) Fixed - returning silently and not taking down IPDL now.

> > +HttpChannelParent::StartDiversion()
> > +{
> > +  // Call OnStartRequest for the "DivertTo" listener.
> > +  nsresult rv = mParentListener->OnStartRequest(mChannel, nullptr);
> 
> If we NS_FAILED from OnStartRequest we need to Cancel the nsHttpChannel
> (with the rv returned from OnStart), set mCanceled(rv), drop
> RecvOnDataAvail, and change RecvOnStop (if we get it) to call OnStop(rv).

Thanks for that. Added those in.

> ::: netwerk/protocol/http/HttpChannelParentListener.cpp
> @@ +38,5 @@
> >                     nsIStreamListener,
> >                     nsIRequestObserver,
> >                     nsIChannelEventSink,
> > +                   nsIRedirectResultListener,
> > +                   ADivertableParentChannel)
> 
> Erm. I'm not sure that would work since ADivertableParentChannel isn't an
> XPCOM class. But since we don't ever QI to it (just cast to it, right?) we
> shouldn't need it.

Indeed. Sometimes global replace works. Sometimes not.

> @@ +54,4 @@
> >      return NS_ERROR_UNEXPECTED;
> >  
> >    LOG(("HttpChannelParentListener::OnStartRequest [this=%p]\n", this));
> > +  return mNextListener->OnStartRequest(aRequest, aContext);
> 
> Still need isPending hack.  We may be able to get away w/o it for first cut
> (and do it when we do Cancel/Suspend/Resume correctly).

Added. I did the hacked version in nsHttpChannel, overriding HttpBaseChannel::IsPending. I played with a wrapper class as Honza had suggested, but I didn't like it. Lots of forwarding and a lot of code to keep in sync with nsHttpChannel's class definition. For now, for speed, hacking on nsHttpChannel seems quicker.
Attachment #8383508 - Attachment is obsolete: true
Attachment #8385627 - Flags: review?(jduell.mcbugs)
So, patches updated; everything seems to build on my mac with a repo updated yesterday; and downloads work in an e10s window with one caveat - the spinner doesn't stop.

Also, I haven't worked on the extra suspend/resume/cancel code yet. I've been reading that as a follow up, but maybe I'll get time to do something this week. That depends on what testing we can get done.

Reuben, please try these patches with yours and let me know if you see any problems. The never-stopping-spinner might be to do with my really basic ExAppHelper patch. Or it might be something more serious. So, it would be great to get some feedback from you.
(In reply to Steve Workman [:sworkman] from comment #58)
> Reuben, please try these patches with yours and let me know if you see any
> problems. The never-stopping-spinner might be to do with my really basic
> ExAppHelper patch. Or it might be something more serious. So, it would be
> great to get some feedback from you.

Yeah, after diverting we need to remove the request from the tab's docloader/loadGroup. That's what causes the never stopping spinner thing. Also, without fixing that, clicking stop on the tab or start a new download kills the download. I'm gonna try your updated patches.
> I haven't worked on the extra suspend/resume/cancel code yet. I've been reading 
> that as a follow up

A quick grep of uriloader/exthandler shows me that suspend is not implemented in nsExternalProtocolHandler.cpp.  And nsDownload::Pause() winds up Cancel()ing the channel.  So I suspect we're OK for this to be a followup.

> after diverting we need to remove the request from the tab's docloader/loadGroup

+1, yes the channel in the child that's diverted must be removed from the loadGroup. Otherwise the loadGroup will consider itself not finished (spinning) until we call OnStopRequest (which we never do). 

>  without fixing that, clicking stop on the tab or start a new download kills the download

Well, not after we have Divert() working, right?  We should be able to cancel in the child after Divert is complete and even if we forgot to remove from the loadGroup, the new listener in the parent shouldn't be canceled.  Anyway, should be a moot point :)

> I'm not sure what you meant about brain melting.

Oh, it was just that with !mDiverting as the 1st param there, we'd wind up not incrementing mSuspendCount in the channel, but then after the if loop we'd call mEventQ->Suspend, so the channel and it's eventQ would have different suspendCounts.  It wouldn't actually break anything (resume would work the same way), but it's not intuitive.
> And nsDownload::Pause() winds up Cancel()ing the channel. 
> So I suspect we're OK for this to be a followup.

I was sleepy when I wrote that--of course that means that we *will* have a narrow window where if the channel on the parent is canceled after DivertTo() but before the child has finished returning data to the parent, we'll deliver OnData/Stop regardless.  But as I've mentioned I expect that race to be hard to hit in practice (user would need to kill download almost instantly).  Followup.
Comment on attachment 8385626 [details] [diff] [review]
Part 2 v1.0 PChannelDiverter and modifications to PHttpChannel

Review of attachment 8385626 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.
Attachment #8385626 - Flags: review?(jduell.mcbugs) → review+
Comment on attachment 8385627 [details] [diff] [review]
Part 3 v1.0 Implement nsI|ADivertableChannel in HttpChannelChild|Parent

Review of attachment 8385627 [details] [diff] [review]:
-----------------------------------------------------------------

Getting very close!  All we need is FTP and some nits...

I don't see code that Release()s mListener on the child after DivertToParent() has been called.  As pointed out in bug 946239 comment 25, we should hold off on actually dropping the reference to the listener until OnStartRequest has finished (so the listener won't go away in the middle of OnStartRequest after it calls DivertToParent). 

I merrily inserted lots of "CRASH" comments in the comments below, on the theory that most B2G devices never run debug code, so we're better off getting a hard crash than a missed assert plus who-knows what state before we figure out what's wrong.  Let me refine that position: I think we should still crash in the child, but not in the parent (it's not worth taking the whole phone down).

::: netwerk/protocol/http/HttpChannelChild.cpp
@@ +365,5 @@
> +  if (NS_WARN_IF(mFlushedForDiversion)) {
> +    MOZ_ASSERT(!mFlushedForDiversion,
> +               "Should not be receiving any more callbacks from parent!");
> +    // Return silently; don't take down IPDL.
> +    return true;

might as well MOZ_CRASH here--if we get here our code is badly broken.  Or you can just keep assert and return false, which does much the same thing.

@@ +373,5 @@
>      mEventQ->Enqueue(new TransportAndDataEvent(this, status, progress,
>                                                progressMax, data, offset,
>                                                count));
>    } else {
> +    MOZ_ASSERT(!mDivertingToParent, "ShouldEnqueue when diverting to parent!");

crash

@@ +393,5 @@
>  
> +  // For diversion to parent, just SendDivertOnDataAvailable.
> +  if (mDivertingToParent) {
> +    if (NS_WARN_IF(mFlushedForDiversion)) {
> +      MOZ_ASSERT(!mFlushedForDiversion,

CRASH here too.

At the risk of angering the new coding style gods, since we're crashing so much, maybe a #define CRASH_IF(condition, "msg") macro? (hey we're not return-ing so we're not hiding a return statement!)

Also no need to WARN_IF before a CRASH.  CRASH gets all the attention we need to get noticed :)

@@ +481,5 @@
> +  if (NS_WARN_IF(mFlushedForDiversion)) {
> +    MOZ_ASSERT(!mFlushedForDiversion,
> +               "Should not be receiving any more callbacks from parent!");
> +    // Return silently; don't take down IPDL.
> +    return true;

CRASH or return false

@@ +488,4 @@
>    if (mEventQ->ShouldEnqueue()) {
>      mEventQ->Enqueue(new StopRequestEvent(this, statusCode));
>    } else {
> +    MOZ_ASSERT(!mDivertingToParent, "ShouldEnqueue when diverting to parent!");

CRASH

@@ +502,5 @@
>             this, statusCode));
>  
> +  if (mDivertingToParent) {
> +    if (NS_WARN_IF(mFlushedForDiversion)) {
> +      MOZ_ASSERT(!mFlushedForDiversion,

CRASH

@@ +1094,5 @@
> +  // SendResume only once, when suspend count drops to 0.
> +  // Don't SendResume at all if we're diverting callbacks to the parent (unless
> +  // suspend was sent earlier); otherwise, resume will be called at the correct
> +  // time in the parent itself.
> +  if (!--mSuspendCount && (!mDivertingToParent || mSuspendSent)) {

Nice catch handling the case where suspend() has been called already by the time OnStartRequest is called.  I suspect your code will even work!  It should be *very* unlikely that anything will suspend the channel before OnStartRequest is called, but you never know...

::: netwerk/protocol/http/HttpChannelChild.h
@@ +146,5 @@
>    bool mIPCOpen;
>    bool mKeptAlive;            // IPC kept open, but only for security info
>    nsRefPtr<ChannelEventQueue> mEventQ;
>  
> +  // Once set, OnData and OnStop will be diverted to the parent channel.

... and possibly OnStop...

::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +42,5 @@
>    , mSentRedirect1BeginFailed(false)
>    , mReceivedRedirect2Verify(false)
>    , mPBOverride(aOverrideStatus)
>    , mLoadContext(aLoadContext)
> +  , mCanceled(NS_OK)

hmm, in nsHttpChannel we have separate bool mCanceled and nsresult mStatus.  That's to support the unusual (but used by XHR) pattern of Cancel(NS_OK) (a.k.a., "everything is fine but I don't need the data from the channel").

We shouldn't have to support that construct for the download manager... For now just rename mCanceled to mStatus and we'll add bool mCanceled if I'm wrong about that and we need the extra info.

@@ +445,5 @@
> +  if (NS_WARN_IF(!mDivertingFromChild)) {
> +    MOZ_ASSERT(mDivertingFromChild,
> +               "Cannot RecvDivertOnDataAvailable if diverting is not set!");
> +    // Return silently; don't take down IPDL channel.
> +    return true;

Bad error, so return false and kill the child.

@@ +461,5 @@
> +    if (mChannel) {
> +      mChannel->Cancel(rv);
> +    }
> +    // If we failed this, we should set mCanceled to drop future
> +    // OnDataAvailables.

scrap comment--implied.

@@ +474,5 @@
> +    if (mChannel) {
> +      mChannel->Cancel(rv);
> +    }
> +    // If we failed this, we should set mCanceled to drop future
> +    // OnDataAvailables.

ditto

@@ +489,5 @@
> +  if (NS_WARN_IF(!mDivertingFromChild)) {
> +    MOZ_ASSERT(mDivertingFromChild,
> +               "Cannot RecvDivertOnStopRequest if diverting is not set!");
> +    // Return silently; don't take down IPDL channel.
> +    return true;

return false

@@ +493,5 @@
> +    return true;
> +  }
> +
> +  // If statusCode is a failure, then we should use it; otherwise, we should use
> +  // the status stored in mCanceled.

My read of 

  http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpChannel.cpp#5041

is that we should prefer the status used by Cancel() on the parent (not yet implemented) or error code returned from OnStart/Data, over an error status from the child channel.  So

   status = (NS_FAILED(mStatus)) ? mStatus : statusCode;

@@ +503,5 @@
> +  }
> +
> +  // Reset diverting status to unset fake pending status in the channel.
> +  if (mChannel) {
> +    static_cast<nsHttpChannel*>(mChannel.get())->SetDiverting(false);

hmm, maybe to keep all the static_cast of mChannel centralized, we should just make mChannel an nsRefPtr<nsHttpChannel> and init it here: 

  http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/HttpChannelParent.cpp#186

@@ +519,5 @@
> +  if (NS_WARN_IF(!mDivertingFromChild)) {
> +    MOZ_ASSERT(mDivertingFromChild,
> +               "Cannot RecvDivertComplete if diverting is not set!");
> +    // Return silently; don't take down IPDL channel.
> +    return true;

return false

@@ +857,5 @@
> +  }
> +
> +  // Call OnStartRequest for the "DivertTo" listener.
> +  nsresult rv = mParentListener->OnStartRequest(mChannel, nullptr);
> +  if (NS_WARN_IF(NS_FAILED(rv))) {

No need for NS_WARN_IF here: it's not unusual for channel to be canceled this way.

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +6186,5 @@
> +NS_IMETHODIMP
> +nsHttpChannel::IsPending(bool *aIsPending)
> +{
> +  NS_ENSURE_ARG_POINTER(aIsPending);
> +  *aIsPending = mIsPending || mDiverting;

My only complain is that the way you've written this, we'll just wind up needing to rename SetDiverting/mDiverting to something like ForcePending in order to also fix bug 748117.  Not a big deal but maybe we should just fix that bug now too.
Attachment #8385627 - Flags: review?(jduell.mcbugs) → feedback+
Comment on attachment 8385625 [details] [diff] [review]
Part 1 v1.0 nsIDivertableChannel and ADivertableParentChannel

Review of attachment 8385625 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/public/nsIDivertableChannel.idl
@@ +46,5 @@
> +   *
> +   *    After divertToParent() has been called, NO further function calls
> +   *    should be made on the channel.  It is a dead object for all purposes.
> +   *    The reference that the channel holds to the listener in the child is
> +   *    released when this method is called.

"is released once OnStartRequest completes, and no other nsIStreamListener calls (OnDataAvailable, OnStopRequest) are made to it."
Attachment #8385625 - Attachment is obsolete: true
Attachment #8386406 - Flags: review+
Most recent patches uploaded. Not as much progress as I would have liked today. I had pushed a try run earlier, just with these three patches, to try to catch any regressions. There was a basic one, but it took some time to figure out that it was basic. I have pushed another try job, and there are a couple more things, but I think the problems should be easy to fix.

https://tbpl.mozilla.org/?tree=Try&rev=8dee28bf1498

As far as todo items, I see the following:
-- Add FTP
-- Test test test.

I'm still hopeful that we can have something in there by Friday, it's just the amount of testing we get done on it before handing off to QA that's an issue. But let's see what happens.
Attachment #8385627 - Attachment is obsolete: true
Quick update:
-- FTP coded and building locally; will test with my little ExAppHelper patch later this evening (it's 5:20pm with me now).
-- Tried fixing those Linux build errors; not reproducible in my VM - sigh.
-- Pushed to try again:
https://tbpl.mozilla.org/?tree=Try&rev=fdb22862f794

I'll upload the patches later once I've had a chance to look at the try results for regressions and do a bit of local testing.

We're getting there...
Attachment #8386406 - Attachment is obsolete: true
Attachment #8387205 - Flags: review+
Attachment #8386412 - Attachment is obsolete: true
Attachment #8387208 - Flags: review?(jduell.mcbugs)
Overall patch status:
-- FTP done.
-- FTP and HTTP working locally, in a Mac e10s window, using my basic ExAppHelper patch.
-- Pushed to try again: https://tbpl.mozilla.org/?tree=Try&rev=fdd66b150515
Reminder: I couldn't reproduce the Linux build errors locally earlier, so there may be more build errors to come. But we can work on that.

Reuben, it would be great if you could test these patches, especially for FTP downloads.
Attachment #8387215 - Flags: review?(jduell.mcbugs)
I'm not sure why gcc can't distinguish between overriding the "bool IsPending()" from the XPCOM "nsresult IsPending(bool*)". But it was easy enough to change the former to just "bool Pending()" :)

The HTTP error "mActiveChannel not declared" is weirder--looking into it.
Attachment #8387406 - Flags: review?(sworkman)
never mind the mActiveChannel comment--that was just bitrot on my box.

I think this patch should (fingers crossed) work:

  https://tbpl.mozilla.org/?tree=Try&rev=7882f86d845a
Attachment #8387406 - Attachment is obsolete: true
Attachment #8387406 - Flags: review?(sworkman)
Attachment #8387476 - Flags: review?(sworkman)
Comment on attachment 8387476 [details] [diff] [review]
Fix FTP and missing nsIStreamListener declaration.

Review of attachment 8387476 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. Also looks like 2.30am in the morning with you. That's a dedicated manager.

r=me.

::: netwerk/base/src/nsBaseChannel.h
@@ +161,2 @@
>      return mPump || mWaitingOnAsyncRedirect;
> + }

Indentation change with closing brace.
Attachment #8387476 - Flags: review?(sworkman) → review+
Woohoo! Linux is building!
I tested the new versions with the patch in bug 946239, and things work, but we are leaking a JSRuntime somewhere. I'm checking if it's actually related to the new patches here. Oh, and it looks like FTP downloads keep trying to connect to the server after they started? Is that normal? If I deny subsequent requests in my firewall I get an alert saying "550 Failed to change directory", but the download still works fine.
Thanks for testing Reuben.

(In reply to Reuben Morais [:reuben] from comment #77)
> I tested the new versions with the patch in bug 946239, and things work,

That's good :) ...

> but we are leaking a JSRuntime somewhere. I'm checking if it's actually related
> to the new patches here.

I'll apply the patch and see what I can determine here.

> Oh, and it looks like FTP downloads keep trying to
> connect to the server after they started? Is that normal? If I deny
> subsequent requests in my firewall I get an alert saying "550 Failed to
> change directory", but the download still works fine.

I'm not super familiar with the FTP code, but it sounds suspect. Let me take a look at that too.

How are you testing, btw? e10s window or B2G?
Comment on attachment 8387208 [details] [diff] [review]
Part 3 v1.2 Implement nsI|ADivertableChannel in HttpChannelChild|Parent

Review of attachment 8387208 [details] [diff] [review]:
-----------------------------------------------------------------

Looking good. I mostly have corrections for how we handle edge case error handling (lots of ASSERTs that should never really fail).  So most of this should possibly not even hold landing up, and could be a followup if needed.  See what you can get to.

::: netwerk/protocol/http/HttpChannelChild.cpp
@@ +232,5 @@
>                                       const NetAddr& peerAddr)
>  {
> +  // mFlushedForDiversion and mDivertingToParent should NEVER be set at this
> +  // stage, as they are set in the listener's OnStartRequest.
> +  MOZ_RELEASE_ASSERT(!mFlushedForDiversion,

I didn't even know about MOZ_RELEASE_ASSERT--good to know.

@@ +867,5 @@
> +  {
> +    // We must flush the queue before remove ourselves from the load group, in
> +    // preparation for RecvDeleteSelf (although we really shouldn't receive any
> +    // msgs after FlushedForDiversion). So make sure this goes out of scope
> +    // before then.

I don't understand this comment: why do we need the AutoEventEnqueuer here? (No more messages should be arriving until IPDL shutdown). Why would we need to finish flushing before we remove the request from the LoadGroup? And if that's true, why are you calling LoadGroup->RemoveRequest while the enqueuer is still in scope?  That means the dequeuing happens after the load group removal.

I don't think the enqueuer is doing any harm, but I also don't think we need it, so I'd remove it and the comment, unless I'm misunderstanding.

@@ +871,5 @@
> +    // before then.
> +    AutoEventEnqueuer ensureSerialDispatch(mEventQ);
> +
> +    mListener = 0;
> +    mListenerContext = 0;

I don't think we necessarily need to wait this long to null out the listener/context, but I also don't see any reason why it should hurt.  We could do it as soon as the call to OnStartRequest (where DivertToParent was called) finishes.  But again, maybe this is fine.

@@ +874,5 @@
> +    mListener = 0;
> +    mListenerContext = 0;
> +    mCacheEntryAvailable = false;
> +    if (mLoadGroup) {
> +      mLoadGroup->RemoveRequest(this, nullptr, mStatus);

We could probably remove the channel from the loadgroup that early too. (and again, prob fine to do it here instead).

::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +507,5 @@
> +  mParentListener = nullptr;
> +  if (NS_WARN_IF(!mDivertingFromChild)) {
> +    MOZ_ASSERT(mDivertingFromChild,
> +               "Cannot RecvDivertComplete if diverting is not set!");
> +    return false;

technically, in all of these places where we kill the child by return false we also need to make sure that we call OnStart/Stop on the listener.  They're all very unlikely, but if you wind up adding the infrastructure to do that easily (see other comments) it might be easy enough to add a one-line function call that does it.

@@ +603,5 @@
>  
> +  if (NS_WARN_IF(mDivertingFromChild)) {
> +    MOZ_ASSERT(!mDivertingFromChild,
> +               "Cannot call OnStopRequest if diverting is set!");
> +    return NS_ERROR_UNEXPECTED;

Need to make sure OnStop gets called in release mode.  (or we could use MOZ_RELEASE_ASSERT for this sort of case--there are lots more I point out.  I could be persuaded either way.)

@@ +625,5 @@
>  {
>    LOG(("HttpChannelParent::OnDataAvailable [this=%p]\n", this));
>  
> +  if (NS_WARN_IF(mDivertingFromChild)) {
> +    MOZ_ASSERT(!mDivertingFromChild,

ditto

@@ +794,5 @@
> +  MOZ_ASSERT(mChannel);
> +  if (NS_WARN_IF(!mDivertingFromChild)) {
> +    MOZ_ASSERT(mDivertingFromChild,
> +               "Cannot ResumeForDiversion if not diverting!");
> +    return NS_ERROR_UNEXPECTED;

you're not paying attention to the error that gets returned here (in RecvComplete).  Some way or another if anything goes wrong we need to 1) make sure OnStart + Stop have been called, and 2) IPDL gets shut down.

@@ +801,5 @@
> +  // Try resuming the channel. Allow it to fail since OnStopRequest may have
> +  // been called before the channel was diverted, and thus the earlier suspend
> +  // would have failed.
> +  nsresult rv = mChannel->Resume();
> +  if (NS_WARN_IF(NS_FAILED(rv) && rv != NS_ERROR_UNEXPECTED)) {

keep a mCalledOnStopFromChild variable?  And we can 1) skip Resume if it's true, and 2) Assert if for some very unlikely reason Resume() fails (and also, for release mode, generate our own OnStop(UNEXPECTED) in that case so that we still abide by the contract with the listener.)

@@ +802,5 @@
> +  // been called before the channel was diverted, and thus the earlier suspend
> +  // would have failed.
> +  nsresult rv = mChannel->Resume();
> +  if (NS_WARN_IF(NS_FAILED(rv) && rv != NS_ERROR_UNEXPECTED)) {
> +    return rv;

Don't return w/o destroying IPDL

@@ +822,5 @@
> +
> +  DebugOnly<nsresult> rv = mParentListener->DivertTo(aListener);
> +  MOZ_ASSERT(NS_SUCCEEDED(rv));
> +
> +  if (NS_WARN_IF(!SendFlushedForDiversion())) {

see SendFOO comment below.  I think that since we'll only have this function called during a RecvFOO method from an IPDL message, it may be safe to not check mIPCClosed here.  But I'd check it anyway--it's technically possible a client could call this at some random later time.

If mIPCClosed or SendFlushed fails, then returning a failure here works: any time after DivertTo returns, though, we need to guarantee OnStart/Stop will get called.

@@ +838,5 @@
> +{
> +  if (NS_WARN_IF(!mDivertingFromChild)) {
> +    MOZ_ASSERT(mDivertingFromChild,
> +               "Cannot StartDiversion if diverting is not set!");
> +    return;

see SendFOO comment below

@@ +853,5 @@
> +    if (mChannel) {
> +      mChannel->Cancel(rv);
> +    }
> +    mStatus = rv;
> +    return;

looks like this isn't right here--we'll cancel the nsHttpChannel (good) but then return without resuming it (and for all we know it already sent OnStop to the child).  I think if we get rid of the "return" here everything should work OK, though.

@@ +858,5 @@
> +  }
> +
> +  // After OnStartRequest has been called, tell HttpChannelChild to divert the
> +  // OnDataAvailables and OnStopRequest to this HttpChannelParent.
> +  if (NS_WARN_IF(!SendDivertMessages())) {

Generally, any SendFOO to child must check mIPCClosed first:

  if (NS_WARN_IF(mIPCClosed || !SendDivertMessages()))

(sending IPDL message to a dead child kills the parent--at least it did as of a year ago--rather harsh IMO but hey :)

That said, I *think* in this case HttpChannelParent would have been deleted if IPDL goes away (the reference to it in HttpChannelParentListener will have been replaced with the DivertedTo listener, so the IPDL ref should be the only one left AFAICT).  But can't hurt to check.

returning if this happens isn't enough--we'll leave the listener in limbo forever.  Instead we should ideally Resume() the channel if it's *really* pending, and if it's not (i.e. it's already called OnStop -> Child), we should call listener->OnStop().  Not sure what error code to use--NS_ERROR_UNEXPECTED is probably good enough.  Ideally we'd do this "resume or generate OnStop" in *all* places where we fail in some way that doesn't continue the child (such as the ASSERT(!mDivertingFromChild) at start of this function).  Think about it--should be easy to just make it a function ("FailChannel") that you can call in all those places.  But make sure that you also call OnStart iff it hasn't been called yet.

::: netwerk/protocol/http/HttpChannelParentListener.cpp
@@ +49,5 @@
>  {
> +  if (NS_WARN_IF(mSuspendedForDiversion)) {
> +    MOZ_ASSERT(!mSuspendedForDiversion,
> +               "Cannot call OnStartRequest if suspended for diversion!");
> +    return NS_ERROR_UNEXPECTED;

Hmm.  Not enough to just return error here in release mode: we have to still call nextListener->OnStart/Stop.

@@ +65,5 @@
>                                            nsISupports *aContext,
>                                            nsresult aStatusCode)
>  {
> +  if (NS_WARN_IF(mSuspendedForDiversion)) {
> +    MOZ_ASSERT(!mSuspendedForDiversion,

ditto

@@ +93,5 @@
>                                              uint64_t aOffset,
>                                              uint32_t aCount)
>  {
> +  if (NS_WARN_IF(mSuspendedForDiversion)) {
> +    MOZ_ASSERT(!mSuspendedForDiversion,

ditto

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +6188,5 @@
> +nsHttpChannel::IsPending(bool *aIsPending)
> +{
> +  NS_ENSURE_ARG_POINTER(aIsPending);
> +  *aIsPending = mIsPending || mForcePending;
> +  return NS_OK;

nit: 4 space indenting please
Attachment #8387208 - Flags: review?(jduell.mcbugs) → feedback+
(In reply to Steve Workman [:sworkman] from comment #78)
> How are you testing, btw? e10s window or B2G?

An OOP b2g-desktop build (set dom.ipc.tabs.disabled to false).
Oh, and I confirmed that the leak only happens if you download something over FTP.
Attachment #8387208 - Attachment is obsolete: true
Attachment #8387632 - Flags: review?(jduell.mcbugs)
Updated based on most recent comments for HttpChannel patch.

Doesn't include Jason's patch for build fixes.
Attachment #8387215 - Attachment is obsolete: true
Attachment #8387215 - Flags: review?(jduell.mcbugs)
Attachment #8387633 - Flags: review?(jduell.mcbugs)
Those updated patches are just for Jason's comments; I'm just starting to look at the leak now.
And here's a try push to make sure the changes are breaking anything:

https://tbpl.mozilla.org/?tree=Try&rev=49585555b5ea
Fixed the leak. Wasn't deleting either ChannelDiverterChild or ChannelDiverterParent in NeckoChild|Parent.cpp.

Reuben, please verify if you get time.
Attachment #8387206 - Attachment is obsolete: true
Attachment #8387711 - Flags: review?(jduell.mcbugs)
That fixed the leak, Steve. I did see a crash while downloading a file over FTP, though. Sadly I only had a debugger attached to the child process, and couldn't reproduce it, but here's the stacktrace:

mozilla::net::FTPChannelParent::SuspendForDiversion() + 500 (FTPChannelParent.cpp:416)
non-virtual thunk to mozilla::net::FTPChannelParent::SuspendForDiversion() + 28 (Unified_cpp_netwerk_protocol_ftp0.cpp:423)
mozilla::net::ChannelDiverterParent::Init(mozilla::net::ChannelDiverterArgs const&) + 453 (ChannelDiverterParent.cpp:48)
mozilla::net::NeckoParent::RecvPChannelDiverterConstructor(mozilla::net::PChannelDiverterParent*, mozilla::net::ChannelDiverterArgs const&) + 41 (NeckoParent.cpp:623)
mozilla::net::PNeckoParent::OnMessageReceived(IPC::Message const&) + 13382 (.PNeckoParent.cpp:1340)
mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&) + 190 (.PContentParent.cpp:1987)
mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const&) + 276 (MessageChannel.cpp:1141)
mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message const&) + 207 (MessageChannel.cpp:1055)
mozilla::ipc::MessageChannel::OnMaybeDequeueOne() + 526 (MessageChannel.cpp:1038)
void DispatchToMethod<mozilla::ipc::MessageChannel, bool (mozilla::ipc::MessageChannel::*)()>(mozilla::ipc::MessageChannel*, bool (mozilla::ipc::MessageChannel::*)(), Tuple0 const&) + 131 (tuple.h:384)
RunnableMethod<mozilla::ipc::MessageChannel, bool (mozilla::ipc::MessageChannel::*)(), Tuple0>::Run() + 78 (task.h:308)
mozilla::ipc::MessageChannel::RefCountedTask::Run() + 40 (MessageChannel.h:378)
mozilla::ipc::MessageChannel::DequeueTask::Run() + 36 (MessageChannel.h:395)
MessageLoop::RunTask(Task*) + 96 (message_loop.cc:345)
MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) + 79 (message_loop.cc:355)
MessageLoop::DoWork() + 292 (message_loop.cc:452)
mozilla::ipc::DoWorkRunnable::Run() + 146 (MessagePump.cpp:229)
nsThread::ProcessNextEvent(bool, bool*) + 1561 (nsThread.cpp:644)
NS_ProcessPendingEvents(nsIThread*, unsigned int) + 154 (nsThreadUtils.cpp:210)
nsBaseAppShell::NativeEventCallback() + 201 (nsBaseAppShell.cpp:99)
nsAppShell::ProcessGeckoEvents(void*) + 440 (nsAppShell.mm:388)
Reuben, I don't see this leak with Desktop Firefox using an e10s Window. Hmmm. I assume when you say that you couldn't reproduce it it only happened once? If it's very intermittent like that, then I think we're ok attacking it on Monday.
Oh, and thanks for testing and verifying the leak fix!

FYI: I'll be mostly offline until Monday morning anyway.
(In reply to Steve Workman [:sworkman] from comment #88)
> Reuben, I don't see this leak with Desktop Firefox using an e10s Window.
> Hmmm. I assume when you say that you couldn't reproduce it it only happened
> once? If it's very intermittent like that, then I think we're ok attacking
> it on Monday.

Yep, only once, and I agree.
Awesome. Currently, we're hoping to land something Sunday evening Paris time. And then we can work on follow ups etc.
(In reply to Steve Workman [:sworkman] from comment #91)
> Awesome. Currently, we're hoping to land something Sunday evening Paris
> time. And then we can work on follow ups etc.

Sounds great! Thanks for the update.
Comment on attachment 8387205 [details] [diff] [review]
Part 1 v1.2 nsIDivertableChannel and ADivertableParentChannel

Review of attachment 8387205 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/src/ADivertableParentChannel.h
@@ +28,5 @@
> +  // added and kept until OnStopRequest has completed.
> +  //
> +  // Note: DivertTo() can fail if the child process has died.  In that case the
> +  // channel should be considered dead (it will not provide any OnStart/Stop/
> +  // Data messages).

Get rid of comment about DivertTo() failing.  We're making it infallible.

@@ +29,5 @@
> +  //
> +  // Note: DivertTo() can fail if the child process has died.  In that case the
> +  // channel should be considered dead (it will not provide any OnStart/Stop/
> +  // Data messages).
> +  virtual nsresult DivertTo(nsIStreamListener *aListener) = 0;

Change return type to void.  (nothing to do in ChannelDiverter class--it was already assuming DivertTo never fails! :)
Comment on attachment 8387632 [details] [diff] [review]
Part 3 v1.3 Implement nsI|ADivertableChannel in HttpChannelChild|Parent

Review of attachment 8387632 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +447,5 @@
> +  MOZ_ASSERT(mParentListener);
> +  if (NS_WARN_IF(!mDivertingFromChild)) {
> +    MOZ_ASSERT(mDivertingFromChild,
> +               "Cannot RecvDivertOnDataAvailable if diverting is not set!");
> +    ChildErrorDuringDiversion(NS_ERROR_UNEXPECTED);

Rename ChildErrorDuringDiversion --> FailDiversion().

@@ +538,5 @@
> +  if (NS_WARN_IF(mDivertingFromChild)) {
> +    MOZ_ASSERT(!mDivertingFromChild,
> +               "Cannot call OnStartRequest if diverting is set!");
> +    return NS_ERROR_UNEXPECTED;
> +  }

Violates contract with listener--we'll never call listener->OnStartRequest.

This should never happen--let's just do MOZ_RELEASE_ASSERT.

@@ +614,5 @@
> +  if (NS_WARN_IF(mDivertingFromChild)) {
> +    MOZ_ASSERT(!mDivertingFromChild,
> +               "Cannot call OnStopRequest if diverting is set!");
> +    return NS_ERROR_UNEXPECTED;
> +  }

same--MOZ_RELEASE_ASSERT.

@@ +638,5 @@
> +  if (NS_WARN_IF(mDivertingFromChild)) {
> +    MOZ_ASSERT(!mDivertingFromChild,
> +               "Cannot call OnDataAvailable if diverting is set!");
> +    return NS_ERROR_UNEXPECTED;
> +  }

MOZ_RELEASE_ASSERT

@@ +808,5 @@
> +    return NS_ERROR_UNEXPECTED;
> +  }
> +
> +  if (!mDivertedOnStopRequest) {
> +    // Try resuming the channel.

Change comment to

  // The nsHttpChannel will deliver remaining OnData/OnStop for the transfer.

@@ +831,5 @@
> +  MOZ_ASSERT(mParentListener);
> +  if (NS_WARN_IF(!mDivertingFromChild)) {
> +    MOZ_ASSERT(mDivertingFromChild,
> +               "Cannot DivertTo new listener if diverting is not set!");
> +    return NS_ERROR_UNEXPECTED;

Let's make DivertTo always return NS_OK, and just fail via OnStart/Stop.

So call to FailDiversion() here and return NS_OK.

@@ +839,5 @@
> +  MOZ_ASSERT(NS_SUCCEEDED(rv));
> +
> +  if (NS_WARN_IF(mIPCClosed || !SendFlushedForDiversion())) {
> +    ChildErrorDuringDiversion(NS_ERROR_UNEXPECTED);
> +    return NS_ERROR_UNEXPECTED;

return NS_OK;

@@ +845,5 @@
> +
> +  // Call OnStartRequest and SendDivertMessages asynchronously to avoid
> +  // reentering client context.
> +  return NS_DispatchToCurrentThread(
> +    NS_NewRunnableMethod(this, &HttpChannelParent::StartDiversion));

I guess might as well check rv here and FailDiversion() here if needed.  Return NS_OK either way.

@@ +881,5 @@
> +  }
> +}
> +
> +void
> +HttpChannelParent::ChildErrorDuringDiversion(nsresult aErrorCode,

rename:  "FailDiversion"


since FailDiversion can be called in DivertTo(), it needs to be called asynchronously (so make this function kick off event that does the real work.

@@ +889,5 @@
> +  MOZ_RELEASE_ASSERT(mDivertingFromChild);
> +  MOZ_RELEASE_ASSERT(mParentListener);
> +  MOZ_RELEASE_ASSERT(mChannel);
> +
> +  // Try to cancel the channel and have it deliver the callbacks.

scrap comment: we also want to cancel just so status is correct.

@@ +899,5 @@
> +  nsresult rv = mChannel->IsPending(&isPending);
> +  MOZ_RELEASE_ASSERT(NS_SUCCEEDED(rv));
> +
> +  if (!aSkipResume && isPending && mDivertingFromChild) {
> +    mChannel->Resume();

We always want to resume the channel (even if it delivered OnStop already) but only if we suspended it. So we need to start keeping track of whether our Suspend() call in SuspendForDiversion actually succeeded, and check that here.  If true, always resume()--regardless of isPending--and do it before OnStartRequest call.

@@ +900,5 @@
> +  MOZ_RELEASE_ASSERT(NS_SUCCEEDED(rv));
> +
> +  if (!aSkipResume && isPending && mDivertingFromChild) {
> +    mChannel->Resume();
> +    return;

Can't return w/o calling OnStart if needed.  So move code that checks isPending and returns w/o calling OnStopRequest down below the call to OnStartRequest.

@@ +907,5 @@
> +  // Channel is not pending, so OnStopRequest was sent to the child already.
> +  // Since child is gone/going, call OnStartRequest (if necessary) and
> +  // OnStopRequest.
> +  if (!mDivertedOnStartRequest) {
> +    mParentListener->OnStartRequest(mChannel, nullptr);

surround OnStartCall with ForcePending(true) ... (false).

::: netwerk/protocol/http/HttpChannelParentListener.cpp
@@ +250,5 @@
> +{
> +  if (NS_WARN_IF(!mSuspendedForDiversion)) {
> +    MOZ_ASSERT(mSuspendedForDiversion, "Must already be suspended!");
> +    return NS_ERROR_UNEXPECTED;
> +  }

MOZ_RELEASE_ASSERT. Too complicated to try to do equivalent of FailDiversion in ParentListener.

@@ +265,5 @@
> +  MOZ_ASSERT(aListener);
> +  if (NS_WARN_IF(!mSuspendedForDiversion)) {
> +    MOZ_ASSERT(mSuspendedForDiversion, "Must already be suspended!");
> +    return NS_ERROR_UNEXPECTED;
> +  }

MOZ_RELEASE_ASSERT.  Too complicated to try to do equivalent of FailDiversion in ParentListener.
Attachment #8387632 - Flags: review?(jduell.mcbugs) → feedback+
Comment on attachment 8387633 [details] [diff] [review]
Part 4 v1.1 Implement nsI|ADivertableChannel in FTPChannelChild|Parent

Review of attachment 8387633 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/ftp/FTPChannelChild.cpp
@@ +343,5 @@
>                                     const uint32_t& count)
>  {
>    LOG(("FTPChannelChild::RecvOnDataAvailable [this=%p]\n", this));
>  
> +  // For diversion to parent, just SendDivertOnDataAvailable.

scrap comment

@@ +537,5 @@
> +    mListenerContext = nullptr;
> +
> +    if (mLoadGroup) {
> +      mLoadGroup->RemoveRequest(this, nullptr, mStatus);
> +    }

Looks like you already null out mListener and remove from Loadgroup in RecvOnStartRequest, so remove all this code.

@@ +547,5 @@
>  {
> +  MOZ_RELEASE_ASSERT(mDivertingToParent);
> +
> +  if (NS_WARN_IF(mSuspendCount <= 0)) {
> +    MOZ_RELEASE_ASSERT(mSuspendCount > 0);

No need NS_WARN_IF--just RELEASE_ASSERT.

::: netwerk/protocol/ftp/FTPChannelParent.cpp
@@ +218,5 @@
> +    if (mChannel) {
> +      mChannel->Cancel(rv);
> +    }
> +    mStatus = rv;
> +    return true;

drop return true, just fall down to return true on next line :)

@@ +278,5 @@
> +  if (mDivertingFromChild) {
> +    if (NS_WARN_IF(!mDivertToListener)) {
> +      MOZ_ASSERT(mDivertToListener, "Cannot divert if listener is unset!");
> +      return NS_ERROR_UNEXPECTED;
> +    }

just MOZ_RELEASE_ASSERT(mDivertToListener)

@@ +332,5 @@
>  
> +  if (mDivertingFromChild) {
> +    if (NS_WARN_IF(!mDivertToListener)) {
> +      MOZ_ASSERT(mDivertToListener, "Cannot divert if listener is unset!");
> +      return NS_ERROR_UNEXPECTED;

ditto

@@ +360,5 @@
> +
> +  if (mDivertingFromChild) {
> +    if (NS_WARN_IF(!mDivertToListener)) {
> +      MOZ_ASSERT(mDivertToListener, "Cannot divert if listener is unset!");
> +      return NS_ERROR_UNEXPECTED;

ditto

@@ +444,5 @@
> +    return NS_ERROR_UNEXPECTED;
> +  }
> +
> +  if (!mDivertedOnStopRequest) {
> +    // Try resuming the channel.

scrap comment.

@@ +453,5 @@
> +      return rv;
> +    }
> +  }
> +
> +  if (NS_WARN_IF(NS_FAILED(Delete()))) {

comment

  // Delete() will tear down IPDL, but ref from underlying nsFTPChannel will keep us alive if there's more data to be delivered to listener.

@@ +467,5 @@
> +  MOZ_ASSERT(aListener);
> +  if (NS_WARN_IF(!mDivertingFromChild)) {
> +    MOZ_ASSERT(mDivertingFromChild,
> +               "Cannot DivertTo new listener if diverting is not set!");
> +    return NS_ERROR_UNEXPECTED;

FailDiversion()

make function return void.

@@ +480,5 @@
> +
> +  // Call OnStartRequest and SendDivertMessages asynchronously to avoid
> +  // reentering client context.
> +  return NS_DispatchToCurrentThread(
> +    NS_NewRunnableMethod(this, &FTPChannelParent::StartDiversion));

FailDiversion if that fails.

@@ +517,5 @@
> +}
> +
> +void
> +FTPChannelParent::ChildErrorDuringDiversion(nsresult aErrorCode,
> +                                            bool aSkipResume)

Same changes as for HTTP.

::: netwerk/protocol/ftp/FTPChannelParent.h
@@ +51,5 @@
> +  nsresult ResumeForDiversion();
> +
> +  // In case of IPDL error during diversion from child, this function tries to
> +  // resume the channel to send OnStart|Stop with an error code. If resume
> +  // fails, OnStart|Stop are called directly.

Handles calling OnStart/Stop if there are errors during diversion.
Attachment #8387633 - Flags: review?(jduell.mcbugs) → feedback+
Attachment #8387205 - Attachment is obsolete: true
Attachment #8387632 - Attachment is obsolete: true
Attachment #8388591 - Flags: review?(jduell.mcbugs)
Attachment #8387633 - Attachment is obsolete: true
Attachment #8388594 - Flags: review?(jduell.mcbugs)
Merged build fixes from attachment 8387476 [details] [diff] [review].
Attachment #8387711 - Attachment is obsolete: true
Attachment #8387711 - Flags: review?(jduell.mcbugs)
Attachment #8388600 - Flags: review+
Merged build fixes from attachment 8387476 [details] [diff] [review].
Attachment #8387476 - Attachment is obsolete: true
Attachment #8388594 - Attachment is obsolete: true
Attachment #8388594 - Flags: review?(jduell.mcbugs)
Attachment #8388601 - Flags: review?(jduell.mcbugs)
Comment on attachment 8388591 [details] [diff] [review]
Part 3 v1.4 Implement nsI|ADivertableChannel in HttpChannelChild|Parent

Review of attachment 8388591 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +929,5 @@
> +  bool isPending = false;
> +  nsresult rv = mChannel->IsPending(&isPending);
> +  MOZ_RELEASE_ASSERT(NS_SUCCEEDED(rv));
> +
> +  // Resume only we suspended earlier.

only /if/

::: netwerk/protocol/http/HttpChannelParent.h
@@ +158,5 @@
> +  bool mDivertedOnStartRequest;
> +  bool mDivertedOnStopRequest;
> +
> +  // Set if we successfully suspended the nsHttpChannel for diversion. Unset
> +  // when we call ResumeForDiversion.

Comments that have detailed info on when variables are used/manipulated are very vulnerable to bitrot.  I'd just nuke whole comment and let people figure it out from the name and the code uses.
Attachment #8388591 - Flags: review?(jduell.mcbugs) → review+
Comment on attachment 8388601 [details] [diff] [review]
Part 4 v1.3 Implement nsI|ADivertableChannel in FTPChannelChild|Parent

Review of attachment 8388601 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/ftp/FTPChannelParent.cpp
@@ +439,5 @@
> +               "Cannot ResumeForDiversion if not diverting!");
> +    return NS_ERROR_UNEXPECTED;
> +  }
> +
> +  if (mSuspendedForDiversion && !mDivertedOnStopRequest) {

like for HTTP don't check mDivertedOnStopRequest here (and get rid of variable if not needed)
Attachment #8388601 - Flags: review?(jduell.mcbugs) → review+
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/ce8ed7a0dfef
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/97858e5c1f13
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/1b4b0a309453
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/7a78f199e1cd

Reuben, patches landed on inbound. (Note: I've been having issues uploading to bugzilla, so I didn't update with the final changes - but, of course, they are on inbound).

Please test with your device. I've tested locally on Mac with e10s window; HTTP and FTP, downloading to completion, and also closing the window before completion - in all cases download finishes.

There's a possible race where one could cancel the download immediately after it is started, e.g. by swiping on the screen. In this case, the download manager would cancel the channel in the parent; however, OnDataAvailables and OnStop could be in transit already from the child and would still be called in ExternalAppHelper etc. It seems very unlikely, but it would be nice to make sure, so can you test if this is probable in the real world?
> There's a possible race where one could cancel the download immediately after it is started,

I expect it will be difficult if not impossible to hit this race--you'd need to swipe-cancel the download faster than the child can divert all the messages back to the parent.  I figure we should test it though.   I'm not sure what the symptoms would look like at the UI level--under the hood what would happen is that the download manager would Cancel() the channel but it could still get OnDataAvailable's (and possibly an OnStopRequest with a different status code than the one passed to Cancel).  That might not actually break anything.

You'd need very fast fingers to hit this :) but give it a try.
How fast are we talking here? The fastest way to cancel the download is by tapping the notification, then waiting for the Settings app to open, then clicking the X, then confirming the action. So at least 2 or 3 seconds.
It should be well under a second, so it sounds very unlikely. But give it a quick try just to make sure.
https://hg.mozilla.org/mozilla-central/rev/760131490c81
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
Do you guys see this same issue recreated over here http://englishclassviaskype.com/
(In reply to susan.morsey from comment #110)
> Do you guys see this same issue recreated over here
> http://englishclassviaskype.com/

Hi Susan - can you be more specific please? Please provide some steps to reproduce the exact issue that you're seeing on that site. Thanks!
Depends on: 1416879
See Also: → 1582966
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: