Closed Bug 1297554 Opened 3 years ago Closed 3 years ago

e10s: remove all uses of gNeckoParent

Categories

(Core :: Networking, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
e10s - ---
firefox52 --- fixed

People

(Reporter: jduell.mcbugs, Assigned: u408661)

Details

(Whiteboard: [necko-active])

Attachments

(1 file)

58 bytes, text/x-review-board-request
mcmanus
: review+
billm
: review+
Details
:billm tells me 

"Use of gNeckoParent will surely break with multiple content processes. Even with a single content process, I think it's possible for there to be multiple content processes alive at once (if one is in the process of shutting down). So the code is probably a little bit broken even now. Can you please file a bug to eliminate gNeckoParent and change all the users so they can figure out which process they want to talk to?"

Here's a hopefully vaguely useful IRC convo we had about how to replace gNeckoParent:

jduell:  I’m filing that bug to remove gNeckoParent, and I realize I don’t know what to replace it with
billm : it looks like it's mostly used in the prediction stuff
jduell : Is there some call I can use that gives me the right NeckoParent instance?
billm : what sort of state do we have available?
jduell : What sort of state?
billm : well, for example, it looks like OnPredictPrefetch is called here: http://searchfox.org/mozilla-central/source/netwerk/base/Predictor.cpp#2376
jduell : It sounds like we need a way to know which child process (and thus which NeckoParent) we’ll be talking to.
billm : yeah
billm: if we pass in httpChannel, I think we can get to a NeckoParent maybe?
jduell : The obvious place for that might be the necko channel?
 jduell: Except the predictor isn’t a channel...
billm : right, but it looks like it deals with channels
billm: maybe we can always find one to base the decision on?
billm: I guess we'll have to see what events trigger prediction. hopefully it always happens off of something happening to an existing channel? 
jduell : I’ll ask Nick (who wrote that code) about it.  So from a channel (which is an IPDL channel too) there’s a way to ask for the parent/owner channel (which is PNecko), right? 
jduell goes to look at the generated IPDL code
billm : we have to get to an httpchannelparent
jduell : OK.  And from there what’s the incantation to get to the NeckoParent?  (or do we need to keep a pointer, etc)
billm : call Manager() on it
billm: not sure how to go from nsIHttpChannel to HttpChannelParent. hopefully we store a pointer somewhere.
jduell : OK cool.  I think this is enough to get us started.  I’ll cc you on the bug
billm : oh, the notification callbacks I guess
jduell : Yeah, we may have to add a pointer or something
billm : OK, sounds good. thanks.
jduell : Thanks bill!
Nick: a plurality of the callsites that use gNecko are in the predictor...

Not sure if the fix is to 1) have the predictor keep tabs on which requests belong to which process, or 2) just make a new IPDL PPredictor.ipdl protocol for the predictor (like we do for cookies) and we'll just automagically know which process we're dealing with.  My hunch is #2 may be easier.

There's a couple non-predictor cases, too--one that appears to be a hack just for test purposes in nsHttpHandler, and one use by FlyWeb.
Assignee: nobody → hurley
Whiteboard: [necko-next]
(In reply to Jason Duell [:jduell] (needinfo me) from comment #1)
> Nick: a plurality of the callsites that use gNecko are in the predictor...
> 
> Not sure if the fix is to 1) have the predictor keep tabs on which requests
> belong to which process, or 2) just make a new IPDL PPredictor.ipdl protocol
> for the predictor (like we do for cookies) and we'll just automagically know
> which process we're dealing with.  My hunch is #2 may be easier.

Having a new protocol doesn't really help. Having one Predictor object per process would help, though.
tracking-e10s: --- → ?
Whiteboard: [necko-next] → [necko-active]
(In reply to Bill McCloskey (:billm) from comment #2)
> Having a new protocol doesn't really help. Having one Predictor object per
> process would help, though.

So prediction (as well as the other non-predictor use of gNeckoParent) is not done from channels - it's kind of the whole point of both uses of gNeckoParent to be doing stuff when we don't (necessarily) have a channel present. So that sucks.

Bill, could you elaborate more on the object-per-process approach? Specifically, are there any examples of that already in the tree?

Alternately, since all these uses exist purely for testing purposes, we could just bite the bullet and not test.
Flags: needinfo?(wmccloskey)
I'll note, re: my last comment, the non-predictor use was added specifically because it was requested we get that test working in e10s mode (a few months back when we were apparently trying to get as many tests working with e10s as possible). So removing it may not be the best approach (though I'm not convinced that not having these tests in e10s would be the end of the world, given all the real work happens on the parent, and we just want to send verifications back to the child in e10s mode tests)
Hmm, I didn't realize these were all just for testing purposes. I still think it makes sense to remove gNeckoParent just so it's not used accidentally for some non-testing thing. But I guess it's less important.

One solution is to broadcast these messages to every content process. You can get them all via ContentParent::GetAll. Then you can get a NeckoParent from there via contentParent->ManagedPNeckoParent().

I'm a little concerned, though, that we seem to be sending all these messages in release builds. That seems inefficient (especially if we end up broadcasting). Can we do it based on a test pref or something?
Flags: needinfo?(wmccloskey)
(In reply to Bill McCloskey (:billm) from comment #5)
> Hmm, I didn't realize these were all just for testing purposes. I still
> think it makes sense to remove gNeckoParent just so it's not used
> accidentally for some non-testing thing. But I guess it's less important.

This is already the case for some FlyWeb code. We should definitely fix this.

> One solution is to broadcast these messages to every content process. You
> can get them all via ContentParent::GetAll. Then you can get a NeckoParent
> from there via contentParent->ManagedPNeckoParent().

I'd also like to plug ContentParent::AllProcesses which works nicely with C++11-style for loops. It would probably be worth it to add a function on ContentParent that returns an existing PNeckoParent if it exists and otherwise creates one and returns it.
(In reply to Blake Kaplan (:mrbkap) from comment #6)
> (In reply to Bill McCloskey (:billm) from comment #5)
> > Hmm, I didn't realize these were all just for testing purposes. I still
> > think it makes sense to remove gNeckoParent just so it's not used
> > accidentally for some non-testing thing. But I guess it's less important.
> 
> This is already the case for some FlyWeb code. We should definitely fix this.

So IIRC, they recently removed this (it was crashing). It certainly didn't show up in my local search of my repo. Maybe my repo was out of date, though.

> > One solution is to broadcast these messages to every content process. You
> > can get them all via ContentParent::GetAll. Then you can get a NeckoParent
> > from there via contentParent->ManagedPNeckoParent().
> 
> I'd also like to plug ContentParent::AllProcesses which works nicely with
> C++11-style for loops. It would probably be worth it to add a function on
> ContentParent that returns an existing PNeckoParent if it exists and
> otherwise creates one and returns it.

I'll take a look at this and what Bill suggested, and come up with a solution that rips gNeckoParent out.

Thanks to both of you!
(In reply to Bill McCloskey (:billm) from comment #5)
> I'm a little concerned, though, that we seem to be sending all these
> messages in release builds. That seems inefficient (especially if we end up
> broadcasting). Can we do it based on a test pref or something?

So one use of gNeckoParent is already based on a pref that's set during testing. The other is not based on a pref, but is based on an argument being non-null, which is only the case in tests, so we're fine there.

Bill, there wasn't anything else you're seeing that indicates that we're sending these message outside tests, right?
Flags: needinfo?(wmccloskey)
No, I guess I just didn't look carefully enough.
Flags: needinfo?(wmccloskey)
Bill - in the review above, I'm specifically looking for you opinion on my use of MOZ_CRASH when looking at the array of necko parents. I cribbed that bit from flyweb where they removed their use of gNeckoParent, and wanted to double-check that it was a reasonable thing to do. Thanks!
Comment on attachment 8793757 [details]
Bug 1297554 - Perform a gNeckoParent-ectomy

https://reviewboard.mozilla.org/r/80428/#review79254

::: netwerk/base/Predictor.cpp:2252
(Diff revision 2)
> -  MOZ_DIAGNOSTIC_ASSERT(gNeckoParent);
> -
>    ipc::URIParams serURI;
>    SerializeURI(aURI, serURI);
>  
> -  if (!gNeckoParent->SendPredOnPredictPrefetch(serURI, httpStatus)) {
> +  for (auto *cp : ContentParent::AllProcesses(ContentParent::eLive)) {

auto* cp?

::: netwerk/base/Predictor.cpp:2255
(Diff revision 2)
>    SerializeURI(aURI, serURI);
>  
> -  if (!gNeckoParent->SendPredOnPredictPrefetch(serURI, httpStatus)) {
> +  for (auto *cp : ContentParent::AllProcesses(ContentParent::eLive)) {
> +    nsTArray<PNeckoParent*> neckoParents;
> +    cp->ManagedPNeckoParent(neckoParents);
> +    if (neckoParents.Length() != 1) {

I think we should be more lenient here than in the FlyWeb code. If we don't have a NeckoParent (which is possible since they're created lazily) we should just skip this process. Maybe change the assert to a continue? I'm not too worried about there being more than one NeckoParent.

You could also implement this more easily with:
  NeckoParent* np = SingleManagedOrNull(cp->ManagedPNeckoParent());
  if (!np) {
    continue;
  }
Attachment #8793757 - Flags: review?(wmccloskey) → review+
Updated review request with Bill's suggestions.
Comment on attachment 8793757 [details]
Bug 1297554 - Perform a gNeckoParent-ectomy

https://reviewboard.mozilla.org/r/80428/#review80662
Attachment #8793757 - Flags: review?(mcmanus) → review+
Pushed by hurley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/67cd8ab2f98e
Perform a gNeckoParent-ectomy r=billm,mcmanus
https://hg.mozilla.org/mozilla-central/rev/67cd8ab2f98e
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
You need to log in before you can comment on or make changes to this bug.