Closed Bug 1391693 Opened 7 years ago Closed 7 years ago

move parent-side interception out of nsHttpChannel into a separate InterceptedHttpChannel

Categories

(Core :: DOM: Service Workers, enhancement, P2)

32 Branch
enhancement

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox58 --- fixed

People

(Reporter: bkelly, Assigned: bkelly)

References

(Blocks 1 open bug, Regressed 1 open bug)

Details

Attachments

(13 files, 50 obsolete files)

9.65 KB, patch
jdm
: review+
Details | Diff | Splinter Review
1.39 KB, patch
smaug
: review+
Details | Diff | Splinter Review
1.21 KB, patch
ckerschb
: review+
Details | Diff | Splinter Review
2.49 KB, patch
bgrins
: review+
Details | Diff | Splinter Review
3.07 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
3.02 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
5.10 KB, patch
asuth
: review+
valentin
: review+
Details | Diff | Splinter Review
10.74 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
8.21 KB, patch
valentin
: review+
Details | Diff | Splinter Review
18.43 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
38.72 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
1.65 KB, patch
bgrins
: review+
Details | Diff | Splinter Review
37.12 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
Currently we have two pretty different paths for handling FetchEvent network interception:

1. HttpChannelChild and InterceptedChannelContent handle the interception using a pipe.
2. nsHttpChannel and InterceptedChannelChrome handle the interception using a synthetic cache entry.

There are a few significant consequences of this:

a. We can have bugs in one path, but not the other.
b. Piping the interception through these existing, complicated classes makes it very hard to understand how our interception works.  Maintenance is hard.
c. It makes other necko work harder because they have to work around the interception bits buried in the main classes.

I'd like to experiment with a different approach.  I propose that we:

i. Create a new SyntheticHttpChannel class.  This would contain only a response head and an nsIInputStream body.
ii. When a FetchEvent is to be fired we do an internal redirect from the original HttpChannelChild/nsHttpChannel to a SyntheticHttpChannel.
iii. If the service worker provides a Response, then we just fire OnStart, ODA, and OnStop, etc using the SyntheticHttpChannel.
iv. If the service worker passes on the channel or triggers a reset then we do another internal redirect back a real HttpChannelChild/nsHttpChannel.
v. If the service worker synthetic Response is a 30x redirect, then we do a redirect back to a real HttpChannelChild/nsHttpChannel.

There is probably a lot of complication I'm not seeing here, but I'd like to try this to help isolate the complexity.
Summary: consider unifying interception in a SyntheticHttpChannel class → consider unifying service worker interception in a SyntheticHttpChannel class
One question I have for necko folks:

Do you think you will want to reuse this "provide a simple head/body for a channel" mechanism for things other than service worker?

I could possibly break this down even further to make it easier to integrate with other things.  Consider these possible steps:

1. Original HttpChannelChild/nsHttpChannel detects the channel should be intercepted.
2. Internal redirect to InterceptedHttpChannel. (This is different from our existing InterceptedChannel stuff.)
3. One of the follow happens:
 a. The SW synthesizes a response triggering InterceptedHttpChannel to internal redirect to a SyntheticHttpChannel.
 b. The SW resets or 30x redirects triggering InterceptedHttpChannel to internal redirect back to an HttpChannelChild/nsHttpChannel

InterceptedHttpChannel never actually fires ODA type events itself.  Instead its a placeholder for a channel that can either be synthesized or reset back to an http channel.

This would let us make the synthesized channel really quite dumb and reusable.  The service worker FetchEvent integration would all be in InterceptedHttpChannel.

What do people think? Would this be good or going too small with each component?

I know internal redirects will have some perf cost, but at this point I'm much more concerned about maintainability and correctness.
(In reply to Ben Kelly [:bkelly] from comment #0)
> Currently we have two pretty different paths for handling FetchEvent network
> interception:
> 
> 1. HttpChannelChild and InterceptedChannelContent handle the interception
> using a pipe.
> 2. nsHttpChannel and InterceptedChannelChrome handle the interception using
> a synthetic cache entry.
> 
> There are a few significant consequences of this:
> 
> a. We can have bugs in one path, but not the other.
> b. Piping the interception through these existing, complicated classes makes
> it very hard to understand how our interception works.  Maintenance is hard.
> c. It makes other necko work harder because they have to work around the
> interception bits buried in the main classes.
> 
> I'd like to experiment with a different approach.  I propose that we:
> 
> i. Create a new SyntheticHttpChannel class.  This would contain only a
> response head and an nsIInputStream body.

What interfaces would this class implement?  The Http in the name suggests it should implement nsIHttpChannel (+nsIHttpChannelInternal) but that may not be necessary.  Anyway, what all this class should be capable of should be decided before you start writing it.

> ii. When a FetchEvent is to be fired we do an internal redirect from the
> original HttpChannelChild/nsHttpChannel to a SyntheticHttpChannel.

Yup, like it!

> iii. If the service worker provides a Response, then we just fire OnStart,
> ODA, and OnStop, etc using the SyntheticHttpChannel.

> iv. If the service worker passes on the channel or triggers a reset then we
> do another internal redirect back a real HttpChannelChild/nsHttpChannel.

As you write it - this will work ONLY if the real HttpChannelChild/nsHttpChannel has not been so far AsyncOpened.    But I'm afraid we passed that point here.  I think what you want is to just start getting the response from the network/cache (as if there were no interception at all) right?  If so, then you have to do the following:

- when FetchEvent is fired (this means "the channem MAY be intercepted", right?) suspend the real channel
- when "the service worker passes on the channel or triggers a reset", return the listener to get response from the real channel again and resume the real channel

We may want to discuss this F2F (vidyo).

> v. If the service worker synthetic Response is a 30x redirect, then we do a
> redirect back to a real HttpChannelChild/nsHttpChannel.

I don't follow this.  What exactly has to happen from the web content point of view it synthesizes a redirect?  (Sorry, my knowledge of SW is short)

> 
> There is probably a lot of complication I'm not seeing here, but I'd like to
> try this to help isolate the complexity.

I'm sorry for such late reply here, was really busy.  Maybe we should schedule a meeting for this all.


I try to answer the second comment as well:


(In reply to Ben Kelly [:bkelly] from comment #1)
> One question I have for necko folks:
> 
> Do you think you will want to reuse this "provide a simple head/body for a
> channel" mechanism for things other than service worker?

Not sure, but could be useful.  But it's not necessary if you need to implement this quickly.

> 
> I could possibly break this down even further to make it easier to integrate
> with other things.  Consider these possible steps:
> 
> 1. Original HttpChannelChild/nsHttpChannel detects the channel should be
> intercepted.
> 2. Internal redirect to InterceptedHttpChannel. (This is different from our
> existing InterceptedChannel stuff.)

Why it has to redirect?  Why it's not enough to just insert it to the nsIStreamListener chain so that it can actually intercept the response?  (We might have tried this in the past and moved to a different solution, tho.  jdm was working on this all)

> 3. One of the follow happens:
>  a. The SW synthesizes a response triggering InterceptedHttpChannel to
> internal redirect to a SyntheticHttpChannel.
>  b. The SW resets or 30x redirects triggering InterceptedHttpChannel to
> internal redirect back to an HttpChannelChild/nsHttpChannel

Why back?  A point I may not understand.  Just a note that it's perfectly OK to redirect to the same URL again, and just modify how the new channel will behave, when needed.

> 
> InterceptedHttpChannel never actually fires ODA type events itself.  Instead
> its a placeholder for a channel that can either be synthesized or reset back
> to an http channel.

Then why it has to be a channel at all?

> 
> This would let us make the synthesized channel really quite dumb and
> reusable.  The service worker FetchEvent integration would all be in
> InterceptedHttpChannel.
> 
> What do people think? Would this be good or going too small with each
> component?

I don't understand the question enough to answer.

> 
> I know internal redirects will have some perf cost, but at this point I'm
> much more concerned about maintainability and correctness.

Sure, internal redirs go via the redirect veto paths, but for internal redirs (this would be one) it's a bit faster.  And I think the any perf regression caused by a redirect will be negligible compared to the whole synthetization JS thing.
(In reply to Honza Bambas (:mayhemer) from comment #2)
> (In reply to Ben Kelly [:bkelly] from comment #0)
> > Currently we have two pretty different paths for handling FetchEvent network
> > interception:
> > 
> > 1. HttpChannelChild and InterceptedChannelContent handle the interception
> > using a pipe.
> > 2. nsHttpChannel and InterceptedChannelChrome handle the interception using
> > a synthetic cache entry.
> > 
> > There are a few significant consequences of this:
> > 
> > a. We can have bugs in one path, but not the other.
> > b. Piping the interception through these existing, complicated classes makes
> > it very hard to understand how our interception works.  Maintenance is hard.
> > c. It makes other necko work harder because they have to work around the
> > interception bits buried in the main classes.
> > 
> > I'd like to experiment with a different approach.  I propose that we:
> > 
> > i. Create a new SyntheticHttpChannel class.  This would contain only a
> > response head and an nsIInputStream body.
> 
> What interfaces would this class implement?  The Http in the name suggests
> it should implement nsIHttpChannel (+nsIHttpChannelInternal) but that may
> not be necessary.  Anyway, what all this class should be capable of should
> be decided before you start writing it.

At this point I am planning to inherit from HttpBaseChannel.  It needs to be an http channel in order to provide headers, etc to the consumer of the channel.

> > iii. If the service worker provides a Response, then we just fire OnStart,
> > ODA, and OnStop, etc using the SyntheticHttpChannel.
> 
> > iv. If the service worker passes on the channel or triggers a reset then we
> > do another internal redirect back a real HttpChannelChild/nsHttpChannel.
> 
> As you write it - this will work ONLY if the real
> HttpChannelChild/nsHttpChannel has not been so far AsyncOpened.    But I'm
> afraid we passed that point here.  I think what you want is to just start

This is what we do today for nsHttpChannel, so I hope that will work.  Not sure about HttpChannelChild.

> > v. If the service worker synthetic Response is a 30x redirect, then we do a
> > redirect back to a real HttpChannelChild/nsHttpChannel.
> 
> I don't follow this.  What exactly has to happen from the web content point
> of view it synthesizes a redirect?  (Sorry, my knowledge of SW is short)

A service worker can do `evt.respondWith(Response.redirect(someOtherURL))` which will synthesize a response with a 30x http status code.  Also, in some conditions service worker will not follow redirects and will instead pass them back through respondWith() for the outer channel to follow.  This is mainly used for navigations.

> > There is probably a lot of complication I'm not seeing here, but I'd like to
> > try this to help isolate the complexity.
> 
> I'm sorry for such late reply here, was really busy.  Maybe we should
> schedule a meeting for this all.

Thanks for your help!  I think I'd like to experiment for a bit before we do a meeting.

One thing I realized might be a problem today is divertToParent stuff.  If we synthesize a response and the content-disposition triggers a download the response must be processed in the parent.  Obviously a separate synthetic class would not do this at all.

Because of this I might lean towards only using this new mechanism in the parent instead of both the child and the parent.  In theory we want to remove the child intercept in the future anyway.
Attached patch bug1391693_p2.patch (obsolete) — Splinter Review
Work-in-progress.  This kind of works in non-e10s.  I can load my blog via service worker with it.  It does not pass tests, though.  For example, I still need to handle synthesized redirects, etc.
Also a lot of commented out cruft I can probably remove now to make it easier to read...
Attached patch bug1391693_p2.patch (obsolete) — Splinter Review
Slightly better patch.

Many tests seem to be failing because while the channel is synthesized and data comes through, something is missing such that window load events won't fire.
Attachment #8900466 - Attachment is obsolete: true
Attached patch bug1391693_p2.patch (obsolete) — Splinter Review
This fixes a fair number of tests.  Previous patch did not call nsILoadGroup::AddRequest() which broke a lot.

Still a number of failures to fix, though.
Attachment #8900540 - Attachment is obsolete: true
Attached patch bug1391693_p2.patch (obsolete) — Splinter Review
Improve some of the redirect and opaque response paths.  More tests pass.  Still tests to fix.
Attachment #8900913 - Attachment is obsolete: true
Attached patch bug1391693_p2.patch (obsolete) — Splinter Review
Almost passes all WPT service worker tests.  There is a crash I still need to figure out, though:

### ### [0x7fffbb13f800] FinishSynthesizedResponse
[15501] WARNING: NS_ENSURE_TRUE(listener) failed: file /srv/mozilla-central/netwerk/base/nsInputStre
amPump.cpp, line 324
[15501] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80070057: file /srv/mozilla-central/
netwerk/protocol/http/InterceptedHttpChannel.cpp, line 560
[15501] WARNING: 'NS_FAILED(rv)', file /srv/mozilla-central/dom/workers/ServiceWorkerEvents.cpp, lin
e 237
### ### [0x7fffbb13f800] CancelInterception
### ### [0x7fffb903e0c0] ListenerAdapter::OnStartRequest
### ### [0x7fffb903e0c0] ListenerAdapter::OnStartRequest mLoadFlags is document load false
### ### [0x7fffbb28d800] Cancel
### ### [0x7fffbb28d800] CancelInterception
Assertion failure: mRawPtr != nullptr (You can't dereference a NULL nsCOMPtr with operator->().), at
 /srv/mozilla-central/obj-x86_64-pc-linux-gnu-optdebug/dist/include/nsCOMPtr.h:763

Thread 1 "firefox" received signal SIGSEGV, Segmentation fault.
0x00007fffe691e925 in nsCOMPtr<nsIStreamListener>::operator-> (this=<optimized out>)
    at /srv/mozilla-central/obj-x86_64-pc-linux-gnu-optdebug/dist/include/nsCOMPtr.h:762
762         MOZ_ASSERT(mRawPtr != nullptr,
(gdb) bt
#0  0x00007fffe691e925 in nsCOMPtr<nsIStreamListener>::operator-> (this=<optimized out>)
    at /srv/mozilla-central/obj-x86_64-pc-linux-gnu-optdebug/dist/include/nsCOMPtr.h:762
#1  0x00007fffe6c1f0f5 in nsCORSListenerProxy::OnStartRequest (this=<optimized out>,
    aRequest=0x7fffbb28d858, aContext=<optimized out>)
    at /srv/mozilla-central/netwerk/protocol/http/nsCORSListenerProxy.cpp:478
#2  0x00007fffe6c268d0 in mozilla::net::(anonymous namespace)::ListenerAdapter::OnStartRequest (
    this=0x7fffb903e0c0, aRequest=<optimized out>, aContext=0x0)
    at /srv/mozilla-central/netwerk/protocol/http/InterceptedHttpChannel.cpp:53
#3  0x00007fffe6933113 in nsInputStreamPump::OnStateStart (this=0x7fffb7b39700)
    at /srv/mozilla-central/netwerk/base/nsInputStreamPump.cpp:537
#4  0x00007fffe6932e1e in nsInputStreamPump::OnInputStreamReady (this=<optimized out>,
    stream=<optimized out>) at /srv/mozilla-central/netwerk/base/nsInputStreamPump.cpp:440
#5  0x00007fffe687889c in nsInputStreamReadyEvent::Run (this=0x7fffb7b094c0)
    at /srv/mozilla-central/xpcom/io/nsStreamUtils.cpp:97
#6  0x00007fffe689f55d in nsThread::ProcessNextEvent (this=<optimized out>,
    aMayWait=<optimized out>, aResult=0x7fffffffc5d7)
    at /srv/mozilla-central/xpcom/threads/nsThread.cpp:1033
#7  0x00007fffe68a09eb in NS_ProcessNextEvent (aThread=0x7ffff7110540 <_IO_2_1_stderr_>,
    aMayWait=<optimized out>) at /srv/mozilla-central/xpcom/threads/nsThreadUtils.cpp:521
#8  0x00007fffe6d7b509 in mozilla::ipc::MessagePump::Run (this=<optimized out>,
    aDelegate=<optimized out>) at /srv/mozilla-central/ipc/glue/MessagePump.cpp:97
#9  0x00007fffe6d208ed in MessageLoop::RunHandler (this=0x7fffe3d6f1c0)
    at /srv/mozilla-central/ipc/chromium/src/base/message_loop.cc:319
#10 MessageLoop::Run (this=0x7fffe3d6f1c0)
    at /srv/mozilla-central/ipc/chromium/src/base/message_loop.cc:299
#11 0x00007fffe8b01a35 in nsBaseAppShell::Run (this=0x7fffde609270)
    at /srv/mozilla-central/widget/nsBaseAppShell.cpp:158
#12 0x00007fffe9ef82d8 in nsAppStartup::Run (this=0x7fffe1db2a10)
    at /srv/mozilla-central/toolkit/components/startup/nsAppStartup.cpp:288
#13 0x00007fffe9f8fa0e in XREMain::XRE_mainRun (this=<optimized out>)
    at /srv/mozilla-central/toolkit/xre/nsAppRunner.cpp:4646
Attachment #8901355 - Attachment is obsolete: true
Attached patch bug1391693_p2.patch (obsolete) — Splinter Review
This seems to pass WPT tests locally.  Lets see how it does on try:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=7c7f3dfedc5d9f271f52f555bde6235001b4d439
Attachment #8902433 - Attachment is obsolete: true
Attached patch bug1391693_p2.patch (obsolete) — Splinter Review
Took me a while to figure out how to integrate with the e10s bits.  This mostly works and even passes most redirect tests.  Still some more mochitest failures to fix.
Attachment #8902499 - Attachment is obsolete: true
Attached patch bug1391693_p2.patch (obsolete) — Splinter Review
Tracked down a tricky race during redirects.  WPT tests pass for both e10s and non-e10s.  Still some mochitests to fix.
Attachment #8903805 - Attachment is obsolete: true
Attached patch bug1391693_p2.patch (obsolete) — Splinter Review
Fixed test_fetch_event.html in non-e10s and a lingering crash in WPT e10s.  The crash seems like a race in existing code.
Attachment #8905323 - Attachment is obsolete: true
Attached patch bug1391693_p2.patch (obsolete) — Splinter Review
Fixed a few more test issues.  Lets see how bad try is over the weekend:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=9d51b4a50f030b5eb4e5adaec525fc8fcefac74e
Attachment #8905683 - Attachment is obsolete: true
Attached patch bug1391693_p2.patch (obsolete) — Splinter Review
Updated with more fixed.  Parent diversion should now work.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=98cc0fa942a7730260d1d6e33a5b67c1e4692394
Attachment #8906131 - Attachment is obsolete: true
Attached patch bug1391693_p5.patch (obsolete) — Splinter Review
I broke out parts of the patch that I'm pretty sure about.  This P5 patch is my current speculative work-in-progress.
Attachment #8906756 - Attachment is obsolete: true
Comment on attachment 8907244 [details] [diff] [review]
P1 Rename nsIInterceptedChannel.cancel() to cancelInterception()

Josh, do you mind reviewing this mechanical change?  This patch simply renames nsIInterceptedChannel::Cancel() to nsIInterceptedChannel::CancelInterception().  This is more consistent with ResetInterception() and also makes it easier to distinguish with nsIChannel::Cancel().
Attachment #8907244 - Flags: review?(josh)
Attached patch bug1391693_p6.patch (obsolete) — Splinter Review
Attachment #8907250 - Attachment is obsolete: true
Attached patch bug1391693_p7.patch (obsolete) — Splinter Review
Attachment #8907258 - Attachment is obsolete: true
Attached patch bug1391693_p8.patch (obsolete) — Splinter Review
The patches up to P6 are refactoring without any intended functional change.  Lets make sure everything is green with just those changes:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=1a5c8a1f40543b3897181531e2ed4f47034fc1ee
Attached patch bug1391693_p8_simple.patch (obsolete) — Splinter Review
This is an alternative approach to dealing with the extra internal redirect for interception in e10s mode.  Lets see how bad it is:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=bef9001e98139a6eb5c20b58760c8d101f08051c
Attachment #8907244 - Flags: review?(josh) → review+
Comment on attachment 8907246 [details] [diff] [review]
P2 Only validate docloader channel on redirect if its been set.

Olli, with the later patches in this bug we will be performing an additional internal redirect on the nsIChannel when its intercepted by a service worker.  The docloader does not currently expect this and triggers this assertion since mDocumentRequest is nullptr.  This patch simply bypasses this assertion when mDocumentRequest has not been set yet.

No functional change in this patch beyond assertions.
Attachment #8907246 - Flags: review?(bugs)
Comment on attachment 8907247 [details] [diff] [review]
P3 Allow CSP report channels to be internally redirected.

Christoph, later in this bug I want to make service worker interception perform an internal redirect on the nsIChannel.  This breaks the current CSP report code since it does not permit any redirects.  While CSP report channels should not allow real redirects, it seems it should be ok to do an internal redirect.

This patch makes CSP report requests permit internal redirects.

As a side note, perhaps we should move the "redirect mode = error" logic from fetch API into necko code so consumers don't have to implement their own AsyncOnChannelRedirect() methods like this to prevent redirects.
Attachment #8907247 - Flags: review?(ckerschb)
Comment on attachment 8907248 [details] [diff] [review]
P4 Don't count internal redirects towards Response.redirected

Tom, do you mind reviewing this patch that touches the Response.redirected code you added previously?  FetchDriver::AsyncOnChannelRedirect() is appending URLs to Response for each redirect.  I want to make it avoid doing this for internal redirects.  The internal redirects should generally not be visible by script.

I need this because I want to make service worker interception use an extra internal redirect later in this bug.
Attachment #8907248 - Flags: review?(ttung)
Comment on attachment 8907247 [details] [diff] [review]
P3 Allow CSP report channels to be internally redirected.

Review of attachment 8907247 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Ben Kelly [:bkelly] from comment #31)
> This patch makes CSP report requests permit internal redirects.

That's fine with me.
Attachment #8907247 - Flags: review?(ckerschb) → review+
Comment on attachment 8907246 [details] [diff] [review]
P2 Only validate docloader channel on redirect if its been set.

rs+
Attachment #8907246 - Flags: review?(bugs) → review+
Attached patch bug1391693_p7.patch (obsolete) — Splinter Review
Fix a place where I did not suspend the pump properly.
Attachment #8907280 - Attachment is obsolete: true
Attached patch bug1391693_p8_simple.patch (obsolete) — Splinter Review
The "simple" e10s integration is not quite as simple as I thought.  But its still much easier than my original e10s integration work.  I just had to add some extra conditionals to avoid later redirect IPC machinery.

This seems to fix the main failing tests in the previous try.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=7738160d5714367766a933acfc69f5ae5013e354
Attachment #8907281 - Attachment is obsolete: true
Attachment #8907294 - Attachment is obsolete: true
Comment on attachment 8907248 [details] [diff] [review]
P4 Don't count internal redirects towards Response.redirected

Review of attachment 8907248 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me! Should we move the code for getting URI from the new channel to if-condition as well?

::: dom/fetch/FetchDriver.cpp
@@ +872,5 @@
>    }
>  
>    // "HTTP-redirect fetch": step 14 "Append locationURL to request's URL list."
>    nsCOMPtr<nsIURI> uri;
>    MOZ_ALWAYS_SUCCEEDS(aNewChannel->GetURI(getter_AddRefs(uri)));

I got the URI from the |aNewChannel| to get the |spec|, so we could record redirected URL into the request by |mRequest->AddURL()| and the rest of code in this function doesn't need to use the |uri|. Perhaps, we could get the URI from the channel only when it's needed?
Attachment #8907248 - Flags: review?(ttung) → review+
(In reply to Tom Tung [:tt] (OoO: 9/22 ~ 9/29) from comment #37)
> I got the URI from the |aNewChannel| to get the |spec|, so we could record
> redirected URL into the request by |mRequest->AddURL()| and the rest of code
> in this function doesn't need to use the |uri|. Perhaps, we could get the
> URI from the channel only when it's needed?

Yea, I should move that into the if-statement as well.  Good catch!  I'll fix this up tomorrow.
Priority: -- → P2
Looks like we mainly have a memory leak to fix.  Ran out of time this week, though.  Lets do a fresh try build over the weekend:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=d41be395443855a5654c93091512a161fb42c72e
Attached patch bug1391693_p7.patch (obsolete) — Splinter Review
At least one leak was due to InterceptedHttpChannel holding a cycle with itself via its mPump member.  This should fix it by clearing in OnStopRequest and DoNotifyListeners.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=c6eb5b9a60ba4aba0b63ab991351b1f5244d5f6d
Attachment #8907663 - Attachment is obsolete: true
Attached patch bug1391693_p8_simple.patch (obsolete) — Splinter Review
Fix another thorny memory leak issue.  We need to handle the case where the HttpChannelParent is destroyed before the parent-side channel is intercepted.  This didn't used to happen but can now that we have an async redirect step in place.  If this happens we'll just defer the cancel until the interception happens.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=0c280780f4d14da3d9c06f6cccd04ed72e2b8a96
Attachment #8907664 - Attachment is obsolete: true
Depends on: 1402085
Patches here are causing us to trigger bug 1402085 in our test_request_* tests.
I still have one issue to solve here.  We are leaking a channel in:

  dom/html/test/test_formSubmission.html

I'll have to track it down on Monday.
Attached patch bug1391693_p7.patch (obsolete) — Splinter Review
Clear more stuff in ReleaseListeners().  This fixes the leaks on my machine.  Lets see what try says:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8a0c3bc4b1488b173156b152af97ee0fc82d48a4

If this is green, then I'll be on to cleanup and flagging for review.
Attachment #8909866 - Attachment is obsolete: true
Rearranged the patches a bit as the start to cleaning up.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=460030d132c9ba4fcc3216d2b7caecdf0b67eee7

Some known things still to fix:

* We're failing a devtools test now: test_console_serviceworker.html
* Non-e10s debug tests seem to run a lot slower.  I need to evaluate what kind of impact this has for opt builds on real sites.
Or figure out why they are so much slower.  I don't think the internal redirects should really be causing that, but maybe they are more expensive in DEBUG builds.
The extra internal redirect is only costing us ~15ms in DEBUG non-e10s.  I think this is reasonable.  I'm continuing to investigate where the extra time is going.
Notes:

We are running much slower in test_request_context_internal.html.  This is happens when we fire window.close() at just the right time during the window loading preventing it from closing.  Waiting for the window to load avoids the problem and we run the test at roughly the same speed as before.

I tried making a reduced test case, but can't reproduce:

https://window-close-while-loading.glitch.me/

When we trigger the problem we get:

GECKO(4920) | [4920, Main Thread] WARNING: NS_ENSURE_TRUE(mDocShell) failed: file /srv/mozilla-central/docshell/base/nsDSURIContentListener.cpp, line 79

If we don't trigger the problem and I refresh the test page I hit this NS_ASSERTION instead:

    GECKO(4920) | [4920, Main Thread] ###!!! ASSERTION: Adding a child where we already have a child? This may misbehave: 'Error', file /srv/mozilla-central/docshell/shistory/nsSHEntry.cpp, line 771
    GECKO(4920) | #01: nsDocShell::AddChildSHEntry(nsISHEntry*, nsISHEntry*, int, unsigned int, bool) (/srv/mozilla-central/docshell/base/nsDocShell.cpp:4427)
    GECKO(4920) | #02: nsDocShell::AddChildSHEntryToParent(nsISHEntry*, int, bool) (/srv/mozilla-central/docshell/base/nsDocShell.cpp:4519)
    GECKO(4920) | #03: nsDocShell::AddToSessionHistory(nsIURI*, nsIChannel*, nsIPrincipal*, nsIPrincipal*, bool, nsISHEntry**) (/srv/mozilla-central/docshell/base/nsDocShell.cpp:12749 (discriminator 1))
    GECKO(4920) | #04: nsDocShell::OnNewURI(nsIURI*, nsIChannel*, nsIPrincipal*, nsIPrincipal*, unsigned int, bool, bool, bool) (/srv/mozilla-central/docshell/base/nsDocShell.cpp:12028 (discriminator 1))
    GECKO(4920) | #05: nsDocShell::OnLoadingSite(nsIChannel*, bool, bool) (/srv/mozilla-central/docshell/base/nsDocShell.cpp:12114 (discriminator 1))
    GECKO(4920) | #06: nsDocShell::CreateContentViewer(nsTSubstring<char> const&, nsIRequest*, nsIStreamListener**) (/srv/mozilla-central/docshell/base/nsDocShell.cpp:9306 (discriminator 2))
    GECKO(4920) | #07: nsDSURIContentListener::DoContent(nsTSubstring<char> const&, bool, nsIRequest*, nsIStreamListener**, bool*) (/srv/mozilla-central/docshell/base/nsDSURIContentListener.cpp:114)
    GECKO(4920) | #08: nsDocumentOpenInfo::TryContentListener(nsIURIContentListener*, nsIChannel*) (/srv/mozilla-central/uriloader/base/nsURILoader.cpp:742 (discriminator 1))
    GECKO(4920) | #09: nsDocumentOpenInfo::DispatchContent(nsIRequest*, nsISupports*) (/srv/mozilla-central/uriloader/base/nsURILoader.cpp:420 (discriminator 1))
    GECKO(4920) | #10: nsDocumentOpenInfo::OnStartRequest(nsIRequest*, nsISupports*) (/srv/mozilla-central/uriloader/base/nsURILoader.cpp:295)
    GECKO(4920) | #11: nsBaseChannel::OnStartRequest(nsIRequest*, nsISupports*) (/srv/mozilla-central/netwerk/base/nsBaseChannel.cpp:837 (discriminator 2))
    GECKO(4920) | #12: nsInputStreamPump::OnStateStart() (/srv/mozilla-central/netwerk/base/nsInputStreamPump.cpp:537 (discriminator 2))
    GECKO(4920) | #13: nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*) (/srv/mozilla-central/netwerk/base/nsInputStreamPump.cpp:440)
    GECKO(4920) | #14: nsCOMPtr<nsIInputStreamCallback>::assign_assuming_AddRef(nsIInputStreamCallback*) (/srv/mozilla-central/obj-x86_64-pc-linux-gnu-optdebug/dist/include/nsCOMPtr.h:396)
    GECKO(4920) | #15: nsThread::ProcessNextEvent(bool, bool*) (/srv/mozilla-central/xpcom/threads/nsThread.cpp:1039 (discriminator 1))
    GECKO(4920) | #16: NS_ProcessNextEvent(nsIThread*, bool) (/srv/mozilla-central/xpcom/threads/nsThreadUtils.cpp:521) 

I'll investigate further tomorrow.

I suspect maybe my InterceptedHttpChannel is doing something slightly different than the old code in this case.
Turns out the issues in the previous comment were due to a bug in InterceptedHttpChannel.  When canceled it was still reporting NS_OK in OnStopRequest(), etc.  Things run much smoother with this fixed.

This update also removes printfs, etc.
Attachment #8912012 - Attachment is obsolete: true
Fix bug number in commit message.
Attachment #8908819 - Attachment is obsolete: true
Attachment #8912470 - Flags: review+
Valentin, do you mind reviewing a few refactoring patches in this bug?

The overall goal here is to allow an nsHttpChannel to do an internal redirect to a new InterceptedHttpChannel class when we want to fire a service worker FetchEvent.  To do this, though, I need to break the strict dependency from HttpChannelParent to nsHttpChannel.

The first step towards breaking this dependency is moving a fewer helper routines from nsHttpChannel to HttpBaseChannel.  I will make HttpChannelParent operate on HttpBaseChannel in the next patch.
Attachment #8907257 - Attachment is obsolete: true
Attachment #8912479 - Flags: review?(valentin.gosu)
Valentin, this patch now changes the HttpChannelParent::mChannel to be an HttpBaseChannel instead of the specific nsHttpChannel type.  This will allow it to work with the InterceptedHttpChannel which inherits from the same base type.

Most of the operations are unchanged.  There are a few places where we must do some additional QI where we did not need to before.  Also, some things only make sense for real nsHttpChannel (like cache channel stuff).

This patch and the previous one should be only a refactoring with no current functional change.
Attachment #8907279 - Attachment is obsolete: true
Attachment #8912480 - Flags: review?(valentin.gosu)
Josh, do you mind reviewing this patch since you wrote the original e10s interception code?  Valentin, I'd also appreciate feedback from you here, but I thought perhaps Josh might know more about the service worker integration.  (If you guys want me to switch reviewers around, I'm happy to do so.)

The current service worker e10s intercept code performs most work in the child process.  If a service worker intercepts and does a simple response then the parent never even creates an InterceptedHttpChannel.

There are times, however, when we do need to create an InterceptedHttpChannel in e10s mode:

1. When the service worker synthesizes a redirect we fake a parent-side interception that results in the same redirect.  This then triggers a real redirected nsHttpChannel.
2. When the child needs to divert a synthesized result to the parent we create an intercepted channel in the parent that is immediately suspended.

The machinery to handle these cases is quite complex and fragile.  It does not currently expect to get an internal redirect when an nsHttpChannel triggers a interception.

To fix this I have decided to make HttpChannelParent and HttpChannelParentListener specifically detect this interception internal redirect.  When it occurs they will "eat" the redirect and immediately replace their internal channel with the InterceptedHttpChannel.  The child process never knows that the internal redirect took place.

While this sounds a bit odd and scary, it is dramatically simpler than the alternative.  I spent a week or two trying to make the child+parent channel pair handle the internal redirect.  Keeping the state in sync was very difficult and never did work right.

I believe this patch is the best way to go here.  As an added incentive, we very much want to remove the child process interception code in the near future.  Once that happens all of this code will go away.
Attachment #8912011 - Attachment is obsolete: true
Attachment #8912482 - Flags: review?(josh)
Attachment #8912482 - Flags: feedback?(valentin.gosu)
Comment on attachment 8912323 [details] [diff] [review]
P10 Don't expect extraneous console message from service worker any more. r=brgins

Brian, do you mind reviewing this devtools test change?  My changes here actually fix a problem we papered over before.

Bug 1187335 added an expected console message here even though it seemed like we should not be getting it.  My patches here cause us to teardown the channel more quickly which result in this stray message not getting reported.

This patch is basically a backout of the changes from bug 1187335 to this file.
Attachment #8912323 - Flags: review?(bgrinstead)
I've flagged the patches I think are ready for review.  The last couple patches still need some cleanup before I ask for review on them.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=d892fead083ef0209fdbc6b3f6f856a86c58ba59
This bug doesn't actually unify our e10s and non-e10s approaches after all.  Changing the title to reflect that its mainly a cleanup of the parent-side interception code.
Summary: consider unifying service worker interception in a SyntheticHttpChannel class → move parent-side interception out of nsHttpChannel into a separate InterceptedHttpChannel
Comment on attachment 8912479 [details] [diff] [review]
P5 Move some helper methods from nsHttpChannel to HttpBaseChannel. r=valentin

Review of attachment 8912479 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/HttpBaseChannel.h
@@ +371,5 @@
>        return mChannelId;
>      }
>  
> +    void InternalSetUploadStream(nsIInputStream *uploadStream)
> +      { mUploadStream = uploadStream; }

nit: Since we are changing it, let's update the coding style.
void InternalSetUploadStream(nsIInputStream *uploadStream)
{
  mUploadStream = uploadStream;
}

@@ +373,5 @@
>  
> +    void InternalSetUploadStream(nsIInputStream *uploadStream)
> +      { mUploadStream = uploadStream; }
> +    void SetUploadStreamHasHeaders(bool hasHeaders)
> +      { mUploadStreamHasHeaders = hasHeaders; }

nit: same here

@@ +379,5 @@
> +    MOZ_MUST_USE nsresult
> +    SetReferrerWithPolicyInternal(nsIURI *referrer, uint32_t referrerPolicy) {
> +        nsAutoCString spec;
> +        nsresult rv = referrer->GetAsciiSpec(spec);
> +        if (NS_FAILED(rv)) return rv;

nit: return on the next line + add braces
Attachment #8912479 - Flags: review?(valentin.gosu) → review+
Comment on attachment 8912480 [details] [diff] [review]
P6 Make HttpChannelParent operate on HttpBaseChannel objects instead of nsHttpChannel directly. r=valentin

Review of attachment 8912480 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with questions answered and nits fixed.

::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +554,5 @@
>    }
>    if (apiRedirectToUri)
>      httpChannel->RedirectTo(apiRedirectToUri);
>    if (topWindowUri) {
> +    rv = httpChannel->SetTopWindowURIIfUnknown(topWindowUri);

Could you check if there's any way for this to fail? It seems to be performing several extra checks.

@@ +675,5 @@
>      return SendFailedAsyncOpen(rv);
>    }
>  
> +  nsCOMPtr<nsICacheInfoChannel> cacheChannel =
> +    do_QueryInterface(static_cast<nsIChannel*>(httpChannel));

mChannel.get()

@@ +1901,5 @@
>  
>    // MessageDiversionStarted call will suspend mEventQ as many times as the
>    // channel has been suspended, so that channel and this queue are in sync.
> +  nsCOMPtr<nsIChannelWithDivertableParentListener> divertChannel =
> +    do_QueryInterface(static_cast<nsIChannel*>(mChannel));

This should be mChannel.get()

@@ +1974,5 @@
>      return NS_ERROR_UNEXPECTED;
>    }
>  
> +  nsCOMPtr<nsIChannelWithDivertableParentListener> divertChannel =
> +    do_QueryInterface(static_cast<nsIChannel*>(mChannel));

mChannel.get()

@@ +2062,5 @@
>    // mDivertListener.
>    nsCOMPtr<nsIStreamListener> converterListener;
>    Unused << mChannel->DoApplyContentConversions(mDivertListener,
> +                                                getter_AddRefs(converterListener),
> +                                                nullptr);

Why the extra arg? Should we call DoApplyContentConversions with nullptr instead of mListenerContext?
I assume this was not intended?

@@ +2137,5 @@
>  
>    // Resume only if we suspended earlier.
>    if (mSuspendedForDiversion) {
> +    nsCOMPtr<nsIChannelWithDivertableParentListener> divertChannel =
> +      do_QueryInterface(static_cast<nsIChannel*>(mChannel));

mChannel.get()
Attachment #8912480 - Flags: review?(valentin.gosu) → review+
(In reply to Valentin Gosu [:valentin] from comment #67)
> ::: netwerk/protocol/http/HttpChannelParent.cpp
> @@ +554,5 @@
> >    }
> >    if (apiRedirectToUri)
> >      httpChannel->RedirectTo(apiRedirectToUri);
> >    if (topWindowUri) {
> > +    rv = httpChannel->SetTopWindowURIIfUnknown(topWindowUri);
> 
> Could you check if there's any way for this to fail? It seems to be
> performing several extra checks.

I thought this was ok because the channel was just created.  I see now that it does a bunch of work to try to find the window URI we probably don't want to do in the parent.  I'll move SetTopWindowURI() helper from nsHttpChannel to HttpChannelBase like the other helpers in P5.  Then I can leave this bit of code unmodified in this patch.

> @@ +2062,5 @@
> >    // mDivertListener.
> >    nsCOMPtr<nsIStreamListener> converterListener;
> >    Unused << mChannel->DoApplyContentConversions(mDivertListener,
> > +                                                getter_AddRefs(converterListener),
> > +                                                nullptr);
> 
> Why the extra arg? Should we call DoApplyContentConversions with nullptr
> instead of mListenerContext?
> I assume this was not intended?

I think I can remove this hunk from the patch.  I was previously trying to operate on idl interface types which required this.  Now that I am using HttpBaseChannel I can leave this code unchanged.

Thanks for the reviews!
Fixed style on moved methods.  Also moved SetTopWindowURI() from nsHttpChannel to HttpBaseChannel.
Attachment #8912479 - Attachment is obsolete: true
Attachment #8912695 - Flags: review+
This fixes most of the review comments.

Valentin, what did you mean by the various mChannel.get() comments?

The various static_cast<nsIChannel*>() things in the patch are to work around ambiguous inheritence to nsISupports that breaks QI.  Using mChannel.get() didn't seem to help with this.
Attachment #8912480 - Attachment is obsolete: true
Flags: needinfo?(valentin.gosu)
Attachment #8912696 - Flags: review+
Fix a problem I introduced in the last update.  The mIsPending flag must be cleared before firing OnStopRequest.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a4c0e82809341d467b1a5d8e71a00d83b6bbaa31
Attachment #8912467 - Attachment is obsolete: true
I did some performance measurements in non-e10s mode using the benchmark here:

https://jakearchibald.github.io/service-worker-benchmark/

Resulting data:

https://docs.google.com/spreadsheets/d/1X0erS7oQLM4T3945utTKDuQ5VOjmv-R_ipiKWLsHoHY/edit?usp=sharing

Overall the switch to InterceptedHttpChannel looks like a perf win:

Fall-through fetch event: ~80ms median, ~100ms mean improvement
Pass-through fetch event: ~70ms median, ~100ms mean improvement
Cache API fetch event: ~60ms median, ~70ms mean improvement
Data URL fetch event: ~150ms median, ~123ms mean improvement

This is a bit surprising to me given it potentially has an extra async runnable for the internal redirect.  I guess the redirects are only potentially async, though, and we don't actually trigger that async'ness in this case.

We also probably have a perf advantage with InterceptedHttpChannel because we avoid the overhead of creating an http cache entry.  I'm not sure how expensive that is, but it seems like it could potentially depend on some disk writes.

As an added bonus, one of the reasons we wanted to do this was so we could enable better streaming in bug 1390638.  So even better perf will follow on these improvements.
Attachment #8912323 - Flags: review?(bgrinstead) → review+
Make InterceptedHttpChannel retargetable to other threads.

I also remeasured the against the benchmark.

Fall-through fetch event: ~115ms median, ~150ms mean improvement
Pass-through fetch event: ~35ms median, ~160ms mean improvement
Cache API fetch event: ~45ms median, ~60ms mean improvement
Data URL fetch event: ~240ms median, ~260ms mean improvement

Note, I remeasured the base data as well which is why some of these numbers are worse than the last comment.
Attachment #8912759 - Attachment is obsolete: true
(In reply to Ben Kelly [:bkelly] from comment #70)
> Valentin, what did you mean by the various mChannel.get() comments?
> 
> The various static_cast<nsIChannel*>() things in the patch are to work
> around ambiguous inheritence to nsISupports that breaks QI.  Using
> mChannel.get() didn't seem to help with this.

static_cast<nsIChannel*>(mChannel.get()) is a bit clearer in showing that we are casting the pointer, not the RefPtr.
Flags: needinfo?(valentin.gosu)
Cleaned up InterceptedHttpChannel and squashed remaining TODO items.  I still need to add some comments.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=483ed2f2ddd900d63717393d77d23b180e0d01cb
Attachment #8912877 - Attachment is obsolete: true
(In reply to Ben Kelly [:bkelly] from comment #72)
> I did some performance measurements in non-e10s mode using the benchmark
> here:
> 
> https://jakearchibald.github.io/service-worker-benchmark/
> 
> Resulting data:
> 
> https://docs.google.com/spreadsheets/d/1X0erS7oQLM4T3945utTKDuQ5VOjmv-
> R_ipiKWLsHoHY/edit?usp=sharing
> 
> Overall the switch to InterceptedHttpChannel looks like a perf win:
> 
> Fall-through fetch event: ~80ms median, ~100ms mean improvement
> Pass-through fetch event: ~70ms median, ~100ms mean improvement
> Cache API fetch event: ~60ms median, ~70ms mean improvement
> Data URL fetch event: ~150ms median, ~123ms mean improvement

There are all great numbers!  How exactly are you collecting them?

> 
> This is a bit surprising to me given it potentially has an extra async
> runnable for the internal redirect.  I guess the redirects are only
> potentially async, though, and we don't actually trigger that async'ness in
> this case.

We definitely do some asyncness (at least one main thread runnable), but according what I have always seen in logs, redirects are surprisingly processed (approved) very quickly (~1ms tops)

> 
> We also probably have a perf advantage with InterceptedHttpChannel because
> we avoid the overhead of creating an http cache entry.  I'm not sure how
> expensive that is, but it seems like it could potentially depend on some
> disk writes.

I believe we were using memory only cache, there is absolutely no overhead, no disk IO.  I more think that the overall complexity was causing it.  Could some IPC be saved?

> 
> As an added bonus, one of the reasons we wanted to do this was so we could
> enable better streaming in bug 1390638.  So even better perf will follow on
> these improvements.
(In reply to Honza Bambas (:mayhemer) from comment #76)
> > Fall-through fetch event: ~80ms median, ~100ms mean improvement
> > Pass-through fetch event: ~70ms median, ~100ms mean improvement
> > Cache API fetch event: ~60ms median, ~70ms mean improvement
> > Data URL fetch event: ~150ms median, ~123ms mean improvement
> 
> There are all great numbers!  How exactly are you collecting them?

This site:

https://jakearchibald.github.io/service-worker-benchmark/

Install the corresponding service worker.  Then I hit reload to get a number.  Make sure not to wait too long between reloads or the service worker will idle out and you will see overhead from re-starting it.

> > This is a bit surprising to me given it potentially has an extra async
> > runnable for the internal redirect.  I guess the redirects are only
> > potentially async, though, and we don't actually trigger that async'ness in
> > this case.
> 
> We definitely do some asyncness (at least one main thread runnable), but
> according what I have always seen in logs, redirects are surprisingly
> processed (approved) very quickly (~1ms tops)

I was measuring ~5ms to ~15ms on an opt build in this linux VM.  From constructor to AsyncOpen().

> > We also probably have a perf advantage with InterceptedHttpChannel because
> > we avoid the overhead of creating an http cache entry.  I'm not sure how
> > expensive that is, but it seems like it could potentially depend on some
> > disk writes.
> 
> I believe we were using memory only cache, there is absolutely no overhead,
> no disk IO.  I more think that the overall complexity was causing it.  Could
> some IPC be saved?

This is non-e10s, so there should not be IPC.  Perhaps nsHttpChannel has more async runnables in play after AsyncOpen.  Also, it could just be how things line up on the main thread.
Andrew, Valentin, do you mind both reviewing this?  I figure Andrew can look at it from a service worker angle and Valentin could make sure its an acceptable nsIChannel implementation.  Does that sound reasonable?

This is the new InterceptedHttpChannel class.  I wrote a long block comment in the header about how it works, so I won't repeat that here.

The next patch will add the code that make nsHttpChannel perform an internal redirect to this channel when we do an interception.
Attachment #8912916 - Attachment is obsolete: true
Attachment #8913349 - Flags: review?(valentin.gosu)
Attachment #8913349 - Flags: review?(bugmail)
Comment on attachment 8913349 [details] [diff] [review]
P8 Add new InterceptedHttpChannel class to represent parent-side channel with a ServiceWorker FetchEvent. r=asuth r=valentin

Last minute change I made broke something.  Sorry for the flag churn.
Attachment #8913349 - Flags: review?(valentin.gosu)
Attachment #8913349 - Flags: review?(bugmail)
Comment on attachment 8913394 [details] [diff] [review]
P9 Make nsHttpChannel redirect to InterceptedHttpChannel to fire a ServiceWorker FetchEvent.

Asking for double review again.

This patch performs the internal redirect from nsHttpChannel to InterceptedHttpChannel.  This is done in Connect() just like in HttpChannelChild.  It needs to happen here instead of earlier in AsyncOpen so that things like HSTS upgrade can take place.

I also included a minor devtools fix here.  Since we are no longer performing the interception directly in nsHttpChannel using http cache we don't get the cache notifications any more.  The devtools change just makes us fake the cache notification like we currently do for e10s.
Attachment #8913394 - Flags: review?(valentin.gosu)
Attachment #8913394 - Flags: review?(bugmail)
Comment on attachment 8913395 [details] [diff] [review]
P11 Remove old nsHttpChannel interception bits. r=valentin

Remove the now-stale interception code from nsHttpChannel.  This is a mostly mechanical change, but I had to think about some of the combined logic statements to try to get them right.  A double-check is appreciated.
Attachment #8913395 - Flags: review?(valentin.gosu)
Updated to for review feedback to use mChannel.get(), etc.
Attachment #8912696 - Attachment is obsolete: true
Attachment #8913400 - Flags: review+
Try is green.  Looks like this just needs final reviews, etc.
(In reply to Ben Kelly [:bkelly] from comment #77)
> (In reply to Honza Bambas (:mayhemer) from comment #76)
> > > Fall-through fetch event: ~80ms median, ~100ms mean improvement
> > > Pass-through fetch event: ~70ms median, ~100ms mean improvement
> > > Cache API fetch event: ~60ms median, ~70ms mean improvement
> > > Data URL fetch event: ~150ms median, ~123ms mean improvement
> > 
> > There are all great numbers!  How exactly are you collecting them?
> 
> This site:
> 
> https://jakearchibald.github.io/service-worker-benchmark/
> 
> Install the corresponding service worker.  Then I hit reload to get a
> number.  Make sure not to wait too long between reloads or the service
> worker will idle out and you will see overhead from re-starting it.

BTW, I should mention these numbers are measuring overall load time for a page with a couple hundred small image requests.  The per-channel improvement is not nearly 100ms.  The changes in this bug seem to just seem to support this sites massively parallel image load better for some reason.
Blocks: 1204254
Comment on attachment 8912482 [details] [diff] [review]
P7 Make HttpChannelParent and HttpChannelParentListener allow an internal redirect during service worker interception. r=jdm

Review of attachment 8912482 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +1840,1 @@
>    if (mIPCClosed)

Let's check for a closed IPC first.

::: netwerk/protocol/http/HttpChannelParentListener.cpp
@@ +328,5 @@
>  {
> +  // Its possible for the child-side interception to complete and tear down
> +  // the actor before we even get this parent-side interception notification.
> +  // In this case just immediately cancel the channel.  We do this via a
> +  // runnable to avoid failing the interception redirect back in some cases.

I'm not sure what "failing the interception redirect back" means.

@@ +401,5 @@
>  {
>    MOZ_ASSERT(aListener);
>    MOZ_RELEASE_ASSERT(mSuspendedForDiversion, "Must already be suspended!");
>  
> +  mInterceptCanceled = false;

A comment about why this is necessary would be useful.
Attachment #8912482 - Flags: review?(josh) → review+
Note, I need to add another patch to get thread retargeting to really work with InterceptedHttpChannel.  I don't implement nsIThreadRetargetableStreamListener, so the pump will never permit retargeting right now.
(In reply to Ben Kelly [:bkelly] from comment #73)
> Cache API fetch event: ~45ms median, ~60ms mean improvement

> Note, I remeasured the base data as well which is why some of these numbers
> are worse than the last comment.

The reason the cache number went up is that its not a statistically significant change.
Valentin, this patch gets InterceptedHttpChannel to actually retarget OMT.  It:

1. Implements nsIRetargetableThreadStreamListener since it listens directly to the nsInputStreamPump.
2. Proxies OnStatus() and OnProgress() calls to the main thread when necessary.  Some atomic members are used to allow the current progress to update OMT and to opportunistically call OnStatus/OnProgress on the main thread.  Another atomic boolean is used to avoid dispatching too many runnables at once.

Note, most of the perf changes I saw in comment 72 and 73 go away with this patch.  Most of the benchmark cases are now statistically the same without this patchset.  The one case that still shows a change is the data URL case with ~200ms improvement on load time for the page.
Attachment #8914336 - Flags: review?(valentin.gosu)
I need to track down an intermittent:

devtools/shared/webconsole/test/test_console_serviceworker_cached.html | received correct number of console calls
got +0, expected 1
It doesn't look like review of this patch has started yet, so uploading a modified version.  This makes InterceptedHttpChannel::OnStopRequest() call gHttpHandler->DoStopRequest().
Attachment #8913358 - Attachment is obsolete: true
Attachment #8913358 - Flags: review?(valentin.gosu)
Attachment #8913358 - Flags: review?(bugmail)
Attachment #8914764 - Flags: review?(valentin.gosu)
Attachment #8914764 - Flags: review?(bugmail)
While I'm changing the flags here, lets fold P12 into P8.  Sorry for the churn.
Attachment #8914336 - Attachment is obsolete: true
Attachment #8914764 - Attachment is obsolete: true
Attachment #8914336 - Flags: review?(valentin.gosu)
Attachment #8914764 - Flags: review?(valentin.gosu)
Attachment #8914764 - Flags: review?(bugmail)
Attachment #8914768 - Flags: review?(valentin.gosu)
Attachment #8914768 - Flags: review?(bugmail)
Comment on attachment 8913394 [details] [diff] [review]
P9 Make nsHttpChannel redirect to InterceptedHttpChannel to fire a ServiceWorker FetchEvent.

Review of attachment 8913394 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +546,5 @@
>          return NS_ERROR_DOCUMENT_NOT_CACHED;
>      }
>  
> +    if (ShouldIntercept()) {
> +      return RedirectToInterceptedChannel();

nit: 4 space indent

@@ +9596,5 @@
>  
> +nsresult
> +nsHttpChannel::RedirectToInterceptedChannel()
> +{
> +  mInterceptCache = INTERCEPTED;

If rv is a failure code, should we still set this to intercepted?

@@ +9635,5 @@
> +    PopRedirectAsyncFunc(
> +        &nsHttpChannel::ContinueAsyncRedirectChannelToURI);
> +  }
> +
> +  return rv;

nit: 4 space indent

::: netwerk/protocol/http/nsHttpChannel.h
@@ +498,5 @@
>      void SetDoNotTrack();
>  
>      already_AddRefed<nsChannelClassifier> GetOrCreateChannelClassifier();
>  
> +    MOZ_MUST_USE nsresult RedirectToInterceptedChannel();

Add a 1-2 line comment explaining what this does
Comment on attachment 8913394 [details] [diff] [review]
P9 Make nsHttpChannel redirect to InterceptedHttpChannel to fire a ServiceWorker FetchEvent.

Review of attachment 8913394 [details] [diff] [review]:
-----------------------------------------------------------------

Also, it would be great if you could include the details from Comment 83 in the description of the patch.
Attachment #8913394 - Flags: review?(valentin.gosu) → review+
Attachment #8913395 - Flags: review?(valentin.gosu) → review+
(In reply to Valentin Gosu [:valentin] from comment #98)
> Comment on attachment 8913394 [details] [diff] [review]
> P9 Make nsHttpChannel redirect to InterceptedHttpChannel to fire a
> @@ +9596,5 @@
> >  
> > +nsresult
> > +nsHttpChannel::RedirectToInterceptedChannel()
> > +{
> > +  mInterceptCache = INTERCEPTED;
> 
> If rv is a failure code, should we still set this to intercepted?

This assignment is removed in P11, so I'm not going to change anything in this patch.  I'll address the other items.

Thanks!
So I've been investigating this error:

devtools/shared/webconsole/test/test_console_serviceworker_cached.html | received correct number of console calls
got +0, expected 1

I tracked down the problem to this check:

            // Filter out messages that came from a ServiceWorker but happened
            // before the page was requested.
            if (cachedMessage.innerID === "ServiceWorker" &&
                requestStartTime > cachedMessage.timeStamp) {
              return;
            }

Here:

https://searchfox.org/mozilla-central/source/devtools/server/actors/webconsole.js#821

When this test fails we are just missing the time comparison.  The requestStartTime is 1 millisecond or so less than cachedMessage.timeStamp.

Now, this should *not* be possible.  The requestStartTime comes from the window performanceTiming.requestStart value which in turn comes from the channel mAsyncOpenTime.  (Since we don't have a real transaction request time, etc)  We absolutely set mAsyncOpenTime before triggering the FetchEvent.

In addition, this test is only failing on windows taskcluster machines where its executed in a VM.  This is the same platform we have had trouble with Timestamp comparisons in other cases.  For example, see bug 1371438.  Our suspicion is that this is being caused by our windows TimeStamp code switching from high precision counters to its fallback mechanism.

I believe the reason this issue is effecting the test now is that the time between AsyncOpen and the FetchEvent is now much less.  On my local machine it used to be ~40ms.  With the new code it is only ~2ms (in the right direction).  So its now close enough that the TimeStamp precision code on windows could cause it fail.

I'm not going to solve this windows TimeStamp issue here, but I need to figure out what to do for this test.  My options seem to be:

1. Disable the test on windows.
2. Add a setTimeout() in the FetchEvent before doing the console.log().  This would probably fix it, but it would mean our checking is not as strict as it could be on other platforms.
3. Pass the original nsHttpChannel's creation time and mAsyncOpenTime into the InterceptedHttpChannel.  This would roughly maintain parity with the old code.  I'm not sure if its ok, though, since I don't see any other necko channels doing this.

Personally I like option (3) because ideally we should still track time from the initial open even if we do an internal redirect before the FetchEvent.  We shouldn't hide that time from performance timing or the devtools network monitor.

Valentin, what do you think?  Would you be ok with me passing the creation/start time from nsHttpChannel to InterceptedHttpChannel?  Brian, do you have a preference about what we do here?
Flags: needinfo?(valentin.gosu)
Flags: needinfo?(bgrinstead)
(In reply to Ben Kelly [:bkelly] from comment #101)
> Valentin, what do you think?  Would you be ok with me passing the
> creation/start time from nsHttpChannel to InterceptedHttpChannel?  Brian, do
> you have a preference about what we do here?

I think that would be OK, this situation.
Come to think of it, there might be other cases when we should be doing exactly that.
Flags: needinfo?(valentin.gosu)
This propagates the start times as discussed in the previous comments.  If its green I'll flag for review.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=d3598e653df1b30716a2a540583fb17063cc9de7
Flags: needinfo?(bgrinstead)
Forgot to fold another patch together with this one.
Attachment #8915132 - Attachment is obsolete: true
The assertion in the previous version was too strict.  It needs to account for mTimingEnabled being false.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=539896bfbbb679ee7f7e5b8470227032fb72cd9e
Attachment #8915133 - Attachment is obsolete: true
Re-base.
Attachment #8913395 - Attachment is obsolete: true
Attachment #8915353 - Flags: review+
Comment on attachment 8915164 [details] [diff] [review]
P12 Propagate creation and start times to InterceptedHttpChannel from original channel. r=valentin

This patch copies the original creation/AsyncOpen times from the nsHttpChannel into the InterceptedHttpChannel as discussed in comment 101.  This avoids hiding the original time via the internal redirect.  It also allows us to pass the devtools test on windows VMs where timestamp checking can be a bit fuzzy.
Attachment #8915164 - Flags: review?(valentin.gosu)
Sorry, one more small change to the InterceptedHttpChannel patch.  (If you guys start reviewing this, please let me know and I will post as follow-up patches.)

This update makes sure we call Performance::AddEntry() in OnStopRequest().
Attachment #8914768 - Attachment is obsolete: true
Attachment #8914768 - Flags: review?(valentin.gosu)
Attachment #8914768 - Flags: review?(bugmail)
Attachment #8915357 - Flags: review?(valentin.gosu)
Attachment #8915357 - Flags: review?(bugmail)
I think I also need to set responseStart and responseEnd.  Right now we are reporting a duration of zero which is not correct.
Ok, this should be the last version of InterceptedHttpChannel.  Sorry again for all the churn.

This last update makes us set synthetic responseStart/responseEnd times to properly communicate timings when interception happens.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3e99b42fa4081e94905b7a77eddcd93983487034.
Attachment #8915357 - Attachment is obsolete: true
Attachment #8915357 - Flags: review?(valentin.gosu)
Attachment #8915357 - Flags: review?(bugmail)
Attachment #8915717 - Flags: review?(valentin.gosu)
Attachment #8915717 - Flags: review?(bugmail)
I realized that my changed in bug 1191943 effectively moved the PerformanceTiming.requestStart value such that it is almost identical to the SW console message in comment 101.  This means that problem would likely return.

I think we should probably compare against the navigationStart in webconsole.js.  This will encompass a more realistic time frame for the window load time.  The requestStart was somewhat artificial in the past.

Note, I still want to keep P12 even with this change.  I want to show the true AsyncOpen() time in PerformanceRequestTiming.startTime and not hide it behind the internal redirect here.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=c42688a8a68fd90ca7a32938f5a07879854d72da
Attachment #8915724 - Flags: review?(bgrinstead)
Comment on attachment 8915724 [details] [diff] [review]
P13 Make webconsole.js compare messages against navigationStart when determining if service worker messages should be shown. r=bgrins

Review of attachment 8915724 [details] [diff] [review]:
-----------------------------------------------------------------

Seems fine to me assuming there's no case where a message that should be skipped is received between the prompt for unloading the previous document and the request to obtain the new document. I don't know of any reason why that would happen.
Attachment #8915724 - Flags: review?(bgrinstead) → review+
Comment on attachment 8915717 [details] [diff] [review]
P8 Add new InterceptedHttpChannel class to represent parent-side channel with a ServiceWorker FetchEvent. r=asuth r=valentin

Review of attachment 8915717 [details] [diff] [review]:
-----------------------------------------------------------------

Really good implementation. I didn't catch any issues with it. Just a few minor nits.

::: netwerk/protocol/http/InterceptedHttpChannel.cpp
@@ +542,5 @@
> +  }
> +
> +  nsresult rv = NS_OK;
> +  nsCOMPtr<nsIIOService> ioService;
> +  rv = gHttpHandler->GetIOService(getter_AddRefs(ioService));

nit: there's no point in passing in the ioService to NS_NewChannelInternal if we're not going to cache it or use it for something else as well.

::: netwerk/protocol/http/InterceptedHttpChannel.h
@@ +31,5 @@
> +//    FinishSynthesizedResponse() is called the synthesized data must be
> +//    reported back to the channel listener.  This is handled in a few
> +//    different ways:
> +//      a. If a redirect was synthesized, then we perform the redirect to
> +//         a new nsHttpChannel.  This new chnnale might trigger yet another

nit: channel typo

@@ +171,5 @@
> +  NS_DECL_NSIREQUESTOBSERVER
> +  NS_DECL_NSISTREAMLISTENER
> +  NS_DECL_NSICHANNELWITHDIVERTABLEPARENTLISTENER
> +  NS_DECL_NSITHREADRETARGETABLEREQUEST
> +  NS_DECL_NSITHREADRETARGETABLESTREAMLISTENER

nit: we normally put the NS_DECL_* at the beginning of the class definition.
Attachment #8915717 - Flags: review?(valentin.gosu) → review+
Attachment #8915164 - Flags: review?(valentin.gosu) → review+
Comment on attachment 8912323 [details] [diff] [review]
P10 Don't expect extraneous console message from service worker any more. r=brgins

Review of attachment 8912323 [details] [diff] [review]:
-----------------------------------------------------------------

note that you've got a typo in the r= line here.
Comment on attachment 8915717 [details] [diff] [review]
P8 Add new InterceptedHttpChannel class to represent parent-side channel with a ServiceWorker FetchEvent. r=asuth r=valentin

Review of attachment 8915717 [details] [diff] [review]:
-----------------------------------------------------------------

Woo!  This is so very awesome.

::: netwerk/protocol/http/InterceptedHttpChannel.cpp
@@ +290,5 @@
> +
> +  // Capture the current status from our atomic count.
> +  int64_t progress = mProgress;
> +
> +  // Note that any asynchronous progress update is now taking place.

nit: odd phrasing.

@@ +408,5 @@
> +  if (mLoadGroup) {
> +    mLoadGroup->AddRequest(this, nullptr);
> +  }
> +
> +  // If we already have a synthesized body then this we are pre-synthesized.

typo/phrasing nit: s/this// or similar.

Also, the contextual comments like this block here and elsewhere are fantastic and invaluable, thank you!

@@ +644,5 @@
> +  if (ShouldRedirect()) {
> +    return FollowSyntheticRedirect();
> +  }
> +
> +  SetApplyConversion(false);

nit: I liked the "// Intercepted responses should already be decoded." comment in InterceptedChannel.cpp as it helped contextualize that this wasn't pure boilerplate, maybe put it or something along those lines here?

@@ +647,5 @@
> +
> +  SetApplyConversion(false);
> +
> +  if (!mBodyReader) {
> +    nsresult rv = NS_NewCStringInputStream(getter_AddRefs(mBodyReader),

And a comment here along the lines of RespondWithHandler::ResolvedCallback's "// Errors and redirects may not have a body." seems desirable.
Attachment #8915717 - Flags: review?(bugmail) → review+
Attachment #8913394 - Flags: review?(bugmail) → review+
(In reply to Andrew Sutherland [:asuth] from comment #116)
> note that you've got a typo in the r= line here.

Thanks.  I fixed this locally.
Pushed by bkelly@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/479b4df3b500
P1 Rename nsIInterceptedChannel.cancel() to cancelInterception() r=jdm
https://hg.mozilla.org/integration/mozilla-inbound/rev/0213c0750dfb
P2 Only validate docloader channel on redirect if its been set. r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/9e9a3e50756b
P3 Allow CSP report channels to be internally redirected. r=ckerschb
https://hg.mozilla.org/integration/mozilla-inbound/rev/c617f24af61c
P4 Don't count internal redirects towards Response.redirected r=tt
https://hg.mozilla.org/integration/mozilla-inbound/rev/45731595169e
P5 Move some helper methods from nsHttpChannel to HttpBaseChannel. r=valentin
https://hg.mozilla.org/integration/mozilla-inbound/rev/f77fb4f77092
P6 Make HttpChannelParent operate on HttpBaseChannel objects instead of nsHttpChannel directly. r=valentin
https://hg.mozilla.org/integration/mozilla-inbound/rev/6d908b514d1e
P7 Make HttpChannelParent and HttpChannelParentListener allow an internal redirect during service worker interception. r=jdm
https://hg.mozilla.org/integration/mozilla-inbound/rev/066c68db5772
P8 Add new InterceptedHttpChannel class to represent parent-side channel with a ServiceWorker FetchEvent. r=asuth r=valentin
https://hg.mozilla.org/integration/mozilla-inbound/rev/4ff6aa3fae0b
P9 Make nsHttpChannel redirect to InterceptedHttpChannel to fire a ServiceWorker FetchEvent. r=asuth r=valentin
https://hg.mozilla.org/integration/mozilla-inbound/rev/5ad4700e9eec
P10 Don't expect extraneous console message from service worker any more. r=bgrins
https://hg.mozilla.org/integration/mozilla-inbound/rev/8d6980bc0601
P11 Remove old nsHttpChannel interception bits. r=valentin
https://hg.mozilla.org/integration/mozilla-inbound/rev/cdf9bc83b20d
P12 Propagate creation and start times to InterceptedHttpChannel from original channel. r=valentin
Depends on: 1407186
Depends on: 1407302
Blocks: 1395837
Telemetry shows some overall service worker interception duration improvements as a result of this change:

https://mzl.la/2ymHV5b

Mean interception navigation time dropped about 50% on mobile.  The median stayed roughly the same, though, so this was almost all 95-percentile delays coming way down.  My guess would be that we are triggering fewer main thread runnables and avoiding jank that way.

There is also improvements on time dispatch the navigation fetch event:

https://mzl.la/2yndzQd

About 30ms mean and 10ms median.  This may be due to fewer main thread runnables conflicting with service worker script loader runnables.

And some slight improvements in service worker launch time in mobile at the same time which is consistent:

https://mzl.la/2ynFHCJ
Depends on: 1420672
Apparently test_workerupdatefound.html is failing on non-e10s.  I'll have to investigate that.
Depends on: 1465580
Depends on: 1465587
Regressions: 1589749
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: