Closed Bug 1220681 Opened 9 years ago Closed 9 years ago

DivertToParent() does not work with intercepted channels

Categories

(Core :: DOM: Service Workers, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla46
Tracking Status
e10s + ---
firefox44 --- disabled
firefox45 --- fixed
firefox46 --- fixed

People

(Reporter: bkelly, Assigned: bkelly)

References

()

Details

Attachments

(6 files, 14 obsolete files)

4.32 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
5.10 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
11.06 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
7.21 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
1.68 KB, patch
jdm
: review+
Details | Diff | Splinter Review
3.01 KB, patch
jdm
: review+
Details | Diff | Splinter Review
+++ This bug was initially created as a clone of Bug #1220678 +++

This bug is to make DivertToParent() actually work for synthesized responses in e10s mode.  Without it, things like the "Download Publication" link here:

 https://jakearchibald.github.io/ebook-demo/publisher-site/readme/

Won't work.  We must get the intercepted data to the parent process so it can be saved to disk.
Since v1 will ship without e10s, this is not a blocker.  We need to fix it before e10s is shipped, though.
Blocks: ServiceWorkers-postv1
No longer blocks: ServiceWorkers-v1
tracking-e10s: --- → ?
Ben, we're not going to block our rollout on this, are you planning on tracking + fixing this for e10sd rollout?
Flags: needinfo?(bkelly)
(In reply to Jim Mathies [:jimm] from comment #2)
> Ben, we're not going to block our rollout on this, are you planning on
> tracking + fixing this for e10sd rollout?

Yes.  What is the planned timeframe for the rollout?
Flags: needinfo?(bkelly)
(In reply to Ben Kelly [:bkelly] from comment #3)
> (In reply to Jim Mathies [:jimm] from comment #2)
> > Ben, we're not going to block our rollout on this, are you planning on
> > tracking + fixing this for e10sd rollout?
> 
> Yes.  What is the planned timeframe for the rollout?

Undetermined at this point. We're starting to prompt on beta and accumulating a lot of performance data. We'll probably start prompting on the release channel or doing some forced A/B experimentation there in Q1 2016. I would suggest that if this is a serious problem it should get fixed by the end of year.
Blocks: e10s-swbeta
I'm going to fix this next week.
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Attached patch wip (obsolete) — Splinter Review
This is harder than I thought to fix in the short term.  Its going to get ugly.
Attachment #8698667 - Attachment is patch: true
I think I have a better plan for this now.  For a DivertToParent() call on a synthesized response channel:

1) Set a flag indicating the parent should be suspended on intercept.
2) Call ContinueAsyncOpen() to open the parent actor.  This includes the flag from (1)
3) The parent will then suspend itself when its synthesized.  (We need to make sure the status code is set properly before suspension, though.)
4) Callbacks from the child mSynthesizedStreamPump will be forward to the parent and its diversion listener.
This patch allows me to use this demo in e10s:

  https://jakearchibald.github.io/ebook-demo/publisher-site/readme/

I verified the resulting files were correct using:

  dd bs=1 skip=1 if=readme.pubarc of=readme.zip
  zipinfo readme.zip

Most of the code was already present in necko.  For example, there were already IPC calls for diverting the OnDataAvailable and OnStopRequest callbacks.  I basically just had to hook the mSynthesizedStreamPump up to the IPC calls.

Also, I had to create the parent IPC actor just-in-time for the diversion.
Attachment #8698667 - Attachment is obsolete: true
Attachment #8699170 - Flags: review?(josh)
Remove an unnecessary change.
Attachment #8699170 - Attachment is obsolete: true
Attachment #8699170 - Flags: review?(josh)
Attachment #8699174 - Flags: review?(josh)
Comment on attachment 8699170 [details] [diff] [review]
Make HttpChannelChild::DivertToParent() work with synthetic responses. r=jdm

Review of attachment 8699170 [details] [diff] [review]:
-----------------------------------------------------------------

This looks right, but we need to sort out what to do with Suspend/Resume.

::: netwerk/protocol/http/HttpChannelChild.cpp
@@ +1690,5 @@
>        SendSuspend();
>        mSuspendSent = true;
>      }
> +    if (mSynthesizedResponsePump) {
> +      mSynthesizedResponsePump->Suspend();

I'm worried that this behaviour doesn't match HttpChannelChild::Resume, and I'm having trouble reasoning about the change here.

@@ +2452,2 @@
>    // If we have a synthesized response, then there is no parent actor.  We
>    // need to make this work somehow, but for now avoid crashing.  (bug 1220681)

This comment should be updated.
Attachment #8699170 - Attachment is obsolete: false
Comment on attachment 8699174 [details] [diff] [review]
Make HttpChannelChild::DivertToParent() work with synthetic responses. r=jdm

Review of attachment 8699174 [details] [diff] [review]:
-----------------------------------------------------------------

Oh good, my only concern from the previous one has gone away.

::: netwerk/protocol/http/HttpChannelChild.cpp
@@ +2452,2 @@
>    // If we have a synthesized response, then there is no parent actor.  We
>    // need to make this work somehow, but for now avoid crashing.  (bug 1220681)

Update this comment.
Attachment #8699174 - Flags: review?(josh) → review+
Attachment #8699170 - Attachment is obsolete: true
Fixed the comment.

I'm working on a test now.
Attachment #8699174 - Attachment is obsolete: true
Attachment #8699183 - Flags: review+
Attached patch test wip (obsolete) — Splinter Review
Attached patch test wip (obsolete) — Splinter Review
Attachment #8699197 - Attachment is obsolete: true
Attached patch test wip (obsolete) — Splinter Review
Attachment #8699201 - Attachment is obsolete: true
I believe this test covers the same behavior in the demo that originated this bug.

It demonstrates, though, that my P1 patch is not adequate.  I can trigger crashes running this test in e10s.  Investigating.
Attachment #8699203 - Attachment is obsolete: true
Attachment #8700065 - Flags: review?(ehsan)
I think I need to make the channel suspend after synthesizing the response for this path.  There is a race between the AsyncOpen I am triggering and the DivertToParent() call to Suspend().
Comment on attachment 8700065 [details] [diff] [review]
P2 Test synthetic responses that trigger downloads. r=ehsan

Review of attachment 8700065 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/test/serviceworkers/browser.ini
@@ +2,5 @@
>  support-files =
>    browser_base_force_refresh.html
>    browser_cached_force_refresh.html
> +  browser_download_window.html
> +  browser_download_worker.js

Nit: as a convention, browser_%s.js is how we name tests.  Please drop the browser_ prefix from the js file.

::: dom/workers/test/serviceworkers/browser_download.js
@@ +45,5 @@
> +  SpecialPowers.pushPrefEnv({'set': [['dom.serviceWorkers.enabled', true],
> +                                     ['dom.serviceWorkers.exemptFromPerDomainMax', true],
> +                                     ['dom.serviceWorkers.testing.enabled', true],
> +                                     ['dom.serviceWorkers.interception.enabled', true],
> +                                     ['dom.caches.enabled', true]]},

Nit: the test is not using the Cache API.
Attachment #8700065 - Flags: review?(ehsan) → review+
So, the crash I was encountering actually triggers when the parent side of the divert runs before the response is synthesized in the parent channel.  This occurs because the diversion listener ends up clearing the channel callbacks effectively removing the intercept controller.

This patch makes us delay diversion if response synthesis is still in progress.

I also had to tweak the child-side DiverToParent() to call Suspend() after the mDivertingToParent flag is set.  This ensures we don't the normal SendSuspend() call for diversion.  The comments in HttpChannelChild::Suspend() suggest this was always the intention.
Attachment #8700189 - Flags: review?(josh)
Try hit an assertion on mac opt DEBUG build:

  Assertion failure: !mDivertingToParent (mDivertingToParent should be unset before OnStartRequest!), at /builds/slave/try-m64-0000000000000000000000/build/src/netwerk/protocol/http/HttpChannelChild.cpp:400

I think there is also a race in the other direction where the synthesizing process on the parent can complete before the diversion begins and suspends it.  I think we just need to ignore these spurious IPC events if we know we are in this condition.

Patch forthcoming.
It might be slightly more elegant to prevent the start request being sent from parent-to-child, but that would require changing the IPC messages with a new flag.  Since we're going to rip this code out soon, anyway, I think its easier to just ignore the duplicate message when it occurs.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=bd8da9e786e1
Attachment #8700329 - Flags: review?(josh)
There is a 10% intermittent on mac os opt.  I think this is timing out due to a test problem on this platform since I'm working with browser chrome and dialogs.  It does not reproduce on any other platform.

I'm just going to disable the test on mac for now and address in a follow-up bug.
An alternative P4 that suspends the parent automatically after synthesizing a response for diversion.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4ee063e48a12
Comment on attachment 8700189 [details] [diff] [review]
P3 Delay diversion on parent side until response head has been synthesized. r=jdm

Review of attachment 8700189 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +1476,5 @@
> +  // to perform the diversion.  Some of our diversion listeners clear callbacks
> +  // which breaks the synthesis process.
> +  if (mSynthesizedResponseHead) {
> +    mDivertListener = aListener;
> +    mPendingDiversion = true;

Let's assert that this is true instead.

::: netwerk/protocol/http/HttpChannelParent.h
@@ +215,5 @@
>  
>    nsAutoPtr<nsHttpResponseHead> mSynthesizedResponseHead;
>  
>    RefPtr<HttpChannelParentListener> mParentListener;
> +  // This is listener we are diverting to or will divert to if mPendingDiversion

Change "This is" to "The"
Attachment #8700189 - Flags: review?(josh) → review+
Comment on attachment 8700329 [details] [diff] [review]
P4 Ignore duplicate RecvOnStartRequest() IPC message when diverting synthesized response. r=jdm

Review of attachment 8700329 [details] [diff] [review]:
-----------------------------------------------------------------

I'd like to try suspending the channel in the parent as part of creating it in P2, which should remove for this change entirely.
Attachment #8700329 - Flags: review?(josh) → review-
Update to actually do the suspend after synthesizing.  I missed a runnable async bounce.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=42ee5cdf056c
Attachment #8700759 - Attachment is obsolete: true
Comment on attachment 8700769 [details] [diff] [review]
P4 Automatically suspend the parent channel after synthesizing the response for diverison. r=jdm

Review of attachment 8700769 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/HttpChannelChild.cpp
@@ +1887,5 @@
>  
>    if (mResponseHead) {
>      openArgs.synthesizedResponseHead() = *mResponseHead;
> +    openArgs.suspendAfterSynthesizeResponse() =
> +      mSuspendParentAfterSynthesizeResponse;

We'll need to initialize this field in the else as well.
Attachment #8700769 - Flags: review+
Actually, now its suspending too late in the patch in comment 28.
Grr... that patch didn't read from my smb share correctly either.
Had to use psftp to get this off my vm correctly.
Attachment #8700776 - Attachment is obsolete: true
Attachment #8700779 - Flags: review+
It seems my previous retriggers might not have been running the new browser_download.js on every platform.  On some platforms its in bc3 and on others its in bc4.  Trying to sort that out in the latest try.
Clean try to do more retriggers.  The previous one was getting quite cluttered:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=59b53ca0d59c
Another try separated out from bug 1233962.  That other bug had some instabilities in it.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=64b5e4f0c012
Updated for review feedback.
Attachment #8700189 - Attachment is obsolete: true
Attachment #8701070 - Flags: review+
The P4 patch automatically suspends the channel on the parent side, but we have no matching Resume() call.  This patch simply makes the SuspendForDiversion() adopt the ownership of the automatic suspend if it was performed.

This fixes the leak locally.  New try build:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=acaa89331178
Attachment #8701071 - Flags: review?(josh)
Comment on attachment 8701071 [details] [diff] [review]
P5 Don't double suspend parent channel during synthesized divert to parent. r=jdm

Review of attachment 8701071 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +1443,5 @@
>    // Try suspending the channel. Allow it to fail, since OnStopRequest may have
> +  // been called and thus the channel may not be pending.  If we've already
> +  // automatically suspended after synthesizing the response, then we don't
> +  // need to suspend again here.
> +  if (!mSuspendAfterSynthesizeResponse) {

Given that SuspendForDiversion is called via MaybeFlushPendingDiversion from SynthesizeResponse, why can't we remove the Suspend call there (and the whole mSuspendAfterSynthesizeResponse logic, including here)and leave this method unmodified? What am I missing?
(In reply to Josh Matthews [:jdm] from comment #39)
> Comment on attachment 8701071 [details] [diff] [review]
> P5 Don't double suspend parent channel during synthesized divert to parent.
> r=jdm
> 
> Review of attachment 8701071 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: netwerk/protocol/http/HttpChannelParent.cpp
> @@ +1443,5 @@
> >    // Try suspending the channel. Allow it to fail, since OnStopRequest may have
> > +  // been called and thus the channel may not be pending.  If we've already
> > +  // automatically suspended after synthesizing the response, then we don't
> > +  // need to suspend again here.
> > +  if (!mSuspendAfterSynthesizeResponse) {
> 
> Given that SuspendForDiversion is called via MaybeFlushPendingDiversion from
> SynthesizeResponse, why can't we remove the Suspend call there (and the
> whole mSuspendAfterSynthesizeResponse logic, including here)and leave this
> method unmodified? What am I missing?

Because the divert call might not have come in yet.  We can complete the synthesize before the diversion actor gets a chance to run.
So MaybeFlushPendingDiversion() might do nothing.  And then SuspendForDiversion() and DivertTo() are called later.
My theory for the timeouts in browser_download.js is that we're killing the worker after I call unregister() in the fetch event, but before I can return a response from the fetch event.  The unregister algorithm terminates the active worker if there are not controlled clients.  Since this fetch event is for the first navigation after registering the SW, there are no clients yet.

This patch uses clients.claim() to make the initial window a controlled client to avoid the race.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b146c64150a3

I've filed a spec issue to clarify behavior for this particular edge case:

https://github.com/slightlyoff/ServiceWorker/issues/804
If we are going to keep the worker alive for a client we also need to restrict which fetch events we treat as downloads.  Otherwise the service worker can live through to the next test case and block the load of its window.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=d7a97c3f9202
Attachment #8701332 - Attachment is obsolete: true
Attachment #8701071 - Flags: review?(josh) → review+
Comment on attachment 8701482 [details] [diff] [review]
P6 Use clients.claim() in browser_download.js to avoid worker unregister race. r=ehsan

Review of attachment 8701482 [details] [diff] [review]:
-----------------------------------------------------------------

This is looking green on try.

See P2 for the rest of the test.
Attachment #8701482 - Flags: review?(josh)
Comment on attachment 8701482 [details] [diff] [review]
P6 Use clients.claim() in browser_download.js to avoid worker unregister race. r=ehsan

Review of attachment 8701482 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/test/serviceworkers/download_worker.js
@@ +13,2 @@
>  addEventListener('fetch', function(evt) {
> +  // This worked may live long enough to receive a fetch event from the next

s/worked/worker/
Attachment #8701482 - Flags: review?(josh) → review+
Comment on attachment 8699183 [details] [diff] [review]
P1 Make HttpChannelChild::DivertToParent() work with synthetic responses. r=jdm

Request approval for all patches, P1 to P6.

Approval Request Comment
[Feature/regressing bug #]: Service workers with e10s.
[User impact if declined]: Certain service worker sites will fail to work if the user's browser is configured to run in e10s mode.
[Describe test coverage new/current, TreeHerder]: test included
[Risks and why]: Moderate, while this change is for service workers it does touch code paths related to non-service worker netwerk operations.  For this reason I'd like to land it early in the aurora cycle.
[String/UUID change made/needed]: None.
Attachment #8699183 - Flags: approval-mozilla-aurora?
Comment on attachment 8699183 [details] [diff] [review]
P1 Make HttpChannelChild::DivertToParent() work with synthetic responses. r=jdm

ok, we are planning to do things with e10s in 45. Let's take it.
Attachment #8699183 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: