e10s breaks necko back pressure

RESOLVED FIXED in Firefox 63

Status

()

defect
P2
normal
RESOLVED FIXED
3 years ago
3 months ago

People

(Reporter: mcmanus, Assigned: junior)

Tracking

unspecified
mozilla63
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(e10s-, firefox63 fixed)

Details

(Whiteboard: [necko-next][necko-quantum])

Attachments

(2 attachments)

normal single process data flow is that the socket thread fills a pipe and the target thread drains the pipe in onDataAvailable. If the pipe fills up the network transaction associated with it also pauses and eventually (after a few buffers in between fill up) the server stops sending data.

The major use case for this is channel.Suspend() and that continues to work fine in e10s. However, this also comes into play simply when the channel code is janky and doesn't get to run for a while and drain the pipe that was filled by the socket thread (at least in single process).

in e10s if the parent process runs quickly it will consume data out of that pipe and push it into ipdl where it essentially lives in the event queue - that frees the socket thread up to keep downloading more data and keep putting it in the event queue no matter how big the event queue gets - there is no back pressure on the server. If the event queue runs quickly this is not a problem and we can probably live with this for a little while if the only issue is jank (given network timescales for backpressure are pretty big compared to cpu time scales).

however quantum dom considers pausing (or severely degrading) run queues for backgrounded tabs/etc. This means the queue can grow unbounded without creating backpressure. In the case of a large download or something that will massively increase buffering requirements.

we can fix this with some kind of window/ack across the ipdl channel and it can certainly be fixed before multiple quantum run queues.
Whiteboard: [necko-next][necko-quantum]
This will affect fetch body stream as well.  I'd like to be able to let js code stop reading the body and apply back pressure back to the necko socket.
so if the js calls channel suspend that should work fine for the channel in most cases

This bug is more about the lack of backpressure from a stalled event queue (particularly in a multi queue model). Suspending the channel is a possibility in that case too, but its a lot harder imagining how the scheduler knows which channel to suspend :)
tracking-e10s: --- → -
Plan for a patch would be:
- on the child process keep track of size of the data pending to be processed on the HttpChannelChild event target (main thread or the retarget thread)
- on reaching a certain amount suspend the parent channel
- on dropping to a certain (smaller :)) amount resume the channel again
- don't forget to resume it when the channel is canceled from the child side
- send the suspend/resume with hi-prio

Notes: size of the pipe on the parent process between a connection and a transaction is 32k x 24 = 0.75MB.
Duplicate of this bug: 1360591
gary is this something you're willing to look at related to https://bugzilla.mozilla.org/show_bug.cgi?id=1355782
Flags: needinfo?(xeonchen)
(In reply to Patrick McManus [:mcmanus] from comment #5)
> gary is this something you're willing to look at related to
> https://bugzilla.mozilla.org/show_bug.cgi?id=1355782

I'm not very familiar with this part right now, but I'm happy to take this :)
Assignee: nobody → xeonchen
Flags: needinfo?(xeonchen)
Depends on: 1367861
Honza, I guess this bug needs to be put on hold until bug 1367861 is resolved?
Flags: needinfo?(honzab.moz)
(In reply to Gary Chen [:xeonchen] (needinfo plz) from comment #7)
> Honza, I guess this bug needs to be put on hold until bug 1367861 is
> resolved?

Yes, it's directly dependent on it.
Flags: needinfo?(honzab.moz)
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P2
Moving to P3 as this bug depends on bug 1367861 which is P3.
Priority: P2 → P3
Moving back to P2, as bug 1367861 is now prioritized.
Priority: P3 → P2
:wiwang will work on it.
Assignee: xeonchen → wiwang
Assignee: wiwang → nobody
Assignee: nobody → honzab.moz
Jason, since I can't easily find the junior's bz account, assigning to you tentatively.  Please reassign for me, thanks.
Assignee: honzab.moz → jduell.mcbugs
Honza, FYI, Junior's account will be re-enabled very soon :) (should be around 7/16)
Assignee: jduell.mcbugs → juhsu
Notes from our design meeting:

1) We'll do proper flow control here, though a fairly simple version.  We'll keep a (pref-able) IPC buffer size (let's do it in Megabytes, not bytes?  I don't want to allow a footgun where users set 100 bytes as their buffer size, etc).  Let's call it "neckoIpcWindowSize".  We talked about using 1 MB in our chat--not clear when the right number is--and will probably vary on desktop/mobile--but if anything 1 MB may be on the small size on desktop?  I could imagine 10 MB even.

2) The parent will keep a "mSendWindowSize" variable per channel (or transaction?  Sounds like we may do it on the channel for now, but when we move to a separate necko process, we'll move this logic to the transaction--don't worry about that for now).  It will start with the value of the "neckoIpcWindowSize" pref, but for every OnDataAvailable it sends, subtract the size of the msg from sendWindowSize.  If SendWindowSize is <= 0, instead of sending the msg, Suspend() the nsHttpChannel on the parent (which will eventually result in TCP flow control becoming engaged if it's needed, which is the point of this bug).  I think we will also need to save the msg content when this happens (could save it into a class member variable, since we'll never need to save more than 1 at a time IIUC) so we send it properly when we resume.

3) The child will send a new IPDL message ("BytesRead" or something) every time it has read >= 1/4th of neckoIpcWindowSize  (1/4 is a magic number--we could pref that too? Not sure we need to, how could it be wrong? :)  When the parent gets that msg, it will increment mSendWindowSize, i.e. it can now send more data.   If mSendWindowSize was <=0, we'll also need to dispatch the saved msg buffer if we have one, and also Resume() the nsHttpChannel.  Note: most channels load less than 250KB, so most channels won't ever send these BytesReads messages, i.e. the overhead here is small and tolerable.

4) We've decided to disable this algorithm if the response is from the HTTP cache.  We only store 50 MB max in the cache, which is unlikely to trigger an OOM.  This bug is more about preventing huge network loads from OOM'ing the child, and we really don't want to suspend loads from the disk, as they may be slow to restart.

5) Telemetry.  For now the only telemetry we need here is a record from each channel during OnStopRequest. If the channel ever hit Suspend() as part of this algorithm, record 1, otherwise 0. We're expecting the '1's to be quite rare--if they're over 1% even we'll probably need to dig in deeper here to see what's going on. Skip telemetry for channels that loaded from cache (and that can include part-cache, part-network replies--those are rare and won't change the results significantly).

6) Preffable off:  Make sure we have an easy way to disable this algorithm in case we ship something bad :)   Setting the neckoIpcWindowSize=0 could be a fine way to disable, or we could have a separate pref.  Implementor's choice! :)
Oh, and I forgot to mention: if it makes it easier to code, it's fine to "overflow the window" a little, i.e. if we have an OnDataAvail with 15 KB but only 10 KB of window left, we have several options:

1) We could suspend the channel and store the 15 KB till we receive an Ack from the child

2) We could send 10 KB and store 5 KB for sending later

3) We could just send the whole 15KB, even though we're 5 KB over the window limit.

Any of these are OK.  I thought #3 might make code simpler (don't need to store messages), but I think we'll have to store messages sometimes no matter what (if we run out of window and the pump calls OnDataAvail on the parent I think we'll have to store those bytes till Resume() time).
From my point of view, I would outline the implementation like this (a bit more simply:)


IMO, the best place to have the 'sending' logic on the parent is:
HttpBackgroundChannelParent::OnTransportAndData.  

The class will keep the counter - int32_t mSendWindowSize, initially 1MB (the preferred limit).  After you call SendOnTransportAndData() you simply do mSendWindowSize -= aCount;  This may go to negative numbers, it's perfectly fine.

We are perfectly fine to Suspend() the channel AFTER we have sent (all!) the data and are now over the limit, i.e. mSendWindowSize <= 0.  When it's coded this way we don't need to keep any data to send next time we are resumed and keep the logic very simple.

Note that in case Suspend() (you have to check!) is posting to some other thread, there may be a race and OnDataAvailable() may still be called (once, twice) after calling Suspend().  We are fine to send the data (don't be strict about the limit!), but we are not fine to call Suspend() again.  Suspend() must be called only when mSendWindowSize has changed from >0 to <=0!

The "ack" message will simply do mSendWindowSize += aAcknowledged.  If the counter was <=0 before and now is >0, Resume() the channel, otherwise do nothing.  No need to do anything else.  

Make sure this all happens on only one thread.  Otherwise synchronization will be needed, but I think this is all just single thread (usually the socket thread).  Add assertions to check it.


The best place for the 'receive' logic is IMO:
HttpBackgroundChannelChild::RecvOnTransportAndData.  

It will keep a simple counter of bytes it has received, uint32_t, initially 0.  When it's over the 1/4 (or whatever we decide) of the overall limit, send the "ack" message to the parent with the value of the counter.  Then reset the counter back to 0.  That's it.
Also remember the main goal here: push back on the server!  Strict saving of memory on the child is not needed.
(In reply to Jason Duell [:jduell] (needinfo me) from comment #16)
> Oh, and I forgot to mention: if it makes it easier to code, it's fine to
> "overflow the window" a little, i.e. if we have an OnDataAvail with 15 KB
> but only 10 KB of window left, we have several options:
> 
> 1) We could suspend the channel and store the 15 KB till we receive an Ack
> from the child
> 
> 2) We could send 10 KB and store 5 KB for sending later
> 
> 3) We could just send the whole 15KB, even though we're 5 KB over the window
> limit.
> 
> Any of these are OK.  I thought #3 might make code simpler (don't need to
> store messages), but I think we'll have to store messages sometimes no
> matter what (if we run out of window and the pump calls OnDataAvail on the
> parent I think we'll have to store those bytes till Resume() time).

we should implement version 3. Data will need to be store somewhere, it does not matter if it is httpChannelParent or ipc queue, why make it complicated.
There are 2 suspends on httpChannel: Suspend() and SuspendInternal();. The reason is the DivertToParent logic. I think using SuspendInternal() is not enough, we will need some extra logic here.

HttpChannelParent calls SuspendInternal only in one case: when divertToParent starts. When a diversion to parent starts if httpChannel has already been suspended before the diversion we suspend HttpChannelParent the same amount of time (so that nsHttpChannel and HttpChannelParent have the same mSuspend count). nsHttpChannel will be then suspended one more time with SuspendInternal.

If nsHttpChannel has been suspended  before the diversion because of mSendWindowSize <= 0 we do not want to suspend HttpChannelParent for that suspend (so we will suspend HttpChannelParent (nsHttpChannel::mSuspend - 1) times).
Here's a test:
Request a resource in child process, and blocking the first ODA for 5 seconds to force a back pressure
https://phabricator.services.mozilla.com/D2369
I had a workable patch, based on Comment 17.
We found that the flow control is not triggered if we only blocking the main thread of child process.

The reason is the IPC goes to the background thread.
In the child process side, HttpBackgroundChild only post data to HttpChannelChild (on the target thread)

Therefore, we'd like to move the flow control to HttpChannelParent/Child
because HttpChannelChild will know the how many data are *really* consumed.
For the partial cache, I don't find an easy way to let the child process know all the situation.
In the view of child, the resource is not from the cache, trigger the flow control while needed.
(In reply to Junior Hsu from comment #23)
> For the partial cache, I don't find an easy way to let the child process
> know all the situation.
> In the view of child, the resource is not from the cache, trigger the flow
> control while needed.

WE have decided to treat partial cache responses as if they are from cache. The parent should now about this. The child can send acks for the partial responses as well, it does not matter. It is important that the parent does not stop sending data.
I trace the relationship between suspend/resume and diversion
(a) HttpChannelChild initiates the first nsHttpChannel::Suspend() in DivertToParent(), build the PChannelDiverter.ipdl channel
(b) ChannelDiverterParent::Init will nsHttpChannel::SuspendInternal() again.
(c) Resume() for (a) happens in HttpChannelChild::RecvDivertMessages(), indicating the parent get the diverter via DivertTo.
(d) ResumeInternal() for (b) happens in HttpChannelParent::DivertComplete(), indicating the ODA/onstop in child is flushed complete.
(e) I'm not sure if it's nsHttpChannel::MessageDiversionStop(), but PHttpChannel channel should be destroyed finally. No more e10s this channel.

Kinda unrelated to this bug, Bug 1201170 facilitates the suspend/resume during diversion.

It seems that no ODA/onstop goes to HttpChannelParent after (b)
https://searchfox.org/mozilla-central/rev/d0a0ae30d534dc55e712bd3eedc1b4553ba56323/netwerk/protocol/http/HttpChannelParent.cpp#2004

Therefore,
if Suspend() is called by mSendWindowSize <=0 before (b),
all the ODA-s on the way to child should arrived before (d) HttpChannelParent::DivertComplete().
That is, we can get the corresponding Resume() before the PHttpChannel is destroyed.
No need to change for the diversion case.

Does that make sense, Dragana?
Flags: needinfo?(dd.mozilla)
Have trouble in Phabricator.

Upload a WIP first
https://phabricator.services.mozilla.com/differential/diff/6547/

TODO:
telemetry
Split this patch for data review
(In reply to Junior Hsu from comment #25)
> I trace the relationship between suspend/resume and diversion
> (a) HttpChannelChild initiates the first nsHttpChannel::Suspend() in
> DivertToParent(), build the PChannelDiverter.ipdl channel
> (b) ChannelDiverterParent::Init will nsHttpChannel::SuspendInternal() again.
> (c) Resume() for (a) happens in HttpChannelChild::RecvDivertMessages(),
> indicating the parent get the diverter via DivertTo.
> (d) ResumeInternal() for (b) happens in HttpChannelParent::DivertComplete(),
> indicating the ODA/onstop in child is flushed complete.
> (e) I'm not sure if it's nsHttpChannel::MessageDiversionStop(), but
> PHttpChannel channel should be destroyed finally. No more e10s this channel.
> 
> Kinda unrelated to this bug, Bug 1201170 facilitates the suspend/resume
> during diversion.
> 
> It seems that no ODA/onstop goes to HttpChannelParent after (b)
> https://searchfox.org/mozilla-central/rev/
> d0a0ae30d534dc55e712bd3eedc1b4553ba56323/netwerk/protocol/http/
> HttpChannelParent.cpp#2004
> 
> Therefore,
> if Suspend() is called by mSendWindowSize <=0 before (b),
> all the ODA-s on the way to child should arrived before (d)
> HttpChannelParent::DivertComplete().
> That is, we can get the corresponding Resume() before the PHttpChannel is
> destroyed.
> No need to change for the diversion case.
> 
> Does that make sense, Dragana?

Yes. You do not really need to change anything for diversion case.
Diversion case can be optimized a bit. We can do that in a separate bug.

During diversion we need to suspend nsHttpChannel but we do not need to suspend HttpChannelParent, it can give data further to the consumer. For diversion the nsHttpChannel is suspended 2 times in (a) and (b). We do not need both. HttpChannelParent is suspended only once with (a). We could remove (a). If we remove (a) we could optimize back- pressure as well.
Flags: needinfo?(dd.mozilla)
Comment on attachment 8997607 [details]
Bug 1280629 - Part 2: telemetry of e10 back pressure suspension rate

Dragana Damjanovic [:dragana] has approved the revision.
Attachment #8997607 - Flags: review+
Comment on attachment 8997607 [details]
Bug 1280629 - Part 2: telemetry of e10 back pressure suspension rate


# Request for data collection review form

**All questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.**

1) What questions will you answer with this data?

To know the suspension rate by collecting whether the flow control triggers the suspension (true) or not (false)

2) Why does Mozilla need to answer these questions?  Are there benefits for users? Do we need this information to address product or business requirements? Some example responses:

If the suspension rate is too high (>1%), we need to adjust the flow control algorithm.

3) What alternative methods did you consider to answer these questions? Why were they not sufficient?

I don't know any alternative approach to collect these data.

4) Can current instrumentation answer these questions?

No.

5) List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox [data collection categories](https://wiki.mozilla.org/Firefox/Data_Collection) on the Mozilla wiki.   

**Note that the data steward reviewing your request will characterize your data collection based on the highest (and most sensitive) category.**

<table>
  <tr>
    <td>NETWORK_BACK_PRESSURE_SUSPENSION_RATE</td>
    <td>Technical data</td>
    <td>1280629</td>
  </tr>
</table>


6) How long will this data be collected?  Choose one of the following:

To 68

7) What populations will you measure?

Pre-release, all countries, all locale

8) If this data collection is default on, what is the opt-out mechanism for users?

No

9) Please provide a general description of how you will analyze this data.

Simply calculate the suspension rate

10) Where do you intend to share the results of your analysis?

Internal use
Attachment #8997607 - Flags: review?(francois)
Comment on attachment 8997607 [details]
Bug 1280629 - Part 2: telemetry of e10 back pressure suspension rate

1) Is there or will there be **documentation** that describes the schema for the ultimate data set available publicly, complete and accurate?

Yes, in Histograms.json.

2) Is there a control mechanism that allows the user to turn the data collection on and off?

Yes, telemetry setting.

3) If the request is for permanent data collection, is there someone who will monitor the data over time?**

Not permanent.

4) Using the **[category system of data types](https://wiki.mozilla.org/Firefox/Data_Collection)** on the Mozilla wiki, what collection type of data do the requested measurements fall under?  **

Category 1

5) Is the data collection request for default-on or default-off?

Default ON only in pre-release channels.

6) Does the instrumentation include the addition of **any *new* identifiers** (whether anonymous or otherwise; e.g., username, random IDs, etc.  See the appendix for more details)?

No.

7) Is the data collection covered by the existing Firefox privacy notice?

Yes.

8) Does there need to be a check-in in the future to determine whether to renew the data?

No, telemetry alerts are fine.
Attachment #8997607 - Flags: review?(francois) → review+
BTW, for next time, it's easier if you attach your responses to the data review questions in a .txt. Then you can r? me just on that and it becomes clear that I'm only doing the data review, not the code review.
Comment on attachment 8997607 [details]
Bug 1280629 - Part 2: telemetry of e10 back pressure suspension rate

François Marier [:francois] has been removed from the revision.
Attachment #8997607 - Flags: review+
Comment on attachment 8997602 [details]
Bug 1280629 - Part 1: Suspend the http channel if the child process is not able to consume on time

Dragana Damjanovic [:dragana] has approved the revision.
Attachment #8997602 - Flags: review+
Keywords: checkin-needed
Blocks: 1483391
No longer blocks: 1483391
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e654f2b128f6
Part 1: Suspend the http channel if the child process is not able to consume on time. r=dragana
https://hg.mozilla.org/integration/mozilla-inbound/rev/64648fd6ef5e
Part 2: telemetry of e10 back pressure suspension rate. r=dragana
Keywords: checkin-needed
Please make sure we don't regress the work done in bug 1240547 with the newly added members to HttpChannelChild.  Thanks.
Flags: needinfo?(juhsu)
(In reply to Honza Bambas (:mayhemer) from comment #41)
> Please make sure we don't regress the work done in bug 1240547 with the
> newly added members to HttpChannelChild.  Thanks.

Thanks for notice.
Will commit a fixed patch on HttpChannelParent/Child right after patches here are landed to central or backout.
Flags: needinfo?(juhsu)
In wpt, there are 3-4 files between 1MB and 4MB.
I'll disable bp flow control in those wpt tests.

I'll try to find why the wpt doesn't honor the resume.
Flags: needinfo?(juhsu)
Keywords: checkin-needed
Pushed by rvandermeulen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/45e540e5e0fc
Part 1: Suspend the http channel if the child process is not able to consume on time r=dragana
Keywords: checkin-needed
Pushed by rvandermeulen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ec225468adb7
Part 2: telemetry of e10 back pressure suspension rate r=dragana
https://hg.mozilla.org/mozilla-central/rev/45e540e5e0fc
https://hg.mozilla.org/mozilla-central/rev/ec225468adb7
Status: NEW → RESOLVED
Closed: 11 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Blocks: 1487559
We have >4% suspension rate in nightly.
It would be fine if it's caused by local server testing, but we don't know.
We'll monitor if the rate goes higher or lower in beta.

In the meantime, file Bug 1487559 to do more investigation of the mime type distribution.
Blocks: 1494133
Blocks: 1498434
Depends on: 1513135
Depends on: 1514065

(In reply to Junior Hsu from comment #44)

In wpt, there are 3-4 files between 1MB and 4MB.
I'll disable bp flow control in those wpt tests.

I'll try to find why the wpt doesn't honor the resume.

Have it done? Why were test disabled to wallpaper the issue instead of inspecting the root cause? It would have caught bug 1513135 earlier.

Depends on: 1524154

(In reply to Masatoshi Kimura [:emk] from comment #50)

(In reply to Junior Hsu from comment #44)

In wpt, there are 3-4 files between 1MB and 4MB.
I'll disable bp flow control in those wpt tests.

I'll try to find why the wpt doesn't honor the resume.

Have it done? Why were test disabled to wallpaper the issue instead of inspecting the root cause? It would have caught bug 1513135 earlier.

Here's some note I made.
https://phabricator.services.mozilla.com/D2745#70783

What I can tell is I believe the handling in necko side is correct in those test case.
It's different from bug 1513135, which only happens on channel diversion.
If it's the same, testing/web-platform/meta/*.ini should be tangled with the pref.

Regressions: 1539766
You need to log in before you can comment on or make changes to this bug.