Closed
Bug 1263028
Opened 9 years ago
Closed 9 years ago
PHttpChannel::OnTransportAndData messages are often very large
Categories
(Core :: Networking: HTTP, defect, P1)
Core
Networking: HTTP
Tracking
()
RESOLVED
FIXED
mozilla48
People
(Reporter: mccr8, Assigned: mayhemer)
References
Details
(Whiteboard: [necko-active])
Attachments
(1 file, 2 obsolete files)
2.69 KB,
patch
|
michal
:
review+
ritu
:
approval-mozilla-aurora+
lizzard
:
approval-mozilla-beta-
|
Details | Diff | Splinter Review |
Looking at IPC_MESSAGE_SIZE telemetry for the last few days, I see about 150k messages for OnTransportAndData that are of size 844kb or larger. This is only a very tiny portion of these messages (around 0.02%) but because these messages are very common, this is probably the second-largest source of large IPC messages, behind the message manager (though it is something like 1/7th as common).
Large IPC messages require contiguous address space, which is likely contributing to our high rate of OOM crashes on the e10s beta experiment. While ideally the IPC layer would be smarter about how it represents these blocks of memory, it would be nice if there were some short-term fixes to reduce the size of the messages.
Reporter | ||
Updated•9 years ago
|
Assignee | ||
Comment 1•9 years ago
|
||
Same question as for PStorage - send in chunks?
Reporter | ||
Comment 2•9 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #1)
> Same question as for PStorage - send in chunks?
Yes, that would help. I guess the basic question is how difficult it is to fix for these two particular users versus doing it at the IPC layer for everybody (bug 1262671). I wouldn't expect the latter is much harder, so maybe this and the PStorage bug aren't worth fixing.
Assignee | ||
Comment 3•9 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #2)
> (In reply to Honza Bambas (:mayhemer) from comment #1)
> > Same question as for PStorage - send in chunks?
>
> Yes, that would help. I guess the basic question is how difficult it is to
> fix for these two particular users versus doing it at the IPC layer for
> everybody (bug 1262671). I wouldn't expect the latter is much harder, so
> maybe this and the PStorage bug aren't worth fixing.
It's pretty easy for HTTP to do it in HTTP directly. PStorage is harder...
Assignee | ||
Comment 4•9 years ago
|
||
Andrew, before we enhance IPC to do this, should I fix this?
Whiteboard: [necko-would-take]
Reporter | ||
Comment 5•9 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #4)
> Andrew, before we enhance IPC to do this, should I fix this?
If it is pretty easy, that would be good. You could also wait a week and see how things stand with the IPC work.
Well, we're always going to have to allocate that big nsString no matter how much we fix the IPC backend code. So it would be good to fix this regardless if it's not too hard.
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
Whiteboard: [necko-would-take] → [necko-active]
Assignee | ||
Comment 7•9 years ago
|
||
Attachment #8740419 -
Flags: review?(michal.novotny)
Assignee | ||
Comment 8•9 years ago
|
||
Comment 9•9 years ago
|
||
Comment on attachment 8740419 [details] [diff] [review]
v1
Review of attachment 8740419 [details] [diff] [review]:
-----------------------------------------------------------------
::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +1146,5 @@
> + while (aCount) {
> + uint32_t toRead = std::min<uint32_t>(aCount, 256 * 1024);
> +
> + nsCString data;
> + nsresult rv = NS_ReadInputStreamToString(aInputStream, data, toRead);
It's too bad that we can't just read out of the stream directly into the buffer of the message we're about to send here, and avoid a memory copy. :(
Comment 10•9 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #9)
> ::: netwerk/protocol/http/HttpChannelParent.cpp
> @@ +1146,5 @@
> > + while (aCount) {
> > + uint32_t toRead = std::min<uint32_t>(aCount, 256 * 1024);
> > +
> > + nsCString data;
> > + nsresult rv = NS_ReadInputStreamToString(aInputStream, data, toRead);
Also, is it reasonable to pull |data| outside the loop and play tricks with SetCapacity and SetLength, so we're ideally only doing a single allocation, rather than several, if we have to go through the loop several times?
Assignee | ||
Comment 11•9 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6cd0ac101ff1
Let me think about it, the stream may actually be buffered.
Assignee | ||
Comment 12•9 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #9)
> It's too bad that we can't just read out of the stream directly into the
> buffer of the message we're about to send here, and avoid a memory copy. :(
It is, and it's probably inevitable. We need to pass nsCString with copy of the data because the IPC message is processed asynchronously. We cannot at all rely on the buffer given in ReadSegments... So, I'm afraid we can't do this, unless we come up with something smarter (shared memory or buffer that can be referenced?)
(In reply to Nathan Froyd [:froydnj] from comment #10)
> Also, is it reasonable to pull |data| outside the loop and play tricks with
> SetCapacity and SetLength, so we're ideally only doing a single allocation,
> rather than several, if we have to go through the loop several times?
Yes, that is true. Not sure why I did it the way I did.
Assignee | ||
Comment 13•9 years ago
|
||
as v1 just single allocation.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=82337c260df0
Attachment #8740419 -
Attachment is obsolete: true
Attachment #8740420 -
Attachment is obsolete: true
Attachment #8740419 -
Flags: review?(michal.novotny)
Attachment #8740479 -
Flags: review?(michal.novotny)
Comment 14•9 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #12)
> (In reply to Nathan Froyd [:froydnj] from comment #9)
> > It's too bad that we can't just read out of the stream directly into the
> > buffer of the message we're about to send here, and avoid a memory copy. :(
>
> It is, and it's probably inevitable. We need to pass nsCString with copy of
> the data because the IPC message is processed asynchronously. We cannot at
> all rely on the buffer given in ReadSegments... So, I'm afraid we can't do
> this, unless we come up with something smarter (shared memory or buffer that
> can be referenced?)
Oh, I wasn't talking about streaming the data out of the stream directly into what gets sent across the network. The current scheme we have is:
have a nsIInputStream
read from nsIInputStream into nsCString
read from nsCString into a Message (PHttpChannelParent::SendOnTransportAndData)
...something happens to the Message after that...
The Message might get sent across the wire in individual pieces, and that's OK, because it has allocated memory distinct from the nsIInputStream or the nsCString. My proposal was:
have a nsIInputStream
read from nsIInputStream into a Message
...something happens to the Message after that...
No need for the intermediate nsCString. I don't think it would actually be that hard to do, just introduce a type whose ParamTraits specialization allocated the appropriate buffer in the Message and read the data out of the associated stream, changing the definition in PHttpChannel appropriately. Though I guess you do want a string on the child side, rather than a stream? That might be tricky...
Assignee | ||
Comment 15•9 years ago
|
||
Got it. So it's not that simple anymore.
Assignee: honzab.moz → nobody
Status: ASSIGNED → NEW
Assignee | ||
Updated•9 years ago
|
Whiteboard: [necko-active] → [necko-would-take]
Assignee | ||
Comment 16•9 years ago
|
||
Comment on attachment 8740479 [details] [diff] [review]
v2
For now this would be better done in IPC directly than trying to implement Nathan's suggestions from comment 14.
I personally would like to start using shared buffers (single process) to avoid memory copying even more. It would eliminate even copy from stream's buffer to the message. But now it's more just a vision.
Attachment #8740479 -
Flags: review?(michal.novotny)
Reporter | ||
Comment 17•9 years ago
|
||
Oddly enough, I found bug 1110596 in my awesomebar, which is about avoiding the copy, and has some data about how much is being copied in a asm.js benchmark.
Reporter | ||
Comment 18•9 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #15)
> Got it. So it's not that simple anymore.
Is there something wrong with this patch as-is? The extra copy Nathan brought up seems like an issue that is orthogonal to the size of the message. I wouldn't worry about solving it in this bug, and anyways your patch seems to mostly eliminate at least the memory impact of the intermediate copy. If you don't have time to finish this patch, I could work on it.
Flags: needinfo?(honzab.moz)
Assignee | ||
Comment 19•9 years ago
|
||
Andrew, the patch (v2) is mostly finished. I tho can see that the try run suffers from few untracked issues, mainly image comparison failures and mainly the OS X 10.10 debug M(1) failure that may show its cause (a wild guess, tho).
Those failures should be looked at first.
From the comments it sounds like the goal of saving memory is not fully achieved with this patch.
Flags: needinfo?(honzab.moz)
Reporter | ||
Comment 20•9 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #19)
> From the comments it sounds like the goal of saving memory is not fully
> achieved with this patch.
There are always further improvements to memory usage to be made, but from my perspective of trying to eliminate very large contiguous allocations in e10s, your patch would solve the problem entirely for OnTransportAndData.
Reporter | ||
Comment 21•9 years ago
|
||
I'm nominating this for e10s tracking because I think this is one of the bigger simple wins we can have to reduce the size of IPC messages. The nice thing about it is that it is capping the size of a frequent message that gets copied in at least four layers: sender string, sender message, receiver message, receiver string.
tracking-e10s:
--- → ?
Comment 22•9 years ago
|
||
Honza, can we assign this to you and can you drive it home? It would be great if we can get a fix in and uplifted to 47.
Flags: needinfo?(honzab.moz)
Assignee | ||
Comment 23•9 years ago
|
||
(In reply to Jim Mathies [:jimm] from comment #22)
> Honza, can we assign this to you and can you drive it home? It would be
> great if we can get a fix in and uplifted to 47.
OK!
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
Flags: needinfo?(honzab.moz)
Assignee | ||
Updated•9 years ago
|
Attachment #8740479 -
Flags: review?(michal.novotny)
Assignee | ||
Updated•9 years ago
|
Whiteboard: [necko-would-take] → [necko-active]
Assignee | ||
Comment 24•9 years ago
|
||
Looks like all of the oranges on related try run are tracked intermittents, so after an r+, we can land!
Comment 25•9 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #23)
> (In reply to Jim Mathies [:jimm] from comment #22)
> > Honza, can we assign this to you and can you drive it home? It would be
> > great if we can get a fix in and uplifted to 47.
>
> OK!
Thanks
Updated•9 years ago
|
Priority: -- → P1
Updated•9 years ago
|
Attachment #8740479 -
Flags: review?(michal.novotny) → review+
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Comment 26•9 years ago
|
||
Keywords: checkin-needed
Comment 27•9 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Comment 28•9 years ago
|
||
Honza, should we uplift this fix to Beta 47 in preparation for our next e10s beta experiment?
Comment 29•9 years ago
|
||
Wrong Honza. ;)
Honza, comment 28?
Flags: needinfo?(odvarko) → needinfo?(honzab.moz)
Reporter | ||
Updated•9 years ago
|
Assignee | ||
Comment 30•9 years ago
|
||
(In reply to Chris Peterson [:cpeterson] from comment #28)
> Honza, should we uplift this fix to Beta 47 in preparation for our next e10s
> beta experiment?
I think it's fairly safe.
Flags: needinfo?(honzab.moz)
Assignee | ||
Comment 31•9 years ago
|
||
Comment on attachment 8740479 [details] [diff] [review]
v2
Approval Request Comment
[Feature/regressing bug #]: e10s
[User impact if declined]: OOMs
[Describe test coverage new/current, TreeHerder]: this is a widely used code hugely covered
[Risks and why]: very very low
[String/UUID change made/needed]: none
Attachment #8740479 -
Flags: approval-mozilla-beta?
Attachment #8740479 -
Flags: approval-mozilla-aurora?
Reporter | ||
Comment 32•9 years ago
|
||
It is a little early to check telemetry, but looking at the data from 4/16, this is working as expected:
http://mzl.la/1SO4zFz
The largest bucket that has any entries is the 123.03kb to 169.62kb one.
Compare this to the data from 4/15, where 0.02% or more of messages are in the bucket around 1mb or larger, and 0.6% of messages are in buckets larger than 162.62kb:
http://mzl.la/1WBkaOL
(As I said above, the percentages here are small, but the absolute number of messages is quite large.)
Comment 33•9 years ago
|
||
Comment on attachment 8740479 [details] [diff] [review]
v2
We aren't shipping e10s in 46 so we shouldn't need this uplift.
Attachment #8740479 -
Flags: approval-mozilla-beta? → approval-mozilla-beta-
Assignee | ||
Comment 34•9 years ago
|
||
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #33)
> Comment on attachment 8740479 [details] [diff] [review]
> v2
>
> We aren't shipping e10s in 46 so we shouldn't need this uplift.
Please double check with comment 28.
Flags: needinfo?(lhenry)
Reporter | ||
Comment 35•9 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #34)
> Please double check with comment 28.
Chris really meant that this should get landed on Aurora 47, so that when it gets turned into Beta 47 it will have the fix. We don't need this on Beta 46.
Flags: needinfo?(lhenry)
Assignee | ||
Comment 36•9 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #35)
> (In reply to Honza Bambas (:mayhemer) from comment #34)
> > Please double check with comment 28.
>
> Chris really meant that this should get landed on Aurora 47, so that when it
> gets turned into Beta 47 it will have the fix. We don't need this on Beta 46.
Thanks for clarifying.
Comment on attachment 8740479 [details] [diff] [review]
v2
Andrew's verification on Nightly looks good, this helps e10s OOMs crash investigation/situation, Aurora47+
Attachment #8740479 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 38•9 years ago
|
||
bugherder uplift |
You need to log in
before you can comment on or make changes to this bug.
Description
•