1263028 - PHttpChannel::OnTransportAndData messages are often very large

Reporter

Description

•

9 years ago

Looking at IPC_MESSAGE_SIZE telemetry for the last few days, I see about 150k messages for OnTransportAndData that are of size 844kb or larger. This is only a very tiny portion of these messages (around 0.02%) but because these messages are very common, this is probably the second-largest source of large IPC messages, behind the message manager (though it is something like 1/7th as common). Large IPC messages require contiguous address space, which is likely contributing to our high rate of OOM crashes on the e10s beta experiment. While ideally the IPC layer would be smarter about how it represents these blocks of memory, it would be nice if there were some short-term fixes to reduce the size of the messages.

Andrew McCreight [:mccr8]

Reporter

Updated

•

9 years ago

Blocks: 1262918
No longer blocks: 1262661

Honza Bambas (:mayhemer)

Assignee

Comment 1

•

9 years ago

Same question as for PStorage - send in chunks?

Andrew McCreight [:mccr8]

Reporter

Comment 2

•

9 years ago

(In reply to Honza Bambas (:mayhemer) from comment #1) > Same question as for PStorage - send in chunks? Yes, that would help. I guess the basic question is how difficult it is to fix for these two particular users versus doing it at the IPC layer for everybody (bug 1262671). I wouldn't expect the latter is much harder, so maybe this and the PStorage bug aren't worth fixing.

Honza Bambas (:mayhemer)

Assignee

Comment 3

•

9 years ago

(In reply to Andrew McCreight [:mccr8] from comment #2) > (In reply to Honza Bambas (:mayhemer) from comment #1) > > Same question as for PStorage - send in chunks? > > Yes, that would help. I guess the basic question is how difficult it is to > fix for these two particular users versus doing it at the IPC layer for > everybody (bug 1262671). I wouldn't expect the latter is much harder, so > maybe this and the PStorage bug aren't worth fixing. It's pretty easy for HTTP to do it in HTTP directly. PStorage is harder...

Honza Bambas (:mayhemer)

Assignee

Comment 4

•

9 years ago

Andrew, before we enhance IPC to do this, should I fix this?

Whiteboard: [necko-would-take]

Andrew McCreight [:mccr8]

Reporter

Comment 5

•

9 years ago

(In reply to Honza Bambas (:mayhemer) from comment #4) > Andrew, before we enhance IPC to do this, should I fix this? If it is pretty easy, that would be good. You could also wait a week and see how things stand with the IPC work.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 6

•

9 years ago

Well, we're always going to have to allocate that big nsString no matter how much we fix the IPC backend code. So it would be good to fix this regardless if it's not too hard.

Honza Bambas (:mayhemer)

Assignee

Updated

•

9 years ago

Assignee: nobody → honzab.moz

Status: NEW → ASSIGNED

Whiteboard: [necko-would-take] → [necko-active]

Honza Bambas (:mayhemer)

Assignee

Comment 7

•

9 years ago

Attached patch v1 (obsolete) — Details — Splinter Review

Attachment #8740419 - Flags: review?(michal.novotny)

Honza Bambas (:mayhemer)

Assignee

Comment 8

•

9 years ago

Attached patch v1 -w (obsolete) — Details — Splinter Review

Nathan Froyd [:froydnj]

Comment 9

•

9 years ago

Comment on attachment 8740419 [details] [diff] [review] v1 Review of attachment 8740419 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/protocol/http/HttpChannelParent.cpp @@ +1146,5 @@ > + while (aCount) { > + uint32_t toRead = std::min<uint32_t>(aCount, 256 * 1024); > + > + nsCString data; > + nsresult rv = NS_ReadInputStreamToString(aInputStream, data, toRead); It's too bad that we can't just read out of the stream directly into the buffer of the message we're about to send here, and avoid a memory copy. :(

Nathan Froyd [:froydnj]

Comment 10

•

9 years ago

(In reply to Nathan Froyd [:froydnj] from comment #9) > ::: netwerk/protocol/http/HttpChannelParent.cpp > @@ +1146,5 @@ > > + while (aCount) { > > + uint32_t toRead = std::min<uint32_t>(aCount, 256 * 1024); > > + > > + nsCString data; > > + nsresult rv = NS_ReadInputStreamToString(aInputStream, data, toRead); Also, is it reasonable to pull |data| outside the loop and play tricks with SetCapacity and SetLength, so we're ideally only doing a single allocation, rather than several, if we have to go through the loop several times?

Andrew McCreight [:mccr8]

Reporter

Updated

•

9 years ago

Comment 11

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6cd0ac101ff1 Let me think about it, the stream may actually be buffered.

Honza Bambas (:mayhemer)

Assignee

Comment 12

•

9 years ago

(In reply to Nathan Froyd [:froydnj] from comment #9) > It's too bad that we can't just read out of the stream directly into the > buffer of the message we're about to send here, and avoid a memory copy. :( It is, and it's probably inevitable. We need to pass nsCString with copy of the data because the IPC message is processed asynchronously. We cannot at all rely on the buffer given in ReadSegments... So, I'm afraid we can't do this, unless we come up with something smarter (shared memory or buffer that can be referenced?) (In reply to Nathan Froyd [:froydnj] from comment #10) > Also, is it reasonable to pull |data| outside the loop and play tricks with > SetCapacity and SetLength, so we're ideally only doing a single allocation, > rather than several, if we have to go through the loop several times? Yes, that is true. Not sure why I did it the way I did.

Honza Bambas (:mayhemer)

Assignee

Comment 13

•

9 years ago

Attached patch v2 — Details — Splinter Review

as v1 just single allocation. https://treeherder.mozilla.org/#/jobs?repo=try&revision=82337c260df0

Attachment #8740419 - Attachment is obsolete: true

Attachment #8740420 - Attachment is obsolete: true

Attachment #8740419 - Flags: review?(michal.novotny)

Attachment #8740479 - Flags: review?(michal.novotny)

Nathan Froyd [:froydnj]

Comment 14

•

9 years ago

(In reply to Honza Bambas (:mayhemer) from comment #12) > (In reply to Nathan Froyd [:froydnj] from comment #9) > > It's too bad that we can't just read out of the stream directly into the > > buffer of the message we're about to send here, and avoid a memory copy. :( > > It is, and it's probably inevitable. We need to pass nsCString with copy of > the data because the IPC message is processed asynchronously. We cannot at > all rely on the buffer given in ReadSegments... So, I'm afraid we can't do > this, unless we come up with something smarter (shared memory or buffer that > can be referenced?) Oh, I wasn't talking about streaming the data out of the stream directly into what gets sent across the network. The current scheme we have is: have a nsIInputStream read from nsIInputStream into nsCString read from nsCString into a Message (PHttpChannelParent::SendOnTransportAndData) ...something happens to the Message after that... The Message might get sent across the wire in individual pieces, and that's OK, because it has allocated memory distinct from the nsIInputStream or the nsCString. My proposal was: have a nsIInputStream read from nsIInputStream into a Message ...something happens to the Message after that... No need for the intermediate nsCString. I don't think it would actually be that hard to do, just introduce a type whose ParamTraits specialization allocated the appropriate buffer in the Message and read the data out of the associated stream, changing the definition in PHttpChannel appropriately. Though I guess you do want a string on the child side, rather than a stream? That might be tricky...

Honza Bambas (:mayhemer)

Assignee

Comment 15

•

9 years ago

Got it. So it's not that simple anymore.

Assignee: honzab.moz → nobody

Status: ASSIGNED → NEW

Honza Bambas (:mayhemer)

Assignee

Updated

•

9 years ago

Whiteboard: [necko-active] → [necko-would-take]

Honza Bambas (:mayhemer)

Assignee

Comment 16

•

9 years ago

Comment on attachment 8740479 [details] [diff] [review] v2 For now this would be better done in IPC directly than trying to implement Nathan's suggestions from comment 14. I personally would like to start using shared buffers (single process) to avoid memory copying even more. It would eliminate even copy from stream's buffer to the message. But now it's more just a vision.

Attachment #8740479 - Flags: review?(michal.novotny)

Andrew McCreight [:mccr8]

Reporter

Comment 17

•

9 years ago

Oddly enough, I found bug 1110596 in my awesomebar, which is about avoiding the copy, and has some data about how much is being copied in a asm.js benchmark.

Andrew McCreight [:mccr8]

Reporter

Updated

•

9 years ago

Comment 18

•

9 years ago

(In reply to Honza Bambas (:mayhemer) from comment #15) > Got it. So it's not that simple anymore. Is there something wrong with this patch as-is? The extra copy Nathan brought up seems like an issue that is orthogonal to the size of the message. I wouldn't worry about solving it in this bug, and anyways your patch seems to mostly eliminate at least the memory impact of the intermediate copy. If you don't have time to finish this patch, I could work on it.

Flags: needinfo?(honzab.moz)

Honza Bambas (:mayhemer)

Assignee

Comment 19

•

9 years ago

Andrew, the patch (v2) is mostly finished. I tho can see that the try run suffers from few untracked issues, mainly image comparison failures and mainly the OS X 10.10 debug M(1) failure that may show its cause (a wild guess, tho). Those failures should be looked at first. From the comments it sounds like the goal of saving memory is not fully achieved with this patch.

Flags: needinfo?(honzab.moz)

Andrew McCreight [:mccr8]

Reporter

Comment 20

•

9 years ago

(In reply to Honza Bambas (:mayhemer) from comment #19) > From the comments it sounds like the goal of saving memory is not fully > achieved with this patch. There are always further improvements to memory usage to be made, but from my perspective of trying to eliminate very large contiguous allocations in e10s, your patch would solve the problem entirely for OnTransportAndData.

Andrew McCreight [:mccr8]

Reporter

Comment 21

•

9 years ago

I'm nominating this for e10s tracking because I think this is one of the bigger simple wins we can have to reduce the size of IPC messages. The nice thing about it is that it is capping the size of a frequent message that gets copied in at least four layers: sender string, sender message, receiver message, receiver string.

tracking-e10s: --- → ?

Jim Mathies [:jimm]

Comment 22

•

9 years ago

Honza, can we assign this to you and can you drive it home? It would be great if we can get a fix in and uplifted to 47.

Flags: needinfo?(honzab.moz)

Andrew McCreight [:mccr8]

Reporter

Updated

•

9 years ago

Blocks: 1259183

Honza Bambas (:mayhemer)

Assignee

Comment 23

•

9 years ago

(In reply to Jim Mathies [:jimm] from comment #22) > Honza, can we assign this to you and can you drive it home? It would be > great if we can get a fix in and uplifted to 47. OK!

Assignee: nobody → honzab.moz

Status: NEW → ASSIGNED

Flags: needinfo?(honzab.moz)

Honza Bambas (:mayhemer)

Assignee

Updated

•

9 years ago

Attachment #8740479 - Flags: review?(michal.novotny)

Honza Bambas (:mayhemer)

Assignee

Updated

•

9 years ago

Whiteboard: [necko-would-take] → [necko-active]

Honza Bambas (:mayhemer)

Assignee

Comment 24

•

9 years ago

Looks like all of the oranges on related try run are tracked intermittents, so after an r+, we can land!

Jim Mathies [:jimm]

Comment 25

•

9 years ago

(In reply to Honza Bambas (:mayhemer) from comment #23) > (In reply to Jim Mathies [:jimm] from comment #22) > > Honza, can we assign this to you and can you drive it home? It would be > > great if we can get a fix in and uplifted to 47. > > OK! Thanks

Jim Mathies [:jimm]

Updated

•

9 years ago

tracking-e10s: ? → +

Priority: -- → P1

Benjamin Smedberg

Updated

•

9 years ago

Blocks: e10s-oom

Michal Novotny [:michal]

Updated

•

9 years ago

Attachment #8740479 - Flags: review?(michal.novotny) → review+

Honza Bambas (:mayhemer)

Assignee

Updated

•

9 years ago

Keywords: checkin-needed

Pulsebot

Comment 26

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/f901f3d964ab

Keywords: checkin-needed

Carsten Book [:Tomcat]

Comment 27

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/f901f3d964ab

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

status-firefox48: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla48

Chris Peterson [:cpeterson]

Comment 28

•

9 years ago

Honza, should we uplift this fix to Beta 47 in preparation for our next e10s beta experiment?

status-firefox46: --- → wontfix

status-firefox47: --- → ?

Flags: needinfo?(odvarko)

Nathan Froyd [:froydnj]

Comment 29

•

9 years ago

Wrong Honza. ;) Honza, comment 28?

Flags: needinfo?(odvarko) → needinfo?(honzab.moz)

Andrew McCreight [:mccr8]

Reporter

Updated

•

9 years ago

status-firefox47: ? → affected

Honza Bambas (:mayhemer)

Assignee

Comment 30

•

9 years ago

(In reply to Chris Peterson [:cpeterson] from comment #28) > Honza, should we uplift this fix to Beta 47 in preparation for our next e10s > beta experiment? I think it's fairly safe.

Flags: needinfo?(honzab.moz)

Honza Bambas (:mayhemer)

Assignee

Comment 31

•

9 years ago

Comment on attachment 8740479 [details] [diff] [review] v2 Approval Request Comment [Feature/regressing bug #]: e10s [User impact if declined]: OOMs [Describe test coverage new/current, TreeHerder]: this is a widely used code hugely covered [Risks and why]: very very low [String/UUID change made/needed]: none

Attachment #8740479 - Flags: approval-mozilla-beta?

Attachment #8740479 - Flags: approval-mozilla-aurora?

Andrew McCreight [:mccr8]

Reporter

Comment 32

•

9 years ago

It is a little early to check telemetry, but looking at the data from 4/16, this is working as expected: http://mzl.la/1SO4zFz The largest bucket that has any entries is the 123.03kb to 169.62kb one. Compare this to the data from 4/15, where 0.02% or more of messages are in the bucket around 1mb or larger, and 0.6% of messages are in buckets larger than 162.62kb: http://mzl.la/1WBkaOL (As I said above, the percentages here are small, but the absolute number of messages is quite large.)

Liz Henry (:lizzard) (relman/hg->git project)

Comment 33

•

9 years ago

Comment on attachment 8740479 [details] [diff] [review] v2 We aren't shipping e10s in 46 so we shouldn't need this uplift.

Attachment #8740479 - Flags: approval-mozilla-beta? → approval-mozilla-beta-

Honza Bambas (:mayhemer)

Assignee

Comment 34

•

9 years ago

(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #33) > Comment on attachment 8740479 [details] [diff] [review] > v2 > > We aren't shipping e10s in 46 so we shouldn't need this uplift. Please double check with comment 28.

Flags: needinfo?(lhenry)

Andrew McCreight [:mccr8]

Reporter

Comment 35

•

9 years ago

(In reply to Honza Bambas (:mayhemer) from comment #34) > Please double check with comment 28. Chris really meant that this should get landed on Aurora 47, so that when it gets turned into Beta 47 it will have the fix. We don't need this on Beta 46.

Flags: needinfo?(lhenry)

Honza Bambas (:mayhemer)

Assignee

Comment 36

•

9 years ago

(In reply to Andrew McCreight [:mccr8] from comment #35) > (In reply to Honza Bambas (:mayhemer) from comment #34) > > Please double check with comment 28. > > Chris really meant that this should get landed on Aurora 47, so that when it > gets turned into Beta 47 it will have the fix. We don't need this on Beta 46. Thanks for clarifying.

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 37

•

9 years ago

Comment on attachment 8740479 [details] [diff] [review] v2 Andrew's verification on Nightly looks good, this helps e10s OOMs crash investigation/situation, Aurora47+

Attachment #8740479 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 38

•

9 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-aurora/rev/1a9e19b4b7b1

status-firefox47: affected → fixed

v1 9 years ago Honza Bambas (:mayhemer) 2.46 KB, patch		Details \| Diff \| Splinter Review
v1 -w 9 years ago Honza Bambas (:mayhemer) 1.82 KB, patch		Details \| Diff \| Splinter Review
v2 9 years ago Honza Bambas (:mayhemer) 2.69 KB, patch	michal : review+ ritu : approval-mozilla-aurora+ lizzard : approval-mozilla-beta-	Details \| Diff \| Splinter Review