[meta] Reduce IPC message size

NEW
Assigned to

Status

()

3 years ago
3 months ago

People

(Reporter: mccr8, Assigned: mccr8)

Tracking

(Depends on: 1 bug, {meta})

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(e10s+, firefox48 affected)

Details

(Whiteboard: btpp-meta)

(Assignee)

Description

3 years ago
Bug 1260908 added MESSAGE_MANAGER_MESSAGE_SIZE and IPC_MESSAGE_SIZE telemetry for tracking the size of IPC messages. Some of these messages are very large, which may be contributing to the high rate of OOM crashes in the e10s beta experiment, through memory fragmentation.

While it would be nice to improve our handling of this at the IPC level, fixing individual IPC users might result in faster improvements in the short term. I don't know if it makes sense to track both kinds of improvements in a single bug or not.

Updated

3 years ago
tracking-e10s: --- → +
(Assignee)

Updated

3 years ago
Whiteboard: btpp-meta
(Assignee)

Comment 1

3 years ago
sdk/remote/process/message is fairly common, but it looks like a generic pipe that the Addon SDK uses to send messages, so I'm not sure if it is worth filing a bug for.
(Assignee)

Updated

3 years ago
Depends on: 1263004
(Assignee)

Updated

3 years ago
Depends on: 1263027
(Assignee)

Updated

3 years ago
Depends on: 1263028
(Assignee)

Comment 2

3 years ago
I've now gone through the telemetry and filed bugs for all of the large messages that are at least somewhat common. Honestly, though, the number of large session store messages (bug 1262661) maybe outnumber all of the rest combined, so I'm not sure how worthwhile it is to work on the others until that is fixed.
(Assignee)

Comment 3

3 years ago
I'm adding the generic IPC-level improvements to this bug, too, which we may or may not want to do.
Depends on: 1262841, 1253131, 1262671
(Assignee)

Updated

3 years ago
Depends on: 1263235
(Assignee)

Updated

3 years ago
Depends on: 1263953

Updated

3 years ago
Depends on: 1259480
(Assignee)

Updated

3 years ago
Depends on: 1264642
(Assignee)

Updated

3 years ago
Depends on: 1264662
(Assignee)

Updated

3 years ago
Depends on: 1265045
(Assignee)

Updated

3 years ago
Assignee: nobody → continuation
(Assignee)

Updated

3 years ago
Depends on: 1264820
(Assignee)

Updated

3 years ago
Depends on: 1268662
(Assignee)

Updated

3 years ago
Depends on: 1268938
(Assignee)

Comment 4

3 years ago
To get a sense of what messages are problematic, I looked at all crashes in 47 beta with the signature [@ OOM | large | mozalloc_abort | mozalloc_handle_oom | moz_xrealloc | Pickle::WriteBytes ], which is a fairly common content process crash.

Across both processes, there were 265 total.

The breakdown for messages looks like:
- 114 contain PContentChild::SendAsyncMessage
- 46 contain PBrowserChild::SendAsyncMessage
- 14 contain PBackgroundChild::SendPBlobConstructor
- 11 contain HttpChannelChild::ContinueAsyncOpen
- 8 contain PBackgroundIDBRequestParent::Send__delete__
- 8 contain ContentParent::DoSendAsyncMessage
- 7 contain PExternalHelperAppChild::SendOnDataAvailable
- 4 contain PBrowserChild::SendInvokeDragSession
- 4 contain IDBTransaction::Write
- 3 contain PStorageParent::OnMessageReceived

The proto signature facet only contained the top 50, so this doesn't account for 46 of the crashes. There were a few one-off crash signatures involving things like IME that I didn't include in this list, but I'm still not sure what the rest of it is. I don't know if even the signatures around 11 or 14 crashes are common enough to be worth fixing unless it is very easy.

Ideally, we'd stick some data somewhere that the crash reporter could find that would indicate the current message, to make figuring out what it was easier.
(Assignee)

Updated

3 years ago
Depends on: 1273301
(Assignee)

Updated

3 years ago
Depends on: 1273685
(Assignee)

Updated

3 years ago
Depends on: 1274706
Summary: Reduce IPC message size → [meta] Reduce IPC message size

Comment 5

3 months ago
can i lend a hand ?
just seeing that there has been no activity from a year 
So i thought i'll try some part of the fix myself
You need to log in before you can comment on or make changes to this bug.