Bug 1260908 added MESSAGE_MANAGER_MESSAGE_SIZE and IPC_MESSAGE_SIZE telemetry for tracking the size of IPC messages. Some of these messages are very large, which may be contributing to the high rate of OOM crashes in the e10s beta experiment, through memory fragmentation. While it would be nice to improve our handling of this at the IPC level, fixing individual IPC users might result in faster improvements in the short term. I don't know if it makes sense to track both kinds of improvements in a single bug or not.
sdk/remote/process/message is fairly common, but it looks like a generic pipe that the Addon SDK uses to send messages, so I'm not sure if it is worth filing a bug for.
I've now gone through the telemetry and filed bugs for all of the large messages that are at least somewhat common. Honestly, though, the number of large session store messages (bug 1262661) maybe outnumber all of the rest combined, so I'm not sure how worthwhile it is to work on the others until that is fixed.
I'm adding the generic IPC-level improvements to this bug, too, which we may or may not want to do.
To get a sense of what messages are problematic, I looked at all crashes in 47 beta with the signature [@ OOM | large | mozalloc_abort | mozalloc_handle_oom | moz_xrealloc | Pickle::WriteBytes ], which is a fairly common content process crash. Across both processes, there were 265 total. The breakdown for messages looks like: - 114 contain PContentChild::SendAsyncMessage - 46 contain PBrowserChild::SendAsyncMessage - 14 contain PBackgroundChild::SendPBlobConstructor - 11 contain HttpChannelChild::ContinueAsyncOpen - 8 contain PBackgroundIDBRequestParent::Send__delete__ - 8 contain ContentParent::DoSendAsyncMessage - 7 contain PExternalHelperAppChild::SendOnDataAvailable - 4 contain PBrowserChild::SendInvokeDragSession - 4 contain IDBTransaction::Write - 3 contain PStorageParent::OnMessageReceived The proto signature facet only contained the top 50, so this doesn't account for 46 of the crashes. There were a few one-off crash signatures involving things like IME that I didn't include in this list, but I'm still not sure what the rest of it is. I don't know if even the signatures around 11 or 14 crashes are common enough to be worth fixing unless it is very easy. Ideally, we'd stick some data somewhere that the crash reporter could find that would indicate the current message, to make figuring out what it was easier.
You need to log in before you can comment on or make changes to this bug.