Closed Bug 1261094 Opened 4 years ago Closed 4 years ago

MessageChannel::mInterruptStack makes a full copy of the message

Categories

(Core :: IPC, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla49
Tracking Status
firefox47 - wontfix
firefox48 + fixed
firefox49 --- fixed

People

(Reporter: mccr8, Assigned: baku)

References

Details

(Whiteboard: [MemShrink:P2][not e10s-specific] btpp-active)

Crash Data

Attachments

(1 file)

MessageChannel::Call() does this:
  mInterruptStack.push(*msg);

...which makes a full copy of |msg|.

This is pretty wasteful, and sometimes leads to OOM crashes when called from PPluginScriptableObjectParent::CallInvoke():
https://crash-stats.mozilla.com/report/index/4099f75e-69dc-4865-89c2-1cd772160326

The OOM allocation size ranges from a few hundred KB all the way up to 176MB (!!!):
https://crash-stats.mozilla.com/report/index/81060c17-7c20-4949-8b80-fe9132160330

As far as I can tell, the only thing anybody does to these messages is call |type()| and |seqno()| on them, so we shouldn't need to copy the full data. Maybe mInterruptStack could be made up of a new type that is just a pair of those bits of data.

This isn't a common OOM signature (I only see about 50), and it isn't e10s specific (the only callers of MessageChannel::Call() seem to be plugins and GMP), but it should be a nice improvement.
See Also: → 1261099
Whiteboard: [MemShrink] → [MemShrink][not e10s-specific]
(In reply to Andrew McCreight [:mccr8] from comment #0)
> As far as I can tell, the only thing anybody does to these messages is call
> |type()| and |seqno()| on them, so we shouldn't need to copy the full data.
> Maybe mInterruptStack could be made up of a new type that is just a pair of
> those bits of data.

IPC::Message::Header is one possibility for that type; it's currently `protected` in IPC::Message (except on OS X?) but that could be changed.  I haven't looked at this code so I don't know if it would make more sense to do that or just make a struct specifically for this.
Whiteboard: [MemShrink][not e10s-specific] → [MemShrink:P2][not e10s-specific]
baku, you're a fan of MessageChannel :)
Flags: needinfo?(amarchesini)
Whiteboard: [MemShrink:P2][not e10s-specific] → [MemShrink:P2][not e10s-specific] btpp-fixlater
MessageChannel is, confusingly, different from MessagePort. :)
I've always wanted to work with 'deep' IPC code. I take this bug. It seems a good first bug for me.
Assignee: nobody → amarchesini
Flags: needinfo?(amarchesini)
Attached patch message.patchSplinter Review
Attachment #8744562 - Flags: review?(continuation)
Whiteboard: [MemShrink:P2][not e10s-specific] btpp-fixlater → [MemShrink:P2][not e10s-specific] btpp-active
Comment on attachment 8744562 [details] [diff] [review]
message.patch

Thanks for fixing this. This looks good to me, but it should get a review from an IPC peer.
Attachment #8744562 - Flags: review?(jld)
Attachment #8744562 - Flags: review?(continuation)
Attachment #8744562 - Flags: feedback+
Attachment #8744562 - Flags: review?(jld) → review+
https://hg.mozilla.org/mozilla-central/rev/b05438670c10
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
[Tracking Requested - why for this release]: this may help e10s memory usage
Tracking for 48 since e10s may go to release in 48
Want to request uplift for this fix to 48?
Flags: needinfo?(amarchesini)
Comment on attachment 8744562 [details] [diff] [review]
message.patch

Approval Request Comment
[Feature/regressing bug #]: IPC
[User impact if declined]: This patch reduces the amount of memory that IPC code uses.
[Describe test coverage new/current, TreeHerder]: green on try.
[Risks and why]: none
[String/UUID change made/needed]: none
Flags: needinfo?(amarchesini)
Attachment #8744562 - Flags: approval-mozilla-aurora?
Comment on attachment 8744562 [details] [diff] [review]
message.patch

Let's try this in aurora. Maybe it will help with OOM crashes.
Attachment #8744562 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
At this point, I would like to only accept uplifts for critical recent regressions, severe stability, security, perf, mlk issues only. This is to minimize code churn and to ensure we ship a high-quality Fx47 in 2 weeks.

Since this is e10s related, which is turned off by default on Fx47, this is now a wontfix.
(In reply to Ritu Kothari (:ritu) from comment #15)
> Since this is e10s related, which is turned off by default on Fx47, this is
> now a wontfix.

I think it is reasonable to not take it on beta at this point, but this is actually related to NPAPI IPC, and is therefore not specific to e10s.
You need to log in before you can comment on or make changes to this bug.