Closed Bug 1604609 Opened 5 years ago Closed 5 years ago

Don't send large window actor messages

Categories

(Core :: DOM: Content Processes, task, P1)

task

Tracking

()

RESOLVED FIXED
mozilla73
Fission Milestone M5
Tracking Status
firefox72 + fixed
firefox73 + fixed

People

(Reporter: mccr8, Assigned: mccr8)

References

Details

Attachments

(2 files, 1 obsolete file)

No description provided.

We have some crashes where the crash reason is "IPC message size is too large" when sending a JS window actor message. The problem is that it is difficult to tell what the actual message is, because that comes from the JS stack and not the C++ stack. In the run up to e10s, we had the same issue with message manager messages, so it makes sense that we'd see the same issue with JS window actors, because window actor messages are kind of the Fission version of message manager messages. It is even possible that some of the crashs we're seeing are messages that were converted from the message manager.

What we did for e10s was twofold and is encapsulated in AllowMessage in nsFrameMessageManager.cpp:

  1. If the message large, return an error instead of attempting to send it.
  2. Record the name of the message that we failed to send. In this way, individual call sites can be fixed as desired. One important detail here is that we strip out all digits from the name of the message. The reason for this was that some addon was sending message manager messages with names like foo1, foo2, etc. and the huge number of messages was causing issues for telemetry (see bug 1275707). I don't know if that is still necessary, but it seems like it wouldn't hurt to keep.
Blocks: 1604564, 1576565

IPC crashes if you try to send large messages. The message manager
already drops messages that are too big (and used to collect telemetry
for which messages were being dropped).

I think that some message manager messages have been ported over to JS
window actors and started crashing on beta, so let's just give the
same behavior for JS window actors as we do for the message manager.

I also added telemetry for what messages are being rejected for being
too large, in case there are some new or old things that we want to
fix.

This also needs data review. I'll fill that out and put it up for review tomorrow, but it is nearing the end of my work day and jdai is in Berlin so I figured I would put it up for the rest of the review now.

Attached file data collection request (obsolete) —
Attachment #9116793 - Flags: data-review?(chutten)

Priority P1 and tracking for Fission dogfooding (M5) since this fix and telemetry probe will improve Fission stability.

status-firefox72: --- → affected

We probably don't need to uplift this fix or telemetry probe to 72 Beta.

Fission Milestone: --- → M5
Priority: -- → P1

(In reply to Chris Peterson [:cpeterson] from comment #5)

We probably don't need to uplift this fix or telemetry probe to 72 Beta.

This will fix a crash that is happening on beta. (See the bugs this is blocking.) Non-fission is also using a lot of window actors.

Comment on attachment 9116793 [details] data collection request Load balancing to Nicole.
Attachment #9116793 - Flags: data-review?(chutten) → data-review?(nshadowen)

I'm going to split out the telemetry in another bug, so we can get this on beta with less risk and delay. Sorry for the churn.

Blocks: 1605218
Summary: Don't send large window actor messages (and collect telemetry for rejected messages) → Don't send large window actor messages
Attachment #9116793 - Attachment is obsolete: true
Attachment #9116793 - Flags: data-review?(nshadowen)

:mccr8 Does this mean you don't need a data review on this bug anymore?

Flags: needinfo?(continuation)

(In reply to Nicole Shadowen from comment #9)

:mccr8 Does this mean you don't need a data review on this bug anymore?

I'm going to re-request data review on the new bug. Sorry for the confusion.

Flags: needinfo?(continuation)
Pushed by amccreight@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9bb2631238b5 Don't send large window actor messages. r=jdai

Comment on attachment 9116595 [details]
Bug 1604609 - Don't send large window actor messages.

Beta/Release Uplift Approval Request

  • User impact if declined: Crashes. See bug 1576565.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Medium
  • Why is the change risky/not risky? (and alternatives if risky): On the one hand, this is low risk, because all it does is not send a message in the case where we were going to crash right after. Also, for code ported from message managers to window actors this just restores the old behavior. Making it riskier is that this affects a lot of code, so maybe something weird could happen if we do nothing than if we crash? The volume of this crash on Nightly is not very high so it isn't clear how much baking on Nightly will help us understand the risks.
  • String changes made/needed: none
Attachment #9116595 - Flags: approval-mozilla-beta?

By "covered by automated tests", I mean the non-failing cases are covered. The failing case, where we drop a message, is not tested. These crashes happen when JS window actors attempt to send a very large (256MB) message. The typical cases where I've seen this before are things like trying to send an entire downloaded file via IPC in a single message.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla73

Comment on attachment 9116595 [details]
Bug 1604609 - Don't send large window actor messages.

avoid crashes in ipc, approved for 72.0b10

Attachment #9116595 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Thanks for the fixup, :zombie!

Flags: needinfo?(continuation)
See Also: → 1661384
See Also: 1661384
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: