Don't send large window actor messages
Categories
(Core :: DOM: Content Processes, task, P1)
Tracking
()
People
(Reporter: mccr8, Assigned: mccr8)
References
Details
Attachments
(2 files, 1 obsolete file)
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta+
|
Details | Review |
4.72 KB,
patch
|
Details | Diff | Splinter Review |
Assignee | ||
Comment 1•5 years ago
|
||
We have some crashes where the crash reason is "IPC message size is too large" when sending a JS window actor message. The problem is that it is difficult to tell what the actual message is, because that comes from the JS stack and not the C++ stack. In the run up to e10s, we had the same issue with message manager messages, so it makes sense that we'd see the same issue with JS window actors, because window actor messages are kind of the Fission version of message manager messages. It is even possible that some of the crashs we're seeing are messages that were converted from the message manager.
What we did for e10s was twofold and is encapsulated in AllowMessage in nsFrameMessageManager.cpp:
- If the message large, return an error instead of attempting to send it.
- Record the name of the message that we failed to send. In this way, individual call sites can be fixed as desired. One important detail here is that we strip out all digits from the name of the message. The reason for this was that some addon was sending message manager messages with names like foo1, foo2, etc. and the huge number of messages was causing issues for telemetry (see bug 1275707). I don't know if that is still necessary, but it seems like it wouldn't hurt to keep.
Assignee | ||
Comment 2•5 years ago
|
||
IPC crashes if you try to send large messages. The message manager
already drops messages that are too big (and used to collect telemetry
for which messages were being dropped).
I think that some message manager messages have been ported over to JS
window actors and started crashing on beta, so let's just give the
same behavior for JS window actors as we do for the message manager.
I also added telemetry for what messages are being rejected for being
too large, in case there are some new or old things that we want to
fix.
Assignee | ||
Comment 3•5 years ago
|
||
This also needs data review. I'll fill that out and put it up for review tomorrow, but it is nearing the end of my work day and jdai is in Berlin so I figured I would put it up for the rest of the review now.
Assignee | ||
Comment 4•5 years ago
|
||
Updated•5 years ago
|
Comment 5•5 years ago
|
||
Priority P1 and tracking for Fission dogfooding (M5) since this fix and telemetry probe will improve Fission stability.
status-firefox72: --- → affected
We probably don't need to uplift this fix or telemetry probe to 72 Beta.
Assignee | ||
Comment 6•5 years ago
|
||
(In reply to Chris Peterson [:cpeterson] from comment #5)
We probably don't need to uplift this fix or telemetry probe to 72 Beta.
This will fix a crash that is happening on beta. (See the bugs this is blocking.) Non-fission is also using a lot of window actors.
Comment 7•5 years ago
|
||
Assignee | ||
Comment 8•5 years ago
|
||
I'm going to split out the telemetry in another bug, so we can get this on beta with less risk and delay. Sorry for the churn.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Comment 9•5 years ago
•
|
||
:mccr8 Does this mean you don't need a data review on this bug anymore?
Assignee | ||
Comment 10•5 years ago
|
||
(In reply to Nicole Shadowen from comment #9)
:mccr8 Does this mean you don't need a data review on this bug anymore?
I'm going to re-request data review on the new bug. Sorry for the confusion.
Comment 11•5 years ago
|
||
Assignee | ||
Comment 12•5 years ago
|
||
Comment on attachment 9116595 [details]
Bug 1604609 - Don't send large window actor messages.
Beta/Release Uplift Approval Request
- User impact if declined: Crashes. See bug 1576565.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Medium
- Why is the change risky/not risky? (and alternatives if risky): On the one hand, this is low risk, because all it does is not send a message in the case where we were going to crash right after. Also, for code ported from message managers to window actors this just restores the old behavior. Making it riskier is that this affects a lot of code, so maybe something weird could happen if we do nothing than if we crash? The volume of this crash on Nightly is not very high so it isn't clear how much baking on Nightly will help us understand the risks.
- String changes made/needed: none
Assignee | ||
Comment 13•5 years ago
|
||
By "covered by automated tests", I mean the non-failing cases are covered. The failing case, where we drop a message, is not tested. These crashes happen when JS window actors attempt to send a very large (256MB) message. The typical cases where I've seen this before are things like trying to send an entire downloaded file via IPC in a single message.
Comment 14•5 years ago
|
||
bugherder |
Comment 15•5 years ago
|
||
Comment on attachment 9116595 [details]
Bug 1604609 - Don't send large window actor messages.
avoid crashes in ipc, approved for 72.0b10
Comment 16•5 years ago
|
||
bugherder uplift |
Comment 17•5 years ago
|
||
Backed out changeset b27da4e9e365 (Bug 1604609) for build bustages at JSWindowActorChild.cpp.
https://hg.mozilla.org/releases/mozilla-beta/rev/2bb7d07b805e5efbd097658b4736907ce10c2097
Failure log:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=282153153&repo=mozilla-beta&lineNumber=36243
Comment 18•5 years ago
|
||
Comment 19•5 years ago
|
||
bugherder uplift |
Description
•