don't let postMessage() flood the main thread
Categories
(Core :: DOM: postMessage, defect, P3)
Tracking
()
People
(Reporter: bkelly, Unassigned)
References
(Blocks 2 open bugs)
Details
(Whiteboard: dom-lws-bugdash-triage)
Attachments
(4 obsolete files)
Reporter | ||
Updated•8 years ago
|
Comment 1•8 years ago
|
||
Updated•7 years ago
|
Reporter | ||
Updated•7 years ago
|
Assignee | ||
Updated•6 years ago
|
Comment 2•6 years ago
|
||
It's very similar to bug 1514413, this one is postMessage
, that bug is Http request. If the parent gets flooded by those events, the browser will completely hang, some extreme case may even force user to reboot the machine.
Chrome doesn't have this problem.
Comment 3•5 years ago
|
||
The existing two dependencies were resolved. What work is left here?
Updated•5 years ago
|
Comment 4•5 years ago
|
||
When trying to apply the patch to the current head revision, I found that
- Bug 1321903 resp. https://hg.mozilla.org/mozilla-central/rev/9439982efdc3a72db5ae3f798649fe558eea71c6 moved the back-pressure mechanism to TimeoutManager
- Bug 1363829 resp. https://hg.mozilla.org/mozilla-central/rev/bd452eda2e83eea0cabb28259106a854d8dcf847 then removed the back-pressure mechanism from TimeoutManager
- Bug 1453925 resp. https://hg.mozilla.org/mozilla-central/rev/71b23fab4c0b8a450cf906a725f55171c0e81638 removed the dispatching of PostMessage, since it already is asynchronous via the IPC mechanism
The last point removes the ability to directly put the ThrottledEventQueue
, but this change might be reverted.
Andrew, could you provide some guidance if implementing this still makes sense? I can supply a rebased patch if this makes answering this easier.
Comment 5•5 years ago
|
||
Depends on D57799
Updated•5 years ago
|
Comment 6•5 years ago
|
||
Depends on D57800
Comment 7•5 years ago
|
||
Depends on D57801
Comment 8•5 years ago
|
||
I now adapted the patch, split up into 3 patches, together with some cleanup, so maybe comment on Phabricator directly.
Comment 9•5 years ago
|
||
(In reply to Simon Giesecke [:sg] [he/him] from comment #4)
Andrew, could you provide some guidance if implementing this still makes sense? I can supply a rebased patch if this makes answering this easier.
I think we should abandon the patchset and close the bug in favor of other bugs that are about the problem than a patch holding bug. This is a real problem, but is likely best considered at a higher level more comprehensively as part of the various scheduler changes coming up and including the performance team. I'm planning to attend https://berlinallhandsjanuary2020.sched.com/event/YwoR/workshop-scheduler-design-update-and-next-steps where I'm interested in understanding if that work will interact with this problem space.
In particular:
- Various worker debugger changes have been happening to better support worker debugging and I wouldn't want to land random changes and risk complicating things for :bhackett.
- Throttling that doesn't actually provide some variant of backpressure or throttling (ex: token bucket) results in a memory leak, which isn't necessarily an improvement, and there's no analysis on this here. (Which is part of my prior paragraph point.)
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Comment 10•3 years ago
|
||
(In reply to Andrew Sutherland [:asuth] (he/him) from comment #9)
I think we should abandon the patchset and close the bug in favor of other bugs that are about the problem than a patch holding bug. This is a real problem, but is likely best considered at a higher level more comprehensively as part of the various scheduler changes coming up and including the performance team.
Hi Andrew, to be more concrete: Do we have a bug we can dupe this on then?
Updated•3 years ago
|
Comment 11•3 years ago
•
|
||
(In reply to Jens Stutte [:jstutte] from comment #10)
Hi Andrew, to be more concrete: Do we have a bug we can dupe this on then?
No, we would need to file new bugs. And this would also entail figuring out what we want our policy to be. This may be better spun off in favor of a product-level decision/investigation into whether there are cases where we should be throttling workers or the impact of workers. There's real potential for cross-browser compat issues here.
That said, e10s and Fission help alleviate much of the original concern here. Specifically:
- e10s means that workers aren't harming the general responsiveness as the browser as a whole because the parent process's main thread is not interfered with.
- Fission means thats the performance implications of spammed postMessage on windows should only be meaningfully impact that origin's main thread.
- That said, for BroadcastChannel (which this patch touched) and MessageChannel (related), those messages still do get routed through the PBackground thread so there is some potential concern there. But this patch did nothing to address that.
Comment 12•2 months ago
|
||
Restating:
- :bkelly's changes to BroadcastChannel and Window(Proxy).postMessages in the patch were to use ThrottledEventQueue or otherwise defer delivery of postMessage in order to avoid having the postMessage tasks dominating the main-thread run loop.
- This is a tricky scenario because doing something like this can potentially result in OOM without a capability to throttle the source of the postMessage calls. This stack did nothing to throttle the source; we only have such a mechanism in place for Workers using WorkerGlobalScope.postMessage, and it does accomplish that. This makes this a much harder problem.
- As I said in comment 11, in e10s and fission have largely mitigated concerns related to this.
From triage discussion:
- We are overhauling MessagePort in bug 1752287 which creates some new potential for backpressure, or at least in terms of allowing us to ensure that the OOM could be localized to the sender which is failing to utilize backpressure.
- We are investigating / working on worker throttling in bug 1818463 that could potentially integrate with backpressure signals, and this could potentially be hooked up to the changes above. But this would be done as new follow-ups.
- It would be desirable to have new real-world scenarios / profiling that shows that we are experiencing real problems here.
Description
•