Remove ChromeUtils.idleDispatch from the critical path for intercepted web requests
Categories
(WebExtensions :: General, defect, P2)
Tracking
(Performance Impact:high, firefox90 fixed)
Tracking | Status | |
---|---|---|
firefox90 | --- | fixed |
People
(Reporter: mayhemer, Assigned: zombie)
References
Details
(Keywords: perf:pageload)
Attachments
(1 file)
A side note: I can see we are triggering (from the parent process) the listener event chain via an idle dispatch. Here it is delayed only by 13 ms, but I've seen logs that took way longer. These notifications are critical to unblock network loads, so I wonder why we use idle dispatch when the next hop is an async IPC message to the web extension process. This is caused by the generic code in frameloader.
Sub log for examination: https://mayhemer.github.io/backtrack-analyzer/?../data/1567863-c23.zip
What was the reason to use idle dispatch for something that blocks (critical) network resource loading? Can we use priorities here?
Introduced in bug 1404652.
Assignee | ||
Comment 1•5 years ago
|
||
The reasons were explained in bug 1404652 comment 0: mainly to reduce CPU/memory/IPC overhead of sending multiple small messages, vs coalescing them into one. The IdleDispatch is capped at 250ms, so unless that's misbehaving, I doubt it could result in delays of multiple minutes.
https://searchfox.org/mozilla-central/source/toolkit/components/extensions/MessageChannel.jsm#324
I'm currently rewriting our IPC to use JSWindowActors instead MessageManagers, and Part 1 actually removes the IdleDispatch usage, with the intention of re-introducing a similar functionality in Part 2.
https://phabricator.services.mozilla.com/D44368
If you could give more details about why you think this approach is the culprit, or how we could "use priorities here", I can probably integrate it in the rewrite.
Reporter | ||
Comment 2•5 years ago
|
||
(In reply to Tomislav Jovanovic :zombie from comment #1)
I doubt it could result in delays of multiple minutes.
I wasn't trying to imply that, no. If coalescing is the case, then it is a good reason, still I think lowering the limit of 250ms to something smaller (few times, not by magnitude) when it would block an event on a critical path for a render-blocking resource load (top level, css, blocking script..) may be beneficial.
https://searchfox.org/mozilla-central/source/toolkit/components/extensions/MessageChannel.jsm#324
I'm currently rewriting our IPC to use JSWindowActors instead MessageManagers, and Part 1 actually removes the IdleDispatch usage, with the intention of re-introducing a similar functionality in Part 2.
https://phabricator.services.mozilla.com/D44368
If you could give more details about why you think this approach is the culprit, or how we could "use priorities here", I can probably integrate it in the rewrite.
Culprit of bug 1567863? No, this (=idledispatch) is definitely not. It's just that I saw it in the log, and not the first time, so I decided to file a bug eventually.
There is no generic approach to use priorities, AFAIK. We don't auto-propagate priorities on event trees. What we could try to do is to let the webext network request observer [1ab] check for some attributes on the channel [2] and possibly suggest a priority from that spot.
[1a] https://searchfox.org/mozilla-central/rev/5cb522c7baba24e55874809e0e206b001494c1e9/toolkit/components/extensions/webrequest/WebRequest.jsm#921
[1b] https://searchfox.org/mozilla-central/rev/5cb522c7baba24e55874809e0e206b001494c1e9/toolkit/components/extensions/ExtensionParent.jsm#1156
[2] https://searchfox.org/mozilla-central/rev/5cb522c7baba24e55874809e0e206b001494c1e9/netwerk/base/nsIClassOfService.idl#62,80
Reporter | ||
Comment 3•5 years ago
|
||
And btw, bug 1404652's title suggest to coalesce "low-priority" events. These are definitely not low priority!
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 4•5 years ago
|
||
- Reduce head-of-line blocking:
As Honza suggest, we probably want to not coalesce (or use a significantly shorter timeout), on certain types of requests, specifically any "top level, css, blocking script.." with a blocking listener.
- Make it efficient with Fission
In certain cases, new actor(conduit)-based IPC may be less efficient that MessageManagers because of the lack of broadcast support. When we need to deliver the exact same message bits to several listeners in the same process, we send a separate IPC message for each one.
The solution is to use a "process-level actor" from bug 1580448 to send messages grouped by the target listeners' process.
Assignee | ||
Comment 5•5 years ago
|
||
Found ClassOfService, and looks we should be able to QI a channel to a CAS to determine if we should prioritize a specific request:
https://searchfox.org/mozilla-central/rev/6566d92d/netwerk/base/nsIClassOfService.idl#25
Comment 6•5 years ago
|
||
Tentatively moving all bugs whose summaries mention "Fission" (or other Fission-related keywords) but are not assigned to a Fission Milestone to the "?" triage milestone.
This will generate a lot of bugmail, so you can filter your bugmail for the following UUID and delete them en masse:
0ee3c76a-bc79-4eb2-8d12-05dc0b68e732
Comment 7•5 years ago
|
||
kmag says this bug should block enabling Fission in Nightly (M6).
I'm not 100% sure this is related to 1624685 but will take your word on it. If so, I notice that the ticket has been open for 6 months without resolution. I would advise that the priority should be much higher because it's giving a very bad Firefox experience for anyone who has web extensions installed.
Here's why: 150ms of latency isn't that noticeable for a single request and a single listener. However, many web extensions have multiple listeners installed. In our case, we are regularly seeing 600ms of additional latency for a single request.
Now compound that for pages which make multiple requests (virtually all of them). Sites like Facebook and Gmail suffer horribly in Firefox. We are seeing 5-10 second delays. This is compounded by the fact that Gmail and other web apps will display "Still loading..." alerts to the user when this happens.
The same extension with the same listeners in Chrome does not have this problem at all. The experience there is very snappy.
Assignee | ||
Comment 10•5 years ago
|
||
You might be correct, what you are describing here (as opposed to what was reported in bug 1624685) does indeed sound different.
To recap, we batch (coalesce) all web request events, to send them in one message when the thread is idle, or a maximum of 250ms. In practice, on an modern laptop with unencumbered CPU/memory, this is usually under 15ms delay in my naive measurements. Only on lower end machines under heavy load is it supposed to go up to 250ms, and that is total, not per extension. To get a 10 second delay from this, the web site would have to issue dozen of requests sequentially, and test this on a encumbered old pc.
If you can reliably reproduce a 10 second difference between loading a website without extensions vs with one or more extensions with empty blocking webRequest listeners, I can easily provide a build that reduces this delay to 0 so we can confirm if this is the root cause.
Comment 12•5 years ago
|
||
Here is a minimal web extension that does nothing. It simply registers two blocking event listeners and returns a promise that allows the request through. Note that I'm using Firefox send and the link will expire in 7 days:
https://send.firefox.com/download/e560c24cd6620742/#i9bjcF528sH-EafgoNRZOg
Here's a sample benchmark page that we use to measure web extension latency:
http://demos.winstonprivacy.com/latency.html
You can run the page with and without the dummy extension enabled to see the impact of latency. I'm running a high end laptop and I'm getting 100-150ms unloaded latency above what FF without an extension exhibits.
Comment 13•5 years ago
|
||
Checking in to see if this conversation is appropriate here or if I should reopen the original ticket. I am happy to follow your lead!
Comment 14•5 years ago
|
||
The link above is about to expire. Here's a new one which will work for 7 days.
https://send.firefox.com/download/a4e5ccd22fe72017/#__BuuQYLBkNq6yiJFXlytA
kmag, is the current version of JSProcessActor sufficient for the needs of this bug?
Assignee | ||
Updated•5 years ago
|
Updated•4 years ago
|
Comment 17•4 years ago
|
||
(In reply to David Teller [:Yoric] (please use "needinfo") from comment #15)
kmag, is the current version of JSProcessActor sufficient for the needs of this bug?
Probably, but I'm not the person to ask at this point.
Comment 18•4 years ago
|
||
Do we know when/how the "events" to coalesce are created? Could we possibly coalesce in some other way?
There are various options, depending on how heavily we need to coalesce.
We could rely on end-of-task model, or perhaps dispatch a new normal priority runnable to the main thread and when that runs, it would
process all the "events".
This should be probably rather high priority bug.
See https://bugzilla.mozilla.org/show_bug.cgi?id=1684703#c17
This makes pages behave really badly on high end machines.
Comment 19•4 years ago
|
||
(In reply to Tomislav Jovanovic :zombie from comment #1)
The reasons were explained in bug 1404652 comment 0: mainly to reduce CPU/memory/IPC overhead of sending multiple small messages, vs coalescing them into one. The IdleDispatch is capped at 250ms, so unless that's misbehaving, I doubt it could result in delays of multiple minutes.
This is being applied on messages that are in the critical path to getting requests out to the network. We cannot have a 250ms delay here.
Comment 20•4 years ago
|
||
Lets bump the priority here. Tom, this is assigned to you, do we need to re-triage that?
Updated•4 years ago
|
Comment 21•4 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #18)
Do we know when/how the "events" to coalesce are created?
Those are network requests (potentially multiple independent ones). Each individual request can trigger event listeners in the following sequence: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/webrequest-flow.png
I have grouped events that are typically near each other (i.e. due to the absence of dependencies on external network activity). Events marked as "blocking" will suspend a request until the event has been handled.
- onBeforeRequest [blocking]
- onBeforeSendHeaders [blocking]
- onSendHeaders
- onHeadersReceived [blocking]
- onAuthRequired [blocking]
- onBeforeRedirect
- onResponseStarted
- onCompleted
- onErrorOccurred
Note that the above is about a single request. When a web page with multiple subresources is loading (or when many tabs are loading), then independent requests can trigger different events.
I suppose that the least that we can do is to use a significantly lower deadline when blocking event listeners are involved. That would then apply to all ongoing network requests: if we didn't and were to re-order events, then extensions can get confused.
Comment 22•4 years ago
|
||
zombie recommends we consider moving this bug from Fission Future to M7. He'll investigate how much work this bug is this week.
Comment 23•4 years ago
|
||
kmag thinks this should affect fission the same way as non-fission, so we shouldn't track it for Fission milestone. zombie?
Comment 24•4 years ago
|
||
zombie confirmed that M8 is a better target for this.
Comment 25•4 years ago
|
||
Sorry, that was incorrect. Not a Fission specific issue.
Assignee | ||
Comment 26•4 years ago
|
||
Use nsIClassOfService flags to check if a request is urgent, or can be batched:
Leader | Unblocked | UrgentStart | TailForbidden
These cover top frame html, <head> blocking css and js (and xhr), font preloads,
and everything else that can delay first paint or DOMContentLoaded.
Comment 27•4 years ago
|
||
Comment 28•4 years ago
|
||
bugherder |
Comment 29•4 years ago
|
||
== Change summary for alert #29853 (as of Wed, 28 Apr 2021 09:24:10 GMT) ==
Improvements:
Ratio | Suite | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|---|
6% | perf_reftest (docs) | some-descendants-1.html | linux1804-64-shippable-qr | e10s stylo webrender | 4.66 -> 4.40 |
4% | tp5o_webext (docs) | macosx1015-64-shippable-qr | e10s stylo webrender-sw | 231.04 -> 222.31 | |
4% | tp5o_webext (docs) | macosx1015-64-shippable-qr | e10s stylo webrender-sw | 230.73 -> 222.14 | |
4% | tp5o_webext (docs) | linux1804-64-shippable | e10s stylo | 308.61 -> 297.25 | |
3% | tp5o_webext (docs) | linux1804-64-shippable-qr | e10s stylo webrender | 291.27 -> 281.70 | |
3% | tp5o_webext (docs) | macosx1015-64-shippable-qr | e10s stylo webrender | 230.76 -> 223.64 |
For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=29853
Updated•3 years ago
|
Description
•