Set EventManager.resetIdleOnEvent = false by default.
Categories
(WebExtensions :: General, defect, P2)
Tracking
(Performance Impact:?, firefox117 fixed)
Tracking | Status | |
---|---|---|
firefox117 | --- | fixed |
People
(Reporter: dan+bugzilla.mozilla.org, Assigned: rpl)
References
Details
(Whiteboard: [addons-jira])
Attachments
(2 files, 1 obsolete file)
455 bytes,
patch
|
Details | Diff | Splinter Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review |
Steps to reproduce:
I ran Firefox with a few extensions and a lot of windows and tabs.
Actual results:
I found that Firefox was slow.
Expected results:
Firefox should have been fast.
I have analyzed this further using the Firefox profiler and found the following:
The constructor of class "EventManager" has an undocumented optional parameter "resetIdleOnEvent" which is set to true by default. (https://hg.mozilla.org/mozilla-central/file/a8939ff5236dad956af827235ceed7104e5e92c2/toolkit/components/extensions/ExtensionCommon.jsm#l2271)
If resetIdleOnEvent==true, then, whenever an event happens, a very expensive machinery is called. The machinery is expensive in terms of CPU resources and hence user time and battery charge. It includes at least 2 additional context switches per event. The CPU cost of handling an event seems to be roughly doubled if resetIdleOnEvent==true than if resetIdleOnEvent==false.
The increase in CPU expense does not seem to be justified. At least, I have not found any justification. Therefore, I believe that resetIdleOnEvent should be false by default. Any code which needs this feature should ask for it explicitly.
This can be accomplished with this patch:
--- a/toolkit/components/extensions/ExtensionCommon.jsm
+++ b/toolkit/components/extensions/ExtensionCommon.jsm
@@ -2268,7 +2268,7 @@
register,
extensionApi,
inputHandling = false,
-
resetIdleOnEvent = true,
-
} = params;resetIdleOnEvent = false,
this.context = context;
this.module = module;
With this patch, Firefox consumes less CPU resources and has less latency, it is more "snappy", even with a few extensions and a lot of windows and tabs.
The cost of resetIdleOnEvent==true
If resetIdleOnEvent==true, then whenever an event listener is called, then also a cross-process message named "background-script-reset-idle" is emitted (from the main process to the extension process) (https://hg.mozilla.org/mozilla-central/file/a8939ff5236dad956af827235ceed7104e5e92c2/toolkit/components/extensions/ExtensionCommon.jsm#l2603).
Once the extension process receives the message, the function "resetBackgroundIdle" is called (https://hg.mozilla.org/mozilla-central/file/a8939ff5236dad956af827235ceed7104e5e92c2/toolkit/components/extensions/parent/ext-backgroundPage.js#l397).
This function cancels an existing "backgroundTimer" and then resolves the "@mozilla.org/timer;1" interface and then instantiates a new instance of such a timer. It also computes data for telemetry. It barely does any useful work, except for the very unlikely case that backgroundState of the extension is BACKGROUND_STATE.SUSPENDING, in which case the a "background-script-suspend-canceled" message is emitted.
This code, including the cross-process context switches, is repeated every time when an event listener is called.
There are about 110 call sites of "new EventManager" in the current mozilla-central code (as of Firefox 112), and only 1 of them provides the "resetIdleOnEvent" optional parameter explicitly. This means that almost all event handlers are affected by this performance bug.
The intention of the resetIdleOnEvent feature seems to be to prevent suspending background pages of extensions in case they have recently received any event listener calls, and to otherwise permit suspending background pages (i.e. for saving CPU resources). However, if this is the intention, then the current implementation seems to consume more considerably more CPU resources than would be saved by suspending background pages. Also, if consumes CPU resources even if there was no background page or no CPU consumption by the background page. A more efficient implementation could trade suspend timing accuracy for efficiency, as follows:
Periodically (as defined in preference "extensions.background.idle.timeout", e.g. every 30 seconds), for each extension, it would be checked whether its background page shall be suspended. This check checks a flag "recentlyActive". If that flag was set to true, the check sets it to false. If that flag was false, then between the last check and now, no one has set it to true. Then, the check calls the suspend code. Whenever an event handler is called for that extension, the "recentlyActive" flag is set to true, which will prevent running the suspend code upon the next periodic check.
Such an implementation would call the suspend code maybe not immediately 30 seconds after the last action, but up to another period length (another 30 seconds) later. However, such an implementation would add a penalty of only one "extension.recentlyActive = true" call per event handler call, instead of adding the penalty of inter-process communication, JSON serialization and deserialization, 2 context switches, resolving the "@mozilla.org/timer;1" interface, and so on, per event handler call. Note that in the current implementation, in addition to the penalty described above, there is one timer per extension (e.g. there will be 20 timers if the user has 20 extensions), but in the suggested implementation, there would be only 1 timer.
Therefore, I believe that the current inefficient code which decreases performance shall only be enabled (again) when its effects on performance have been found to be more advantageous than without enabling it. Until then, it should be disabled (which the above patch "resetIdleOnEvent = false" achieves partially).
Reporter | ||
Comment 1•2 years ago
|
||
Comment 2•2 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'WebExtensions::Untriaged' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Updated•2 years ago
|
Comment 3•2 years ago
|
||
Hello,
I’m from Webextensions QA and from what I’ve read throughout this report, I believe, what you are proposing is an enhancement.
As such I will mark this issue accordingly, however, in case of error, please do revert the changes. Thank you !
Reporter | ||
Comment 4•2 years ago
|
||
(In reply to Alex Cornestean from comment #3)
Hello,
I’m from Webextensions QA and from what I’ve read throughout this report, I believe, what you are proposing is an enhancement.
As such I will mark this issue accordingly
Hello Alex,
I believe I am reporting a regression defect, specifically a performance regression defect, not an enhancement. The introduction of sending "background-script-reset-idle" messages a year ago was made apparently without any assessment as to what performance effect it would have, and it has made the performance much worse. I am proposing to acknowledge that performance regression defect as such and to fix it.
The question is only how to fix the performance regression defect in the short term and in the long term. This could be accomplished
- by backing-out the offending changeset(s) entirely (which maybe would have to be done manually because of the time elapsed),
- by disabling the biggest part of the offending functionality, at least as a default (which is what I suggest with "resetIdleOnEvent = false" as a default, because this provides the greatest mitigation for such a small change, and therefore would serve well as a short term solution to the performance regression bug)
- by rewriting the background page suspension algorithm entirely (which would require substantial more work, and therefore is not available as a short term solution).
So the possible fixes to the defect may look like possible enhancements, but only if you believe that it is not a defect in the first place. However, the above reasons make me believe that it is a defect.
Therefore, I suggest to change of the type back to "defect".
however, in case of error, please do revert the changes. Thank you !
Unfortunately, my bugzilla account does not seem to have enough privileges to perform the revert of the type back to "defect" myself. Therefore, I ask you to change it back (or provide a reason why you believe it is not a defect).
Comment 5•2 years ago
|
||
Hello,
I changed the report back to “defect” and will continue to investigate and attempt to reproduce the issue.
Assignee | ||
Comment 6•2 years ago
|
||
needinfo-ing myself to investigate further.
Updated•2 years ago
|
Assignee | ||
Comment 7•2 years ago
|
||
Updated•2 years ago
|
Comment 8•1 years ago
|
||
There is an r+ patch which didn't land and no activity in this bug for 2 weeks.
:rpl, could you have a look please?
If you still have some work to do, you can add an action "Plan Changes" in Phabricator.
For more information, please visit BugBot documentation.
Comment 10•1 year ago
|
||
bugherder |
Assignee | ||
Updated•1 year ago
|
Description
•