Closed Bug 1684965 Opened 5 months ago Closed 5 months ago

Multiple seconds of parent process jank from listServiceWorkerRegistrations/< when browser console is open - maybe unbounded growth of the "pool" map?

Categories

(DevTools :: Framework, defect, P2)

defect

Tracking

(firefox86 fixed)

RESOLVED FIXED
86 Branch
Tracking Status
firefox86 --- fixed

People

(Reporter: mstange, Assigned: nchevobbe)

Details

Attachments

(1 file)

Profile: https://share.firefox.dev/3rTWrxG

Steps to reproduce:

  1. Have a Firefox profile with about 500 registered service workers. (about:debugging#/runtime/this-firefox shows "Service Workers (535)" for me.)
  2. Make sure no about:debugging tab is open.
  3. Open the browser console.
  4. Surf around for a bit, visiting pages that use service workers. Alternatively, refresh this fuchsia.dev page about 10 times.
  5. Grab a profile of the pageload of the fuchsia.dev page. Alternatively, open the link to it in a background tab and try scrolling while it loads.

Expected results:
The parent process should stay very responsive during pageload of service worker pages.

Actual results:
The parent process janks or even hangs, depending on how long the error console has been open.

Furthermore, closing the error console window spends a long time in destroy methods (almost a full minute in this case): https://share.firefox.dev/38f3fOq

Bugbug thinks this bug should belong to this component, but please revert this change in case of error.

Component: General → Console
Component: Console → Framework

Looks like another case where using Maps is slowing us down. Since the keys in this Map are simple strings, we can try to switch to plain objects and see if we can improve performance.
There's also the fact that when calling manage, we check if there's already a parent (and if so, we make the parent unmanage the new actor), and retrieving the parent actor will look through all the pool of a connection (https://searchfox.org/mozilla-central/rev/f8a41209af503016e78278774052d48d8c52b91c/devtools/shared/protocol/Pool.js#40). We could probably set a parentActorID on the child actor to speed up that part.

There's probably something we can do on listServiceWorkerRegistration as we're recreating a new pool each time, which manages all the registration actor, and destroy the existing pool https://searchfox.org/mozilla-central/rev/f8a41209af503016e78278774052d48d8c52b91c/devtools/server/actors/root.js#451-454,457,459

At the moment, the getParent method was using the connection poolFor method
to retrieve the pool that was managing it. This is quite costly as poolFor
loops through all the pools of the connection.
This patch adds a parent property to the Pool that is set in manage and
reset in unmanage, and used in the getParent method.
This speeds up getParent as well as the methods that call it (manage when the
actor was already managed, and destroy).

Depends on D100905

Assignee: nobody → nchevobbe
Status: NEW → ASSIGNED

(In reply to Nicolas Chevobbe [:nchevobbe] from comment #2)

Looks like another case where using Maps is slowing us down. Since the keys in this Map are simple strings, we can try to switch to plain objects and see if we can improve performance.

Switching to an object actually made things worse. The Map method shows up inthe profile only because they're called a lot.
The real culprit is getParent, that uses DevToolsServerConnection#poolFor, which loops through all existing pools. In a long debugging session, I guess we can have a large number of pools, making the iteration takes longer.

In another patch, I added a parent reference in Pool so we don't have to use poolFor and saw dramatic improvements, both for managing pools that were already manage (the cause of the slowness seen in Comment 0), as well as for destroy.

  • destroy.DAMP : 251.76 -> 48.75 (-80.64%)
  • getParentTest.DAMP : 1,002.94 -> 0.23 (-99.98%)
  • manage-already-managed.DAMP : 1,002.28 -> 2.49 (-99.75%)

Tests are fine (TRY), so I think we can go this way

Pushed by nchevobbe@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f849568cfe21
[devtools] Manage a reference to the parent in Pool. r=ochameau.
Severity: -- → S3
Priority: -- → P2

Thanks for fixing this!

I don't know the code, but what you said about the number of pools also sounds concerning. The code you linked to in comment 3 destroys and recreates a pool - is it possible that the old pool still "hangs around" somewhere and contributes to the slowness? It sounds troubling that the error console would get slower the longer it is open.

Status: ASSIGNED → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → 86 Branch
You need to log in before you can comment on or make changes to this bug.