Open Bug 1823729 Opened 2 years ago Updated 7 months ago

parent process can consume many GB of memory if google document active

Categories

(WebExtensions :: General, defect, P2)

defect

Tracking

(Performance Impact:low, firefox-esr102 unaffected, firefox111 unaffected, firefox112 affected, firefox113 affected)

REOPENED
Performance Impact low
Tracking Status
firefox-esr102 --- unaffected
firefox111 --- unaffected
firefox112 --- affected
firefox113 --- affected

People

(Reporter: aryx, Unassigned, NeedInfo)

References

(Blocks 1 open bug)

Details

(Keywords: perf:resource-use, perf:responsiveness, top50, Whiteboard: [addons-jira])

Attachments

(1 file)

Firefox 113.0a1 20230321091911 on Windows 10.

With a Firefox profile with ~12 windows which contain ~3000 tabs (unknown if a requirement to reproduce the issue), switching to tabs which contain Google Docs content (doc, sheet) can trigger the following behavior:

  • parent process firefox.exe starts to consume more memory (have observed >21GB) which will return to its idle ~1GB if the user does not interact with it.
  • Firefox UI (controlled by the parent process) often gets unresponsive or switching between tabs is delayed.

Recorded profile: https://share.firefox.dev/40qCHlM

The behavior got also observed in bug 1818758 but was actually unrelated.

LibreOffice (7.3.4.2) got installed on this computer some weeks ago, and since then fonts can look different (e.g. Treeherder task symbols, failure lines, Twitter single message view). Could additional fonts cause this issue?

The issue did not reproduce in a new Nightly profile with 1 Google sheet and 2 Google docs.

Performance Impact: --- → ?

Zombie, I see crazy lots of time being spent when ConduitsParent/Child.jsm calls sendQuery.
What is Conduits and why does it pass so much data around?

Performance Impact: ? → high
Flags: needinfo?(tomica)

The Bugbug bot thinks this bug should belong to the 'Core::Layout: Text and Fonts' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Layout: Text and Fonts
Product: Firefox → Core
Component: Layout: Text and Fonts → Untriaged
Product: Core → Firefox

Are you still experiencing this issue? A memory report from about:memory might help diagnose the issue.

Flags: needinfo?(aryx.bugmail)
Component: Untriaged → Performance
Product: Firefox → Core

This bug was moved into the Performance component.

:aryx, could you make sure the following information is on this bug?

  • ✅ For slowness or high CPU usage, capture a profile with http://profiler.firefox.com/, upload it and share the link here.
  • For memory usage issues, capture a memory dump from about:memory and attach it to this bug.
  • Troubleshooting information: Go to about:support, click "Copy raw data to clipboard", paste it into a file, save it, and attach the file here.

If the requested information is already in the bug, please confirm it is recent.

Thank you.

Flags: needinfo?(aryx.bugmail)

Based on this profile this very much looks like we're doing something bad in the WebExtensions support code.

Does this reproduce without add-ons?

Component: Performance → General
Product: Core → WebExtensions
Version: unspecified → Trunk

The Performance Impact Calculator has determined this bug's performance impact to be high. If you'd like to request re-triage, you can reset the Performance Impact flag to "?" or needinfo the triage sheriff.

Platforms: [x] Windows [x] macOS
Impact on browser: Causes noticeable jank
Configuration: Rare
Websites affected: Major
[x] Causes severe resource usage

This does not reproduce anymore. Because no other person reported the issue, I will close the bug. Feel encouraged to reopen if you want to continue the investigation.

Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(aryx.bugmail)
Resolution: --- → WORKSFORME

Observed the issue again and with 16GB used on startup + tab switching, had to let it load all pinned and active tabs and idle to not pump the memory usage with tab switching anymore.

Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---

(In reply to Olli Pettay [:smaug][bugs@pettay.fi] from comment #1)

Zombie, I see crazy lots of time being spent when ConduitsParent/Child.jsm calls sendQuery.
What is Conduits and why does it pass so much data around?

I just took a look to the profiler data linked in comment 0 and I see that the Conduits sendQuery calls are related to a flood of tabs.onUpdated events for a listener registered by the multi account container tab (which can be seen in the "Marker Table" tab for the parent process, the amount of markers is so high that the "Marker Chart" instead is getting stuck, likely while trying to render too many of them in the generated chart).

The amount of tab events being notified to the extensions seems definitely too high, suggests something is triggering a pretty continuous sequence of tabs.onUpdated event, I'm not yet sure if it may be multi-accounts-container itself or if it is hit when the tabs.onUpdated event listener registered by the multi account container (which I'd guess it would be this one registered here) is combined with some specific webpage triggering an tabs.onUpdated event as a side effect of something being done continuously (e.g. changing the tab title, tab favicon or some of its states) or with something being done by another extension installed along multi account container.

Based on the quick look I gave to the code around the tabs.onUpdated listener that multi-account-containers registers here) is expected to be:

  • only registered when a newly opened tab is a container tab
  • unregistered as soon as the tab load is completed

And so it seems that to be able to hit the behavior described in comment 0 the following conditions may have to be all met:

  • multi-account-container has registered a listener for a newly opened container tabs
  • the webpage document being loaded in the new container tabs to never reach the "complete" readyState
  • a frequent (and likely never ending?) sequence of tabs.onUpdated events to be triggered for that same tabs

We can try to confirm if this STR would already allow us to reproduce the issue consistently and assess what we may be able to do on the WebExtensions internals side to reduce the impact on the parent process when this kind of misbehavior may be hit, or if we need to gather some more details to be able to reproduce it consistently.

In the shorter term, given that the extension that based on the current report provided by this issue seems to be one of our own extensions, it may also be worth look into which kind of action we may consider on the extension side:

Currently multi-account-containers is registering tabs.onUpdate without any extra parameters (see MDN docs here) and so the listener is going to be called for any tabs update event, but it seems we are mainly interested to be notified on changes to the tab status and so we may consider registering that listener with the extra parameter set to { properties: ["status"] } to be only notified when the tab status is changing and not be notified at all for any other change to the tab properties unrelated to that.

Flags: needinfo?(tomica)

Hi Sebastian,
would you mind to share with us details from about:support? currently I'd be mainly interested in the list of addons enabled (e.g. in case that may be useful to get to a complete STR if we are still unable to reproduce it consistently with the STR I mentioned in comment 9, drafted based on the details available at the moment), but the user prefs and some other details included in about:support may also end up to be useful.

Flags: needinfo?(aryx.bugmail)
Attached file aboutsupport.txt

The Firefox Translations has been turned off after the about:support information got copied - will monitor if this fixes the issue.

Flags: needinfo?(aryx.bugmail)

Observe again, this time with Firefox Translate disabled.

The severity field is not set for this bug.
:willdurand, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(wdurand)

This issue reproduces (almost?) 100% for the first run each time after Nightly got updated now. Could MAC not be the root cause but we a consequence of it?

With that many windows and tabs, there will be many events on every startup, and MAC being inefficient sure seems as a likely cause.

Hey Danny, could you look into recommended optimizations from Luca above please?

Flags: needinfo?(wdurand) → needinfo?(contact)

:robwu can you review this bug and determine if the severity matches the performance impact? A High impact would translate to a S2. If you feel this is an S3, can you comment on the mismatch between our calculations and yours and we can reassess the impact?

Flags: needinfo?(rob)

We downgrade from S2 to S3 because the condition for triggering this perf issue is exceptional (3k tabs, installing a specific extension).

In comment 9 and comment 15 we recommended the Multi Accounts Container extension to patch themselves to address this issue. Regardless of what we're doing with this bug to optimize for the situation, a fix in the MAC add-on seems useful either way.

Severity: -- → S3
Flags: needinfo?(rob)
Priority: -- → P2
Whiteboard: [addons-jira]

I've reviewed the available data, and comment 17. The Performance Impact Calculator has determined this bug's performance impact to be low. If you'd like to request re-triage, you can reset the Performance Impact flag to "?" or needinfo the triage sheriff.

Platforms: [x] Windows [x] macOS
Impact on browser: Causes noticeable jank
Configuration: Rare
Resource impact: Severe

Performance Impact: high → low

:zombie It seems like limiting tabs.onUpdated to only look for the "status" doesn't have any drawback so I pushed a PR on https://github.com/mozilla/multi-account-containers/pull/2659. :lgreco is there any way to realibly reproduce the perf issue so I can confirm limiting the eventlistener is the only fix needed?

Flags: needinfo?(contact) → needinfo?(lgreco)

(In reply to Danny Colin [:sdk] from comment #19)

:zombie It seems like limiting tabs.onUpdated to only look for the "status" doesn't have any drawback so I pushed a PR on https://github.com/mozilla/multi-account-containers/pull/2659. :lgreco is there any way to realibly reproduce the perf issue so I can confirm limiting the eventlistener is the only fix needed?

Honestly I can't recall anymore which kind of attempts I may already have done when I wrote comment 9, but what I wrote in comment 9 and comment 10 makes me pretty sure that I was unable to reproduce it yet, and so I tried to describe in comment 9 what conditions may make us able to trigger the issue in case we needed to force it to be reproduced "artificially".

Ideas for an "artificially" recreated STR

Having said that, looking back to the multi-account-containers code I linked in comment 9 and the details described in comment 0 makes me think that restoring a session with a big enough list of container tabs may be an important part of an STR able to trigger the issue consistently (because the restored tabs not explicitly activated will be likely staying discarded, and so they will never be reaching the "complete" status, and if they are all container tabs then we should end up with one of these tabUpdateHandler listeners per discarded container tab to stay around).

I think that recreating the pre-conditions described above along with a test webpage or (a test extension) making sure to forcefully triggering a flood of tabs.onUpdated events may have a reasonable chance to be able to trigger it consistently.

Additional thought: is multi-account-containers also leaking tabs.onUpdate listeners when multiple tabs are opened closely to each other?

Another detail that I just noticed in that part of the multi-account-containers logic (unless I'm misreading that logic):

  • tabs.onCreated listener is re-assigning this.tabUpdateHandler to a new arrow function on each tabs.onCreated event, then it adds that as a tabs.onUpdated listener
  • then the arrow function will be then removing from the tabs.onUpdated listeners the arrow function currently set on this.tabUpdateHandler

And so I'm wondering if that is also part of the issue, e.g. if tabs.onCreated is called for multiple tabs being opened almost at the same time (e.g. as part of session restore), then the extension may end up leaking the arrow functions registered as tabs.onUpdated listener if this.tabUpdateHandler was already re-assigned to a different arrow function by another call to the tabs.onCreated listener.

wdyt? is the issue I'm describing above something that you also think that can technically be hit by that part of the multi-account-containers logic or may I be misreading that logic?

Flags: needinfo?(lgreco) → needinfo?(contact)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: