Open Bug 1769763 Opened 7 months ago Updated 18 days ago

Massive memory use in structured-clone-holder

Categories

(WebExtensions :: General, defect)

Firefox 100
defect

Tracking

(Performance Impact:?)

UNCONFIRMED
Performance Impact ?

People

(Reporter: matias.kiviniemi, Assigned: kmag, NeedInfo)

References

Details

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:100.0) Gecko/20100101 Firefox/100.0

Steps to reproduce:

Opening new Firefox window with Gmail inbox and leave it be unused for ~15minutes. Tried updating to 100.1 but did not help.

Actual results:

Firefox browser uses 4-10GB of memory depending on the moment. If left for hours it went upto 20+GB. Looking at about:memory measure, most of the 10GB at the moment was used by structured-clone-holder.

Main Process (pid 14624)
Explicit Allocations

10,717,631,166 B (100.0%) -- explicit
├──10,264,294,656 B (95.77%) -- dom
│ ├──10,263,966,944 B (95.77%) ── structured-clone-holder [22642]
│ └─────────327,712 B (00.00%) ── event-listener-managers-hash
├─────145,407,808 B (01.36%) ++ heap-overhead
├──────73,629,988 B (00.69%) ++ js-non-window
├──────70,240,848 B (00.66%) ++ window-objects
├──────44,165,224 B (00.41%) ── heap-unclassified
├──────33,074,848 B (00.31%) ++ storage
├──────20,375,888 B (00.19%) ++ startup-cache
├──────13,264,432 B (00.12%) ++ workers/workers(chrome)
├──────11,520,352 B (00.11%) ++ script-preloader
├──────10,746,160 B (00.10%) ++ network
├───────7,695,760 B (00.07%) ++ atoms
├───────4,849,568 B (00.05%) ++ threads
├───────4,458,784 B (00.04%) ++ images
├───────3,047,829 B (00.03%) ── cert-storage/storage
├───────2,754,512 B (00.03%) ++ xpconnect
├───────2,598,192 B (00.02%) ++ layout
├───────1,124,544 B (00.01%) ── cookie-service
├─────────983,312 B (00.01%) ++ cycle-collector
├─────────976,480 B (00.01%) ++ telemetry
├─────────658,224 B (00.01%) ── memory-reporter-manager
├─────────540,944 B (00.01%) ── icu
├─────────487,730 B (00.00%) ++ preferences
├─────────277,584 B (00.00%) ++ gfx
├─────────198,224 B (00.00%) ++ string-bundles
├─────────103,360 B (00.00%) ++ data-storage
├──────────71,056 B (00.00%) ── profiler/profiler-state
├──────────61,403 B (00.00%) ── skia-font-cache
├──────────22,080 B (00.00%) ++ xpcom
├─────────────592 B (00.00%) ── xul-prototype-cache/other
├─────────────512 B (00.00%) ── url-preloader/other
├─────────────272 B (00.00%) ── history-links-hashtable
├───────────────0 B (00.00%) ── media/libogg
└───────────────0 B (00.00%) ── spell-check

Expected results:

Memory use with a few webpages should not be that high.

The Bugbug bot thinks this bug should belong to the 'Core::Gecko Profiler' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Gecko Profiler
Product: Firefox → Core
Component: Gecko Profiler → DOM: Performance
Performance Impact: --- → ?

Who might know about structured-clone-holder?

Component: DOM: Performance → Performance
Flags: needinfo?(sphink)

baku would be my first guess, but there's a good chance the problem is more of keeping a large number of clones lying around, not the structured clone mechanism itself. But maybe baku could shed light onto what could be hanging onto that many clones?

For the reporter: what extensions do you have? This memory usage is extreme and I haven't heard of this symptom before, so I'm trying to think of what might be different in your situation. You can attach the output of about:support using the "Copy raw data to clipboard" button, or just the "Add-ons" portion of that or the text output if you're uncomfortable sharing all of it.

Another request: go to about:telemetry, and send over a copy of what the "Raw JSON" button in the bottom left gives you. Or if you want to send minimal data: choose the "Histogram" section, and mouse over the DESERIALIZE_BYTES histogram. Click the Copy button in the bottom right and put that output here. Do the same with DESERIALIZE_ITEMS. (I'm not sure these will be useful, and I've never looked at a single person's output specifically before. But it would rule out some things, perhaps.)

Flags: needinfo?(sphink)
Flags: needinfo?(matias.kiviniemi)
Flags: needinfo?(amarchesini)

This seems very likely to be webextensions related.

I think there's a bit of terminology/naming collision going on here. The "explicit/dom/structured-clone-holder" is actually tracking StructuredCloneBlob instances per the use of the string in mozilla::dom::StructuredCloneBlob::CollectReports, not StructuredCloneHolder instances which provides DOM object structured (de)serialization support on top of what JS already provides.

StructuredCloneBlob instances are created via the chrome WebIDL StructuredCloneHolder. A searchfox search for StructuredCloneHolder in js files seems to indicate that the mechanism is exclusively used by web-extensions and devtools.

Component: Performance → General
Flags: needinfo?(amarchesini)
Product: Core → WebExtensions

It was called StructuredCloneHolder when I first wrote the patch, but Bill asked me to rename it.

Anyway, the last time I saw something like this happen, an extension was sending a ton of very large messages. I couldn't find anything keeping them alive, but the cycle collector was taking ages to clean them up for some reason. I added code to explicitly drop the structured clone contents and neuter the objects earlier in most cases when they were used for messaging, but not all.

There are some cases where we intentionally keep things alive in memory for storage APIs, though, so it could be that too.

Hi,

Extensions seems to be the issue, specifically Lastpass is the only extension I have (visible in about:preferences) and disabling that one seems to solve the problem :) I'll report to last pass but below are the parts about lastpass in case you want to look for the cause. System should probably be a bit sturdier so that a single addon doesn't take the whole house down. And/or a warning ("This extension is taking a lot of resources") would be nice.

Matias

about:support
{
"name": "LastPass: Free Password Manager",
"type": "extension",
"version": "4.95.0.3",
"isActive": true,
"id": "support@lastpass.com"
},

about:telemetry
"WEBEXT_CONTENT_SCRIPT_INJECTION_MS_BY_ADDONID": {
"support@lastpass.com": {
"bucket_count": 100,
"histogram_type": 0,
"sum": 722,
"range": [
1,
50000
],
"values": {
"37": 0,
"41": 1,
"45": 1,
"50": 3,
"55": 3,
"61": 1,
"67": 1,
"74": 1,
"89": 1,
"98": 0
}
}
}
...
"support@lastpass.com": {
"bucket_count": 100,
"histogram_type": 0,
"sum": 1151,
"range": [
1,
60000
],
"values": {
"964": 0,
"1064": 1,
"1174": 0
}
}
...
"support@lastpass.com": {
"bucket_count": 100,
"histogram_type": 0,
"sum": 263,
"range": [
1,
50000
],
"values": {
"232": 0,
"255": 1,
"281": 0
}
},
...
"browser.ui.toolbar_widgets": {
"library-button_pinned_nav-bar-end": true,
"bookmark_pinned_pageaction-urlbar": true,
"save-to-pocket-button_pinned_nav-bar-end": true,
"fxa-toolbar-menu-button_pinned_nav-bar-end": true,
"stop-reload-button_pinned_nav-bar-start": true,
"managed-bookmarks_pinned_bookmarks-bar": true,
"downloads-button_pinned_nav-bar-end": true,
"personal-bookmarks_pinned_bookmarks-bar": true,
"menu-toolbar_pinned_off": true,
"tabbrowser-tabs_pinned_tabs-bar": true,
"titlebar_pinned_off": true,
"new-tab-button_pinned_tabs-bar": true,
"alltabs-button_pinned_tabs-bar": true,
"sidebar-button_pinned_nav-bar-end": true,
"forward-button_pinned_nav-bar-start": true,
"menubar-items_pinned_menu-bar": true,
"support-lastpass-com_pinned_nav-bar-end": true,
"back-button_pinned_nav-bar-start": true,
"bookmarks-bar_pinned_newtab": true
},
...
"support@lastpass.com": {
"version": "4.95.0.3",
"scope": 1,
"type": "extension",
"updateDay": 19128,
"isSystem": false,
"isWebExtension": true,
"multiprocessCompatible": true,
"blocklisted": false,
"description": "Last Password you will ever need",
"name": "LastPass: Free Password Manager",
"userDisabled": false,
"appDisabled": false,
"foreignInstall": false,
"hasBinaryComponents": false,
"installDay": 18897,
"signedState": 2
},

Flags: needinfo?(matias.kiviniemi)

Hello,

I’m from Webextensions QA and I’ve attempted to reproduce the issue on the latest Release (100.0.2/20220519220738) under Windows 10 x64, as per the provided STR, however without success.

I’ve created a new profile, installed LastPass and logged in. Afterwards, I’ve logged in to Gmail in a new window and waited for ~30 minutes.
Initially, Firefox memory usage was at about 1GB and then started to decrease to 700 MB. Throughout the 30 minutes, small fluctuations of memory usage did occur however nothing as reported – 700 MB to ~1 GB.

So I did not manage to reproduce this issue on my end.

Hi Matias,
could you try to collect a profile using Firefox Profile (https://profiler.firefox.com)? (by activating it before the issue and triggering and keeping it active while the memory usage is starting to grow)

A profile may help us to gather some more details about which internal usage of the StructuredCloneHolder is actually triggering it (I suspect it may be related to NativeMessaging but it would be good to confirm it).

Flags: needinfo?(matias.kiviniemi)

Here's the profile link https://share.firefox.dev/3lXuhA4

Flags: needinfo?(matias.kiviniemi)

(In reply to matias.kiviniemi from comment #9)

Here's the profile link https://share.firefox.dev/3lXuhA4

Thanks a lot Matias!

Based on a quick look to the profile data, it looks like:

  • in the google.com child process, lastpass is calling a function named LP_elt_get_text (defined in its content script named onloadwff.js and injected in all web and file urls) quite a lot and that seems to be already triggering the garbage collector as well as a side-effect every time it is being called: https://share.firefox.dev/3GGrp48

  • from a quick look to the callers of LP_elt_get_text itself, I got the impression that at least some of its caller may be sending the textContent of the entire document.body to the background page over WebExtensions messaging API, which may be quite an amount of data on websites where document.body is growing a lot in size (and it may also explain what may have prevented the issue to be triggered when Alex tried to reproduce it, the reproducibility of the issue may be tied to the size that document.body may be reaching over the time)

    • I'd be curious to know which is the value returned by document.body.textContent.length when executed in the DevTools webconsole tab opened on the mail.google.com tab.
  • in the extension process, last pass background page will be receiving this messages over the WebExtensions messaging API, and there are quire some event delay and garbage collector activity, likely triggered by the amount of data being deserialized: https://share.firefox.dev/3x18vl5

The data sent from the content child process to the extension child process will go through the main process, and so the huge amount of StructuredCloneHolder in the parent process is not unlikely to be a side-effect of this messaging between the content and extension child processes.

Based on what I've observed, I got the impression that the amount of data sent all at once by the LastPass content script over the WebExtensions messaging API may be one of the main reasons for the high memory usage, and that would be something that seems worth investigating on the extension side, and based on that investigation consider alternative approaches that the extension could use to achieve the same goal without hitting this kind issue when the content page is getting too big to be sent across messaging all at once.

Besides reaching out the extension developers, this additional details may help us to try to force the issue to be triggered in a reduced test case, which would make it easier to see if there is anything that could be done on the WebExtensions internals side to reduce the impact, and so I'll take a look into that.

Hi Andreas,
could you help us to reach out LastPass developers?

It would be helpful to confirm and get their perspective about this issue and what I described in comment 10 based on the Firefox profile collected by the reporter.

Flags: needinfo?(awagner)

(In reply to Luca Greco [:rpl] [:luca] [:lgreco] from comment #11)

Hi Andreas,
could you help us to reach out LastPass developers?

It would be helpful to confirm and get their perspective about this issue and what I described in comment 10 based on the Firefox profile collected by the reporter.

Done. I sent a message through the review system pointing to this bug. Hope to hear from them here soon.

Flags: needinfo?(awagner)

Hello from LastPass!

Thank you so much for letting us know about the issue and providing the detailed profiler data (cool app by the way 🙌).

From what I could gather, it looked like a large amount of time (and possibly memory?) was spent in the setprefs / sendCS function in the background page. Inside those functions, we are sending a rather large string (250kB) to the content script on each page load.

I imagine that could be the source of this problem? We will investigate what we can do to remove this from our payload. Thanks!

P.S. You may also be interested to know that we are planning to retire this version of our content script in the near future.

The Firefox Profiler also has some experimental support for memory profiling that you can turn on in about:profiling: JS Allocations and Native allocations. In this case I'd use the second one. This can give some insights about which code triggers some memory that's retained over time.

Take care that this skews the timing data a lot, so don't look at that data too much when Native allocations is enabled.

We can probably also improve this on our side. The StructuredCloneHolder.deserialize method throws away its data unless the keepData argument is true. The extension API tries to leave it false in most of the easy cases, but there are still cases where it isn't always obvious whether it's going to need to be deserialized again, so it passes true. If we're triggering some of those, it may be worth trying to eliminate them.

The severity field is not set for this bug.
:mixedpuppy, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(mixedpuppy)
Flags: needinfo?(mixedpuppy)

The severity field is not set for this bug.
:mixedpuppy, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(mixedpuppy)

I want to chime in and say that this is likely not exclusive to LastPass. I've been experiencing the same issue for weeks/months without LastPass enabled. I have a couple similar extensions though - 1Password, Keeper (and others not as likely to be doing the same types of DOM processing). I've noticed it also on Google properties - typically end up seeing the issue when opening Google Docs, though not consistently.

The Keeper Password Manager extension causes this issue for me. I was seeing regular crashes on of Firefox 102 on Linux (once every day or two with heavy usage), caused by the main process hitting the maximum number of allowed memory mappings. The largest allocation by far in about:memory was structured-clone-holder, and a profile indicated that Keeper was causing those allocations.

Prior to disabling the extension, the mapping count (wc -l /proc/<pid>/mappings) of the main process would usually hover around 10,000 but would frequently (every few minutes during regular usage) spike up to 30,000 or even 50,000, increasing at a rate of a thousand or so per second before decreasing back to the baseline at an even faster rate. The maximum mapping count on Linux by default is around 60,000, so it makes sense that some of the spikes would make it even higher and cause a crash.

After disabling the Keeper extension, the spikes don't happen at all and the mapping count remains stable around 10,000.

Duplicate of this bug: 1799313
Assignee: nobody → kmaglione+bmo
You need to log in before you can comment on or make changes to this bug.