Huge amount of heap-unclassified memory for mozilla::gfx::NativeFontResourceMac::Create() when using drawSnapshot
Categories
(Core :: Graphics: Text, defect, P3)
Tracking
()
People
(Reporter: whimboo, Assigned: bradwerth)
References
(Regression)
Details
(Keywords: memory-footprint, regression)
Attachments
(7 files)
8.34 KB,
text/plain
|
Details | |
380.84 KB,
image/png
|
Details | |
487 bytes,
text/plain
|
Details | |
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta-
|
Details | Review |
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta-
|
Details | Review |
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta-
|
Details | Review |
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta-
|
Details | Review |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:76.0) Gecko/20100101 Firefox/76.0 ID:20200326130624
Via bug 1611467 I reported a massive heap-unclassified usage for fonts when using Google docs including Emojis. Lee fixed everything what he was able to on the Font side. But even after the fix I still see a huge amount of heap-unclassified memory even with no tab for Google docs open for a while, or even forcing a minimize memory in about:memory.
Also clearing the font cache by flipping the gfx.downloadable_fonts.enabled
preference as Lee suggested doesn't help.
So this might indicate that something in CrossProcessPaint keeps that font data alive. I'm running a DMD build and here is one unreported entry of the processed DMD of the parent process:
Unreported {
93 blocks in heap block record 1 of 24,152
248,586,240 bytes (248,398,568 requested / 187,672 slop)
Individual block sizes: 207,929,344; 2,170,880 x 7; 774,144 x 6; 753,664 x 5; 749,568 x 2; 741,376 x 2; 630,784 x 2; 606,208; 438,272 x 3; 421,888 x 2; 417,792; 405,504 x 3; 401,408 x 3; 397,312; 393,216 x 3; 389,120 x 3; 233,472 x 3; 217,088 x 3; 212,992 x 4; 196,608; 167,936 x 3; 139,264 x 3; 106,496; 102,400; 40,960 x 10; 32,768 x 6; 24,576 x 13
23.13% of the heap (23.13% cumulative)
36.34% of unreported (36.34% cumulative)
Allocated at {
#01: __CFDataInit (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x12326a)
#02: mozilla::gfx::NativeFontResourceMac::Create(unsigned char*, unsigned int) (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0xd9621e)
#03: mozilla::gfx::RecordedFontData::PlayEvent(mozilla::gfx::Translator*) const (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0xd30b73)
#04: std::__1::__function::__func<mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::$_0, std::__1::allocator<mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::$_0>, bool (mozilla::gfx::RecordedEvent*)>::operator( (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0xd36e6c)
#05: bool mozilla::gfx::RecordedEvent::DoWithEvent<mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::MemReader>(mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::MemReader&, mozilla::gfx::RecordedEvent::EventType (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0xd275f9)
#06: mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long) (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0xd23c55)
#07: mozilla::gfx::CrossProcessPaint::ResolveInternal(mozilla::dom::IdType<mozilla::dom::BrowserParent>, nsRefPtrHashtable<nsUint64HashKey, mozilla::gfx::SourceSurface>*) (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x10ca7f2)
#08: mozilla::gfx::CrossProcessPaint::MaybeResolve() (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x10ca238)
#09: mozilla::gfx::CrossProcessPaint::ReceiveFragment(mozilla::dom::WindowGlobalParent*, mozilla::gfx::PaintFragment&&) (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x10c9e71)
#10: mozilla::MozPromise<mozilla::gfx::PaintFragment, mozilla::ipc::ResponseRejectReason, true>::ThenValue<mozilla::dom::WindowGlobalParent::DrawSnapshotInternal(mozilla::gfx::CrossProcessPaint*, mozilla::Maybe<mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownU (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x27d44b6)
#11: mozilla::MozPromise<mozilla::gfx::PaintFragment, mozilla::ipc::ResponseRejectReason, true>::ThenValueBase::ResolveOrRejectRunnable::Run() (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x9908c5)
#12: nsThread::ProcessNextEvent(bool, bool*) (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x104bd7)
#13: NS_ProcessPendingEvents(nsIThread*, unsigned int) (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x103082)
#14: nsBaseAppShell::NativeEventCallback() (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x2a6f281)
#15: nsAppShell::ProcessGeckoEvents(void*) (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x2ace1dd)
#16: __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x84b21)
#17: __CFRunLoopDoSource0 (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x84ac0)
#18: __CFRunLoopDoSources0 (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x848d4)
#19: __CFRunLoopRun (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x83740)
#20: CFRunLoopRunSpecific (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x82bd3)
#21: RunCurrentEventLoopInMode (/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox + 0x2f65d)
}
}
Matt, do you have an idea what's wrong with that, and why the memory is not getting released?
Comment 1•5 years ago
|
||
Can you send me the full DMD report file please?
I can't see anything obvious that would be leaking from that code.
Reporter | ||
Comment 2•5 years ago
|
||
Sure, I sent it to you via Matrix. Thanks.
Updated•5 years ago
|
Reporter | ||
Comment 3•5 years ago
|
||
Hey Matt. Have you already had the time to check my sent DMD logs?
Reporter | ||
Comment 4•5 years ago
|
||
Here some new numbers after a week of usage. It's 2.4GB of heap-unclassified again!
Unreported {
626 blocks in heap block record 1 of 26,864
2,310,172,672 bytes (2,308,983,516 requested / 1,189,156 slop)
Individual block sizes: 207,929,344 x 10; 7,340,032 x 3; 2,170,880 x 20; 1,142,784 x 4; 811,008 x 3; 806,912 x 2; 774,144 x 53; 753,664 x 29; 749,568 x 4; 741,376 x 5; 630,784 x 6; 606,208 x 2; 557,056; 552,960; 536,576 x 2; 438,272 x 37; 421,888 x 35; 417,792 x 34; 405,504 x 11; 401,408; 393,216 x 13; 389,120 x 13; 323,584 x 3; 233,472 x 8; 217,088 x 6; 212,992 x 7; 208,896; 196,608; 180,224 x 2; 176,128; 172,032 x 2; 167,936 x 6; 139,264 x 13; 131,072 x 2; 110,592 x 4; 106,496 x 5; 102,400; 49,152 x 2; 40,960 x 107; 32,768 x 26; 24,576 x 136; 20,480 x 4
77.76% of the heap (77.76% cumulative)
89.71% of unreported (89.71% cumulative)
Allocated at {
#01: __CFDataInit (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x12326a)
#02: mozilla::gfx::NativeFontResourceMac::Create(unsigned char*, unsigned int) (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0xdc005e)
#03: mozilla::gfx::RecordedFontData::PlayEvent(mozilla::gfx::Translator*) const (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0xd56983)
#04: std::__1::__function::__func<mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::$_0, std::__1::allocator<mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::$_0>, bool (mozilla::gfx::RecordedEvent*)>::operator( (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0xd5cc9c)
#05: bool mozilla::gfx::RecordedEvent::DoWithEvent<mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::MemReader>(mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::MemReader&, mozilla::gfx::RecordedEvent::EventType (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0xd4d4ab)
#06: mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long) (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0xd49ac5)
#07: mozilla::gfx::CrossProcessPaint::ResolveInternal(mozilla::dom::IdType<mozilla::dom::BrowserParent>, nsRefPtrHashtable<nsUint64HashKey, mozilla::gfx::SourceSurface>*) (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x10ed7f2)
#08: mozilla::gfx::CrossProcessPaint::MaybeResolve() (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x10ed19b)
#09: mozilla::gfx::CrossProcessPaint::ReceiveFragment(mozilla::dom::WindowGlobalParent*, mozilla::gfx::PaintFragment&&) (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x10ecdd1)
#10: mozilla::MozPromise<mozilla::gfx::PaintFragment, mozilla::ipc::ResponseRejectReason, true>::ThenValue<mozilla::dom::WindowGlobalParent::DrawSnapshotInternal(mozilla::gfx::CrossProcessPaint*, mozilla::Maybe<mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownU (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x27d0b16)
#11: mozilla::MozPromise<mozilla::gfx::PaintFragment, mozilla::ipc::ResponseRejectReason, true>::ThenValueBase::ResolveOrRejectRunnable::Run() (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x99dee5)
#12: nsThread::ProcessNextEvent(bool, bool*) (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x1052eb)
#13: NS_ProcessPendingEvents(nsIThread*, unsigned int) (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x103782)
#14: nsBaseAppShell::NativeEventCallback() (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x2a6bfe1)
#15: nsAppShell::ProcessGeckoEvents(void*) (/Users/henrik/code/gecko/Firefox Nightly.app/Contents/MacOS/XUL + 0x2acf26d)
#16: __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x84b21)
#17: __CFRunLoopDoSource0 (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x84ac0)
#18: __CFRunLoopDoSources0 (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x848d4)
#19: __CFRunLoopRun (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x83740)
#20: CFRunLoopRunSpecific (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x82bd3)
#21: RunCurrentEventLoopInMode (/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox + 0x2f65d)
}
}
Reporter | ||
Comment 5•5 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #4)
Here some new numbers after a week of usage. It's 2.4GB of heap-unclassified again!
2,310,172,672 bytes (2,308,983,516 requested / 1,189,156 slop)
4 days later this is now at:
4,754.40 MB (100.0%) -- explicit
├──3,845.64 MB (80.89%) ── heap-unclassified
Reporter | ||
Comment 6•5 years ago
|
||
Matt mentioned to me that this might be related to the screenshot code, which might be relevant to thumbnail generation. What I can say is that I'm using the tab panel (browser.ctrlTab.mostRecentlyUsed=true
), which shows thumbnails for the pages when switching between tabs. Maybe that might be related here.
Reporter | ||
Comment 7•5 years ago
|
||
Since I'm using a new DMD nightly build of Firefox, which contains the fixes from bug 1628419, it seems that I'm no longer able to reproduce this problem. I will keep poking the next days.
Reporter | ||
Comment 8•5 years ago
|
||
The problem still persists:
1,685.49 MB (100.0%) -- explicit
├──1,030.86 MB (61.16%) ── heap-unclassified
I'm going to disable ctrlTab now to see if that is causing the problem.
Reporter | ||
Comment 9•5 years ago
|
||
I'm running this build for 5 days now without having Ctrl+Tab
enabled, and here is the result:
1,005.84 MB (100.0%) -- explicit
[..]
├────172.16 MB (17.12%) ── heap-unclassified
I will now re-enable it just to verify that memory usage for heap-unclassified goes up again.
Dao, maybe you already have an idea?
Reporter | ||
Comment 10•5 years ago
|
||
Since I enabled the Ctrl+Tab feature about 3h ago the heap unclassified usage has increased a lot:
1,219.09 MB (100.0%) -- explicit
├────383.33 MB (31.44%) ── heap-unclassified
Note that this feature has been enabled by default on bug 1473595 for Firefox 63.
Comment 11•5 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #10)
Since I enabled the Ctrl+Tab feature about 3h ago the heap unclassified usage has increased a lot:
1,219.09 MB (100.0%) -- explicit
├────383.33 MB (31.44%) ── heap-unclassified
How many open tabs did you have at that point?
Reporter | ||
Comment 12•5 years ago
|
||
I have about 30 open tabs in that browser instance. Since my last comment I mostly opened new ones (mainly Bugzilla), made comments and switching between tabs, and closed those new tabs.
Reporter | ||
Updated•5 years ago
|
Comment 13•5 years ago
|
||
This might be due to this perf optimization: https://searchfox.org/mozilla-central/rev/d2cec90777d573585f8477d5170892e5dcdfb0ab/browser/base/content/browser-ctrlTab.js#51-52
Comment 14•5 years ago
|
||
I had a play with this locally, and I do see growing heap-unclassified memory (with browser.ctrlTab.recentlyUsedOrder=true, and ctrl-tabbing lots, plus tab changes).
I don't see us accumulating FontResourceMac or UnscaledFontMac though, so I don't think this is the screenshot code holding onto those objects longer than necessary.
Lee, is this possibly related to other bugs where we're holding onto Mac fonts for too long?
Comment 15•5 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #14)
I had a play with this locally, and I do see growing heap-unclassified memory (with browser.ctrlTab.recentlyUsedOrder=true, and ctrl-tabbing lots, plus tab changes).
I don't see us accumulating FontResourceMac or UnscaledFontMac though, so I don't think this is the screenshot code holding onto those objects longer than necessary.
Lee, is this possibly related to other bugs where we're holding onto Mac fonts for too long?
The problem is CrossProcessPaint which keeps alive draw target recordings indefinitely, which have to hold onto the fonts because the recording is still alive.
Comment 16•5 years ago
|
||
(In reply to Lee Salzman [:lsalzman] from comment #15)
The problem is CrossProcessPaint which keeps alive draw target recordings indefinitely, which have to hold onto the fonts because the recording is still alive.
Do you have STR for reproducing that?
In my testing all CrossProcessPaint objects are deleted immediately after we take a snapshot, which releases all contained FontResourceMac/UnscaledFontMac objects that were created during their lifetime.
Comment 17•5 years ago
|
||
Henrik, do you have a quick STR that reliably reproduces the heap-unclassified bloat?
Reporter | ||
Comment 18•5 years ago
|
||
My steps are similar to those from Matt in comment 14. I'm using my daily profile for that, so with sessionstore I already have about 30 tabs open. Then I just hold Ctrl+Shift so that the panel opens, keep it holding to let it cycle through the displayed tabs, and select one. Maybe repeat by opening and closing tabs, so that the list of tabs gets updated.
I just triggered another DMD build in case someone also wants to use one:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0ebcd12d44561f35ce58b6d873101b1b1ba7f172
And that is the command I use to run it:
DMD=1 Firefox\ Nightly.app/Contents/MacOS/firefox --mode=dark-matter --stack=full
So what do I have to look out for? Any line of mozilla::gfx::NativeFontResourceMac::Create
in the unreported section? I mean the amount of memory is low at the beginning and raises, so what would be a threshold?
Comment 19•5 years ago
|
||
You can check how many thumbnails are kept in memory running this in the browser console:
gBrowser.tabs.filter(t => t.__thumbnail).length
Reporter | ||
Comment 20•5 years ago
|
||
(In reply to Dão Gottwald [::dao] from comment #19)
gBrowser.tabs.filter(t => t.__thumbnail).length
Checked that and there are currently 14 thumbnails cached. When I checked the type of __thumbnail
I can see that those are not only img
but also of type canvas
. Is that expected?
This is for:
Unreported {
357 blocks in heap block record 1 of 24,886
523,554,816 bytes (522,863,264 requested / 691,552 slop)
..
Allocated at {
#01: __CFDataInit (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x12326a)
#02: mozilla::gfx::NativeFontResourceMac::Create(unsigned char*, unsigned int) (/Users/henrik/code/gecko/Firefox Nightly DMD.app/Contents/MacOS/XUL + 0xdcf01e)
#03: mozilla::gfx::RecordedFontData::PlayEvent(mozilla::gfx::Translator*) const (/Users/henrik/code/gecko/Firefox Nightly DMD.app/Contents/MacOS/XUL + 0xd65323)
#04: std::__1::__function::__func<mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::$_0, std::__1::allocator<mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::$_0>, bool (mozilla::gfx::RecordedEvent*)>::operator( (/Users/henrik/code/gecko/Firefox Nightly DMD.app/Contents/MacOS/XUL + 0xd6b63c)
#05: bool mozilla::gfx::RecordedEvent::DoWithEvent<mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::MemReader>(mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::MemReader&, mozilla::gfx::RecordedEvent::EventType (/Users/henrik/code/gecko/Firefox Nightly DMD.app/Contents/MacOS/XUL + 0xd5be4b)
#06: mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long) (/Users/henrik/code/gecko/Firefox Nightly DMD.app/Contents/MacOS/XUL + 0xd58465)
#07: mozilla::gfx::CrossProcessPaint::ResolveInternal(mozilla::dom::IdType<mozilla::dom::BrowserParent>, nsRefPtrHashtable<nsUint64HashKey, mozilla::gfx::SourceSurface>*) (/Users/henrik/code/gecko/Firefox Nightly DMD.app/Contents/MacOS/XUL + 0x10f6862)
#08: mozilla::gfx::CrossProcessPaint::MaybeResolve() (/Users/henrik/code/gecko/Firefox Nightly DMD.app/Contents/MacOS/XUL + 0x10f620b)
#09: mozilla::gfx::CrossProcessPaint::ReceiveFragment(mozilla::dom::WindowGlobalParent*, mozilla::gfx::PaintFragment&&) (/Users/henrik/code/gecko/Firefox Nightly DMD.app/Contents/MacOS/XUL + 0x10f5e41)
Comment 21•5 years ago
|
||
For what it is worth, I'm also seeing gigs of heap-unclassified in the main process on OSX.
Comment 22•5 years ago
|
||
I also have large heap-unclassified on OSX on the main process, but I don't use ctrl-tab.
Reporter | ||
Comment 23•5 years ago
|
||
Thanks. Maybe the part with Ctrl+Tab is a red herring. But my question for Dao still stands above. I don't think that we should store canvas elements as thumbnails...
Otherwise could one of you also use a DMD build of Firefox for a while? I just triggered a new build for MacOS with DMD enabled:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=8b0c3c0957236388c8ed8089856c30f51e249308
It would be good to see if you all can see the same unreported memory as I do.
Comment 24•4 years ago
|
||
Because this bug's Severity has not been changed from the default since it was filed, and it's Priority is P3
(Backlog,) indicating it has been triaged, the bug's Severity is being updated to S3
(normal.)
Reporter | ||
Comment 25•4 years ago
|
||
Bug 1635681 sounds interesting because I'm also keeping Firefox open for days and use the sleep mode a lot.
Comment 26•4 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #20)
(In reply to Dão Gottwald [::dao] from comment #19)
gBrowser.tabs.filter(t => t.__thumbnail).length
Checked that and there are currently 14 thumbnails cached. When I checked the type of
__thumbnail
I can see that those are not onlyimg
but also of typecanvas
. Is that expected?
Yes, that's expected. If you want to give it a try, here's a build with the canvas caching disabled: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3ff2f7d149a18e64baefe5f8bf7e1ce079a07ba9
Comment 27•4 years ago
|
||
I'm seeing something similar, with about 200MB of heap-unclassified under mozilla::gfx::NativeFontResourceMac::Create(). I don't have any Ctrl-Tab thing enabled. I do use Command-` to switch between windows, if that matters. This is with WebRender and Fission force-enabled.
I also have hundreds of MB of web render heap-unclassified, under WebRenderBridgeParent::UpdateResources and the like, but maybe that should be a separate issue.
Also, it would be nice if somebody wrote a memory reporter for this font data so people don't have to bend over backwards with a DMD build to understand what is happening. I've seen a number of reports of high main process memory on OSX, and I can't tell if they are related or not.
Comment 28•4 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #27)
I'm seeing something similar, with about 200MB of heap-unclassified under mozilla::gfx::NativeFontResourceMac::Create().
Filed bug 1639385 for getting this reported.
Reporter | ||
Comment 29•4 years ago
|
||
Ok, so given that no-one else uses Ctrl+Tab but still see this problem, we should update the bug's summary to not include this specific feature in Firefox.
Reporter | ||
Comment 30•4 years ago
|
||
Is there a way to give an web extension access to the heap unclassified usage? Even through an experimental API. That way we could easily show the value in the toolbar for easier inspection. So something similar as what my https://github.com/mozilla/memchaser extension did in the past. This might help to further investigate possible steps because the value is always in the view, and it doesn't force you to go to about:memory
and create a report.
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Comment 31•4 years ago
|
||
With having the memory reporter for NativeFontResourceMac now, the verbose logging shows 0% for it, while there is still a good amount of heap unclassified:
1,019,141,922 B (100.0%) -- explicit
├────307,251,713 B (30.15%) ── heap-unclassified
├────257,523,664 B (25.27%) -- gfx
│ ├──257,204,224 B (25.24%) ── heap-textures
│ ├──────178,864 B (00.02%) ── font-list
│ ├───────87,408 B (00.01%) ── font-charmaps
│ ├───────34,416 B (00.00%) ── font-cache
│ ├───────18,752 B (00.00%) ── font-shaped-words
│ └────────────0 B (00.00%) ── native-font-resource-data
What could be done next here?
Comment 32•4 years ago
|
||
I don't know if it is the same issue, but I'm still seeing very high heap-unclassified on OSX with WebRender enabled. Yesterday I had about 3.4GB of it. I'll try to run with DMD at some point.
Comment 33•4 years ago
|
||
I had 3.6GB of heap-unclassified in the parent process on OSX just now. I've restarted with DMD enabled so we'll see if that turns up anything informative.
Comment 34•4 years ago
|
||
Lee, is there any further work that can be done here? This is a ton of memory that's being used, and your other patch doesn't seem to have fixed it. Let me know if I can provide further information. Thanks.
Comment 35•4 years ago
•
|
||
(In reply to Andrew McCreight [:mccr8] from comment #34)
Lee, is there any further work that can be done here? This is a ton of memory that's being used, and your other patch doesn't seem to have fixed it. Let me know if I can provide further information. Thanks.
I've already spent a lot of time doing a ton of investigation and speculative fixes for this by now that I don't know there's anything left that can be done. If someone else wants to dig further and see if they can find something I missed, they are welcome to try.
Comment 36•4 years ago
|
||
I'll see if I can come up with some actual steps to reproduce.
I tried disabling Fission to see if that makes it worse. I've now had my Firefox session open for about a day, and there's 1.5GB of heap-unclassified in the parent process.
Reporter | ||
Comment 37•4 years ago
|
||
I updated this morning to the latest Nightly. And since then (~12h) I put my MacBook a couple of times into hibernation mode (similar to bug 1635681?), and used Gmail, and Treeherder quite a fair amount of time (note that the latter is a memory / performance hog for me). Maybe one of these is related.
I wished that my memchaser extension would work with the WebExtension API and a button in the toolbar could visualize the memory usage. I always forget to regularly check about:memory.
Reporter | ||
Comment 38•4 years ago
|
||
Since my last comment no hibernation took place and I basically continued work in Gmail and Bugzilla. Unclassified memory went further up 200MB in that short time-span and now has reached: 954.01 MB (62.02%) ── heap-unclassified
Reporter | ||
Comment 39•4 years ago
|
||
FYI I was running Firefox in safe mode today with a single hibernation, and I can still see the same behavior.
Reporter | ||
Comment 40•4 years ago
|
||
As Jonathan pointed out on Matrix one of my recent observations shows sizes of 6 times the Apple Color Emoji font. Maybe we stash the file instead of the file descriptor into the shared memory?
Unreported {
104 blocks in heap block record 1 of 32,100
1,297,989,632 bytes (1,297,792,900 requested / 196,732 slop)
Individual block sizes: 207,929,344 x 6; 2,170,880 x 9; 819,200; 811,008 x 4;
Here the font file on the local disk:
➜ ls -l /System/Library/Fonts/Apple\ Color\ Emoji.ttc
-rw-r--r-- 1 root wheel 207931344 Apr 17 23:10 /System/Library/Fonts/Apple Color Emoji.ttc
Reporter | ||
Comment 41•4 years ago
|
||
I just experienced that issue again:
5,159.63 MB (100.0%) -- explicit
├──4,256.90 MB (82.50%) ── heap-unclassified
This is unbelievable high and also kinda freezes not only Firefox but any other application on my MBP too. I see the beach ball a lot and responses of the ui take forever.
I attached a DMD report of the unreported memory and as it can be seen we have the 208MB memory block 12 times in memory! If it's really the emoticon font file I might know where it actually comes from. Earlier today I tried to find some emojis on Slack. After a restart I will try again and clearly keep an eye out for this website.
Comment 42•4 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #41)
This is unbelievable high and also kinda freezes not only Firefox but any other application on my MBP too. I see the beach ball a lot and responses of the ui take forever.
This sounds very much like what I see when I encounter bug 1652631. If you run a copy of Firefox from the command line during this period it would output the error message to the terminal if it was happening on your system.
Reporter | ||
Comment 43•4 years ago
|
||
No, I don't see this failure listed in my console and I always run Nightly through it to enable DMD mode.
Comment 45•4 years ago
|
||
Bug 1669286 should help diagnose this a bit.
Comment 46•4 years ago
|
||
Also, it looks like the OS will hold on to font stuff without us wanting it. The private API CTFontRemoveFromCaches fixes this.
Comment 47•4 years ago
|
||
Henrik the symbols in your DMD report look wrong. Also, can you check your about:memory in a Nightly at least as new as 20201007094223 and see if heap unclassified is still high? If it's high what's amount in gfx.webrender.weak-fonts?
Reporter | ||
Comment 48•4 years ago
|
||
I was running without webrender the last days. So had to restart Firefox. I will let you know early next week how it looks like for me.
Reporter | ||
Comment 49•4 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #47)
Henrik the symbols in your DMD report look wrong.
What specifically do you mean with wrong? That's the output from running dmd.py, so maybe there's a problem with the script?
Also, can you check your about:memory in a Nightly at least as new as 20201007094223 and see if heap unclassified is still high? If it's high what's amount in gfx.webrender.weak-fonts?
I'm now using Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:83.0) Gecko/20100101 Firefox/83.0 ID:20201009041754 with webrender enabled for a while, and I still see high heap unclassified memory. Right now it's around 50% and not 80% - which is still not less:
1,579.78 MB (100.0%) -- explicit
├────777.15 MB (49.19%) ── heap-unclassified
├────255.91 MB (16.20%) -- gfx
│ ├──245.82 MB (15.56%) -- webrender
│ │ ├──230.35 MB (14.58%) -- resource-cache
│ │ │ ├──221.20 MB (14.00%) ── weak-fonts
Note that minimizing memory doesn't free up the memory as used by weak-fonts.
Here the details for the largest unreported memory block from the DMD build:
Unreported {
112 blocks in heap block record 1 of 30,133
454,107,136 bytes (453,890,748 requested / 216,388 slop)
Individual block sizes: 207,929,344 x 2; 2,170,880 x 4; 819,200; 811,008 x 2; 806,912; 774,144 x 9; 753,664 x 4; 749,568; 741,376 x 2; 630,784 x 2; 557,056; 552,960; 536,576; 438,272 x 6; 421,888 x 2; 417,792 x 2; 405,504 x 4; 233,472 x 2; 217,088 x 3; 212,992 x 3; 196,608 x 2; 167,936 x 3; 139,264 x 3; 131,072; 110,592; 106,496 x 3; 102,400 x 2; 40,960 x 18; 32,768 x 6; 24,576 x 20
34.49% of the heap (34.49% cumulative)
55.41% of unreported (55.41% cumulative)
Allocated at {
#01: __CFDataInit (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x1221f9)
#02: DumpCompleteHeap (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x3250376)
#03: DumpCompleteHeap (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x320f123)
#04: DumpCompleteHeap (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x32125bf)
#05: DumpCompleteHeap (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x320776c)
#06: DumpCompleteHeap (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x3205aed)
#07: mozilla_dump_image (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x341b640)
#08: mozilla_dump_image (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x341b12b)
#09: mozilla_dump_image (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x341ac73)
#10: mac_plugin_interposing_child_OnShowCursor (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x4759ddd)
#11: NS_NewLocalFileWithCFURL (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x30c24d0)
#12: NS_NewLocalFileWithFSRef (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x4019f7)
#13: NS_NewLocalFileWithFSRef (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x3fdb5c)
#14: nsXPTCStubBase::Stub249() (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0xfda050)
#15: __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x83d52)
#16: __CFRunLoopDoSource0 (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x83cf1)
#17: __CFRunLoopDoSources0 (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x83b0b)
#18: __CFRunLoopRun (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x8283a)
#19: CFRunLoopRunSpecific (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x81e3e)
#20: RunCurrentEventLoopInMode (/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox + 0x2fabd)
#21: ReceiveNextEventCommon (/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox + 0x2f7d5)
}
}
But I question this output...
Comment 50•4 years ago
|
||
Yeah, it looks like you're missing symbols for XUL. Still it's interesting to see weak-fonts relatively high. I wonder if there's something we can do to encourage the OS to release them.
Comment 51•4 years ago
|
||
Nightlies no longer come with symbols, see bug 1667170. (I'm surprised DMD can be used on regular Nightlies, I thought you still needed a custom local build for that. Cool!)
Updated•4 years ago
|
Comment 52•4 years ago
|
||
So here's my understanding of what's happening:
- When we create a CTFont from a CGFont, a descriptor that holds on to the CGFont gets put in an NSCache
- We create a lot of CTFonts from CGFonts with WebRender (we have a separate CGFont for each thread that does glyph rasterization)
- This cache is cleared when you call CTFontRemoveFromCaches(nullptr)
- The cache will also be pruned on memory pressure. This prevents it totally exhausting memory on the system.
If we create the CGFont from the CTFont we don't have this problem. WebKit and Chromium both do things this way from what I understand.
Comment 53•4 years ago
|
||
Henrik, can you try today's nightly and report how much heap-unclassified and weak-fonts you have?
Reporter | ||
Comment 54•4 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #53)
Henrik, can you try today's nightly and report how much heap-unclassified and weak-fonts you have?
After running for a day I still have a lot of heap-unclassified memory, but at least no weak-fonts! \o/
972.03 MB (100.0%) -- explicit
├──390.65 MB (40.19%) ── heap-unclassified
[..]
├───32.55 MB (03.35%) -- gfx
│ ├──27.30 MB (02.81%) -- webrender
│ │ ├──18.32 MB (01.88%) -- resource-cache
│ │ │ ├──18.32 MB (01.88%) ── fonts
│ │ │ └───0.00 MB (00.00%) ++ (3 tiny)
When I find the time today I will create a new DMD report.
Reporter | ||
Comment 55•4 years ago
|
||
Here the current DMD report:
Unreported {
97 blocks in heap block record 1 of 27,158
246,390,784 bytes (246,203,824 requested / 186,960 slop)
Individual block sizes: 207,929,344; 7,340,032; 2,170,880 x 4; 774,144 x 4; 753,664 x 3; 688,128; 630,784; 438,272 x 8; 421,888 x 7; 417,792 x 7; 405,504 x 7; 389,120; 372,736; 233,472; 217,088; 212,992; 167,936; 139,264 x 4; 40,960 x 16; 32,768 x 8; 28,672; 24,576 x 18
34.84% of the heap (34.84% cumulative)
59.10% of unreported (59.10% cumulative)
Allocated at {
#01: __CFDataInit (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x1221f9)
#02: mozilla::gfx::NativeFontResourceMac::Create(unsigned char*, unsigned int) (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x3262166)
#03: mozilla::gfx::RecordedFontData::PlayEvent(mozilla::gfx::Translator*) const (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x3220b73)
#04: std::__1::__function::__func<mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::$_0, std::__1::allocator<mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::$_0>, bool (mozilla::gfx::RecordedEvent*)>::operator( (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x3223f8f)
#05: bool mozilla::gfx::RecordedEvent::DoWithEvent<mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::MemReader>(mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::MemReader&, mozilla::gfx::RecordedEvent::EventType (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x3218f8c)
Comment 56•4 years ago
|
||
Ok, yeah, that's probably more of the same problem. It's the blob image version.
Reporter | ||
Comment 57•4 years ago
|
||
After 5 more days it still looks better than before:
1,164.78 MB (100.0%) -- explicit
├────461.66 MB (39.64%) ── heap-unclassified
(In reply to Jeff Muizelaar [:jrmuizel] from comment #56)
Ok, yeah, that's probably more of the same problem. It's the blob image version.
Mind explaining that a bit more? What is the blob image version?
Comment 58•4 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #57)
Mind explaining that a bit more? What is the blob image version?
Bug 1672088, which has the blob image version of this. Can you get a DMD report from today's Nighty and check if things have improved further?
Reporter | ||
Comment 59•4 years ago
|
||
Sure. Give me some time. I will report on Friday.
Reporter | ||
Comment 60•4 years ago
|
||
Tested with Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:84.0) Gecko/20100101 Firefox/84.0 ID:20201021095128
935.64 MB (100.0%) -- explicit
├──367.44 MB (39.27%) ── heap-unclassified
The DMD report still shows the same stack. Here the top 5 frames of the report with updated addresses:
#01: __CFDataInit (/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation + 0x1221f9)
#02: mozilla::gfx::NativeFontResourceMac::Create(unsigned char*, unsigned int) (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x2bf3369)
#03: mozilla::gfx::RecordedFontData::PlayEvent(mozilla::gfx::Translator*) const (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x2bb22f3)
#04: std::__1::__function::__func<mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::$_0, std::__1::allocator<mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::$_0>, bool (mozilla::gfx::RecordedEvent*)>::operator( (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x2bb570f)
#05: bool mozilla::gfx::RecordedEvent::DoWithEvent<mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::MemReader>(mozilla::gfx::InlineTranslator::TranslateRecording(char*, unsigned long)::MemReader&, mozilla::gfx::RecordedEvent::EventType (/Applications/FirefoxNightly.app/Contents/MacOS/XUL + 0x2baa70c)
Note that it was working fine for about a day but right now when I checked again the heap-unclassified jumped from around 20 to these 40%.
Comment 61•4 years ago
|
||
I also see this stack as the top "unreported" stack in a DMD report, from a build that was only running for a few hours.
Comment 62•4 years ago
|
||
I did some investigation and it looks like we are leaking fonts in a similar way with WebRender off. That being said, the leaks only happen in the content processes instead of with WebRender where it happens in child and parent processes.
Reporter | ||
Comment 63•4 years ago
|
||
This morning while working a while without AC the heap-unclassified raised again to a huge value:
2,258.29 MB (100.0%) -- explicit
├──1,439.03 MB (63.72%) ── heap-unclassified
There is the mozilla::gfx::NativeFontResourceMac::Create
block reported twice with each around 650MB. Not sure what actually triggered that.
Comment 64•4 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #63)
This morning while working a while without AC the heap-unclassified raised again to a huge value:
2,258.29 MB (100.0%) -- explicit ├──1,439.03 MB (63.72%) ── heap-unclassified
There is the
mozilla::gfx::NativeFontResourceMac::Create
block reported twice with each around 650MB. Not sure what actually triggered that.
I'd be really interested to know what triggers this jump. Do you have any guess what it could be?
Reporter | ||
Comment 65•4 years ago
|
||
I did a fair amount of test failure investigation via Treeherder today including filtering jobs, opening the log viewer, and copying/pasting content to Bugzilla. That's what I usually do only on Monday due to the incoming intermittent bug comments. Otherwise no idea.
Reporter | ||
Comment 66•4 years ago
•
|
||
And with a restarted instance of Firefox yesterday, I'm back at over 70% heap-unclassified again (webrender enabled):
1,378.27 MB (100.0%) -- explicit
├──1,035.95 MB (75.16%) ── heap-unclassified
Sadly I missed to start Firefox with DMD enabled, so that I cannot provide more details.
Reporter | ||
Comment 67•4 years ago
|
||
Maybe related here... I had https://addons.mozilla.org/en-US/firefox/addon/no-transition/ installed all the last months and disabled it yesterday because lots of sites are broken when CSS transitions are disabled. Maybe that opened up some other leaky code paths for me.
Reporter | ||
Comment 68•4 years ago
|
||
I can confirm that with the changes from bug 1672899 nothing has changed. Having a memory reporter would be indeed very helpful.
Reporter | ||
Comment 69•4 years ago
|
||
Jeff, do you have an idea meanwhile when this could have been regressed, especially for non-webrender? I wonder if some of the already landed patches might need an uplift to at least 83. Probably not given that the still to fix case seems to be the most important one.
Comment 70•4 years ago
|
||
Do we have evidence that this regressed for non-WebRender?
Reporter | ||
Comment 71•4 years ago
|
||
I'm fairly sure that I haven't seen around 80% of heap-unclassified usage in the past. But it would be hard to do such a regression check given that it's a fair amount of time someone would have to spent.
Comment 72•4 years ago
|
||
Based on when I saw high heap-unclassified I suspect that changes in macOS caused my problem. My recollection is that I started seeing the problem as soon as I started using a new macbook pro, which has a newer version of macOS on it then my older macbook pro (I'm a little slow/cautious to update macOS). I guess the other possibility is that changes in the hardware (gfx drivers?) could be another change between the good and bad configurations that I observed.
Comment 73•4 years ago
|
||
Henrik, are you running a new enough build that you have bug 1674175? If so can you share how much heap-unclassified you have, a long with native-font-resource-mac?
Reporter | ||
Comment 74•4 years ago
•
|
||
Yeah, I'm running it for a while and just this evening I now hit the following:
1,320.11 MB (100.0%) -- explicit
├────450.01 MB (34.09%) -- gfx
│ ├──420.90 MB (31.88%) ── native-font-resource-mac
│ ├───23.22 MB (01.76%) ++ webrender
│ └────5.88 MB (00.45%) ++ (5 tiny)
├────370.71 MB (28.08%) ── heap-unclassified
I'm fairly sure that this might be related to IRCCloud or Matrix. Both tabs I haven't used at all over the weekend and earlier today. Until then I didn't see a jump of both. But now after reading through the notifications in both tabs, the memory usage seems to have been jumped.
And as it can be seen a quite fair amount of memory is used by native font resources.
Reporter | ||
Comment 75•4 years ago
•
|
||
Do we actually have the capability to add the kind of native font that has been created? Similar to user-fonts
, which can be expanded and show the font family, the URL, and the principal.
Reporter | ||
Comment 77•4 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #76)
We could possibly do that.
Filed as bug 1674868. Until then I will keep an eye on the Matrix and IRCCloud tabs.
Comment 78•4 years ago
|
||
It looks CoreText has an LRU cache of 12 descriptors per thread.
For the numbers to get as big as they are it suggests that fonts are very large instead of us accumulating lots of fonts.
Comment 79•4 years ago
|
||
So this is almost certainly caused by the tab thumbnails used during ctrl+tab. If one of the tabs uses the apple emjoi font we can end up with it in a a recording.
Reporter | ||
Comment 80•4 years ago
|
||
Oh wow. So my assumption from around 7 months ago is actually correct, and indeed I was just able to reproduce the problem. I measured memory before and after opening the Ctrl+Tab panel, and native font resources climbed from 0 bytes up to 204 MB:
1,007.35 MB (100.0%) -- explicit
├────231.37 MB (22.97%) -- gfx
│ ├──204.50 MB (20.30%) ── native-font-resource-mac
Reporter | ||
Comment 81•4 years ago
|
||
Reporter | ||
Comment 82•4 years ago
|
||
The tab that is actually causing this problem is the notification center from Github:
https://github.com/notifications?query=is%3Aunread
I assume the emojis as used under filter
are causing this.
Reporter | ||
Comment 83•4 years ago
|
||
This is actually a regression that I can perfectly replicate now! By using ESR68 it works all fine, but 78ESR is broken. I will run mozregression to find out the causing change.
Reporter | ||
Comment 84•4 years ago
•
|
||
This is a year old regression:
7:46.35 INFO: Narrowed nightly regression window from [2019-11-04, 2019-11-07] (3 days) to [2019-11-06, 2019-11-07] (1 days) (~0 steps left)
7:46.35 INFO: Got as far as we can go bisecting nightlies...
7:46.35 INFO: Last good revision: 96b58f95ed7333672e6dba134d091015328d299b (2019-11-06)
7:46.35 INFO: First bad revision: 7748cc7e9b63d86a40eb4799c0830172da579c84 (2019-11-07)
7:46.35 INFO: Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=96b58f95ed7333672e6dba134d091015328d299b&tochange=7748cc7e9b63d86a40eb4799c0830172da579c84
There is only bug 1576911 (Page thumbnailer should use Fission for messaging) for thumbnail related changes that could have caused that. Gijs, given that you mentored that bug, what do you think? Sadly mozregression cannot continue given that it cannot find any further builds from autoland (bug 1674970).
Reporter | ||
Comment 85•4 years ago
|
||
By manually testing d271c572a9bcd008ed14bf104b2eb81949952e4c
it's indeed the Nightly containing the above change:
Reporter | ||
Comment 86•4 years ago
|
||
Also note that with builds having this regression the generation of the thumbnail takes a moment, and there is a delay after opening Ctrl-Tab and getting the thumbnail displayed. Before the change the thumbnail was presented right away.
Reporter | ||
Comment 87•4 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #84)
There is only bug 1576911 (Page thumbnailer should use Fission for messaging) for thumbnail related changes that could have caused that.
That change actually switched from drawImage() to drawSnapshot(). So I assume some code in GFX is leaking here.
Matt, can you have a look again?
Reporter | ||
Comment 88•4 years ago
|
||
The same also happens when using Marionette. So it's not related to Ctrl+Tab but the drawSnapshot
API.
Reporter | ||
Comment 89•4 years ago
|
||
I spoke with Matt on Matrix and as he mentioned we should wait for Jeff's investigation.
Reporter | ||
Comment 90•4 years ago
|
||
Updating to bug 1499845 which actually covered the implementation of the drawSnapshot API.
Updated•4 years ago
|
Comment 91•4 years ago
|
||
I was able to get the memory usage as high as 5,182.07MB which is more than the 12x200MB that I would expect as the maximum from the descriptor cache. This suggest that there's something else going on to cause the leak.
Comment 92•4 years ago
|
||
Matt, the best way to mitigate this problem right now is to avoid including system fonts in the recording that we use for thumbnails. This is what we do with WebRender and blob images.
Comment 93•4 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #91)
I was able to get the memory usage as high as 5,182.07MB which is more than the 12x200MB that I would expect as the maximum from the descriptor cache. This suggest that there's something else going on to cause the leak.
That being said, commenting out the body of DrawTargetSkia::DrawGlyphs is still sufficient to avoid the leak.
Comment 94•4 years ago
|
||
If I comment out DrawTargetSkia::DrawGlyphs but add a CFRelease(CTFontCopyFontDescriptor(ctFont)); to NativeFontResourceMac::Create I can get the leak to go as high as 12x200MB but not higher as would be expected from the descriptor cache. I suspect the other cache is the Skia strike cache. If I "Minimize memory usage" (which clears that cache) my native-font-resource-mac memory usage drops down to the more expected 1,986.84 MB.
All of this means, we're probably not actually leaking, we just have a couple of caches that keep copies of a 200MB font file alive.
Reporter | ||
Comment 95•4 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #91)
I was able to get the memory usage as high as 5,182.07MB which is more than the 12x200MB that I would expect as the maximum from the descriptor cache. This suggest that there's something else going on to cause the leak.
Not sure what I'm about to do here. Let me know specifics in case you still need some feedback from me.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #94)
All of this means, we're probably not actually leaking, we just have a couple of caches that keep copies of a 200MB font file alive.
Is that only a problem on Mac, or do other platforms also have those caches? Also why do we cache the same font multiple times? Is it because of the different content processes? If yes, could it be maybe cached via shmem?
Comment 96•4 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #95)
Is that only a problem on Mac, or do other platforms also have those caches? Also why do we cache the same font multiple times? Is it because of the different content processes? If yes, could it be maybe cached via shmem?
Mac is the only place where we we're likely to run into a commonly used 200MB font.
The mitigation to this high memory usage is to record the system fonts in the recording by name instead of by contents. This way we'll only have one Apple Emoji font.
Updated•4 years ago
|
Reporter | ||
Comment 97•4 years ago
|
||
Jeff, would bug 1625476 be similar to the caching as you described above? At least it sounds like to me.
Comment 98•4 years ago
|
||
Some background:
There are two systems in place for handling fonts when making a recording.
- External fonts - This is what WebRender uses. WebRender creates a font key when a font is first used and this key is maintained between the content process and the gpu process. This key persists beyond the lifetime of the recording so that if we record the same content again we don't need to recreate the key.
- Internal fonts - This is the default and what printing and screenshots use. A copy of the font data is included in the recording. Nothing persists beyond the lifetime of the recording.
To minimize this particular issue it's probably ok to only worry about avoiding copying the font data for system fonts. This avoids the complexity of managing external fonts like WebRender does. This means that we'll still end up with duplicate copies of the data for data fonts, but they tend to be smaller so hopefully it won't be as much of an issue. Plus we have better memory tracking infrastructure to keep an eye on the problem now.
The font data serialization happens on demand when we try to draw glyphs with a particular font here:
https://searchfox.org/mozilla-central/source/gfx/2d/DrawTargetRecording.cpp#252
WebRender takes the WantsExternalFonts()
branch, screenshots takes the else. Currently it tries to get the data for the font and record that, falling back to the name if it's unable to get the data. We should be able to swap the order here and try to use the name first falling back to grabbing the data otherwise.
For the curious, here's how WebRender handles managing the external keys:
We set up a callback to handle external keys here:
https://searchfox.org/mozilla-central/source/gfx/layers/wr/WebRenderCommandBuilder.cpp#636
That code calls into:
https://searchfox.org/mozilla-central/source/gfx/layers/wr/WebRenderBridgeChild.cpp#313
which calls GetFontDescriptor().
GetFontDescriptor() will return false for data fonts and otherwise return the font-name.
Assignee | ||
Comment 100•4 years ago
|
||
Assignee | ||
Comment 101•4 years ago
|
||
I think attachment 9186397 [details] implements the strategy in comment 98. I'm not sure how to test it definitively, but it does seem prevent the Emoji font from remaining in-cache when I create a Ctrl-Tab snapshot of the Apple Emoji font page. I'm also not sure how to create a test for it.
Assignee | ||
Comment 102•4 years ago
|
||
Comment 103•4 years ago
|
||
I don't think there's a great way to test it so I wouldn't worry about it.
Assignee | ||
Comment 104•4 years ago
|
||
This seems to cause a failure on Windows for test browser_ext_tabs_saveAsPDF.js. I'll see if I can understand the logs. I'm not developing on a Windows machine:
Assertion failure: [GFX1 22]: Failed creating UnscaledFont of type 0 from font descriptor, at /builds/worker/workspace/obj-build/dist/include/mozilla/gfx/Logging.h:754
[task 2020-11-06T23:32:42.417Z] 23:32:42 INFO - GECKO(3864) | #01: mozilla::gfx::RecordedFontDescriptor::PlayEvent(mozilla::gfx::Translator*) const [gfx/2d/RecordedEventImpl.h:3459]
[task 2020-11-06T23:32:42.417Z] 23:32:42 INFO - GECKO(3864) | #02: static mozilla::gfx::RecordedEvent::DoWithEvent<mozilla::gfx::EventStream>(mozilla::gfx::EventStream&, mozilla::gfx::RecordedEvent::EventType, std::function<bool (mozilla::gfx::RecordedEvent *)> const&) [gfx/2d/RecordedEventImpl.h:3912]
[task 2020-11-06T23:32:42.417Z] 23:32:42 INFO - GECKO(3864) | #03: mozilla::layout::PrintTranslator::TranslateRecording(mozilla::layout::PRFileDescStream&) [layout/printing/PrintTranslator.cpp:50]
[task 2020-11-06T23:32:42.418Z] 23:32:42 INFO - GECKO(3864) | #04: mozilla::layout::RemotePrintJobParent::PrintPage(mozilla::layout::PRFileDescStream&, nsRefPtrHashtable<nsUint64HashKey,mozilla::gfx::RecordedDependentSurface>*) [layout/printing/ipc/RemotePrintJobParent.cpp:167]
[task 2020-11-06T23:32:42.418Z] 23:32:42 INFO - GECKO(3864) | #05: mozilla::layout::RemotePrintJobParent::FinishProcessingPage(nsRefPtrHashtable<nsUint64HashKey,mozilla::gfx::RecordedDependentSurface>*) [layout/printing/ipc/RemotePrintJobParent.cpp:146]
[task 2020-11-06T23:32:42.419Z] 23:32:42 INFO - GECKO(3864) | #06: mozilla::layout::RemotePrintJobParent::RecvProcessPage(nsTArray<unsigned long long>&&) [layout/printing/ipc/RemotePrintJobParent.cpp:121]
[task 2020-11-06T23:32:42.419Z] 23:32:42 INFO - GECKO(3864) | #07: mozilla::layout::PRemotePrintJobParent::OnMessageReceived(IPC::Message const&) [s3:gecko-generated-sources-l1:46f001374024ac4ec33b95387c7bb26f8ae1f460b841cfcca8bb518d1e03ea9260db5e5539b1942f4e7b2117a2a5a1d6432c73e7711b401d67ad5080b527bc58/ipc/ipdl/PRemotePrintJobParent.cpp::295]
Comment 105•4 years ago
|
||
(In reply to Brad Werth [:bradwerth] from comment #104)
This seems to cause a failure on Windows for test browser_ext_tabs_saveAsPDF.js. I'll see if I can understand the logs. I'm not developing on a Windows machine:
I'd suggest adding some assertions to UnscaledFontDWrite::CreateFromFontDescriptor to get a better idea where it's failing and what the name of the font is.
Reporter | ||
Comment 106•4 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #103)
I don't think there's a great way to test it so I wouldn't worry about it.
Isn't having a minimal HTML page with emojis enough and internally call the drawSnapshot command? After that we could request a memory dump and analyze the native-font-resource-mac
sub entries (with bug 1674868 fixed).
Comment 107•4 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #105)
(In reply to Brad Werth [:bradwerth] from comment #104)
This seems to cause a failure on Windows for test browser_ext_tabs_saveAsPDF.js. I'll see if I can understand the logs. I'm not developing on a Windows machine:
I'd suggest adding some assertions to UnscaledFontDWrite::CreateFromFontDescriptor to get a better idea where it's failing and what the name of the font is.
Looking at the UnscaledFontDWrite code, I see that CreateFromFontDescriptor
expects the aData
parameter to be a null-terminated WCHAR string (note that IDWriteFactory::CreateFontFileReference
accepts a WCHAR const *
parameter but no corresponding length).
However, it's not clear to me that what we pass it is guaranteed to be null-terminated. In UnscaledFontDWrite::GetFontDescriptor
, we get the filename via GetFontFileName, which reads the name into an array with space for a null terminator, checks that the terminator is in fact still null (so the name didn't overflow), and then pops off the terminator before returning.
What gets serialized, then, is the unterminated WCHAR
string together with its size (in bytes). When deserializing, though, we just check that the size is non-zero, but we don't actually use it to determine the length of the string.
(We don't have this problem when the descriptor is passed to webrender, because the Rust code explicitly adds a null terminator to the path before passing it to IDWriteFactory::CreateFontFileReference
.)
So I think a simple fix is to always serialize the path with a null terminator, so that it can be used directly when deserializing (without needing to copy it into a new temporary string in order to add a terminator). The extra null-termination that the Rust code adds will then be redundant, but harmless.
I pushed a try run with your patch plus this potential fix, and this seems to have resolved the browser_ext_tabs_saveAsPDF.js failure:
https://treeherder.mozilla.org/jobs?repo=try&revision=bb1b9f7c6a5c0bff63e33dec49420667a2b39c7b
Assignee | ||
Comment 108•4 years ago
|
||
This ensures that CreateFromFontDescriptor gets the null-terminated WCHAR
that it expects.
Depends on D96274
Assignee | ||
Comment 109•4 years ago
|
||
(In reply to Jonathan Kew (:jfkthame) from comment #107)
I pushed a try run with your patch plus this potential fix, and this seems to have resolved the browser_ext_tabs_saveAsPDF.js failure:
https://treeherder.mozilla.org/jobs?repo=try&revision=bb1b9f7c6a5c0bff63e33dec49420667a2b39c7b
Thank you, Jonathan for your insight into this. I pushed an updated two-part patch to try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=82d238106e36c427392787e750522334900da402.
Comment 110•4 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #106)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #103)
I don't think there's a great way to test it so I wouldn't worry about it.
Isn't having a minimal HTML page with emojis enough and internally call the drawSnapshot command? After that we could request a memory dump and analyze the
native-font-resource-mac
sub entries (with bug 1674868 fixed).
Do we have any example tests that look at the results of about:memory?
I suppose we could ensure that the font doesn't show up in native-font-resource-mac. That kind of tests does feel pretty targeted at the specific solution though. i.e. it doesn't feel that likely to catch regressions. I'll leave it up to Brad though.
Assignee | ||
Comment 111•4 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #110)
I suppose we could ensure that the font doesn't show up in native-font-resource-mac. That kind of tests does feel pretty targeted at the specific solution though. i.e. it doesn't feel that likely to catch regressions. I'll leave it up to Brad though.
I'll attempt to build such a test.
Comment 112•4 years ago
|
||
toolkit/components/aboutmemory/tests/chrome.ini has some tests that appear to look at about:memory
Reporter | ||
Comment 113•4 years ago
|
||
Bred, https://phabricator.services.mozilla.com/D96478 needs review but no reviewer is set. I assume that's the only patch which blocks us from getting all landed? Or does it wait for a test?
Assignee | ||
Comment 114•4 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #113)
Bred, https://phabricator.services.mozilla.com/D96478 needs review but no reviewer is set. I assume that's the only patch which blocks us from getting all landed? Or does it wait for a test?
I'm trying to put a test together that does the drawSnapshot call + parse the about:memory output. I'll set review flags for everything when I have that ready -- hopefully soon.
Assignee | ||
Comment 115•4 years ago
|
||
Depends on D96478
Assignee | ||
Comment 116•4 years ago
|
||
Comment 117•4 years ago
|
||
I think this will need the GDI change that I talked about to make sure we always return false from GetDescriptor.
Assignee | ||
Comment 118•4 years ago
|
||
Depends on D96478
Updated•4 years ago
|
Comment 119•4 years ago
|
||
Comment 120•4 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/cfe4fe8f1bee
https://hg.mozilla.org/mozilla-central/rev/0f878ffc3020
https://hg.mozilla.org/mozilla-central/rev/e27e9c12a5e7
https://hg.mozilla.org/mozilla-central/rev/42b9dbf35ce8
Reporter | ||
Comment 121•4 years ago
|
||
That works pretty great with Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:85.0) Gecko/20100101 Firefox/85.0 ID:20201120094511. I no longer see native fonts allocated anymore after using Ctrl+Tab with the Github notification page captured.
Given that this patch has a massive impact on memory usage, I wonder if we should uplift to 84 beta.
Comment 122•4 years ago
|
||
Yes, let's do that.
Assignee | ||
Comment 123•4 years ago
|
||
Comment on attachment 9188187 [details]
Bug 1625590 Part 4: Add a test of native font cache behavior on macOS.
Beta/Release Uplift Approval Request
- User impact if declined: Snapshots taken of pages with native fonts will cause memory bloat for macOS users.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Medium
- Why is the change risky/not risky? (and alternatives if risky): This has impacts for all users because it favors sending font descriptions over font data when transporting across processes. We have identified cases (covered by tests) where font descriptors were handled differently than the font data. We believe we have patched these cases, but we may not have adequate test coverage to cover all the ways fonts are transferred between processes.
- String changes made/needed:
Assignee | ||
Updated•4 years ago
|
Updated•4 years ago
|
Comment 124•4 years ago
|
||
Comment on attachment 9186397 [details]
Bug 1625590 Part 1: Make DrawTargetRecording::FillGlyphs prefer sending font description over font data.
This is an old bug, it seems better to let it ride the trains.
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Reporter | ||
Comment 125•4 years ago
|
||
(In reply to Julien Cristau [:jcristau] from comment #124)
Comment on attachment 9186397 [details]
Bug 1625590 Part 1: Make DrawTargetRecording::FillGlyphs prefer sending font description over font data.This is an old bug, it seems better to let it ride the trains.
Well, it's old but the situation is even worse with Webrender enabled, or am I wrong? Jeff, can you give details? If that is the case we have to take into account that we recently started to ship WebRender enabled by default on MacOS.
Reporter | ||
Comment 126•4 years ago
|
||
Setting as firefox-84 affected again, to keep it on our radar.
Comment 127•4 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #125)
(In reply to Julien Cristau [:jcristau] from comment #124)
Comment on attachment 9186397 [details]
Bug 1625590 Part 1: Make DrawTargetRecording::FillGlyphs prefer sending font description over font data.This is an old bug, it seems better to let it ride the trains.
Well, it's old but the situation is even worse with Webrender enabled, or am I wrong? Jeff, can you give details? If that is the case we have to take into account that we recently started to ship WebRender enabled by default on MacOS.
No, WebRender already avoided the code path fixed by this patch. The problem should be the same with WebRender on vs off.
Reporter | ||
Comment 128•4 years ago
|
||
I see. Thanks.
Comment 129•4 years ago
|
||
Sorry for the bug spam, but I wanted to say thank you as this made memory usage so much better! No need to restart FF anymore 1-2 per day \o/
Description
•