Bug 1436250 (memshrink-content)

[meta] Reduce content process memory overhead

NEW
Unassigned

Status

()

enhancement
Last year
12 days ago

People

(Reporter: bzbarsky, Unassigned)

Tracking

(Depends on 51 bugs, Blocks 1 bug, {meta})

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [MemShrink:meta])

User Story

https://treeherder.mozilla.org/perf.html#/graphs?timerange=604800&series=mozilla-inbound,1684808,1,4&series=mozilla-inbound,1684802,1,4

Attachments

(1 attachment)

Going to use this to track various specific bits.

We have a lot of heap-unclassified (almost 40% of the heap) in a vanilla content process with nothing really loaded.

Also, I haven't even looked at the non-heap-allocated overhead.
Depends on: 1436179
Whiteboard: [memshrink]
Keywords: meta
Summary: Reduce content process memory overhead → [meta] Reduce content process memory overhead
I have a dump of memory allocated by content processes using ASAN's __sanitizer_print_memory_profile(); it's a little confused because both content processes interleave their dumps (I'll need to find a way to separate them, or only dump 1).  The two processes have allocated (still live) ~17MB and ~21.5MB; for reference the two processes show a size in System Monitor (on Fedora) of ~24.5 and 28MB when I looked in a different run with the same profile.  

Probably the 17MB is the 'warm' process that hasn't loaded any content. and the other is showing a blank page.  Comparing the two will also be interesting.

The dump is large (since I told it to dump the allocation stacks of *all* allocations; total was ~77K for the small, and 100K for the larger).  I'll upload the raw files, but the highlights:

We're spending a lot on alignment(?)
We have a LOT of power-of-2-sized buffers -- IIRC jemalloc isn't efficient on powers-of-two (not unusual)  -- Glandium?
The profiler is allocating a bunch of memory up front in case it needs it when turned on (I presume)
Lots of HashTables - many probably far from filled, and some are static once created
Prefs.... (njn is working on this!)
fontconfig is a PIG!!!!
Telemetry is a non-0 %-age
Quite a bit (scattered) of script data/source/etc



~8% is in posix_memalign from slab_allocator_alloc_chunk() in gslice.c (2500+ allocations)
~3% (~850K) in ~25 allocations from ThreadInfo::ThreadInfo in tools/profiler/core/ThreadInfo.cpp, allocated when the threads are registered with the profiler, or 32808 bytes per thread.  That's a lot to spend for the profiler when I haven't installed it in that profile, let alone used it.  Lazy allocation, perhaps?
~3% in many allocations from g_realloc() (no further backtrace)
~3% (650K) in 21 allocations from performXDR<> called from js::XDRScript<> 
~2% in 5 allocations (of ~88K each) from js::DuplicateString(), called from js::ScriptSource::setSourceCopy()
~1% (262144) in 1 allocation from PLDHashTable::ChangeTable from Preferences(!) (SetLatePreferences)
~1% (262144) in 16 allocations from DoInterfaceDescriptior(XPTArena...), called a ways above from DoRegisterXPT()
~1% in ~8000 allocations from FcCharSetFindLeafCreate() (fontconfig)
~1% in ~7700 allocations from FcValueListCreate/
~1% in 621 allocations from  JSScript::createScriptData() (from XDRScript<>)
~0.5% in 2 allocations from __strtof_l()
~0.5% in FcPatternObjectInsertElt()
~0.5% (131072) from js::detail::HashTable<>changeTableSize()
~0.5% (131072) in 2 allocations from PLDHashTable::Add() (an XPTInterfaceInfoManager table)
~0.5% (131072) in 1 allocations from PLDHashTable::Add()  from GetAtomHashEntry() when in RegisterStaticAtoms
~0.5% (131072) in PLDHashTable::Add() called from TelemetryHistogram::InitializeGlobalState()
(perhaps a couple more 131072 or 262144 allocations)
~0.5% (122K) in XPT_DoCString() from XPTInterfaceInfoManager::RegisterBuffer()
~0.4% (113K) in 95 allocations from _dl_new_object()
~0.4% (111K) from FcCharSetPutLeaf
~0.4% in 17xx allocations from PLDHashTable::Add for strings from TelemetryHistogram::InitializeGlobalState()
~0.4% (102K) in 50 allocations from  nsPersistentProperties::SetStringProperty()
~0.4% (101K) in many allocations from FcValueSave()
~0.4% (98K) in 3 allocations from ThreadInfo::ThreadInfo()
~0.4% (98304) in 6 allocations from xptiInterfaceEntry::Create()
~0.4% (98304) in 2 allocations from PLDHashTable::Add() 
98K in 12 allocs from FcConfigAllocExpr
81920 (10*8192!) in 10 allocations from DuplicateString<char, 8192ul, 1ul> from Pref::Pref()
Bunch more allocs from ThreadInfo::ThreadInfo() (profiler)
65536 in 1 alloc from HashTable<>::createTable() from AtomizeAndCopyChars<>
65536 in 1 allocation from nsAtomFriend::RegisterStaticAtoms()
65536 in 2 allocations from gfxFcPlatformFontList::AddPatternToFontList()/InitFontListForPlatform()
65536 in 2 allocs from js::LifoAlloc::newChunkWithCapacity()
65536 in 1 alloc from nsComponentManagerImpl::RegisterCIDEntryLocked()
65520 in 2 allocs from nsPurpleBuffer::Put()
60K in 65 allocs from ft_mem_qalloc() (freetype)
60K in ~2500 allocs from nsAtomFriend::RegisterStaticAtoms()
Flags: needinfo?(mh+mozilla)
> We have a LOT of power-of-2-sized buffers -- IIRC jemalloc isn't efficient on powers-of-two (not unusual)  -- Glandium?

No, powers-of-two are the best case, along with everything that's exactly matching a class size, or is a multiple of the page size for larger sizes.
Flags: needinfo?(mh+mozilla)
> No, powers-of-two are the best case, along with everything that's exactly
> matching a class size, or is a multiple of the page size for larger sizes.

Good.  (IIRC at one point it was better to be power-of-2-minus-n; though perhaps I'm thinking of some other system/allocator)
You might be thinking about things like nsTAutoArray, which have an embedded header, so a better size for it is jemalloc_class_size - header_size.
(In reply to Randell Jesup [:jesup] from comment #1)
> I have a dump of memory allocated by content processes using ASAN's
> __sanitizer_print_memory_profile()

You'll want to be careful with that -- I'm pretty sure ASAN will be using it's own allocator instead of jemalloc, so it's not a representative run.

You can use DMD for vanilla heap profiling. It works with jemalloc so will give representative results. See the docs about "live mode" at https://developer.mozilla.org/en-US/docs/Mozilla/Performance/DMD.
So, DMD results (I was using ASAN): similar of course, since I don't care much about a few bytes - biggest difference would be in slop and alignment I imagine.

Comments above are still valid; we can now see that gtk is using a moderate amount, and no surprise the fontconfig stuff is called from a InitFontList.
Lots in total in js::ScriptSource, various things involving Atoms, and XPTInterfaceInfoManager::RegisterBuffer() is big hotspot
LifoAlloc then comes in with a lot of different little allocations (probably not surprising)

6% (811K) in ThreadInfo::ThreadInfo (note: another couple % below with different stacks)
4.5% in js::ScriptSource::performXDR<>
3.6% (491K) from XPTInterfaceInfoManager::RegisterBuffer() (a few % more below)
2.3% in js::ScriptSource::setSourceCopy()
2% in js::SharedScriptData::new_ from JSScript::createScriptData()
1.9% (262144) in PLDHashTable::ChangeTable() from SetLatePreferences()
1.7% in js::SharedScriptData::new_ from JSScript::fillyInitFromEmitters
1.6% in FcPatternObjectAddWithBinding() (from InitFontList())
1.5% in gtk_css_selector_tree_builder_build from (near top) dgtk_settings_get_for_display()
1.4% from js::ScriptSource::setSourceCopy()
1.2% in glibc _dl_new_object()
1% (~30% cumulative) in nsPersistentProperties::SetStringProperty() from nsStringBundle::LoadProperties()
1% from XPTInterfaceInfoManager::RegisterBuffer() (again, different stack slightly)
1% in PLDHashTable::Add() from nsAtomFriend::RegisterStaticAtoms()
1% (131072, 1 alloc) in gtk_css_provider_load_internal() from gtk_settings_get_for_display
1% (ditto) in PLDHashTable::Add() from TelemetryHistogram::InitializeGlobalState()
1% (ditto) in js::AtomizeChars from the frontend::GeneralParser<>
1% (ditto) in js::AtomizeChars from js::XDRAtom<>/js::XDRScript<>
1% in FcPatternObjectInsertElt from InitFontlist()
0.85% in PLDHashTable::Add from XPTInterfaceInfoManager::RegisterBuffer() (different stack)
0.8% in gtk_css_ruleset_add() from gtk_settings_get_for_display
0.8% in js::SharedScriptData::new_() from JSScript::fullyInitFromEmitter()
0.8% DuplicateString() from pref_SetPref()
0.7% in FcCharSetPutLeaf from InitFontList
0.7% in js:LifoAlloc::newChunkWithCapacity() from js::frontend:PerHandlerParser
0.6% from js::SharedScriptData::new_ from JSScriptCreateScriptData (different stack)
0.6% from nsAtomFriend::RegisterStaticAtoms()
0.5% from FcPatternObjectAddWithBinding
0.5% from ThreadInfo::ThreadInfo (different stack)
0.5% from XPTInterfaceInfoManager::RegisterBuffer() (different stack)
0.5% from call_init() (dl_init.c) in glibc
0.5% (45% cumulative) in js::AtomizeChars (different stack)

<bunch of 65536 byte allocs from Component Manager, HashTables for StaticAtoms, JSSCript::shareScriptData()>

<several 61440-byte totals (15 allocs) from LifoAlloc, and a bunch in the 50K region with 13 allocs from LifoAlloc>

<4 ~36K alloc stacks from ThreadInfo::ThreadInfo -- different callers - HangMonitor, WatchdogMain, BackgroundHangManager -- I wonder if there's some duplication here that could be eliminated)
Bug 1436179 tracks the ThreadInfo/profiler bits.
Raw data.  Note lsan4_xaa is the first 100K lines (which goes down to ~1K total allocation/stack; the tail is LONG; xaa is only about 1/15th of the full file.  Also note that lsan has a mix of two content processes; one that is displaying a blank page, one that hasn't been used yet

https://app.box.com/folder/46288716831
Assignee: nobody → rjesup
Status: NEW → ASSIGNED
Assignee: rjesup → nobody
Status: ASSIGNED → NEW
Depends on: 1437168
Depends on: 1438088
Depends on: 1438287
Whiteboard: [memshrink] → [MemShrink:meta]
Depends on: 1441290
Depends on: 1441292
Depends on: 529808
Depends on: 1441736
No longer depends on: 1441290
Depends on: 1441754
Depends on: 1442361
Depends on: 1442433
Depends on: 1442737
Depends on: 1425524
Duplicate of this bug: 1444751
Depends on: 1446519
Depends on: 1254777
Depends on: 1449288
Depends on: MinGCMem
Depends on: 1440336
Depends on: 786819
Depends on: 1451524
Depends on: 1451568
Depends on: 1452786
Depends on: 1452862
Depends on: 833098
Depends on: 1458339
Depends on: 1460304
Alias: memshrink-content
Depends on: 1460416
Depends on: 1460002
Depends on: 1463569
Depends on: 1463587
Depends on: 1463908
Depends on: 1464542
See Also: → 1350472
Depends on: 1464548
Depends on: 1464552
cc'ing felipe who might want to be in the loop on this.
Depends on: angle-62
No longer depends on: angle-62
Depends on: 1443077
Depends on: 1469719
Depends on: 648417
No longer depends on: 1439412
Depends on: 1470023
Depends on: 1470324
Depends on: 1470333
Depends on: 1470339
Depends on: 1470365
Depends on: 1470591
Depends on: 1470783
Depends on: 1470793
Depends on: 1470983
Depends on: 1471025
Depends on: 1471062
Depends on: 1471091
Depends on: 1471102
Depends on: 1472491
Depends on: 1472523
Depends on: 1473414
Depends on: 1473631
Depends on: 1473634

Updated

11 months ago
Depends on: 1474130

Updated

11 months ago
Depends on: 1474139

Updated

11 months ago
Depends on: 1474140

Updated

11 months ago
Depends on: 1474143

Updated

11 months ago
Depends on: 1474155

Updated

11 months ago
Depends on: 1474163

Updated

11 months ago
Depends on: 1258781
Depends on: 1474400
Depends on: 1240547
Depends on: 1474793

Updated

11 months ago
Depends on: 1474918
Depends on: 1446831
Depends on: 1471535
Depends on: 1475091
Depends on: 645563

Updated

11 months ago
Depends on: 1475290
Depends on: 1475518

Updated

11 months ago
Depends on: 1475700

Updated

11 months ago
Depends on: 1475899

Updated

11 months ago
Depends on: 1476403

Updated

11 months ago
Depends on: 1476405

Updated

11 months ago
Depends on: 1476416

Updated

11 months ago
Depends on: 1476432

Updated

11 months ago
Depends on: 1477393

Updated

11 months ago
Depends on: 1477576

Updated

11 months ago
Depends on: 1477579

Updated

11 months ago
Depends on: 1478124

Updated

11 months ago
Depends on: 1416723

Updated

11 months ago
Depends on: 1479236

Updated

11 months ago
Depends on: 1479241

Updated

11 months ago
Depends on: 1479245

Updated

11 months ago
Depends on: 1479250

Updated

11 months ago
Depends on: 1479309

Updated

11 months ago
Depends on: 1479310

Updated

11 months ago
Depends on: 1479312

Updated

11 months ago
Depends on: 1479313

Updated

11 months ago
Depends on: 1479318

Updated

11 months ago
Depends on: 1446940

Updated

11 months ago
Depends on: 1480244

Updated

11 months ago
Depends on: 1480319

Updated

11 months ago
Depends on: 1480327

Updated

11 months ago
Depends on: 1471878

Updated

11 months ago
Depends on: 1479569
User Story: (updated)

Updated

10 months ago
Depends on: 1481321

Updated

10 months ago
Depends on: 1475571
Depends on: 1481975

Updated

10 months ago
Depends on: 1481998

Updated

10 months ago
Depends on: 1483363

Updated

10 months ago
Depends on: 1483414

Updated

10 months ago
Depends on: 1483664

Updated

10 months ago
Depends on: 1483738

Updated

10 months ago
Depends on: 1484373

Updated

10 months ago
Depends on: 1484413

Updated

10 months ago
Depends on: 1484415

Updated

10 months ago
Depends on: 1484466

Updated

10 months ago
Depends on: 1484496
Depends on: 1485347

Updated

10 months ago
Depends on: 1486182

Updated

10 months ago
Depends on: 1486444

Updated

10 months ago
Depends on: 1477213
Depends on: 1487137
Depends on: 1487135
Depends on: 1487146
Depends on: 1487198
Depends on: 1487212
Depends on: 1487214
Depends on: 1487216
Depends on: 1487217
Depends on: 1487221
Depends on: 1487223
Depends on: 1487228
Depends on: 1487234
Depends on: 1487235
Depends on: 1487237
Depends on: 1489315

Updated

9 months ago
Depends on: 1475556

Updated

8 months ago
Depends on: 1257388

Updated

8 months ago
Depends on: 1497729
No longer depends on: 1497729

Updated

8 months ago
Depends on: 1482153

Updated

8 months ago
Depends on: 1498278
Depends on: 1501438
Depends on: 1419091
Depends on: 1482091
Depends on: 1505522

Updated

7 months ago
Depends on: 1503496
Depends on: 1505689
Depends on: 1505690
Depends on: 1344428
Depends on: 1502284
Depends on: 1506763
No longer depends on: 1502284
Depends on: 1507434

Updated

7 months ago
Depends on: 1477432
Depends on: 1514869

Updated

5 months ago
Blocks: fission
Depends on: 1523749
Depends on: 1524687
Depends on: 1524688
Depends on: 1529551
No longer depends on: 1529551

Updated

3 months ago
Depends on: 1527532
Depends on: 1540301
Depends on: 1540824
Depends on: 1541208
Depends on: 1543777

Updated

23 days ago
Depends on: 1544371
Depends on: 1556539
You need to log in before you can comment on or make changes to this bug.