Closed Bug 1520366 Opened 5 years ago Closed 5 years ago

LifoAlloc can allocate crazy amounts of memory loading a Facebook page (temporarily)

Categories

(Core :: JavaScript Engine: JIT, defect, P1)

defect

Tracking

()

VERIFIED FIXED
mozilla66
Root Cause Design Error
Performance Impact high
Tracking Status
firefox-esr60 --- unaffected
firefox64 --- unaffected
firefox65 --- unaffected
firefox66 blocking verified

People

(Reporter: jesup, Assigned: nbp)

References

(Blocks 1 open bug, )

Details

(Keywords: memory-footprint, perf:pageload, regression)

Attachments

(3 files, 5 obsolete files)

Attached file backtrace

See the profile https://perfht.ml/2TPNmnF -- the Memory track for the content process (if you hover it) shows a spike to 22GB(!) The spikes don't happen on every run.

Running with MOZALLOC_LOG=filename, I saw in the traces allocations up to 285MB at a time (and quite a few over 100MB). Using GDB, I caught a >100MB allocation (111MB in this case) from LifoAlloc (for a 48-byte request). See attachment for the backtrace.

Note that a number of the profiles I took showed spikes where the stack was in FinishCompilation or some other CompilerConstraint method.

Whiteboard: [qf]
Flags: needinfo?(nicolas.b.pierron)
Blocks: 1473892

[Tracking Requested - why for this release]: Memory spikes like this could cause large numbers of OOMs

There may be more bugs leading to the need for so much memory, or just a bug somewhere in the calculations, or facebook could be doing something incredibly stupid - or some combination. In any case, we need to control the memory spike here.

From what I can read from the log you pasted on IRC:

47566:6373 140629615007552 malloc(98566144)=0x7fe67c900000
91818:6373 140629615007552 malloc(111149056)=0x7fe675f00000
126608:6373 140629615007552 malloc(124780544)=0x7fe66e700000
142633:6373 140629615007552 malloc(140509184)=0x7fe666100000
169520:6373 140629615007552 malloc(158334976)=0x7fe65c800000
182772:6373 140629615007552 malloc(178257920)=0x7fe651b00000
227485:6373 140629615007552 malloc(200278016)=0x7fe645c00000
233444:6373 140629615007552 malloc(225443840)=0x7fe638500000
254436:6373 140629615007552 malloc(253755392)=0x7fe629300000
286596:6373 140629615007552 malloc(285212672)=0x7fe618300000

This looks like this is following the formulae which is implemented here:
https://searchfox.org/mozilla-central/rev/c21d6620d384dfb13ede6054015da05a6353b899/js/src/ds/LifoAlloc.cpp#162

Which suggest that the whole LifoAlloc contains 2.3 GB in terms of aggregated small allocations (less than the chunk size).

So, my guess is that we are looking for a LifoAlloc buffer which keeps growing and is never reclaimed at any time.
The backtrace from comment 0 suggest that this LifoAlloc is the cx->typeLifoAlloc().

However, this LifoAlloc is supposed to be reclaimed, by copying everything from an old LifoAlloc to a new one, and freeing the old chunks:
https://searchfox.org/mozilla-central/source/js/src/vm/TypeInference.cpp#4774,4780
https://searchfox.org/mozilla-central/source/js/src/gc/GC.cpp#3614
https://searchfox.org/mozilla-central/source/js/src/gc/GC.cpp#3660,3667

As we do not see any dent in the memory profile. My guess would be that js::TypeZone::beginSweep is never called. And no copying from the sweepTypeLifoAlloc are made to copy over to a new typeLifoAlloc. A dent in the memory profile would be expected as iterating over 22GB is supposed to take time, and the dent would be expected to be at most twice as large as the memory consumed at a given time.

In addition I will notice that on the profile from comment 0, the memory drop corresponds to the end of the last GC slice / GC major.

Unless this is a LifoAlloc issue, this memory increase sounds legitimate and we should probably look either at ways to reduce our memory allocations related to types, or trigger a GC if we allocate too much type constraints. The problem with the later is that we might end-up with a GC & Compile loop.

Jon and Jan, which way do you think we should take?

Flags: needinfo?(nicolas.b.pierron)
Flags: needinfo?(jdemooij)
Flags: needinfo?(jcoppeard)

I think we should first try to understand why this is a problem now. Can we try to get data from before/after bug 1489572 landed so we can see what the effects of that bug are?

Flags: needinfo?(jdemooij)

(In reply to Nicolas B. Pierron [:nbp] from comment #3)

Unless this is a LifoAlloc issue, this memory increase sounds legitimate and we should probably look either at ways to reduce our memory allocations related to types, or trigger a GC if we allocate too much type constraints.

If we to allocate a significant amount of memory for type constraints we should tell the GC about it. Maybe we could give LifoAlloc an optional Zone* member and call Zone::updateMallocCounter() when we allocate?

(In reply to Jan de Mooij [:jandem] from comment #4)

Can we try to get data from before/after bug 1489572 landed so we can see what the effects of that bug are?

Randell tried once more after reverting the heuristic change, and was not able to reproduce the memory consumption.

However, after reviewing the logic twice. I am still unable to understand how this could be related.

Marking this as a release blocker for now, though we can change that if you think it isn't necessary.

Maybe bug 1489572 explains why our oom|small rate just jumped up massively.

I was remembering wrong; the spike in oom | small was in 65 beta 8 and 9 (bug 1519129).

Ok, based on the log provided by Randell, I isolated the problem to our usage of transferFrom & alloc on the same LifoAlloc:

0x7f6557d894d8 (hot): chunks_.last()->begin(),end()=0x7f649ab00020,0x7f649ab07a18, n=32
0x7f6557d894d8 (hot): chunks_.last()->begin(),end()=0x7f649ab00020,0x7f649ab07a40, n=40
transferFrom(0x7f654c7214d8): 0x7f6557d894d8: curSize = 0
0x7f6557d894d8 (hot): chunks_.last()->begin(),end()=0x7f654c727020,0x7f654c727308, n=40
0x7f6557d894d8 (hot): chunks_.last()->begin(),end()=0x7f654c727020,0x7f654c727350, n=72
[…]
0x7f6557d894d8 (hot): chunks_.last()->begin(),end()=0x7f654c727020,0x7f654c727cc0, n=40
0x7f6557d894d8 (hot): chunks_.last()->begin(),end()=0x7f654c727020,0x7f654c727ce8, n=40
transferFrom(0x7f654c7604d8): 0x7f6557d894d8: curSize = 0
transferFrom(0x7f654d0d84d8): 0x7f6557d894d8: curSize = 0
0x7f6557d894d8 (hot): chunks_.last()->begin(),end()=0x7f6547ce4020,0x7f6547ce7700, n=48
0x7f6557d894d8 (hot): chunks_.last()->begin(),end()=0x7f6547ce4020,0x7f6547ce7730, n=48
[…]
0x7f6557d894d8 (hot): chunks_.last()->begin(),end()=0x7f6547ce4020,0x7f6547ce7fd0, n=48
0x7f6557d894d8 (hot): chunks_.last()->begin(),end()=0x7f6547ce4020,0x7f6547ce8000, n=48
0x7f6557d894d8 (cold): chunks_.last()->begin(),end()=0x7f6483500020,0x7f6483500050, n=48
0x7f6557d894d8 (hot): chunks_.last()->begin(),end()=0x7f6483500020,0x7f6483500080, n=48

When transferring from one LifoAlloc to another, the targetted LifoAlloc appends chunks at the end.
Then we also allocate data in the same LifoAlloc used as a transfer target.

Having a LifoAlloc both as a target and as an allocation source will cause issues as the curSize_ of the LifoAlloc would be artificially increased each time another LifoAlloc is transferred. Therefore, when we transfer a small LifoAlloc at the end of the current one, the last chunk of the smaller LifoAlloc would be used, and as we fail to allocate in it, we would allocate a new chunk which size is based on the size of the aggregated allocation and not the proper allocations of the current LifoAlloc.

I see few ways to address this issue:

  • We forbid to have a LifoAlloc as an allocation base and as a transfer target.
  • We append transferred chunks to the oversize_ list of chunks instead of the chunks_ list.
  • We prepend transferred chunks and keep a counter for transferred LifoAlloc sizes, to remove it from curSize_ in the NextSize call.
  • We copy the data instead of transferring the chunks.

This problems appears on Facebook more than other website because it uses a lot of large scripts. This problem is cause by the typeLifoAlloc [2] which is using ordinary allocations and LifoAlloc::transferFrom [1] when merging off-thread parsing of large scripts in js::gc::GCRuntime::mergeRealms.

This problem did not appear before the heuristic changes because all blocks were small and not based on the size of previous allocations. We were wasting memory but at a smaller pace than what is done with the geometric growth.

[1] https://searchfox.org/mozilla-central/search?q=symbol:_ZN2js9LifoAlloc12transferFromEPS0_&redirect=false
[2] https://searchfox.org/mozilla-central/search?q=symbol:_ZN2js8TypeZone13typeLifoAllocEv&redirect=false

Flags: needinfo?(jcoppeard)
Priority: -- → P1
Attachment #9037179 - Flags: review?(jdemooij)
Attachment #9037179 - Flags: feedback?(tcampbell)
Attachment #9037179 - Flags: feedback?(rjesup)
Severity: normal → critical
Comment on attachment 9037179 [details] [diff] [review]
LifoAlloc::transferFrom move all the chunks to the oversize_ list. r=

Review of attachment 9037179 [details] [diff] [review]:
-----------------------------------------------------------------

Just some extremely minor issues with the comments

::: js/src/ds/LifoAlloc.cpp
@@ +364,5 @@
>  void LifoAlloc::transferFrom(LifoAlloc* other) {
> +  // Note: When transferring chunks from another LifoAlloc, all chunks are
> +  // transferred to the oversize_ list instead of the chunk_ list. The reason is
> +  // that some user of LifoAlloc might use transferFrom and allocImpl at the
> +  // same time. Using both at the same time add issues about wasted memory as

either "adds issues about" (very minor english correction), or replace with "can cause"

@@ +368,5 @@
> +  // same time. Using both at the same time add issues about wasted memory as
> +  // chunks would be allocated based on the size of the transferred chunks and
> +  // not exclusively based on the size of the actual allocations. Also,
> +  // appending transferred chunks to the chunks_ list will prevent allocation to
> +  // happen in the last allocated chunk.

"prevent allocation from happening" (minor english)
Attachment #9037179 - Flags: feedback?(rjesup) → feedback+

Optionally you could include a ref to the bug number, but I don't think that's needed

Assignee: nobody → nicolas.b.pierron
Status: NEW → ASSIGNED
Whiteboard: [qf] → [qf:p1:pageload]
Comment on attachment 9037179 [details] [diff] [review]
LifoAlloc::transferFrom move all the chunks to the oversize_ list. r=

Review of attachment 9037179 [details] [diff] [review]:
-----------------------------------------------------------------

Forwarding to Ted because he reviewed bug 1489572 and I'm not as familiar with LifoAlloc internals these days. Fortunately we don't have to uplift to beta and he'll be back in a few days.
Attachment #9037179 - Flags: review?(tcampbell)
Attachment #9037179 - Flags: review?(jdemooij)
Attachment #9037179 - Flags: feedback?(tcampbell)
Comment on attachment 9037179 [details] [diff] [review]
LifoAlloc::transferFrom move all the chunks to the oversize_ list. r=

Review of attachment 9037179 [details] [diff] [review]:
-----------------------------------------------------------------

This would break the invariant that oversize_ list contains only singleton chunks. While not strictly necessary, it was a nice conceptual invariant that we had considered possible future uses for.

Instead, Nicolas has alternate proposal with prepended chunks that sounds worth exploring. If that doesn't work out, I'll reconsider this patch.
Attachment #9037179 - Flags: review?(tcampbell)
This patch implements the third bullet point from comment 9. Each time chunks
are being transferred from one LifoAlloc to another, the transferred chunks are
prepend to the list of chunks_ which maintains the same last chunk for making
allocations.

In order to avoid the allocation of larger chunks than necessary, the size of
the non-oversize (chunks + unused) chunks is computed and added to the
transferredChunksSize counter. Note, we should not account for oversize chunks
in this counter in order to be able to sum the amount (disjoint sets) in
LifoAlloc::newChunkWithCapacity.
Attachment #9038018 - Flags: review?(tcampbell)
(Oops, I forgot to commit before attaching the patch)
Delta:
 - prependAll call steal with std::move.
 - Add size of unused chunks to the measuredSize in trandferFrom.
Attachment #9038026 - Flags: review?(tcampbell)
Attachment #9038018 - Attachment is obsolete: true
Attachment #9038018 - Flags: review?(tcampbell)
Comment on attachment 9038026 [details] [diff] [review]
Count and prepend transferred chunks when merging realms.

Review of attachment 9038026 [details] [diff] [review]:
-----------------------------------------------------------------

I think the general approach makes sense.

I'd like to consider using (something like) |smallAllocsSize| instead of both |transferedSize| / |oversizeSize|. This should also solve the issue about unused chunks.

Here is a try link with my counter proposal: https://hg.mozilla.org/try/pushloghtml?changeset=9969f0216393

Part 1 - Pre-existing typo we should fix
Part 2 - Optional consistency changes I made while trying this
Part 3 - Your prepend code taken directly
Part 4 - The alternate counters that I'm proposing.

(Sorry for the iterations.. this has more subtlety than I realized)

::: js/src/ds/LifoAlloc.h
@@ +526,5 @@
>  
>    size_t markCount;
>    size_t defaultChunkSize_;
>    size_t oversizeThreshold_;
> +  // Size of all chunks in both chunks_ list and oversize_ list.

|curSize_| includes unused chunks as well.
Attachment #9038026 - Flags: review?(tcampbell)

(In reply to Ted Campbell [:tcampbell] from comment #17)

Here is a try link with my counter proposal:
https://hg.mozilla.org/try/pushloghtml?changeset=9969f0216393

Part 1 - Pre-existing typo we should fix
Part 2 - Optional consistency changes I made while trying this
Part 3 - Your prepend code taken directly
Part 4 - The alternate counters that I'm proposing.

These sounds definitely better, than having a transferred counter and oversize counter.

I suggest you attach these patches here, and I do the review of the parts which I did not made in the first place, or we go on with the current patch and we fix it later on in a less constraint time-frame.

When using LifoAlloc::transferFrom to merge allocators (such as for
types), prepend the chunks to avoid wasted space at end of current last
chunk. This helps usecases such as TypeInference to behave more
predictably when merging small allocators.

(Original implementation by :nbp)
Attachment #9038216 - Flags: review?(nicolas.b.pierron)
- Replace oversizeSize with smallAllocsSize. This will track sizes of
  'normal' chunks used for small allocations. It excludes unused chunks,
  oversize chunks and chunks transfered from other LifoAlloc. This new
  counter is used to determine chunk size growth heuristics. This aims
  to reduce memory spikes due to transferFrom allocation patterns that
  we see in the wild.
- Also fix a pre-existing typo in LifoAlloc::freeAll
Attachment #9038218 - Flags: review?(nicolas.b.pierron)
Attachment #9038216 - Flags: review?(nicolas.b.pierron) → review+
Comment on attachment 9038218 [details] [diff] [review]
Update LifoAlloc growth heuristics to track smallAllocsSize

Review of attachment 9038218 [details] [diff] [review]:
-----------------------------------------------------------------

nit: Replace "normal" by what it means "non-transferred small allocations"

::: js/src/ds/LifoAlloc.cpp
@@ +118,5 @@
>  
>  void LifoAlloc::freeAll() {
> +  // When free-ing all chunks, we can no longer determine which chunks were
> +  // normal and which were not, so simply clear the heuristic to zero right
> +  // away.

nit: s/normal/transferred/

@@ +373,5 @@
>  void LifoAlloc::transferFrom(LifoAlloc* other) {
>    MOZ_ASSERT(!markCount);
>    MOZ_ASSERT(!other->markCount);
>  
> +  // Transfered chunks are not considered "normal" because they used their own

nit: Transferred (I keep making the same typo)

nit: Transferred chunks are not counted as part of the |smallAllocSize_| as this could introduce a bias in the |NextSize| heuristics and lead to over allocations in |this| LifoAlloc, which might only expect a few allocations.

@@ +375,5 @@
>    MOZ_ASSERT(!other->markCount);
>  
> +  // Transfered chunks are not considered "normal" because they used their own
> +  // growth rates. As well, the final block may be almost empty and should this
> +  // should not impact future non-transfered allocations.

nit: Replace "As well, …" by something like:

Also, to avoid interference with small allocations made with |this|, the last chunk of the |chunks_| list should remain the last chunk. Therefore, the transferred chunks are prepend to the |chunks_| list.

::: js/src/ds/LifoAlloc.h
@@ +534,5 @@
>    size_t peakSize_;
> +
> +  // Size of all chunks containing 'normal' bump allocations. This heuristic is
> +  // used to compute growth rate while ignoring chunks such as oversized,
> +  // now-unused, or transfered (which followed their own growth patterns).

nit: s/normal/small/
Attachment #9038218 - Flags: review?(nicolas.b.pierron) → review+
Version: 52 Branch → unspecified
- Replace oversizeSize with smallAllocsSize. This will track sizes of
  'normal' chunks used for small allocations. It excludes unused chunks,
  oversize chunks and chunks transfered from other LifoAlloc. This new
  counter is used to determine chunk size growth heuristics. This aims
  to reduce memory spikes due to transferFrom allocation patterns that
  we see in the wild.
- Also fix a pre-existing typo in LifoAlloc::freeAll
- Fix comments based on review-nits
- Carrying r+ from Comment 21
Attachment #9038240 - Flags: review+
Attachment #9037179 - Attachment is obsolete: true
Attachment #9038026 - Attachment is obsolete: true
Attachment #9038218 - Attachment is obsolete: true
Attachment #9038236 - Attachment is obsolete: true
Pushed by tcampbell@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/13e451a69083
Prepend LifoAlloc chunks when transferring them. r=nbp
https://hg.mozilla.org/integration/mozilla-inbound/rev/acd0dadf8013
Update LifoAlloc growth heuristics to track smallAllocsSize. r=nbp

Randell, can you please verify that you're seeing saner behavior now?

Flags: needinfo?(rjesup)

Yes, it looks sane now!

Flags: needinfo?(rjesup)
Status: RESOLVED → VERIFIED

Please specify a root cause for this bug. See :tmaity for more information.

Root Cause: --- → ?
Root Cause: ? → Design Error
Performance Impact: --- → P1
Keywords: perf:pageload
Whiteboard: [qf:p1:pageload]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: