Closed Bug 1392928 Opened 7 years ago Closed 3 years ago

Consider to keep unused Ion LifoAllocs because freeing it is not cheap

Tracking

()

Status:

RESOLVED INCOMPLETE

Project Flags:

Performance Impact

medium

Tracking Flags:

Tracking

Status

firefox57

---

affected

People

(Reporter: ting, Assigned: nbp)

References

(Blocks 1 open bug)

Details

(Keywords: perf)

Attachments

(1 file, 1 obsolete file)

experiment patch 7 years ago Ting-Yu Chou [:ting] (away) 3.59 KB, patch		Details \| Diff \| Splinter Review
experiment patch v2 7 years ago Ting-Yu Chou [:ting] (away) 4.37 KB, patch	luke : feedback-	Details \| Diff \| Splinter Review

Ting-Yu Chou [:ting] (away)

Reporter

Description

•

7 years ago

I saw LifoAlloc::freeAll on both VTune and WPA, the cost is mainly in VirtualFree and memset for poisoning. I wonder 1) can we have a unused BumpChunks linked list to keep them instead of freeing them directly, 2) is poisoning necessary?

xul.dll!js::jit::FreeIonBuilder
  xul.dll!js::LifoAlloc::freeAll
    mozglue.dll!free_impl
      mozglue.dll!arena_run_dalloc
        mozglue.dll!arena_purge [1]
          mozglue.dll!pages_decommit
            KernelBase.dll!VirtualFree
      vcruntime140.dll!memset [2][3]

[1] http://searchfox.org/mozilla-central/rev/48ea452803907f2575d81021e8678634e8067fc2/memory/mozjemalloc/mozjemalloc.cpp#2967
[2] http://searchfox.org/mozilla-central/rev/48ea452803907f2575d81021e8678634e8067fc2/memory/mozjemalloc/mozjemalloc.cpp#3562
[3] http://searchfox.org/mozilla-central/rev/48ea452803907f2575d81021e8678634e8067fc2/memory/mozjemalloc/mozjemalloc.cpp#3636

Ting-Yu Chou [:ting] (away)

Reporter

Comment 1

•

7 years ago

(In reply to Ting-Yu Chou [:ting] from comment #0)
> I saw LifoAlloc::freeAll on both VTune and WPA, the cost is mainly in
> VirtualFree and memset for poisoning. I wonder 1) can we have a unused
> BumpChunks linked list to keep them instead of freeing them directly, 2) is
> poisoning necessary?

(2) is redundant, there won't be poisoning if we don't free them.

Ting-Yu Chou [:ting] (away)

Reporter

Updated

•

7 years ago

Assignee: nobody → janus926

Ting-Yu Chou [:ting] (away)

Reporter

Comment 2

•

7 years ago

Attached patch experiment patch (obsolete) — Details — Splinter Review

Ting-Yu Chou [:ting] (away)

Reporter

Comment 3

•

7 years ago

On my desktop, running https://mozilla.github.io/arewefastyet-speedometer/2.0/ with attachment 8904212 [details] [diff] [review] I got:

  83.88, 81.81, 89.16, 89.65, 87.98

m-c 1401e3eec44d got:

  86.78, 87.84, 87.20, 87.68, 82.13

Ting-Yu Chou [:ting] (away)

Reporter

Comment 4

•

7 years ago

Attached patch experiment patch v2 — Details — Splinter Review

With this experiment patch, I got following numbers on Ubuntu for https://mozilla.github.io/arewefastyet-speedometer/2.0, which shows Speedometer 2.0-r221659:

  46.36, 45.55, 45.95, 45.45, 45.97

and without the patch:

  43.91, 43.73, 43.68, 44.59, 43.40

The experiment patch does not free any BumpChunk in the pool. But there should have a limitation like ChunkPool [1]. I am thinking to free chunks in BumpChunk::delete_ if the pool reaches its size limit and can't take any more chunks, not to free them when GC so we don't free many of them at once.

What do you think?

[1] http://searchfox.org/mozilla-central/rev/67f38de2443e6b613d874fcf4d2cd1f2fc3d5e97/js/src/jsgc.cpp#726

Attachment #8904212 - Attachment is obsolete: true

Attachment #8905421 - Flags: feedback?(luke)

Ting-Yu Chou [:ting] (away)

Reporter

Updated

•

7 years ago

Whiteboard: [qf]

Ting-Yu Chou [:ting] (away)

Reporter

Comment 5

•

7 years ago

(In reply to Ting-Yu Chou [:ting] from comment #4)
> With this experiment patch, I got following numbers on Ubuntu for
> https://mozilla.github.io/arewefastyet-speedometer/2.0, which shows
> Speedometer 2.0-r221659:
>   46.36, 45.55, 45.95, 45.45, 45.97
> and without the patch:
>   43.91, 43.73, 43.68, 44.59, 43.40

The numbers on Windows with the patch are:

  43.87, 43.16, 43.59, 43.53, 43.41

m-c:

  41.99, 41.30, 41.53, 41.46, 41.98

Luke Wagner [:luke]

Comment 6

•

7 years ago

Comment on attachment 8905421 [details] [diff] [review]
experiment patch v2

Thanks for the preliminary investigation here, but I don't think this is the right strategy to remove this overhead.

Normally, LifoAllocs *do* keep BumpChunks around during normal use which uses mark()+release().  freeAll() is explicitly intended to release memory with the goal of avoiding permanently keeping memory allocated in a long-lived LifoAllocs that never gets given back.  In particular, if you have a one time spike in usage, you don't want that to hang around forever.

So if you're seeing freeAll() hot in profiles, the solutions that make sense to me are:
 - can we call freeAll() less aggressively (e.g., waiting until GCs, and then, use the GC's background free thread to do the actual free() call)
 - can we create some freeExcessAllocation() that just frees down to a moderate allocation so that in most cases we're not freeing but still achieving the goal of not permanently wasting memory b/c of a one-time spike in usage.

Jan, WDYT?

Flags: needinfo?(jdemooij)

Attachment #8905421 - Flags: feedback?(luke) → feedback-

Kannan Vijayan [:djvj]

Comment 7

•

7 years ago

A good start to this would be to collect total size of these buffers for all compiles across a browsing session, and look for a good 80% or 90% percentile cutoff, and if that's reasonable, use that as the "standard" size.

Kannan Vijayan [:djvj]

Updated

•

7 years ago

Whiteboard: [qf] → [qf:p2]

Jan de Mooij [:jandem]

Comment 8

•

7 years ago

(In reply to Luke Wagner [:luke] from comment #6)
> freeAll() is explicitly intended to release memory
> with the goal of avoiding permanently keeping memory allocated in a
> long-lived LifoAllocs that never gets given back. 

I think the problem here is the Ion compilation LifoAlloc and not freeAll() on long-lived LifoAllocs.

What we could do is have a free-list of 1 or 2 Ion LifoAllocs and then reuse them (making sure we don't waste a lot memory, of course). (Luke and I were actually discussing this a while ago.)

Flags: needinfo?(jdemooij)

Luke Wagner [:luke]

Comment 9

•

7 years ago

(In reply to Jan de Mooij [:jandem] from comment #8)
> What we could do is have a free-list of 1 or 2 Ion LifoAllocs and then reuse
> them (making sure we don't waste a lot memory, of course). (Luke and I were
> actually discussing this a while ago.)

Yeah, I still like this idea (just haven't gotten around to it yet).  (Except I was imagining a pool of up to numCpus LifoAllocs, culled during GC.)

Ting-Yu Chou [:ting] (away)

Reporter

Updated

•

7 years ago

Summary: Consider to keep unused BumpChunks when LifoAlloc::freeAll because freeing it is not cheap → Consider to keep unused Ion LifoAllocs because freeing it is not cheap

Ting-Yu Chou [:ting] (away)

Reporter

Comment 10

•

7 years ago

Keeping Ion LifoAllocs means the BumpChunks in different LifoAlloc can't be shared even though they're not used.

Naveed Ihsanullah [:naveed]

Updated

•

7 years ago

Priority: -- → P3

Ting-Yu Chou [:ting] (away)

Reporter

Updated

•

7 years ago

Assignee: janus926 → nobody

Firefox Bug Husbandry Bot

Updated

•

7 years ago

Keywords: perf

Jean Gong :jgong

Updated

•

7 years ago

Whiteboard: [qf:p2] → [qf:i60][qf:p1]

Jean Gong :jgong

Updated

•

6 years ago

Whiteboard: [qf:i60][qf:p1] → [qf:f60][qf:p1]

Jean Gong :jgong

Updated

•

6 years ago

Whiteboard: [qf:f60][qf:p1] → [qf:f61][qf:p1]

Jean Gong :jgong

Updated

•

6 years ago

Whiteboard: [qf:f61][qf:p1] → [qf:f64][qf:p1]

Steven DeTar [:sdetar]

Comment 11

•

6 years ago

Jan: Do you know if this is still a issue or if it was already fixed in another way (or seperately)?  Luke thought there was maybe a chance this could have been fixed already.

Flags: needinfo?(jdemooij)

Nicolas B. Pierron [:nbp]

Assignee

Comment 12

•

6 years ago

(In reply to Steven DeTar [:sdetar] from comment #11)
> Jan: Do you know if this is still a issue or if it was already fixed in
> another way (or seperately)?  Luke thought there was maybe a chance this
> could have been fixed already.

The fact that we do not have a pool of LifoAlloc chunks is still be an issue.

Also, The current patch is out-dated since I rewrote the BumpChunk structure to ensure that we have a single owner.

I think a simpler solution would be to transfer the unused compilation chunks to the next Ion compilation, and reclaim any unused chunks at the end of each compilation.

Flags: needinfo?(jdemooij)

Jan de Mooij [:jandem]

Comment 13

•

6 years ago

We now free the Ion LifoAllocs off-thread so that at least avoids the problem in comment 0. I agree it might be nice to reuse (a small number of) allocators or pages.

Jean Gong :jgong

Updated

•

6 years ago

Whiteboard: [qf:f64][qf:p1] → [qf:p1:f64]

Nicolas B. Pierron [:nbp]

Assignee

Comment 14

•

6 years ago

I think Bug 1489572 might be a better approach than reusing existing LifoAlloc pages, by reducing the number of Bumpchunk allocations.

This should remove the VirtualFree, but keep anything which attempt to iterate over the pages to poison it.

Kannan Vijayan [:djvj]

Comment 15

•

6 years ago

I've been doing some profiling work today, re-evaluating some old QF bugs.  As :nbp noted earlier in a meeting today, these _are_ showing up in profiles in a reasonably significant way.

I left my notes at work, so I'll post an update tomorrow with details of where I observed it.

Nicolas B. Pierron [:nbp]

Assignee

Comment 16

•

6 years ago

When I was investigating LifoAlloc related crashes, I noticed that TI, IonMonkey and the parser were the main consumers. TI is probably the most offending one at it will create a new LifoAlloc and move everything to the new one, in order to ""sweep"" the unused data. Using Bug 1489572 will probably solve this issue.

Another aspect of TI uses of LifoAlloc is the fact that it relies on resizable data structures, which would cause a lot of memory to be wasted.

Nicolas B. Pierron [:nbp]

Assignee

Updated

•

6 years ago

Flags: needinfo?(nicolas.b.pierron)

Steven DeTar [:sdetar]

Updated

•

6 years ago

Assignee: nobody → nicolas.b.pierron

Steven DeTar [:sdetar]

Updated

•

6 years ago

Whiteboard: [qf:p1:f64] → [qf:p2]

Nicolas B. Pierron [:nbp]

Assignee

Comment 17

•

6 years ago

As expected Bug 1489572 reduces the number of chunks allocations by a factor of a 1000 for small block sizes such as TI, and should subsequently reduce the number of free that we have to do.

Once Bug 1489572 is unblocked, we should reevaluate this issue to highlight that the signature reported in comment 0 disappeared as expected.

Flags: needinfo?(nicolas.b.pierron)

Nicolas B. Pierron [:nbp]

Assignee

Comment 18

•

5 years ago

(In reply to Nicolas B. Pierron [:nbp] from comment #17)
> Once Bug 1489572 is unblocked, we should reevaluate this issue to highlight
> that the signature reported in comment 0 disappeared as expected.

Needinfo bas to re-evaluate, as I recall you used VTune recently, and that you might be able to reproduce the analysis described in comment 0.

If this signature disappeared or is not interesting anymore, then we should close this issue as a duplicate of Bug 1489572.

Flags: needinfo?(bas)

Bas Schouten (:bas.schouten)

Comment 19

•

3 years ago

I don't have current data on this.

Flags: needinfo?(bas)

Nicolas B. Pierron [:nbp]

Assignee

Updated

•

3 years ago

Status: NEW → RESOLVED

Closed: 3 years ago

Resolution: --- → INCOMPLETE

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

2 years ago

Performance Impact: --- → P2

Whiteboard: [qf:p2]

You need to log in before you can comment on or make changes to this bug.