Closed Bug 1489572 Opened 6 years ago Closed 6 years ago

LifoAlloc: double chunk size on each allocation

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla66

Project Flags:

Performance Impact

high

Tracking Flags:

Tracking

Status

firefox63

---

wontfix

firefox64

---

wontfix

firefox65

---

wontfix

firefox66

---

fixed

People

(Reporter: jorendorff, Assigned: nbp)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(8 files, 9 obsolete files)

round LifoAlloc sizes up to the jemalloc chunk size 6 years ago Randell Jesup [:jesup] (needinfo me) 2.18 KB, patch		Details \| Diff \| Splinter Review
Histogram of lifo alloc allocations 6 years ago Randell Jesup [:jesup] (needinfo me) 6.46 KB, text/plain		Details
Add LifoAlloc::free_ no-op 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 2.70 KB, patch		Details \| Diff \| Splinter Review
AsmJS CheckFunction: Reclaim LifoAlloc space on errors. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 1.75 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
TypeHashSet: Reclaim space when increasing the capacity. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 998 bytes, patch		Details \| Diff \| Splinter Review
LifoAlloc: Add the ability to best-fit the next chunk allocation. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 8.54 KB, patch		Details \| Diff \| Splinter Review
Ioncompile: Reclaim LifoAlloc space on errors. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 1.93 KB, patch		Details \| Diff \| Splinter Review
Control Flow Graph: Use a single TempAllocator for CFG cache. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 3.77 KB, patch		Details \| Diff \| Splinter Review
LifoAlloc: Free large allocations. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 19.23 KB, patch		Details \| Diff \| Splinter Review
LifoAlloc: Double the size of the small-space to reduce the number of chunks. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 1.15 KB, patch		Details \| Diff \| Splinter Review
LifoAlloc: Move cold code to LifoAlloc.cpp. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 10.39 KB, patch	tcampbell : review+	Details \| Diff \| Splinter Review
Split LifoAlloc chunks in 2 lists. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 20.50 KB, patch	tcampbell : review+	Details \| Diff \| Splinter Review
Adaptively double LifoAlloc chunks to reduce the number of chunks. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 6.83 KB, patch	tcampbell : review+	Details \| Diff \| Splinter Review
Try Nathan heuristics 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 1.90 KB, patch	jorendorff : review+	Details \| Diff \| Splinter Review
to be fuzzed 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 26.09 KB, patch	decoder : feedback+ gkw : feedback+	Details \| Diff \| Splinter Review
Bug 1489572 - Apply upstream patch to valgrind to fix some false positives. 6 years ago Mike Hommey [:glandium] 47 bytes, text/x-phabricator-request		Details \| Review
Fix test case running longer than expected due to increased interpreter stack. r= 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 756 bytes, patch	tcampbell : review+	Details \| Diff \| Splinter Review

Jason Orendorff [:jorendorff]

Reporter

Description

•

6 years ago

The initial compilation of self-hosted code results in the allocation of 722 chunks. Not a big deal. But it does seem a bit silly. The real reason to care about this is that I'd like to add an assertion that all ParseNodes seen by the BytecodeEmitter are actually located in one of the LifoAlloc's chunks. This is easier if there are fewer of them.

Jason Orendorff [:jorendorff]

Reporter

Updated

•

6 years ago

Flags: needinfo?(nicolas.b.pierron)

Randell Jesup [:jesup] (needinfo me)

Comment 1

•

6 years ago

Attached patch round LifoAlloc sizes up to the jemalloc chunk size (obsolete) — Details — Splinter Review

doesn't do doubling, but does round up to the jemalloc chunk size (which above 4K is the next multiple of 4K)

Attachment #9007384 - Flags: review?(jorendorff)

Randell Jesup [:jesup] (needinfo me)

Updated

•

6 years ago

Assignee: nobody → rjesup

Status: NEW → ASSIGNED

Randell Jesup [:jesup] (needinfo me)

Comment 2

•

6 years ago

Silly not to at least round to the amount of memory that actually will be allocated for it

Jason Orendorff [:jorendorff]

Reporter

Comment 3

•

6 years ago

Comment on attachment 9007384 [details] [diff] [review] round LifoAlloc sizes up to the jemalloc chunk size Review of attachment 9007384 [details] [diff] [review]: ----------------------------------------------------------------- This is a good idea. Clearing the review flag, because the change needs to be made in a different method. ::: js/src/ds/LifoAlloc.cpp @@ +268,2 @@ > // Allocate a new BumpChunk with enough space for the next allocation. > + UniqueBumpChunk newChunk = newChunkWithCapacity(size); newChunkWithCapacity itself contains some code to round `size` up, so please merge this with the logic that's already there. (`BumpChunk::newWithCapacity(size)` does, finally, pass `size` directly to `js_malloc`.)

Attachment #9007384 - Flags: review?(jorendorff)

Randell Jesup [:jesup] (needinfo me)

Comment 4

•

6 years ago

Attached file Histogram of lifo alloc allocations — Details

This is output of starting, going to arstechnica, then loading 1 article with a small patch to dump out LifoAlloc sizes from newChunkWithCapacity, run through 'sort -n -k 3 | uniq -c'. (So each line is prefixed by the number of times it occurs). ^2 is the result of RoundUpPow2(allocSizeWithCanaries), and "final:" is what you get if you call MallocGoodSize() on that. If it says "final (default):" it's defaultChunkSize_. I'll note that no chunk has a final size (^2 size) of <4k, and all are ^2 at or above 4K, even without MallocGoodSize, and MallocGoodSize would only apply in a few cases (and there we're always ^2 anyways). The 4096 size seem to come from JSContext.h: static const size_t TEMP_LIFO_ALLOC_PRIMARY_CHUNK_SIZE = 4 * 1024; and ProtectedRegionTree(): alloc(4096, false), and vm/Stack.h: static const size_t DEFAULT_CHUNK_SIZE = 4 * 1024; and: jit/IonControlFlow.h: static const size_t DEFAULT_CHUNK_SIZE = 4096; jit/ICStubSpace.h: static const size_t STUB_DEFAULT_CHUNK_SIZE = 4096; jit/ICStubSpace.h: static const size_t STUB_DEFAULT_CHUNK_SIZE = 4096; so given the above, what change are you requesting here?

Flags: needinfo?(jorendorff)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Updated

•

6 years ago

Assignee: rjesup → nicolas.b.pierron

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Updated

•

6 years ago

Flags: needinfo?(nicolas.b.pierron)

Whiteboard: [qf:p1:f64]

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Updated

•

6 years ago

Depends on: 1491413

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 5

•

6 years ago

Attached patch Add LifoAlloc::free_ no-op (obsolete) — Details — Splinter Review

This is a small patch to add free_ no-op on LifoAlloc structure. Mostly made to remove noise from the upcoming patches.

Attachment #9009948 - Flags: review?(tcampbell)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 6

•

6 years ago

Attached patch AsmJS CheckFunction: Reclaim LifoAlloc space on errors. (obsolete) — Details — Splinter Review

When testing with extra assertions, I noticed that AsmJS creates extra LifoAlloc::Mark but never releases the content of the LifoAlloc on errors. Which implies that we could potentially stack up memory in LifoAlloc chunks.

Attachment #9009949 - Flags: review?(bbouvier)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 7

•

6 years ago

Attached patch TypeHashSet: Reclaim space when increasing the capacity. (obsolete) — Details — Splinter Review

This patch adds a call to LifoAlloc::free_, such that we could reclaim the memory of large allocations made in a LifoAlloc chunks. Note: I am investigating an non-handled OOM on speedometer which might be related to this change, but I do not see how this would be.

Attachment #9009950 - Flags: review?(jdemooij)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 8

•

6 years ago

Attached patch LifoAlloc: Add the ability to best-fit the next chunk allocation. (obsolete) — Details — Splinter Review

This patch is made such that we do not restrict our-self to allocate power of 2 for all allocations. This code would be used in the follow-up patch to allocate space for large allocations. Large allocations would default to a best-fit allocation instead of a first-come allocation.

Attachment #9009956 - Flags: review?(tcampbell)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 9

•

6 years ago

Attached patch Ioncompile: Reclaim LifoAlloc space on errors. (obsolete) — Details — Splinter Review

This patch moves the TempAllocator to be part of the CFGSpace, and call its destructor before reclaiming memory.

Attachment #9009957 - Flags: review?(jdemooij)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 10

•

6 years ago

Attached patch Control Flow Graph: Use a single TempAllocator for CFG cache. (obsolete) — Details — Splinter Review

Use a TempAllocator for CFGSpace and properly call its descrutor when reclaiming the space.

Attachment #9009958 - Flags: review?(jdemooij)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 11

•

6 years ago

Attached patch LifoAlloc: Free large allocations. (obsolete) — Details — Splinter Review

This patch add one additional list of live chunks for large allocations. Each allocation larger than the default chunk size is going to be allocated in the new list, with a single allocation per chunk. Having a single allocation per chunk allow us to reclaim the memory when free_ is called. Freeing data in a LifoAlloc structure is not the most recommended operation, and as such I made some efforts to prevent freeing data which is still marked. Unfortunately, we cannot yet assert that all marked are released before any call to freeAll. Note: Releasing a vector which got reallocated would have been an issue before, and these assertions would not prevent such issue. They would only keep the old memory around as our LifoAlloc behave today, instead of attempting to free any allocations which are before any marks.

Attachment #9009959 - Flags: review?(tcampbell)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 12

•

6 years ago

Attached patch LifoAlloc: Double the size of the small-space to reduce the number of chunks. (obsolete) — Details — Splinter Review

Double the space reserved for small allocations if we are running out of space, unless there is already a buffer which we can use.

Attachment #9009960 - Flags: review?(tcampbell)

Jason Orendorff [:jorendorff]

Reporter

Comment 13

•

6 years ago

(In reply to Randell Jesup [:jesup] from comment #4) > so given the above, what change are you requesting here? I had in mind the change nbp has made in attachment 9009960 [details] [diff] [review].

Flags: needinfo?(jorendorff)

Benjamin Bouvier [:bbouvier] (inactive)

Comment 14

•

6 years ago

Comment on attachment 9009949 [details] [diff] [review] AsmJS CheckFunction: Reclaim LifoAlloc space on errors. Review of attachment 9009949 [details] [diff] [review]: ----------------------------------------------------------------- Good catch, thanks for fixing.

Attachment #9009949 - Flags: review?(bbouvier) → review+

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Updated

•

6 years ago

Depends on: 1496402

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 15

•

6 years ago

Comment on attachment 9009960 [details] [diff] [review] LifoAlloc: Double the size of the small-space to reduce the number of chunks. Review of attachment 9009960 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/ds/LifoAlloc.cpp @@ +172,5 @@ > // for the next allocation. > minSize = MallocGoodSize(minSize); > const size_t chunkSize = fit || minSize > defaultChunkSize_ > ? minSize > + : Max((curSize_ - reallocCurSize_) * 2, defaultChunkSize_); self-nit: no need for the *2, as this would triple the current size.

Steven DeTar [:sdetar]

Updated

•

6 years ago

Priority: -- → P2

Ted Campbell [:tcampbell]

Comment 16

•

6 years ago

Comment on attachment 9009949 [details] [diff] [review] AsmJS CheckFunction: Reclaim LifoAlloc space on errors. I ran into same issue in fuzz-bug Bug 1493475 and ended up with same solution which has since landed. Marking this as obsolete.

Attachment #9009949 - Attachment is obsolete: true

Nathan Froyd [:froydnj]

Comment 17

•

6 years ago

Maybe LifoAlloc::mark() should return an RAII auto-releasing object that one could manually get Mark objects out of if you really needed to manage Mark lifetimes, rather than people having to remember to call release()?

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 18

•

6 years ago

(In reply to Nathan Froyd [:froydnj] from comment #17) > Maybe LifoAlloc::mark() should return an RAII auto-releasing object that one > could manually get Mark objects out of if you really needed to manage Mark > lifetimes, rather than people having to remember to call release()? Ideally yes. Unfortunately we have more ground work to do, such as ensuring that the destructors are effectively called. Some of the LifoAllocScope allocations are made within a LifoAlloc chunk, which does not handle any calls to the destructors, and as such we have no way to ensure that these calls would be balanced at the moment. I want to change that in the future, but we should first ensure that we do it properly. And balancing these mark() and release(Mark) calls feels less important than implementing this feature at the moment.

Priority: P2 → P1

Ted Campbell [:tcampbell]

Comment 19

•

6 years ago

After discussion with :nbp, the work for this bug will be scoped to: - Cleanup cold code to LifoAlloc.cpp - Oversize flag for allocations - store in a separate list of chunks - always alloc and always free - Mark will have to track this list as well - Doubling up to a limit (1MB or so) Follow-up: - Move TempAllocator patches to another bug - Move LifoAlloc::free to another bug / discussion - Fix mark/release balancing.

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 20

•

6 years ago

Attached patch LifoAlloc: Move cold code to LifoAlloc.cpp. — Details — Splinter Review

Split hot/cold code in different functions such that cold code will not get inlined. This should make the generated code smaller and potentially faster too.

Attachment #9017915 - Flags: review?(tcampbell)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Updated

•

6 years ago

Attachment #9007384 - Attachment is obsolete: true

Attachment #9009948 - Attachment is obsolete: true

Attachment #9009950 - Attachment is obsolete: true

Attachment #9009956 - Attachment is obsolete: true

Attachment #9009957 - Attachment is obsolete: true

Attachment #9009958 - Attachment is obsolete: true

Attachment #9009959 - Attachment is obsolete: true

Attachment #9009960 - Attachment is obsolete: true

Attachment #9009948 - Flags: review?(tcampbell)

Attachment #9009950 - Flags: review?(jdemooij)

Attachment #9009956 - Flags: review?(tcampbell)

Attachment #9009957 - Flags: review?(jdemooij)

Attachment #9009958 - Flags: review?(jdemooij)

Attachment #9009959 - Flags: review?(tcampbell)

Attachment #9009960 - Flags: review?(tcampbell)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 21

•

6 years ago

Attached patch Split LifoAlloc chunks in 2 lists. — Details — Splinter Review

LifoAlloc bump allocations are now stored in 2 different lists instead of 1. The first list continas small allocations, allocated in small chunks, while the second list holds large allocation, each stored in its own chunk. Splitting this list in 2 should prevent BumpChunks from being composed mostly of wasted space when a large allocation happens.

Attachment #9017916 - Flags: review?(tcampbell)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 22

•

6 years ago

Attached patch Adaptively double LifoAlloc chunks to reduce the number of chunks. — Details — Splinter Review

Add recordings of the quantity of oversize allocations such that the quantity of small allocations can be deduced with a difference. As discussed, this implement the heuristic of doubling the size until we reach a factor of 1024 of the original size, in which case buckets of 1024 the default chunk size would be allocated instead of 1024 buckets of the default chunk size.

Attachment #9017920 - Flags: review?(tcampbell)

Jan de Mooij [:jandem]

Comment 23

•

6 years ago

(In reply to Nicolas B. Pierron [:nbp] from comment #22) > As discussed, this implement the heuristic of doubling the size until we > reach a > factor of 1024 of the original size, in which case buckets of 1024 the > default > chunk size would be allocated instead of 1024 buckets of the default chunk > size. If we start with 4 KB, we will double until 4 MB, right? I'm a bit worried about wasting a lot of (unused) space and this backfiring on 32-bit with its limited/fragmented virtual address space. Is one 4 MB chunk instead of 2 separate 2 MB chunks really a measurable win?

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 24

•

6 years ago

(In reply to Jan de Mooij [:jandem] from comment #23) > (In reply to Nicolas B. Pierron [:nbp] from comment #22) > > As discussed, this implement the heuristic of doubling the size until we > > reach a > > factor of 1024 of the original size, in which case buckets of 1024 the > > default > > chunk size would be allocated instead of 1024 buckets of the default chunk > > size. > > If we start with 4 KB, we will double until 4 MB, right? I'm a bit worried > about wasting a lot of (unused) space and this backfiring on 32-bit with its > limited/fragmented virtual address space. Is one 4 MB chunk instead of 2 > separate 2 MB chunks really a measurable win? If you are going to need 4MB chunks, that's because you already consumed at least (4 MB - 4kB). Maybe we can change this heuristic to not double but follow the Fibonacci sequence. I did observed some crash reports where TI was consuming 64 MB of 4 KB chunks. So, I do not think this would waste much memory compared to our actual memory consumption, and I do think this would reduce the number of free which are potentially slowing us down. If this wasted memory becomes an issue to be addressed, then we can change the logic of the NextSize function later, in a follow-up bug. At the moment, the biggest issue is to get this in-place, more than fine tuning these heuristics. If you have time to fine tune these, great! In my case I have other things to address for 64.

Nathan Froyd [:froydnj]

Comment 25

•

6 years ago

FWIW, for xpcom strings, we double until a certain threshold (8MB), then x1.125 (cheap to compute with shift and add) rounded to the nearest MB. Maybe a similar strategy will work here?

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 26

•

6 years ago

Comment on attachment 9017915 [details] [diff] [review] LifoAlloc: Move cold code to LifoAlloc.cpp. Review of attachment 9017915 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/ds/LifoAlloc.cpp @@ +321,5 @@ > other->decrementCurSize(size); > } > + > +void > +LifoAlloc::setReadOnly() self-nit: #ifdef LIFO_CHUNK_PROTECT

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 27

•

6 years ago

Comment on attachment 9017916 [details] [diff] [review] Split LifoAlloc chunks in 2 lists. Review of attachment 9017916 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/gc/StoreBuffer.h @@ +155,5 @@ > storage_ = js_new<LifoAlloc>(LifoAllocBlockSize); > + // This prevents LifoAlloc::Enum to crash with a release > + // assertion if we ever allocate one entry larger than > + // LifoAllocBlockSize. > + storage_->setOversizeThreshold(SIZE_MAX); self-nit: Check the returned value before accessing.

Ted Campbell [:tcampbell]

Comment 28

•

6 years ago

Attachment #9017915 - Flags: review?(tcampbell) → review+

Ted Campbell [:tcampbell]

Comment 29

•

6 years ago

Comment on attachment 9017916 [details] [diff] [review] Split LifoAlloc chunks in 2 lists. Review of attachment 9017916 [details] [diff] [review]: ----------------------------------------------------------------- Having a list for singleton chunks makes a lot of sense to me. I wonder if we should investigate use the setOversizeThreshold() on a few consumers of LifoAlloc to control heuristics around fragmentation. I want the hack in the StoreBuffer to be fixed with a better abstraction before landing. ::: js/src/ds/LifoAlloc.cpp @@ +159,2 @@ > LifoAlloc::UniqueBumpChunk > +LifoAlloc::newChunkWithCapacity(size_t n, bool oversize) Ideally we would have MOZ_ASSERT_IF(!oversize, n < oversizeThreshold_), but I'm not sure we can get it away with it in all cases due to allocSizeWithReadZone. @@ +170,5 @@ > { > return nullptr; > } > > + const size_t chunkSize = oversize || minSize > defaultChunkSize_ Nit: parentheses around (oversize || minSize > defaultChunkSize_) @@ +222,5 @@ > LifoAlloc::allocImplColdPath(size_t n) > { > void* result; > + UniqueBumpChunk newChunk = getOrCreateChunk(n); > + if (!newChunk) { I like how this cleanup worked out. @@ +281,2 @@ > } > + detail::BumpChunk::Mark oversizeMark; I think this line is deadcode. @@ +287,4 @@ > } > > void > LifoAlloc::release(Mark mark) The duplicated code is getting long. Maybe we can have a helper function or lambda here. ::: js/src/ds/LifoAlloc.h @@ +534,4 @@ > BumpChunkList chunks_; > > + // List of chunks containing allocated data where each allocation is larger > + // than the default chunk size. Each chunk in this list contains no more - larger than oversize threshold - each chunk contains exactly one allocation - This reduces wasted space in normal chunk list @@ +596,5 @@ > > MOZ_ALWAYS_INLINE > void* allocImpl(size_t n) { > void* result; > + if (MOZ_UNLIKELY(n > oversizeThreshold_)) { Probably would be good to have a comment re-iterating this heuristic. Perhaps something like: // Give oversized allocations their down chunk instead of wasting space due to fragmentation at the end of a normal chunk. @@ +621,5 @@ > > + // Set the threshold to allocate data in its own chunk outside the space for > + // small allocations. > + void setOversizeThreshold(size_t oversizeThreshold) { > + oversizeThreshold_ = oversizeThreshold; I'm wondering if we should MOZ_ASSERT(oversizeThreshold <= defaultChunkSize, "Allocations larger than defaultChunkSize should always be oversize"); @@ +796,5 @@ > + bool empty = chunks_.empty() || > + (chunks_.begin() == chunks_.last() && chunks_.last()->empty()); > + MOZ_ASSERT_IF(!oversize_.empty(), !oversize_.last()->empty()); > + empty = empty && oversize_.empty(); > + return empty; Nit: Directly |return empty && oversize_.empty()| @@ +909,5 @@ > : chunkIt_(alloc.chunks_.begin()), > chunkEnd_(alloc.chunks_.end()), > head_(nullptr) > { > + MOZ_RELEASE_ASSERT(alloc.oversize_.empty()); Good idea! ::: js/src/gc/StoreBuffer.h @@ +152,5 @@ > MOZ_MUST_USE bool init() { > MOZ_ASSERT(!head_); > if (!storage_) { > storage_ = js_new<LifoAlloc>(LifoAllocBlockSize); > + // This prevents LifoAlloc::Enum to crash with a release s/to crash/from crashing/ @@ +155,5 @@ > storage_ = js_new<LifoAlloc>(LifoAllocBlockSize); > + // This prevents LifoAlloc::Enum to crash with a release > + // assertion if we ever allocate one entry larger than > + // LifoAllocBlockSize. > + storage_->setOversizeThreshold(SIZE_MAX); Add a disableOversize() method so that this mechanism is inside LifoAlloc instead of in the StoreBuffer. We could then consider asserting in Enum that the disableOversize was set (by checking SIZE_MAX).

Attachment #9017916 - Flags: review?(tcampbell) → review+

Ted Campbell [:tcampbell]

Comment 30

•

6 years ago

Looking at the try jobs, the AWSY results seem problematic (25% explicit mem regression). I'm going to fire off a few tries runs with and without to see what is going on.

Ted Campbell [:tcampbell]

Comment 31

•

6 years ago

I did some try runs to compare with and without the patches. For AWSY, this heuristic change gives an 8.5% regression is JS heap size. It is looking like we should do more investigation and push back to FF65. I think we need better measurements on how many LifoAllocs are typically live, what unused-but-available space exists in large chunks (and the unused_ list), what total size of LifoAlloc becomes. https://treeherder.mozilla.org/#/jobs?repo=try&revision=e2e3de91e6bd6b735bb6bedd70a0d9a2f14a486a https://treeherder.mozilla.org/#/jobs?repo=try&revision=65ce0d32b4c365a7a25951db738e2237c573e4b6

Jan de Mooij [:jandem]

Comment 32

•

6 years ago

(In reply to Nicolas B. Pierron [:nbp] from comment #24) > I did observed some crash reports where TI was consuming 64 MB of 4 KB > chunks. So, I do not think this would waste much memory compared to our > actual memory consumption, I admit 64 MB chunks of 4 KB is a bit silly, but switching to 8 MB (TYPE_LIFO_ALLOC_PRIMARY_CHUNK_SIZE = 8 KB) chunks at a time *will* have some unused *megabytes* in most cases. These allocations are also more likely to fail on 32-bit systems. Maybe all that is fine and worth it, but the Try results do suggest it's a valid concern.

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 33

•

6 years ago

I have made 2 try pushes, one which is the current set of patches: https://treeherder.mozilla.org/#/jobs?repo=try&revision=9a9344e687b4a1935217fe1bb25d7b78af33737e And another with Nathan suggestion (comment 25): https://treeherder.mozilla.org/#/jobs?repo=try&revision=69585aca7bea588ee12514948ce6b87037ae2e03 (In reply to Ted Campbell [:tcampbell] from comment #31) > I did some try runs to compare with and without the patches. For AWSY, this > heuristic change gives an 8.5% regression is JS heap size. It is looking > like we should do more investigation and push back to FF65. I do not see any JS heap size increases in the report that I have from AWSY, from my try pushes, I only see "Base Content" which is bi-modal. My guess would be that we are not able to reclaim some memory when a GC is triggered, and one of the LifoAlloc keeps 7 MB of unused chunks. I will also note, by looking at the Graph over the last 14 days seems to highlight that this is a Try issue related to AWSY: https://treeherder.mozilla.org/perf.html#/graphs?series=try,1706923,1,4&series=mozilla-central,1707015,1,4&highlightAlerts=0 Therefore, I do not think we should worry about these AWSY result unless they appear on mozilla central.

Ted Campbell [:tcampbell]

Comment 34

•

6 years ago

I'm not comfortable with being the one to sign off on this change so close to merge. I'm fine with the correctness of the patches, if there are others who have a better feel for perf vs memory tradeoffs. I would strongly like to push this back to next week so we can see how it behaves on nightly in practice, especially with the talk that NextSize may need heuristic tuning.

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Updated

•

6 years ago

Depends on: 1500176

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 35

•

6 years ago

Attached patch Try Nathan heuristics — Details — Splinter Review

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 36

•

6 years ago

Attachment #9018333 - Flags: review?(jorendorff)

Ted Campbell [:tcampbell]

Comment 37

•

6 years ago

Comment on attachment 9017920 [details] [diff] [review] Adaptively double LifoAlloc chunks to reduce the number of chunks. Review of attachment 9017920 [details] [diff] [review]: ----------------------------------------------------------------- Chatting with nbp, I'll mark this r+ if we leave the NextSize as |return start| and split the heuristic off to another patch. The NextSize trick here seems like a reasonable way to organize our scaling code.

Attachment #9017920 - Flags: review?(tcampbell) → review+

Jason Orendorff [:jorendorff]

Reporter

Comment 38

•

6 years ago

Comment on attachment 9018333 [details] [diff] [review] Try Nathan heuristics Review of attachment 9018333 [details] [diff] [review]: ----------------------------------------------------------------- r=me on the idea generally. ::: js/src/ds/LifoAlloc.cpp @@ +161,4 @@ > } > > static size_t > NextSize(size_t start, size_t used) This code was hard for me to read, so I rewrote it. This produces a very similar sequence, not identical: // Heuristic to choose the size of the next BumpChunk for small allocations. // `start` is the size of the first chunk. `used` is the total size of all // BumpChunks in this LifoAlloc so far. static size_t NextSize(size_t start, size_t used) { // Double the size, up to 1 MB. const size_t mb = 1 * 1024 * 1024; if (used < mb) { return Max(start, used); } // After 1 MB, grow more gradually, to waste less memory. // The sequence (in megabytes) begins: // 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 4, 5, ... return JS_ROUNDUP(used / 8, mb); }

Attachment #9018333 - Flags: review?(jorendorff) → review+

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 39

•

6 years ago

(In reply to Ted Campbell [:tcampbell] from comment #29) > ::: js/src/ds/LifoAlloc.cpp > @@ +159,2 @@ > > LifoAlloc::UniqueBumpChunk > > +LifoAlloc::newChunkWithCapacity(size_t n, bool oversize) > > Ideally we would have MOZ_ASSERT_IF(!oversize, n < oversizeThreshold_), but > I'm not sure we can get it away with it in all cases due to > allocSizeWithReadZone. I was not able to add this assertion as the Trampolines create in the JitRuntime are making use of a TempAllocator initialized with the JSContext LifoAlloc. This assertion fails because the TempAllocator is using the 16KB ensureBallast calls, while the LifoAlloc of the JSContext is initialized with a defaultChunkSize of 4KB.

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Updated

•

6 years ago

Blocks: 1500898

Pulsebot

Comment 40

•

6 years ago

Pushed by npierron@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/f5973de3f6e8 LifoAlloc: Move cold code to LifoAlloc.cpp. r=tcampbell https://hg.mozilla.org/integration/mozilla-inbound/rev/683e3e0eaee1 Split LifoAlloc chunks in 2 lists. r=tcampbell https://hg.mozilla.org/integration/mozilla-inbound/rev/cbfcb4b2b5c9 Add LifoAlloc instrumentation for adding heuristics. r=tcampbell https://hg.mozilla.org/integration/mozilla-inbound/rev/f3aa68c506e0 Add LifoAlloc heuristic to increase the reserved space based on the consumed space. r=jorendorff

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Updated

•

6 years ago

status-firefox63: --- → wontfix

status-firefox64: --- → affected

status-firefox65: --- → affected

Cosmin Sabou [:CosminS]

Comment 41

•

6 years ago

Backed out for valgrind failures at js::jit::LIRGenerator::visitBlock Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&searchStr=linux%2Cx64%2Copt%2Cvalgrind-linux64-valgrind%2Fopt%2C%28v%29&fromchange=f3aa68c506e01055a7d7d20eb4fb672c511b9caf&tochange=58bcea5ede75c573fc8b8b20380b17d19b6f50ca&selectedJob=207271995 Failure log: https://hg.mozilla.org/integration/mozilla-inbound/rev/58bcea5ede75c573fc8b8b20380b17d19b6f50ca Backout link: https://hg.mozilla.org/integration/mozilla-inbound/rev/58bcea5ede75c573fc8b8b20380b17d19b6f50ca [task 2018-10-23T14:48:49.551Z] 14:48:49 INFO - 0:33.48 TEST-UNEXPECTED-FAIL | valgrind-test | Conditional jump or move depends on uninitialised value(s) at js::jit::LIRGenerator::visitBlock / js::jit::LIRGenerator::generate / js::jit::GenerateLIR / js::jit::CompileBackEnd [task 2018-10-23T14:48:49.552Z] 14:48:49 INFO - 0:33.48 ==3591== Conditional jump or move depends on uninitialised value(s) [task 2018-10-23T14:48:49.552Z] 14:48:49 INFO - 0:33.48 ==3591== at 0xFEF30DC: js::jit::LIRGenerator::visitBlock(js::jit::MBasicBlock*)+204 (Lowering.cpp:5324) [task 2018-10-23T14:48:49.552Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFEF3344: js::jit::LIRGenerator::generate()+180 (Lowering.cpp:5398) [task 2018-10-23T14:48:49.552Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFE09223: js::jit::GenerateLIR(js::jit::MIRGenerator*)+403 (Ion.cpp:1784) [task 2018-10-23T14:48:49.552Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFE09C6D: js::jit::CompileBackEnd(js::jit::MIRGenerator*)+29 (Ion.cpp:1889) [task 2018-10-23T14:48:49.552Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x100C709F: js::HelperThread::handleIonWorkload(js::AutoLockHelperThreadState&)+191 (HelperThreads.cpp:2199) [task 2018-10-23T14:48:49.552Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x100C69E5: js::HelperThread::threadLoop()+325 (HelperThreads.cpp:2643) [task 2018-10-23T14:48:49.552Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x100CACCE: callMain<0> (Thread.h:243) [task 2018-10-23T14:48:49.552Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x100CACCE: js::detail::ThreadTrampoline<void (&)(void*), js::HelperThread*>::Start(void*)+14 (Thread.h:236) [task 2018-10-23T14:48:49.552Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x4E3BB4F: start_thread+207 (pthread_create.c:304) [task 2018-10-23T14:48:49.552Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x5CDCFBC: clone+108 (clone.S:112) [task 2018-10-23T14:48:49.552Z] 14:48:49 INFO - 0:33.48 ==3591== Uninitialised value was created by a client request [task 2018-10-23T14:48:49.553Z] 14:48:49 INFO - 0:33.48 ==3591== at 0xFE0B452: setBump (LifoAlloc.h:329) [task 2018-10-23T14:48:49.553Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFE0B452: tryAlloc (LifoAlloc.h:472) [task 2018-10-23T14:48:49.553Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFE0B452: allocImpl (LifoAlloc.h:608) [task 2018-10-23T14:48:49.553Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFE0B452: alloc (LifoAlloc.h:669) [task 2018-10-23T14:48:49.553Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFE0B452: new_<js::jit::IonBuilder, JSContext *, js::jit::CompileRealm *, const js::jit::JitCompileOptions &, js::jit::TempAllocator *&, js::jit::MIRGraph *&, js::CompilerConstraintList *&, js::jit::BaselineInspector *&, js::jit::CompileInfo *&, const js::jit::OptimizationInfo *&, js::jit::BaselineFrameInspector *&> (LifoAlloc.h:867) [task 2018-10-23T14:48:49.553Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFE0B452: IonCompile (Ion.cpp:2110) [task 2018-10-23T14:48:49.553Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFE0B452: js::jit::Compile(JSContext*, JS::Handle<JSScript*>, js::jit::BaselineFrame*, unsigned char*, bool)+3250 (Ion.cpp:2439) [task 2018-10-23T14:48:49.553Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFE0BD7E: BaselineCanEnterAtEntry (Ion.cpp:2564) [task 2018-10-23T14:48:49.553Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFE0BD7E: js::jit::IonCompileScriptForBaseline(JSContext*, js::jit::BaselineFrame*, unsigned char*)+286 (Ion.cpp:2697) [task 2018-10-23T14:48:49.553Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x3CD593301691: ??? [task 2018-10-23T14:48:49.553Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x3CD593307700: ??? [task 2018-10-23T14:48:49.553Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x16F792E7: ??? [task 2018-10-23T14:48:49.553Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x3CD593307700: ??? [task 2018-10-23T14:48:49.554Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x1718D6E7: ??? [task 2018-10-23T14:48:49.554Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x3CD5932FF4C1: ??? [task 2018-10-23T14:48:49.554Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFE6F8EE: EnterJit (Jit.cpp:103) [task 2018-10-23T14:48:49.554Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFE6F8EE: js::jit::MaybeEnterJit(JSContext*, js::RunState&)+910 (Jit.cpp:170) [task 2018-10-23T14:48:49.554Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x103E9A8E: Interpret(JSContext*, js::RunState&)+35278 (Interpreter.cpp:3505) [task 2018-10-23T14:48:49.554Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x103E0E1B: js::RunScript(JSContext*, js::RunState&)+379 (Interpreter.cpp:447) [task 2018-10-23T14:48:49.554Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x103F0F03: js::ExecuteKernel(JSContext*, JS::Handle<JSScript*>, JSObject&, JS::Value const&, js::AbstractFramePtr, JS::Value*)+275 (Interpreter.cpp:813) [task 2018-10-23T14:48:49.554Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFCCE04A: ExecuteInExtensibleLexicalEnvironment(JSContext*, JS::Handle<JSScript*>, JS::Handle<JSObject*>)+298 (Eval.cpp:480) [task 2018-10-23T14:48:49.554Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFCCE2BE: js::ExecuteInJSMEnvironment(JSContext*, JS::Handle<JSScript*>, JS::Handle<JSObject*>, JS::AutoVector<JSObject*>&)+254 (Eval.cpp:589) [task 2018-10-23T14:48:49.554Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFCCE18B: js::ExecuteInJSMEnvironment(JSContext*, JS::Handle<JSScript*>, JS::Handle<JSObject*>)+91 (Eval.cpp:544) [task 2018-10-23T14:48:49.554Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xD075306: mozJSComponentLoader::ObjectForLocation(ComponentLoaderInfo&, nsIFile*, JS::MutableHandle<JSObject*>, JS::MutableHandle<JSScript*>, char**, bool, JS::MutableHandle<JS::Value>)+1878 (mozJSComponentLoader.cpp:939) [task 2018-10-23T14:48:49.554Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xD0772D5: mozJSComponentLoader::Import(JSContext*, nsTSubstring<char> const&, JS::MutableHandle<JSObject*>, JS::MutableHandle<JSObject*>, bool)+2357 (mozJSComponentLoader.cpp:1372) [task 2018-10-23T14:48:49.555Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xD0764C3: mozJSComponentLoader::ImportInto(nsTSubstring<char> const&, JS::Handle<JSObject*>, JSContext*, JS::MutableHandle<JSObject*>)+99 (mozJSComponentLoader.cpp:1166) [task 2018-10-23T14:48:49.555Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xD0762F3: mozJSComponentLoader::ImportInto(nsTSubstring<char> const&, JS::Handle<JS::Value>, JSContext*, unsigned char, JS::MutableHandle<JS::Value>)+435 (mozJSComponentLoader.cpp:1032) [task 2018-10-23T14:48:49.555Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xD6E469E: mozilla::dom::ChromeUtils::Import(mozilla::dom::GlobalObject const&, nsTSubstring<char16_t> const&, mozilla::dom::Optional<JS::Handle<JSObject*> > const&, JS::MutableHandle<JSObject*>, mozilla::ErrorResult&)+542 (ChromeUtils.cpp:412) [task 2018-10-23T14:48:49.555Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xDC6338C: mozilla::dom::ChromeUtils_Binding::import(JSContext*, unsigned int, JS::Value*)+1644 (ChromeUtilsBinding.cpp:3220) [task 2018-10-23T14:48:49.555Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x103EFB3B: CallJSNative (Interpreter.cpp:468) [task 2018-10-23T14:48:49.555Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x103EFB3B: js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct)+923 (Interpreter.cpp:560) [task 2018-10-23T14:48:49.555Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x103E9DE3: CallFromStack (Interpreter.cpp:620) [task 2018-10-23T14:48:49.555Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x103E9DE3: Interpret(JSContext*, js::RunState&)+36131 (Interpreter.cpp:3462) [task 2018-10-23T14:48:49.555Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x103E0E1B: js::RunScript(JSContext*, js::RunState&)+379 (Interpreter.cpp:447) [task 2018-10-23T14:48:49.555Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x103EFE8A: js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct)+1770 (Interpreter.cpp:587) [task 2018-10-23T14:48:49.555Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x103F03B8: js::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, js::AnyInvokeArgs const&, JS::MutableHandle<JS::Value>)+40 (Interpreter.cpp:633) [task 2018-10-23T14:48:49.555Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFFC2025: JS_CallFunctionValue(JSContext*, JS::Handle<JSObject*>, JS::Handle<JS::Value>, JS::HandleValueArray const&, JS::MutableHandle<JS::Value>)+1269 (jsapi.cpp:2917) [task 2018-10-23T14:48:49.556Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xD0BAD2A: nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS*, unsigned short, nsXPTMethodInfo const*, nsXPTCMiniVariant*)+3690 (XPCWrappedJSClass.cpp:1206) [task 2018-10-23T14:48:49.556Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xC9796AD: PrepareAndDispatch+493 (xptcstubs_x86_64_linux.cpp:127) [task 2018-10-23T14:48:49.556Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xC978B1E: SharedStub+90 (in /builds/worker/workspace/build/src/obj-firefox/toolkit/library/libxul.so) [task 2018-10-23T14:48:49.556Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFBFEB24: nsXREDirProvider::DoStartup()+324 (nsXREDirProvider.cpp:1090) [task 2018-10-23T14:48:49.556Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFBF505D: XREMain::XRE_mainRun()+2269 (nsAppRunner.cpp:4607) [task 2018-10-23T14:48:49.556Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFBF5C73: XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&)+947 (nsAppRunner.cpp:4922) [task 2018-10-23T14:48:49.556Z] 14:48:49 INFO - 0:33.48 ==3591== by 0xFBF6258: XRE_main(int, char**, mozilla::BootstrapConfig const&)+216 (nsAppRunner.cpp:5014) [task 2018-10-23T14:48:49.556Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x10EC8B: do_main (nsBrowserApp.cpp:237) [task 2018-10-23T14:48:49.556Z] 14:48:49 INFO - 0:33.48 ==3591== by 0x10EC8B: main+795 (nsBrowserApp.cpp:329) [task 2018-10-23T14:48:49.556Z] 14:48:49 INFO - 0:33.48 ==3591== [task 2018-10-23T14:48:49.556Z] 14:48:49 INFO - 0:33.48 { [task 2018-10-23T14:48:49.556Z] 14:48:49 INFO - 0:33.48 <insert_a_suppression_name_here> [task 2018-10-23T14:48:49.557Z] 14:48:49 INFO - 0:33.48 Memcheck:Cond [task 2018-10-23T14:48:49.557Z] 14:48:49 INFO - 0:33.48 fun:_ZN2js3jit12LIRGenerator10visitBlockEPNS0_11MBasicBlockE [task 2018-10-23T14:48:49.557Z] 14:48:49 INFO - 0:33.48 fun:_ZN2js3jit12LIRGenerator8generateEv [task 2018-10-23T14:48:49.557Z] 14:48:49 INFO - 0:33.48 fun:_ZN2js3jit11GenerateLIREPNS0_12MIRGeneratorE [task 2018-10-23T14:48:49.557Z] 14:48:49 INFO - 0:33.48 fun:_ZN2js3jit14CompileBackEndEPNS0_12MIRGeneratorE [task 2018-10-23T14:48:49.557Z] 14:48:49 INFO - 0:33.48 fun:_ZN2js12HelperThread17handleIonWorkloadERNS_25AutoLockHelperThreadStateE [task 2018-10-23T14:48:49.557Z] 14:48:49 INFO - 0:33.48 fun:_ZN2js12HelperThread10threadLoopEv [task 2018-10-23T14:48:49.557Z] 14:48:49 INFO - 0:33.48 fun:callMain<0> [task 2018-10-23T14:48:49.557Z] 14:48:49 INFO - 0:33.48 fun:_ZN2js6detail16ThreadTrampolineIRFvPvEJPNS_12HelperThreadEEE5StartES2_ [task 2018-10-23T14:48:49.557Z] 14:48:49 INFO - 0:33.48 fun:start_thread [task 2018-10-23T14:48:49.557Z] 14:48:49 INFO - 0:33.48 fun:clone [task 2018-10-23T14:48:49.557Z] 14:48:49 INFO - 0:33.48 }

Flags: needinfo?(nicolas.b.pierron)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 42

•

6 years ago

Attached patch to be fuzzed — Details — Splinter Review

This patch is the aggregation of all other patches which landed. I was unable to reproduce the Valgrind failure reported before, and suspect that fuzzing this patch with an ASan build and running with the following command line options is likely to trigger these issues: --ion-eager --no-threads

Attachment #9020291 - Flags: feedback?(nth10sd)

Attachment #9020291 - Flags: feedback?(choller)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Updated

•

6 years ago

Flags: needinfo?(nicolas.b.pierron)

Ted Campbell [:tcampbell]

Comment 43

•

6 years ago

Valgrind seemed to fail on https://hg.mozilla.org/try/rev/683e3e0eaee1 with the split-into-two-lists patch. Failure is pretty strange.

Steven DeTar [:sdetar]

Updated

•

6 years ago

Whiteboard: [qf:p1:f64] → [qf:p1:f65]

Christian Holler (:decoder)

Comment 44

•

6 years ago

Comment on attachment 9020291 [details] [diff] [review] to be fuzzed I tested this over the weekend and couldn't find anything.

Attachment #9020291 - Flags: feedback?(choller) → feedback+

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Comment 45

•

6 years ago

Comment on attachment 9020291 [details] [diff] [review] to be fuzzed Review of attachment 9020291 [details] [diff] [review]: ----------------------------------------------------------------- Tested this for a day or so and didn't find anything either.

Attachment #9020291 - Flags: feedback?(nth10sd) → feedback+

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 46

•

6 years ago

Trying to run the JS shell with jit-test test suite, increasing the LifoAlloc red-zone size set to 128 bytes does not seems to highlight any issues either.

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 47

•

6 years ago

I managed to reproduce this issue with the JS shell compiled with clang 5.0.2: $ valgrind -q --smc-check=all-non-file --error-exitcode=1 --gen-suppressions=all --show-possibly-lost=no --leak-check=full \ ./js --ion-eager --ion-offthread-compile=off /dev/null When looking at the first reported error on visitInstruction, the code: MOZ_ASSERT(!errored()); is compiled to: d7e340: 0f b7 48 30 movzwl 0x30(%rax),%ecx d7e344: 0f b6 50 32 movzbl 0x32(%rax),%edx d7e348: c1 e2 10 shl $0x10,%edx d7e34b: 09 ca or %ecx,%edx d7e34d: 81 fa ff ff 00 00 cmp $0xffff,%edx d7e353: 0f 86 47 01 00 00 jbe d7e4a0 <_ZN2js3jit12LIRGenerator16visitInstructionEPNS0_12MInstructionE+0x170> This is the code generated for checking the value of: offThreadStatus_.isErr() In this case offThreadStatus_ is of type AbortReasonOr<Ok> (i.e: Result<Ok, AbortReason>) which is compiled with the packing strategy: ResultImplementation<Ok, AbortReason, PackingStrategy::PackedVariant> where data are stored with the IsPackableVariant<Ok, AbortReason>::VEBool: struct VEbool { V v; E e; bool ok; }; In the above assembly, the code of isErr(): return !data.ok; Is compiled as the following expression: ecx = load.16 [rax + 0x30] // Load data.v and data.e values (data.e is non-initialized) edx = load.8 [rax + 0x32] // Load data.ok value. edx = edx << 16 | ecx // edx contains the result of VEbool structure. if (edx <= 0xffff) // test if data.ok is 0. On the other hand gcc compiles the previous code to: c0740a: 0f b6 42 32 movzbl 0x32(%rdx),%eax c0740e: 84 c0 test %al,%al c07410: 0f 84 fa 00 00 00 je c07510 <_ZN2js3jit12LIRGenerator16visitInstructionEPNS0_12MInstructionE+0x110> eax = load.8 [rdx + 0x32] // Load data.ok value. if ((eax & eax) == 0) // test if data.ok is 0. The problem is that the clang-ism is manipulates undefined bits which seems to confuse Valgrind at some point and raise a false positive error, at least on the LIRGenerator::visitInstruction case. To me we should, figure out why valgrind did not manage to understand this clang-ism, and disable such clang-ism as it seems to generate inefficient code (2 loads instead and one extra register, instead of one load and one register in gcc).

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 48

•

6 years ago

I uploaded a patch to fix the issue reported by Valgrind and detailed in comment 47. This Valgrind fix is being tracked at: https://bugs.kde.org/show_bug.cgi?id=401112

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 49

•

6 years ago

Mike, do you know if it is possible to use a custom version of Valgrind for running our test suite? Or should we wait until the next Valgrind release? (see comment 47 and comment 48)

Flags: needinfo?(mh+mozilla)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 50

•

6 years ago

(In reply to Nicolas B. Pierron [:nbp] from comment #47) > I managed to reproduce this issue with the JS shell compiled with clang > 5.0.2: For information, clang 7.0.0 generates the same code and has the same false-positive report when executed with Valgrind. My blind guess would be an issue in the optimization choice which use this optimization but does not check if the compared structure can be read with a single load.

Mike Hommey [:glandium]

Comment 51

•

6 years ago

We can apply patches, but I'd rather wait for it to be reviewed upstream first.

Flags: needinfo?(mh+mozilla)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Updated

•

6 years ago

Blocks: 1509094

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 52

•

6 years ago

(In reply to Mike Hommey [:glandium] from comment #51) > We can apply patches, but I'd rather wait for it to be reviewed upstream > first. Done: https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=f2c03ce3babe51eecbf03735f726c4028a162857

Flags: needinfo?(mh+mozilla)

Mike Hommey [:glandium]

Comment 53

•

6 years ago

Attached file Bug 1489572 - Apply upstream patch to valgrind to fix some false positives. — Details

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 54

•

6 years ago

Add leave-open keyword for landing LifoAlloc changes after attachment 9028867 [details] is landed.

Keywords: leave-open

Pulsebot

Comment 55

•

6 years ago

Pushed by mh@glandium.org: https://hg.mozilla.org/integration/autoland/rev/134001f7ca7f Apply upstream patch to valgrind to fix some false positives. r=froydnj

Stefan Hindli [:stefan_hindli]

Comment 56

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/134001f7ca7f

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Comment 57

•

6 years ago

Attached patch Fix test case running longer than expected due to increased interpreter stack. r= — Details — Splinter Review

Changing this test case I added earlier this year, to check LifoAlloc mprotect instrumentation is working well and not crashing. Knowing that the general stack protection is no longer present I am almost tempted to remove this test case. In the mean time, this test fails with --no-baseline --no-ion as it takes too much time on the our CI, which is the intended test configuration. I am hoping that adding the "slow" tag will work. (currently running on Try) The reason why this test fails is likely due to the change of LifoAlloc chunk increased size, and the interpreter frames are being allocated within these LifoAlloc chunks. The fact of having more contiguous space with this current bug, is likely preventing us from wasting Interpreter stack space, thus stack-exhaustion errors to be delayed.

Mike Hommey [:glandium]

Updated

•

6 years ago

Flags: needinfo?(mh+mozilla)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Updated

•

6 years ago

Attachment #9031228 - Flags: review?(tcampbell)

Ted Campbell [:tcampbell]

Updated

•

6 years ago

Attachment #9031228 - Flags: review?(tcampbell) → review+

Pulsebot

Comment 58

•

6 years ago

Pushed by npierron@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/d9dea0ec29b0 LifoAlloc: Move cold code to LifoAlloc.cpp. r=tcampbell https://hg.mozilla.org/integration/mozilla-inbound/rev/6cc520c2871e Split LifoAlloc chunks in 2 lists. r=tcampbell https://hg.mozilla.org/integration/mozilla-inbound/rev/11c134af2134 Add LifoAlloc instrumentation for adding heuristics. r=tcampbell https://hg.mozilla.org/integration/mozilla-inbound/rev/aaddfe44c11b Add LifoAlloc heuristic to increase the reserved space based on the consumed space. r=jorendorff https://hg.mozilla.org/integration/mozilla-inbound/rev/c39cd7438b60 Fix test case running longer than expected due to increased interpreter stack. r=tcampbell

Narcis Beleuzu [:NarcisB]

Comment 59

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/d9dea0ec29b0 https://hg.mozilla.org/mozilla-central/rev/6cc520c2871e https://hg.mozilla.org/mozilla-central/rev/11c134af2134 https://hg.mozilla.org/mozilla-central/rev/aaddfe44c11b https://hg.mozilla.org/mozilla-central/rev/c39cd7438b60

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Updated

•

6 years ago

Status: ASSIGNED → RESOLVED

Closed: 6 years ago

Keywords: leave-open

Resolution: --- → FIXED

Ryan VanderMeulen [:RyanVM]

Updated

•

6 years ago

status-firefox64: affected → wontfix

status-firefox65: affected → wontfix

status-firefox66: --- → fixed

Target Milestone: --- → mozilla66

Joel Maher ( :jmaher ) (UTC -8)

Comment 60

•

6 years ago

some perf wins in ion: == Change summary for alert #18398 (as of Thu, 20 Dec 2018 23:21:11 GMT) == Improvements: 3% raptor-wasm-godot-ion-firefox linux64 opt 2,132.13 -> 2,070.39 3% raptor-wasm-godot-ion-firefox linux64-qr opt 2,165.06 -> 2,108.11 3% raptor-wasm-godot-ion-firefox linux64 pgo 2,084.79 -> 2,030.47 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=18398

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

6 years ago

Depends on: 1516638

Randell Jesup [:jesup] (needinfo me)

Updated

•

6 years ago

Depends on: 1520366

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

3 years ago

Performance Impact: --- → P1

Whiteboard: [qf:p1:f65]

Nicolas B. Pierron [:nbp | PTO until 11/11]

Assignee

Updated

•

4 months ago

Duplicate of this bug: 1475561

You need to log in before you can comment on or make changes to this bug.

round LifoAlloc sizes up to the jemalloc chunk size 6 years ago Randell Jesup [:jesup] (needinfo me) 2.18 KB, patch		Details \| Diff \| Splinter Review
Histogram of lifo alloc allocations 6 years ago Randell Jesup [:jesup] (needinfo me) 6.46 KB, text/plain		Details
Add LifoAlloc::free_ no-op 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 2.70 KB, patch		Details \| Diff \| Splinter Review
AsmJS CheckFunction: Reclaim LifoAlloc space on errors. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 1.75 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
TypeHashSet: Reclaim space when increasing the capacity. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 998 bytes, patch		Details \| Diff \| Splinter Review
LifoAlloc: Add the ability to best-fit the next chunk allocation. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 8.54 KB, patch		Details \| Diff \| Splinter Review
Ioncompile: Reclaim LifoAlloc space on errors. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 1.93 KB, patch		Details \| Diff \| Splinter Review
Control Flow Graph: Use a single TempAllocator for CFG cache. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 3.77 KB, patch		Details \| Diff \| Splinter Review
LifoAlloc: Free large allocations. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 19.23 KB, patch		Details \| Diff \| Splinter Review
LifoAlloc: Double the size of the small-space to reduce the number of chunks. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 1.15 KB, patch		Details \| Diff \| Splinter Review
LifoAlloc: Move cold code to LifoAlloc.cpp. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 10.39 KB, patch	tcampbell : review+	Details \| Diff \| Splinter Review
Split LifoAlloc chunks in 2 lists. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 20.50 KB, patch	tcampbell : review+	Details \| Diff \| Splinter Review
Adaptively double LifoAlloc chunks to reduce the number of chunks. 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 6.83 KB, patch	tcampbell : review+	Details \| Diff \| Splinter Review
Try Nathan heuristics 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 1.90 KB, patch	jorendorff : review+	Details \| Diff \| Splinter Review
to be fuzzed 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 26.09 KB, patch	decoder : feedback+ gkw : feedback+	Details \| Diff \| Splinter Review
Bug 1489572 - Apply upstream patch to valgrind to fix some false positives. 6 years ago Mike Hommey [:glandium] 47 bytes, text/x-phabricator-request		Details \| Review
Fix test case running longer than expected due to increased interpreter stack. r= 6 years ago Nicolas B. Pierron [:nbp \| PTO until 11/11] 756 bytes, patch	tcampbell : review+	Details \| Diff \| Splinter Review