580803 - TM: de-crazy-ify memory pressure handling

Assignee

Description

•

15 years ago

We currently have 4 memory pressure heuristics applied in 6 different points, and 2 of them are partially overridden by the DOM code. The 4 heuristics are spread out over 5 files, which makes it hard to see just how redundant they are. This first patch tries to clean up the mess a bit and moves all the heuristics in one central place (JS_MaybeGC), makes DOM use the same set of heuristics and also has a single trigger point (also JS_MaybGC!). This means that the embedding must call JS_MaybeGC, or JS_GC to make GCs happen. To be more explicit, this eliminates last ditch GCs. The allocator path will try to grow the GC heap and the malloc heap until we get a hard error (malloc or mmap failure). If we run into the soft quotas (GC malloc bytes or GC heap bytes limits), the next call to JS_MaybeGC will cause a GC and hopefully we will get back into compliance with the soft limits. When and from where to call JS_MaybeGC? The shell currently doesn't call it at all, which means the shell will start accumulated unlimited memory. I will fix this in the next version. The DOM already calls JS_MaybeGC in the operation callback. We should probably move this in a parallel thread that wakes up occasionally and checks for JS_MaybeGC. This would have the advantage that the GC runs on that thread and doesn't trash the main thread's cache.

Andreas Gal :gal

Assignee

Comment 1

•

15 years ago

Attached patch patch (obsolete) — Details — Splinter Review

Assignee: general → gal

Andreas Gal :gal

Assignee

Updated

•

15 years ago

Attachment #459179 - Attachment is patch: true

Attachment #459179 - Attachment mime type: application/octet-stream → text/plain

Robert Sayre

Updated

•

15 years ago

Blocks: 580458

Brendan Eich [:brendan]

Comment 2

•

15 years ago

I do not agree on removing the last ditch GC, since many embeddings, in all likelihood including Firefox, rely on it. Think about it: you have a GC heap. You need more memory. But you're up against the runtime heap-size limit. If you OOM while tons of garbage are stinking up the heap, you'll hard-stop the script and break the web. Better to GC unless you can prove there's no (or too little) garbage amid the megabytes of heap. /be

Brendan Eich [:brendan]

Comment 3

•

15 years ago

(De-crazifying otherwise sounds good.)

Andreas Gal :gal

Assignee

Comment 4

•

15 years ago

Attached patch patch (obsolete) — Details — Splinter Review

This patch schedules an operation callback if we go over quota. Ideally we probably want to do this with setrlimit and trigger the callback whenever the OS detects that we suck up too much memory, but we can do that in a second patch. The new OOM behavior is as follows: - If we notice that we are going over quota (GC heap or malloc), trigger an operation callback. - If the callback notices that we are over quota, it forces a GC. If we are still over quota after the GC, we throw an OOM exception. The callback also does MaybeGC. The DOM no longer does it.

Attachment #459179 - Attachment is obsolete: true

Andreas Gal :gal

Assignee

Comment 5

•

15 years ago

#2: There is minimal behavioral change here. I still force a GC approximately around the same time. The main difference is that the GC now happens in once place in all cases. I am also moving MaybeGC into the event loop in gecko. With that we will have a shallow stack to scan and hopefully no pending requests, instead of scanning a deep stack in LastDitch.

Brendan Eich [:brendan]

Comment 6

•

15 years ago

(In reply to comment #5) > #2: There is minimal behavioral change here. How do you know? I mean, measure and stuff. Science! /be

Andreas Gal :gal

Assignee

Comment 7

•

15 years ago

I am on it. Measuring, measuring, measuring! :) The shell behaves the same. You can allocate in a loop to your hearts desire with JIT on or off. Trying the browser now.

Igor Bukanov

Comment 8

•

15 years ago

(In reply to comment #6) > (In reply to comment #5) > > #2: There is minimal behavioral change here. > > How do you know? I mean, measure and stuff. Science! The problem can only come from a long trace that allocate a lot of stuff. With fatvals this is much less likely and, if problematic, for such trace an extra operation count guard could be inserted in the middle after some threshold of potentially allocating operations. But this is a moot problem since we do not enforce any malloc quotas nor run the last ditch GC when malloc returns null. For this reason embeddings already faces that problem of getting OOM reports while having tons of garbage. So gal's plan is very sound.

Igor Bukanov

Comment 9

•

15 years ago

Comment on attachment 459232 [details] [diff] [review] patch >diff --git a/js/src/jscntxt.h b/js/src/jscntxt.h >--- a/js/src/jscntxt.h >+++ b/js/src/jscntxt.h >@@ -1565,29 +1545,45 @@ struct JSRuntime { > > JSWrapObjectCallback wrapObjectCallback; > > JSRuntime(); > ~JSRuntime(); > > bool init(uint32 maxbytes); > >+ void triggerGC() { >+ gcIsNeeded = true; The function must still trigger the operation callback. Otherwise the runtime may not notice the breach of GC limits for quite some time. >diff --git a/js/src/jsgc.cpp b/js/src/jsgc.cpp >--- a/js/src/jsgc.cpp >+++ b/js/src/jsgc.cpp >@@ -636,27 +636,16 @@ ReleaseGCChunk(JSRuntime *rt, jsuword ch > METER(rt->gcStats.nchunks--); > rt->gcChunkAllocator->free(p); > } > > static JSGCArena * > NewGCArena(JSContext *cx) > { > JSRuntime *rt = cx->runtime; >- if (!JS_THREAD_DATA(cx)->waiveGCQuota && rt->gcBytes >= rt->gcMaxBytes) { >- /* >- * FIXME bug 524051 We cannot run a last-ditch GC on trace for now, so >- * just pretend we are out of memory which will throw us off trace and >- * we will re-try this code path from the interpreter. >- */ >- if (!JS_ON_TRACE(cx)) >- return NULL; >- js_TriggerGC(cx, true); I do not spot the place that would trigger the GC when the limit is breached. So this code should remain in a simpler form that just do js_TriggerGC(cx, true) when rt->gcBytes >= rt->gcMaxBytes. > static JSGCThing * > RefillFinalizableFreeList(JSContext *cx, unsigned thingKind) > { >- bool canGC = !JS_ON_TRACE(cx) && !JS_THREAD_DATA(cx)->waiveGCQuota; >- bool doGC = canGC && IsGCThresholdReached(rt); This is righteous. We do not need to check for IsGCThresholdReached as we rely both in GC and in malloc on triggerGC scheduling the GC ASAP.

Brendan Eich [:brendan]

Comment 10

•

15 years ago

(In reply to comment #8) > But this is a moot problem since we do not enforce any malloc quotas nor run > the last ditch GC when malloc returns null. We've always run it when rt->gcMaxBytes is hit, of course. > For this reason embeddings already > faces that problem of getting OOM reports while having tons of garbage. That's not true if gcMaxBytes is << memory available from the OS to the process data segment. Like in Firefox. Anyway, science will save us! Gal has a white smock on. /be

Andreas Gal :gal

Assignee

Comment 11

•

15 years ago

Attached patch patch (obsolete) — Details — Splinter Review

The patch wasn't ready for review yet. This one should be close.

Attachment #459232 - Attachment is obsolete: true

Andreas Gal :gal

Assignee

Comment 12

•

15 years ago

This patch does triggerGC() if we go over malloc-quota while malloc-ing or over gc heap quota while getting a new arena. Ideally, instead of checking directly this should be done through a setrlimit/getrlimit scheme where we chase an ever raising memory ceiling, and once we hit either of the quota we trigger the GC. This is a bit harder to do on windows, so I am leaving this for a follow-up bug.

Andreas Gal :gal

Assignee

Updated

•

15 years ago

Attachment #459479 - Flags: review?(brendan)

Andreas Gal :gal

Assignee

Comment 13

•

15 years ago

Could we got some fuzzing for this (especially gczeal).

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Comment 14

•

15 years ago

(In reply to comment #13) > Could we got some fuzzing for this (especially gczeal). Early info: this patch seems to hold up fine after 10+ minutes of preliminary fuzzing. :)

Brendan Eich [:brendan]

Comment 15

•

15 years ago

Comment on attachment 459479 [details] [diff] [review] patch Bouncing to Igor, he is the right reviewer and I'm focused on my patch queue. /be

Attachment #459479 - Flags: review?(brendan) → review?(igor)

Igor Bukanov

Comment 16

•

15 years ago

Comment on attachment 459479 [details] [diff] [review] patch >diff --git a/js/src/jscntxt.cpp b/js/src/jscntxt.cpp >--- a/js/src/jscntxt.cpp >+++ b/js/src/jscntxt.cpp >@@ -1888,43 +1887,36 @@ js_InvokeOperationCallback(JSContext *cx >+#ifdef JS_THREADSAFE >+ JS_YieldRequest(cx); >+#endif If the callback is called under memory pressure yielding here just delays the GC. If simpler logic is desired at least call this after the GC. And comment about that! >@@ -1573,16 +1553,36 @@ struct JSRuntime { > void setGCTriggerFactor(uint32 factor); > void setGCLastBytes(size_t lastBytes); > >- void* malloc(size_t bytes) { return ::js_malloc(bytes); } >- >- void* calloc(size_t bytes) { return ::js_calloc(bytes); } >- >- void* realloc(void* p, size_t bytes) { return ::js_realloc(p, bytes); } >+ bool gcQuotaReached() { >+ return gcBytes >= gcMaxBytes; >+ } >+ >+ size_t updateMallocCounter(size_t bytes) { Returning bytes from the function complicates following the code. So keep the type void here and update the callers accordingly. >+ gcMallocBytes -= bytes; /* We tolerate races and lost counts here. */ gcMallocBytes should be volatile. AFAICS it is better to call triggerGC here so the compiler would not need to load gcMallocBytes twice, once before and once after malloc/realloc while checking again for gcMallocBytes. >+ inline void triggerGC() { >+ if (!runtime->gcIsNeeded) { >+ runtime->gcIsNeeded = true; >+ JS_TriggerOperationCallback(this); This is incompatible with thread workers. The GC callback in firefox prevents the GC to run on non-main threads. So if a worker thread hits the GC limit while the main thread busy doing non-allocation loop the GC can be delayed for long time. For now just keep the triggerAllOperationCallbacks here and file a followup bug about better scheduling of the GC from non-main thread. For example, we can add a hook the FF can use to trigger the GC only on the main thread. >@@ -1774,45 +1732,21 @@ RefillFinalizableFreeList(JSContext *cx, Comment in the function that we check for memory pressure only if we need to get a new arena.

Igor Bukanov

Comment 17

•

15 years ago

Forgot to write about r+ with the above addressed.

Andreas Gal :gal

Assignee

Comment 18

•

15 years ago

http://hg.mozilla.org/tracemonkey/rev/b4aeef9c6c91

Whiteboard: fixed-in-tracemonkey

Andreas Gal :gal

Assignee

Comment 19

•

15 years ago

I forgot to change triggerGC to notify all callbacks, not just the current context's http://hg.mozilla.org/tracemonkey/rev/277470ab29c8

Andreas Gal :gal

Assignee

Updated

•

15 years ago

Attachment #459479 - Flags: review?(igor) → review+

Andreas Gal :gal

Assignee

Comment 20

•

15 years ago

The change to JS_TriggerAllOperationCallbacks introduces a deadlock.

Andreas Gal :gal

Assignee

Comment 21

•

15 years ago

Fix for the deadlock. http://hg.mozilla.org/tracemonkey/rev/943de0cf6f0a

Jeff Walden [:Waldo]

Comment 22

•

15 years ago

I think you might have turned a test intermittently orange (at least as far as I can tell, it started with your first push here): http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1279929778.1279930497.23454.gz&fulltext=1 The test in question is here, and I think you're causing the first check in checkVeryLongRequestLine to fail: http://hg.mozilla.org/tracemonkey/file/default/netwerk/test/httpserver/test/test_request_line_split_in_two_packets.js If this problem was caused by this patch, or by another patch of yours, don't hesitate to poke me for debugging help. If the problem is far upstream of that failure point, as I suspect it is, I can probably point you at the code that generated the faulty data more quickly than you can from reading the code. I'm unlikely to be on IRC until Monday; catch me then if you haven't figured it out by then.

Igor Bukanov

Comment 23

•

15 years ago

(In reply to comment #19) > I forgot to change triggerGC to notify all callbacks, not just the current > context's > > http://hg.mozilla.org/tracemonkey/rev/277470ab29c8 In the comment 16 I asked to move the triggerGC call into updateMallocCounter for code simplicity and to avoid re-loading the malloc counter. Either do it here or in a followup.

Andreas Gal :gal

Assignee

Comment 24

•

15 years ago

All the code is inlined. The location of the access doesn't matter.

Igor Bukanov

Comment 25

•

15 years ago

(In reply to comment #24) > All the code is inlined. The location of the access doesn't matter. malloc itself is not inlined so to implement the current: gcMallocBytes -= size; p = malloc(size); if (gcMallocBytes < 0) triggerGC(); the compiler will have to reload gcMallocBytes. This extra load is clearly visible in the following example were test1 follows the code from the patch and test2 is the suggested change: #include <stdlib.h> struct A { long buffer[100]; volatile long gcMallocBytes; volatile bool triggerGC; }; void report_oom(A *a); void *test1(A *a, size_t size) { a->gcMallocBytes -= size; void *p = malloc(size); if (a->gcMallocBytes < 0) a->triggerGC = true; if (!p) report_oom(a); return p; } void *test2(A *a, size_t size) { a->gcMallocBytes -= size; if (a->gcMallocBytes < 0) a->triggerGC = true; void *p = malloc(size); if (!p) report_oom(a); return p; } I compile the code with GCC for i386 using: g++ -Os -fno-rtti -fno-exceptions -m32 -S x.cpp -Wall -Wextra The assembly for test1: pushl %ebp movl %esp, %ebp pushl %ebx subl $32, %esp movl 8(%ebp), %ebx movl 12(%ebp), %edx movl 400(%ebx), %eax subl %edx, %eax movl %eax, 400(%ebx) pushl %edx call malloc movl 400(%ebx), %edx addl $16, %esp testl %edx, %edx jns .L7 movb $1, 404(%ebx) .L7: testl %eax, %eax jne .L8 subl $12, %esp pushl %ebx movl %eax, -12(%ebp) call _Z10report_oomP1A movl -12(%ebp), %eax addl $16, %esp .L8: movl -4(%ebp), %ebx leave ret for test2: pushl %ebp movl %esp, %ebp pushl %ebx subl $20, %esp movl 8(%ebp), %ebx movl 12(%ebp), %edx movl 400(%ebx), %eax subl %edx, %eax movl %eax, 400(%ebx) movl 400(%ebx), %eax testl %eax, %eax jns .L2 movb $1, 404(%ebx) .L2: subl $12, %esp pushl %edx call malloc addl $16, %esp testl %eax, %eax jne .L3 subl $12, %esp pushl %ebx movl %eax, -12(%ebp) call _Z10report_oomP1A movl -12(%ebp), %eax addl $16, %esp .L3: movl -4(%ebp), %ebx leave ret Here for test1 the compiler needs to issue movl 400(%ebx), %eax twice while for test2 only single load is present.

Andreas Gal :gal

Assignee

Comment 26

•

15 years ago

Are we seriously arguing here about an extra load surrounding a malloc? You known, malloc, the thing that does mmap, locking, touching of lots of memory, and stuff like that. This branch is predicted with near 100% accuracy. What problem are we solving here?

Andreas Gal :gal

Assignee

Comment 27

•

15 years ago

Attached patch move gc triggering into the runtime (obsolete) — Details — Splinter Review

Attachment #460066 - Flags: review?(igor)

Igor Bukanov

Comment 28

•

15 years ago

(In reply to comment #26) > Are we seriously arguing here about an extra load surrounding a malloc? Comment 24 contains: > All the code is inlined. The location of the access doesn't matter. My previous comment shows that it does matter. I was not arguing about the relevance of the optimization as the comment 24 also does not talk about it. In the comment 23 I asked about this change because the resulting code will be simpler and gives more opportunities for a compiler to optimize. "Simpler" here because updating and checking the counter in the same function is easy to follow than to look into two functions one of which named reportIfOutOfMemory, a name that does not hint about the counter check.

Igor Bukanov

Comment 29

•

15 years ago

Comment on attachment 460066 [details] [diff] [review] move gc triggering into the runtime Thanks, this is much nicer!

Attachment #460066 - Flags: review?(igor) → review+

Brendan Eich [:brendan]

Comment 30

•

15 years ago

(In reply to comment #12) > Ideally, instead of checking directly > this should be done through a setrlimit/getrlimit scheme where we chase an ever > raising memory ceiling, and once we hit either of the quota we trigger the GC. > This is a bit harder to do on windows, so I am leaving this for a follow-up > bug. Filed yet? /be

Andreas Gal :gal

Assignee

Updated

•

15 years ago

Whiteboard: fixed-in-tracemonkey

Andreas Gal :gal

Assignee

Comment 31

•

15 years ago

Attached patch patch — Details — Splinter Review

Backout patch. This sadly also loses the waiveGC elimination. Sigh.

Attachment #459479 - Attachment is obsolete: true

Attachment #460066 - Attachment is obsolete: true

Andreas Gal :gal

Assignee

Comment 32

•

15 years ago

Backed out. http://hg.mozilla.org/tracemonkey/rev/babf32a45349

Status: NEW → RESOLVED

Closed: 15 years ago

Resolution: --- → INVALID

Brendan Eich [:brendan]

Comment 33

•

15 years ago

Why INVALIDate this bug? No one wants crazy-fied code. We may have to defer it to post firefox 4, if there's no benchmark-based need, but why throw the bug away? It loses information. I would hate to see someone reinvent as much of the wheel as we have here. Bugzilla is a searchable knowledge base. I even save bugmail for faster searching of my well-curated corpus. /be

Igor Bukanov

Comment 34

•

15 years ago

Could we have for the record the list of issues that triggered orange?

Brendan Eich [:brendan]

Comment 35

•

15 years ago

Understanding all the orange would be good but might take a while. It also in my best guess would not leave us with no need for last ditch GC in the current state of the codebase minus whatever bugs result in the orange. I could be wrong, but the whole "precognitive GC scheduling" idea is unquantified and probabilistic. Even without bugs, it has non-zero likelihood of resulting in OOM with heap full of garbage. I don't think we can take that chance, and again: I have seen no profile data showing we need to take that chance *right now*. /be

Status: RESOLVED → REOPENED

Resolution: INVALID → ---

Igor Bukanov

Comment 36

•

15 years ago

(In reply to comment #35) > Understanding all the orange would be good but might take a while. There are two unrelated parts in the patch. The first one is eliminating the last ditch GC. The second is changing JSContext::malloc and friends to use simplified memory pressure detection. To make sure that at least that part is sound I am going to factor that out into separated bug. > It also in my best guess would not leave us with no need for last ditch GC in > the current state of the codebase minus whatever bugs result in the orange. I > could be wrong, but the whole "precognitive GC scheduling" idea is unquantified > and probabilistic. I do not see how that could be unsound. If js_TriggerGC does not work, it implies that operation callback support is broken. But those bugs can be fixed. At worst it may be necessary to inject the callback checks into long traces or few places like the parser. > > Even without bugs, it has non-zero likelihood of resulting in OOM with heap > full of garbage. Yet SpiderMonkey had not run the last-ditch GC for ages when malloc returns NULL. So I really see no fundamental difference between that and the OOM in the GC when it tries to bring more pages. > I don't think we can take that chance, and again: I have seen > no profile data showing we need to take that chance *right now*. Elimination of the last-ditch GC helps to make code safer since it removes the number of cases when the GC can run. Granted, with conservative GC this is less relevant, but still the knowledge that NewGCThing does not run the GC makes it much easy to proof the correctness.

Igor Bukanov

Updated

•

15 years ago

Depends on: 582718

Andreas Gal :gal

Assignee

Comment 37

•

15 years ago

(Igor, if you want to grab this bug, go for it)

Brendan Eich [:brendan]

Comment 38

•

15 years ago

(In reply to comment #36) > There are two unrelated parts in the patch. The first one is eliminating the > last ditch GC. The second is changing JSContext::malloc and friends to use > simplified memory pressure detection. To make sure that at least that part is > sound I am going to factor that out into separated bug. Great -- smaller is better if possible. > > It also in my best guess would not leave us with no need for last ditch GC in > > the current state of the codebase minus whatever bugs result in the orange. I > > could be wrong, but the whole "precognitive GC scheduling" idea is unquantified > > and probabilistic. > > I do not see how that could be unsound. If js_TriggerGC does not work, it > implies that operation callback support is broken. No, I'm saying if you don't run a last ditch GC from the allocator slow path, you will hit OOM right there -- triggering a GC later or earlier may not help. /be

Igor Bukanov

Comment 39

•

15 years ago

(In reply to comment #38) > No, I'm saying if you don't run a last ditch GC from the allocator slow path, > you will hit OOM right there -- triggering a GC later or earlier may not help. Yes, but (I have to repeat myself) how this is different from hitting OOM on malloc?

Brendan Eich [:brendan]

Comment 40

•

15 years ago

(In reply to comment #39) > (In reply to comment #38) > > No, I'm saying if you don't run a last ditch GC from the allocator slow path, > > you will hit OOM right there -- triggering a GC later or earlier may not help. > > Yes, but (I have to repeat myself) how this is different from hitting OOM on > malloc? It is very different! If the OS actually fails malloc (many do not), you've used up the heap due to a leak or real workload demand If we hit the last ditch, the heap may instead be full of dead temps and ripe for GC and OOM is not only user-hostile or embedding-hostile, it is unnecessary and completely backward-incompatible. /be

Brendan Eich [:brendan]

Comment 41

•

15 years ago

(In reply to comment #40) > (In reply to comment #39) > > (In reply to comment #38) > > > No, I'm saying if you don't run a last ditch GC from the allocator slow path, > > > you will hit OOM right there -- triggering a GC later or earlier may not help. > > > > Yes, but (I have to repeat myself) how this is different from hitting OOM on > > malloc? > > It is very different! If the OS actually fails malloc (many do not), you've > used up the heap due to a leak or real workload demand > > If we hit the last ditch, the heap Oh, and different heap here -- sorry. JS GC heap != OS data segment. JS GC heap << OS data segment. /be

Igor Bukanov

Comment 42

•

15 years ago

(In reply to comment #41) > Oh, and different heap here -- sorry. JS GC heap != OS data segment. JS GC heap > << OS data segment. The patch explicitly turns the hard GC heap quota into a soft one that can be breached. Thus with the patch NewGCThing can only return null when the mmap fails which is exactly the same condition for jemalloc to return null. So I see no differences here especially given the fact that the trace code were operating under this soft quota regime for quite some time, see waiveGC usage in jsgc.cpp.

Brendan Eich [:brendan]

Comment 43

•

15 years ago

We need to figure out why JS_GetStringBytes returned "" instead of the expected big string. I have a hard time believing mmap returned 0, but it's possible. /be

Ryan VanderMeulen [:RyanVM]

Comment 44

•

14 years ago

Still a valid non-TM bug?

Igor Bukanov

Comment 45

•

14 years ago

This is no longer relevant - the compartment GC changed a lot in this area.

Status: REOPENED → RESOLVED

Closed: 15 years ago → 14 years ago

Resolution: --- → WONTFIX

patch 15 years ago Andreas Gal :gal 27.12 KB, patch		Details \| Diff \| Splinter Review
patch 15 years ago Andreas Gal :gal 27.12 KB, patch		Details \| Diff \| Splinter Review
patch 15 years ago Andreas Gal :gal 23.41 KB, patch	gal : review+	Details \| Diff \| Splinter Review
move gc triggering into the runtime 15 years ago Andreas Gal :gal 4.41 KB, patch	igor : review+	Details \| Diff \| Splinter Review
patch 15 years ago Andreas Gal :gal 34.26 KB, patch		Details \| Diff \| Splinter Review