Closed Bug 965745 Opened 6 years ago Closed 6 years ago

operation callback not always run when running Ion code, leading to high memory usage

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

VERIFIED FIXED
mozilla30
Tracking Status
firefox27 --- wontfix
firefox28 --- fixed
firefox29 --- fixed
firefox30 --- verified
firefox-esr24 --- wontfix
b2g18 --- wontfix
b2g-v1.1hd --- wontfix
b2g-v1.2 --- wontfix
b2g-v1.3 --- fixed
b2g-v1.3T --- fixed
b2g-v1.4 --- fixed

People

(Reporter: bbouvier, Assigned: jonco)

References

Details

(Whiteboard: [MemShrink:P2][qa-])

Attachments

(1 file)

function *gen() {
  for(;;)
    yield gen;
}
var f = gen;
while (f) {
  f = gen();
}

When running this code for a few minutes, it fills up memory. nbp and wingo also think that it might be a memory leak.
Simpler case:

function *gen() { while (1) yield gen }
while (1) gen()

Also occurs with legacy generators:

function gen() { while (1) yield gen }
while (1) gen()
WTF?  This is pretty basic:

function *gen() { yield 1 }
while (1) gen()
Summary: Memory leak involving self yielding generator → Generators leak memory
dumpHeap() says that GC is collecting the objects, and valgrind memcheck does not detect extra leaks when I allocate more generators rather than less. I must be doing something wrong.
Whiteboard: [MemShrink]
Steps to reproduce:

 1. Build a fresh release-mode shell.
 2. Open a system memory monitor (e.g. top, sorted by RSS).
 3. Run the script.

Expectations: The js process should consume a constant small amount of memory (up to a few dozen MB).

Actual results: The RSS of the JS process continues to rise.  After 5 seconds on my machine it takes 5.5GB of private dirty memory.

Note that if you limit the heap size using the relevant function whose name I don't recall right now, the RSS doesn't rise.
function *gen() { yield 1 }
var count = 0;
while (1) {
    gen()
    if (count++ == 1000) {
        gc();
        count = 0;
    }
}

With this test case, the memory doesn't increase at all, except for the few MB needed for the runtime. I am not really aware of the conditions under which a GC occurs, but h4writer tells me that a GC should naturally occur after X allocations, which is not the case in the initial test case.
Attached patch patch-backedgesSplinter Review
After adding some tracing, it seems that we are triggering the operation callback (due to memory allocation), but it is never running.

Aparrently we check the interrupt flag at script entry and loop backedges, although we use mprotect and/or patch loop backedges when running Ion-generated code.

I tried with --no-ion and --ion-eager flags and the problem seems to be Ion specific.

Looking at the JitRuntime::ensureIonCodeAccessible() it seems we only patch the backedges if the code was previously protected, which is not always the case.

I don't know this code particularly, but it seems like we would always want to patch the backedges if the interrupt flag is set.

How does this look for a fix?  I moved some of the comments around as well to make the make more sense.

Also, do we ever re-patch the backedges to not check the interrupt flag?  I couldn't see where this was done.
Attachment #8369372 - Flags: review?(jdemooij)
Comment on attachment 8369372 [details] [diff] [review]
patch-backedges

Review of attachment 8369372 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, nice catch!

(In reply to Jon Coppeard (:jonco) from comment #6)
> Looking at the JitRuntime::ensureIonCodeAccessible() it seems we only patch
> the backedges if the code was previously protected, which is not always the
> case.

Yeah, the assumption is that ensureIonCodeAccessible is only called from the signal handler but that's not true.

> Also, do we ever re-patch the backedges to not check the interrupt flag?  I
> couldn't see where this was done.

This is done in InterruptCheck in jit/VMFunctions.cpp:

    cx->runtime()->jitRuntime()->patchIonBackedges(cx->runtime(),
                                                   JitRuntime::BackedgeLoopHeader);
Attachment #8369372 - Flags: review?(jdemooij) → review+
Whiteboard: [MemShrink] → [MemShrink:P2]
Not knowing much about this, this sounds kind of bad.  Is this only a problem for generators?  Should it be backported?  "After 5 seconds on my machine it takes 5.5GB of private dirty memory" does not sound good. ;)
Jon, do you have any idea about the questions in comment 8?  Thanks.
Assignee: nobody → jcoppeard
Flags: needinfo?(jcoppeard)
QA Contact: jcoppeard
It's not only a problem for generators.  I'm not really sure how often it will occur in practice - I guess you would have to only run Ion code otherwise the interpreter would end up invoking the callback.  But it's a simple fix, so assuming no problems show up in the next few days I think we should backport.
Flags: needinfo?(jcoppeard)
Summary: Generators leak memory → operation callback not always run when running Ion code, leading to high memory usage
> But it's a
> simple fix, so assuming no problems show up in the next few days I think we
> should backport.

Yes. FWIW, the highest-rated comment on the Hacker News discussion of Firefox 27's release was about how it's the first major browser release to support generators: https://news.ycombinator.com/item?id=7177607
https://hg.mozilla.org/mozilla-central/rev/357895a6aaac
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
Doesn't seem like any problems showed up, right?
Flags: needinfo?(jcoppeard)
Comment on attachment 8369372 [details] [diff] [review]
patch-backedges

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Bug 864220 - IonMonkey: Implement loop interrupt checks using mprotect
User impact if declined: Possibility of high memory usage
Testing completed (on m-c, etc.): On m-c since 2014-02-05
Risk to taking this patch (and alternatives if risky): Low
String or IDL/UUID changes made by this patch: None
Attachment #8369372 - Flags: approval-mozilla-beta?
Attachment #8369372 - Flags: approval-mozilla-aurora?
Flags: needinfo?(jcoppeard)
Blocks: 864220
Attachment #8369372 - Flags: approval-mozilla-beta?
Attachment #8369372 - Flags: approval-mozilla-beta+
Attachment #8369372 - Flags: approval-mozilla-aurora?
Attachment #8369372 - Flags: approval-mozilla-aurora+
Is the operation callback where we are doing our stack overflow checks in Ion code by any chance?  There are some top crashes that seem to involve JITted code (bug 856670, bug 969441) that could be related to this, if so.  I guess we can see if crash volume for those signatures goes down on non-release branches.
Flags: needinfo?(jdemooij)
(In reply to Andrew McCreight [:mccr8] from comment #18)
> Is the operation callback where we are doing our stack overflow checks in
> Ion code by any chance?

No it's the other way around: when we trigger the operation callback, we modify the stack limit for JIT code so that the check always fails and we call into the VM. There we handle the operation callback and reset the stack limit.

(This only applies to script entry; loops rely on mprotect/signal handlers or an explicit check if that's disabled.)

> There are some top crashes that seem to involve
> JITted code (bug 856670, bug 969441) that could be related to this, if so. 
> I guess we can see if crash volume for those signatures goes down on
> non-release branches.

I'll take a look at these bugs.
Flags: needinfo?(jdemooij)
Benjamin, can you please confirm this is fixed for you now in Firefox 28, 29, and 30? Should this get an in-testsuite testcase?
Flags: needinfo?(benj)
Whiteboard: [MemShrink:P2] → [MemShrink:P2][qa-]
This is fixed in the latest nightly, couldn't say for the other releases.

I think it cannot get in an automatically reproducible testcase without being run in valgrind, since as far as i can tell, the shell cannot say whether it leaks memory or not (and if it can, I would love to know how to do that).
Flags: needinfo?(benj)
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.