Open Bug 846985 Opened 11 years ago Updated 5 months ago

Allow introspection of stack growth

Categories

(Core :: JavaScript Engine, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: gps, Unassigned)

References

(Blocks 1 open bug)

Details

In bug 845842 we are tackling issues with Firefox Health Report (mostly JavaScript with some calls into C++) exhausting the stack (presumably the C++ one) and running into the dreaded "too much recursion" error.

After a very long and productive IRC chat with jimb, we speculate the reason is the accumulation of js::Interpret frames on the C++ stack. Our specific scenario involves generators (each generator appears to add its own js::Interpret frame). But, the issue of new js::Interpret frame creation is beyond just generators, apparently.

According to jimb's measurements, each js::Interpret frame is quite large - 37k or so! Accumulated js::Interpret frames (such as in a loop) can quickly blow out the C++ stack!

While jimb speculated stack blow out probably doesn't happen very often, we don't really know. Bug 846979 has been added to collect telemetry so we can find out.

But even if it doesn't happen very often, as bug 845842 demonstrates, it does happen. And, finding out why is very difficult: it requires either deep knowledge of the JS engine internals or doing something involved, such as running under gdb or hacking up the source to instrument js::Interpret. As a JS-minded developer with very little knowledge of the JS engine internals, this is sub-par to say the least. Even if I do educate myself, continued application of this knowledge would require keeping apprised of happenings in SpiderMonkey - no easy task! I'd much prefer The JS engine just give me an API that tells me what's going on.

In this bug, I'm requesting a mechanism in SpiderMonkey and/or Gecko to allow JS developers to better understand how the code they write impacts the stack size (C++ or JS stacks).

As a JavaScript developer, here are a few scenarios I think would be rad:

1) If my code causes excessive accumulation of stack size (C++ or JS), a warning is issued before the stack limit is reached. What would be really cool is if the engine recognizes that each iteration of a loop "leaks" a js::Interpret or similar large amounts of stack and proactively notifies me that the loop will not scale.

2) If I would be able to instrument my code to record how much the stack grows or shrinks by (presumably by obtaining raw stack sizes), I could add probes all over my code and isolate "leaks."

3) If a tick grows the stack by more than X kb, an alert is automatically issued. Said alert would contain the JS frames responsible for the significant stack growth (presumably adding js::Interpret frames).

4) If we could hook stack size monitoring into unit tests to better identify regressions in stack sizes.

5) If the engine told me why the stack grew by so much (possibly why a new js::Interpret frame was created) and how to avoid that (perhaps a link to an MDN page or something).

6) If I could receive a notification every time a new js::Interpret (or similar large frame) were added to the stack, I could look at wasteful code and fix it. i.e. give me a low-level API to build awesome on top of.

The gist is I want to know when the code I write blows up the stack before the stack actually blows up. I want to catch issues during code development and during testing, not after it has been deployed to the wild.

I think a first natural step is for some form of API in the JS engine and/or Gecko. After that, Devtools and Automation can look at hooking it up for world good.

How all this would work, I don't know. I don't touch C++ for the tree. I know little about the JS engine beside would I periodically glean from people much more knowledgeable than me. Perhaps large parts of this bug request are based on faulty or incorrect knowledge, I dunno. What I do know is that we currently have a problem of which JavaScript developers have little to no insight. Solving problems inside a black box is very difficult. Please give me a flashlight.
(In reply to Gregory Szorc [:gps] from comment #0)

> 1) If my code causes excessive accumulation of stack size (C++ or JS), a
> warning is issued before the stack limit is reached. What would be really
> cool is if the engine recognizes that each iteration of a loop "leaks" a
> js::Interpret or similar large amounts of stack and proactively notifies me
> that the loop will not scale.
> 
> ...
> 3) If a tick grows the stack by more than X kb, an alert is automatically
> issued. Said alert would contain the JS frames responsible for the
> significant stack growth (presumably adding js::Interpret frames).

All I want for Christmas is for my console to stop spewing totally useless stuff like

  pldhash: for the table at address 0xa20d978, the given entrySize of 48 probably favors chaining over double hashing.

and instead start spewing

  foo/bar/baz.js:123: generator expression causing stack growth on each iteration; estimate 3 more iterations before exhaustion.

I'm totally serious. I will buy a bottle of something for whoever makes that happen.
I don't think that enormous stack consumption for js::Interpret is set in stone.  We could just take that memory and put it in a heap allocated structure and access it through a pointer on the stack.
I feel like there's a rumor here getting started that I would really like to strangle in the crib:

Generator expression is not any worse for stack growth than any number of other commonly used JS constructs. The troubles in bug 845842 - the immediate motivation for this bug - are due to our promises being sometimes asynchronous and sometimes synchronous, and can arise without any generator use at all. In the deep stacks we saw there, there were no generator frames on the stack.

The stack consumption for a generator resumption is the same as that consumed for a getter or setter property invocation, or invoking a proxy handler, or a valueOf or toString function call.

I *would* agree that generator use makes it easier to trip over the bad case in our promise code. Caution combining Task.jsm with our unrevised promise implementation makes sense. But I'm persuaded, by the list of four reasons to be always asynchronous that Joe cites in bug 845842 comment 36, that we need an always-async promises implementation. Such an implementation would be fine for use with Task.jsm.
(In reply to Brian Hackett (:bhackett) from comment #2)
> I don't think that enormous stack consumption for js::Interpret is set in
> stone.  We could just take that memory and put it in a heap allocated
> structure and access it through a pointer on the stack.

What is using all that memory?
Jim: OK, so maybe I'm unfairly targeting generators. Sorry about that. Even if we fix promise and/or Task.jsm to be less stack abusive, we still have a general problem of "some JS constructs result in a lot more memory use than others." From the outside, you really don't know which. Even if we educate developers on what causes significant memory growth today and train people to avoid those scenarios, there's no guarantee that behavior of the JS engine will remain that way forever. Tools/alerts on what is actually going on are the most reliable way to keep this problem in check long term. That is what this bug asks for.

Alternatively, I will also settle for a resolution where the JS engine will never grow the stack enough for this to be an issue. But, I suspect that's a lot more work (if it's even possible) and ask that this bug request be a stop-gap until we reach that end goal (if ever).
I would also appreciate the api requested by gps. In addition, I believe that this API (or a variant thereof) will be very even more useful for JS developers of all sorts once bug 723959 has contributed to making JavaScript memory usage more efficient but also more subtle.
I suspect that, should we provide an API for somehow watching stack space, the right place to patch would be http://dxr.mozilla.org/mozilla-central/js/src/jsfriendapi.h#l573
I do like the idea here; of course we should expose aspects of the engine's performance that create restrictions on the design of the code it's meant to run. (If it weren't for bug 716647, this would be natural to add to the Debugger API.)

On the other hand, I wonder if we should just treat it as a bug: you had a right to expect "plenty" but you didn't get "plenty". Perhaps it's not hard to shrink js::Interpret's frames. We need to do some investigation here.
Whiteboard: [MemShrink]
Depends on: 716647
Assignee: general → nobody
Severity: normal → S3
Blocks: js-debugger
Severity: S3 → N/A
Type: defect → enhancement
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.