Closed Bug 687951 Opened 13 years ago Closed 13 years ago

"too much recursion" errors with large compiled JS files

Categories

(Core :: JavaScript Engine, defect)

x86
Linux
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla11
Tracking Status
firefox7 - wontfix
firefox8 - wontfix
firefox9 - wontfix
firefox10 - wontfix
firefox11 + verified

People

(Reporter: azakai, Assigned: cdleary)

References

()

Details

(Keywords: regression, Whiteboard: [qa!])

Attachments

(3 files)

http://repl.it is a new open source site with REPL environments for various languages. Python, Ruby and Lua are compiled from C to JS. Python and Ruby do not work for me on FF9 on Linux, I get

  too much recursion

during load.

The site works fine in other browsers.

The same problem appears to show up in sqlite_0_1_cc_pretty.js in bug 687127 (which is another large JS file generated by compiled C code). Interestingly there, running minification on the file prevents the problem.
Works for me on FF6. Fails on FF7, FF8 and as already mentioned FF9, all on Linux (Ubuntu 32-bit).

Nominating for tracking, since this is an innovative use of JavaScript that is becoming more and more common. Other websites that use the same technology include for example http://zodb.ws, http://pythonfiddle.com/ and others (sometimes the same compiled C code works on those, though, might be different compilation options and/or minification options). So this regression is concerning, especially when the sites work properly on other browsers: We shouldn't be the browser that holds sites back from pushing the web platform forward.

I am hoping that the problem is a quota of some sort in the JS engine and not too hard to fix.
repl.it WFM on Mozilla/5.0 (Windows NT 6.2; Win64; x64; rv:9.0a1) Gecko/20110920 Firefox/9.0a1
Thanks for the info Aaron. Just to be sure we are talking about the same thing, the steps to reproduce that I should have elaborated on before are:

1. load repl.it
2. click on Python

You should get an interpreter environment at that point, where you can enter commands and get responses, if everything worked. For example,

  print 5

should lead to 5 being printed.
Yes, I tested with |print "hello"| but it did in fact respond with |hello|. Ruby worked as well.
Since this is a regression, can you bisect it to find the regressing changeset?
Keywords: qawanted
It's working fine for me on Firefox 7 on Mac....platform/arch specific?
Works for me, using:
Build identifier: Mozilla/5.0 (Windows NT 6.1; rv:9.0a1) Gecko/20110906 Firefox/9.0a1
I followed the steps in comment 3.
Tested using Firefox 9.0a1 20110920085517 on Ubuntu 11.04 64-bit. It appears like this might be Linux only. Just to confirm, here is what I see:

1) Load http://repl.it
2) Select "Python"
3) Wait for module to load

Result:
Module never loads completely; progress bar nears 100% but never quite finishes.

Is this what I should be looking for when I search the regression range?
Yes, that is it. To be absolutely sure, you can open the web console before loading the python page (control-shift-K), and you should see "too much recursion" as an error in the console. So I guess the full STR should be

1. Open web console (control-shift-K)
2. Load http://repl.it
3. Select Python
4. Wait for load, see that it stops at some point, with "too much recursion" in the web console. Or, if it worked, you will see a prompt, and you can continue to 5,
5. Enter |print 5| and press enter. You should see 5 printed.
Just a quick status update...I've narrowed down to sometime between 2011-05-31 and 2011-06-30. I'm still working to narrow it down further.
Last good nightly: 2011-06-27
First bad nightly: 2011-06-28

Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=5e072732cb45&tochange=383e60bc9089
Thanks Anthony!
Thanks from me as well! I am bisecting this with local builds now. So far I can confirm that the first changeset passes and the last fails, as expected.
Finished bisecting, bug 649576 is responsible,

changeset:   71894:3d646df22a4b
user:        Chris Leary <cdleary@mozilla.com>
date:        Fri Jun 24 14:22:30 2011 -0700
summary:     Bug 649576: Extricate JSHashTable from JSAtomList death grip. (r=luke)

Looks like that bug changed parsing code which does seem very relevant to this issue. No idea why the problem would only show up on Linux as it does, though. (Different stack sizes perhaps? Linux's are very large though...)
Blocks: 649576
Here is a stack trace during a "too much recursion error". 85 recursive calls to js_EmitTree there.
Assigning to Chris. Chris, we're worried about this for Firefox 7's release...can you take a loop ASAP?
Assignee: general → cdleary
It looks like js_EmitTree has a 1328 byte frame. That's high for an inherently recursive function. My backtrace has 380 frames with main locals at 0xffffd1c8 and leaf function locals at 0xfff83998, which averages to ~1300 bytes per frame.

By contrast, my debug build only seems to have 448 bytes per js_EmitTree frame. When I add an 800B stack allocated buffer it fails the same way in debug.

We should factor out cases in js_EmitTree to alleviate the locals pressure. I'll argue that this isn't high priority, but I'll see what I can whip up in a few hours.
If you don't think it's high priority we don't either...you're the expert.

To make sure we are on the same page, I would like your assessment of:

1. If it affects all platforms (not clear as some of us can't reproduce)
2. If you think this type of issue would be hit a lot on the open web
3. General feel of engineering effort required to fix it (trivial/low/medium/high/tons)
3. General feeling of risk for a possible fix (trivial/low/medium/high/tons)

Thanks!
Factors a bunch of case arms out of js_EmitTree which appears to let better inlining decisions occur. Opt 32 bit I get a 416B frame, debug is down to 384B.
Attachment #561804 - Flags: feedback?(dmandelin)
Comment on attachment 561804 [details] [diff] [review]
WIP: simple refactoring

Looks good. I like breaking up giant functions anyway.
Attachment #561804 - Flags: feedback?(dmandelin) → feedback+
Comment on attachment 561804 [details] [diff] [review]
WIP: simple refactoring

So, to recap per comment 18:

1. Doesn't affect all platforms, it's dependent on the compiler and optimization level. (The specific compiler determines the frame size, and Linux GCC seems to be making an agressively-sized frame for whatever reasons: inlining, register allocation spills, etc. The OS X GCC is a much older version than Linux GCC and MSVC is a different beast altogether.)
2. No, just for emscripten-like generated workloads that haven't been run through some kind of normalization, like the closure compiler. We should make sure to qualify GWT-generated sites to make sure it's not an issue there, but I think we would have run into that at this point in the cycle if it were an issue.
3. Low (patch made).
4. Low (pushed to try).
Attachment #561804 - Flags: feedback?(clegnitto)
Just to chime in with a note here about point #2. Passing the code through the Closure compiler does not in fact eliminate the issue. The live code on repl.it that's triggering the problem is Closure-compiled.
As Max said, closure compiler is not a complete workaround for this problem. It works in some cases but not others.

I'd like to also add that not being able to run large compiled JS files is limiting in several areas. Aside from various language runtimes which started this bug, for example Mozilla's Paladin gaming project uses such files (so far without issue, thankfully). Such compiled files might also be useful for things like text-to-speech (bug 525444).
Attachment #561804 - Flags: feedback?(clegnitto) → review?(dmandelin)
Comment on attachment 561804 [details] [diff] [review]
WIP: simple refactoring

dmandelin said giving this review to Waldo would be okay.
Attachment #561804 - Flags: review?(dmandelin) → review?(jwalden+bmo)
Comment on attachment 561804 [details] [diff] [review]
WIP: simple refactoring

Review of attachment 561804 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsemit.cpp
@@ +5084,5 @@
> +    JSObjectBox *prevBox = NULL;
> +    /* If this try has a catch block, emit it. */
> +    JSParseNode *pn2 = pn->pn_kid2;
> +    JSParseNode *lastCatch = NULL;
> +    if (pn2) {

|if (JSParseNode *pn2 = ...)| would clarify that the variable's only needed in that one case.

@@ +5331,5 @@
> +
> +        /*
> +         * Emit a JSOP_BACKPATCH op to jump from the end of our then part
> +         * around the else part.  The js_PopStatementCG call at the bottom
> +         * of this switch case will fix up the backpatch chain linked from

"at the bottom of this switch case" needs updating.
Attachment #561804 - Flags: review?(jwalden+bmo) → review+
Whiteboard: [inbound]
https://hg.mozilla.org/mozilla-central/rev/4ecbff48c0f1
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla10
Comment on attachment 561804 [details] [diff] [review]
WIP: simple refactoring

Too late for FF7 I guess, but can we get this into FF8 and FF9? cdleary estimates the risk of this patch as low in comment 21.
Attachment #561804 - Flags: approval-mozilla-beta?
Attachment #561804 - Flags: approval-mozilla-aurora?
We want to have this bake a little more on nightly before approving anywhere. We'll let the requests stand.
Comment on attachment 561804 [details] [diff] [review]
WIP: simple refactoring

We decided because we haven't seen any reports of this and there is worry about testing that we'd rather have it work its way up through the whole process. Thanks for the quick turnaround and investigation Chris!
Attachment #561804 - Flags: approval-mozilla-beta?
Attachment #561804 - Flags: approval-mozilla-beta-
Attachment #561804 - Flags: approval-mozilla-aurora?
Attachment #561804 - Flags: approval-mozilla-aurora-
I am seeing this on latest Nightly once again (Python on repl.it fails on load with "too much recursion" in the error console).
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [inbound]
(In reply to Alon Zakai (:azakai) from comment #30)

Alon, the sqlite shell example still seems to work okay -- is there a way we can get the Python one in shell form for testing?
Yes, here is the python file used on repl.it:

https://github.com/replit/empythoned/blob/master/dist/python.opt.js

(download the "view raw" link there).

That file fails in the console with "too much recursion".
I'm seeing this on the examples on http://g.raphaeljs.com/ on Mac, yikes. Running Beta8, I'll check trunk
Happens on latest trunk too :-( The news is gRaphaël isn't as popular as Raphaël (which seemed to work when I tried it). The bad news is this doesn't look fixed (and is more widespread than initially thought).
Although I should note it might be an issue with the library (https://github.com/DmitryBaranovskiy/g.raphael/issues/123)
OK, I'm told the real fix here is complex enough that we won't take it on branches other than trunk. But we should get it fixed on trunk.
Depends on: 704369
Now that bug 704369 has landed on mozilla-central, could somebody verify that this issue is resolved? I don't have a test case, though I would be happy to add one to the suite if it were easily identifiable.
I am on the very latest nightly, and repl.it still fails for me on "too much recursion".
(In reply to Alon Zakai (:azakai) from comment #38)
> I am on the very latest nightly, and repl.it still fails for me on "too much
> recursion".

Push from m-i to m-c happened at 01:30 so it will probably only be in a mozilla-central build at this point.
Sorry for jumping the gun here.

Ok, I tested it now on today's nightly, and it works! :)
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Target Milestone: mozilla10 → mozilla11
Whiteboard: [qa+]
Attached image Screenshot
Using the scenario from comment #9, the module never ends loading, although "too much recursion" is not returned in the web console. Instead, there is an uncaught exception and some parsing errors (see attached screenshot)
repl.it is now broken because of recent XHR changes with typed arrays. I filed an issue with the website. This is unrelated to the "too much recursion" error.
Scenario from comment #9 works fine on other browsers (Chrome): the python module loads and is functional. In this case, do you want another bug to be opened for the issues mentioned in comment #41?
Thanks!
I think we can mark this verified simply due to the fact that the "too much recursion" errors are no longer reproducible; in spite of the other issues reported. Any further issues should be reported to a new bug.
Status: RESOLVED → VERIFIED
Keywords: qawanted
Whiteboard: [qa+] → [qa!]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: