Last Comment Bug 687951 - "too much recursion" errors with large compiled JS files
: "too much recursion" errors with large compiled JS files
Status: VERIFIED FIXED
[qa!]
: regression
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: x86 Linux
: -- normal with 1 vote (vote)
: mozilla11
Assigned To: Chris Leary [:cdleary] (not checking bugmail)
:
Mentors:
http://repl.it
Depends on: 704369
Blocks: 644244 649576
  Show dependency treegraph
 
Reported: 2011-09-20 11:59 PDT by Alon Zakai (:azakai)
Modified: 2012-03-01 12:46 PST (History)
14 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-
wontfix
-
wontfix
-
wontfix
-
wontfix
+
verified


Attachments
stack trace during "too much recursion" (17.25 KB, text/plain)
2011-09-21 11:17 PDT, Alon Zakai (:azakai)
no flags Details
WIP: simple refactoring (53.70 KB, patch)
2011-09-22 11:34 PDT, Chris Leary [:cdleary] (not checking bugmail)
jwalden+bmo: review+
dmandelin: feedback+
christian: approval‑mozilla‑aurora-
christian: approval‑mozilla‑beta-
Details | Diff | Review
Screenshot (116.75 KB, image/png)
2012-02-24 07:03 PST, Mihaela Velimiroviciu (:mihaelav)
no flags Details

Description Alon Zakai (:azakai) 2011-09-20 11:59:43 PDT
http://repl.it is a new open source site with REPL environments for various languages. Python, Ruby and Lua are compiled from C to JS. Python and Ruby do not work for me on FF9 on Linux, I get

  too much recursion

during load.

The site works fine in other browsers.

The same problem appears to show up in sqlite_0_1_cc_pretty.js in bug 687127 (which is another large JS file generated by compiled C code). Interestingly there, running minification on the file prevents the problem.
Comment 1 Alon Zakai (:azakai) 2011-09-20 12:15:06 PDT
Works for me on FF6. Fails on FF7, FF8 and as already mentioned FF9, all on Linux (Ubuntu 32-bit).

Nominating for tracking, since this is an innovative use of JavaScript that is becoming more and more common. Other websites that use the same technology include for example http://zodb.ws, http://pythonfiddle.com/ and others (sometimes the same compiled C code works on those, though, might be different compilation options and/or minification options). So this regression is concerning, especially when the sites work properly on other browsers: We shouldn't be the browser that holds sites back from pushing the web platform forward.

I am hoping that the problem is a quota of some sort in the JS engine and not too hard to fix.
Comment 2 Aaron Kaluszka 2011-09-20 12:29:06 PDT
repl.it WFM on Mozilla/5.0 (Windows NT 6.2; Win64; x64; rv:9.0a1) Gecko/20110920 Firefox/9.0a1
Comment 3 Alon Zakai (:azakai) 2011-09-20 13:17:45 PDT
Thanks for the info Aaron. Just to be sure we are talking about the same thing, the steps to reproduce that I should have elaborated on before are:

1. load repl.it
2. click on Python

You should get an interpreter environment at that point, where you can enter commands and get responses, if everything worked. For example,

  print 5

should lead to 5 being printed.
Comment 4 Aaron Kaluszka 2011-09-20 14:26:28 PDT
Yes, I tested with |print "hello"| but it did in fact respond with |hello|. Ruby worked as well.
Comment 5 David Mandelin [:dmandelin] 2011-09-20 14:32:01 PDT
Since this is a regression, can you bisect it to find the regressing changeset?
Comment 6 christian 2011-09-20 14:38:39 PDT
It's working fine for me on Firefox 7 on Mac....platform/arch specific?
Comment 7 Martijn Wargers [:mwargers] (gone per 2016-05-31 :-( ) 2011-09-20 14:55:47 PDT
Works for me, using:
Build identifier: Mozilla/5.0 (Windows NT 6.1; rv:9.0a1) Gecko/20110906 Firefox/9.0a1
I followed the steps in comment 3.
Comment 8 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2011-09-20 14:57:49 PDT
Tested using Firefox 9.0a1 20110920085517 on Ubuntu 11.04 64-bit. It appears like this might be Linux only. Just to confirm, here is what I see:

1) Load http://repl.it
2) Select "Python"
3) Wait for module to load

Result:
Module never loads completely; progress bar nears 100% but never quite finishes.

Is this what I should be looking for when I search the regression range?
Comment 9 Alon Zakai (:azakai) 2011-09-20 15:04:26 PDT
Yes, that is it. To be absolutely sure, you can open the web console before loading the python page (control-shift-K), and you should see "too much recursion" as an error in the console. So I guess the full STR should be

1. Open web console (control-shift-K)
2. Load http://repl.it
3. Select Python
4. Wait for load, see that it stops at some point, with "too much recursion" in the web console. Or, if it worked, you will see a prompt, and you can continue to 5,
5. Enter |print 5| and press enter. You should see 5 printed.
Comment 10 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2011-09-20 15:29:11 PDT
Just a quick status update...I've narrowed down to sometime between 2011-05-31 and 2011-06-30. I'm still working to narrow it down further.
Comment 11 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2011-09-20 15:45:37 PDT
Last good nightly: 2011-06-27
First bad nightly: 2011-06-28

Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=5e072732cb45&tochange=383e60bc9089
Comment 12 christian 2011-09-20 18:13:35 PDT
Thanks Anthony!
Comment 13 Alon Zakai (:azakai) 2011-09-20 18:31:08 PDT
Thanks from me as well! I am bisecting this with local builds now. So far I can confirm that the first changeset passes and the last fails, as expected.
Comment 14 Alon Zakai (:azakai) 2011-09-21 11:11:58 PDT
Finished bisecting, bug 649576 is responsible,

changeset:   71894:3d646df22a4b
user:        Chris Leary <cdleary@mozilla.com>
date:        Fri Jun 24 14:22:30 2011 -0700
summary:     Bug 649576: Extricate JSHashTable from JSAtomList death grip. (r=luke)

Looks like that bug changed parsing code which does seem very relevant to this issue. No idea why the problem would only show up on Linux as it does, though. (Different stack sizes perhaps? Linux's are very large though...)
Comment 15 Alon Zakai (:azakai) 2011-09-21 11:17:31 PDT
Created attachment 561524 [details]
stack trace during "too much recursion"

Here is a stack trace during a "too much recursion error". 85 recursive calls to js_EmitTree there.
Comment 16 christian 2011-09-21 11:44:12 PDT
Assigning to Chris. Chris, we're worried about this for Firefox 7's release...can you take a loop ASAP?
Comment 17 Chris Leary [:cdleary] (not checking bugmail) 2011-09-21 15:00:36 PDT
It looks like js_EmitTree has a 1328 byte frame. That's high for an inherently recursive function. My backtrace has 380 frames with main locals at 0xffffd1c8 and leaf function locals at 0xfff83998, which averages to ~1300 bytes per frame.

By contrast, my debug build only seems to have 448 bytes per js_EmitTree frame. When I add an 800B stack allocated buffer it fails the same way in debug.

We should factor out cases in js_EmitTree to alleviate the locals pressure. I'll argue that this isn't high priority, but I'll see what I can whip up in a few hours.
Comment 18 christian 2011-09-21 15:05:11 PDT
If you don't think it's high priority we don't either...you're the expert.

To make sure we are on the same page, I would like your assessment of:

1. If it affects all platforms (not clear as some of us can't reproduce)
2. If you think this type of issue would be hit a lot on the open web
3. General feel of engineering effort required to fix it (trivial/low/medium/high/tons)
3. General feeling of risk for a possible fix (trivial/low/medium/high/tons)

Thanks!
Comment 19 Chris Leary [:cdleary] (not checking bugmail) 2011-09-22 11:34:15 PDT
Created attachment 561804 [details] [diff] [review]
WIP: simple refactoring

Factors a bunch of case arms out of js_EmitTree which appears to let better inlining decisions occur. Opt 32 bit I get a 416B frame, debug is down to 384B.
Comment 20 David Mandelin [:dmandelin] 2011-09-22 13:36:53 PDT
Comment on attachment 561804 [details] [diff] [review]
WIP: simple refactoring

Looks good. I like breaking up giant functions anyway.
Comment 21 Chris Leary [:cdleary] (not checking bugmail) 2011-09-22 14:11:07 PDT
Comment on attachment 561804 [details] [diff] [review]
WIP: simple refactoring

So, to recap per comment 18:

1. Doesn't affect all platforms, it's dependent on the compiler and optimization level. (The specific compiler determines the frame size, and Linux GCC seems to be making an agressively-sized frame for whatever reasons: inlining, register allocation spills, etc. The OS X GCC is a much older version than Linux GCC and MSVC is a different beast altogether.)
2. No, just for emscripten-like generated workloads that haven't been run through some kind of normalization, like the closure compiler. We should make sure to qualify GWT-generated sites to make sure it's not an issue there, but I think we would have run into that at this point in the cycle if it were an issue.
3. Low (patch made).
4. Low (pushed to try).
Comment 22 Max Shawabkeh 2011-09-22 16:54:13 PDT
Just to chime in with a note here about point #2. Passing the code through the Closure compiler does not in fact eliminate the issue. The live code on repl.it that's triggering the problem is Closure-compiled.
Comment 23 Alon Zakai (:azakai) 2011-09-23 10:06:14 PDT
As Max said, closure compiler is not a complete workaround for this problem. It works in some cases but not others.

I'd like to also add that not being able to run large compiled JS files is limiting in several areas. Aside from various language runtimes which started this bug, for example Mozilla's Paladin gaming project uses such files (so far without issue, thankfully). Such compiled files might also be useful for things like text-to-speech (bug 525444).
Comment 24 Chris Leary [:cdleary] (not checking bugmail) 2011-09-26 15:31:40 PDT
Comment on attachment 561804 [details] [diff] [review]
WIP: simple refactoring

dmandelin said giving this review to Waldo would be okay.
Comment 25 Jeff Walden [:Waldo] (remove +bmo to email) 2011-09-26 19:03:43 PDT
Comment on attachment 561804 [details] [diff] [review]
WIP: simple refactoring

Review of attachment 561804 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsemit.cpp
@@ +5084,5 @@
> +    JSObjectBox *prevBox = NULL;
> +    /* If this try has a catch block, emit it. */
> +    JSParseNode *pn2 = pn->pn_kid2;
> +    JSParseNode *lastCatch = NULL;
> +    if (pn2) {

|if (JSParseNode *pn2 = ...)| would clarify that the variable's only needed in that one case.

@@ +5331,5 @@
> +
> +        /*
> +         * Emit a JSOP_BACKPATCH op to jump from the end of our then part
> +         * around the else part.  The js_PopStatementCG call at the bottom
> +         * of this switch case will fix up the backpatch chain linked from

"at the bottom of this switch case" needs updating.
Comment 26 Marco Bonardo [::mak] 2011-09-28 02:09:50 PDT
https://hg.mozilla.org/mozilla-central/rev/4ecbff48c0f1
Comment 27 Alon Zakai (:azakai) 2011-09-28 10:08:45 PDT
Comment on attachment 561804 [details] [diff] [review]
WIP: simple refactoring

Too late for FF7 I guess, but can we get this into FF8 and FF9? cdleary estimates the risk of this patch as low in comment 21.
Comment 28 christian 2011-09-29 14:35:30 PDT
We want to have this bake a little more on nightly before approving anywhere. We'll let the requests stand.
Comment 29 christian 2011-10-04 14:53:57 PDT
Comment on attachment 561804 [details] [diff] [review]
WIP: simple refactoring

We decided because we haven't seen any reports of this and there is worry about testing that we'd rather have it work its way up through the whole process. Thanks for the quick turnaround and investigation Chris!
Comment 30 Alon Zakai (:azakai) 2011-10-11 10:14:42 PDT
I am seeing this on latest Nightly once again (Python on repl.it fails on load with "too much recursion" in the error console).
Comment 31 Chris Leary [:cdleary] (not checking bugmail) 2011-10-11 11:20:47 PDT
(In reply to Alon Zakai (:azakai) from comment #30)

Alon, the sqlite shell example still seems to work okay -- is there a way we can get the Python one in shell form for testing?
Comment 32 Alon Zakai (:azakai) 2011-10-11 11:27:18 PDT
Yes, here is the python file used on repl.it:

https://github.com/replit/empythoned/blob/master/dist/python.opt.js

(download the "view raw" link there).

That file fails in the console with "too much recursion".
Comment 33 christian 2011-10-13 19:48:16 PDT
I'm seeing this on the examples on http://g.raphaeljs.com/ on Mac, yikes. Running Beta8, I'll check trunk
Comment 34 christian 2011-10-13 19:50:55 PDT
Happens on latest trunk too :-( The news is gRaphaël isn't as popular as Raphaël (which seemed to work when I tried it). The bad news is this doesn't look fixed (and is more widespread than initially thought).
Comment 35 christian 2011-10-13 19:52:45 PDT
Although I should note it might be an issue with the library (https://github.com/DmitryBaranovskiy/g.raphael/issues/123)
Comment 36 David Mandelin [:dmandelin] 2011-11-21 16:43:52 PST
OK, I'm told the real fix here is complex enough that we won't take it on branches other than trunk. But we should get it fixed on trunk.
Comment 37 Chris Leary [:cdleary] (not checking bugmail) 2011-12-07 08:28:46 PST
Now that bug 704369 has landed on mozilla-central, could somebody verify that this issue is resolved? I don't have a test case, though I would be happy to add one to the suite if it were easily identifiable.
Comment 38 Alon Zakai (:azakai) 2011-12-07 13:35:30 PST
I am on the very latest nightly, and repl.it still fails for me on "too much recursion".
Comment 39 Chris Leary [:cdleary] (not checking bugmail) 2011-12-07 14:36:51 PST
(In reply to Alon Zakai (:azakai) from comment #38)
> I am on the very latest nightly, and repl.it still fails for me on "too much
> recursion".

Push from m-i to m-c happened at 01:30 so it will probably only be in a mozilla-central build at this point.
Comment 40 Alon Zakai (:azakai) 2011-12-08 10:04:04 PST
Sorry for jumping the gun here.

Ok, I tested it now on today's nightly, and it works! :)
Comment 41 Mihaela Velimiroviciu (:mihaelav) 2012-02-24 07:03:29 PST
Created attachment 600382 [details]
Screenshot

Using the scenario from comment #9, the module never ends loading, although "too much recursion" is not returned in the web console. Instead, there is an uncaught exception and some parsing errors (see attached screenshot)
Comment 42 Alon Zakai (:azakai) 2012-02-24 13:26:02 PST
repl.it is now broken because of recent XHR changes with typed arrays. I filed an issue with the website. This is unrelated to the "too much recursion" error.
Comment 43 Mihaela Velimiroviciu (:mihaelav) 2012-02-28 01:03:44 PST
Scenario from comment #9 works fine on other browsers (Chrome): the python module loads and is functional. In this case, do you want another bug to be opened for the issues mentioned in comment #41?
Thanks!
Comment 44 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-03-01 12:46:27 PST
I think we can mark this verified simply due to the fact that the "too much recursion" errors are no longer reproducible; in spite of the other issues reported. Any further issues should be reported to a new bug.

Note You need to log in before you can comment on or make changes to this bug.