Last Comment Bug 678623 - "too much recursion" with physics engine in worker thread
: "too much recursion" with physics engine in worker thread
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: All All
-- normal (vote)
: mozilla8
Assigned To: Alon Zakai (:azakai)
: Jason Orendorff [:jorendorff]
Depends on: 653967
  Show dependency treegraph
Reported: 2011-08-12 14:16 PDT by Alon Zakai (:azakai)
Modified: 2011-08-16 10:53 PDT (History)
5 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

stack trace (17.25 KB, text/plain)
2011-08-12 14:16 PDT, Alon Zakai (:azakai)
no flags Details
patch (1005 bytes, patch)
2011-08-12 15:54 PDT, Alon Zakai (:azakai)
no flags Details | Diff | Splinter Review
patch v2 (1.94 KB, patch)
2011-08-15 16:48 PDT, Alon Zakai (:azakai)
bent.mozilla: review+
Details | Diff | Splinter Review

Description User image Alon Zakai (:azakai) 2011-08-12 14:16:50 PDT
Created attachment 552763 [details]
stack trace

The worker version of the physics engine demo in bug 653967,

fails with "too much recursion" happening in the worker thread. However the non-worker version of that demo, which loads the same physics engine (just not in a worker) works ok,

This problem is sensitive to something, not sure what: I get it on Nightly on linux and when I build with default settings, but I do not get it when I build with --disable-optimize for example. bent can't see it on os x.

Attaching a stack trace for this error. js_EmitTree occurs dozens of times calling itself recursively, eventually leading to js_ReportOverRecursed being run.
Comment 1 User image Alon Zakai (:azakai) 2011-08-12 15:54:45 PDT
Created attachment 552785 [details] [diff] [review]

This patch sets the stack quota to 128*sizeof(size_t)*1024, which is the same as elsewhere (TabChild, nsInProcessTabChildGlobal). With this the demo works fine.
Comment 2 User image Ben Turner (not reading bugmail, use the needinfo flag!) 2011-08-12 16:24:06 PDT
I don't know if this is safe... Does the mochitest dom/workers/test/test_recursion.html pass without crashing? I tuned this to 32 on OS X because that was the largest value I could use without that test crashing...
Comment 3 User image Alon Zakai (:azakai) 2011-08-12 16:47:28 PDT
I sent to try half an hour ago. I guess I'll see it crash on OS X there...

Do you know if the code in the worker in that test crashes OS X when run normally (not in a worker)? If so, then the other places we currently set the stack size to the value in the patch are dangerous. If not, then I wonder what is different between the worker and non-worker case.
Comment 4 User image Alon Zakai (:azakai) 2011-08-12 18:22:42 PDT
As far as I can see, we never pick a stack size - all our thread creations use the default platform values. And I believe those values are 8MB for Linux, 1MB for Windows, 512K for OS X. So I guess that can explain why OS X would be most sensitive to this problem - the JS stack size limit we set on a 32-bit platform is 512K.

If that is the situation, perhaps a solution is to set the stack size to 1MB on OS X? I worry about us doing such a change though.
Comment 5 User image Emanuel Hoogeveen [:ehoogeveen] 2011-08-13 06:20:46 PDT
Bug 664341 reduced the stack size for media threads on Linux.
Comment 6 User image Alon Zakai (:azakai) 2011-08-15 16:48:35 PDT
Created attachment 553309 [details] [diff] [review]
patch v2

This patch sets the stack size to 1MB and the JS stack limit to 512K (on 32-bit systems), and passes try.

The patch uses the same stack size on all OSes instead of using the defaults. This seems safer since we are now able to be sure that the JS stack limit will be smaller (the crashes before were apparently due to the JS stack limit being equal to the native stack size, and we overflowed).
Comment 7 User image Ben Turner (not reading bugmail, use the needinfo flag!) 2011-08-15 16:58:16 PDT
Comment on attachment 553309 [details] [diff] [review]
patch v2

Review of attachment 553309 [details] [diff] [review]:

Sounds good! Thanks.

::: dom/workers/RuntimeService.cpp
@@ +78,5 @@
>  #define WORKER_RUNTIME_HEAPSIZE 32 * 1024 * 1024
> +// The C stack size. We use the same stack size on all platforms for
> +// consistency.
> +#define WORKER_STACKSIZE 256 * sizeof(size_t) * 1024


@@ +804,5 @@
>    if (!thread) {
> +    if (NS_FAILED(NS_NewThread(getter_AddRefs(thread),
> +                               nsnull,
> +                               WORKER_STACKSIZE))) {

Nit: keep nsnull on the first line, only wrap what you have to.
Comment 9 User image Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-08-16 04:18:07 PDT

Could we check in a test here?
Comment 10 User image Ben Turner (not reading bugmail, use the needinfo flag!) 2011-08-16 07:57:48 PDT
The test is "Does dom/workers/test/test_recursion.html crash the browser?"
Comment 11 User image Alon Zakai (:azakai) 2011-08-16 10:03:39 PDT
We can also add a test for the other direction, that is, to test if we can run huge scripts in workers without hitting "too much recursion". I don't see how to do that though except to add a test with the demo in this bug (2MB), and I assumed people would object to that. Or not?

Note You need to log in before you can comment on or make changes to this bug.