Closed
Bug 477021
Opened 15 years ago
Closed 15 years ago
JSRuntime.contextList mutations must not happen when GC is running
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
People
(Reporter: igor, Assigned: igor)
References
Details
(Keywords: fixed1.9.0.11, fixed1.9.1, Whiteboard: fixed-in-tracemonkey)
Attachments
(2 files, 4 obsolete files)
4.81 KB,
patch
|
brendan
:
review+
|
Details | Diff | Splinter Review |
4.19 KB,
patch
|
igor
:
review+
samuel.sidler+old
:
approval1.9.0.11+
|
Details | Diff | Splinter Review |
As it was discovered in the bug 476934 the xpc code always calls JS_DestroyContext APIs outside JS request violating the JS API requirement. One way to fix this is to add a call to JS_BeginRequest to ReleaseJSContext implementation. But this would effectively mean taking an extra lock when destroying the context. Thus I suggest to add a code to js_DestroyContext to wait for the GC to finish before unlinking the context from the runtime list. It will explicitly allow to call JS_DestroyContext outside requests. It will also allow for a C++ embedding to always use JSAutoRequest and similar classes removing a possibility of missing JS_EndRequest calls.
Assignee | ||
Comment 1•15 years ago
|
||
There is similar threading bug in js_NewContext. The function updates JSRuntime.contextList to add the new context there when the GC may run on a separated thread accessing the field when iterating over contexts. I nominate this for 1.9.1 as DOM workers may expose the bug. I also make this bug to depend on the bug 476934 as for the fix I plan to use js_WaitIfGC from that bug.
Flags: blocking1.9.1?
Summary: js_DestroyContext should wait for GC to finish → JSRuntime.contextList must not happen when GC is running
Assignee | ||
Comment 2•15 years ago
|
||
About embeddings forgetting to call JS_DestroyContext from a request: it is not only xpconnet but also shell/js.cpp who does exactly that in all 3 cases when it calls JS_DestroyContext* APIs. At this point I would even suggest to require that JS_DestroyContext must be called outside the request.
Assignee | ||
Updated•15 years ago
|
Summary: JSRuntime.contextList must not happen when GC is running → JSRuntime.contextList mutations must not happen when GC is running
Assignee | ||
Comment 3•15 years ago
|
||
The patch fixes js_NewContext issue and allows to call js_DestroyContext outside a request. The patch depends on the patch from bug 476934 comment 12 so I will ask for its review after addressing the bug 476934.
Assignee | ||
Comment 4•15 years ago
|
||
The new version of the patch is against TM tip. It makes sure that js_(New|Destroy)Context wait for the GC if necessary. I also changed the initialization order for the context to make sure that it is fully initialized before it is put on the runtime list.
Attachment #360901 -
Attachment is obsolete: true
Attachment #361029 -
Flags: review?(brendan)
Comment 5•15 years ago
|
||
Comment on attachment 361029 [details] [diff] [review] v2 Looks good -- best to get MikeM, Wes, and anyone else who can help to test hard on their MT server-side embeddings. /be
Attachment #361029 -
Flags: review?(brendan) → review+
Comment 6•15 years ago
|
||
Context list mutations are almost certainly causing this bug too: Bug 466182 Check the patch in. I'll test monday. Is the trunk otherwise stable for MT guys?
Comment 7•15 years ago
|
||
(In reply to comment #6) > Context list mutations are almost certainly causing this bug too: Bug 466182 > Check the patch in. I'll test monday. Is the trunk otherwise stable for MT > guys? Trunk for us is http://hg.mozilla.org/tracemonkey -- we have full tinderbox mochi and unit test. As for MT testing, it's getting better all the time since Firefox 3.1 has web worker threads. Finally! But please help us test in any way you can. Thanks, MikeM -- and thanks for all your years of patience while we were idling in single-thread land. /be
Comment 8•15 years ago
|
||
Hope this is the right bug to block bug 466182. If not, someone will fix the deps I'm sure. /be
Blocks: 466182
Assignee | ||
Comment 9•15 years ago
|
||
landed to TM - http://hg.mozilla.org/tracemonkey/rev/7153a04e0968
Whiteboard: fixed-in-tracemonkey
Assignee | ||
Comment 10•15 years ago
|
||
It turned out that I could not wait until the full TM tinderbox cycle finishes. So if the patch causes troubles, just back it out.
Updated•15 years ago
|
Flags: blocking1.9.1? → blocking1.9.1+
Comment 11•15 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/7153a04e0968
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Updated•15 years ago
|
Flags: in-testsuite-
Flags: in-litmus-
Comment 12•15 years ago
|
||
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/4c12b34b1fea
Keywords: fixed1.9.1
Comment 13•15 years ago
|
||
To land this in the 1.9.0 branch, I think we must first land bug 467441 there. igor, does that sound right to you?
Comment 14•15 years ago
|
||
Comment 15•15 years ago
|
||
The interdiff is like:
>v1 added this extraneous line, which isn't relevant in the branch.
>- JS_ASSERT(cx->resolveFlags == 0);
Attachment #366892 -
Attachment is obsolete: true
Assignee | ||
Comment 16•15 years ago
|
||
(In reply to comment #13) > To land this in the 1.9.0 branch, I think we must first land bug 467441 there. > igor, does that sound right to you? Hm, does the trunk patch in this bug depends on that and not, for example, on bug 477021? In any case, if that makes backport life easier, I am very fine with that.
Comment 17•15 years ago
|
||
Did you mean some other bug? Bug 477021 is this bug!
Assignee | ||
Comment 18•15 years ago
|
||
(In reply to comment #17) > Did you mean some other bug? Bug 477021 is this bug! Yes, I meant some other bug - one from the list that you discovered for 1.9.0 backporting.
Comment 19•15 years ago
|
||
Comment on attachment 366905 [details] [diff] [review] backport to 1.9.0 (for SpiderMonkey 1.8 source release) v2 Patch 3/4 to fix a crashing bug for embedders that blocks the SpiderMonkey 1.8 source release. See bug 478336 comment 21 for the full list.
Attachment #366905 -
Flags: approval1.9.0.10?
Updated•15 years ago
|
Assignee: general → jorendorff
Comment 20•15 years ago
|
||
Comment on attachment 366905 [details] [diff] [review] backport to 1.9.0 (for SpiderMonkey 1.8 source release) v2 Approved for 1.9.0.10, a=dveditz for release-drivers
Attachment #366905 -
Flags: approval1.9.0.10? → approval1.9.0.10+
Assignee | ||
Comment 21•15 years ago
|
||
landed to 1.9.0: Checking in jscntxt.c; /cvsroot/mozilla/js/src/jscntxt.c,v <-- jscntxt.c new revision: 3.139; previous revision: 3.138 done
Keywords: fixed1.9.0.10
Assignee | ||
Comment 22•15 years ago
|
||
Backed out from 1.9.0 - the landed patch depends on the patch from bug 476934 that has caused compilation error on Windows.
Keywords: fixed1.9.0.10
Assignee | ||
Comment 23•15 years ago
|
||
A "final" glance before landing the patch has revealed that it used the JS_STATIC_ASSERT macro after some code. This breaks older MSVC versions that are still supported on 1.9.0 because JS_STATIC_ASSERT is implemented via an extern declaration that must come before any code in older versions of C. The new patch just drops that static assert and removes `< garbage text in the comment.
Attachment #366905 -
Attachment is obsolete: true
Attachment #373852 -
Flags: review+
Attachment #373852 -
Flags: approval1.9.0.10?
Assignee | ||
Updated•15 years ago
|
Assignee: jorendorff → igor
Assignee | ||
Comment 24•15 years ago
|
||
Here is the diff between v2 and v3 for 1.9.0: --- jscntxt.c.base 2009-04-21 17:36:40.000000000 +0200 +++ jscntxt.c 2009-04-21 17:37:32.000000000 +0200 @@ -228,7 +228,6 @@ #ifdef JS_THREADSAFE js_InitContextThread(cx, thread); #endif - JS_STATIC_ASSERT(JSVERSION_DEFAULT == 0); JS_ASSERT(cx->version == JSVERSION_DEFAULT); JS_INIT_ARENA_POOL(&cx->stackPool, "stack", stackChunkSize, sizeof(jsval), &cx->scriptStackQuota); @@ -352,7 +351,7 @@ if (last) { #ifdef JS_THREADSAFE - /*`< + /* * If cx is not in a request already, begin one now so that we wait * for any racing GC started on a not-last context to finish, before * we plow ahead and unpin atoms. Note that even though we begin a
Comment 25•15 years ago
|
||
Comment on attachment 373852 [details] [diff] [review] backport to 1.9.0 (for SpiderMonkey 1.8 source release) v3 Yeah, this is fine. Approved for 1.9.0.10. a=ss
Attachment #373852 -
Flags: approval1.9.0.10? → approval1.9.0.10+
Assignee | ||
Comment 26•15 years ago
|
||
landed to 1.9.0: Checking in jscntxt.c; /cvsroot/mozilla/js/src/jscntxt.c,v <-- jscntxt.c new revision: 3.143; previous revision: 3.142 done
Keywords: fixed1.9.0.10
Comment 27•15 years ago
|
||
Dear Developers, I have got some updates with the latest js code in CVS, and I make sure these code has incorporated the final patch for the Bug 477021. However, now I can use only two threads and three semaphores (the method is similar to my patches published in bug #133773) to generate a order violation (js_FinishRuntimeScriptState can happen first to destroy the scriptFilenameTable hash table, and then js_MarkScriptFilenames and js_SweepScriptFilenames are called to try to access this hash table). Although the NULL check at the beginning of the js_MarkScriptFilenames and js_SweepScriptFilenames functions can hide this segmentation fault, but I am afraid the NULL check is not a perfect solution to this order violation. Also, it seems to me that the js_WaitForGC() before removing context from the contextlist does not help. Below is my output. I will submit my patch to the bug page. If you have any concerns, please let me know. ---------------------------------------------------------------------- heming@rcs:~/bugs/firefox-133773$ sh bug Tid -138024048 END js_WaitForGC Tid -146416752 END js_WaitForGC DestroyContext tid -138024048 in critical region, js_ContextNum 2 Tid -138024048 sets -138024048 to sem 1 tid1 Tid -138024048 sets -138024048 to sem 2 tid2 Tid -138024048 js_WaitForGC in js_DestroyContext Tid -138024048 END js_WaitForGC enter js_GC tid -138024048, rt state 2, gcLevel 0 Tid -138024048 post on sem 1 Tid -138024048 wait on sem 2 DestroyContext tid -146416752 in critical region, js_ContextNum 1 Tid -146416752 sets -146416752 to sem 1 tid2 Tid -146416752 sets -146416752 to sem 2 tid1 Tid -146416752 wait on sem 1 Tid -146416752 wakes up on sem 1 Tid -146416752 js_WaitForGC in js_DestroyContext Tid -146416752 END js_WaitForGC enter js_GC tid -146416752, rt state 3, gcLevel 0 Tid -146416752 js_MarkScriptFilenames Tid -146416752 destroy js_FinishRuntimeScriptState Tid -146416752 post on sem 2 Tid -138024048 wakes up on sem 2 Tid -138024048 js_MarkScriptFilenames Tid -138024048 js_MarkScriptFilenames NUUUUUUUL, will CRASH without the scriptFilenameTable check! 0 Tid -138024048 js_SweepScriptFilenames NUUUUUUUL, will CRASH without the scriptFilenameTable check!
Attachment #399119 -
Flags: review+
Updated•15 years ago
|
Attachment #399119 -
Attachment is patch: true
Attachment #399119 -
Attachment mime type: application/octet-stream → text/plain
Comment 28•15 years ago
|
||
Here is the seg fault back trace if I remove the "if (!rt->scriptFilenameTable)" check in the js_SweepScriptFilenames() function (this check is added to fix another bug 131246, not for this bug or bug 133773). The order violation in a legal schedule is: one thread is allowed to call js_FinishRuntimeScriptState() (in js_DestroyContext() function) to destroy the hash table, then another thread is allowd to call js_SweepScriptFilenames() (in js_GC() function) to access this table. (gdb) bt #0 0x080a5bbb in JS_HashTableEnumerateEntries (ht=0x0, f=0x80feaa0 <js_script_filename_sweeper>, arg=0x815e008) at jshash.c:360 #1 0x080feb2b in js_SweepScriptFilenames (rt=0x815e008) at jsscript.c:1283 #2 0x080a4e2a in js_GC (cx=0x8175708, gckind=GC_NORMAL) at jsgc.c:3426 #3 0x08071798 in js_DestroyContext (cx=0x8175708, mode=JSDCM_FORCE_GC) at jscntxt.c:453 #4 0x08058cb3 in JS_DestroyContext (cx=0x8175708) at jsapi.c:1035 #5 0x0804bf37 in RunJavascript ()
Updated•15 years ago
|
Attachment #399119 -
Flags: review+ → review?(igor)
Comment 29•15 years ago
|
||
Comment on attachment 399119 [details] [diff] [review] The js_WaitForGC() in js_DestroyContext does not help. You can not review your own patch. A JavaScript peer must review your patch instead. I'll ask Igor to look at it but I'm not sure the problem you are trying to fix is clearly identified. It may be better to file a new bug rather than try to hijack this bug.
Comment 30•15 years ago
|
||
Yes, please file a new bug, attach a patch with r?igor@mir2.org set there, as I suggested elsewhere. Thanks, /be
Assignee | ||
Comment 31•15 years ago
|
||
Comment on attachment 399119 [details] [diff] [review] The js_WaitForGC() in js_DestroyContext does not help. The patch went to bug 515403.
Attachment #399119 -
Attachment is obsolete: true
Attachment #399119 -
Flags: review?(igor)
You need to log in
before you can comment on or make changes to this bug.
Description
•