Closed Bug 70240 Opened 24 years ago Closed 24 years ago

liveconnect should not cache JSContext pointers!

Categories

(Core Graveyard :: Java: Live Connect, defect)

defect
Not set
critical

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: xiaobin.lu, Assigned: beard)

Details

Attachments

(7 files)

From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)
BuildID:    

Reload the applet, cause nsLiveconnect::Getwindow is called, it actullay uses 
the cached JSContext from the thread list. Unfortunately, the cx->runtime has 
been changed into an invalid state which causes JS_malloc call crash. 

Reproducible: Always
Steps to Reproduce:
1.Unzip the testcase
2.Launch the index.html,click the link
3.You may see another window is pop up and an applet will be shown in a 
seperate frame.
4.Go ahead and click File->End Seesion
5.Click reload button in the loading page
6.Another applet will be shown, click File->End Session again
7.The browser crashes right away 

Actual Results:  Everything should be OK.

Expected Results:  The browser crash.

  I have a chance to debug into the code. Actually when we hit File->End 
Session, it will call JSObject.getWindow function which in turn calls 
nsLiveconnect::Getwindow function.
  The logic to calculate the JSContext is if it is not in the thread list, it 
calculate a new one for that, so when the second time we call this function, we 
actullay used the JSContxet from the cached list. However, this value should 
not be changed. In fact, cx->runtime changed to an invalid state and it will 
cause the JS_malloc( a sequent call in the Getwindow()) crash.
  Since this bug is very urgent to our 6.0A OEM release, I provided a simple 
workaround for this. Would you guys review this workaround and if there is no 
problem( it will not cause other problems, I mean), we will put it into our OEM 
branch and we may sit to wait for a better solution.
 
beard, you want this one, right?  Either to fix or to reassign to whoever owns
the relevant liveconnect code?

It may be that the workaround only makes for inefficiency, creating a new
context all the time instead of reusing it.  A better fix might watch for
js_DestroyContexts and clear the cached context pointer.  I don't see a hook to
watch for js_DestroyContexts off-hand (thought maybe there was on in jsdbgapi.h
and .c).  We can add one easily.

/be
Assignee: rogerl → beard
Confirming based on xiaobin.lu's testimony.

/be
Status: UNCONFIRMED → NEW
Ever confirmed: true
This is the stack trace when the second time we hit the reload button:

JS_DestroyContext(JSContext * 0x14b03648) line 832
nsJSContext::~nsJSContext() line 374 + 13 bytes
nsJSContext::`scalar deleting destructor'(unsigned int 1) + 15 bytes
nsJSContext::Release(nsJSContext * const 0x14b035d0) line 382 + 154 bytes
nsCOMPtr<nsIScriptContext>::assign_assuming_AddRef(nsIScriptContext * 
0x00000000) line 472
nsCOMPtr<nsIScriptContext>::assign_with_AddRef(nsISupports * 0x00000000) line 
849
nsCOMPtr<nsIScriptContext>::operator=(nsIScriptContext * 0x00000000) line 584
nsDocShell::Destroy(nsDocShell * const 0x142a4bf4) line 1706
nsWebShell::Destroy(nsWebShell * const 0x142a4bf4) line 1396
nsHTMLFrameInnerFrame::~nsHTMLFrameInnerFrame() line 489
nsHTMLFrameInnerFrame::`scalar deleting destructor'(unsigned int 1) + 15 bytes
nsFrame::Destroy(nsFrame * const 0x13de8f88, nsIPresContext * 0x14a4cfe0) line 
428 + 34 bytes
nsFrameList::DestroyFrames(nsIPresContext * 0x14a4cfe0) line 36
nsContainerFrame::Destroy(nsContainerFrame * const 0x13de8f20, nsIPresContext * 
0x14a4cfe0) line 98
nsFrameList::DestroyFrames(nsIPresContext * 0x14a4cfe0) line 36
nsContainerFrame::Destroy(nsContainerFrame * const 0x13de8e24, nsIPresContext * 
0x14a4cfe0) line 98
nsLineBox::DeleteLineList(nsIPresContext * 0x14a4cfe0, nsLineBox * 0x13de8ed0) 
line 252
nsBlockFrame::Destroy(nsBlockFrame * const 0x13de8d9c, nsIPresContext * 
0x14a4cfe0) line 1230 + 16 bytes
nsFrameList::DestroyFrames(nsIPresContext * 0x14a4cfe0) line 36
nsContainerFrame::Destroy(nsContainerFrame * const 0x13de8d64, nsIPresContext * 
0x14a4cfe0) line 98
nsFrameList::DestroyFrames(nsIPresContext * 0x14a4cfe0) line 36
nsContainerFrame::Destroy(nsContainerFrame * const 0x13de8d28, nsIPresContext * 
0x14a4cfe0) line 98
ViewportFrame::Destroy(ViewportFrame * const 0x13de8d28, nsIPresContext * 
0x14a4cfe0) line 144
FrameManager::~FrameManager() line 405
FrameManager::`scalar deleting destructor'(unsigned int 1) + 15 bytes
FrameManager::Release(FrameManager * const 0x14261988) line 384 + 157 bytes
PresShell::~PresShell() line 1328 + 36 bytes
PresShell::`scalar deleting destructor'() + 15 bytes
PresShell::Release(PresShell * const 0x13f62060) line 1238 + 158 bytes
nsCOMPtr<nsIPresShell>::~nsCOMPtr<nsIPresShell>() line 490
DocumentViewerImpl::~DocumentViewerImpl() line 449 + 97 bytes
DocumentViewerImpl::`scalar deleting destructor'(unsigned int 1) + 15 bytes
DocumentViewerImpl::Release(DocumentViewerImpl * const 0x14af1858) line 357 + 
154 bytes
nsCOMPtr<nsIContentViewer>::assign_assuming_AddRef(nsIContentViewer * 
0x00000000) line 472
nsCOMPtr<nsIContentViewer>::assign_with_AddRef(nsISupports * 0x00000000) line 
849
nsCOMPtr<nsIContentViewer>::operator=(nsIContentViewer * 0x00000000) line 584
nsDocShell::SetupNewViewer(nsDocShell * const 0x149fe6d8, nsIContentViewer * 
0x14b2b4d0) line 2868
nsWebShell::SetupNewViewer(nsWebShell * const 0x149fe6d8, nsIContentViewer * 
0x14b2b4d0) line 350 + 13 bytes
nsDocShell::Embed(nsDocShell * const 0x149fe6f8, nsIContentViewer * 0x14b2b4d0, 
const char * 0x01ea06ac, nsISupports * 0x00000000) line 2501 + 23 bytes
nsWebShell::Embed(nsWebShell * const 0x149fe6f8, nsIContentViewer * 0x14b2b4d0, 
const char * 0x01ea06ac, nsISupports * 0x00000000) line 383
nsDocShell::CreateContentViewer(nsDocShell * const 0x149fe6d8, const char * 
0x0012fa9c, nsIChannel * 0x14b22d18, nsIStreamListener * * 0x0012faf0) line 
2692 + 32 bytes
nsDSURIContentListener::DoContent(nsDSURIContentListener * const 0x149feb30, 
const char * 0x0012fa9c, int 3, const char * 0x100a56c8 gCommonEmptyBuffer, 
nsIChannel * 0x14b22d18, nsIStreamListener * * 0x0012faf0, int * 0x0012fa80) 
line 103 + 33 bytes
nsDocumentOpenInfo::DispatchContent(nsIChannel * 0x14b22d18, nsISupports * 
0x00000000) line 367 + 109 bytes
nsDocumentOpenInfo::OnStartRequest(nsDocumentOpenInfo * const 0x14b22fa8, 
nsIChannel * 0x14b22d18, nsISupports * 0x00000000) line 241 + 16 bytes
nsFileChannel::OnStartRequest(nsFileChannel * const 0x14b22d20, nsIChannel * 
0x1497f038, nsISupports * 0x00000000) line 634
nsOnStartRequestEvent::HandleEvent(nsOnStartRequestEvent * const 0x14982a48) 
line 210 + 26 bytes
nsStreamListenerEvent::HandlePLEvent(PLEvent * 0x149825a0) line 97 + 12 bytes
PL_HandleEvent(PLEvent * 0x149825a0) line 580 + 10 bytes
PL_ProcessPendingEvents(PLEventQueue * 0x00bd6500) line 513 + 9 bytes
_md_EventReceiverProc(HWND__ * 0x00020220, unsigned int 49424, unsigned int 0, 
long 12412160) line 1049 + 9 bytes
USER32! 77e148dc()
USER32! 77e14aa7()
USER32! 77e266fd()
nsAppShellService::Run(nsAppShellService * const 0x00c181d0) line 408
main1(int 1, char * * 0x00497a60, nsISupports * 0x00000000) line 990 + 32 bytes
main(int 1, char * * 0x00497a60) line 1171 + 37 bytes
mainCRTStartup() line 338 + 17 bytes
KERNEL32! 77e992a6()

  It seems that when we hit the reload button, the FrameManager is elicted to 
destroy the old frames which leads to destroy the JSContext. While the next 
time we get the JSContext, it was reset to an invalid state.
  
So, this may be a problem with the caching of an applet in the nsILiveconnect 
instance, but I each time you call JSObject.getWindow(), the applet/plugin 
instance you pass in should overwrite the previous one. 
Status: NEW → ASSIGNED
  Patrick:
    I think the problem is why the js_DestroyContext was called when we hit the 
reload button. For a page which has an applet embedded in, each time we hit the 
reload button, that function was not called. So everything is OK. For our case, 
the applet is shown in a seperate frame and when we hit the reload button in 
the page which loads that applet, the js_DestroyContext was called so that the 
next time when it gets used, it picks up a invalid one.
   
  After debugging further, I found I made a wrong comment. Actually everytime, 
when a webpage is reloaded, it will elict the frame manager to do some clean 
stuff. 
  The only difference of our case is the frame is called nsHTMLFrameInnerFrame 
because it is generated by clicking a link in the parent window. When reloaded 
that kind of page, nsHTMLFrameInnerFrame destructor was called and so 
nsDocshell destroy is called which leads the JSContext is cleaned.
  I found that after we hit File->End Session to end the applet of the test 
case, if at this time we went to an applet which introduces some liveconnect 
call, it will pick up a wrong JSContext->runtime cached in the thread_list, 
then it will crash the browser.
  I think the better solution is in the liveconnect side, either we check the 
validity before we use the JSContext or do some backup process to save the 
context, so if next time we only need to check if they are equal or not, if 
they are equal, we can use that JSContext, otherwise, generate a new one.
  Patrick: Would you mind looking for a better solution for this bug? Thanks!
  The workaround proposed before is not very efficient. I tried to figure out 
how to notify the jsj_env when the js_DestroyContext is called. However, that 
will affect the performance even more because the js_DestroyContext must be 
called in many situations. So, instead of doing that, I decided to sacrify some 
efficiency in the liveconnect side. Originally, we got jsj_env from the 
thread_list. To consider that the jscontext might be changed, the patch simply 
recalculate the jscontext everytime we want to use it. The workaround is just 
recalculate the whole jsj_env structure. This is why I said it is better than 
the workaround. 
Brendan:
   Would you be so kind to review the patch and give us a sr and approve ?
   Thanks in advance! I owe you a lot!
Fixing summary to describe the problem.  I'll attach a patch soon.

/be
Summary: JS_malloc cause the browser crash → liveconnect needs to be notified about js_DestroyContext events
Ok, no patch from me.  A question instead.  The liveconnect/jsjava.h API export
header file has this comment and prototype:

/* This function is used to specify a particular JSContext as *the* JavaScript
   execution environment to be used when LiveConnect is accessed from the given
   Java thread, i.e. when one of the methods of netscape.javascript.JSObject
   has been called.  There can only be one such JS context for any given Java
   thread at a time.  (To multiplex JSContexts among a single thread, this
   function could be called before Java is invoked on that thread.)  The return
   value is the previous JSContext associated with the given Java thread.

   If this function has not been called for a thread and a crossing is made
   into JavaScript from Java, the map_jsj_thread_to_js_context() callback will
   be invoked to determine the JSContext for the thread.  The purpose of the 
   function is to improve performance by avoiding the expense of the callback.
*/
JS_EXPORT_API(JSContext *)
JSJ_SetDefaultJSContextForJavaThread(JSContext *cx, JSJavaThreadState *jsj_env);

I don't find any calls to this method from OJI code.  What am I missing?

/be
   Are you suggesting me to call this function instaed of calling the 
map_jsj_thread_to_js_context()? But I think we need to pass a JS contxet to 
this function,right? However, this function can not help the Java thread to get 
a valid JSContext. I am sorry if I miss what you said. 
I'm suggesting that someone needs to call JSJ_SetDefaultJSContextForJavaThread
or map_jsj_thread_to_js_context will be called every time, as the comments near
line 94 in liveconnect/jsjava.h make clear.

If no one ever calls JSJ_SetDefaultJSContextForJavaThread, then the liveconnect
code will, when it finds no better JSContext to use (i.e., when Java is calling
JS "spontaneously"), call map_jsj_thread_to_js_context.   Oops, looks like both
calls to that callback *do* update jsj_env->cx, which means there is *no point
at all* in an implementation of that callback invoking
JSJ_SetDefaultJSContextForJavaThread.

Ah, here is the
impl:http://lxr.mozilla.org/seamonkey/source/modules/oji/src/lcglue.cpp#112. 
The comment there about "invalid under Gecko" is wrong -- the old layout
codebase did not have a 1:1 relationship between thread and JSContext either --
but never mind.  It looks like, if all goes well, the applet's plugin instance
peer is consulted at
http://lxr.mozilla.org/seamonkey/source/modules/oji/src/lcglue.cpp#140 to find
the JSContext to use.  This must be the one that's going away.

Note right here: the liveconnect API suggests that this
map_jsj_thread_to_js_context_impl function *could* call
JSJ_SetDefaultJSContextForJavaThread, once it has an answer it likes.  But I see
the problem the comment about "invalid under Gecko" alludes to: there can be
many JSContexts in use by many applets in the one Gecko/UI/main thread.

So we need to ask every time, or else let liveconnect cache the last returned
context in jsj_env->cx -- but we must be careful to invalidate and refill that
cache when asked for a different applet's context (safe because applets never
"switch contexts").

Anyway, on to find the GetJSContext impl....

http://lxr.mozilla.org/seamonkey/source/modules/plugin/nglsrc/nsPluginInstancePeer.cpp#871

This looks like it.  The JSContext to use is gleaned from the global (window)
object of the document containing this plugin instance (peer).

So, I'm breaking this down in tedious detail to rediscover what we already know:
we're stashing a JSContext* that then goes away behind liveconnect's back.

From the comments in liveconnect/jsjava.h, it seems to me liveconnect is *not*
supposed to cache the cx returned by map_jsj_thread_to_js_context in
jsj_env->cx.  Instead, *implementations* of map_jsj_thread_to_js_context that
can safely cache may do so by calling JSJ_SetDefaultJSContextForJavaThread.  So
here's a patch that does that.

/be
Really fixing summary.  Patch next.

/be
Summary: liveconnect needs to be notified about js_DestroyContext events → liveconnect should not cache JSContext pointers!
I wrote:

>Oops, looks like both calls

I mean "callers" here, not "calls" -- "callers within js/src/liveconnect/*.c".

>to that callback *do* update jsj_env->cx, which means there is *no point at
>all* in an implementation of that callback invoking
>JSJ_SetDefaultJSContextForJavaThread.

I'm commenting again to make it crystal-clear that the jsjava.h comments about
how map_jsj_thread_to_js_context and JSJ_SetDefaultJSContextForJavaThread (which
is misspelled as JSJ_SetJSContextForJavaThread, I'll fix that too) work do not
match how the code works.  The comments and the design are the right way; the
little lines after each map_jsj_thread_to_js_context call succeeds that set
jsj_env->cx = cx; are at fault.

Caching must be optional, at the discretion of the map_jsj_thread_to_js_context
implementation.

/be
   Brendan
       Thank you very much!
brendan: uhhhhh, okay.  You're too smart for me.  I just read your last four
comments on this bug, and barely got done with one comment before another came
in!  :-)

Thank you.  Now what?  If Xiaobin verifies that your patch works, does this mean
that it's automatically r'd and sr'd?  I don't know how it works in this case.
beard should get r= and sr= from people who should review js/src/liveconnect. 
Maybe himself and jband?

/be
   I have verified the patch. It does work now.
Beard:
   Would you give us a sr for the patch as soon as possible? Thank you very 
much!
sr=beard, thanks brendan for making the code prettier as well. Here's an updated 
patch to remove a few more hard tabs and make everything more better.
Beard: why don't you check the latest patch in and mark this closed.  Between
you, me, and xiabin, it has enough review.  Thanks,

/be
Fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
I am unable to unzip the testcase to verify the fix - for some reason
WinZip thinks the file is corrupt...

Can anyone with access to the testcase verifiy that this bug is fixed? 
If so, please mark it "Verified"; if not, please reopen it - thanks! 
Based on Xiaobin's comment at 2001-03-01 17:59 above, marking VERIFIED - 
Status: RESOLVED → VERIFIED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: