Closed
Bug 82034
Opened 24 years ago
Closed 24 years ago
JavaScript to Java broken after lazy start of JVM (REQUESTOR: Xiaobin)
Categories
(Core Graveyard :: Java: OJI, defect, P1)
Core Graveyard
Java: OJI
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: joe.chou, Assigned: jband_mozilla)
References
()
Details
(Keywords: crash, Whiteboard: [oji_escalation] oji_working)
Attachments
(7 files)
4.50 KB,
text/plain
|
Details | |
814 bytes,
text/plain
|
Details | |
2.42 KB,
patch
|
Details | Diff | Splinter Review | |
2.75 KB,
patch
|
Details | Diff | Splinter Review | |
11.31 KB,
patch
|
Details | Diff | Splinter Review | |
11.36 KB,
patch
|
Details | Diff | Splinter Review | |
649 bytes,
patch
|
Details | Diff | Splinter Review |
After lazy start of JVM, JavaScript to Java communication somehow was broken. A
simple test case in the URL can help to reproduce the problem: go to the URL,
bring up Java console, and press the print button.
Actual happened: no thing showed on the concole.
Expected to happen: a message, "HelloWorld.print() called" should be shown.
Assign and to myself and accepted.
Assignee: edburns → joe.chou
Comment 2•24 years ago
|
||
Add Stanley and Patrick to the cc list.
Comment 3•24 years ago
|
||
Joe, would you please zip the testcase and post it to here since javaweb is an
internal website. Thanks!
Comment 6•24 years ago
|
||
Marking as OJI hot bugs also.
Xiaobin and I did some more debugging, and found out that currently the browser
crashed when pressing the print button in the test case attached. The crash
happened on Windows and Solaris (On Linux, somehow, instead of crashing, the
browser was hung.) Tracing on Windows and Solaris, the crash point is the same:
on line 2179 of jsobj.c:
ok = resolve(cx, obj, js_IdToValue(id));
and the reason of crashing was that resolve somehow was null (0x0). Further
tracing found out that the reason of resove being null was that, on line 2083 of
the same file:
/* Try obj's class resolve hook if id was not found in obj's scope. */
if (!sym) {
clasp = LOCKED_OBJ_GET_CLASS(obj);
resolve = clasp->resolve;
resolve was assigned with a null value. Obj in LOCKED_OBJ_GET_CLASS(obj) was
javaObject. It seemed the javaObject class somehow was not initialized.
Any ideas?
Comment 8•24 years ago
|
||
I think this bug should also be marked as one of a OJI hot bugs.
Changing severity of this bug to "blocker" so that it's known to be stopping
people from getting their needed functionality out of the browser. Xiaobin,
this is the way I'll mark it as a hot bug for now, unless we get a specific,
customer-driver case for putting the "oji_escalation" status on it.
If you query the OJI "blocker" bugs, you should see a fairly well-contained
list. We'll start managing that more closely to help us keep our focus.
Severity: critical → blocker
Summary: JavaScript to Java broken after lazy start of JVM → JavaScript to Java broken after lazy start of JVM (REQUESTOR: Xiaobin)
Whiteboard: [oji_escalation]
Comment 10•24 years ago
|
||
George, I think this bug warrants escalation status. It blocks Gary Kind.
Comment 11•24 years ago
|
||
Gary, will you please add a comment to this bug stating why this is a blocker
for you? I'd like to start collecting customer "testimonials" about the
important bugs. It's good data, and you've been giving us great data in the past.
Comment 12•24 years ago
|
||
Jst:
Would you please look at this bug? I suspect that it has something to do
with your stuff.
Reporter | ||
Comment 13•24 years ago
|
||
Per my further testing, the May 7 trunk build seemed working OK, but May 8 trunk
build had problems for the same test case attached. When the print button was
pressed, the browser was hung, instead of crashed as in the later builds (as
described earlier). Therefore, it looked like the test case started having
problem after the landing of XPCDOM_20010329_BRANCH.
Jst, would you take a look and see if you can find some clue? Thanks.
Comment 14•24 years ago
|
||
This patch fixes *a* crash that I'm seeing, but not really knowing alot about
java I'm not sure I'm even doing the right thing when testing this out. Let me
know if this makes things better.
Index: dom/src/base/nsDOMClassInfo.cpp
===================================================================
RCS file: /cvsroot/mozilla/dom/src/base/nsDOMClassInfo.cpp,v
retrieving revision 1.23
diff -u -r1.23 nsDOMClassInfo.cpp
--- nsDOMClassInfo.cpp 2001/05/22 10:48:22 1.23
+++ nsDOMClassInfo.cpp 2001/05/25 10:10:04
@@ -2987,7 +2987,10 @@
jobject appletObject = nsnull;
nsresult rv = javaPluginInstance->GetJavaObject(&appletObject);
- NS_ENSURE_SUCCESS(rv, rv);
+
+ if (NS_FAILED(rv) || !appletObject) {
+ return rv;
+ }
nsCOMPtr<nsILiveConnectManager> manager =
do_GetService(nsIJVMManager::GetCID());
Comment 15•24 years ago
|
||
Johnny:
Thanks for you quick response!
However, I don't think it is a problem of Java Plugin. Actually I applied
your patch, it still can not avoid crash. The appletObject returns from plugin
is a valid object, anyway.
Reporter | ||
Comment 16•24 years ago
|
||
Here is some more testing results of the attached test case:
- Running May 7 build (with lazy start, and XPCDOM landing not checked in yet),
the browser hung at xpcom/base/nsCOMPtr.h line 649, "You can't dereference a
NULL nsCOMPtr with operatior->().".
- Running May 7 build (backed out lazy start, and without XPCDOM), it worked.
- Runing May 8 build (with lazy start and XPCDOM landing), the browser hung.
- Running May 8 build (backed out lazy start, but with XPCDOM), the browser
still hung.
- Running May 21 build (with lazy start, XPCDOM, and whatever already checked
in), crashed (see detail at the beginning of the comments).
Keywords: crash
Priority: -- → P1
Comment 17•24 years ago
|
||
Comment 18•24 years ago
|
||
This bug is due to XPCDOM which changes the way of native JS applet object
being wrapped, I think.
The reason of crash is "resolve call back function" is NULL for JavaObject
class which is not valid. I posted a patch which adds this function.
After this patch, we can resolve the function of the applet object. However,
there is another issue needs to be resolved. The issue is the js_Invoke function
can not find the java object reference from the Javascript call stack.
Johnny:
Any ideas on solving this?
Comment 19•24 years ago
|
||
jband, help! :-)
Comment 20•24 years ago
|
||
Adding status to indicate that it's actively being worked on by Joe.
Whiteboard: [oji_escalation] → [oji_escalation] oji_working
Assignee | ||
Comment 21•24 years ago
|
||
Yuck!
Note that xiaobin's patch should not return JS_FALSE if the property is not
found. This will terminate the script if the property is not found. If the
function is not going to define a JS property anyway then it could just as well
be implemented as:
JSBool JS_DLL_CALLBACK
JavaObject_resolve(JSContext *cx, JSObject *obj, jsval id)
{
return JS_TRUE;
}
The new DOM code sets the liveconnect wrapped JavaObject as the JS __proto__ of
the xpconnect wrapped node using the code in nsHTMLExternalObjSH::PostCreate...
http://lxr.mozilla.org/seamonkey/search?string=nsHTMLExternalObjSH%3A%3APost
Note that the decriptive comments there are for the plugin case where the plugin
instance is an xpconnect wrapped object.
So the problem we get into is that JS starts out by working on a JSClass-ish
xpconnect wrapped native JSObject and that object delegates to a JSOps-ish
liveconnect wrapped java object. Once we get into js_GetProperty we are on our
way down a slippery slope.
The JavaObject_resolve implementation *could* do a lookup and do a call to
define a JS property on the object. And you could implement a
JSClass::getProperty callback that uses the code in JavaObject_getPropertyById.
I think Brendan has a bug (or at least a plan) to get rid of the related
JavaObject_lookupProperty "*propp = (JSProperty*)1;" hack.
I wonder if the whole JSOps-based scheme should be dumped rather than trying to
patch this a piece at a time to interoperate.
Comment 22•24 years ago
|
||
jband, thanks for your comment here.
Two questions for your comment:
* Is it worthing to add resolve function?
* Why we need to define the JS property to the plugin object? By the way,
what's the kind of property we need to define.
Assignee | ||
Comment 23•24 years ago
|
||
Ok. So I'm debugging this and realizing that this all *should* just work right
and nver even try to cal the JavaObject resolver because the code in
js_LookupProperty knows how to detect a non-NATIVE JSObject as do the right
thing. But it was not doing that. Then I discovered that shaver had made the
JavaObject *be* a native object! (here we are saying 'native' to mean a JSObject
that uses the JSObjectOps code of jsobj.c - it has slots and all that jazz.
Specifically this means the the OBJ_IS_NATIVE(obj) macro resolves to true).
shaver checked in fixes to bug 74848 that make these objects be native even
though they have a JSClass riddled with NULL callbacks! Note my astonishment.
This was *such* a bomb waiting to go off.
I'm not seeing yet exactly why shaver felt the need to go this far.
Reporter | ||
Comment 24•24 years ago
|
||
More testing results (using same test case attached earlier):
1) With May 7 trunk build (including 26516(lazy start JVM) and others so far,
but no XPCDOM landing yet), if applied patch of 77600, then the test case
worked. Without the patch of 77600, the browser hung (as described in earlier
comments). The reason of the browser being bung was that the lazy start of JVM
started too late, and the patch of 77600 fixed that problem.
2) With May 8 trunk build (including XPCDOM landing, plus other checkins during
the day), even applied patch of 77600, the browser crashed (see stack below).
The cause of the crash was that somehow JSJ hash table was null. Was there any
change (s) in XPCDOM might be related to that? This crash is different from the
crash of May 21 trunk build (due to null resolve method of javaObject, as
described in earlier comments), and it may reveal some clues of the root cause
of the problem.
Stack of crash:
...
Program received signal SIGSEGV, Segmentation fault.
0xff2dac20 in JSJ_HashTableAdd (ht=0x0, key=0x1b01c4, value=0x76e808,
arg=0x78f4a8) at jsj_hash.c:285
285 keyHash = (*ht->keyHash)(key, arg);
Current language: auto; currently c
(gdb) where
#0 0xff2dac20 in JSJ_HashTableAdd (ht=0x0, key=0x1b01c4, value=0x76e808,
arg=0x78f4a8) at jsj_hash.c:285
#1 0xff2d58a4 in new_class_descriptor (cx=0x5d9f28, jEnv=0x78f4a8,
java_class=0x1b01c4) at jsj_class.c:384
#2 0xff2d5b90 in jsj_GetJavaClassDescriptor (cx=0x5d9f28, jEnv=0x78f4a8,
java_class=0x1b02a8) at jsj_class.c:475
#3 0xff2d14a0 in jsj_WrapJavaObject (cx=0x5d9f28, jEnv=0x78f4a8,
java_obj=0xf855c, java_class=0x1b02a8) at jsj_JavaObject.c:155
#4 0xff2d86b4 in jsj_ConvertJavaObjectToJSValue (cx=0x5d9f28, jEnv=0x78f4a8,
java_obj=0xf855c, vp=0xffbec19c) at jsj_convert.c:861
#5 0xff2cbfb8 in JSJ_ConvertJavaObjectToJSValue (cx=0x5d9f28,
java_obj=0xf855c, vp=0xffbec19c) at jsj.c:861
#6 0xfd046dcc in nsJVMManager::WrapJavaObject (this=0x1ebfd0,
context=0x5d9f28, javaObject=0xf855c, outJSObject=0xffbec2b8)
at nsJVMManager.cpp:377
#7 0xfcf9de9c in nsHTMLAppletElementSH::GetPluginJSObject (this=0x7c2920,
cx=0x5d9f28, obj=0x600e38, plugin_inst=0x7d7284, plugin_obj=0xffbec2b8,
plugin_proto=0xffbec2b4) at nsDOMClassInfo.cpp:2815
#8 0xfcf9c898 in nsHTMLExternalObjSH::PostCreate (this=0x7c2920,
wrapper=0x731ca0, cx=0x5d9f28, obj=0x600e38) at nsDOMClassInfo.cpp:2530
#9 0xfdf6c824 in XPCWrappedNative::GetNewOrUsed (ccx=@0xffbec4b0,
Object=0x76f198, Scope=0x5c28f0, Interface=0x2c2bd0,
resultWrapper=0xffbec41c) at xpcwrappednative.cpp:388
...
Assignee | ||
Comment 25•24 years ago
|
||
I think the fix to bug 74848 was simply wrong. AFAICT shaver and brendan decided
to make these objects be native simply to uphold the assertions in jslock.c that
would fire when the finalizer took us through a path that called
js_GetRequiredSlot and thus did the locking.
I think they failed to take into account that there are categories of JSObjects
that are not native, but have legitimate reasons for using JSScopes and having
slots and all that. We don't have a reasonable way to detect objects
like that currently. Perhaps the right thing is to really make these
liveconnect objects be native. But there is much more to that than the changes
shaver made.
I propose to undo shaver's change that made them native and (more
controversially) remove the JS_ASSERT(OBJ_IS_NATIVE(obj)); checks from
js_LockObj and js_UnlockObj. I'll attach a patch for testing and if brendan or
shaver surface we can discuss whether or not this is the right thing to do.
Assignee | ||
Comment 26•24 years ago
|
||
Reporter | ||
Comment 27•24 years ago
|
||
I applied the John's patch in the May 8th trunk build, it crashed with the same
stack as the one I just posted.
Assignee | ||
Comment 28•24 years ago
|
||
So that's a different bug, right? Why not file a new report? Does the current
trunk crash for you in this way?
Comment 29•24 years ago
|
||
Thanks, John. Would you make your patch be appoved ASAP?
Comment 30•24 years ago
|
||
Applied John's patch, it works for me now!
Comment 31•24 years ago
|
||
I didn't test in Unix. In windows, it works now. Joe, did you still get crash
in Unix?
Reporter | ||
Comment 32•24 years ago
|
||
I tried John's patch on May 21 trunk build on Solaris, and it hung, with and
without the patch of 77600. The May 21 build was the one that being used when
the bug was filed. I'll try a more recent build later today, and see if the
result may be different.
Reporter | ||
Comment 33•24 years ago
|
||
I tried the patch on May 29 build on Solaris, still hung. Looking for more
details.
Assignee | ||
Comment 34•24 years ago
|
||
Joe, do you have any way to break into the hung process and see what it is
doing? Are you sure it is really stuck? How long do you wait? On a 400mhz NT
machine I see a circa 15 second pause the first time I hit the print button. No
pause on subsequent button presses or after page reload. FWIW, I see the same
sort of pause the first time I run javac on the command line.
Reporter | ||
Comment 35•24 years ago
|
||
I tried the patch again, this time I did not apply the patch of 77600, only
applied John's patch, and it worked!
John, in your patch, I noticed that you made the change in the init method for:
jsj_JavaArray
jsj_JavaClass
jsj_JavaObject
what about the other ones like:
jsj_JavaMember
jsj_JavaPackage
jsj_JSObject
Do they need the same change as well?
Assignee | ||
Comment 36•24 years ago
|
||
Joe: AFAICT I changed all the places that use JSObjectOps.
Reporter | ||
Comment 37•24 years ago
|
||
I found out that if I did not open the Java console before going to the test
page, the browser would then be hung. Patch of 77600 seemed not to be the quilty
party. Do you see that on Windows, John and Xiaobin? It seems somehow I need to
open the Java console first, then bring up the test page, and the test case will
work. Otherwise, if I bring up the test page then open the Java console, the
browser will get stuck. At least that's what happened on Solaris.
Comment 38•24 years ago
|
||
It is fine in Windows with or without openning Java console first. Actually it
does not matter whether you apply the "patch" for 77600.
Reporter | ||
Comment 39•24 years ago
|
||
Xiaobin, did you try with the test case in this bug (press print button)?
Reporter | ||
Comment 40•24 years ago
|
||
I found out the reason of "requiring" to start Java console first, and came up
with a patch which seemed to resolve the problem. If the problem turns out to be
Unix only, then I may open a seperate bug to deal with the issue.
Assignee | ||
Comment 41•24 years ago
|
||
shaver is on vacation and brendan is working on a secret project and wishes to
not be disturbed. However, they both responded to my email ragging on them for
their whacky change. As it happens, neither of them liked my proposed solution.
Patrick and I also discussed this in depth. The consensus is that these objects
should not sit on the fence trying to both be non-native (as far as JS is
concerned) yet using JS native handlers to manage their slot accesses. The best
long term solution may be to get rid of the JSObjectOps stuff and make these
liveconnect wrappers just *be* JS natives.
But for now the clean and simple thing seems to be to make them be non-native
and have then *not* use the JS engine slot accessors. This is easy enough to do.
Patrick and I agreed that it is OK to make these slot accessors be non-locking
since there is no slot array reallocing to race with. And (we assume) setting
and getting jsvals from memory locations is atomic on the platforms that matter.
Otherwise I think my patch is non-controversial. It works for me. Attachment
coming...
Assignee | ||
Comment 42•24 years ago
|
||
Comment 43•24 years ago
|
||
Joe:
I tested both your testcase and mine, both work OK.
Assignee | ||
Comment 44•24 years ago
|
||
Assignee | ||
Comment 45•24 years ago
|
||
Oops, I realized that in the absense of a JSObjectOps::mark function the jsgc
will use obj->map->freeslot to determine how many slots to mark. So, we must set
that. Also, I changed from having a magic number for the slot count to using
(JSSLOT_PRIVATE+1).
Comment 46•24 years ago
|
||
De-lurk: looks ok to me, but the jsj_wrapper_[gs]etRequiredSlot names do not
need to end in "Op", which would have saved some whitespace-only changes in the
diff (eyestrain whine only). r/sr=brendan@mozilla.org
<relurk/>
/be
Comment 47•24 years ago
|
||
Comment 48•24 years ago
|
||
The patch I just attached should fix the crash you'll see if an applet fails to
load, that patch should go in with the other patch in this bug.
Assignee | ||
Comment 49•24 years ago
|
||
r/sr=jband on jst's patch.
Reporter | ||
Comment 50•24 years ago
|
||
With the patches here, on Windows platform the problem seems resolved, but on
Unix, unless the Java console is open first, the browser will be hung. A new bug
has been filed to address the problem, and the bug number is: 83698, which will
have a patch on top of the ones here to make Unix work.
Comment 51•24 years ago
|
||
r=beard for DOM helper code patch.
Comment 52•24 years ago
|
||
A nit, the patch to jsj_JavaClass.c introduces an extra copy of the comment:
/* Mandatory non-null function pointer members. */
in the declaration of JavaClass_ops. At least that's how patch seems to be
interpreting it, and I think that's what the diff indicates. r=beard
Assignee | ||
Comment 53•24 years ago
|
||
Thanks Patrick. Nit fixed.
I'll take this bug and try to get both my last patch and jst's patch approved
for 0.9.1. Both patches are r='d and sr='d.
Assignee: joe.chou → jband
Status: ASSIGNED → NEW
Comment 54•24 years ago
|
||
a= asa@mozilla.org for checkin to the 0.9.1 branch and the trunk.
(on behalf of drivers)
Assignee | ||
Comment 55•24 years ago
|
||
My patch and jst's patch checked in to 0.9.1 branch and trunk.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Comment 57•24 years ago
|
||
Verified on the commercial trunk (2001-09-06-05-trunk) and branch build
(2001-09-07-05-0.9.4).
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•