Last Comment Bug 352064 - Error finalizing JS objects causes LiveConnect crash (possibly exploitable)
: Error finalizing JS objects causes LiveConnect crash (possibly exploitable)
Status: VERIFIED FIXED
[sg:critical?] uses freed memory
: verified1.8.0.9, verified1.8.1.1
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: 1.8 Branch
: All All
: -- critical (vote)
: mozilla1.8.1
Assigned To: Steven Michaud [:smichaud] (Retired)
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-09-10 13:44 PDT by Steven Michaud [:smichaud] (Retired)
Modified: 2007-08-09 13:53 PDT (History)
9 users (show)
dveditz: blocking1.8.1.1+
dveditz: blocking1.8.0.9+
bob: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Java applet plus HTML code test case (2.14 KB, application/zip)
2006-09-10 13:49 PDT, Steven Michaud [:smichaud] (Retired)
no flags Details
OS X crash log (37.92 KB, text/plain)
2006-09-10 13:52 PDT, Steven Michaud [:smichaud] (Retired)
no flags Details
Linux crash logs (3.41 KB, text/plain)
2006-09-10 13:55 PDT, Steven Michaud [:smichaud] (Retired)
no flags Details
Windows crash logs (32.56 KB, text/plain)
2006-09-10 13:56 PDT, Steven Michaud [:smichaud] (Retired)
no flags Details
Patch that fixes the problem (3.87 KB, patch)
2006-09-10 14:25 PDT, Steven Michaud [:smichaud] (Retired)
no flags Details | Diff | Splinter Review
Same patch without printf statements (3.15 KB, patch)
2006-09-14 09:58 PDT, Steven Michaud [:smichaud] (Retired)
jhpedemonte: review+
brendan: superreview+
mconnor: approval1.8.0.9+
mtschrep: approval1.8.1-
mconnor: approval1.8.1.1+
Details | Diff | Splinter Review

Description Steven Michaud [:smichaud] (Retired) 2006-09-10 13:44:14 PDT
I will attach a test-case Java applet (plus invoking HTML code) that
demonstrates the problem, plus crash logs (generated in Firefox
1.5.0.6 on Mac OS X, SuSE Linux 9.2 and Windows XP Pro) that
demonstrate the potential for exploitability.

I'll also attach a patch to JavaObject_finalize() and
jsj_GC_callback() in js/src/liveconnect/jsj_JavaObject.c that seems to
fix the problem.

At least in the context of my test case, JavaObject_finalize() is
called from js_FinalizeObject() in js/src/jsobj.c when the JavaScript
garbage collector wants to get rid of a "JavaObject".  Whether or not
JavaObject_finalize() has succeeded, js_FinalizeObject() drops the
object map and frees the slots for the JavaObject (or any other kind
of JavaScript object) that it's trying to finalize.  But, under some
circumstances, JavaObject_finalize() _doesn't_ succeed, and _doesn't_
stop the JavaObject from continuing to be used.  As it does keep
getting re-used, eventually you get a crash.

The crashes are almost always under
JS_EvaluateUCScriptForPrincipals(), and in turn under js_Invoke(),
js_Execute() or js_Interpret().  They appear always to be access
violations, and often (but not always) the instruction pointer gets
set to values that appear to be inside strings (e.g. 0x20202020) --
which suggests heap corruption and/or buffer overflows.  But I suspect
that these crashes would be hard to exploit -- since it would probably
be hard to control _which_ data structure (and which part of that
structure) the instruction pointer ends up being set to.

I don't really know who to cc this bug to -- so feel free to add cc's
for others you think should be aware of it.  (I just chose the last
two people who'd made changes to jsj_JavaObject.c, plus Jesse
Ruderman, who I know is interested in finding security holes at least
in the Mac versions of Mozilla.org browsers.)
Comment 1 Steven Michaud [:smichaud] (Retired) 2006-09-10 13:49:23 PDT
Created attachment 237646 [details]
Java applet plus HTML code test case

Here's my test case.  You'll have to reload the applet at least once
to get a crash.  In my experience, crashes are easiest to get on Mac
OS X and Windows (I'm not sure why).  On Linux I generally have to
reload the applet many times, quickly, before I get a crash.
Comment 2 Steven Michaud [:smichaud] (Retired) 2006-09-10 13:52:53 PDT
Created attachment 237648 [details]
OS X crash log

These crash logs are selected to demonstrate the potential for
exploitation.  Only about half the crashes I've seen set the
instruction pointer to what's obviously the inside of some data
structure.
Comment 3 Steven Michaud [:smichaud] (Retired) 2006-09-10 13:55:30 PDT
Created attachment 237649 [details]
Linux crash logs

The Linux and Windows crash logs were generated by running Firefox
1.5.0.6 inside gdb.  Without gdb, the browser always dies without
putting up any OS or Talkback crash dialog.
Comment 4 Steven Michaud [:smichaud] (Retired) 2006-09-10 13:56:00 PDT
Created attachment 237650 [details]
Windows crash logs
Comment 5 Steven Michaud [:smichaud] (Retired) 2006-09-10 14:25:00 PDT
Created attachment 237654 [details] [diff] [review]
Patch that fixes the problem

In the original code for JavaObject_finalize(), jsj_EnterJava() is
always called -- regardless of whether that's actually necessary
(regardless of whether or not jEnv needs to be set).  But
jsj_EnterJava() can fail if it's re-entered from a different JSContext
-- and that can happen while a JavaScript "alert" window is open
during a Java-to-JavaScript call to "alert()" (e.g. while a call to
JS_CallFunctionValue() from nsCLiveconnect::Call() still hasn't
returned).  (I'm still not sure _why_ this happens, but I've seen that
it does happen.  It may have something to do with the fact that the
"alert window" and the window it was popped up from have different
JSContexts.)

If the original code's call to jsj_EnterJava() fails for any reason,
the JavaObject that's being finalized doesn't get removed from the
java_obj_reflections hashtable -- so it keeps getting re-used.  But,
though its ClassDescriptor also doesn't get released,
js_FinalizeObject() still drops its map and frees its slots -- which
means that when the JavaObject is re-used, references will be made to
structures which have already been finalized.

I've tested this patch with Firefox 1.5.0.6 on Mac OS X, Windows XP
and SuSE Linux.  Since jsj_JavaObject.c is the same on the trunk and
all branches, this patch should apply to any of them without offsets
(and should work without problems).
Comment 6 Steven Michaud [:smichaud] (Retired) 2006-09-10 20:25:28 PDT
Here's something I forgot to mention that makes it impossible to test
this bug on Mac OS X on the trunk -- a change to the
nsIScriptSecurityManager interface on 2006-08-21 broke the Java
Embedding Plugin's support for JavaScript-to-Java LiveConnect.  For
more information see bug 350664.
Comment 7 Steven Michaud [:smichaud] (Retired) 2006-09-14 08:47:19 PDT
Well?
Comment 8 Igor Bukanov 2006-09-14 09:15:48 PDT
(In reply to comment #7)
> Well?
> 

Could you attach the patch without printf statements? 

Comment 9 Steven Michaud [:smichaud] (Retired) 2006-09-14 09:58:31 PDT
Created attachment 238446 [details] [diff] [review]
Same patch without printf statements

Here it is.
Comment 10 Igor Bukanov 2006-09-14 13:37:54 PDT
Guys, who should review this? Also this probably belong to Java-LiveConnect, right?
Comment 11 Brendan Eich [:brendan] 2006-09-14 14:16:55 PDT
Comment on attachment 238446 [details] [diff] [review]
Same patch without printf statements


> JS_EXPORT_API(void)
> JavaObject_finalize(JSContext *cx, JSObject *obj)
> {
>     JavaObjectWrapper *java_wrapper;
>     jobject java_obj;
>     JNIEnv *jEnv;
>     JSJavaThreadState *jsj_env;
> 
>     java_wrapper = JS_GetPrivate(cx, obj);
>     if (!java_wrapper)
>         return;
>     java_obj = java_wrapper->java_obj;
> 
>-    jsj_env = jsj_EnterJava(cx, &jEnv);
>-    if (!jEnv)
>-        return;
>-
>     if (java_obj) {
>         remove_java_obj_reflection_from_hashtable(java_obj, java_wrapper->u.hash_code);
>-

Nit: please don't remove this blank line.

Patch looks good to me.  Sorry for delay in responding to bugmail, problem is that liveconnect is ownerless at the moment (Sun folks who owned it changed jobs).  Javier, could you review?  If you r+, please land on the trunk it with the nit picked, or ask for someone else to do it if you are as jammed up with other bugs as I am ;-).

/be
Comment 12 Mike Schroepfer 2006-09-15 10:35:38 PDT
Pushing to 1.8.1.1 - can you land on trunk asap so we can start the baking
Comment 13 Mike Schroepfer 2006-09-15 10:36:18 PDT
Comment on attachment 238446 [details] [diff] [review]
Same patch without printf statements

pushing to 1.8.1.1 since we need some time on trunk to bake this...
Comment 14 jhp (no longer active) 2006-09-18 08:55:32 PDT
Comment on attachment 238446 [details] [diff] [review]
Same patch without printf statements

>+        jsj_env = jsj_EnterJava(cx, &jEnv);
>+        if (jEnv) {
>+            jsj_ReleaseJavaClassDescriptor(cx, jEnv, java_wrapper->class_descriptor);
>+            JS_free(cx, java_wrapper);
>+            jsj_ExitJava(jsj_env);
>+        } else {
>+            java_wrapper->u.next = deferred_wrappers;
>+            deferred_wrappers = java_wrapper;
>+        }

Why do you have the 'deferred_wrappers' code in the jsj_EnterJava 'failure' case?
Comment 15 Steven Michaud [:smichaud] (Retired) 2006-09-18 09:05:54 PDT
> Why do you have the 'deferred_wrappers' code in the jsj_EnterJava
> 'failure' case?

Because otherwise the ClassDescriptor, "java_wrapper" and global ref
will never be freed.
Comment 16 jhp (no longer active) 2006-09-18 09:11:04 PDT
Comment on attachment 238446 [details] [diff] [review]
Same patch without printf statements

Looks good.  I can't check in and keep track of tboxen today, so if someone else could check this in, that would be great.  Otherwise, I'll check it in tomorrow.
Comment 17 Steven Michaud [:smichaud] (Retired) 2006-09-18 09:11:45 PDT
> and global ref

Oops, this doesn't need to be freed in this case (since it doesn't
exist).  And I suppose it _might_ be the case that, in the absence of
a java_obj you'll never have a ClassDescriptor or "java_wrapper" that
need to be freed.  But I was just being very conservative.
Comment 18 timeless 2006-09-18 11:56:55 PDT
i don't understand why this didn't block the release.
Comment 19 Brendan Eich [:brendan] 2006-09-18 12:43:31 PDT
(In reply to comment #18)
> i don't understand why this didn't block the release.

I think you should know why.  You've been around long enough.  The security bug policy requires us to keep bugs security-sensitive until patches are well-proven and distributed downstream.  That's not obviously the case here. I say this having sr+'ed the patch; my point is that the hour is late and liveconnect is unowned and undertested; testimony of the high-quality submitter, smichaud; and code review by people associated with js/src, me; only go so far in the end-game of a release.

What's more, we will always have bugs such as this coming in as we try to freeze and finalize a release candidate.  If we do as you seem to suggest, we will never ship any software -- and that's completely counterproductive to improving the real world security of the products that people actually use.

With the patch update system, we can guarantee to fix this bug for everyone who downloads Firefox 2 in six to eight weeks, at the first regularly scheduled dot release.  That's what we should do too, in order to get all the other security fixes that have been in 1.8.1/Firefox 2, baking for weeks or months, out into users' hands.

So what exactly did you not understand, really?

/be
Comment 20 Brendan Eich [:brendan] 2006-09-18 12:56:37 PDT
(In reply to comment #17)
> > and global ref
> 
> Oops, this doesn't need to be freed in this case (since it doesn't
> exist).  And I suppose it _might_ be the case that, in the absence of
> a java_obj you'll never have a ClassDescriptor or "java_wrapper" that
> need to be freed.  But I was just being very conservative.

Can you prove one way or the other whether this code is needed?  If not, can you prove it doesn't hurt -- that it's conservative in all cases?  "Prove" meaning make a convincing argument, nothing fancy :-).

I'll try to find time to read the relevant code, but I thought I would throw this out now, since at least you and Javier may have more time to look.  Thanks,

/be
Comment 21 Steven Michaud [:smichaud] (Retired) 2006-09-18 13:22:52 PDT
(In reply to comment #20)

I'll assume that you're talking about the patch as a whole, and not
just part of it (e.g. not just the jsj_EnterJava failure case).

I think I've already proven that you can't put the call to
remove_java_obj_reflection_from_hashtable() inside a
jsj_EnterJava()/jsj_ExitJava() block.  In other respects I tried to
leave the code as close as possible to what it was originally.

Aside from having too inclusive a jsj_EnterJava()/jsj_ExitJava()
block, I think the original code already behaved quite safely:  All
possibilities were covered with regard to java_obj (whether or not it
was NULL) and jsj_EnterJava() (whether or not it succeeded).  I don't
know for sure whether it's necessary to postpone deleting the
java_obj's global ref -- but it's not unreasonable, and does't cost
much.  And in any case that's how the code was written before I got to
it.  I've added the possibility of deferring part of the finalization
of a java_wrapper object even when no java_obj exists -- but again I
don't think this is unreasonable or costs too much (especially since
the deferral mechanism already existed, and was already being used).

I think this takes care of the "conservative" side -- I think I've
been as conservative as possible.

As to whether the code might (and might have been) doing more than is
strictly necessary, I don't yet know.  (I'll look into it over the
next few days.)  But it's always better to err on the side of caution.
Comment 22 Steven Michaud [:smichaud] (Retired) 2006-09-18 15:26:10 PDT
Let me say that I think Mike has the best strategy (from comment 12
and comment 13):  If no-one can find something specifically wrong with
the patch (or can come up with a better alternative), please put land
it on the trunk as soon as possible, so it can "bake" there.

I don't know the JavaScript code very well (either the Liveconnect
part or the rest of it).  And it doesn't seem like _anyone_ currently
working on Mozilla.org browsers knows the Liveconnect part very well.

But I've found a serious problem, plus a very simple and
straightforward solution to it.

So clearly you guys need to do _something_, and your best bet is
probably my patch (or some variation on it).  But it should also be in
the hands of testers for a while before it gets into a release
version.
Comment 23 Steven Michaud [:smichaud] (Retired) 2006-09-19 08:50:58 PDT
> And I suppose it _might_ be the case that, in the absence of a
> java_obj you'll never have a ClassDescriptor or "java_wrapper" that
> need to be freed.

I've found out that you _can_ have a "java_wrapper" and
ClassDescriptor without a java_obj.  So my patch (and the previous
code) was right to assume this might be possible.

Starting at line 245 in jsj_JavaObject.c (in jsj_WrapJavaObject()) you
have the following three lines:

    java_obj = (*jEnv)->NewGlobalRef(jEnv, java_obj);
    java_wrapper->java_obj = java_obj;
    if (!java_obj)
        goto out_of_memory;

So if (*jEnv)->NewGlobalRef() fails, goto out_of_memory.  At
out_of_memory you have:

  out_of_memory:
    /* No need to free js_wrapper_obj, as it will be finalized by GC. */
    JS_ReportOutOfMemory(cx);
    return NULL;

This code assumes the garbage collector will free js_wrapper_obj --
i.e. that JavaObject_finalize() will be called on it.
Comment 24 Steven Michaud [:smichaud] (Retired) 2006-09-19 09:00:11 PDT
> Starting at line 245 in jsj_JavaObject.c (in jsj_WrapJavaObject())
> you have the following three lines:

Should be line 238 (I was looking at my patched copy of this file).
Comment 25 jhp (no longer active) 2006-09-19 11:27:05 PDT
I'll take care of checking this into the trunk shortly.
Comment 26 jhp (no longer active) 2006-09-19 11:57:06 PDT
Checked in to trunk. ->FIXED
Comment 27 Daniel Veditz [:dveditz] 2006-09-19 16:02:35 PDT
Restoring lost blocking flag
Comment 28 Bob Clary [:bc:] 2006-09-20 10:17:31 PDT
test added to an internal test suite.
Comment 29 Bob Clary [:bc:] 2006-09-29 13:17:09 PDT
verified fixed 1.9 20060929 windows/linux
Comment 30 Daniel Veditz [:dveditz] 2006-11-03 11:29:17 PST
Any volunteers for checking into the branches?
Comment 31 Mike Connor [:mconnor] 2006-11-07 06:43:55 PST
Comment on attachment 238446 [details] [diff] [review]
Same patch without printf statements

a=mconnor on behalf of drivers for 1.8.0.9 and 1.8.1.1 checkin
Comment 32 Brendan Eich [:brendan] 2006-11-21 13:15:40 PST
Could use help with branch landings.

/be
Comment 33 :Gavin Sharp [email: gavin@gavinsharp.com] 2006-11-21 14:17:49 PST
mozilla/js/src/liveconnect/jsj_JavaObject.c 	1.40.20.1
mozilla/js/src/liveconnect/jsj_JavaObject.c 	1.40.12.1
Comment 34 Bob Clary [:bc:] 2006-11-30 08:31:47 PST
no crash in 20061130 1.8.0.9, 1.8.1.1, 1.9 nightly or debug builds on winxp but with a debug 1.8.0.9 build I see the following on shutdown:

JS engine warning: 257 GC roots remain after destroying the JSRuntime.
                   These roots may point to freed memory. Objects reachable
                   through them have not been finalized.
Comment 35 Steven Michaud [:smichaud] (Retired) 2006-11-30 09:04:44 PST
I don't think this patch can be the cause of your problem -- the patch
never prevents a "JavaObject" from being garbage-collected.

In any case, does backing out the patch make any difference to your
results?
Comment 36 Steven Michaud [:smichaud] (Retired) 2006-11-30 09:13:51 PST
I should have been clearer:  This patch never prevents a "JavaObject"
from being garbage-collected, and (as far as I can tell) never even
postpones its gargage-collection.  At most it may postpone freeing the
"java_wrapper" and its "class_descriptor".  But the "GC root" has
already been freed.
Comment 37 Bob Clary [:bc:] 2006-11-30 09:57:01 PST
(In reply to comment #35)

> In any case, does backing out the patch make any difference to your
> results?

I assume it will crash without the patch.
Comment 38 Steven Michaud [:smichaud] (Retired) 2006-11-30 10:21:21 PST
And if you disable the test for this bug (which you mentioned in
comment #28) you don't see the "GC roots remain" problem?
Comment 39 Bob Clary [:bc:] 2006-11-30 10:40:26 PST
(In reply to comment #38)
> And if you disable the test for this bug (which you mentioned in
> comment #28) you don't see the "GC roots remain" problem?
> 

I'm not quite sure what you mean by disable the test, but if I just start and stop the browser, or load the java implementation test at java.com I don't see the GC roots issue.
Comment 40 Steven Michaud [:smichaud] (Retired) 2006-11-30 10:48:51 PST
So I guess the _only_ formal test you were running (with which you got
the "GC roots remain" problem) was the test for this bug (mentioned in
comment #28).  What happens if you run a bunch of other JavaScript
tests (not necessarily involving LiveConnect)?  I'll bet you still see
the "GC roots remain" problem.
Comment 41 Bob Clary [:bc:] 2006-11-30 14:41:11 PST
(In reply to comment #40)
> So I guess the _only_ formal test you were running (with which you got
> the "GC roots remain" problem) was the test for this bug (mentioned in
> comment #28).  

yes.

> What happens if you run a bunch of other JavaScript
> tests (not necessarily involving LiveConnect)?  I'll bet you still see
> the "GC roots remain" problem.
> 

Nope. I don't see that message on any of the non liveconnect js tests from this mornings run on any of the branches|platforms.
Comment 42 Steven Michaud [:smichaud] (Retired) 2006-11-30 15:20:16 PST
I still don't think there's any way my patch can be causing the "GC
roots remain" problem.  I believe you've found some other bug that's
also triggered by the test(s) for this bug.

But, since backing out my patch will just bring back the old crashes,
my hypothesis is difficult to test.

Please point me to the test(s) you've been running.  I'm not familiar
with Mozilla.org's testing infrastructure.  But if I can look at your
test script(s), and (ideally) also play with them, I may be able to
find a way to trigger the "GC roots remain" bug without triggering the
crashes for which I originally opened this bug ... which (of course)
would make it possible to test what happens when you back out my
patch.
Comment 43 Bob Clary [:bc:] 2006-11-30 15:29:18 PST
I'm just using attachment 237646 [details].
Comment 44 Steven Michaud [:smichaud] (Retired) 2006-11-30 15:46:43 PST
> I'm just using attachment 237646 [details].

Oh.  And here I thought you'd been using all the fancy testing
infrastructure that I just heard about at the Firefox Summit :-)

I have a bad feeling about this -- I suspect I'll only be able to
prove my point by finding the other bug(s) and fixing it (or them).
But teasing out my fix for _this_ bug took a solid two weeks, and it's
going to be a while before I once more have that kind of time to
spend.

In the meantime, does the "GC roots remain" problem happen only on the
1.8.0.9 branch?  (And is this branch what will become Firefox
1.5.0.9?)
Comment 45 Bob Clary [:bc:] 2006-11-30 19:41:27 PST
(In reply to comment #44)
> 
> In the meantime, does the "GC roots remain" problem happen only on the
> 1.8.0.9 branch?  (And is this branch what will become Firefox
> 1.5.0.9?)
> 

yes (windows and linux) and yes. Do you have a 1.8.0.9 build without this patch handy? Maybe the GC root thing will show up if you don't do the multiple refresh? If you don't have a build, let me know and I'll make one.

verified fixed 20061130 1.8.1.1 windows/linux

note I get ASSERTION: QueryInterface needed: 'query_result.get() == mRawPtr', file ../../dist/include/xpcom/nsCOMPtr.h, line 594 on the trunk on linux but not Windows. Not sure if it is related though.
Comment 46 Steven Michaud [:smichaud] (Retired) 2006-12-01 08:05:35 PST
> yes (windows and linux)

So you don't see the problem on OS X, even with the 1.8.0.9 builds?
Or is it just that you haven't tested on OS X?

> Do you have a 1.8.0.9 build without this patch handy?

I don't (though I could make one sometime this weekend).  Please make
one for me, and let me know where I can download it.  That way you can
test it, too :-)

> ASSERTION: QueryInterface needed: 'query_result.get() == mRawPtr'

I doubt this is related.  But try breaking on Assert_NoQueryNeeded()
(while running in gdb), and get a stack trace.
Comment 47 Bob Clary [:bc:] 2006-12-03 15:25:13 PST
without this patch I still see the GC root problem in 1.8.0.9 with this testcase. I'm looking for the cause, but it is not related to this patch.

verified fixed 1.8.0.9
Comment 48 Daniel Veditz [:dveditz] 2007-03-20 17:00:50 PDT
Bob: do you still see a GC root problem with the testcase? If so please spin off another bug to cover it.
Comment 49 Bob Clary [:bc:] 2007-03-20 17:27:02 PDT
(In reply to comment #48)
> Bob: do you still see a GC root problem with the testcase? If so please spin
> off another bug to cover it.
> 

filed Bug 374680 on a thebes crash in trunk and Bug 374681 for the GC roots on 1.8.0.


Note You need to log in before you can comment on or make changes to this bug.