Last Comment Bug 514554 - (CVE-2009-3371) Crash with recursive web-worker calls [@XPCNativeSet::Mark]
(CVE-2009-3371)
: Crash with recursive web-worker calls [@XPCNativeSet::Mark]
Status: RESOLVED FIXED
[sg:critical?]
: crash, fixed1.9.0.15, testcase, verified1.9.1
Product: Core
Classification: Components
Component: XPConnect (show other bugs)
: Trunk
: All Windows XP
: P2 normal (vote)
: mozilla1.9.2
Assigned To: Blake Kaplan (:mrbkap)
:
: Andrew Overholt [:overholt]
Mentors:
jar:https://bug514554.bugzilla.mozill...
Depends on:
Blocks: 522839
  Show dependency treegraph
 
Reported: 2009-09-03 15:35 PDT by Daniel Veditz [:dveditz]
Modified: 2011-06-13 10:01 PDT (History)
12 users (show)
jst: blocking1.9.2+
dveditz: blocking1.9.0.15+
dveditz: wanted1.9.0.x+
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
beta1-fixed
.4+
.4-fixed


Attachments
testcase (open either of the two index files) (1.29 KB, application/java-archive)
2009-09-03 15:35 PDT, Daniel Veditz [:dveditz]
no flags Details
Proposed fix (958 bytes, patch)
2009-09-08 14:41 PDT, Blake Kaplan (:mrbkap)
peterv: review+
Details | Diff | Splinter Review
With that (897 bytes, patch)
2009-09-09 14:08 PDT, Blake Kaplan (:mrbkap)
mrbkap: review+
jst: superreview+
dveditz: approval1.9.1.4+
dveditz: approval1.9.0.15+
Details | Diff | Splinter Review
crash report dmp file for 1.9.1.4 crash (68.76 KB, application/octet-stream)
2009-10-16 17:49 PDT, Al Billings [:abillings]
no flags Details
crash report extra file for 1.9.1.4 crash (399 bytes, application/octet-stream)
2009-10-16 17:49 PDT, Al Billings [:abillings]
no flags Details

Description Daniel Veditz [:dveditz] 2009-09-03 15:35:41 PDT
Created attachment 398496 [details]
testcase (open either of the two index files)

Orlando Berrera from sectheory.com sent me this testcase, a variant (likely the original) was posted by Malloc(i) at http://sla.ckers.org/forum/read.php?14,30128 so this is a public crash.

Orlando and Malloci report this as a DoS, a near-null read access violation. When I tried it I saw a few scarier and suspicious things, a near-null write access violation due to an integer overflow (reading from 8 + register containing 0xffffffff), crashes in GC, and crashes in the cycle collector.

I'm not at all sure web workers are a required part of this testcase. Would we get the same results appending to the document in a simple setInterval() ? This testcase seems to crash faster than I'd expect that approach to, but maybe that's just because the threads can generate events while layout is reflowing instead of having to wait for it.
Comment 1 Daniel Veditz [:dveditz] 2009-09-03 15:39:27 PDT
You should be able to run the testcase using a jar: url like
  jar:https://bug514554.bugzilla.mozilla.org/attachment.cgi?id=398496!/index.html
  (also index2.html)

if that doesn't work you can download and run it locally.
Comment 2 Samuel Sidler (old account; do not CC) 2009-09-04 01:55:26 PDT
I thought web workers were a feature of 1.9.1?
Comment 3 Daniel Veditz [:dveditz] 2009-09-04 10:06:29 PDT
They are, I nominated it for 1.9.0 because it might not be the workers themselves that are the problem. But we can wait until we identify a fix and decide then.
Comment 4 Samuel Sidler (old account; do not CC) 2009-09-04 10:09:22 PDT
Ben: Can you put this on the top of your list to investigate?
Comment 5 Ben Turner (not reading bugmail, use the needinfo flag!) 2009-09-04 17:04:27 PDT
I could only get this to crash on windows, here:

http://crash-stats.mozilla.com/report/index/d45fc94d-dabf-4019-ac6d-a728d2090904

Other than that I saw a few OOM-ish crashes (bad_alloc exceptions in gfx fonts).
Comment 6 Ben Turner (not reading bugmail, use the needinfo flag!) 2009-09-04 17:08:41 PDT
(In reply to comment #5)
And the caller of XPCNativeSet::Mark is XPCJSRuntime::GCCallback
Comment 7 Ben Turner (not reading bugmail, use the needinfo flag!) 2009-09-04 17:10:59 PDT
I'll try a debug build over the weekend, but for now moving this over to XPConnect since it looks to be the culprit.
Comment 8 Blake Kaplan (:mrbkap) 2009-09-08 14:41:57 PDT
Created attachment 399320 [details] [diff] [review]
Proposed fix

The code in XPC_WN_CallMethod does:

    XPCCallContext ccx(JS_CALLER, cx, obj, funobj, 0, argc, argv, vp);
...
    if(!XPCNativeMember::GetCallInfo(ccx, funobj, &iface, &member))
        return Throw(NS_ERROR_XPC_CANT_GET_METHOD_INFO, cx);
    ccx.SetCallInfo(iface, member, JS_FALSE);

Because XPCNativeMember::GetCallInfo must enter a request, it is a potential GC point. XPCCallContext::Init has the following code:

    if(name)
        SetName(name);

    if(argc != NO_ARGS)
        SetArgsAndResultPtr(argc, argv, rval);

Note that SetArgsAndResultPtr sets the call context's state to HAVE_ARGS, which is > HAVE_NAME. So here we end up with a call context which is HAVE_NAME but has not initialized mSet. XPC_WN_CallMethod works because it then calls XPCCallContext::SetCallInfo, which initializes mSet (amongst other members).

So, if GC happens between the construction and the call to SetCallInfo, then we could attempt to mark a garbage set and crash. This patch catches this case and initializes the offending members to avoid such crashes.
Comment 9 Blake Kaplan (:mrbkap) 2009-09-08 14:43:18 PDT
Comment on attachment 399320 [details] [diff] [review]
Proposed fix

>+            mStaticMemberIsLocal = nsnull;

Please pretend this says 'JS_FALSE' instead of nsnull.
Comment 10 Peter Van der Beken [:peterv] 2009-09-09 03:16:20 PDT
Comment on attachment 399320 [details] [diff] [review]
Proposed fix

Would it make sense to do it in SetArgsAndResultPtr instead (if mState < HAVE_NAME)? That would also avoid the problem for callers who don't pass argc, etc to the constructor and then call SetArgsAndResultPtr without calling SetName. Not sure if we care about those.
This looks like an old bug, probably need to take this on branches.
Comment 11 Blake Kaplan (:mrbkap) 2009-09-09 14:08:13 PDT
Created attachment 399572 [details] [diff] [review]
With that
Comment 12 Daniel Veditz [:dveditz] 2009-09-10 10:54:29 PDT
mrbkap: does this also affect xpconnect in 1.9.0? Might be harder to get the timing right without workers, but it's a small fix and might be good to take even if we're not sure it can be triggered.
Comment 13 Blake Kaplan (:mrbkap) 2009-09-10 11:02:55 PDT
(In reply to comment #12)
> mrbkap: does this also affect xpconnect in 1.9.0? Might be harder to get the
> timing right without workers, but it's a small fix and might be good to take
> even if we're not sure it can be triggered.

The fix applied to 1.9.0 (with a little massaging). In order to hit this bug though, you have to run JS on not the main thread, which I don't think we do on the 1.9.0 branch. I'll attach a backported patch in a little bit.
Comment 14 Blake Kaplan (:mrbkap) 2009-09-10 14:13:22 PDT
http://hg.mozilla.org/mozilla-central/rev/709036ca5768
Comment 15 Blake Kaplan (:mrbkap) 2009-09-10 14:32:14 PDT
Comment on attachment 399572 [details] [diff] [review]
With that

This actually applies on 1.9.0.
Comment 16 Samuel Sidler (old account; do not CC) 2009-09-11 02:07:19 PDT
Comment on attachment 399572 [details] [diff] [review]
With that

This doesn't need approval on 1.9.2 since it's blocking.
Comment 17 Daniel Veditz [:dveditz] 2009-09-14 15:09:52 PDT
Comment on attachment 399572 [details] [diff] [review]
With that

Approved for 1.9.1.4 and 1.9.0.15, a=dveditz
Comment 18 Blake Kaplan (:mrbkap) 2009-09-15 17:00:15 PDT
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/61015c1c2156
Comment 19 Blake Kaplan (:mrbkap) 2009-09-16 11:34:13 PDT
Checking in js/src/xpconnect/src/xpccallcontext.cpp;
/cvsroot/mozilla/js/src/xpconnect/src/xpccallcontext.cpp,v  <--  xpccallcontext.cpp
new revision: 1.27; previous revision: 1.26
done
Comment 20 Al Billings [:abillings] 2009-09-17 11:05:59 PDT
(In reply to comment #13)
> The fix applied to 1.9.0 (with a little massaging). In order to hit this bug
> though, you have to run JS on not the main thread, which I don't think we do on
> the 1.9.0 branch. I'll attach a backported patch in a little bit.

 The attached PoC doesn't crash 1.9.0.14, either run via Dan's jar: url or running the files locally.
Comment 21 Blake Kaplan (:mrbkap) 2009-09-17 12:27:38 PDT
(In reply to comment #20)
>  The attached PoC doesn't crash 1.9.0.14, either run via Dan's jar: url or
> running the files locally.

I don't know if it's possible to reproduce this bug on 1.9.0. It requires multiple threads running JS which don't exist without web workers in Firefox proper.
Comment 22 Al Billings [:abillings] 2009-10-16 17:28:25 PDT
Using the attached testcase, I peg my CPU at above 95% with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.4) Gecko/20091016 Firefox/3.5.4 (.NET CLR 3.5.30729). I assume that the DoS is expected while it runs the script.

I had it crash randomly once with 1.9.1.4 but I haven't been able to do it again. I'm not 100% convinced that this is fixed.
Comment 23 Al Billings [:abillings] 2009-10-16 17:32:51 PDT
If I run the index.html file in Dan's attached testcase and then shut down Firefox after a couple of minutes, it sometimes crashes.

It doesn't crash immediately like it does in 1.9.1.3.
Comment 24 Blake Kaplan (:mrbkap) 2009-10-16 17:39:54 PDT
Al, What stacks are you crashing with?
Comment 25 Al Billings [:abillings] 2009-10-16 17:43:36 PDT
It didn't submit them for some reason and they aren't in the local directory. I'm trying again.
Comment 26 Al Billings [:abillings] 2009-10-16 17:49:32 PDT
Created attachment 406824 [details]
crash report dmp file for 1.9.1.4 crash
Comment 27 Al Billings [:abillings] 2009-10-16 17:49:53 PDT
Created attachment 406825 [details]
crash report extra file for 1.9.1.4 crash
Comment 28 Al Billings [:abillings] 2009-10-16 17:50:25 PDT
My upload of the crash is being throttled for some reason.
Comment 29 Al Billings [:abillings] 2009-10-16 17:53:55 PDT
There we go.

I can do it about half or a third of the time by letting it run with index.html and then just clicking the red x to close Firefox.

http://crash-stats.mozilla.com/report/index/62c11801-4dbe-403b-811f-3d2ad2091016?p=1
Comment 30 Blake Kaplan (:mrbkap) 2009-10-16 17:59:34 PDT
(In reply to comment #29)
> http://crash-stats.mozilla.com/report/index/62c11801-4dbe-403b-811f-3d2ad2091016?p=1

This shows a null pointer deref in the web worker code. I think it should get its own bug.
Comment 31 Ben Turner (not reading bugmail, use the needinfo flag!) 2009-10-17 01:23:06 PDT
(In reply to comment #30)
> This shows a null pointer deref in the web worker code. I think it should get
> its own bug.

Except for the fact that the stack is totally wacky... Looks more like some kind of stack corruption.
Comment 32 Blake Kaplan (:mrbkap) 2009-10-17 01:30:58 PDT
(In reply to comment #31)
> Except for the fact that the stack is totally wacky... Looks more like some
> kind of stack corruption.

Heh, ignoring the XPC_WN_NoMods_Proto_Resolve, the stack actually looks fairly believable. Anyway, let's take this to bug 522839.
Comment 33 Al Billings [:abillings] 2009-10-19 10:35:21 PDT
So do you think it is safe to mark this as verified for 1.9.1 then, Blake? The crash is really a separate issue?
Comment 34 Blake Kaplan (:mrbkap) 2009-10-19 11:50:25 PDT
I think so, yes.
Comment 35 Orlando Barrera II 2009-10-21 11:12:36 PDT
So you guys found that the bug was causing stack corruption?  Could I get access to the new bug filed https://bugzilla.mozilla.org/show_bug.cgi?id=522839
Comment 36 juan becerra [:juanb] 2009-10-29 21:57:58 PDT
When I load the test case in 3.5.3 I crash immediately and the stacks are consistent for me: http://crash-stats.mozilla.com/report/index/27587aae-b48b-4494-84e5-e5d712091029

However, when I load the test case in 3.5.4 the application does not crash right away, but it becomes non-responsive and after a couple of minutes it crashes:
http://crash-stats.mozilla.com/report/index/32b482a9-6ae0-4ee5-9bc0-f76412091029

These two look different, but unfortunately the latter one does not look like the stack signature in bug 522839, which would make the results of this investigation a little tidier.
Comment 37 Ben Turner (not reading bugmail, use the needinfo flag!) 2009-10-29 22:32:31 PDT
bp-32b482a9-6ae0-4ee5-9bc0-f76412091029 is just a fancy "out of memory" crash. If you've made it that far then you've avoided this bug.
Comment 38 Johnny Stenback (:jst, jst@mozilla.com) 2009-11-03 09:16:35 PST
Ben, Blake, can this land for 1.9.2? Or do we need to dig in deeper here before we land this? Seems like it should be landable given that we shipped 1.9.1.4 with this...
Comment 39 Blake Kaplan (:mrbkap) 2009-11-04 05:58:16 PST
Apparently I landed this in time to make beta 1: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/8319e4e20c0e

Note You need to log in before you can comment on or make changes to this bug.