Closed Bug 344759 Opened 18 years ago Closed 18 years ago

EIP hijack in FF 1.5.0.5 -- crash [@ 0x20202113] called by JS_SetPrivate

Categories

(Core :: Security, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla1.8.1beta2

People

(Reporter: darin.moz, Assigned: mrbkap)

References

Details

(Keywords: crash, fixed1.8.1, verified1.8.0.5, Whiteboard: [sg:critical?])

Crash Data

Attachments

(6 files, 3 obsolete files)

From H D Moore <hdm@metasploit.com>:
----------------------------------------------------------------------------

This bug is really annoying to track down, it seems like changing anything
in the script prevents the EIP hijack from working. Hope it works for
you, I can reliably reproduce here (Windows XP SP2 - FF 1.5.0.4).

-HD

(a18.470): Access violation - code c0000005 (!!! second chance !!!)
eax=20202020 ebx=00000000 ecx=20202020 edx=022e5308 esi=022e5310
edi=022e5308
eip=20202113 esp=0012d7c4 ebp=0012d7e8 iopl=0         nv up ei pl nz na pe
nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000
efl=00000202
*** ERROR: Module load completed but symbols could not be loaded for C:
\WINDOWS\system32\xpsp2res.dll
xpsp2res+0x202113:
20202113 008000000606     add     [eax+0x6060000],al
ds:0023:26262020=??
0:000> k
ChildEBP RetAddr
WARNING: Stack unwind information not available. Following frames may be
wrong.
0012d7e8 60082df1 xpsp2res+0x202113
0012d834 6009c92d js3250!JS_SetPrivate+0x47
0012d864 60082a79 js3250!js_GetSrcNoteOffset+0x400f
0012d8a4 6009c148 js3250!JS_InitClass+0xa3
0012d8e0 60081fd4 js3250!js_GetSrcNoteOffset+0x382a
0012d90c 60082321 js3250!JS_SetGlobalObject+0xc4
*** WARNING: Unable to verify checksum for C:\Program Files\Mozilla
Firefox\firefox.exe
*** ERROR: Symbol file could not be found.  Defaulted to export symbols
for C:\Program Files\Mozilla Firefox\firefox.exe -
0012d928 0041021f js3250!JS_ResolveStandardClass+0x17c
0012d940 600addd2 firefox!NS_RegistryGetFactory+0x5ee0
0012d980 600ada90 js3250!js_LookupProperty+0x35b
0012d9a0 600ae29d js3250!js_LookupProperty+0x19
0012d9e0 004162e7 js3250!js_FindProperty+0x31b
0012da20 00416243 firefox!NS_RegistryGetFactory+0xbfa8
0012da4c 0040ce48 firefox!NS_RegistryGetFactory+0xbf04
0012dae0 0040cfb3 firefox!NS_RegistryGetFactory+0x2b09
0012db98 005364c5 firefox!NS_RegistryGetFactory+0x2c74
0012dbd8 00536e34 firefox!nsPrintOptions::GetNewPrintSettings+0x9baa
0012dbf4 005f298c firefox!nsPrintOptions::GetNewPrintSettings+0xa519
0012dc24 00732737 firefox!DeviceContextImpl::GetCanonicalPixelScale+0x9c53
0012dc4c 007247a5 firefox!nsPrintSettings::AddRef+0xb752
*** ERROR: Symbol file could not be found.  Defaulted to export symbols
for C:\Program Files\Mozilla Firefox\xpcom_core.dll -
0012dc90 60311954 firefox!nsPrintSettings::GetPrintOptionsBits+0x14d0

----------------------------------------------------------------------------

See attached testcase.

I was able to reproduce this exploit using FF 1.5.0.4 and the pre-release version of FF 1.5.0.5 from today (July 14).
Attached file testcase
When successful, this webpage causes a find dialog to appear.  In other cases, it has caused by browser to crash or hang.
this got fixed on trunk after  2006-01-11
Ok, I crash with a 2006-03-23 build, but don't crash with a 2006-03-24 trunk build. I tested this multiple times and on 2 different machines.
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=HEAD&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2006-03-23+05&maxdate=2006-03-24+06&cvsroot=%2Fcvsroot

Note, I had to disable Java, because otherwise it would crash on the Java part in trunk (I think that's a known bug on trunk, at least there are some known Java crashers on trunk).
I managed to crash once with Firefox1.5RC3 (talkback ID: TB20767470W), but not anymore, afterwards.
Attached file testcase2 (obsolete) —
I think I may have reduced it to this. However, I'm not sure (since I can't really reproduce the crash with the original testcase in FF1.5.0.5RC3).
But I am sure, that the first button is crashing also Firefox1.5.0.5RC3.
The second button is only crashing current trunk builds.
A typical talkback I get from testcase2 in Firefox1.5 is TB20978230Z:
kernel32.dll + 0x1eb33 (0x7c81eb33)
_CxxThrowException
_com_error::_com_error
_com_issue_error
CommonConstructor  [c:/builds/tinderbox/Fx-Mozilla1.8.0-Release/WINNT_5.2_Depend/mozilla/js/src/xpconnect/src/XPCIDispatchExtension.cpp, line 90]
ActiveXConstructor 
etc...

That's quite different from the original testcase, so that's why I'm not sure it's the same bug.
afaik we don't normally build w/ activex support. the build i have doesn't seem to:

 --enable-application=browser --enable-update-channel=release --enable-optimize --disable-debug --disable-tests --enable-static --disable-shared --enable-official-branding --enable-svg --enable-canvas --enable-update-packaging
CCing the reporter, H D Moore.
Attached file testcase 2, trunk-only version (obsolete) —
This version doesn't hang Firefox and Mac OS X for 30 seconds while it loads ;)  I haven't actually tested it, because I don't have ActiveX enabled.
Filed bug 344804 with a reduced testcase for a Java crash this turns up.  I don't know if it's the same Java crash Martijn mentioned in comment 3.
I can't reproduce the crash in comment 0.  With Firefox 1.5.0.4 on both Windows XP and Mac, as well as with yesterday's 1.5.0.x nightly on Mac, I get a few alert-like things and a Find dialog, and my browser window is resized, but Firefox does not crash.
Crashing [@ 0x20202113] in the JavaScript engine probably means this is a GC hazard bug.  GC hazard bugs tend to be hard to reduce :(  I think the fastest way to get this fixed on 1.5.0.x is to determine what trunk patch fixed it.

Martijn, is the "fixed on trunk" period you gave in comment 3 for the GC-hazard-looking crash in comment 0, or is it for the ActiveX crash in comment 5?  I'm hoping it's the latter, since there weren't any changes to the JavaScript engine or XPConnect on that day that look like GC hazard fixes.
Keywords: crash
Summary: EIP hijack in FF 1.5.0.5 → EIP hijack in FF 1.5.0.5 -- crash [@ 0x20202113] called by JS_SetPrivate
Whiteboard: [sg:critical?]
Version: Trunk → 1.8 Branch
Flags: blocking1.9a1+
Flags: blocking1.8.1?
Flags: blocking1.8.0.5+
I couldn't reproduce the original testcase either until I created a new profile, then *boom*. So if you're having trouble reproducing, try that. There's probably some common power-user pref setting a lot of us have that's interfering.

I died in JS_SetPrivate a lot (different call stacks, though): TB20989667, TB20989639, TB20989449

Once under ActiveXConstructor (TB20987029), similar to Martijn's comment 6.

Have not been able to reproduce with a debug build. I get a single "NATIVE:window" line added each time I press the button, and on the console "An error occurred updating the cmd_copy command". Otherwise nothing.
(In reply to comment #12)
> Martijn, is the "fixed on trunk" period you gave in comment 3 for the
> GC-hazard-looking crash in comment 0, or is it for the ActiveX crash in comment
> 5?
It's not for the ActiveX crash, I just tested that. Not sure what it is.
A typical talkback ID, I get is: TB20990092X
js3250.dll + 0x44f0e (0x60104f0e)
firefox.exe + 0x48f9c4 (0x0088f9c4)
xpcom_core.dll + 0x354e9 (0x604954e9)
firefox.exe + 0x1de24 (0x0041de24)

> I'm hoping it's the latter, since there weren't any changes to the
> JavaScript engine or XPConnect on that day that look like GC hazard fixes.

The bonsai link doesn't go back far enough, just misses Igor's 3:50 check-in for bug 330692 ("GC_MARK_DEBUG: using GC_MARK, not JS_MarkGCThing internally"). Let me see if applying that does the trick (may need to drag bug 324278 along, too).
Igor's patches don't apply cleanly to my tree (not unexpected). I'm too sleepy to fix it up tonight and travelling most of tomorrow, can anyone else look at it? If not I'll try again Monday.
Blake - can you take a look?
Ok, the "fixed on trunk" period I gave in comment 3 appears to come from the fix from bug 330900. I've checked that by building a 2006-03-23 build and manually applying the patch from that bug.
That means that the regression range in comment 3 is not relevant for this bug :(
Flags: blocking1.8.1? → blocking1.8.1+
Blocks: 344950
Do we have a fixed-on-trunk window for testcase 2?
Flags: blocking1.8.1+ → blocking1.8.1?
Comment on attachment 229352 [details]
testcase2

(In reply to comment #19)
> Do we have a fixed-on-trunk window for testcase 2?

Sorry, testcase 2 is most likely not related to this bug (I thought it was, first).
Testcase2 is not fixed on trunk, I'll file a new bug on it.
Attachment #229352 - Attachment is obsolete: true
Attachment #229367 - Attachment is obsolete: true
Sorry, I somehow reset the blocking1.8.1+ back to blocking1.8.1?
I filed bug 344957 and bug 344960 for testcase2, which isn't related to this bug.
Flags: blocking1.8.1? → blocking1.8.1+
(In reply to comment #18)
> Ok, the "fixed on trunk" period I gave in comment 3 appears to come from the
> fix from bug 330900. I've checked that by building a 2006-03-23 build and
> manually applying the patch from that bug.
> That means that the regression range in comment 3 is not relevant for this bug

Why do you say that? The stacks from bug 330900 were different from this bug, and in addition the current 1.8/1.8.0 branches also have the 330900 fix but still crash on the first testcase here. I've not been able to reproduce this crash with any debug build.

I confirm your regression range in comment 3 (2006-03-23-05 broken, 2006-03-24-05 fixed) but my money's still on Igor's GC patch.
(In reply to comment #13)
> Have not been able to reproduce with a debug build. I get a single

I saw the same problem.  mrbkap explained it to me:  the string representations have some more information in them, which includes the word "native".  So this change makes it run in a DEBUG build:

@@ -26,7 +26,7 @@

 function msftestIsNative(obj) {
        var txt = obj + '';
-       if (txt.indexOf('native') != -1) {
+       if (txt.indexOf('[native') != -1) {
                return true;
        }
        return false;
(In reply to comment #22)
> Why do you say that? The stacks from bug 330900 were different from this bug,
> and in addition the current 1.8/1.8.0 branches also have the 330900 fix but
> still crash on the first testcase here. I've not been able to reproduce this
> crash with any debug build.

I don't crash with 1.8/1.8.0 branches (only very sporadically). When I add  && e.indexOf('CRMFRequest') == -1 to the testcase on the relevant spots, then I don't crash anymore with a 2006-03-23 trunk build.
(In reply to comment #26)
> Created an attachment (id=229642) [edit]
> reduced testcase invoke window.find on buf

works in 1.8.0.5, 1.8, 1.9 winxp
(In reply to comment #13)
> I couldn't reproduce the original testcase either until I created a new
> profile, then *boom*. So if you're having trouble reproducing, try that.
> There's probably some common power-user pref setting a lot of us have that's
> interfering.
> 
> I died in JS_SetPrivate a lot (different call stacks, though): TB20989667,
> TB20989639, TB20989449

The stack traces shows 

JS_SetPrivate  [c:/builds/tinderbox/Fx-Mozilla1.8.0-Release/WINNT_5.2_Depend/mozilla/js/src/jsapi.c, line 2164]
js_DefineFunction  [c:/builds/tinderbox/Fx-Mozilla1.8.0-Release/WINNT_5.2_Depend/mozilla/js/src/jsfun.c, line 2125]
JS_InitClass  [c:/builds/tinderbox/Fx-Mozilla1.8.0-Release/WINNT_5.2_Depend/mozilla/js/src/jsapi.c, line 2045]

but js_DefineFunction does not call JS_SetPrivate directly, so the stack traces are not complete. Why it is so?
FWIW, I ran mrbkap's variant of the original testcase under WAY_TOO_MUCH_GC on my 1.8.0 branch Windows build, and it didn't crash.
> this change makes it run in a DEBUG build:

Thank you (and mrbkap) for that!

> > reduced testcase invoke window.find on buf
> works in 1.8.0.5, 1.8, 1.9 winxp

Define "works"? I don't crash using that testcase on either trunk or 1.5.0.4
Attached patch Potential fixSplinter Review
This is the only object that could possibly be the cause of the crash based on the stacks that we have seen. The basic principal here is that we cannot trust newborn roots!
Assignee: dveditz → mrbkap
Status: NEW → ASSIGNED
Attachment #229759 - Flags: review?(brendan)
Attachment #229765 - Flags: review?
Attachment #229765 - Flags: approval1.8.0.5?
This fixes the problem in my 1.8.0.5 builds (debug and optimized). Reproducing the original crash in the debug build was iffy, but optimized was a consistent crasher (despite the occassional need to create new profiles, or delete the prefs.js file) and it's fixed now.
Comment on attachment 229759 [details] [diff] [review]
Potential fix

The comment might say "any potential GC callback."

r=me, looks good.

/be
Attachment #229759 - Flags: review?(brendan) → review+
Comment on attachment 229765 [details] [diff] [review]
merged to 1.8 branches

r=me with same comment suggestion as for the trunk/1.8-branch.

/be
Attachment #229765 - Flags: review? → review+
Fix checked into trunk.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Comment on attachment 229765 [details] [diff] [review]
merged to 1.8 branches

approved for 1.8.0 branch, a=dveditz
Attachment #229765 - Flags: approval1.8.1?
Attachment #229765 - Flags: approval1.8.0.5?
Attachment #229765 - Flags: approval1.8.0.5+
Fix checked into 1.8.0 branch
Keywords: fixed1.8.0.5
(In reply to comment #31)
> The basic principal here is that we cannot trust newborn roots!

But that's a bug in itself, and a regression at some point that we did not catch.  We need to fix it, too, and centrally.  Short of fixing bug 313437, we should not allow JS_ClearNewbornRoots to do anything when called from the GC allocator.

Would you please file that bug?

/be
(In reply to comment #39)
> (In reply to comment #31)
> > The basic principal here is that we cannot trust newborn roots!
> 
> But that's a bug in itself, and a regression at some point that we did not
> catch.  We need to fix it, too, and centrally.  Short of fixing bug 313437, we
> should not allow JS_ClearNewbornRoots to do anything when called from the GC
> allocator.
> 
> Would you please file that bug?

This JS_ClearNewbornRoots hazard arose due to the null-script branch callback case, added for bug 203278.  But it exists also in theory due to the JSGC_END mode GC-callback, used by Venkman and other modules.

/be
Attachment #229765 - Flags: approval1.8.1? → approval1.8.1+
(In reply to comment #39)
> Would you please file that bug?

bug 345230 has been filed on this.
Blocks: sbb?
v.fixed on 1.8.0 branch with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.5) Gecko/20060719 Firefox/1.5.0.5, no crash with testcase.  I was able to reliably crash with rc3 with the following steps:

1. launch with profile manager flag (-P)
2. create new profile
3. go to this bug, login to bugzilla
4. open testcase and click button
5. boom!

I crashed about 5-6 times in a row with the previous build (rc3), but have not been able to reproduce with latest rc4 build after about 15 tries (same basic steps above with a few variations thrown in).
In Firefox 1.0.8 this particular testcase crashes on bug 330900 instead. Once that's fixed on that branch this should be retested. It may or may not crash due to the timing issues involved.
Flags: blocking1.7.14?
Flags: blocking-aviary1.0.9?
Not needed on the aviary or moz1.7 branches since bug 203278 was not fixed until after those branches.
Flags: blocking1.7.14?
Flags: blocking1.7.14-
Flags: blocking-aviary1.0.9?
Flags: blocking-aviary1.0.9-
Fix checked into the 1.8 branch.
Keywords: fixed1.8.1
OS: Windows XP → All
Priority: -- → P1
Hardware: PC → All
Target Milestone: --- → mozilla1.8.1beta2
Version: 1.8 Branch → Trunk
Unfortunately, this bug does exist for 1.7.x and 1.0.x. 

Dan, are you sure this was introduced with fix for bug #203278 ? I have not backported that, nevertheless *all* testcases do apply, so probably it was a different fix that triggered this. Any idea?
Did anyone verify this on Linux?  Our QA seems to suggest that this is still not fixed on Linux... Will try to get more details.
(To clarify, they say that 1.5.0.5 still exhibits this bug on Linux...)  Will update with details as I get them.
(In reply to comment #48)
> (To clarify, they say that 1.5.0.5 still exhibits this bug on Linux...)  Will
> update with details as I get them.
> 

At least backporting the branches fix attached did not help for 1.0.x ... so probably its really not fixed for linux.
(In reply to comment #49)
> At least backporting the branches fix attached did not help for 1.0.x ... so
> probably its really not fixed for linux.

Do you have backtraces of any crashes you see? The testcase for this bug is *extremely* generic, and it's possible that fixing this bug is exposing another bug.
Attached file strace output
So I went and had a look at the testing boxes, and it appears to not be a crash anymore.  It's simply an exit(1) which is an improvement,  but we should still get it to work properly.  Anyway, here's the strace output.
(In reply to comment #51)
> So I went and had a look at the testing boxes, and it appears to not be a crash
> anymore.  It's simply an exit(1) which is an improvement,  but we should still
> get it to work properly.  Anyway, here's the strace output.

If I'm reading the strace correctly, it seems that X becomes unhappy with us for whatever reason (the end of the log is opening XErrorDB and writing out something about 'Gecko') and kills us. I think that deserves its own bug.
Attached file Backtrace 1.0.x (without patch) (obsolete) —
This is a backtrace for 1.0.8 without the backported patch.

Further there was no patch for bug 203278 applied - see comment #44
In response to comment 43  I used the patch from bug 330900 ... but it makes no difference at all. The backtrace looks similar.
(In reply to comment #53)
> Further there was no patch for bug 203278 applied - see comment #44
> In response to comment 43  I used the patch from bug 330900 ... but it makes no
> difference at all. The backtrace looks similar. 

The backtrace looks similar to what? That looks like a very different problem, presumably fixed by some other bug on the trunk (and probably on the 1.8 branch). This particular crash is actually controlled, it will assert (abort in debug builds, and return false, silently halting the script's execution).
(In reply to comment #54)
> (In reply to comment #53)
> > Further there was no patch for bug 203278 applied - see comment #44
> > In response to comment 43  I used the patch from bug 330900 ... but it makes no
> > difference at all. The backtrace looks similar. 
> 
> The backtrace looks similar to what? 

I meant similar to the backtrace from a build without the patch from bug 330900. 


> branch). This particular crash is actually controlled, it will assert (abort in
> debug builds, and return false, silently halting the script's execution).
> 

Fine, that's good enough for me I guess.

Attachment #231138 - Attachment is obsolete: true
Group: security
Flags: in-testsuite?
Crash Signature: [@ 0x20202113]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: