Open Bug 581539 (crossfuzz) Opened 14 years ago Updated 2 years ago

[meta] Bugs found by Michal Zalewski's cross_fuzz

Categories

(Core :: Fuzzing, defect)

defect

Tracking

()

People

(Reporter: lcamtuf, Unassigned)

References

(Depends on 3 open bugs, Blocks 1 open bug)

Details

(Keywords: meta, Whiteboard: [sg:nse meta])

Running this fuzzer:

http://lcamtuf.coredump.cx/cross_fuzz/cross_fuzz_mozilla.html

...tends to crash Firefox with calls to invalid memory locations, etc:

FAULT ->101ab450 ff5118           call dword ptr [ecx+0x18] ds:0023:00003c4c=????????

This fuzzer is very under construction at this moment, but thought you might want to know ASAP.
Michal, which version of Firefox have you been testing?

And about how long do I need to run the fuzzer?
Current 3.6, as indicated in bug flags; Windows 32-bit. About 10-15 minutes works for me. Same location, another crash:

FAULT ->101ab450 ff5118           call dword ptr [ecx+0x18] ds:0023:5741525b=????????

The address it is attempting to call is just "[RAW", ASCII - likely attacker controlled. Looks major. Full crash dump (sorry, not a debug build):

*----> State Dump for Thread Id 0x3e4 <----*

eax=046d976c ebx=0012ef58 ecx=57415243 edx=0012ec74 esi=0012ed40 edi=00000000
eip=101ab450 esp=0012ec60 ebp=0012ec6c iopl=0         nv up ei pl zr na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00200246

*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\Program Files\Mozilla Firefox\xul.dll - 
function: xul!gfxFontCache__AddNew
        101ab43a 846234           test    [edx+0x34],ah
        101ab43d 1e               push    ds
        101ab43e 00568b           add     [esi-0x75],dl
        101ab441 750c             jnz     xul!gfxFontCache__AddNew+0x277 (101ab44f)
        101ab443 832600           and     dword ptr [esi],0x0
        101ab446 8b4008           mov     eax,[eax+0x8]
        101ab449 8b08             mov     ecx,[eax]
        101ab44b 8d5508           lea     edx,[ebp+0x8]
        101ab44e 52               push    edx
        101ab44f 50               push    eax
FAULT ->101ab450 ff5118           call dword ptr [ecx+0x18] ds:0023:5741525b=????????
        101ab453 85c0             test    eax,eax
        101ab455 0f8850341e00     js    xul!gfxPlatform__operator=+0x4b090 (1038e8ab)
        101ab45b 8b4508           mov     eax,[ebp+0x8]
        101ab45e 854510           test    [ebp+0x10],eax
        101ab461 7406             jz      xul!gfxFontCache__AddNew+0x291 (101ab469)
        101ab463 c70601000000     mov     dword ptr [esi],0x1
        101ab469 33c0             xor     eax,eax
        101ab46b 5e               pop     esi
        101ab46c 5d               pop     ebp
        101ab46d c20c00           ret     0xc

*----> Stack Back Trace <----*
WARNING: Stack unwind information not available. Following frames may be wrong.
ChildEBP RetAddr  Args to Child              
0012ec6c 101fc9f8 0466b9c0 0012ed40 00000040 xul!gfxFontCache__AddNew+0x278
0012ec90 100fa387 0466b9c0 00000003 00000001 xul!nsExpirationTracker<gfxFont,3>__TimerCallback+0x18ab
0012ef28 100ef78e 0012ef58 00000001 01b59000 xul!gfxSkipCharsIterator__SetOffsets+0x2e967
0261f254 03c1c950 03cce348 03cbd2c0 00000001 xul!gfxSkipCharsIterator__SetOffsets+0x23d6e
00000000 00000000 00000000 00000000 00000000 0x3c1c950
Hmm, something to do with fonts...
No, the stacks are bogus because there are no symbols. We'll either need crashreport IDs or stacks with symbols (they can be pulled from our symbol server), or we need to reproduce locally.

I suspect reproducing either in valgrind or in a recording would be our best option here.
This is definitely not the only scary-looking crash, though. A slightly improved version of the fuzzer:

http://lcamtuf.coredump.cx/cross_fuzz/cross_fuzz_fixed.html

...also crashes here:

eax=00000033 ebx=ccccc3c0 ecx=00340ff0 edx=0012d1c4 esi=090ed000 edi=0012d1c4
eip=003190a1 esp=0012d098 ebp=0012d0b0 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00200206

function: js3250!JS_CallTracer
        00319086 5b               pop     ebx
        00319087 8be5             mov     esp,ebp
        00319089 5d               pop     ebp
        0031908a c3               ret
        0031908b 8b450c           mov     eax,[ebp+0xc]
        0031908e 8bc8             mov     ecx,eax
        00319090 81c9ff0f0000     or      ecx,0xfff
        00319096 8b59f1           mov     ebx,[ecx-0xf]
        00319099 83e90f           sub     ecx,0xf
        0031909c 25ff0f0000       and     eax,0xfff
FAULT ->003190a1 837b0820      cmp dword ptr [ebx+0x8],0x20 ds:0023:ccccc3c8=????????
        003190a5 7553             jnz     js3250!JS_CallTracer+0xfa (003190fa)

Let me know if you can't repro.
Product: Firefox → Core
QA Contact: firefox → toolkit
Version: 3.6 Branch → 1.9.2 Branch
Just FYI, to help prioritize: my current plan is to publish this fuzzer in approx 60 days (or earlier, if all vendors address the problems sooner).
So far haven't managed to reproduce the crash on debug-OSX build (1.9.2)
(I tested both links).

Testing now using non-debug build.

Though, ofc it is possible that the crash is Windows only, or at least non-OSX.
On non-debug build I did get a crash
0   libxpconnect.dylib            	0x130b36bd WrappedNativeMarker(JSDHashTable*, JSDHashEntryHdr*, unsigned int, void*) + 109
1   libmozjs.dylib                	0x00199747 JS_DHashTableEnumerate + 135
2   libxpconnect.dylib            	0x130b32ec XPCWrappedNativeScope::MarkAllWrappedNativesAndProtos() + 76
3   libxpconnect.dylib            	0x1309972f XPCJSRuntime::GCCallback(JSContext*, JSGCStatus) + 751
4   libgklayout.dylib             	0x11fe6cd3 DOMGCCallback(JSContext*, JSGCStatus) + 51
5   libmozjs.dylib                	0x001bfa40 js_GC + 3408
6   libmozjs.dylib                	0x00178f58 JS_GC + 120
7   libxpconnect.dylib            	0x1307cb27 nsXPConnect::Collect() + 199
8   libxpcom_core.dylib           	0x00343ae6 nsCycleCollector::Collect(unsigned int) + 278
9   libxpcom_core.dylib           	0x00343ce9 nsCycleCollector_collect() + 41
10  libgklayout.dylib             	0x11feba7a nsJSContext::LoadEnd() + 202
11  libgklayout.dylib             	0x11c70e8d DocumentViewerImpl::LoadComplete(unsigned int) + 429
12  libdocshell.dylib             	0x006d950a nsDocShell::EndPageLoad(nsIWebProgress*, nsIChannel*, unsigned int) + 570
13  libdocshell.dylib             	0x006d6ef0 nsDocShell::OnStateChange(nsIWebProgress*, nsIRequest*, unsigned int, unsigned int) + 560
14  libdocshell.dylib             	0x006e9bfc nsDocLoader::FireOnStateChange(nsIWebProgress*, nsIRequest*, int, unsigned int) + 236
15  libdocshell.dylib             	0x006eabc4 nsDocLoader::DocLoaderIsEmpty(int) + 340
...which looks a lot like Bug 520554
We need to hook this into our fuzzer harness to catch future regressions too. I'll let this bug track the fuzzer itself and spin off depends bugs for instances uncovered.
Blocks: fuzz
Depends on: 520554
Keywords: meta
Whiteboard: [sg:nse meta]
Just FYI, this is by far the best variant of the fuzzer, bringing Firefox down in a couple of minutes with call to bad, seemingly attacker-controlled memory address much of the time:

http://lcamtuf.coredump.cx/cross_fuzz/cross_fuzz_final_20100728.html

All previous versions should be considered obsolete :P
I get still the same stack as in comment 8.
(Not that it matters much but Webkit and Presto seems to crash in a <5 seconds, 
 gecko stays up for few minutes)
Yup.
Depends on: 582601
Depends on: 582649
What are 582601 and 582649? I don't have access.
I caught a crash in recording with the fuzzer in comment 5, looks like we're not removing an XPCWrappedNative from the hash before deleting it. Here's my guess:

1. We create a wrapped native, refcount is artificially bumped to 2.
2. Release is called when C++ is done with the wrapper, refcount goes to 1.
3. GC happens.
4. FlatJSObjectFinalized check IsWrapperExpired, which returns false, so we don't remove wrapper from hash.
5. FlatJSObjectFinalized then calls Release, deleting the wrapper.
6. Later enumeration of hash hits deleted wrapper.

The question is why ExpireWrapper was never called. It looks like that is only called from RootAndUnlinkJSObjects, so is it possible that the cycle collector never knew about this wrapper? I'll dig a little more, but I'm thinking that we might be able to fix this by setting the expired bit at creation time and only clearing it in Traverse. Then this case would work like normal, I think.

Any thoughts?
bent, is that bug 520554? It's similar, but not exactly the same.
Most likely, yes. Peterv and I have been discussing it and it seems to be the same root problem.
Sorry for the spam, but this is a new, greatly improved version of the fuzzer. The two most significant changes include randomizing DOM crawl order, and limiting object crawl fanout.

http://lcamtuf.coredump.cx/cross_fuzz/cross_fuzz_randomized_20100729.html

It seems to trigger a more varied set of faults, including:

FAULT ->104d6002 c706d8bba510 mov dword ptr [esi],0x10a5bbd8 ds:0023:10a5bbd8=104d6038

...or this weird abort trap:

ABORT: `OnError' called on non-toplevel actor: file e:/builds/moz2_slave/win32_build/build/obj-firefox/ipc/ipdl/PPluginScriptableObjectChild.cpp, line 1822.
(It also crashes Firefox much faster)
Depends on: 583225
One more request: please keep this bug locked until other vendors have chance to catch up (the fuzzer also affects most other browsers). Releasing individual fixes and unlocking their respective bugs is of course OK.
Alias: crossfuzz
It's been several weeks, so just wanted to check progress. Looks like 520554 is the one remaining scary crash; is this being worked on?

Also, were you able to check out cross_fuzz_randomized_20100729.html (see above)?
I'm waiting on bug 582649 to get fixed before I'll continue testing.
Depends on: 590291
bug 582649 is fixed now.
Just a gentle ping; the 60-day mark is about two weeks away. I can wait a bit more, but are we making any progress on bug 520554, and is cross_fuzz_randomized_20100729.html not causing any other major crashes on trunk?
looks like 520554 might have been fixed by Bug 583225.
More specifically, the fix in bug 583225 is expected to take care of the crashes [@ WrappedNativeMarker] hit by this fuzzer.  Bug 520554 isn't much of a bug report.
No longer depends on: 520554
After a quick chat with Jesse, I put together a probably close-to-usable variant of cross_fuzz_randomized_20100729.html that may be minimally more useful in troubleshooting crashes, by logging all the evals() in a manner that can be used to construct repros:

http://lcamtuf.coredump.cx/cross_fuzz/cross_fuzz_randomized_20100729_logging.html

You probably want to replace <textarea> with console output. I haven't tested this much, so there is a chance it's not a perfect copy, though; I can't spend more time on this ATM :-(
Here's an (IMO more useful) version that does reproducible seeds, insteads:

http://lcamtuf.coredump.cx/cross_fuzz/cross_fuzz_randomized_20100729_seed.html

It accepts seed in location.hash, say:

http://lcamtuf.coredump.cx/cross_fuzz/cross_fuzz_randomized_20100729_seed.html#1234

If no seed is found, it picks one based on system time, and updates location.hash accordingly.

Unlike cross_fuzz_randomized_20100729_logging.html, it's guaranteed to map 1:1 to the original cross_fuzz_randomized_20100729.html fuzzer.
Hi folks,

We're nearing the 60 day boundary, and according to Jesse, there still are new, exploitable crashes seen with this fuzzer on trunk.

Is the seed-based variant any improvement?
Jesse, are they filed? I'm happy to debug stuff in recording but I thought this bug was taken care of, based on the lack of any other blocking bugs.
I was still getting various crashers while testing (something like the bug 582649 for instance).
Unfortunately, I don't have time to look further into.
But I guess the unrandomized version fuzzer mentioned in comment 30 might make it simpler to reproduce.
Depends on: 596988
Depends on: 599966
Depends on: 600064
Just an obligatory biweekly ping to check the status of this bug ;-)
Hi folks,

I am slowly getting to a point where I would prefer to release this fuzzer externally. Are there any immediate, outstanding things that need to be done? I know Jesse incorporated some of the logic into his fuzzers and is filing bugs.
Michal, 

http://twitter.com/#!/lcamtuf (latest tweet) sounds like you are possibly getting ready to talk about this one.  Is that true?
Yes, the plan is to release the tool early January, as most of the vendors are either OK with it, or haven't at least haven't objected to repeated inquiries. I had no response to my Dec 3 note that this is going to happen; and January is already way past the originally proposed deadline (comment 6).

Here's the draft blurb I have for Mozilla:

"Firefox: Mozilla notified in July 2010. About 10 crashes addressed in bug 581539, with attribution in security bulletins. Fuzzing approach subsequently rolled into Jesse Ruderman's fuzzing infrastructure under bug 594645 in September; from that point on, about three dozen additional bugs identified (generally with no specific attribution at patch time). Several difficult to diagnose crashes may be still occurring on trunk."

This tally does not differentiate between exploitable and non-exploitable crashes; this fact is explained in the intro to my post.

FWIW, the blurb does not make you look bad in comparison; Microsoft will probably come off poorly, all other browsers are roughly in the same shape. The announcement will also have this preamble:

"This design makes it unexpectedly difficult to get clean, deterministic repros; to that effect, in the current versions of all the affected browsers, we are still seeing a collection of elusive problems when running the tool. I believe that at this point, a broader community involvement [link to vulnerability bounty programs] may be instrumental to tracking down and resolving these bugs."

I hope that's OK. I do not expect this to realistically have any damaging effects to Firefox or any other project, given that the difficulty of tracing down cross_fuzz crashes is higher than the difficulty of recreating such a fuzzer.
ok, jesse is running a test run on firefox 3.6.13 to see if any unexpected problems surface.  if you can also give us a heads up on the exact timing we can give our PR folks a heads up on any inbound questions that might surface from your posting.
I'm probably going to go with Jan 3 or so.
Michal,

Can you send us a link to the exact fuzzer(s) that you plan to release so we can run some tests against important firefox releases?

https://bugzilla.mozilla.org/show_bug.cgi?id=581539#c30 is the last comment that talks about any updates.  Are there changes you have added beyond that?
The last meaningful change to the fuzzer, that resulted in some new crashes in Firefox, is that in comment 12. The one in comment 30 simply adds seeds for easier repros. I plan to release the contents of this directory:

http://lcamtuf.coredump.cx/cross_fuzz/

If you want to double-check, I am fairly confident that you only need to test the "canonical" version, identical to comments 12 and 30, that is:

http://lcamtuf.coredump.cx/cross_fuzz/cross_fuzz_randomized_20100729_seed.html
Err, should be comment 20, not comment 12.
Depends on: 622165
Depends on: 622197
I released it a bit earlier because I suspect one of the vulns in MSIE may be known to third parties. Sorry bout that.

http://lcamtuf.blogspot.com/2011/01/announcing-crossfuzz-potential-0-day-in.html
Group: core-security
Depends on: 622456
Depends on: 622466
Depends on: 622483
Depends on: 622596
Depends on: 623070
No longer depends on: 622165, 622456, 622483, 622596, 623070
Depends on: 635539
Depends on: 671484
Depends on: 674189
Depends on: 693316
Severity: critical → normal
OS: Windows XP → All
Hardware: x86 → All
Summary: cross_fuzz: crashes with evidence of memory corruption → Bugs found by Michal Zalewski's cross_fuzz
Version: 1.9.2 Branch → Trunk
Depends on: 1408488
Depends on: 1565631
Summary: Bugs found by Michal Zalewski's cross_fuzz → [meta] Bugs found by Michal Zalewski's cross_fuzz
Depends on: 1566678
Depends on: 1566684
Depends on: 1567350
Depends on: 1567351
Depends on: 1571037
Depends on: 1650447
Depends on: 680745
Depends on: 1753493
Component: Security → Fuzzing
Keywords: sec-other
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.