Closed Bug 790826 Opened 12 years ago Closed 12 years ago

Heap-use-after-free in mozilla::plugins::parent

Categories

(Core Graveyard :: Plug-ins, defect)

x86_64
All
defect
Not set
normal

Tracking

(firefox16 unaffected, firefox17+ fixed, firefox18+ fixed, firefox-esr10 unaffected)

RESOLVED FIXED
mozilla19
Tracking Status
firefox16 --- unaffected
firefox17 + fixed
firefox18 + fixed
firefox-esr10 --- unaffected

People

(Reporter: inferno, Assigned: gfritzsche)

References

Details

(4 keywords, Whiteboard: [adv-track-main17+][asan])

Attachments

(6 files, 1 obsolete file)

Attached file Testcase
Reproduces on trunk. For reproducing reliably, just run > 10 simultaneous firefox instances. ================================================================= ==21539== ERROR: AddressSanitizer heap-use-after-free on address 0x7f8bcb1c0088 at pc 0x7f8c1f38cac0 bp 0x7fffbeb1a2f0 sp 0x7fffbeb1a2e8 WRITE of size 4 at 0x7f8bcb1c0088 thread T0 #0 0x7f8c1f38cac0 in mozilla::plugins::parent::_releaseobject(NPObject*) src/dom/plugins/base/nsNPAPIPlugin.cpp:1389 0x7f8bcb1c0088 is located 8 bytes inside of 32-byte region [0x7f8bcb1c0080,0x7f8bcb1c00a0) freed by thread T0 here: #0 0x42b630 in free ??:0 #1 0x7f8c1f3e57bd in NPObjWrapperPluginDestroyedCallback(PLDHashTable*, PLDHashEntryHdr*, unsigned int, void*) src/dom/plugins/base/nsJSNPRuntime.cpp:1944 previously allocated by thread T0 here: #0 0x42b6f0 in __interceptor_malloc ??:0 #1 0x7f8c236873a8 in moz_xmalloc src/memory/mozalloc/mozalloc.cpp:57 #2 0x7f8c1f96a10b in mozilla::plugins::PluginScriptableObjectParent::CreateProxyObject() src/dom/plugins/ipc/PluginScriptableObjectParent.cpp:557 #3 0x7f8c1f9c4f32 in mozilla::plugins::PPluginModuleParent::OnMessageReceived(IPC::Message const&) src/objdir-ff-asan/ipc/ipdl/PPluginModuleParent.cpp:858 #4 0x7f8c1f978d8d in mozilla::ipc::AsyncChannel::OnDispatchMessage(IPC::Message const&) src/ipc/glue/AsyncChannel.cpp:473 #5 0x7f8c1f9d3228 in mozilla::plugins::PPluginInstanceParent::CallNPP_GetValue_NPPVpluginScriptableNPObject(mozilla::plugins::PPluginScriptableObjectParent**, short*) src/objdir-ff-asan/ipc/ipdl/PPluginInstanceParent.cpp:374 #6 0x7f8c1f9490a0 in mozilla::plugins::PluginInstanceParent::NPP_GetValue(NPPVariable, void*) src/dom/plugins/ipc/PluginInstanceParent.cpp:1104 #7 0x7f8c1f95a48d in mozilla::plugins::PluginModuleParent::NPP_GetValue(_NPP*, NPPVariable, void*) src/dom/plugins/ipc/PluginModuleParent.cpp:709 Shadow byte and word: 0x1ff179638011: fd 0x1ff179638010: fd fd fd fd fd fd fd fd More shadow bytes: 0x1ff179637ff0: fd fd fd fd fd fd fd fd 0x1ff179637ff8: fd fd fd fd fd fd fd fd 0x1ff179638000: fa fa fa fa fa fa fa fa 0x1ff179638008: fa fa fa fa fa fa fa fa =>0x1ff179638010: fd fd fd fd fd fd fd fd 0x1ff179638018: fd fd fd fd fd fd fd fd 0x1ff179638020: fa fa fa fa fa fa fa fa 0x1ff179638028: fa fa fa fa fa fa fa fa 0x1ff179638030: fd fd fd fd fd fd fd fd Stats: 331M malloced (365M for red zones) by 595516 calls Stats: 49M realloced by 25909 calls Stats: 292M freed by 330351 calls Stats: 158M really freed by 206564 calls Stats: 536M (137309 full pages) mmaped in 134 calls mmaps by size class: 8:344043; 9:49146; 10:16380; 11:20470; 12:4096; 13:2048; 14:1536; 15:384; 16:576; 17:1280; 18:272; 19:40; 20:20; mallocs by size class: 8:461364; 9:70728; 10:23906; 11:25624; 12:5405; 13:2947; 14:2126; 15:554; 16:764; 17:1730; 18:303; 19:45; 20:20; frees by size class: 8:220906; 9:56741; 10:19146; 11:21782; 12:4052; 13:2704; 14:1865; 15:492; 16:612; 17:1711; 18:281; 19:42; 20:17; rfrees by size class: 8:136492; 9:35705; 10:11257; 11:16093; 12:2517; 13:1371; 14:1234; 15:312; 16:395; 17:1147; 18:34; 19:6; 20:1; Stats: malloc large: 2098 small slow: 3007 ==21539== ABORTING
Component: General → Plug-ins
Keywords: sec-critical
Product: Firefox → Core
bc, could you try and make this testcase smaller? I can't tell whether the aria or CSS is really relevant to the crash (I suspect it's not).
Assignee: nobody → georg.fritzsche
after crashing immediately on my first try I'm having problems reproducing at all. :-( I'll see about getting better reproducibility and then will work on reduction.
Blocks: 791798
(In reply to Bob Clary [:bc:] from comment #2) > I'll see about getting better reproducibility Did you have any luck with this?
No, but I've been out with the Flu for a few days. I'll try to look at this again today.
Georg: does the reduced testcase demonstrate the problem for you? You can find links to ASAN try builds here: http://people.mozilla.org/~choller/firefox/asan/
(In reply to Daniel Veditz [:dveditz] from comment #7) > Georg: does the reduced testcase demonstrate the problem for you? Sorry for the delay - yes, it does reproduce (inconsistently) and also with custom builds. However, i haven't been able to resolve the issue yet.
The heap-use-after-free is apparently caused by a race nsJSNPRuntime: * NPObjWrapperPluginDestroyedCallback() - deallocating the NPObject vs. * NPObjWrapper_Finalize() leading to DelayedReleaseGCCallback() and releaseref on the NPObject This patches fixes the issue, but i'm unsure if i'm overlooking any side-effects. Benjamin, can you review this?
Attachment #666632 - Flags: review?(benjamin)
Keywords: csec-uaf
This code is all running on the main thread. Which function is first for this crash scenario? If it's NPObjWrapperPluginDestroyedCallback then NPObjWrapper_Finalize, this shouldn't happen. NPObjWrapperPluginDestroyedCallback calls ::JS_SetPrivate(entry->mJSObj, nullptr); So the finalizer should do nothing. If it's the other way around, I'm still not clear on what's happening: * NPObjWrapper_Finalize. Enqueues a sDelayedReleases->AppendElement(npobj). Removes the object from sNPObjectWrappers. * Then plugin destruction, NPObjWrapperPluginDestroyedCallback should not be called for this object because it was already removed from sNPObjectWrappers! Unless the GC which causes NPObjWrapper_Finalize happens *within* the hashtable enumeration (nsJSNPRuntime::OnPluginDestroy), which could cause sNPObjWrappers to become corrupted. The only JSAPI call within the enumeration is JS_SetPrivate, which I believe never triggers GC or finalizers, but I'd love for somebody to confirm that definitively. I think this needs more understanding.
The ASAN error callstack and output i'm seeing.
Attachment #666632 - Flags: review?(benjamin)
When I try this testcase in debug non-valgrind/ASAN builds, I always end up crashing but not in something directly related: > xul.dll!DoDeferredRelease<nsISupports *>(array={...}) Line 527 C++ xul.dll!XPCJSRuntime::GCCallback(rt=0x074dd368, status=JSGC_END) Line 733 C++ mozjs.dll!Collect(rt=0x074dd368, incremental=true, budget=0x0000000000000000, gckind=GC_NORMAL, reason=TRANSPLANT) Line 4654 C++ mozjs.dll!js::GCFinalSlice(rt=0x074dd368, gckind=GC_NORMAL, reason=TRANSPLANT) Line 4693 C++ mozjs.dll!js::FinishIncrementalGC(rt=0x074dd368, reason=TRANSPLANT) Line 178 C++ mozjs.dll!JS_TransplantObject(cx=0x0c42a390, origobjArg=0x11331040, targetArg=0x12737040) Line 1557 C++ xul.dll!xpc::TransplantObject(cx=0x0c42a390, origobj=0x11331040, target=0x12737040) Line 693 C++ xul.dll!nsGlobalWindow::SetNewDocument(aDocument=0x0ece9790, aState=0x00000000, aForceReuseInnerWindow=false) Line 1992 C++ xul.dll!DocumentViewerImpl::InitInternal(aParentWidget=0x00000000, aState=0x00000000, aBounds={...}, aDoCreation=true, aNeedMakeCX=true, aForceSetNewDocument=true) Line 928 C++ xul.dll!DocumentViewerImpl::Init(aParentWidget=0x00000000, aBounds={...}) Line 678 C++ xul.dll!nsDocShell::SetupNewViewer(aNewViewer=0x0e64f930) Line 8009 C++ xul.dll!nsDocShell::Embed(aContentViewer=0x0e64f930, aCommand=0x5af9890a, aExtraInfo=0x00000000) Line 6070 C++ xul.dll!nsDocShell::CreateContentViewer(aContentType=0x0e5d0020, request=0x0cb60f58, aContentHandler=0x0e5c3920) Line 7796 C++ xul.dll!nsDSURIContentListener::DoContent(aContentType=0x0e5d0020, aIsContentPreferred=false, request=0x0cb60f58, aContentHandler=0x0e5c3920, aAbortProcess=0x0035d24f) Line 122 C++ xul.dll!nsDocumentOpenInfo::TryContentListener(aListener=0x0eec4db8, aChannel=0x0cb60f58) Line 655 C++ xul.dll!nsDocumentOpenInfo::DispatchContent(request=0x0cb60f58, aCtxt=0x00000000) Line 356 C++ xul.dll!nsDocumentOpenInfo::OnStartRequest(request=0x0cb60f58, aCtxt=0x00000000) Line 248 C++ xul.dll!nsBaseChannel::OnStartRequest(request=0x0ec52400, ctxt=0x00000000) Line 731 C++ xul.dll!nsInputStreamPump::OnStateStart() Line 417 C++ xul.dll!nsInputStreamPump::OnInputStreamReady(stream=0x09cbf9b0) Line 368 C++ xul.dll!nsInputStreamReadyEvent::Run() Line 83 C++ xul.dll!nsThread::ProcessNextEvent(mayWait=false, result=0x0035d577) Line 612 C++ I also tried a debugging patch which would catch the "nested GC" error I mentioned, and it's not triggering anything.
A Linux crash from the same testcase ended up at #4 <signal handler called> #5 0x00007fecb3ee4cf1 in ReleaseSliceNow (slice=100, data=0xc5c910) at /home/bsmedberg/builds/mozilla-central/src/js/xpconnect/src/XPCJSRuntime.cpp:569 #6 0x00007fecb3ee50bf in XPCIncrementalReleaseRunnable::ReleaseNow (this= 0xc5c8f0, limited=true) at /home/bsmedberg/builds/mozilla-central/src/js/xpconnect/src/XPCJSRuntime.cpp:627 #7 0x00007fecb3ee52d2 in XPCIncrementalReleaseRunnable::Run (this=0xc5c8f0) at /home/bsmedberg/builds/mozilla-central/src/js/xpconnect/src/XPCJSRuntime.cpp:658 #8 0x00007fecb485ae42 in nsThread::ProcessNextEvent (this=0x894fa0, mayWait= false, result=0x7fffc9b754ef)
Valgrind output on Linux & trunk for the test-case: Invalid read of size 4 at 0x8960648: mozilla::plugins::parent::_releaseobject(NPObject*) (nsNPAPIPlugin.cpp:1408) by 0x897ACA0: DelayedReleaseGCCallback(JSRuntime*, JSGCStatus) (nsJSNPRuntime.cpp:213) by 0x874390D: XPCJSRuntime::GCCallback(JSRuntime*, JSGCStatus) (XPCJSRuntime.cpp:748) by 0x91F216E: Collect(JSRuntime*, bool, long, js::JSGCInvocationKind, js::gcreason::Reason) (jsgc.cpp:4654) by 0x91F21DC: js::GCFinalSlice(JSRuntime*, js::JSGCInvocationKind, js::gcreason::Reason) (jsgc.cpp:4696) by 0x91DFC76: js::FinishIncrementalGC(JSRuntime*, js::gcreason::Reason) (jsfriendapi.cpp:178) by 0x91A4966: JS_TransplantObject (jsapi.cpp:1560) by 0x8825D1D: xpc::TransplantObject(JSContext*, JSObject*, JSObject*) (WrapperFactory.cpp:693) by 0x83FFEE5: nsGlobalWindow::SetNewDocument(nsIDocument*, nsISupports*, bool) (nsGlobalWindow.cpp:1992) by 0x7F88C1C: DocumentViewerImpl::InitInternal(nsIWidget*, nsISupports*, nsIntRect const&, bool, bool, bool) (nsDocumentViewer.cpp:927) by 0x7F895BB: DocumentViewerImpl::Init(nsIWidget*, nsIntRect const&) (nsDocumentViewer.cpp:677) by 0x88343E3: nsDocShell::SetupNewViewer(nsIContentViewer*) (nsDocShell.cpp:8015) Address 0x233af5b8 is 8 bytes inside a block of size 32 free'd at 0x4C2A82E: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) by 0x7496F72: moz_free (mozalloc.cpp:51) by 0x8B27337: mozilla::plugins::PluginScriptableObjectParent::ScriptableDeallocate(NPObject*) (mozalloc.h:224) by 0x89773FD: NPObjWrapperPluginDestroyedCallback(PLDHashTable*, PLDHashEntryHdr*, unsigned int, void*) (nsJSNPRuntime.cpp:1945) by 0x8C9AB8E: PL_DHashTableEnumerate (pldhash.cpp:716) by 0x897A85C: nsJSNPRuntime::OnPluginDestroy(_NPP*) (nsJSNPRuntime.cpp:1992) by 0x8966A77: nsNPAPIPluginInstance::Stop() (nsNPAPIPluginInstance.cpp:315) by 0x8974BCC: nsPluginHost::StopPluginInstance(nsNPAPIPluginInstance*) (nsPluginHost.cpp:3288) by 0x81CE3F4: nsObjectLoadingContent::DoStopPlugin(nsPluginInstanceOwner*, bool, bool) (nsObjectLoadingContent.cpp:2360) by 0x81CE694: nsObjectLoadingContent::StopPluginInstance() (nsObjectLoadingContent.cpp:2404) by 0x81CBAD7: nsObjectLoadingContent::NotifyOwnerDocumentActivityChanged() (nsObjectLoadingContent.cpp:795) by 0x818A613: NotifyActivityChanged(nsIContent*, void*) (nsDocument.cpp:3907)
Attached file gdb session with adress traces (obsolete) —
I managed to get a non-garbled trace on this with a Linux x64 ASAN build. The affected region is 0x7fffcee5e380, which is: * added to sDelayedReleases in NPObjWrapper_Finalize() * deallocated from NPObjWrapperPluginDestroyedCallback() * then _releaseobject() is called on it from DelayedReleaseGCCallback()
NPObjWrapper_Finalize should have removed the object from the hashtable. So why is NPObjWrapperPluginDestroyedCallback being called for this object at all?
Attached file GDB trace session
Traced a bit more and apparently the object is added again after being torn down: * nsNPObjWrapper::GetNewOrUsed() -> adds to sNPObjWrappers * NPObjWrapper_Finalize() -> removes from sNPObjWrappers, schedules delayed release * nsNPObjWrapper::GetNewOrUsed() -> adds to sNPObjWrappers ... * NPObjWrapperPluginDestroyedCallback() -> free object * DelayedReleaseGCCallback() -> access free'd object
Attachment #669604 - Attachment is obsolete: true
So... is the NPObjWrapper_Finalize() fine call at that time fine? Assuming that, does it make sense to clean any remaining delayed releases from nsJSNPRuntime::OnPluginDestroy()?
In bug 729760 we started allowing JS/Gecko code to run in between finalization of JS objects and the GC_END callback. I don't understand why nsNPObjWrapper::GetNewOrUsed can return the NPObject after we've decided to release it. It looks like this is the call that's obtaining the NPObject: nsresult rv = GetValueFromPlugin(NPPVpluginScriptableNPObject, &npobj); Is there a rule that it's allowed to return the given NPObject until we call _releaseobject on it?
It's allowed to return the same NPObject for as long as the instance is alive, sure. It can keep an internal ref on the NPObject which implied nothing about the JS object lifetime.
In that case, I agree with Georg. nsJSNPRuntime::OnPluginDestroy should remove the npobj it's destroying from sDelayedReleases.
Comment on attachment 666632 [details] [diff] [review] Fix race issue on NPObjWrappers Bill, thanks for the info. Re-requesting review.
Attachment #666632 - Flags: review?(benjamin)
Attachment #666632 - Flags: review?(benjamin) → review+
Comment on attachment 666632 [details] [diff] [review] Fix race issue on NPObjWrappers How easily can the security issue be deduced from the patch? I suspect it would be quite difficult to figure out how to reproduce the problem and even more difficult to exploit. You' have to combine a heapsmash with a GC timing attack to weaponize this, and that's fiendishly difficult. Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No Which older supported branches are affected by this flaw? If not all supported branches, which bug introduced the flaw? Bug 791798 seems to indicate that this is a regression in 17 from around 27-28 August, but we're not sure and nothing in the range seems obviously incorrect. Analysis says this could date all the way back to incremental GC, but the crashes aren't showing up on 16. Do you have backports for the affected branches? The patch here should work for 17. How likely is this patch to cause regressions; how much testing does it need? This patch feels fairly safe to me.
Attachment #666632 - Flags: sec-approval?
Attachment #666632 - Flags: sec-approval? → sec-approval+
I'm going to put this patch on bug 791798, a public topcrash bug, and land it from there so that it looks less enticing.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
This may be related to bug 785806 and bug 787885. Can you request approval for branches, Benjamin?
On Monday when I've verified it against crash-stats, yes.
Keywords: testcase, verifyme
Whiteboard: [adv-track-main17+]
Whiteboard: [adv-track-main17+] → [adv-track-main17+][asan]
Group: core-security
Flags: sec-bounty+
mass remove verifyme requests greater than 4 months old
Keywords: verifyme
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: