Crash [@ RefPtr<nsCycleCollector>::operator!] through [@ mozilla::dom::indexedDB::(anonymous namespace)::FactoryOp::PermissionRetry]
Categories
(Core :: Storage: IndexedDB, defect)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox93 | --- | wontfix |
People
(Reporter: decoder, Unassigned)
Details
(5 keywords)
Crash Data
Attachments
(1 file)
|
6.88 KB,
text/plain
|
Details |
In experimental IPC fuzzing, we found the following crash on mozilla-central revision 610ae1bbeff8+ (fuzzing build):
=================================================================
==1721==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7fffdb25fcf1 bp 0x7fffc5932400 sp 0x7fffc59323d0 T26)
==1721==The signal is caused by a READ memory access.
==1721==Hint: address points to the zero page.
#0 0x7fffdb25fcf1 in RefPtr<nsCycleCollector>::operator!() const objdir-ff-asan/dist/include/mozilla/RefPtr.h:311
#1 0x7fffdb25fcf1 in NS_CycleCollectorSuspect3 xpcom/base/nsCycleCollector.cpp:3762
#2 0x7fffe5d59acb in unsigned long nsCycleCollectingAutoRefCnt::decr<&NS_CycleCollectorSuspect3>(void*, nsCycleCollectionParticipant*, bool*) xpcom/base/nsISupportsImpl.h:275
#3 0x7fffe5d59acb in unsigned long nsCycleCollectingAutoRefCnt::decr<&NS_CycleCollectorSuspect3>(nsISupports*, bool*) xpcom/base/nsISupportsImpl.h:262
#4 0x7fffe5d59acb in mozilla::dom::ContentParent::Release() dom/ipc/ContentParent.cpp:3441
#5 0x7fffe5a3e5cc in mozilla::RefPtrTraits<mozilla::dom::ContentParent>::Release(mozilla::dom::ContentParent*) objdir-ff-asan/dist/include/mozilla/RefPtr.h:50
#6 0x7fffe5a3e5cc in RefPtr<mozilla::dom::ContentParent>::ConstRemovingRefPtrTraits<mozilla::dom::ContentParent>::Release(mozilla::dom::ContentParent*) objdir-ff-asan/dist/include/mozilla/RefPtr.h:381
#7 0x7fffe5a3e5cc in RefPtr<mozilla::dom::ContentParent>::assign_assuming_AddRef(mozilla::dom::ContentParent*) objdir-ff-asan/dist/include/mozilla/RefPtr.h:69
#8 0x7fffe5a3e5cc in RefPtr<mozilla::dom::ContentParent>& RefPtr<mozilla::dom::ContentParent>::operator=<mozilla::dom::ContentParent>(already_AddRefed<mozilla::dom::ContentParent>&&) objdir-ff-asan/dist/include/mozilla/RefPtr.h:206
#9 0x7fffe5a3e5cc in mozilla::dom::indexedDB::(anonymous namespace)::FactoryOp::PermissionRetry() dom/indexedDB/ActorsParent.cpp:15401
#10 0x7fffe5973ebc in mozilla::dom::indexedDB::(anonymous namespace)::FactoryOp::RecvPermissionRetry() dom/indexedDB/ActorsParent.cpp:15971
#11 0x7fffe5973ebc in non-virtual thunk to mozilla::dom::indexedDB::(anonymous namespace)::FactoryOp::RecvPermissionRetry() dom/indexedDB/ActorsParent.cpp:?
#12 0x7fffde478f6d in mozilla::dom::indexedDB::PBackgroundIDBFactoryRequestParent::OnMessageReceived(IPC::Message const&) objdir-ff-asan/ipc/ipdl/PBackgroundIDBFactoryRequestParent.cpp:179
#13 0x7fffde5d4a15 in mozilla::ipc::PBackgroundParent::OnMessageReceived(IPC::Message const&) objdir-ff-asan/ipc/ipdl/PBackgroundParent.cpp:3478
#14 0x7fffdcc42d5c in mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message const&) ipc/glue/MessageChannel.cpp:2066
[...]
It might be that there is simply a null check missing in the IndexedDB code (particularly PermissionRetry) but I am not 100% sure about the cycle collector being on the stack here as well. Could be simply collecting a null pointer, or some kind of lifetime issue. Marking s-s until confirmed to be harmless.
If this turns out to be just a null deref, please leave the bug marked s-s to not draw attention to ongoing IPC fuzzing work.
| Reporter | ||
Comment 1•4 years ago
|
||
Comment 2•4 years ago
|
||
From NS_CycleCollectorSuspect3 it seems that sCollectorData.get() returned a nullptr here.
ContentParent uses a no-op flavor of cycle-collector, not sure if this is related here? We go through its release function because we probably set mContentParent here with some new value and this happens to be our last reference to the former one. So the release is not coming from a cycle collection, which might explain why we have no CollectorData.
It seems to me that the cycle collector code should not break in this case? But it also seems like a non-edge case, so I would wonder how this could slip through then?
Comment 3•4 years ago
|
||
I don't know what "no-op flavor of cycle-collector" means.
If that link was supposed to point to https://searchfox.org/mozilla-central/rev/d6188c9ce02efeea309e7177fc14c9eb2f09db37/dom/ipc/ContentParent.cpp#3442, that just means ContentParent itself doesn't tell about its outgoing edges.
Someone is trying to release ContentParent in a wrong thread. ContentParent is main thread only, and the stack hints about non-main-thread.
Comment 4•4 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #3)
I don't know what "no-op flavor of cycle-collector" means.
If that link was supposed to point to https://searchfox.org/mozilla-central/rev/d6188c9ce02efeea309e7177fc14c9eb2f09db37/dom/ipc/ContentParent.cpp#3442, that just means ContentParent itself doesn't tell about its outgoing edges.
Yes, exactly, sorry for the broken link - I clicked on the wrong icon in searchfox, it seems.
Someone is trying to release ContentParent in a wrong thread. ContentParent is main thread only, and the stack hints about non-main-thread.
OK, so the IndexedDB code does not expect to ever hold the last reference to it, IIUC. Thanks for the hint!
Comment 5•4 years ago
•
|
||
https://searchfox.org/mozilla-central/rev/d6188c9ce02efeea309e7177fc14c9eb2f09db37/dom/indexedDB/ActorsParent.cpp#15397,15401 overrides existing mContentParent value, thus causing the release, I think.
This isn't about holding last reference, but calling release on a wrong thread.
Comment 6•4 years ago
|
||
OK, sure, thanks for the clarification.
Interestingly we assert to be here on the background thread right before, which would mean that this was never meant to be run on the main thread. The whole scope of that assignment seems to be to just pass mContentParent to ourselves and dispatch us to the main thread, where we safely release mContentParent, so in general all seems fine here.
IIUC the fuzzer managed to dispatch another PermissionRetry event fast enough to the background thread and on the same instance of FactoryOp to find the mContentParent still set to something.
We could have some thread-safe state on FactoryOp about having dispatched already something to the main-thread and bail out in that case.
Comment 7•4 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #6)
We could have some thread-safe state on
FactoryOpabout having dispatched already something to the main-thread and bail out in that case.
Hm, we can just check the state in RecvPermissionRetry, if the state is not State::PermissionChallenge we shouldn't call PermissionRetry.
I guess the IPC fuzzing build is not a debug build, otherwise you would hit this assertion
Comment 8•4 years ago
|
||
Ah, so we just need to uplift this assert to be a real check. However, I see many of these MOZ_ASSERT(mState == State::*). At least for functions that are a direct entry point for IPC we might want to be more strict in general to avoid unexpected behavior?
Comment 9•4 years ago
|
||
I wouldn't change (uplift) existing assertions, it's just that we never assumed that the other side would send unexpected messages. So we just need to add more checks to the places where we handle messages received from the other side (based on the assertions which we already have deeper in the code).
IPDL used to have a state machine support, but it seems it was never working well and it was removed in bug 1316757 in the end.
Anyway, solving this at IPDL level would be much nicer long term, especially given the ongoing IPC fuzzing work.
You can find a state machine definition in IPDL here: https://wiki.mozilla.org/IPDL/Low_level_setup
Updated•4 years ago
|
Comment 10•4 years ago
|
||
sec-other may have been premature? we've got something running on a thread it shouldn't have been, and an assertion that seems like we're in an unexpected state (MOZ_ASSERT(mState == State::PermissionChallenge);) that might have different results than a null deref with different input. Clearing so we can re-triage
Comment 11•4 years ago
|
||
I guess mContentParent had an existing value, so we're calling release on it. I assume that the value must normally be null. We could release assert that mContentParent as null has a hacky fix for the immediate issue. I wonder if it would be nice to have some special RefPtr variant that is "safe" to use off the main thread. eg you can assign an already_AddRefed, or do a move, but overwriting an existing value crashes.
The actual call to suspect is "safe" because it tries to get a cycle collector object via TLS, and the IPC thread will never have one, but looking at the implementation of decr(), before that we're going to end up doing unsafe things to the refcount. In the worst case, maybe this could cause the ref count to not get incremented enough, which would lead to a UAF down the line. I'm not sure how easy it would be to exploit that.
Updated•4 years ago
|
Comment 12•3 years ago
|
||
We removed State::PermissionChallenge entirely as part of bug 1354500. :decoder, can you confirm this to be no issue anymore on the fuzzer side, too?
| Reporter | ||
Comment 13•3 years ago
|
||
I have not seen this bug anymore in any of the fuzzing runs, so I assume it was fixed by the removal.
Updated•2 years ago
|
Description
•