Potential cross-process UAF with shmem as used between RemodeDecoderChild/Parent
Categories
(Core :: Audio/Video: Playback, defect, P3)
Tracking
()
People
(Reporter: jya, Unassigned)
References
Details
Seen in https://treeherder.mozilla.org/logviewer.html#?job_id=317010820&repo=try
I can't really understand why it could happen but I can see the observed behaviour.
The RemoteDecoderChild maintains a shmem pool to allocate the RemoteMediaRawData and pass it to the parent. The RemoteDecoderChild (RDC) keeps ownership of those shmem and will re-use them as soon as the RemoteDecoderParent (RDP) returns the decoded data.
Similarly, the RDP use a shmem pool to return the decoded image/audio back to the RDC and will re-use it when a new RemoteMediaRawData to be decoded is received.
Both RDC and RDP will clear the shmem pool and de-allocate all shmem when the actor is destroyed.
What the log is showing:
[task 2020-09-29T08:32:46.768Z] 08:32:46 INFO - GECKO(1498) | [Child 1695, RemVidChild] WARNING: Shmem was deallocated: file /builds/worker/checkouts/gecko/ipc/glue/Shmem.cpp, line 343
[task 2020-09-29T08:32:46.768Z] 08:32:46 INFO - GECKO(1498) | ###!!! [Child][DispatchAsyncMessage] Error: SHMEM_CREATED_MESSAGE Payload error: message could not be deserialized
[task 2020-09-29T08:32:46.770Z] 08:32:46 INFO - GECKO(1498) | Assertion failure: mSegment (null segment), at /builds/worker/checkouts/gecko/ipc/glue/Shmem.cpp:255
[task 2020-09-29T08:32:46.790Z] 08:32:46 INFO - Initializing stack-fixing for the first stack frame, this may take a while...
[task 2020-09-29T08:32:49.791Z] 08:32:49 INFO - JavaScript error: /builds/worker/workspace/build/tests/bin/components/httpd.js, line 2963: NS_ERROR_UNEXPECTED: Component returned failure code: 0x8000ffff (NS_ERROR_UNEXPECTED) [nsIBinaryOutputStream.writeByteArray]
[task 2020-09-29T08:32:56.557Z] 08:32:56 INFO - GECKO(1498) | #01: mozilla::RemoteArrayOfByteBuffer::Check(unsigned long, unsigned long) const [dom/media/ipc/RemoteMediaData.cpp:32]
[task 2020-09-29T08:32:56.558Z] 08:32:56 INFO - GECKO(1498) | #02: mozilla::AlignedBuffer<float, 32> mozilla::RemoteArrayOfByteBuffer::AlignedBufferAt<float>(unsigned long) const [dom/media/ipc/RemoteMediaData.h:157]
[task 2020-09-29T08:32:56.559Z] 08:32:56 INFO - GECKO(1498) | #03: mozilla::ArrayOfRemoteAudioData::ElementAt(unsigned long) const [dom/media/ipc/RemoteMediaData.cpp:269]
[task 2020-09-29T08:32:56.559Z] 08:32:56 INFO - GECKO(1498) | #04: mozilla::RemoteAudioDecoderChild::ProcessOutput(mozilla::DecodedOutputIPDL&&) [dom/media/ipc/RemoteAudioDecoder.cpp:27]
[task 2020-09-29T08:32:56.560Z] 08:32:56 INFO - GECKO(1498) | #05: mozilla::RemoteDecoderChild::Decode(nsTArray<RefPtr<mozilla::MediaRawData> > const&)::$_2::operator()(mozilla::MozPromise<mozilla::DecodeResultIPDL, mozilla::ipc::ResponseRejectReason, true>::ResolveOrRejectValue&&) const [dom/media/ipc/RemoteDecoderChild.cpp:138]
[task 2020-09-29T08:32:56.561Z] 08:32:56 INFO - GECKO(1498) | #06: mozilla::MozPromise<mozilla::DecodeResultIPDL, mozilla::ipc::ResponseRejectReason, true>::ThenValue<mozilla::RemoteDecoderChild::Decode(nsTArray<RefPtr<mozilla::MediaRawData> > const&)::$_2>::DoResolveOrRejectInternal(mozilla::MozPromise<mozilla::DecodeResultIPDL, mozilla::ipc::ResponseRejectReason, true>::ResolveOrRejectValue&) [xpcom/threads/MozPromise.h:845]
[task 2020-09-29T08:32:56.561Z] 08:32:56 INFO - GECKO(1498) | #07: mozilla::MozPromise<mozilla::DecodeResultIPDL, mozilla::ipc::ResponseRejectReason, true>::ThenValueBase::ResolveOrRejectRunnable::Run() [xpcom/threads/MozPromise.h:411]
[task 2020-09-29T08:32:56.562Z] 08:32:56 INFO - GECKO(1498) | #08: mozilla::SimpleTaskQueue::DrainTasks() [xpcom/threads/TaskDispatcher.h:44]
[task 2020-09-29T08:32:56.563Z] 08:32:56 INFO - GECKO(1498) | #09: nsThread::DrainDirectTasks() [xpcom/threads/nsThread.cpp:1452]
[task 2020-09-29T08:32:56.563Z] 08:32:56 INFO - GECKO(1498) | #10: nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:1258]
[task 2020-09-29T08:32:56.564Z] 08:32:56 INFO - GECKO(1498) | #11: NS_ProcessNextEvent(nsIThread*, bool) [xpcom/threads/nsThreadUtils.cpp:513]
[task 2020-09-29T08:32:56.565Z] 08:32:56 INFO - GECKO(1498) | #12: mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) [ipc/glue/MessagePump.cpp:303]
[task 2020-09-29T08:32:56.565Z] 08:32:56 INFO - GECKO(1498) | #13: MessageLoop::RunInternal() [ipc/chromium/src/base/message_loop.cc:334]
[task 2020-09-29T08:32:56.566Z] 08:32:56 INFO - GECKO(1498) | #14: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:310]
[task 2020-09-29T08:32:56.567Z] 08:32:56 INFO - GECKO(1498) | #15: nsThread::ThreadFunc(void*) [xpcom/threads/nsThread.cpp:444]
[task 2020-09-29T08:32:56.575Z] 08:32:56 INFO - GECKO(1498) | #16: _pt_root [nsprpub/pr/src/pthreads/ptthread.c:204]
is that somehow the RDP's shmem sent to the RDC has been deallocated while the RDC is retrieving the content.
This indicates that somehow here the RDP received the message "ActorDestroy" while RDC is processing an earlier result.
It is possible that the message ActorDestroy got dispatched already; though I don't see why seeing that the RDC is obviously alive.
In the log where this condition happen, we can see that it's usually related with a decoding error in the parent. Thouh this shouldn't matter as the life of the parent is controlled by the child.
In any case, what we could do instead is pass ownership of the allocated shmem as it's sent over IPC and place it in the parent or child shmem pool.
We could even save on shmem allocations here as we will be able to use the same shmem for both sending the MediaRawData to be decoded and to carry the decoded result on its way back.
All this has been rendered possible with bug 1648309 as we only create a new shmem when the previous one has been de-allocated.
Updated•4 years ago
|
Comment 1•4 years ago
|
||
I don't think is ever a UAF issue, but it does look like shmem allocation/deallocation can race.
Deallocating a shmem will synchronously mutate the header and set the size to 0. If a 'ShmemCreated' message is still in transit over IPC, then deserialization of this message will fail.
Allocating an unsafe shmem, sending it async to another process and then immediately deallocating it will likely fail in this way.
I think in this case, RemoteDecoderChild has sent send__delete__ at the same time as RemoteDecoderParent has sent a decode response message. The parent side receives the message first and deallocs Shmems, and then the child side receives the message and fails to deserialize.
Reporter | ||
Comment 2•4 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #1)
I think in this case, RemoteDecoderChild has sent send__delete__ at the same time as RemoteDecoderParent has sent a decode response message. The parent side receives the message first and deallocs Shmems, and then the child side receives the message and fails to deserialize.
I don't think this is what is happening.
Before calling Send_delete()
we have:
https://searchfox.org/mozilla-central/source/dom/media/ipc/RemoteDecoderChild.cpp#53-57
So we can't have a decode/drain/flush or init in progress, it would trigger this assert.
RemoteDecoderChild::DestroyIPDL is called from RemoteMediaDataDecoder::Shutdown once the RemoteDecoderChild::Shutdown has completed.
Yet we are in the middle of a decode in the child when the parent is in the middle of handling decode.
So it's something else.
Reporter | ||
Updated•4 years ago
|
Description
•