Crash [@ mozilla::EventStateManager::PreHandleEvent]
Categories
(Core :: DOM: Events, defect)
Tracking
()
People
(Reporter: decoder, Assigned: smaug)
Details
(Keywords: crash, sec-other, testcase, Whiteboard: [post-critsmash-triage][adv-main101-])
Crash Data
Attachments
(2 files)
In experimental IPC fuzzing, we found the following crash on mozilla-central revision 4d80f4e1809a+:
=================================================================
==1698==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000038 (pc 0x7fffd7e0c7db bp 0x7fffffcaa580 sp 0x7fffffcaa100 T0)
#0 0x7fffd7e0c7db in mozilla::EventStateManager::PreHandleEvent(nsPresContext*, mozilla::WidgetEvent*, nsIFrame*, nsIContent*, nsEventStatus*, nsIContent*) dom/events/EventStateManager.cpp:784:29
#1 0x7fffe0145d42 in mozilla::PresShell::EventHandler::DispatchEvent(mozilla::EventStateManager*, mozilla::WidgetEvent*, bool, nsEventStatus*, nsIContent*) layout/base/PresShell.cpp:8163:39
#2 0x7fffe0132dfc in mozilla::PresShell::EventHandler::HandleEventWithCurrentEventInfo(mozilla::WidgetEvent*, nsEventStatus*, bool, nsIContent*) layout/base/PresShell.cpp:8132:17
#3 0x7fffe013111b in mozilla::PresShell::EventHandler::HandleEventUsingCoordinates(nsIFrame*, mozilla::WidgetGUIEvent*, nsEventStatus*, bool) layout/base/PresShell.cpp:7050:30
#4 0x7fffe012b4db in mozilla::PresShell::EventHandler::HandleEvent(nsIFrame*, mozilla::WidgetGUIEvent*, bool, nsEventStatus*) layout/base/PresShell.cpp:6853:12
#5 0x7fffe01277df in mozilla::PresShell::HandleEvent(nsIFrame*, mozilla::WidgetGUIEvent*, bool, nsEventStatus*) layout/base/PresShell.cpp:6796:23
#6 0x7fffded60bd3 in nsViewManager::DispatchEvent(mozilla::WidgetGUIEvent*, nsView*, nsEventStatus*) view/nsViewManager.cpp:685:18
#7 0x7fffded5ff1a in nsView::HandleEvent(mozilla::WidgetGUIEvent*, bool) view/nsView.cpp:1129:9
#8 0x7fffdf47d3f1 in mozilla::widget::HeadlessWidget::DispatchEvent(mozilla::WidgetGUIEvent*, nsEventStatus&) widget/headless/HeadlessWidget.cpp:0:0
#9 0x7fffdedac63d in nsBaseWidget::ProcessUntransformedAPZEvent(mozilla::WidgetInputEvent*, mozilla::layers::APZEventResult const&) widget/nsBaseWidget.cpp:908:3
#10 0x7fffdedb0af7 in nsBaseWidget::DispatchInputEvent(mozilla::WidgetInputEvent*) widget/nsBaseWidget.cpp:1087:31
#11 0x7fffdf47dd96 in mozilla::widget::HeadlessWidget::SynthesizeNativeMouseEvent(mozilla::gfx::IntPointTyped<mozilla::LayoutDevicePixel>, nsIWidget::NativeMouseMessage, mozilla::MouseButton, nsIWidget::Modifiers, nsIObserver*) widget/headless/HeadlessWidget.cpp:460:3
#12 0x7fffdca5a51c in mozilla::dom::BrowserParent::RecvSynthesizeNativeMouseEvent(mozilla::gfx::IntPointTyped<mozilla::LayoutDevicePixel> const&, unsigned int const&, short const&, unsigned int const&, unsigned long const&) dom/ipc/BrowserParent.cpp:1863:13
#13 0x7fffdcea86e7 in mozilla::dom::PBrowserParent::OnMessageReceived(IPC::Message const&) objdir-ff-asan/ipc/ipdl/PBrowserParent.cpp:5074:57
#14 0x7fffdd1ae863 in mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&) objdir-ff-asan/ipc/ipdl/PContentParent.cpp:6989:32
#15 0x7fffcccdb960 in mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message const&) ipc/glue/MessageChannel.cpp:1830:25
#16 0x7fffcccd488a in mozilla::ipc::MessageChannel::DispatchMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message&&) ipc/glue/MessageChannel.cpp:1752:9
#17 0x7fffcccd66d8 in mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::ActorLifecycleProxy*, mozilla::ipc::MessageChannel::MessageTask&) ipc/glue/MessageChannel.cpp:1474:3
#18 0x7fffcccd8d00 in mozilla::ipc::MessageChannel::MessageTask::Run() ipc/glue/MessageChannel.cpp:1588:14
#19 0x7fffc93f338b in mozilla::RunnableTask::Run() xpcom/threads/TaskController.cpp:467:16
#20 0x7fffc934b69b in mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) xpcom/threads/TaskController.cpp:775:26
#21 0x7fffc9344869 in mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) xpcom/threads/TaskController.cpp:611:15
#22 0x7fffc9345ff1 in mozilla::TaskController::ProcessPendingMTTask(bool) xpcom/threads/TaskController.cpp:390:36
#23 0x7fffc93cd0ff in mozilla::TaskController::InitializeInternal()::$_0::operator()() const xpcom/threads/TaskController.cpp:124:37
#24 0x7fffc93cd0ff in mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_0>::Run() objdir-ff-asan/dist/include/nsThreadUtils.h:531:5
#25 0x7fffc93936c1 in nsThread::ProcessNextEvent(bool, bool*) xpcom/threads/nsThread.cpp:1173:16
#26 0x7fffc93ae188 in NS_ProcessNextEvent(nsIThread*, bool) xpcom/threads/nsThreadUtils.cpp:467:10
#27 0x7fffcccea796 in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) ipc/glue/MessagePump.cpp:85:21
#28 0x7fffcc928f4f in MessageLoop::RunInternal() ipc/chromium/src/base/message_loop.cc:331:10
[...]
I haven't figured out yet what exactly is null here, but I'll investigate further and it keeps reproducing fairly frequently, so we should be able to validate a potential fix quickly. If you can spot earlier what might be wrong, let me know (this is in IPC fuzzing where we just send random messages to the parent directly. S-s only for the parent bug.
Reporter | ||
Comment 1•2 years ago
|
||
Comment 2•2 years ago
|
||
A general question on IPC fuzzing: Would it be possible to get the parameters of the IPC message that caused the crash?
Here dragevent
seems to be nullptr
. I see no attempt to validate the IPC parameters in BrowserParent::RecvSynthesizeNativeMouseEvent
, so this happens implicitly during processing, it seems. It is probably a design decision to make, if we want to replicate all possible combinations of flags for a-priori verification or if we should just better harden the processing.
Reporter | ||
Comment 3•2 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #2)
A general question on IPC fuzzing: Would it be possible to get the parameters of the IPC message that caused the crash?
Yes, it would even be possible to reproduce this at some point. Right now we don't have that ability yet, but I do have the original data that caused the crash and I could maybe write some code that deciphers it back into the IPC messages and their parameters.
Here
dragevent
seems to benullptr
. I see no attempt to validate the IPC parameters inBrowserParent::RecvSynthesizeNativeMouseEvent
, so this happens implicitly during processing, it seems. It is probably a design decision to make, if we want to replicate all possible combinations of flags for a-priori verification or if we should just better harden the processing.
I also looked into this and got confused because the event is created here apparently:
This code takes the address of a stack object, so that seems weird to me when dispatching an event (something I would assume that is async? but not sure).
Comment 4•2 years ago
|
||
(In reply to Christian Holler (:decoder) from comment #3)
I also looked into this and got confused because the event is created here apparently:
This code takes the address of a stack object, so that seems weird to me when dispatching an event (something I would assume that is async? but not sure).
I think the naming is just odd here. The nsIWidget::DispatchEvent
implementatiosn I looked at do just directly call some HandleEvent
, as seen here on the stack. So I do not think the reference to the stack object should ever bite us here.
Reporter | ||
Comment 5•2 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #4)
I think the naming is just odd here. The
nsIWidget::DispatchEvent
implementatiosn I looked at do just directly call someHandleEvent
, as seen here on the stack. So I do not think the reference to the stack object should ever bite us here.
It feels like a potential footgun at least, if all of the caller implementations have to be aware of this, but maybe this can be statically enforced somehow.
However, if this is all true, then I wonder how the event pointer could be null at all. Unless I mixed something up and it isn't actually this pointer at all.
Comment 6•2 years ago
|
||
(In reply to Christian Holler (:decoder) from comment #5)
(In reply to Jens Stutte [:jstutte] from comment #4)
It feels like a potential footgun at least, if all of the caller implementations have to be aware of this, but maybe this can be statically enforced somehow.
Well, I think it is a called function's responsibility to ensure to make copies or whatever in case they really want to dispatch a reference down the road? Passing a pointer to a stack object would seem worse to me than passing a reference?
However, if this is all true, then I wonder how the event pointer could be null at all. Unless I mixed something up and it isn't actually this pointer at all.
virtual DragEvent* AsDragEvent() { return nullptr; }
is probably the source of this. I assume we somehow managed to create a non-drag event that carries eDragOver
state such that aEvent->AsDragEvent();
returns nullptr
.
Comment 7•2 years ago
|
||
So the right hardening here would be to always check the result of AsXXXEvent()
before using it. In fact I see quite some asserts around this, but those do not help against IPC fuzzing on opt builds.
Updated•2 years ago
|
Reporter | ||
Comment 8•2 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #6)
virtual DragEvent* AsDragEvent() { return nullptr; }
is probably the source of this. I assume we somehow managed to create a non-drag event that carrieseDragOver
state such thataEvent->AsDragEvent();
returnsnullptr
.
Ah right, makes sense. Thanks!
So the right hardening here would be to always check the result of AsXXXEvent() before using it. In fact I see quite some asserts around this, but those do not help against IPC fuzzing on opt builds.
Yea, we should probably do some error handling instead (e.g. returning IPC errors). If that's strictly not possible, maybe they should be release asserts before just crashing.
Comment 9•2 years ago
•
|
||
Basically, each widget or internal event is created in the stack with a proper event class which is corresponding to the event message. And shouldn't do redundant validation at every point because it makes the code messy and it's worse for Quantum Flow. In normal case, even without the check of the combination, the mismatch should be detected in automated tests. In this case, SynthesizeNativeMouseEvent
should ban non-mouse event messages like eDragOver
. However, it looks like that it's never eDragOver
.
https://searchfox.org/mozilla-central/rev/711e1cea1cb584057c50aac0a26a3f7c969eda66/widget/headless/HeadlessWidget.cpp#436,438,441,444,451,460
And there is no assigner.
https://searchfox.org/mozilla-central/search?q=%3D+eDragOver&path=&case=true®exp=false
So I do not understand where the eDragOver
came from.
Assignee | ||
Comment 10•2 years ago
|
||
I assume IPC fuzzing code generates some bogus data.
Comment 11•2 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #10)
I assume IPC fuzzing code generates some bogus data.
Right, and the goal of this is to check how resilient our code is against bogus parameters sent by malicious children. As said in comment 2:
It is probably a design decision to make, if we want to replicate all possible combinations of flags for a-priori verification or if we should just better harden the processing.
but I'd assume it is easier and less error-prone to harden the processing.
Reporter | ||
Comment 12•2 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #2)
A general question on IPC fuzzing: Would it be possible to get the parameters of the IPC message that caused the crash?
I made a little bit of progress here and was able to reproduce the crash using the test file in a special build, but outside of the actual fuzzing environment (without VM, etc). And I was able to print the payload (IPC packet contents without the header). This is the one message that I think is causing this:
DEBUG: OnEventMessage iteration 2, EVS: 40 Payload: 156.
DEBUG: MakeTargetDecision: Protocol: PBrowser msgType: PBrowser::Msg_SynthesizeNativeMouseEvent
INFO: Replaying IPC packet with payload:
0x67 0x67 0x67 0x67 0x67 0x67 0x67 0x61 0x4A 0x00 0x01 0x00 0x1A 0x49 0x71 0x8A 0xF4 0x8C 0x52 0x64 0x3B 0xBE 0x7E 0x9A 0x41 0x00 0xAB 0xAB 0xAB 0xAB 0xAB 0xAB
0xAB 0xAB 0xAB 0xAB 0xAB 0xAB 0xAB 0x17 0x01 0xAB 0xAB 0xAB 0xAB 0xAB 0xAB 0xAB
0xAB 0xAB 0x64 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xAB 0xAB 0xAB 0xAB 0xAB 0xAB
0xAB 0xAB 0xAB 0xAB 0xA0 0xAB 0xAB 0xAB 0xAB 0x4A 0x00 0x45 0x0D 0x0D 0x0D 0x0D
0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D
0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x0D 0x76
0x02 0x0D 0x35 0x31 0x98 0xA3 0x5E 0xAB 0xAB 0xAB 0xAB 0xAB 0xAB 0xAB 0xAB 0xAB
0xEF 0x02 0xAB 0xAB 0xAB 0xAB 0xAB 0xAB 0xAB 0x73 0x01 0xAB 0xAB 0xAB 0xA0 0xAB
0xA0 0xAB 0xAB 0xAB 0xAB 0x4A 0x00 0x2D 0x80 0x80 0x20 0x00
DEBUG: OnEventMessage: Port 389826929292770431 880451802929655800. Actor 10
DEBUG: OnEventMessage: Flags: 2049 TxID: -1 Handles: 0
Of course I can add some logging to Msg_SynthesizeNativeMouseEvent
to see what the params actually end up to be, so we don't have to decipher it from this payload. Do you need this right now or do we already know how to fix this? It would be great to get this fixed quickly because it is by far the top-most crasher in my experiments.
Assignee | ||
Comment 13•2 years ago
|
||
"was able to reproduce the crash using the test file in a special build"
What is that special build doing?
I think it is clear what can cause this kind of issue - event being sent using certainly value for its struct type (mClass) , but unexpected value for its mMessage.
This particular case uses a testing only SynthesizeNativeMouseEvent, which creates a mouse event, but if the fuzzer passes an unexpected aNativeMessage, we may get the crash (safe such).
SynthesizeNativeMouseScrollEvent may have very similar issue too.
Why is this marked as regression? (perhaps some bot marked it?)
Reporter | ||
Comment 14•2 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #13)
"was able to reproduce the crash using the test file in a special build"
What is that special build doing?
This is a build injecting fuzzing data directly into the IPC layer.
I think it is clear what can cause this kind of issue - event being sent using certainly value for its struct type (mClass) , but unexpected value for its mMessage.
This particular case uses a testing only SynthesizeNativeMouseEvent, which creates a mouse event, but if the fuzzer passes an unexpected aNativeMessage, we may get the crash (safe such).
SynthesizeNativeMouseScrollEvent may have very similar issue too.
Thanks for the diagnosis. We preferably do not want to crash in such situations (handling the bug via IPC failure is preferable), but if we absolutely have to, it should be done through MOZ_RELEASE_ASSERT.
Why is this marked as regression? (perhaps some bot marked it?)
It is part of the automatic fuzzing template, I removed it now.
Assignee | ||
Comment 15•2 years ago
|
||
We could easily add checks to those two places at least.
One could ensure nsContentUtils::GetEventClassIDFromMessage returns the right class id.
That would be even fast.
Assignee | ||
Comment 16•2 years ago
|
||
Oh, wait, my comment is wrong. We aren't passing message, but NativeMouseMessage.
Assignee | ||
Comment 17•2 years ago
|
||
Assignee | ||
Comment 18•2 years ago
|
||
Is that patch enough for you to move forward with testing?
Comment 19•2 years ago
|
||
I'd really love to specify the range in IPDL and it'd be guaranteed that the value range is checked before calling Recv*
...
Reporter | ||
Comment 20•2 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #18)
Is that patch enough for you to move forward with testing?
Yes, this works nicely, thanks :)
Assignee | ||
Comment 21•2 years ago
|
||
(In reply to Masayuki Nakano [:masayuki] (he/him)(JST, +0900) from comment #19)
I'd really love to specify the range in IPDL and it'd be guaranteed that the value range is checked before calling
Recv*
...
Yeah, but I'm not aware of such mechanism. I think we could land the patch and I'll ask the IPC folks about more generic fix so that
enum values would be guaranteed to be reasonable.
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Comment 22•2 years ago
|
||
Ah, I had totally forgotten ContiguousEnumSerializer and such. If wanted we could change the ipdl to pass enum and then have special serializer for it. If there are more cases using the enum, then yes. Otherwise perhaps not worth it.
Updated•2 years ago
|
Comment 23•2 years ago
|
||
don't even try to handle bogus messages in testing methods, r=masayuki
https://hg.mozilla.org/integration/autoland/rev/fac368205e5da0e545428c3336ff07b203baa795
https://hg.mozilla.org/mozilla-central/rev/fac368205e5d
Updated•2 years ago
|
Updated•2 years ago
|
Comment 24•2 years ago
|
||
The patch landed in nightly and beta is affected.
:smaug, is this bug important enough to require an uplift?
If not please set status_beta
to wontfix
.
For more information, please visit auto_nag documentation.
Updated•2 years ago
|
Updated•2 years ago
|
Updated•1 year ago
|
Description
•