Closed Bug 1433671 Opened 6 years ago Closed 6 years ago

Crash in mozilla::AccessibleCaretManager::UpdateCarets

Categories

(Core :: CSS Parsing and Computation, defect)

Unspecified
Android
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox-esr52 --- wontfix
firefox58 --- wontfix
firefox59 --- fixed
firefox60 --- fixed

People

(Reporter: calixte, Assigned: emilio)

References

(Blocks 1 open bug)

Details

(Keywords: crash, csectype-uaf, sec-moderate, Whiteboard: [post-critsmash-triage][adv-main59-])

Crash Data

Attachments

(3 files, 2 obsolete files)

This bug was filed from the Socorro interface and is
report bp-6e641622-15ae-496e-b1b4-162250180127.
=============================================================

Top 10 frames of crashing thread:

0 libxul.so mozilla::AccessibleCaretManager::UpdateCarets layout/base/AccessibleCaretManager.cpp:221
1 libxul.so mozilla::AccessibleCaretManager::OnScrollEnd layout/base/AccessibleCaretManager.cpp:700
2 libxul.so mozilla::AccessibleCaretEventHub::PostScrollState::OnScrollEnd layout/base/AccessibleCaretEventHub.cpp:294
3 libxul.so nsTimerImpl::Fire xpcom/threads/nsTimerImpl.cpp:701
4 libxul.so nsTimerEvent::Run xpcom/threads/TimerThread.cpp:286
5 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1040
6 libxul.so NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:517
7 libxul.so mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:97
8 libxul.so MessageLoop::Run ipc/chromium/src/base/message_loop.cc:326
9 libxul.so nsBaseAppShell::Run widget/nsBaseAppShell.cpp:157

=============================================================

There are 2 crashes (from are 2 installations) in nightly 60 with buildid 20180125104838. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1431474.

[1] https://hg.mozilla.org/mozilla-central/rev?node=00ae6912038d
Flags: needinfo?(emilio)
Ok, so this makes sense, but I can't know how it was sound before that patch.

So the accessible caret manager is owned by the event hub, that is owned by the shell... Flushing pending notifications can run arbitrary script, that can call Destroy() on the pres shell (and thus tear down the accessible caret event hub, and the manager with him).

So the crash here is that when we're checking IsTerminated we're no longer alive.

TYLin, do you know how was this supposed to be sound? I can fix this (it's a matter of holding a reference to the caret manager in FlushLayout, and return whether we're terminated instead of checking it back). But I would like to understand whether I'm right in thinking that the code before bug 1431471 was equally unsound (though maybe less often).
Flags: needinfo?(aethanyc)
The crash address for the 2 crashes is 0xe5e5e621.
Group: core-security
Keywords: csectype-uaf
Flags: needinfo?(emilio)
Yeah, I mean... This is clearly a UAF. I'm moderately sure the code wasn't sound before my patch either.

The accessible caret manager is owned by the event hub, that is owned by the
shell.
    
Flushing pending notifications can run arbitrary script, that can call Destroy()
on the pres shell (and thus tear down the accessible caret event hub, and the
manager with him).
    
I don't know why before my change this wasn't crashing badly, but the code as it
was just doesn't look sound to me at all either (maybe I'm misunderstanding
something and I should just revert that patch and give up on having nice
invariants during our flushes..., but I don't think it's the case).

This also adds some sanity-checking that we don't die under our flush.
Assignee: nobody → emilio
Status: NEW → ASSIGNED
Attachment #8946073 - Flags: review?(bzbarsky)
Comment on attachment 8946073 [details] [diff] [review]
Make AccessibleCaretEventManager flushes a bit more sound.

Review of attachment 8946073 [details] [diff] [review]:
-----------------------------------------------------------------

From the call stack, it seems that the crash happens in the timer callback AccessibleCaretEventHub::FireScrollEnd().

I think you're right. Nothing guarantees AccessibleCaretEventHub is alive in the scope of the timer call back if PresShell::Destroy() is called. In other cases, I had added asserts in all the entry points of AccessibleCaretEventHub to ensure it's refcount is > 1 (bug 1248847), i.e. the caller needs to hold a reference to AccessibleCaretEventHub before calling the hub's public APIs.

By my original design, AccessibleCaretManager shouldn't need to worry about the lifetime of AccessibleCaretEventHub if it needs to flush layout, so I'm thinking about changing [1] to 
  RefPtr<AccessibleCaretEventHub> self =
    static_cast<AccessibleCaretEventHub*>(aAccessibleCaretEventHub);

By doing this, we don't need to add a ref to the event hub in AccessibleCaretManager::FlushLayout(). What do you think?

Other parts of this patch look good to me.

[1] https://dxr.mozilla.org/mozilla-central/rev/e22e2c9eb81686e958a9448ea3d1e8cd766743e2/layout/base/AccessibleCaretEventHub.cpp#773

::: layout/base/AccessibleCaretManager.cpp
@@ +982,5 @@
>      selection->Modify(NS_LITERAL_STRING("extend"),
>                        aDirection,
>                        NS_LITERAL_STRING("character"));
> +    // FIXME(emilio): Is this sound? What guarantees that the event hub is
> +    // alive?

See my explanation above.

I think this will be called when selecting a word, and MOZ_ASSERT(mRefCnt.get() > 1) in 
AccessibleCaretEventHub::HandleEvent() guarantees this

@@ +1038,5 @@
>    }
>  
> +  // Make sure that the object that owns us doesn't go away during the flush.
> +  RefPtr<AccessibleCaretEventHub> hub =
> +    mPresShell->GetAccessibleCaretEventHub();

We don't need this if we add ref to event hub in FireScrollEnd(). 

(And it would be nice if the manager doesn't need to worry about the lifetime of the event hub)

@@ +1424,4 @@
>  {
>    if (!mPresShell) {
>      return;
>    }

Nit: The beginning of the FlushLayout() checks mPresShell, so this can be removed.

::: layout/base/AccessibleCaretManager.h
@@ +199,4 @@
>  
>    void ClearMaintainedSelection() const;
>  
> +  // Returns whether the presshell we're attached to isn't gone.

Nit: To be precise, AccessibleCaretManager is attached to AccessibleCaretEventHub, so perhaps change the wording to something like:

Returns whether mPresShell we're holding is still valid.
Flags: needinfo?(aethanyc)
Ok, that makes perfect sense. Thanks Ting-Yu!
Attached patch Updated patch. (obsolete) — Splinter Review
FireLongTap had the same issue.
Attachment #8946148 - Flags: review?(bzbarsky)
Attachment #8946073 - Attachment is obsolete: true
Attachment #8946073 - Flags: review?(bzbarsky)
Comment on attachment 8946148 [details] [diff] [review]
Updated patch.

Review of attachment 8946148 [details] [diff] [review]:
-----------------------------------------------------------------

I feel I could r+ this change, but maybe Boris will see something that I missed.
Attachment #8946148 - Flags: review?(bzbarsky) → review+
Yeah, your review is enough for me I think.
Even though the manifestation as crashes is recent, I think the issue is present everywhere.
Comment on attachment 8946148 [details] [diff] [review]
Updated patch.

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: crashes on android devices and touch screens.
Fix Landed on Version: not landed yet
Risk to taking this patch (and alternatives if risky): not much risk I'd say, read below for a more detailed assessment.
String or UUID changes made by this patch: none

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.

Approval Request Comment
[Feature/Bug causing the regression]: not a recent regression by any means.
[User impact if declined]: crashes / subtle security bugs.
[Is this code covered by automated tests?]: no
[Has the fix been verified in Nightly?]: no
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: not risky
[Why is the change risky/not risky?]: Ensures the object doesn't die during its own updates and adds assertion, no other behavior change.
[String changes made/needed]: none

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
Not much, I think. Killing the presshell from a particular flush is not trivial (it'd take me a huge amount of time to get something to crash here I think).

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
Kind of, the patch adds release assertions to ensure the issue doesn't happen.

Which older supported branches are affected by this flaw?
All of them.

If not all supported branches, which bug introduced the flaw?

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
They should apply cleanly, these files haven't changed much. Thus easy, and low risk.

How likely is this patch to cause regressions; how much testing does it need?
Not likely, we just take a reference to the owner of the object in the appropriate places, shouldn't change behavior except in the path where we currently crash (avoiding it).
Attachment #8946148 - Flags: sec-approval?
Attachment #8946148 - Flags: approval-mozilla-release?
Attachment #8946148 - Flags: approval-mozilla-esr52?
Attachment #8946148 - Flags: approval-mozilla-beta?
(In reply to Emilio Cobos Álvarez [:emilio] from comment #8)
> Yeah, your review is enough for me I think.

Thanks a lot for it btw TYLin! :)
See comment 10 for the approval request.
Attachment #8946148 - Attachment is obsolete: true
Attachment #8946148 - Flags: sec-approval?
Attachment #8946148 - Flags: approval-mozilla-release?
Attachment #8946148 - Flags: approval-mozilla-esr52?
Attachment #8946148 - Flags: approval-mozilla-beta?
Attachment #8946212 - Flags: review+
Attachment #8946212 - Flags: approval-mozilla-release?
Attachment #8946212 - Flags: approval-mozilla-esr52?
Attachment #8946212 - Flags: approval-mozilla-beta?
Comment on attachment 8946212 [details] [diff] [review]
Slightly modified patch to make gtest happy.

See comment 10.
Attachment #8946212 - Flags: sec-approval?
Comment on attachment 8946212 [details] [diff] [review]
Slightly modified patch to make gtest happy.

This is OK, but really we should annotate FlushLayout as MOZ_CAN_RUN_SCRIPT and then annotate up the callstack to the first place we don't control (e.g. the timer callbacks, which we could then annotate MOZ_CAN_RUN_SCRIPT_BOUNDARY).
(In reply to Emilio Cobos Álvarez [:emilio] from comment #15)
> Created attachment 8946355 [details] [diff] [review]
> Followup (just for trunk): Annotate with MOZ_CAN_RUN_SCRIPT.

I know the formatting is weird, but I just did clang-format, so I guess it's fine.
Comment on attachment 8946355 [details] [diff] [review]
Followup (just for trunk): Annotate with MOZ_CAN_RUN_SCRIPT.

Please document the MOZ_CAN_RUN_SCRIPT_BOUNDARY as to why they're needed (or at least wanted) and how we know their args (including `this`) are safe.
Attachment #8946355 - Flags: review?(bzbarsky) → review+
Group: core-security → layout-core-security
Keywords: regressionsec-moderate
Comment on attachment 8946212 [details] [diff] [review]
Slightly modified patch to make gtest happy.

Given it's sec-moderate, it doesn't need approval for trunk it seems. I'd still want to land the fix on branches, but I guess this is pretty hard to trigger.
Attachment #8946212 - Flags: sec-approval?
https://hg.mozilla.org/mozilla-central/rev/8e42b66f4656
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
The MOZ_CAN_RUN stuff caused bustage (https://hg.mozilla.org/mozilla-central/rev/5ca278d8c1af), so I had to back it out in https://hg.mozilla.org/mozilla-central/rev/fc06e8853496. ni? me to get it to land.
Flags: needinfo?(emilio)
Group: layout-core-security → core-security-release
Comment on attachment 8946212 [details] [diff] [review]
Slightly modified patch to make gtest happy.

Doesn't meet the severity to warrant uplift to a dot-release or ESR52, but let's get this on beta for 59b7 at least.

Emilio, it looks like this'll need a rebased patch before it can land too.
Attachment #8946212 - Flags: approval-mozilla-release?
Attachment #8946212 - Flags: approval-mozilla-release-
Attachment #8946212 - Flags: approval-mozilla-esr52?
Attachment #8946212 - Flags: approval-mozilla-esr52-
Attachment #8946212 - Flags: approval-mozilla-beta?
Attachment #8946212 - Flags: approval-mozilla-beta+
Attached patch Beta patch.Splinter Review
Flags: qe-verify-
Whiteboard: [post-critsmash-triage]
Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][adv-main59-]
Landing the annotations in bug 1444375.
Flags: needinfo?(emilio)
So which bug are those for?
"belong to bug 1433671" is this bug.

(I'm having trouble to rebase a patch for beta because of that stuff)
Flags: needinfo?(aryx.bugmail)
Or perhaps emilio remember, assuming the commit had right author
Flags: needinfo?(emilio)
Bug 1444375.
Flags: needinfo?(emilio)
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: