Closed Bug 1402218 Opened 7 years ago Closed 7 years ago

stylo: AddressSanitizer: use-after-poison on address @ GetStateBits nsAbsoluteContainingBlock::Reflow

Categories

(Core :: Layout, defect, P2)

57 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox-esr52 --- unaffected
firefox55 --- unaffected
firefox56 --- unaffected
firefox57 + fixed
firefox58 + fixed

People

(Reporter: bc, Assigned: bzbarsky)

References

(Blocks 1 open bug, )

Details

(5 keywords, Whiteboard: [post-critsmash-triage])

Crash Data

Attachments

(5 files)

Attached file Linux Opt Asan Log
1. https://www.marcaria.com/ws/en/trademarks/nepal-trademark-registration-application

2. Linux opt-asan

==4842==ERROR: AddressSanitizer: use-after-poison on address 0x6250005839f0 at pc 0x7f52a5b73399 bp 0x7ffdd9435330 sp 0x7ffdd9435328
READ of size 8 at 0x6250005839f0 thread T0 (Web Content)
    #0 0x7f52a5b73398 in GetStateBits /builds/worker/workspace/build/src/layout/generic/nsIFrame.h:1991:46
    #1 0x7f52a5b73398 in nsAbsoluteContainingBlock::Reflow(nsContainerFrame*, nsPresContext*, mozilla::ReflowInput const&, nsReflowStatus&, nsRect const&, nsAbsoluteContainingBlock::AbsPosReflowFlags, nsOverflowAreas*) /builds/worker/workspace/build/src/layout/generic/nsAbsoluteContainingBlock.cpp:129
    #2 0x7f52a5b8c677 in nsBlockFrame::Reflow(nsPresContext*, mozilla::ReflowOutput&, mozilla::ReflowInput const&, nsReflowStatus&) /builds/worker/workspace/build/src/layout/generic/nsBlockFrame.cpp:1452:26
    #3 0x7f52a5d3cea9 in nsLineLayout::ReflowFrame(nsIFrame*, nsReflowStatus&, mozilla::ReflowOutput*, bool&) /builds/worker/workspace/build/src/layout/generic/nsLineLayout.cpp:921:13
    #4 0x7f52a5d3b024 in nsInlineFrame::ReflowInlineFrame(nsPresContext*, mozilla::ReflowInput const&, nsInlineFrame::InlineReflowInput&, nsIFrame*, nsReflowStatus&) /builds/worker/workspace/build/src/layout/generic/nsInlineFrame.cpp:800:15
Attached file Linux Debug Log
1. https://www.marcaria.com/ws/en/trademarks/nepal-trademark-registration-application
2. Linux Debug

Assertion failure: (GetOverflowOutOfFlows() && GetOverflowOutOfFlows()->ContainsFrame(aFloat)) || (GetPushedFloats() && GetPushedFloats()->ContainsFrame(aFloat)) (aFloat is not our child or on an unexpected frame list), at /builds/worker/workspace/build/src/layout/generic/nsBlockFrame.cpp:5540
#01: nsBlockFrame::DoRemoveOutOfFlowFrame [layout/generic/nsBlockFrame.cpp:5590]
#02: nsBlockFrame::RemoveFrame [layout/generic/nsBlockFrame.cpp:5363]
#03: nsFrameManager::RemoveFrame [layout/base/nsFrameManager.cpp:536]
#04: nsPlaceholderFrame::DestroyFrom [layout/generic/nsPlaceholderFrame.cpp:172]
#05: nsBlockFrame::DoRemoveFrame [layout/generic/nsBlockFrame.cpp:6028]
Attached file Windows 32bit Opt Log
1. https://www.marcaria.com/ws/en/trademarks/nepal-trademark-registration-application

2. Operating system: Windows NT
                  6.1.7601 Service Pack 1
CPU: x86
     GenuineIntel family 6 model 45 stepping 2
     2 CPUs

GPU: UNKNOWN

Crash reason:  EXCEPTION_ACCESS_VIOLATION_READ
Crash address: 0xfffffffff0de812f
Assertion: Unknown assertion type 0x00000000
Process uptime: 6 seconds

Thread 0 (crashed)
 0  xul.dll!mozilla::ReflowInput::ReflowInput(nsPresContext *,mozilla::ReflowInput const &,nsIFrame *,mozilla::LogicalSize const &,mozilla::LogicalSize const *,unsigned int) [ReflowInput.cpp:b14c75b83d02 : 223 + 0xf]
    eip = 0x6362ee92   esp = 0x002ba1e8   ebp = 0x002ba1f8   ebx = 0x002ba808
    esi = 0x3fffffff   edi = 0x002ba224   eax = 0xf0de7fff   ecx = 0x089b94c0
    edx = 0x002ba301   efl = 0x00010293
    Found by: given as instruction pointer in context
 1  xul.dll!nsAbsoluteContainingBlock::ReflowAbsoluteFrame(nsIFrame *,nsPresContext *,mozilla::ReflowInput const &,nsRect const &,nsAbsoluteContainingBlock::AbsPosReflowFlags,nsIFrame *,nsReflowStatus &,nsOverflowAreas *) [nsAbsoluteContainingBlock.cpp:b14c75b83d02 : 679 + 0x2c]
    eip = 0x6365cf8b   esp = 0x002ba200   ebp = 0x002ba3dc
    Found by: call frame info
 2  xul.dll!nsAbsoluteContainingBlock::Reflow(nsContainerFrame *,nsPresContext *,mozilla::ReflowInput const &,nsReflowStatus &,nsRect const &,nsAbsoluteContainingBlock::AbsPosReflowFlags,nsOverflowAreas *) [nsAbsoluteContainingBlock.cpp:b14c75b83d02 : 166 + 0x16]
    eip = 0x6365b6a0   esp = 0x002ba3e4   ebp = 0x002ba4b0
    Found by: call frame info
 3  xul.dll!nsBlockFrame::Reflow(nsPresContext *,mozilla::ReflowOutput &,mozilla::ReflowInput const &,nsReflowStatus &) [nsBlockFrame.cpp:b14c75b83d02 : 1452 + 0x3e]
    eip = 0x6365c0d1   esp = 0x002ba4b8   ebp = 0x002ba780
    Found by: call frame info
 4  xul.dll!nsLineLayout::ReflowFrame(nsIFrame *,nsReflowStatus &,mozilla::ReflowOutput *,bool &) [nsLineLayout.cpp:b14c75b83d02 : 921 + 0x15]
    eip = 0x63683d03   esp = 0x002ba788   ebp = 0x002ba91c
    Found by: call frame info
 5  xul.dll!nsInlineFrame::ReflowInlineFrame(nsPresContext *,mozilla::ReflowInput const &,nsInlineFrame::InlineReflowInput &,nsIFrame *,nsReflowStatus &) [nsInlineFrame.cpp:b14c75b83d02 : 800 + 0x5]
    eip = 0x63684f61   esp = 0x002ba924   ebp = 0x002ba940
    Found by: call frame info

exploitable rates the 32bit crash as "high"
Crash Signature: [@ mozilla::ReflowInput::ReflowInput]
Group: core-security → layout-core-security
Priority: -- → P3
This was an absolute bear to narrow down because of inconsistent reproduction. I finally got a reasonable consistent STR by loading the page in 5-6 tabs at a time and reloading a few times if needed. I eventually got down to the below Servo commit for the regressing commit.

https://hg.mozilla.org/integration/autoland/rev/49d106783eec

I've verified multiple times that rev the parent rev (08d59706aac1) doesn't reproduce the issue.
Has Regression Range: --- → yes
Flags: needinfo?(emilio)
Summary: AddressSanitizer: use-after-poison on address @ GetStateBits nsAbsoluteContainingBlock::Reflow → servo: AddressSanitizer: use-after-poison on address @ GetStateBits nsAbsoluteContainingBlock::Reflow
Priority: P3 → P2
Summary: servo: AddressSanitizer: use-after-poison on address @ GetStateBits nsAbsoluteContainingBlock::Reflow → stylo: AddressSanitizer: use-after-poison on address @ GetStateBits nsAbsoluteContainingBlock::Reflow
Emilio has a lot on his plate for tomorrow. TYLin, given that this is a UAF in block reflow, can you have a look today?
Flags: needinfo?(emilio) → needinfo?(tlin)
I just have a brief look at this issue. It seems the frame being removed from float list is not a float frame at all...
Inside nsBlockFrame::DoRemoveOutOfFlowFrame, aFrame is a frame inside its parent's absolute list, but its style display shows it is not absolute, so the method takes the other route to remove it as a float.

I suspect this is probably because we assign an incorrect style context to the frame somehow, or we incorrectly changes the content of the style struct assigned to the frame.
Note that this is reliably reproducible when STYLO_THREADS=1 or reload the page in a background tab (which is basically STYLO_THREADS=1 in that tab).
It is still not completely clear to me what happens there, but it seems there are first-line reparenting involves.

This is basically because someone is assigning a out-of-flow frame with a in-flow style context. I put an assertion in DidSetStyleContext to assert that aOldStyleContext->StyleDispay()->mPosition == StyleDisplay()->mPosition, and this assertion is violated when this happens. The stack for this assertion is:
> Assertion failure: aOldStyleContext->StyleDisplay()->mPosition == StyleDisplay()->mPosition, at /layout/generic/nsFrame.cpp:894
> #01: nsIFrame::SetStyleContext (\layout\generic\nsiframe.h:772)
> #02: mozilla::ServoRestyleManager::DoReparentStyleContext (\layout\base\servorestylemanager.cpp:1583)
> #03: mozilla::ServoRestyleManager::DoReparentStyleContext (\layout\base\servorestylemanager.cpp:1491)
> #04: mozilla::ServoRestyleManager::ReparentFrameDescendants (\layout\base\servorestylemanager.cpp:1625)
> #05: mozilla::ServoRestyleManager::DoReparentStyleContext (\layout\base\servorestylemanager.cpp:1616)
> #06: mozilla::ServoRestyleManager::ReparentStyleContext (\layout\base\servorestylemanager.cpp:1453)
> #07: mozilla::ServoRestyleManager::ProcessPostTraversal (\layout\base\servorestylemanager.cpp:958)
> #08: mozilla::ServoRestyleManager::ProcessPostTraversal (\layout\base\servorestylemanager.cpp:920)
> #09: mozilla::ServoRestyleManager::ProcessPostTraversal (\layout\base\servorestylemanager.cpp:920)
> #10: mozilla::ServoRestyleManager::ProcessPostTraversal (\layout\base\servorestylemanager.cpp:920)
> #11: mozilla::ServoRestyleManager::ProcessPostTraversal (\layout\base\servorestylemanager.cpp:920)
> #12: mozilla::ServoRestyleManager::DoProcessPendingRestyles (\layout\base\servorestylemanager.cpp:1121)
> #13: mozilla::PresShell::DoFlushPendingNotifications (\layout\base\presshell.cpp:4170)

The position is changed from "absolute" (2) to "static" (0).

The grandparent of the problematic frame above is a nsFirstLineFrame. The problematic frame itself is an nsBlockFrame whose content is a table element.

I searched its stylesheet, and there is only one :first-letter rule which is:
> div.content div.sidebar ul>li:first-line {}

There is pretty much just one element met all the conditions, which is the top-level <table> element inside <li class="attorney">.

It is not clear to me why its position is changed from absolute to static. When the page loads successfully, I don't see any rule matching this element contains position declaration.
Attached file testcase
OK I eventually figure out a simplified testcase based on my analysis in comment 8. This reliably crashes. Although the assertion and stack isn't completely identical to that site, but it also involves assigning an in-flow style context to an out-of-flow frame, so I believe this is caused by the same root cause as the given website.

bz, since first-line involves here, could you have a look?
Flags: needinfo?(tlin) → needinfo?(bzbarsky)
I'll look for sure tomorrow, but that last testcase looks like we should have a pending reframe of the id="test" thing.  Normally we leave the old style context in place when that happens.

Then when we unwind out to the parent with a first-line style we do style context reparenting.  And in theory that should work correctly, because we just compute a style for the same rulenode, with a different parent.

But I just realized that while this works correctly in Gecko, in stylo a rulenode does NOT actually capture all the styles for an element.  In particular, I bet when we reparent we end up using the new inline style, not the old one.  So we get a style context with "position:static".

Obvious options are to either not reparent things that are due to be reframed anyway (how to detect this?) or to make ServoStyleSet::ReparentStyleContext actually work correctly and not pick up changes to styles due to not all styles being represented in the ruletree.
Flags: needinfo?(bzbarsky)
No, looks like servo's ruletree stores the actual PropertyDeclarationBlock for inline style.  So I'm really not sure why we end up with the wrong style there; will debug in the morning.
Flags: needinfo?(bzbarsky)
(In reply to Boris Zbarsky [:bz] (still digging out from vacation mail) from comment #11)
> No, looks like servo's ruletree stores the actual PropertyDeclarationBlock
> for inline style.  So I'm really not sure why we end up with the wrong style
> there; will debug in the morning.

We store the declaration block, but we mutate it instead of cloning it, I believe. So what you're describing in comment 10 seems plausible to me.
Flags: needinfo?(emilio)
(In reply to Emilio Cobos Álvarez [:emilio] from comment #12)
> We store the declaration block, but we mutate it instead of cloning it, I
> believe. So what you're describing in comment 10 seems plausible to me.

I suspect this may not be the case since bug 1361938... I need to dig into it.
Sorry for the delay getting back to this. There's nothing cloning the style attribute when it gets mutated, we mark it as dirty instead. I guess we may as well just clone it instead...
I guess fixing this that way we may get rid of the Locked<PropertyDeclarationBlock>s, which would be nice and would allow me to do a servo refactor that I've wanted to do for a long-ish time.

Boris, I can take this if you want.
Emilio and I just talked about this.  He's going to file a followup on the refactor he wants to make, but the proximate fix here is likely to make nsDOMCSSDeclaration::RemovePropertyInternal look more like nsDOMCSSDeclaration::ModifyDeclaration in terms of its handling of mutability/dirtiness.  That should make sure that the servo-side rulenodes don't mutate in place on inline style change and we get new rulenodes instead, etc.

In particular, this issue is specific to _removing_ styles via inline style (setting to empty string or calling removeProperty).
Flags: needinfo?(emilio)
MozReview-Commit-ID: 6LRjLU3QQok
Attachment #8913262 - Flags: review?(emilio)
Assignee: nobody → bzbarsky
Status: NEW → ASSIGNED
Flags: needinfo?(bzbarsky)
Comment on attachment 8913262 [details] [diff] [review]
Make sure to clone our CSSDeclaration as needed when removing properties

Review of attachment 8913262 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8913262 - Flags: review?(emilio) → review+
See Also: → 1404006
I opened bug 1404006 with the refactor I want to eventually do. That patch is safer to uplift of course.
Comment on attachment 8913262 [details] [diff] [review]
Make sure to clone our CSSDeclaration as needed when removing properties

[Security approval request comment]
How easily could an exploit be constructed based on the patch?  Probably fairly easily once someone tries the crashtest.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?  The crashtest does.  The patch does not, apart from it being clear that it's about removing properties from style declarations.  We could land the patch sans test for now...

Which older supported branches are affected by this flaw?  57.

If not all supported branches, which bug introduced the flaw?  stylo.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?  This should apply cleanly to 57.

How likely is this patch to cause regressions; how much testing does it need?  I think this is pretty safe, personally, but some web compat testing would be nice.  Ideally we would just simul-land this on trunk and beta 57.

Approval Request Comment
[Feature/Bug causing the regression]: stylo
[User impact if declined]: Crashes in some cases when browsing to websites.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: No.
[Needs manual test from QE? If yes, steps to reproduce]: Yes, comment 0.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: See above.
[Why is the change risky/not risky?]:
[String changes made/needed]: None.
Attachment #8913262 - Flags: sec-approval?
Attachment #8913262 - Flags: approval-mozilla-beta?
Please make sure to test the site and see if that ia fixed as well. The description in comment 16 sounds very specific to my simplified testcase, so I'm not confident enough it is really a fix for the site as well then.
I spent a while trying to reproduce on the site, fwiw, without my patch but with the assert from comment 8 added, with no success.  :(
make sure to use opt asan builds.
(In reply to Boris Zbarsky [:bz] (still digging out from vacation mail) from comment #22)
> I spent a while trying to reproduce on the site, fwiw, without my patch but
> with the assert from comment 8 added, with no success.  :(

Did you set STYLO_THREADS=1? It seems to me that it is much more reproducible with that.
Or probably I can test that. It was very reproducible for me.
Comment on attachment 8913262 [details] [diff] [review]
Make sure to clone our CSSDeclaration as needed when removing properties

sec-approval+.
Attachment #8913262 - Flags: sec-approval? → sec-approval+
OK I checked the patch with the website in comment 0.

Without the patch applied, I can easily reproduce the assertion mentioned in comment 1 with the website with STYLO_THREADS=1 and a shift-refresh. With the patch applied, that is no longer reproducible.
And, this is a use-after-poison. With the assessment dveditz made in bug 1401420 comment 26, I think this is a sec-low, and we should be able to land the patch and crashtest without sec-approval.
https://hg.mozilla.org/mozilla-central/rev/76b6ae2fc6eb
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Please either request uplift or set 57 status to wontfix.
Flags: needinfo?(bzbarsky)
bz has requested uplift in comment 20.
Flags: needinfo?(bzbarsky)
(In reply to Xidorn Quan [:xidorn] UTC+10 from comment #32)
> bz has requested uplift in comment 20.

Whoops, sorry.
See Also: → 1404324
Comment on attachment 8913262 [details] [diff] [review]
Make sure to clone our CSSDeclaration as needed when removing properties

Sec high, taking it.
Should be in 57b5
Attachment #8913262 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Group: layout-core-security → core-security-release
Flags: qe-verify-
Whiteboard: [post-critsmash-triage]
Group: core-security-release
See Also: → 1841128
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: