Closed Bug 883514 (CVE-2013-1732) Opened 11 years ago Closed 10 years ago

Global buffer overflow (read 4) at nsFloatManager::GetFlowArea() with multicol, list, floats

Categories

(Core :: Layout: Floats, defect)

23 Branch
x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla26
Tracking Status
firefox22 --- wontfix
firefox23 --- wontfix
firefox24 + verified
firefox25 + verified
firefox26 + verified
firefox-esr17 24+ verified
b2g18 24+ fixed
b2g-v1.1hd --- fixed
b2g-v1.2 --- fixed

People

(Reporter: aki.helin, Assigned: dholbert)

Details

(Keywords: csectype-bounds, sec-critical, Whiteboard: [asan][adv-main24+][adv-esr1709+])

Attachments

(9 files, 1 obsolete file)

Opening several of the attached page at once causes ASan to spot a global buffer overflow in current aurora- and beta asan builds, most of the time. The read is aligned and pretty far away from adjecent objects, so filing as a potential security issue since it's not obvious we're always reading something fixed, like padding bytes, with fixed outcome.

The original even larger testcase needed just two of them to trigger this reliably. The attached somewhat reduced page still hits this most of the time. Let me know if you have trouble reproducing, because this could be related to machine speed etc.

$ opt/firefox-aurora-asan/firefox ff-gbof-getflowarea.html{,,,,,,,,,,,,,,} 2>&1 | symbolize | c++filt
=================================================================
==14198== ERROR: AddressSanitizer global-buffer-overflow on address 0x7ff38e6b69a0 at pc 0x7ff386ed8bb1 bp 0x7fff9afa6740 sp 0x7fff9afa6738
READ of size 4 at 0x7ff38e6b69a0 thread T0
    #0 0x7ff386ed8bb0 in nsFloatManager::GetFlowArea(int, nsFloatManager::BandInfoType, int, nsRect, nsFloatManager::SavedState*) const /home/aki/src/mozilla-aurora/layout/generic/nsFloatManager.cpp:140
0x7ff38e6b69a0 is located 32 bytes to the left of global variable 'mozilla::sse_private::sse3_enabled (/home/aki/src/mozilla-aurora/objdir-ff-asan/xpcom/build/SSE.cpp)' (0x7ff38e6b69c0) of size 1
0x7ff38e6b69a0 is located 24 bytes to the right of global variable 'nsTArrayHeader::sEmptyHdr (/home/aki/src/mozilla-aurora/objdir-ff-asan/xpcom/build/nsTArray.cpp)' (0x7ff38e6b6980) of size 8
  'nsTArrayHeader::sEmptyHdr (/home/aki/src/mozilla-aurora/objdir-ff-asan/xpcom/build/nsTArray.cpp)' is ascii string ''
Shadow byte and word:
  0x1ffe71cd6d34: f9
  0x1ffe71cd6d30: 00 f9 f9 f9 f9 f9 f9 f9
More shadow bytes:
  0x1ffe71cd6d10: 04 f9 f9 f9 f9 f9 f9 f9
  0x1ffe71cd6d18: 01 f9 f9 f9 f9 f9 f9 f9
  0x1ffe71cd6d20: 04 f9 f9 f9 f9 f9 f9 f9
  0x1ffe71cd6d28: 00 00 00 f9 f9 f9 f9 f9
=>0x1ffe71cd6d30: 00 f9 f9 f9 f9 f9 f9 f9
  0x1ffe71cd6d38: 01 f9 f9 f9 f9 f9 f9 f9
  0x1ffe71cd6d40: 01 f9 f9 f9 f9 f9 f9 f9
  0x1ffe71cd6d48: 01 f9 f9 f9 f9 f9 f9 f9
  0x1ffe71cd6d50: 01 f9 f9 f9 f9 f9 f9 f9
Stats: 150M malloced (212M for red zones) by 600810 calls
Stats: 12M realloced by 34962 calls
Stats: 102M freed by 265916 calls
Stats: 0M really freed by 0 calls
Stats: 400M (102447 full pages) mmaped in 100 calls
  mmaps   by size class: 8:524256; 9:57337; 10:24570; 11:16376; 12:4096; 13:4096; 14:1280; 15:512; 16:1152; 17:128; 18:32; 19:8; 20:4;
  mallocs by size class: 8:502183; 9:54311; 10:19963; 11:14517; 12:3446; 13:3628; 14:1128; 15:394; 16:1111; 17:100; 18:17; 19:8; 20:4;
  frees   by size class: 8:202047; 9:33734; 10:12283; 11:11046; 12:2220; 13:2726; 14:536; 15:272; 16:944; 17:90; 18:11; 19:4; 20:3;
  rfrees  by size class:
Stats: malloc large: 129 small slow: 2773
==14198== ABORTING
Attached file stack
In a local m-c debug build on Linux64 I get a fatal assertion:

###!!! ABORT: bad state: 'floatCount <= mFloats.Length()', file layout/generic/nsFloatManager.cpp, line 131

(gdb) list
126       // Determine the last float that we should consider.
127       uint32_t floatCount;
128       if (aState) {
129         // Use the provided state.
130         floatCount = aState->mFloatInfoCount;
131         NS_ABORT_IF_FALSE(floatCount <= mFloats.Length(), "bad state");
132       } else {
133         // Use our current state.
134         floatCount = mFloats.Length();
135       }
(gdb) p floatCount
$10 = 1
(gdb) p this
$11 = (const nsFloatManager *) 0x7fffb3adb1c0
(gdb) p this->mFloats.Length()
$12 = 0
The assertion also occurs in FF21 and ESR17 debug builds on Linux64.
I wonder it's a regression from bug 25888?
Jet, can you find an owner for this bug please?
Flags: needinfo?(bugs)
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: sec-bounty?
Summary: Global buffer overflow at nsFloatManager::GetFlowArea() → Global buffer overflow (read 4) at nsFloatManager::GetFlowArea()
dbaron: do you agree w/Mats that this might be related to bug 25888?
Flags: needinfo?(dbaron)
So far I don't see the abort locally, though it's possible that's due to patches I have locally.
(In reply to Daniel Veditz [:dveditz] from comment #4)
> dbaron: do you agree w/Mats that this might be related to bug 25888?

Is it important to know that?  I'd rather spend the time trying to fix it than figuring that out.
I don't see the ABORT in a clean tree either.
dbaron, did you open the page in multiple tabs? This won't reproduce if you open it just once. 

This trace has come up multiple times over the last week on my test machine, so the bug should still be there. The repro might depend on on build and/or moon phase though.
dbaron: if you'd like to try an ASAN build (often makes this type of problem more deterministic) you can find them at
https://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux64-asan/

There are also debug asan builds for linux.
Flags: needinfo?(dbaron)
FWIW, this seems to still reproduce in aurora built today. A few more pages were needed to reproduce reliably on a slower Core 2 Duo laptop, so it could be useful to try this with thread sanitizer.
> (In reply to Daniel Veditz [:dveditz] from comment #4)
> > dbaron: do you agree w/Mats that this might be related to bug 25888?
> 
> Is it important to know that?  I'd rather spend the time trying to fix it
> than figuring that out.

Sorry, reflexive question trying to figure out which releases might need fixing. But Mats already answered that ESR and Release channels are affected (assuming the assert represents the same bad state, which seems likely).
Rather than an unowned sec-crit I'm going to interpret comment 6 as volunteering.
Assignee: nobody → dbaron
In turn, I'm going to give it to dholbert.
Assignee: dbaron → dholbert
Flags: needinfo?(bugs)
In my ASAN debug+opt build, I'm seeing the abort that mats mentioned in comment 1 (about 1/3 of the time that I start Firefox as-directed in comment 0, with the "{,,,,,,,,,,,,,,}" to get many copies of the testcase).

I haven't yet seen the global-buffer-overflow. I wonder if that only happens if we continue executing after the (debug-only) abort...?
(In reply to Daniel Holbert [:dholbert] from comment #14)
> I haven't yet seen the global-buffer-overflow. I wonder if that only happens
> if we continue executing after the (debug-only) abort...?

(Looks like that's correct -- the abort is checking that floatCount <= mFloats.Length(), and then the buffer-overflow is a few lines later, when we do mFloats[floatCount-1].  If floatCount is too large, it makes sense that mFloats[floatCount-1] would likely be an out-of-bounds read.)
If low risk fix is ready within the next week we could consider it for Beta uplift.
Per comment 1: when this fails, mFloatInfoCount == floatCount == 1, whereas mFloats.Length() is 0.

Here's approximately how we get in that situation:
 (a) Initially, nsFloatManager::mFloats has length 1. (It contains 1 frame.)

 (b) We call nsFloatManager::PushState, which (among other things) saves a snapshot of that float-count, in nsBlockReflowState::mFloatManagerStateBefore. So that state now has mFloatInfoCount = 1.

 (c) We call nsFloatManager::RemoveTrailingRegions, which *removes the frame* from nsFloatManager::mFloats.

 (d) Now, we call GetFloatAvailableSpaceWithState(), passing in mFloatManagerStateBefore. This assumes (per the ABORT_IF_FALSE) that our float manager still has at least as many floats as it had in part (b). **But it does not**, due to the RemoveTrailingRegions call.

So the RemoveTrailingRegions call is what's giving us trouble here, it would appear.
Also: We get lots of calls to RemoveTrailingRegions that leave mFloats the same length (i.e. keep it at 0, or keep it at 1), likely for other tabs, before we hit the problematic call.

In my debugging printf's, I only see one call to RemoveTrailingRegions that shrinks mFloats from 1 to 0, and it's immediately before we explode.
The RemoveTrailingRegions call is preceded by this comment:
> 6976     // [...] It is safe to do this here
> 6977     // because we know from here on the float manager will only be
> 6978     // used for its XMost and YMost, not to place new floats and
> 6979     // lines.

This comment is apparently incorrect; the float manager is *not* just  used for XMost and YMost. It's also used for the "GetFloatAvailableSpaceWithState" call, which touches the mFloats list. This means it is *not* safe.
MXR link for that comment and the RemoveTrailingRegions invocation:
 http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsBlockFrame.cpp#6976

I'm attaching a backtrace from the faulty RemoveTrailingRegions() invocation (note that in this case, it was triggered by a window-resize).

I typed "fin" a few times to finish execution of the deepest stackframes, and I hit the abort while finishing up a block reflow.
(cleaned up my gdb log slightly, to prune out a spammy warning that gdb prints along with every null pointer for some reason)
Attachment #778739 - Attachment is obsolete: true
Attached file frametree_at_abort.txt
(For reference, here's the frame tree from that same debugging session, when we hit the abort.)
Also: I can reliably trigger the abort by loading the testcase (just one copy) and then resizing the window horizontally.
Attachment #763086 - Attachment description: repro → original testcase (aborts debug builds, when resized or when many copies are loaded)
Attachment #763086 - Attachment mime type: text/plain → text/html
Attachment #763086 - Attachment description: original testcase (aborts debug builds, when resized or when many copies are loaded) → original testcase (aborts debug builds when resized or when many copies are loaded)
Here's a reduced testcase. This aborts my debug build when I load it.
Here's a slightly-more-reduced testcase.
Summary: Global buffer overflow (read 4) at nsFloatManager::GetFlowArea() → Global buffer overflow (read 4) at nsFloatManager::GetFlowArea() with multicol, list, floats
Here's a mostly-for-debugging purposes patch, which prevents the abort & out-of-bounds read. (It just changes the MOZ_ASSERT'ed condition into an "if" check, and averts disaster (and spams a NS_ERROR assertion) if we're about to explode.)
This patch also fixes the issue. It's probably a little better than the "band-aid", but it still seems hacky...

This patch fixes the inconsistency between phase (c) and phase (d) in comment 17 -- that is, when we (potentially) shrink we shrink mFloats, **immediately** update mFloatManagerStateBefore to be sure it's not too large.

I don't like this, because it seems like it could result in incorrect (albeit non-crashing) behavior in GetFloatAvailableSpaceWithState, in some circumstances.
Actually, even better: if we can just reverse the order of (c) and (d) from comment 17, then we'd be OK! (Then, the quote in comment 19 would actually be correct RE "It is safe")

In particular: if we delay nsBlockFrame::Reflow's CheckFloats() call (which makes the problematic  RemoveTrailingRegions call) to happen a bit later -- just before ComputeFinalSize(), *after* we reflow our bullets -- that would take care of this bug, I think.

(Note that the CheckFloats() call originally lived just before ComputeFinalSize() when we initially added it, in the patch on bug 282173.  At that point, there wasn't any bullet-reflow code there -- but the patch that later inserted that code (in bug 367332) seems to care more about being *before ComputeFinalSize* than about being *after CheckFloats*.  So I suspect the ordering of the bullet code vs. CheckFloats() was & is arbitrary.)

Try push (without any mention of this bug or any tempting security-related comments), to see if that change breaks anything:
 https://tbpl.mozilla.org/?tree=Try&rev=ed8a4d891a59
Attached patch fix v1Splinter Review
Here's that patch (now with a commit message included).

Optimistically requesting review. (If anything goes wrong with the try push, I'll cancel the review request.)
Attachment #779995 - Flags: review?(dbaron)
Flags: in-testsuite?
Try run is green (all unit tests, linux & windows, debug + opt), modulo two jobs that hit intermittent failures.
Aurora + fix v1 had no issues with my other repros for this.
So what seems odd to me is that the RemoveTrailingRegions call is removing a float that was present in the float manager before we started reflowing the block.  (Or at least that's what it seems like.)  Is that what's happening, and do you know why?  Did some previous block fail to remove it?

I'd have expected this RemoveTrailingRegions call to be removing only floats that weren't placed prior to starting reflow of this block.

(So it seems like the layout here would be incorrect both before and after the patch.)
Flags: needinfo?(dholbert)
(In reply to David Baron [:dbaron] (don't cc:, use needinfo? instead) from comment #32)
> So what seems odd to me is that the RemoveTrailingRegions call is removing a
> float that was present in the float manager before we started reflowing the
> block.  (Or at least that's what it seems like.)  Is that what's happening

That is indeed what's happening.

> and do you know why?  Did some previous block fail to remove it?

I don't yet know why, but I'll look into it.
A few frame dumps coming up, using "testcase 3 (reduced)".

In these dumps, the block frame that we're reflowing when everything falls apart is 0x7f1c5e745b88 (which I'll call "the b88 block"). The float that gets removed (and causes trouble) is the nsImageFrame.

When we enter the doomed nsBlockFrame::Reflow call, that nsImageFrame is the one float tracked by the reflow state's nsFloatManager::mFloats, and our frame tree looks like this:

> ColumnSet(ul)(1)@0x7f1c5e7292f0 {0,0,38400,2460} vis-overflow=0,-473,38400,2933 scr-overflow=0,-473,38400,2933 [state=0002024000400631] [content=0x7f1c579ae1a0] [sc=0x7f1c5e5bfb50]<
[...]
>   Block(ul)(1)@0x7f1c5e798738 next=0x7f1c5e798888 prev-in-flow=0x7f1c5e729220 next-in-flow=0x7f1c5e798888 {5243,0,1883,1140} [state=0002100000100004] [content=0x7f1c579ae1a0] [sc=0x7f1c5e7293f8:-moz-column-content]<
[...]
>     OverflowContainersList 0x7f1c5e798bc0 <
>       Block(li)(3)@0x7f1c5e798920 prev-in-flow=0x7f1c5e745b88 next-in-flow=0x7f1c5e798a90 {0,0,1883,0} vis-overflow=0,0,1883,1440 scr-overflow=0,0,1883,1440 [state=0000120000100284] [content=0x7f1c69568a10] [sc=0x7f1c5e729888]<
>       >
>     >
>   >
>   Block(ul)(1)@0x7f1c5e798888 prev-in-flow=0x7f1c5e798738 {8086,0,1883,1440} [state=0000100000100405] [content=0x7f1c579ae1a0] [sc=0x7f1c5e7293f8:-moz-column-content]<
>     line 0x7f1c5e7971c8: count=1 state=block,clean,prevmarginclean,impacted,not wrapped,before:nobr,after:nobr[0x18] {0,0,0,0} <
>       Block(li)(3)@0x7f1c5e745b88 next-in-flow=0x7f1c5e798920 {0,1140,1883,1140} vis-overflow=-840,0,2723,1140 scr-overflow=-840,0,2723,1140 [state=0000120042100601] [content=0x7f1c69568a10] [sc=0x7f1c5e729888]<
>         Overflow-lines 0x7f1c55517d00/0x7f1c55517d10 <
>           line 0x7f1c5e797078: count=1 state=block,dirty,prevmargindirty,not impacted,not wrapped,before:nobr,after:nobr[0x20b] {0,0,0,0} <
>             Block(div)(1)@0x7f1c5e745ec0 next=0x7f1c5e797020 {0,0,1883,0} [state=0000120000100600] [content=0x7f1c695c5470] [sc=0x7f1c5e744dc8]<
>             >
>           >
>           line 0x7f1c5e7970b8: count=1 state=inline,dirty,prevmargindirty,not impacted,not wrapped,before:nobr,after:nobr[0xd43] {0,0,0,0} vis-overflow=-840,427,360,473 scr-overflow=-840,427,360,473 <
>             Placeholder(img)(3)@0x7f1c5e797020 {0,900,0,0} [state=0000000000100000] [content=0x7f1c59ddbe40] [sc=0x7f1c5e729be8:-moz-non-element] outOfFlowFrame=ImageFrame(img)(3)@0x7f1c5e745f58
>           >
>         >
>         BulletList 0x7f1c5e797178 <
>           Bullet(li)(3)@0x7f1c5e7970f8 {-840,427,360,473} [state=0000004000000000] [content=0x7f1c69568a10] [sc=0x7f1c5e744948:-moz-list-bullet]
>         >
>       >
>     >
>     OverflowContainersList 0x7f1c5e798878 <
>       Block(li)(3)@0x7f1c5e798a90 next=0x7f1c5e7989f8 prev-in-flow=0x7f1c5e798920 {0,0,1883,0} vis-overflow=0,0,1883,1440 scr-overflow=0,0,1883,1440 [state=0000120000100284] [content=0x7f1c69568a10] [sc=0x7f1c5e729888]<
>         FloatList 0x7f1c5e798b18 <
>           ImageFrame(img)(3)@0x7f1c5e745f58 {0,0,1440,1440} [state=0000064100200300] [content=0x7f1c59ddbe40] [sc=0x7f1c5e745c68]
>         >
>       >

In particular, note that the ImageFrame is a child of the b88 block's next-in-flow's next-in-flow. (where both of those next-in-flows happen to be in OverflowContainersLists.)

When we hit CheckFloats() call in this reflow of b88, the frame tree has changed to look like this (starting with the last "Block(ul)" above, because nothing above that has changed):

>   Block(ul)(1)@0x7f1c5e798888 prev-in-flow=0x7f1c5e798738 {8086,0,1883,1440} [state=0000100000100405] [content=0x7f1c579ae1a0] [sc=0x7f1c5e7293f8:-moz-column-content]<
>     line 0x7f1c5e7971c8: count=1 state=block,clean,prevmarginclean,impacted,not wrapped,before:nobr,after:nobr[0x18] {0,0,0,0} <
>       Block(li)(3)@0x7f1c5e745b88 next-in-flow=0x7f1c5e798920 {0,1140,1883,1140} vis-overflow=-840,0,2723,1140 scr-overflow=-840,0,2723,1140 [state=0000120046300601] [content=0x7f1c69568a10] [sc=0x7f1c5e729888]<
>         line 0x7f1c5e797078: count=1 state=block,clean,prevmarginclean,impacted,not wrapped,before:nobr,after:nobr[0x318] {0,0,1883,0} <
>           Block(div)(1)@0x7f1c5e745ec0 {0,0,1883,0} [state=0000120000100200] [content=0x7f1c695c5470] [sc=0x7f1c5e744dc8]<
>           >
>         >
>         Overflow-lines 0x7f1c55517d00/0x7f1c55517d10 <
>           line 0x7f1c5e7970b8: count=1 state=inline,dirty,prevmargindirty,impacted,not wrapped,before:nobr,after:nobr[0xd53] {0,0,0,0} vis-overflow=600,427,360,473 scr-overflow=600,427,360,473 <
>             Placeholder(img)(3)@0x7f1c5e797020 {1440,900,0,0} [state=0000000000100000] [content=0x7f1c59ddbe40] [sc=0x7f1c5e729be8:-moz-non-element] outOfFlowFrame=ImageFrame(img)(3)@0x7f1c5e745f58
>           >
>         >
>         OverflowOutOfFlowList 0x7f1c5e798be0 <
>           ImageFrame(img)(3)@0x7f1c5e745f58 {0,0,1440,1440} [state=0000064100200300] [content=0x7f1c59ddbe40] [sc=0x7f1c5e745c68]
>         >
>         BulletList 0x7f1c5e797178 <
>           Bullet(li)(3)@0x7f1c5e7970f8 {600,427,360,473} [state=0000004000000000] [content=0x7f1c69568a10] [sc=0x7f1c5e744948:-moz-list-bullet]
>         >
>       >
>     >
>     OverflowContainersList 0x7f1c5e798878 <
>       Block(li)(3)@0x7f1c5e798a90 next=0x7f1c5e7989f8 prev-in-flow=0x7f1c5e798920 {0,0,1883,0} vis-overflow=0,0,1883,1440 scr-overflow=0,0,1883,1440 [state=0000120000100284] [content=0x7f1c69568a10] [sc=0x7f1c5e729888]<
>       >

Note that the nsImageFrame *is now a child* of our b88 block -- it's in its OverflowOutOfFlowList, which is why we end up removing it in CheckFloats()'s call to RemoveTrailingRegions().
Flags: needinfo?(dholbert)
(Note that, while the b88 block doesn't initially have the floated ImageFrame as a child, it *does* have the *Placeholder* for the floated ImageFrame as a child. Then, it takes the float from the other block when it reflows that placeholder, and that's how we end up with that float in our OverflowOutOfFlowList.)
So I'm fuzzy on the float-manager internals & what the invariants are, but one possible source of trouble here might be that we're reflowing a bit out of order here.  In particular, we reflow b88's two-hops-away next-continuation (a90) before we reflow b88 itself -- and that's troublemaking because b88 has a placeholder and a90 has the corresponding float.

IN MORE DETAIL: when we're reflowing b88's parent (the block "0x7f1c5e798888" in the frame dumps above):

 (a) we first call ReflowOverflowContainerChildren(), which reflows block a90, and its float-ImageFrame child.  (This is where the float gets added to the nsFloatManager's mFloats list.)

 (b) Then, we call ReflowDirtyLines() to reflow the "normal" child, block b88.  (When we reflow b88's PlaceholderFrame child, we reparent the float to be a child of b88, and it ends up in b88's OverflowOutOfFlowList and causes trouble when we hit CheckFloats.

So -- I'm calling that order "backwards" because we're reflowing b88 *after* a90, even though b88 is a *prev* continuation of a90. (two hops away, in the continuation chain)  And this also means we're reflowing the float before its placeholder.

I suspect that's bad.
Didn't get this in time for FF23, wontfixing.
Any updates here? This is pending review for almost a month now.
[toggling needinfo=dbaron -- I seem to recall that he was probably OK with the patch, but wanted to understand the situation better]

Hopefully Comment 36 answers the review-question from 1st paragraph comment 32, at least to a first approximation.

I'll paraphrase that comment, to sum up & possibly make it more understandable: basically, the problem arises when a frame (like b88 in the dumps above) contains a placeholder whose corresponding floated-frame lives in a future continuation, and the continuation happens to be sitting in an overflow list on the original frame's parent.  (So, the frame and its continuation have the same parent.)  This means the continuation will get reflowed first (which gets the float tracked in the float manager). Then, when we get to actually reflowing the original frame (b88), we reparent the float (as part of reflowing the placeholder), and then we can stick the float on our (b88's) OverflowOutOfFlowList, which means the float will get removed in RemoveTrailingRegions -- even though it's been tracked by the float manager since we started reflowing our frame (b88). And then we're hosed when we hit GetFloatAvailableSpaceWithState(), since that assumes that our float manager hasn't removed anything since we (b88) entered reflow.

But if we delay the RemoveTrailingRegions call to happen *after* GetFloatAvailableSpaceWithState (as the attached patch does), then the float manager will still be intact for the duration of GetFloatAvailableSpaceWithState, and we won't be hosed.
Flags: needinfo?(dbaron)
Yeah, my problem is that I no longer remember what I thought about it and need to start over.  Hopefully later this week, but not tomorrow.
Comment on attachment 779995 [details] [diff] [review]
fix v1

I think this fix is good and we should go ahead and do it, but I think 
there's still another problem here that could lead to other sec-critical
bugs, and we should get a followup filed on fixing that.

r=dbaron, and sorry for taking so long
Attachment #779995 - Flags: review?(dbaron) → review+
(In reply to Daniel Holbert [:dholbert] from comment #34)
> > ColumnSet(ul)(1)@0x7f1c5e7292f0 {0,0,38400,2460} vis-overflow=0,-473,38400,2933 scr-overflow=0,-473,38400,2933 [state=0002024000400631] [content=0x7f1c579ae1a0] [sc=0x7f1c5e5bfb50]<
> [...]
> >   Block(ul)(1)@0x7f1c5e798738 next=0x7f1c5e798888 prev-in-flow=0x7f1c5e729220 next-in-flow=0x7f1c5e798888 {5243,0,1883,1140} [state=0002100000100004] [content=0x7f1c579ae1a0] [sc=0x7f1c5e7293f8:-moz-column-content]<
> [...]
> >     OverflowContainersList 0x7f1c5e798bc0 <
> >       Block(li)(3)@0x7f1c5e798920 prev-in-flow=0x7f1c5e745b88 next-in-flow=0x7f1c5e798a90 {0,0,1883,0} vis-overflow=0,0,1883,1440 scr-overflow=0,0,1883,1440 [state=0000120000100284] [content=0x7f1c69568a10] [sc=0x7f1c5e729888]<
> >       >
> >     >
> >   >
> >   Block(ul)(1)@0x7f1c5e798888 prev-in-flow=0x7f1c5e798738 {8086,0,1883,1440} [state=0000100000100405] [content=0x7f1c579ae1a0] [sc=0x7f1c5e7293f8:-moz-column-content]<
> >     line 0x7f1c5e7971c8: count=1 state=block,clean,prevmarginclean,impacted,not wrapped,before:nobr,after:nobr[0x18] {0,0,0,0} <
> >       Block(li)(3)@0x7f1c5e745b88 next-in-flow=0x7f1c5e798920 {0,1140,1883,1140} vis-overflow=-840,0,2723,1140 scr-overflow=-840,0,2723,1140 [state=0000120042100601] [content=0x7f1c69568a10] [sc=0x7f1c5e729888]<

It also seems pretty broken here that we have a block whose next-in-flow is on the overflow containers list of it's parent's *prev-in-flow*.  (See the two "Block(li)".)

(In reply to Daniel Holbert [:dholbert] from comment #39)
> continuation have the same parent.)  This means the continuation will get
> reflowed first (which gets the float tracked in the float manager). Then,

Yeah, but we shouldn't be reflowing the next-continuation before the prev-continuation.  That seems to be where things go wrong.

I'd still like to see a second patch for this, maybe in a different bug.  (Perhaps hand off to Mats if he's willing and you have too many other things?  I suspect he has a reasonably good understanding of this code.)
Flags: needinfo?(dbaron)
(In reply to David Baron [:dbaron] (needinfo? me) from comment #42)
> Yeah, but we shouldn't be reflowing the next-continuation before the
> prev-continuation.  That seems to be where things go wrong.

OK -- that's sort of what I suspected, at the beginning/end of comment 36. Makes sense.

> I'd still like to see a second patch for this, maybe in a different bug. 
> (Perhaps hand off to Mats if he's willing and you have too many other
> things?  I suspect he has a reasonably good understanding of this code.)

OK. I'll file a followup bug, also security-sensitive, because it'll be using the testcase from this bug and we still have users affected by this bug (for the time being).

Thanks!
Landed on inbound: http://hg.mozilla.org/integration/mozilla-inbound/rev/11e825ffbe54

(Once its TBPL cycle completes successfully, I'll request backport approval.)
Followup filed, per ends of comment 42 / comment 43: bug 913162.
Please request sec-approval? before landing security patches -- we need to make sure we line up our ducks on branch landings. You should probably ask retroactively (along with branch landing approval) so that you can fill in the questions, which will help judge risk and necessity for the branches.
Meant to include this link with previous comment
https://wiki.mozilla.org/Security/Bug_Approval_Process
Flags: needinfo?(dholbert)
Oops! I totally knew that, and forgot. Apologies. I'll request retroactive approval to fill out the form.
Flags: needinfo?(dholbert)
Comment on attachment 779995 [details] [diff] [review]
fix v1

[Security approval request comment]
> How easily could an exploit be constructed based on the patch?

Not easily. The patch reveals that lists and floats are involved, but not columns/pagination. (and the testcase that's required to trigger this is very fragile/specially-crafted)

> Do comments in the patch, the check-in comment, or tests included
> in the patch paint a bulls-eye on the security problem?

No.

> Which older supported branches are affected by this flaw?

All branches. (At least as far back as ver 17, per comment 2)

> If not all supported branches, which bug introduced the flaw?

N/A

> Do you have backports for the affected branches? If not, how different,
> hard to create, and risky will they be?

Yes; the attached patch applies cleanly to ESR17, so no custom backport-patches are required.

> How likely is this patch to cause regressions; how much testing does it need?

Unlikely to cause regressions; this is the sort of patch where, if it passes tests, it's almost certainly fine.  As noted in comment 28, the position of this line with respect to the code I'm moving it past appears to be arbitrary, so it shouldn't affect behavior. (And it only fixes the abort in this case because we already have a fragile/broken frame tree when we get to that point in the code, with this testcase; that's tracked in the followup bug 913162.)
Attachment #779995 - Flags: sec-approval?
Comment on attachment 779995 [details] [diff] [review]
fix v1

Requesting approval for Aurora/Beta/ESR17.

[Approval Request Comment]
> If this is not a sec:{high,crit} bug, please state case for ESR consideration:
It's sec-critical.

> User impact if declined:
Security vulnerability

> Fix Landed on Version:
So far, just inbound, which is at version 26, though I'm requesting to be able to backport it to 25 and 24 (aurora and beta).

> Risk to taking this patch (and alternatives if risky):
Low, per last chunk of previous comment.

> String or UUID changes made by this patch:
None.
Attachment #779995 - Flags: approval-mozilla-esr17?
Attachment #779995 - Flags: approval-mozilla-beta?
Attachment #779995 - Flags: approval-mozilla-aurora?
Comment on attachment 779995 [details] [diff] [review]
fix v1

sec-approval+
a=dveditz for aurora/beta/esr-17
Attachment #779995 - Flags: sec-approval?
Attachment #779995 - Flags: sec-approval+
Attachment #779995 - Flags: approval-mozilla-esr17?
Attachment #779995 - Flags: approval-mozilla-esr17+
Attachment #779995 - Flags: approval-mozilla-beta?
Attachment #779995 - Flags: approval-mozilla-beta+
Attachment #779995 - Flags: approval-mozilla-aurora?
Attachment #779995 - Flags: approval-mozilla-aurora+
Thanks! I'll land the backports tomorrow, if RyanVM doesn't beat me to it.
Flags: needinfo?(dholbert)
https://hg.mozilla.org/mozilla-central/rev/11e825ffbe54
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
BTW: for sanity/verification's sake, I loaded "testcase 3" in aurora, beta, and esr17 debug builds with the patch & without the patch (current tip vs. tip-1).

They all behaved as I'd hope/expect. (They aborted without the patch, and they loaded with no issues with the patch.)
Flags: sec-bounty? → sec-bounty+
Confirmed crash on ASan m-c from 2013-06-06.
Verified fixed on ASan builds of FF17esr, FF24, FF25 and FF26, 2013-09-11.
Keywords: csec-bounds
Whiteboard: [asan][adv-main24+]
Whiteboard: [asan][adv-main24+] → [asan][adv-main24+][adv-esr1709+]
Alias: CVE-2013-1732
NI on :dholbert to help with a backport on mozilla-b2g18 as the status is affected.
Flags: needinfo?(dholbert)
Comment on attachment 779995 [details] [diff] [review]
fix v1

This trunk patch applies cleanly to b2g18. Requesting approval‑mozilla‑b2g18 so that it can land there.

[Approval Request Comment]
See comment 50 for approval request responses.
Attachment #779995 - Flags: approval-mozilla-b2g18?
Flags: needinfo?(dholbert)
Comment on attachment 779995 [details] [diff] [review]
fix v1

a=bajaj over email
Attachment #779995 - Flags: approval-mozilla-b2g18?
Group: core-security
Flags: in-testsuite? → in-testsuite+
You need to log in before you can comment on or make changes to this bug.