Closed Bug 468491 Opened 12 years ago Closed 11 years ago

Crash [@ gfxTextRun::`vector deleting destructor'][@ nsTextFrame::ClearTextRun] with first-letter, float: left and display: table-cell


(Core :: Layout, defect, P2)






(Reporter: martijn.martijn, Assigned: roc)



(6 keywords)

Crash Data


(4 files)

Attached file testcase
See testcase, which crashes current trunk build after 100ms. It doesn't crash in a 2008-12-02 build, it does crash in a 2008-12-03 build:

I think a regression from bug 466763.
0  	xul.dll  	gfxTextRun::`vector deleting destructor'  	
1 	xul.dll 	nsTextFrame::ClearTextRun 	layout/generic/nsTextFrameThebes.cpp:3593
2 	xul.dll 	nsContinuingTextFrame::Destroy 	layout/generic/nsTextFrameThebes.cpp:3392
Flags: blocking1.9.2?
Crashes dereferencing c0000007 for me :(
Flags: blocking1.9.1?
Summary: Crash [@ nsTextFrame::ClearTextRun] with first-letter, float: left and display: table-cell → Crash [@ gfxTextRun::`vector deleting destructor'][@ nsTextFrame::ClearTextRun] with first-letter, float: left and display: table-cell
Flags: blocking1.9.1? → blocking1.9.1+
Priority: -- → P2
Assignee: nobody → roc
Works for me on Mac trunk debug.
Still crashes here, using:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090204 Minefield/3.2a1pre (.NET CLR 3.5.30729)
It might possibly be fixed by the patch in bug 472776. If not, I guess I'll have to crank up a Windows build :-(.
Depends on: 472776
Can someone retest this for me in a nightly build that includes the fix for bug 472776?
The testcase still crashes using:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090224 Minefield/3.2a1pre (.NET CLR 3.5.30729)
this site also crashing in similiar manner I think:

Today's nightly - Vista HP SP1
Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2a1pre) Gecko/20090225 Minefield/3.2a1pre Firefox/3.0.4 ID:20090225033705
Attached file stack
I got the stack when re-loading
Just a guess, do we perhaps set wrong flags to textrun?

What happened here was that in the patch in bug 466763 we started calling RemoveFromFlow on the second frame in the continuation chain for a text node, and then destroying that second frame, repeatedly until there's no second frame in the chain. That can cause a problem when the second frame is at the start of a textrun. When we destroy the frame we also destroy the textrun, and we iterate through the frame's next-continuations to clear out their references to the textrun. Problem is we've already called RemoveFromFlow on the frame so its old next-continuations don't get traversed, and end up holding dangling references to the textrun.
This fixes the bug by just avoiding calling RemoveFromFlow here. Just removing the frames in reverse order, starting with the last continuation, is fine.

The only remaining callers to RemoveFromFlow are in nsContinuingTextFrame::Destroy, after we've cleaned out the textrun, and in nsSplittableFrame::Destroy. So those should be fine. I've made a note that you should only be calling RemoveFromFlow in a frame's Destroy method. Actually the only reason it's not inlined into nsSplittableFrame::Destroy is that nsContinuingTextFrame is not a subclass of nsSplittableFrame --- and it's the only splittable frame that isn't a subclass of nsSplittableFrame.

In this patch I'm also removing BreakFromPrevInFlow since there are no callers and we don't want to add any.
Attachment #368829 - Flags: review?(dbaron)
Whiteboard: [needs review]
(I tested this on latest-trunk mozilla-central nightly build on WinXP SP3)

0:000> !exploitable
Exploitability Classification: PROBABLY_EXPLOITABLE
Recommended Bug Title: Probably Exploitable - User Mode Write AV near NULL starting at xul!gfxTextRun::`vector deleting destructor'+0x1c (Hash=0x6f767310.0x7a710162)

User mode write access violations that are near NULL are probably exploitable.
Why do you think write AVs near NULL are probably exploitable?  I've been treating them like any other null deref.
I'd also like to know why that tool thinks "writes near NULL" are probably exploitable.

In this case, the bug is not actually limited to a NULL dereference, although I think it's non-exploitable for other reasons.
(In reply to comment #14)
> Why do you think write AVs near NULL are probably exploitable?  I've been
> treating them like any other null deref.

!exploitable thinks they're probably exploitable, I have no idea why.
Blocks: 484375
Comment on attachment 368829 [details] [diff] [review]
[Checkin: Comment 18]

Should we null out the next-continuation pointer on textFrame at the end of this exercise to avoid leaving a dangling pointer for a bit, or does something already here do that?

r=dbaron.  Sorry for the delay.
Attachment #368829 - Flags: review?(dbaron) → review+
+      aFrameManager->RemoveFrame(frameToDeleteParent, nsnull, frameToDelete);

This ends up call nsContinuingTextFrame::Destroy on frameToDelete, which calls nsSplittableFrame::RemoveFromFlow on it, which will set frameToDelete's prev-in-flow's next-continuation pointer to null (since frameToDelete is the last in flow and has no next-continuation of its own). So there won't be a dangling frame pointer at any time.

Closed: 12 years ago
Flags: blocking1.9.2? → in-testsuite+
Resolution: --- → FIXED
Whiteboard: [needs review] → [needs 191 landing]
Hmm, the Windows XP unittest box just failed with a similar stack. Caused by (or failed to be fixed by) this bug?

Full log:

Thread 0 (crashed)
 0  xul.dll!nsTArray<gfxTextRun::GlyphRun>::~nsTArray<gfxTextRun::GlyphRun>() [nsTArray.h:07599e2b0941 : 267 + 0x5]
    eip = 0x101c2cdb   esp = 0x0012e11c   ebp = 0x0012e174   ebx = 0x0b159501
    esi = 0x0b159540   edi = 0x0b6b2080   eax = 0x00000000   ecx = 0x0b159540
    edx = 0x8e05000c   efl = 0x00010202
 1  xul.dll!gfxTextRun::~gfxTextRun() [gfxFont.cpp:07599e2b0941 : 1412 + 0xf]
    eip = 0x1062e548   esp = 0x0012e124   ebp = 0x0012e174
 2  xul.dll!gfxTextRun::`vector deleting destructor'(unsigned int) + 0x35
    eip = 0x101bb866   esp = 0x0012e12c   ebp = 0x0012e174
 3  xul.dll!nsTextFrame::ClearTextRun() [nsTextFrameThebes.cpp:07599e2b0941 : 3772 + 0x7]
    eip = 0x1026fc28   esp = 0x0012e138   ebp = 0x0012e174   ebx = 0x0b159530
Verified fixed on trunk with Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2a1pre) Gecko/20090422 Minefield/3.6a1pre

Seems like we forgot to check it into 1.9.1. Adding keyword...
Keywords: checkin-needed
Target Milestone: --- → mozilla1.9.2a1
Attachment #368829 - Attachment description: fix → fix [Checkin: Comment 18 & 22]
Comment on attachment 368829 [details] [diff] [review]
[Checkin: Comment 18]

After fixing context for
patching file layout/base/crashtests/crashtests.list
Hunk #1 FAILED at 182
1 out of 1 hunks FAILED
patching file layout/generic/nsSplittableFrame.cpp
Hunk #1 FAILED at 223
1 out of 1 hunks FAILED
patching file layout/generic/nsSplittableFrame.h
Hunk #1 FAILED at 93
1 out of 1 hunks FAILED
Whiteboard: [needs 191 landing] → [fixed1.9.1b5]
Comment on attachment 368829 [details] [diff] [review]
[Checkin: Comment 18]


It looks like there are still other callers:
Attachment #368829 - Attachment description: fix [Checkin: Comment 18 & 22] → fix [Checkin: Comment 18]
Keywords: fixed1.9.1
Whiteboard: [fixed1.9.1b5] → [needs 1.9.1 landing: needs 1.9.1 patch!]
Is this still blocking the 1.9.1 / 3.5 release? I can't see it in the list of All Blockers or Layout blockers on
Reopening for branch fix.
Resolution: FIXED → ---
(Thanks for the heads-up Luke)
Blocks: 493287
Oh, this is easy, we don't need to land the nsSplittableFrame changes on branch, just the rest of the patch.
Whiteboard: [needs 1.9.1 landing: needs 1.9.1 patch!] → [needs 191 landing]
Closed: 12 years ago11 years ago
Keywords: fixed1.9.1
Resolution: --- → FIXED
Whiteboard: [needs 191 landing]
Verified fixed on trunk and 1.9.1 with:

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20090522 Minefield/3.6a1pre ID:20090522032716

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1pre) Gecko/20090521 Shiretoko/3.5pre ID:20090521135222
OS: Windows XP → All
Hardware: x86 → All
Here's this bug's fix, backported to 1.9.0.x.  I just applied the 1.9.1 changeset (0904c429bd5b), so this doesn't include the changes to nsSplittableFrame, but those aren't needed, per comment 28.  (they just remove an obsolete method, I think)  My only modification was to fix bitrot in crashtests.list.
Attachment #382854 - Flags: approval1.9.0.12?
Removing stale dependency on bug 472776 -- this dependency was tentatively added on 2009/02/05, as a result of comment 4 (where roc suggested that bug 472776's patch might fix this bug), but comment 7 shows that the dependency was incorrect.
No longer depends on: 472776
Attachment #382854 - Flags: approval1.9.0.12? → approval1.9.0.12+
Comment on attachment 382854 [details] [diff] [review]
fix, backported to 1.9.0.x (based on 1.9.1 checkin)

Approved for, a=dveditz for release-drivers
Patch landed on CVS trunk:

Checking in layout/base/nsCSSFrameConstructor.cpp;
/cvsroot/mozilla/layout/base/nsCSSFrameConstructor.cpp,v  <--  nsCSSFrameConstructor.cpp
new revision: 1.1485; previous revision: 1.1484
RCS file: /cvsroot/mozilla/layout/base/crashtests/468491-1.html,v
Checking in layout/base/crashtests/468491-1.html;
/cvsroot/mozilla/layout/base/crashtests/468491-1.html,v  <--  468491-1.html
initial revision: 1.1
Checking in layout/base/crashtests/crashtests.list;
/cvsroot/mozilla/layout/base/crashtests/crashtests.list,v  <--  crashtests.list
new revision: 1.101; previous revision: 1.100
Keywords: fixed1.9.0.12
Attached testcase does not crash (nor the post-fix builds).
(In reply to comment #37)
> Attached testcase does not crash (nor the post-fix
> builds).

That's expected. I landed the patch that causes this regression and the patch that fixes it back-to-back, so as not to expose our 1.9.0.x nightly users to this bug.

So, this was only broken on 1.9.0.x for a few minutes -- between revision 1.100 and 1.101 of layout/base/crashtests/crashtests.list:,1.100

If you'd like, you should be able to reproduce this bug by checking out a build from 2009-06-12 at 12:30 PDT.
Thanks, Dan. I'll mark this as verified for then.
Crash Signature: [@ gfxTextRun::`vector deleting destructor'] [@ nsTextFrame::ClearTextRun]
You need to log in before you can comment on or make changes to this bug.