The default bug view has changed. See this FAQ.

Crash [@ BuildTextRunsScanner::BreakSink::SetBreaks(unsigned int, unsigned int, unsigned char*)] with crazy -moz-column testcase

RESOLVED FIXED in mozilla2.0b7

Status

()

Core
Layout
--
critical
RESOLVED FIXED
7 years ago
4 years ago

People

(Reporter: Jesse Ruderman, Assigned: mats)

Tracking

(Blocks: 1 bug, {crash, testcase})

Trunk
mozilla2.0b7
crash, testcase
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(blocking2.0 final+, blocking1.9.2 .13+, status1.9.2 .13-fixed, blocking1.9.1 .16+, status1.9.1 .16-fixed)

Details

(Whiteboard: [sg:critical][critsmash:investigating], crash signature)

Attachments

(10 attachments, 6 obsolete attachments)

451 bytes, application/xhtml+xml
Details
39.08 KB, text/plain
Details
1.11 KB, patch
Details | Diff | Splinter Review
40.42 KB, text/html
Details
1.05 KB, patch
Details | Diff | Splinter Review
12.00 KB, text/html
Details
10.86 KB, text/html
Details
12.34 KB, patch
roc
: review+
Details | Diff | Splinter Review
22.02 KB, patch
Details | Diff | Splinter Review
27.27 KB, patch
roc
: review+
Details | Diff | Splinter Review
(Reporter)

Description

7 years ago
Created attachment 451142 [details]
testcase (crashes 1.9.1 and 1.9.2 Firefox when loaded)
(Reporter)

Comment 1

7 years ago
Created attachment 451143 [details]
crash report
(Assignee)

Comment 2

7 years ago
We're calling a virtual method on a deleted gfxTextRun.
Group: core-security
OS: Mac OS X → All
Hardware: x86 → All
Whiteboard: [sg:critical]
(Assignee)

Comment 3

7 years ago
Created attachment 451301 [details] [diff] [review]
to illustrate

This is not a suggested fix/wallpaper - just to illustrate where the problem
starts.  In BuildTextRunsScanner::AssignTextRun we call f->ClearTextRun()
which deletes mTextRun if it doesn't have the TEXT_IN_CACHE bit:
http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsTextFrameThebes.cpp#3759
The patch illustrates the deleted text run is still in use by one of the
mBreakSinks.
(Assignee)

Comment 4

7 years ago
Same crash occurs on 1.9.1 and 1.9.2
status1.9.1: --- → ?
status1.9.2: --- → ?
Mats can probably fix this.
Assignee: nobody → matspal

Updated

7 years ago
Whiteboard: [sg:critical] → [sg:critical][critsmash:investigating]
status1.9.1: ? → wanted
status1.9.2: ? → wanted
Mats, do you know when you'll be able to work on this?
(Assignee)

Comment 7

7 years ago
Created attachment 455029 [details]
Frame dump + stacks

The first stack dump is from when the first "REASSIGNING MULTIFLOW ..."
warning occurs, note that aTextRun=0x163b680 (red) here.
The next stack is when we hit the warning again and this is the same
call where we have 0x163b680 in one of the BreakSinks.
Also note the "UnhookTextRunFromFrames 0x163b680 (red)" here,
I guess from "f->ClearTextRun();"
(Assignee)

Comment 8

7 years ago
roc, can you tell me what the MULTIFLOW warning means:
http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsTextFrameThebes.cpp#1953
It means that we had to wipe out the textrun for a frame, deleting the entire textrun, and the textrun had text accumulated from multiple frames. This is not wrong, it can happen, but I added that warning for myself because if it happens too much we have serious performance problems.
(Assignee)

Comment 10

7 years ago
Created attachment 455365 [details] [diff] [review]
wip

Removing the BreakSink that refers to the old textrun and all BreakSinks
that follows it, then for each of those BreakSinks, remove it from
mLineBreaker and all TextItems that follows.

This fixes the crash for me locally (Linux) and pass reftests+crashtests.
I've submitted the patch to TryServer.
(Assignee)

Comment 11

7 years ago
Comment on attachment 455365 [details] [diff] [review]
wip

This doesn't fix the crash on 1.9.2/1.9.1 though... just moves it to some other place.  Removing *all* BreakSinks seems to fix it though... I'll investigate some more...
Attachment #455365 - Attachment is obsolete: true
(Assignee)

Comment 12

7 years ago
Created attachment 455617 [details] [diff] [review]
fix v2

This fixes the crash also on 1.9.2 and 1.9.1.
It passed TryServer (m-c) unit tests and reftests/crashtests
in local 1.9.2 debug builds on Linux and OSX.
Attachment #455617 - Flags: review?(roc)
Hmm, it doesn't seem right just to reset the linebreaker. We could be in the middle of finding line breaks.

Do you know why a textrun whose text was added to the linebreaker is being replaced?
(Assignee)

Comment 14

7 years ago
Created attachment 455900 [details]
Frame dumps + stack

This a detailed trace starting in BuildTextRuns (first frame dump),
to the error condition in AssignTextRun (last frame dump).

While looping in BuildTextRuns, calling ScanFrame for each frame,
we eventually reach 0x19243f0(pink) where we hit
1388       if (!ContinueTextRunAcrossFrames(mLastFrame, frame)) {
1389         FlushFrames(PR_FALSE, PR_FALSE);
http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsTextFrameThebes.cpp#1388
(mLastFrame=0x1924158(lime) "\n",  frame=0x19243f0(pink) " ")
so we call FlushFrames, which creates a new text run (0x181be80)
and assign it to the frames in the mapped flow (0x14d71b8 0x1924158).

I don't see anything wrong in this per se, other than the old text run
still being referenced in the BreakSinks.
Attachment #455029 - Attachment is obsolete: true
(Assignee)

Comment 15

7 years ago
Created attachment 455905 [details] [diff] [review]
fix v3

Don't use ClearTextRun() since it might delete the text run while
it's used in BreakSinks/LineBreaker.  Delete it later when we
flush line breaks instead (in FlushLineBreaks).
Attachment #455617 - Attachment is obsolete: true
Attachment #455905 - Flags: review?(roc)
Attachment #455617 - Flags: review?(roc)
This patch is nice and safe, but I'm still worried that we could lose line breaks. The line breaker might record line break opportunities in this textrun that we are just going to throw away.

In your log, it looks like the frame 0x14d71b8 where we replace the old textrun with the new textrun is actually zero-length? Maybe that's an underlying problem we should fix.

The reason we wipe out the old textrun when a new textrun overlaps it is that we have to keep the invariant that the text ranges mapped by textruns don't overlap. But if the frame has no text then there is no overlap to worry about so there shouldn't be a problem just changing its textrun from one to another.

Comment 17

7 years ago
ETA is July 24th, since mats is out for a bit
Comment on attachment 455905 [details] [diff] [review]
fix v3

per comment 16 and offline, roc says we need a new patch.
Attachment #455905 - Flags: review?(roc) → review-
Blocking Final 2.0+
blocking2.0: --- → final+
#114 crash for thunderbird 3.1.1, approx same for v3.0.6

a firefox crash mentions "accidentally rsized the font larger with the mouse pad somehow, then tried to resize the window to full screen and crash"  bp-269e3e50-3730-42e3-b9e4-379322100709

summary sig change to match BuildTextRunsScanner::BreakSink::SetBreaks(unsigned int, unsigned int, unsigned char*) to crash-stats

unrelated? ... free | nsAutoPtr<BuildTextRunsScanner::BreakSink>::`scalar deleting destructor''(unsigned int)
Summary: Crash [@ BuildTextRunsScanner::BreakSink::SetBreaks] with crazy -moz-column testcase → Crash [@ BuildTextRunsScanner::BreakSink::SetBreaks(unsigned int, unsigned int, unsigned char*)] with crazy -moz-column testcase
(Assignee)

Comment 21

7 years ago
Comment on attachment 451142 [details]
testcase (crashes 1.9.1 and 1.9.2 Firefox when loaded)

This testcase does not trigger the bug on trunk anymore due to:
http://hg.mozilla.org/mozilla-central/rev/c2fa4377a799
It still crashes with that change reverted so replacing the
newlines in the testcase with some other character should
still crash on trunk...
Attachment #451142 - Attachment description: testcase (crashes Firefox when loaded) → testcase (crashes 1.9.1 and 1.9.2 Firefox when loaded)
(Assignee)

Comment 22

7 years ago
Created attachment 463024 [details] [diff] [review]
crashtest.diff  (for trunk and branches)
(Assignee)

Comment 23

7 years ago
(In reply to comment #16)
> ... it looks like the frame 0x14d71b8 where we replace the old textrun
> with the new textrun is actually zero-length? Maybe that's an underlying
> problem we should fix.

I don't think that's a bug in this case, it can happen when a text frame
grows to include text from its next continuation during reflow, leaving
the continuation temporarily empty until its reflowed.
SetLength mentions this case explicitly:
http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsTextFrameThebes.cpp#6083
(Assignee)

Comment 24

7 years ago
Created attachment 463029 [details] [diff] [review]
fix v4

ClearTextRun + update break sinks to use the new text run --
but only if the old text run is NOT in use by the previous continuation.
Attachment #455905 - Attachment is obsolete: true
Attachment #463029 - Flags: review?(roc)
As discussed offline, Mats is going to make a new patch here.
blocking1.9.1: --- → needed
blocking1.9.2: --- → needed
Mats, got an update?
(Reporter)

Comment 27

7 years ago
dbaron is on a -moz-column crash-killing rampage.

Updated

7 years ago
Whiteboard: [sg:critical][critsmash:investigating] → [sg:critical][critsmash:patch]
Attachment #463029 - Attachment is obsolete: true
Attachment #463029 - Flags: review?(roc)
Whiteboard: [sg:critical][critsmash:patch] → [sg:critical]
Is that new patch coming soon?  (roc, do you remember what was wrong with the old one?)
Mats has been working on the swapDocShells frame-tree-swapping bug.

This patch was not the right approach. Basically, the situation in this bug is that we have a text continuation frame C that is empty (maps zero characters) but is associated with textrun A. Textruns get recreated and C is assigned a new textrun B. This causes us to delete textrun A (which is also being used by C's prev-in-flows), but it shouldn't need to; because C has no actual text associated with it at this point, we can just flip it to point to textrun B and no textruns need to be updated. (We do need to ensure that C does not appear in the userdata for textrun A.)
Any updates on a new approach here?
Mats has finally finished the presentation-swapping work and will work on this this week.

Updated

7 years ago
Whiteboard: [sg:critical] → [sg:critical][critsmash:investigating]
(Assignee)

Comment 32

7 years ago
Created attachment 480764 [details] [diff] [review]
fix v5
Attachment #480764 - Flags: review?(roc)
Attachment #480764 - Flags: review?(roc) → review+
(Assignee)

Comment 33

7 years ago
Intentionally didn't land the tests yet, until we fix branches.
http://hg.mozilla.org/mozilla-central/rev/cc8777e97c21
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b8

Comment 34

7 years ago
could this landing be connected to the crash volume regression described in
https://bugzilla.mozilla.org/show_bug.cgi?id=485003#c11 and below?
(Assignee)

Comment 35

7 years ago
I think so.  I'm looking at bug 603490 and bug 603510 which
shows that the assumptions we made in this patch are wrong.
So I'm backing out and reopening this bug instead.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 36

7 years ago
Backout changeset:
http://hg.mozilla.org/mozilla-central/rev/9af1a5813a7b
(Assignee)

Updated

7 years ago
Depends on: 603490
(Assignee)

Updated

7 years ago
Depends on: 603510
(Assignee)

Comment 37

7 years ago
Created attachment 482664 [details]
Trace for bug 603490
Attachment #480764 - Attachment is obsolete: true
(Assignee)

Comment 38

7 years ago
Created attachment 482665 [details]
Trace for bug 603510
(Assignee)

Comment 39

7 years ago
roc, how about if we always do a limited ClearTextRun() -- clearing only
the tail of continuations starting at the frame we're looking at (f).
The crash in this bug comes from UnhookTextRunFromFrames
starting at the "first-in-userdata" (or first-in-flow for SIMPLE_FLOW).
That would mean a full ClearTextRun() for bug 603490 and bug 603510
(which I think is right) and for this bug we would start at the
"first-in-userdata" which is the first frame using the oldTextRun
(0x1920b58 in attachment 455900 [details]), then loop until we find 
0x14d71b8 (f) and clear from there, the net result being a simple
SetTextRun(nsnull) in this case.  What do you think?
Sounds good.
(Assignee)

Comment 41

7 years ago
Created attachment 482907 [details] [diff] [review]
fix v6

Add 'aStartContinuation' parameter to the relevant methods;
in case it's null these methods should work as before,
in case it's non-null they should traverse frames as before but
not do anything until we get to 'aStartContinuation', then they should
start doing their thing.  The only non-null caller is AssignTextRun.
If 'aStartContinuation' is non-null it's an error if it's not found
in the user data.

Still waiting for Try results, but an earlier version with only
a minor difference worked fine...
Attachment #482907 - Flags: review?(roc)
(Assignee)

Comment 42

7 years ago
(TryServer unit tests were successful with a couple of known [orange] bugs)
Comment on attachment 482907 [details] [diff] [review]
fix v6

Probably best to not use default parameters here.
Attachment #482907 - Flags: review?(roc) → review+
(Assignee)

Comment 44

7 years ago
Created attachment 483383 [details] [diff] [review]
fix v7

Ok, I dropped the default param and added 'nsnull' to callers as needed.
I don't think this needs a new review so I'm pushing this in a bit....
(Assignee)

Comment 45

7 years ago
http://hg.mozilla.org/mozilla-central/rev/9a1c991eddb0
Status: REOPENED → RESOLVED
Last Resolved: 7 years ago7 years ago
Resolution: --- → FIXED

Updated

7 years ago
Blocks: 485003
(Assignee)

Updated

7 years ago
Depends on: 604843
blocking1.9.1: needed → .15+
blocking1.9.2: needed → .12+
(Assignee)

Updated

7 years ago
Depends on: 605340
Target Milestone: mozilla2.0b8 → mozilla2.0b7
(Assignee)

Comment 46

7 years ago
Created attachment 489213 [details] [diff] [review]
Patch for branches

Accumulated patch for 1.9.2/1.9.1 branches.  It contains the fix for this
bug and its regressions, bug 605340 and bug 604843, plus crashtests for
bug 603510 and bug 603490. (I'm holding the crashtest on this bug)
Attachment #489213 - Flags: review?(roc)
Attachment #489213 - Flags: review?(roc) → review+
(Assignee)

Updated

7 years ago
Attachment #489213 - Flags: approval1.9.2.13?
Attachment #489213 - Flags: approval1.9.1.16?
Comment on attachment 489213 [details] [diff] [review]
Patch for branches

Approved for 1.9.2.13 and 1.9.1.16, a=dveditz for release-drivers
Attachment #489213 - Flags: approval1.9.2.13?
Attachment #489213 - Flags: approval1.9.2.13+
Attachment #489213 - Flags: approval1.9.1.16?
Attachment #489213 - Flags: approval1.9.1.16+
(Assignee)

Comment 48

6 years ago
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/650e7b0d6d34
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/bd03fbc2b2fc
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/19e3f097936a
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/91bf1ee0a8ac
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/5cb35072f373
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/6444dcb39b02

http://hg.mozilla.org/releases/mozilla-1.9.2/rev/64e803656a1f
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/ee365bf30311
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/0ac503e14b53
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/1d2fa8e53cc3
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/57ce1356af46
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/3956db8d0373
status1.9.1: wanted → .16-fixed
status1.9.2: wanted → .13-fixed
It looks like none of the crashtests make 1.9.0 crash.
Group: core-security
Crash Signature: [@ BuildTextRunsScanner::BreakSink::SetBreaks(unsigned int, unsigned int, unsigned char*)]
(Assignee)

Comment 50

4 years ago
Crash test:
https://hg.mozilla.org/integration/mozilla-inbound/rev/22e8618b6646
Flags: in-testsuite? → in-testsuite+
https://hg.mozilla.org/mozilla-central/rev/22e8618b6646
You need to log in before you can comment on or make changes to this bug.