Closed
Bug 373400
Opened 17 years ago
Closed 16 years ago
Crash when print previewing tinderbox page [@ nsTableRowGroupFrame::SplitSpanningCells]
Categories
(Core :: Layout: Tables, defect, P2)
Tracking
()
VERIFIED
FIXED
People
(Reporter: polidobj, Assigned: bernd_mozilla)
References
()
Details
(Keywords: crash, regression, testcase, Whiteboard: [dbaron-1.9:Rw])
Crash Data
Attachments
(10 files, 3 obsolete files)
1.07 KB,
text/html
|
Details | |
186.92 KB,
text/html
|
Details | |
7.28 KB,
patch
|
Details | Diff | Splinter Review | |
7.46 KB,
patch
|
Details | Diff | Splinter Review | |
7.86 KB,
text/plain
|
Details | |
2.37 KB,
text/html
|
Details | |
989 bytes,
patch
|
Details | Diff | Splinter Review | |
9.94 KB,
patch
|
sharparrow1
:
review+
|
Details | Diff | Splinter Review |
10.91 KB,
patch
|
dbaron
:
superreview-
|
Details | Diff | Splinter Review |
10.19 KB,
patch
|
dbaron
:
superreview+
|
Details | Diff | Splinter Review |
Doing a print preview on a tinderbox page crashes. A few times it wouldn't crash in the initial print preview but would crash when changing between portrait and landscape. The regression range I found is: 2006120804 crashes 2006120704 ok http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=HEAD&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2006-12-07+04%3A00%3A00&maxdate=2006-12-08+04%3A00%3A00&cvsroot=%2Fcvsroot I'm suspecting bug 362708. Because that's a closed bug I'm setting the security flag on this bug. ( Just in case :) ) Incident ID: 30085199 Stack Signature nsTableRowGroupFrame::SplitSpanningCells 6be0833e Product ID FirefoxTrunk Build ID 2007030904 Trigger Time 2007-03-09 13:38:14.0 Platform Win32 Operating System Windows NT 5.1 build 2600 Module firefox.exe + (0031005d) URL visited User Comments Since Last Crash 23862 sec Total Uptime 23862 sec Trigger Reason Access violation Source File, Line No. e:\builds\tinderbox\fx-trunk\winnt_5.2_depend\mozilla\layout\tables\nstablerowgroupframe.cpp, line 943 Stack Trace nsTableRowGroupFrame::SplitSpanningCells [mozilla/layout/tables/nstablerowgroupframe.cpp, line 943] nsTableRowGroupFrame::SplitRowGroup [mozilla/layout/tables/nstablerowgroupframe.cpp, line 1203]
Comment 1•17 years ago
|
||
Confirm crash in winxp debug build. ###!!! ASSERTION: invalid continued row: 'PR_FALSE', file f:/work/mozilla/builds/1.9.0/mozilla/layout/tables/nsTableRowGroupFrame.cpp, line 1007 for (nsTableRowFrame* row = &aFirstRow; !wasLast; row = row->GetNextRow()) { wasLast = (row == &aLastRow); => PRInt32 rowIndex = row->GetRowIndex(); nsPoint rowPos = row->GetPosition(); null dereference on row On Linux I see lots of memory|cpu thrashing and a temporary hang of a minute or so...
Comment 2•17 years ago
|
||
This crashes for me when switching to landscape on print preview. This testcase and the url don't crash for me on the latest branch build on print preview.
Comment 3•17 years ago
|
||
I see this on linux as well (same crash that Bob saw, null pointer reference). Doesn't seem like an exploitable crash.
Comment 5•17 years ago
|
||
This is obviously a regression from the reflow branch landing and NOT bug 362708. I see no reason for the security flag to be set on it. All that does is make it hard for people to find it.
Blocks: reflow-refactor
Comment 6•17 years ago
|
||
(In reply to comment #5) > This is obviously a regression from the reflow branch landing and NOT bug > 362708. I see no reason for the security flag to be set on it. All that does > is make it hard for people to find it. > Hmm I take that back. I suppose it is possible that 362708 could be the culprit. I will try to narrow this down.
Updated•17 years ago
|
Flags: blocking1.9?
Comment 7•17 years ago
|
||
OK> I did a lot of testing on this. Firstly, I can not make the attached testcase crash with current builds at all. Secondly, the crash you are seeing with that regression range is a print preview crash that was already fixed in bug 362210 and not related to this bug at all. Thirdly, that crash was caused by bug 300010 (the reflow landing) and not by any security related fix. Therefore, this bug should be made public and I will try to narrow down a real regression window on it.
No longer blocks: reflow-refactor
Comment 8•17 years ago
|
||
Comment 9•17 years ago
|
||
New regression window: 2006020604 - no crash 2006020617 - Gran Paradiso 3.0a2 crashes http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=HEAD&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2007-02-06+04%3A00&maxdate=2007-02-06+17%3A00&cvsroot=%2Fcvsroot I don't see anything interesting in that range.
Comment 10•17 years ago
|
||
Perhaps there is something odd with the Gran Paradiso 3.0a2 release build. It is not clear that is the same crash, and the 2007020704 nightly does NOT crash. Still looking.
Comment 11•17 years ago
|
||
OK. I found the real check-in that caused the regression. Oddly it turns out it is another security fix. Go figure. Anyway it was bug 370360. I verified it as the cause by backout.
Assignee: printing → nobody
Component: Print Preview → Layout: Tables
QA Contact: layout.tables
Assignee | ||
Comment 12•17 years ago
|
||
>Oddly it turns out it is another security fix. >Go figure. The patch in bug 370360 consists of three pieces: 1. fix the possible array boundary violation. This got checked in into branches 2. switch to TArray 3. make cell frames to report properly the truncation status, which will cause more attempts to break cells. We crash just in this function. 1. is security related 2. is to have a sane code base 3. while I was at it I got negative heights which might be bad (integer overflow...) and I fixed it. 940 // Iterate the rows between aFirstRow and aLastRow 941 for (nsTableRowFrame* row = &aFirstRow; !wasLast; row = row->GetNextRow()) { 942 wasLast = (row == &aLastRow); // we crash here, probably due to 0 deref. 943 PRInt32 rowIndex = row->GetRowIndex(); 944 nsPoint rowPos = row->GetPosition();
Comment 13•17 years ago
|
||
I was not trying to imply there was something specific incorrect in the code that was changed in that patch. We definitely do crash because of a 0 deref at that point. THe first thing i tried (which did appear to fix this was to add. if (!row) break; at that point. This avoids the crash, but I was trying to find a proper fix. I was thinking perhaps this is another case where a test for rows already pushed into the nextInFlow. I should also point out hat for at least 2 weeks before the patch for bug 370360, there was another flaky crash related to doing a print preview on this testcase. The crashes were internment and at not at the same point. Backing out the patch seems to revert you to the sometimes more flaky crashes. With the patch from 370360 it seems to always crash immediately uponn doing the print preview. Without the patch, it will sometimes crash upon just doing the print preview, sometimes crash when scrolling the preview page, sometimes crash upon closing the print preview, and sometimes upon closing the browser. Most of the time, however, it does not crash at all.
Comment 14•17 years ago
|
||
OK. Before the patch for bug 370360 landed, print preview on this page would crash intermittently, but not in SplitSpanningCells. After the patch landed, a repeatable crash in SplitSpanningCells began occurring. By backing out the patch for bug 370360, I was able to get back to the original condition. However, leaving the patch for bug 370360 intact, and adding the wallpaper patch from comment #13, I am unable to make it crash at all. It would appear that for some reason aLastRow is incorrect in the call to SplitSpanningCells.
Comment 15•17 years ago
|
||
Odder still, the call to SplitSpanningCells that is crashing is the attempt to go back to the first split, passing the same values for aFirstRow and aLastRow. As the first call did not crash it is hard to see why this one should. I guess I will have to leave trying to fix this to someone who understands this area of the code better.
Assignee | ||
Comment 16•17 years ago
|
||
Bill don't waste your time on it, this is a table crash. ==> mine
Assignee: nobody → bernd_mozilla
Updated•17 years ago
|
Flags: blocking1.9? → blocking1.9+
Assignee | ||
Comment 17•17 years ago
|
||
There is a mismatch between http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/layout/tables/nsTableRowGroupFrame.cpp&rev=3.383&mark=953,1180#950 what splitspanning cell expects and what is done in splitrowgroup. This changed due to bug 231823 see https://bugzilla.mozilla.org/show_bug.cgi?id=231823#c15 specifically.
Assignee | ||
Comment 18•17 years ago
|
||
Comment 19•17 years ago
|
||
(In reply to comment #18) > Created an attachment (id=262663) [details] > sketch > Verifying that this patch resolves the crash issue for me.
Assignee | ||
Comment 20•17 years ago
|
||
but it regresses bug 231823 :-(
Comment 21•17 years ago
|
||
With this patch I don't get a crash and don't get the regression of bug 231823. However, it does not do as good a job of conserving paper on printing the tinderbox webpage.
Comment 22•17 years ago
|
||
This makes more sense to me, but does not appear to help in the print of tinderbox pages the first 2 pages still are mostly blank.
Attachment #262882 -
Attachment is obsolete: true
Comment 23•17 years ago
|
||
Actually, this one line change seems to be sufficient to avoid the crash. It removes a line of code that makes little sense to me in trying to follow the logic of the original fix for bug 231823. This does not cause any regression in the bug 231823 testcase.
Comment 24•17 years ago
|
||
Arrgghh! This is odd. It could be that none of those patches I posted in comments #21 #22 and #23 actually avoid the crash. This crash is not quite a solid as I thought. For some unexplained reason current trunk builds no longer crash on any of these testcases, nor do they seem to crash on doing a print preview of any of the current tinderbox pages. However, I don't see nay code that has been checked in recently that appears could possibly have fixed anything in this area.
Assignee | ||
Comment 25•17 years ago
|
||
Bill, the test case asserts more than enough in a debug build. The problem with bug 231823 is that the previous code used the availheight as page break hint see http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/layout/tables/nsTableRowGroupFrame.cpp&rev=3.383&mark=952-957#945 The page page break hint is top down measured while https://bugzilla.mozilla.org/show_bug.cgi?id=231823#c15 the availableHeight is the remainder that is measured relative to the bottom. The original idea was to break not inside the table cells but at the row boundaries. David made the variable contain what it names in bug 231823 and I r+ed it. This efficiently turned off SplitSpanningCells. The task here is not to avoid the crash but to find way that gives right coordinates to ReflowCellFrame inside SplitSpanningCells. Negative coordinates for the availHeight are not acceptable. Neither is it acceptable to print dozens of pages for the test case in bug 231823.
Assignee | ||
Comment 26•17 years ago
|
||
Eli, do you have an idea what I should do here?
Comment 27•17 years ago
|
||
(In reply to comment #26) > Eli, do you have an idea what I should do here? > Oh, wow; this code is pretty confusing. The fix to bug 231823 was wrong. The issue is that availHeight is used to mean two different incompatible things. One is that it's used as the bottom coordinate of the row for SplitSpanningCells. The other is that it is used to mean the bottom coordinate of the rowgroup. The patch in bug 231823 screws up both of these meanings. I'll see if I can come up with a patch; that will be clearer than trying to explain it.
Comment 28•17 years ago
|
||
Since it now appears that the root cause of this bug appears to be bug 231823 and not any of the previously suspected currently non-public security bugs, should this bug be made public now?
Updated•17 years ago
|
Attachment #263051 -
Attachment is obsolete: true
Updated•17 years ago
|
Attachment #263073 -
Attachment is obsolete: true
Comment 29•17 years ago
|
||
Okay, so I tried basically backing out the patch for bug 231823. Somehow both rows are getting continued onto each page. This clearly shouldn't be happening, but I'm not familiar enough with the table code to understand why this is happening.
Comment 30•17 years ago
|
||
A bit more: SplitSpanningCells creates a continuation for the last row on the page. However, when a row gets pushed off to the next page, we end up creating a continuation for that last row, which makes the row which got pushed off not the first row on the page. Therefore, we end up creating a ton of continuations until we hit a continuation of the spanned cell that can't be split anymore.
Comment 31•17 years ago
|
||
This is basically backing out bug 231823; I fixed a minor bug in one edge case (the "We were better off with the 1st call to SplitSpanningCells, do it again" case was using the wrong value). The rest of this is simply variable renaming to make it clear what's going on. In case I didn't make it clear before, the real cause of bug 231823 is at http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/layout/tables/nsTableRowGroupFrame.cpp&rev=3.383&mark=972#972 In the case where the last row on the page gets pushed off to the next page, this code still creates a continuation for the last row on the page, which shouldn't get one. I'm not entirely sure how to go about fixing the issue, though.
Assignee | ||
Comment 32•17 years ago
|
||
>In the case where the last row on the page gets pushed off to the next page,
>this code still creates a continuation for the last row on the page,
Isn't that the job of UndoContinuedRow?
The whole continuation row business seems questionable to me, why don't we push the continued cell into the next row? (Should have asked this Chris in 2002).
And if we undo we could just loop over the row and look for cells that have a previnflow.
Comment 33•17 years ago
|
||
(In reply to comment #32) > The whole continuation row business seems questionable to me, why don't we push > the continued cell into the next row? (Should have asked this Chris in 2002). The point of this code is to deal with the issue that if a row has a child that continues onto the next page, but the row itself doesn't continue onto the next page, pushing the cells into the overflow doesn't work. I actually have an idea for one way to fix this; I'll throw together a patch soon.
Comment 34•17 years ago
|
||
(In reply to comment #33) > I actually have an idea for one way to fix this; I'll throw together a patch > soon. Never mind; the whole thing is way too complicated for me to trust it.
Comment 35•17 years ago
|
||
Fantasai is doing a lot of work in bug 379349 about printing and page breaks. So far he has not looked at tables over there but seem to be getting ready to.
Assignee | ||
Comment 36•17 years ago
|
||
Comment 37•17 years ago
|
||
(In reply to comment #36) > Created an attachment (id=266188) [details] > patch that does not regress 231823 > Works great in avoiding the crash and wasting less space on tinderbox pages. Not only does it not regress bug 231823, but it fixes the issue remaining from bug 231823 that both testcases were previously printing 2 pages, the second one being blank. That no longer occurs with this patch.
Assignee | ||
Comment 38•17 years ago
|
||
Assignee | ||
Comment 39•17 years ago
|
||
Assignee | ||
Comment 40•17 years ago
|
||
Attachment #269569 -
Flags: review?(sharparrow1)
Comment 41•17 years ago
|
||
Comment on attachment 269569 [details] [diff] [review] accumulated patch Did you mean to include the nsTableFrame.cpp change? It's fine either way, but I don't think it affects this testcase. >- // We're not on top of the page, so put the row on the next page to give it more height >- rowIsOnPage = PR_FALSE; >+ if (rowMetrics.height > availSize.height) { >+ // We're not on top of the page, so put the row on the next page to give it more height >+ rowIsOnPage = PR_FALSE; >+ } Is there some reason why the containing if statement shouldn't be "if (rowMetrics.height > availSize.height)" rather than "if (rowMetrics.height >= availSize.height)"? Other than that, it looks good.
Comment 42•17 years ago
|
||
I don't crash on Windows, with an up to date 1.8 branch build. Is there a reason for this bug to stay security sensitive? If I understand things correctly it's a trunk-only null deref regression?
Comment 43•17 years ago
|
||
Comment on attachment 269569 [details] [diff] [review] accumulated patch Marking r- until I get an answer to my question.
Attachment #269569 -
Flags: review?(sharparrow1) → review-
Assignee | ||
Comment 44•17 years ago
|
||
>Did you mean to include the nsTableFrame.cpp change?
Yes, the code is wrong otherwise. I found it while I was auditing the code for this bug.
Is there some reason why the containing if statement shouldn't be "if
(rowMetrics.height > availSize.height)" rather than "if (rowMetrics.height >=
availSize.height)"?
Yes, if rowMetrics.height == availSize.height it still fits (perfectly) so why should it be pushed?
Assignee | ||
Comment 45•17 years ago
|
||
Comment on attachment 269569 [details] [diff] [review] accumulated patch asking again for review
Attachment #269569 -
Flags: review- → review?(sharparrow1)
Comment 46•17 years ago
|
||
I'm not sure if I was clear; I was asking about http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/layout/tables/nsTableRowGroupFrame.cpp&rev=3.387&mark=1111#1111.
Updated•17 years ago
|
Attachment #269569 -
Flags: review?(sharparrow1) → review+
Assignee | ||
Comment 47•17 years ago
|
||
Attachment #276272 -
Flags: superreview?(dbaron)
Comment 48•17 years ago
|
||
Comment on attachment 276272 [details] [diff] [review] patch adapted to eli's comment >Index: nsTableFrame.cpp > // Special handling for incomplete children >- if (NS_FRAME_IS_NOT_COMPLETE(aStatus)) { >+ if (NS_FRAME_IS_NOT_COMPLETE(aStatus) && isPaginated) { And what if it's not paginated? >Index: nsTableRowFrame.cpp > aCellFrame->SetSize( > nsSize(cellSize.width, fullyComplete ? aAvailableHeight : desiredSize.height)); > > // XXX What happens if this cell has 'vertical-align: baseline' ? > // XXX Why is it assumed that the cell's ascent hasn't changed ? > if (fullyComplete) { >+ desiredSize.height = aAvailableHeight; > aCellFrame->VerticallyAlignChild(mMaxCellAscent); > } Why not move this addition earlier (into a different if() checking the same condition) so you can avoid the ?: inside the parameter to the SetSize call?
Comment 49•17 years ago
|
||
Comment on attachment 276272 [details] [diff] [review] patch adapted to eli's comment Finally got to reading the rest of this; I didn't review closely, but it seems ok. But marking sr- pending answer to the first question in comment 48.
Attachment #276272 -
Flags: superreview?(dbaron) → superreview-
Updated•16 years ago
|
Whiteboard: [dbaron-1.9:Rw]
Priority: -- → P2
Assignee | ||
Comment 50•16 years ago
|
||
I removed the first chunk. I added it initially, because I was concerned that we will try to split the table in non paginated mode. I am not sure thats a really good idea so its probably better to wait if this is really an issue.
Attachment #290038 -
Flags: superreview?(dbaron)
Comment 51•16 years ago
|
||
Comment on attachment 290038 [details] [diff] [review] patch to incorporate Davids comments sr=dbaron
Attachment #290038 -
Flags: superreview?(dbaron) → superreview+
Assignee | ||
Comment 52•16 years ago
|
||
fix checked in
Status: NEW → RESOLVED
Closed: 16 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Comment 53•16 years ago
|
||
Verified fixed, using: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9b2pre) Gecko/2007120405 Minefield/3.0b2pre No crash anymore with any of the testcases or the tinderbox page on print preview.
Status: RESOLVED → VERIFIED
Updated•16 years ago
|
Updated•13 years ago
|
Crash Signature: [@ nsTableRowGroupFrame::SplitSpanningCells]
Comment 54•9 years ago
|
||
Landed the three crashtests: https://hg.mozilla.org/integration/mozilla-inbound/rev/328b65f2efa0
Group: core-security
Flags: in-testsuite? → in-testsuite+
You need to log in
before you can comment on or make changes to this bug.
Description
•