Last Comment Bug 344000 - dom cellmap crash
: dom cellmap crash
[sg:critical] heap buffer underflow. ...
: crash, testcase, verified1.8.0.7, verified1.8.1
Product: Core
Classification: Components
Component: Layout: Tables (show other bugs)
: Trunk
: x86 Windows XP
-- normal (vote)
: mozilla1.8.1
Assigned To: Bernd
Depends on:
Blocks: stirtable 346980
  Show dependency treegraph
Reported: 2006-07-08 23:02 PDT by Bernd
Modified: 2009-04-24 11:06 PDT (History)
5 users (show)
dveditz: blocking1.9a1+
darin.moz: blocking1.8.1+
dveditz: blocking1.8.0.7+
bob: in‑testsuite+
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

testcase (961 bytes, text/html)
2006-07-08 23:03 PDT, Bernd
no flags Details
patch (5.79 KB, patch)
2006-07-09 09:27 PDT, Bernd
no flags Details | Diff | Splinter Review
patch (6.15 KB, patch)
2006-07-13 22:33 PDT, Bernd
no flags Details | Diff | Splinter Review
revised patch (8.75 KB, patch)
2006-08-01 07:50 PDT, Bernd
bzbarsky: review+
bzbarsky: superreview+
Details | Diff | Splinter Review

Description User image Bernd 2006-07-08 23:02:43 PDT
the testcase tries to access foo[-1] or foo[0xffffffff] marking sensitive.
Comment 1 User image Bernd 2006-07-08 23:03:24 PDT
Created attachment 228578 [details]
Comment 2 User image Bernd 2006-07-09 09:27:52 PDT
Created attachment 228602 [details] [diff] [review]
Comment 3 User image Daniel Veditz [:dveditz] 2006-07-13 13:29:15 PDT
Affects the 1.8 branches also, nominating.

Why do these routines use signed indexes? Could they be changed to use unsigned ints instead?
Comment 4 User image Daniel Veditz [:dveditz] 2006-07-13 13:50:49 PDT
Comment on attachment 228602 [details] [diff] [review]

>+  NS_ASSERTION(aFirstRowIndex >= 0, "nsCellMap::InsertRows called with negative rowIndex");

If you don't change the API's to use unsigned should we do more than assert? Add the assertion to other routines as well?

>   PRInt32 numRows = mRows.Count();
>   PRInt32 numCols = aMap.GetColCount();
>   if (aFirstRowIndex >= numRows) {
>+    mRowCount -= aNumRowsToRemove;
>     return;

seems dodgy here, can mRowCount be made negative this way? Or has something else already incremented mRowCount by exactly the amount aFirstRowIndex is larger than numRows.

What if aFirstRowIndex is ever negative here? We didn't expect it could happen in the other place either.

>     if (aEndRowIndex < mRowCount - 1) {
>       cellData = GetDataAt(aMap, aEndRowIndex + 1, colX, PR_TRUE);
>       if ((cellData) && (cellData->IsRowSpan())) {
>         return PR_TRUE; // there is a row span out of the region
>       }
>     }
>+    else {
>+      // this cell might be the cause of a dead row which should be removed
>+      return PR_TRUE;
>+    }

"might be"? what happens if it isn't?
Comment 5 User image Bernd 2006-07-13 22:26:30 PDT
>Why do these routines use signed indexes?
Because Chris wrote them. -1 is very common as a hidden boolean parameter to toggle function behaviour.

I haven't really analyzed the code what would happen with unsigned indices. I am currently more in panic mode trying to get the whole thing survive bug 339128. The upcoming patch does exactly this at least on my machine.

>seems dodgy here, can mRowCount be made negative this way? Or has something
>else already incremented mRowCount by exactly the amount aFirstRowIndex is
>larger than numRows.
Every row that we have increases mRowCount. Every row that we remove has to reduce it. We have a scenario where the latter does not happen without the patch.

>What if aFirstRowIndex is ever negative here? We didn't expect it could happen
>in the other place either.

If the routine returns PR_TRUE we will call a more expensive routine thats not a issue. 
Comment 6 User image Bernd 2006-07-13 22:33:50 PDT
Created attachment 229215 [details] [diff] [review]
Comment 7 User image Boris Zbarsky [:bz] (still a bit busy) 2006-07-16 09:38:11 PDT
I'd really like an explanation of what the various parts of the patch do to be able to review this...  For example, if aEndRowIndex >= mRowCount - 1, why is it ok to call GetDataAt(aMap, aEndRowIndex, colX, PR_TRUE)?  Why is the numOrigRows check needed?   That sort of thing.
Comment 8 User image Bernd 2006-08-01 07:50:28 PDT
Created attachment 231583 [details] [diff] [review]
revised patch
Comment 9 User image Boris Zbarsky [:bz] (still a bit busy) 2006-08-01 08:07:41 PDT
Comment on attachment 231583 [details] [diff] [review]
revised patch

>--- nsCellMap.cpp	2006-08-01 16:26:40.296875000 +0200
>+++ nsCellMap.local	2006-08-01 16:33:42.140625000 +0200
>@@ -1498,30 +1502,51 @@ PRBool nsCellMap::CellsSpanOut(nsVoidArr
>+                                   // current cellmap extension.

s/extension/extent/ ?

>@@ -2023,18 +2048,23 @@ nsCellMap::RebuildConsideringRows(nsTabl

>+  PRInt32 copyEndRowIndex = PR_MIN(numOrigRows, aStartRowIndex);
>   // put back the rows before the affected ones just as before
>-  for (rowX = 0; rowX < aStartRowIndex; rowX++) {
>+  // aStartRowIndex might be after all existing rows so we should limit the
>+  // copy to the amount of exisiting rows

Wouldn't this comment make more sense before you declare copyEndRowIndex and do the PR_MIN thing?

With that, looks ok.  Thanks for the comments!
Comment 10 User image Bernd 2006-08-02 05:33:51 PDT
fixed on trunk
Comment 11 User image Mike Schroepfer 2006-08-15 12:58:14 PDT
Is this patch suitable for 1.8 branch?
Comment 12 User image Bernd 2006-08-20 11:55:16 PDT
fixed on branch by the patch in bug 346980
Comment 13 User image Bernd 2006-08-21 12:01:53 PDT
the only bug that should block is bug 346980 which is blocked by the
regression 347725

asking for on this
Comment 14 User image Bob Clary [:bc:] 2006-08-22 10:38:02 PDT
ff2b2 debug windows/linux no crash, no assert, nightly windows/linux no crash
verified fixed 1.8
Comment 15 User image Daniel Veditz [:dveditz] 2006-08-23 15:15:28 PDT
Moving to at Bernd's request (regression worries)
Comment 16 User image Bernd 2006-08-29 12:40:36 PDT
fixed on branches by the cellmap branch patch
Comment 17 User image alice nodelman [:alice] [:anode] 2006-09-05 15:19:47 PDT
Can I get some clarification on this bug's status?  It seems to have been minus-ed for 1807, but it is marked 'fixed1807'.  Has the patch been checked in?  Has this issue been fixed by another patch?  Or are we planning on checking in this patch to the 1.8.0 branch later, even though this particular crash has been fixed?

Comment 18 User image Daniel Veditz [:dveditz] 2006-09-05 17:29:41 PDT
This did land in after all. It wasn't "blocking", but was still approved.
Comment 19 User image alice nodelman [:alice] [:anode] 2006-09-07 13:16:37 PDT should not crash

Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv: Gecko/20060906 Firefox/

Comment 20 User image Bob Clary [:bc:] 2009-04-24 11:06:55 PDT
crash test landed

Note You need to log in before you can comment on or make changes to this bug.