Closed Bug 351328 Opened 18 years ago Closed 18 years ago

Crash [@ nsCSSRendering::PaintBackgroundWithSC] with colspan=0

Categories

(Core :: Layout: Tables, defect)

PowerPC
macOS
defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: jruderman, Assigned: bernd_mozilla)

References

Details

(5 keywords, Whiteboard: [sg:critical] uses freed memory)

Crash Data

Attachments

(3 files, 3 obsolete files)

This testcase causes a scary crash.  In a debug build, the memory address accessed looks random (NOT something like 0xdddddddd or 0xdadadada).  Examples: 0xff9c8020, 0x653b3261.

Stack trace:

0    nsCachedStyleData::GetStyleData(nsStyleStructID const&) + 92 (nsRuleNode.h:215)
1    nsStyleContext::GetStyleData(nsStyleStructID) + 52 (nsStyleContext.cpp:221)
2    nsIFrame::GetStyleData(nsStyleStructID) const + 108 (nsIFrame.h:591)
3    nsIFrame::GetStyleDisplay() const + 36 (nsStyleStructList.h:95)
4    nsCSSRendering::PaintBackgroundWithSC(nsPresContext*, nsIRenderingContext&, nsIFrame*, nsRect const&, nsRect const&, nsStyleBackground const&, nsStyleBorder const&, nsStylePadding const&, int, nsRect*) + 176 (nsCSSRendering.cpp:2902)
5    TableBackgroundPainter::PaintCell(nsTableCellFrame*, int) + 572 (nsTablePainter.cpp:599)
...
Attached file testcase
Just before the crash I see

1 * ###!!! ASSERTION: bad arg in nsCellMap::InsertCellAt: 'PR_FALSE', nsCellMap.cpp, line 1576
2 * ###!!! ASSERTION: nsCellMap::GetNumCellsOriginatingInCol - bad col index: 'PR_FALSE', nsCellMap.cpp, line 792

this does not crash Gecko/20060111 arghhh
Assignee: nobody → bernd_mozilla
Keywords: regression
Attached patch don't assert but fix it (obsolete) — Splinter Review
Attachment #236847 - Flags: superreview?(bzbarsky)
Attachment #236847 - Flags: review?(bzbarsky)
(In reply to comment #4)
> don't assert but fix it

Hmm, isn't this just wallpaper? Shouldn't we track down why the index is
out of bounds and fix the root cause?

IMHO, the table code needs more assertion checks, not less.
Attached patch Like so? (obsolete) — Splinter Review
Its not wallpaper what I proposed, but doing an append as your patch does. However my patch does cover all callers of InsertCells, yours does not. If we would like to avoid the low level fix then the logic that you propose should be incorporated into nsTableFrame::InsertCells. Regardless what option we choose: asserting and throwing the cellframe references is not option.
Or with a closer look I don't see where my patch is more or less wallpaper than yours. The issue is the 0 span handling. I will rewrite this as I am sick of it. But this is not branchable. So I need a patch that cuts all possible entry pathes towards the possibilities of throwing cell frames.
(In reply to comment #7)
> However my patch does cover all callers of InsertCells, yours does not.

Yes, that is intentional. I don't think we should silently "repair"
bad indicies like you did. I don't mind making the code robust against
errors, but please keep the assertion in that case so we can track down
the bad callers and fix them.

> the logic that you propose should be incorporated into
> nsTableFrame::InsertCells. 

Ok. If you want to extend InsertCells to mean Append when
the index is one beyond the last cell that's fine with me.
Please assert for other invalid indicies though, so we can
catch bugs in callers.

(In reply to comment #8)
> Or with a closer look I don't see where my patch is more or less
> wallpaper than yours.

Well, your code allows me to write buggy code without being notified
of my error.

My patch on the other hand fixes the root cause of this particular
bug which is that after AttributeChangedFor() have called RemoveCell()
the col index will be too large if this is the last cell in a row.
We should Append the cell in this case since InsertCells() doesn't
handle that case.

I agree that making the low level code more robust for future errors
is a good thing, but please keep the assertion in that case
(and extend it to assert when |aColIndexBefore| is negative too).
Bernd, don't get me wrong, I have the greatest respect and appreciation
for your hard work on the table code.

I just don't think "repairing" bad arguments (silently) in low level
functions is a good strategy to make quality code (in general).
And, as far as I know, we don't do that in other layout code.
Mats I will happily leave the assert where it is but I don't see any point to wallpaper one caller and to wait till the next caller will trip the condition at the low level routine. The goal to me seems to be write safe subfunctions that could not fail rather than to put the load on the callers. Especially given the numerous branches that we maintain.

>My patch on the other hand fixes the root cause of this particular
>bug which is that after AttributeChangedFor()

I would be so happy if this would be the root, but it is not.
The root is the wrong zero span handling. What happens here is:



  document.getElementById('A').setAttribute('colspan', 0);
  document.getElementById('B').setAttribute('rowspan', 3);

<table>
  <tr>
    <td id="A">TD</td>
    <td id="B" colspan="3">TD</td>
  </tr>
</table>

The table has initially 4 columns
then we call the first time AttributeChanged
it removes the first cell the number of cols goes down to 3
then we insert the cell this will we need to write it before the first cell, so we empty the row and then append the first cell into the row. Everything is OK up to here. Now we append the second cell it looks for the first free space (cellmap hole or a dead cell) but append has a secondary function it expands zero spans and this happens here. So the first cell is expanded till it covers all columns. That is equivalent to a colspan 3. Then the first place to place the second cell is at col 3 (starting with 0).
In the cellmap dump it looks like:

C0 C C C1 C C              <===== this is wrong

while it should ideally look like 

C0 C1 C C

or at least as (due to the fact that we insert a zero span as minimum colspan=2)

C0 C C1 C C

Cx stands for real cells, C for colspanned

So on the second round of calls we remove the second cell it causes a cellmap rebuild and we end with

C0 C 

Now when we insert the cell again the cellmap underneath has changed the position 2 is now after all cells. What we shall do now is an Append (your proposal) or as i would call it an insert after pos 1. But both of them are wallpapers.

The root is cellmap like this C0 C C C1 C C where C0 is a cell with colspan 0.
So the question now is where do we wallpaper at the level where the error occures or at the caller level. My proposal is the first one, yours the second. The disadvantage in the second approach is that we put the burden on the caller that it detects misbehaviour of the callee. "Nobody expects the spanish inquisition" (http://en.wikipedia.org/wiki/The_Spanish_Inquisition_(Monty_Python)) is the rule that describes such an approach.

Why do I would like to wallpaper at all:
I would like to have at least some fix on 1.8.1. The correct fix IMHO is to remove at
http://lxr.mozilla.org/seamonkey/source/layout/tables/nsCellMap.cpp#2433
 aUpdateZeroSpan argument and implement a new mechanism to expand zerospans.
This will not be done in 10 lines. So I am afraid that I will not get it into 1.8.1.
s/The root is cellmap like this C0 C C C1 C C where C0 is a cell with colspan 0./
The root problem is when we have a cellmap like this: C0 C C C1 C C , where C0 is a cell with colspan 0./
Comment on attachment 236847 [details] [diff] [review]
don't assert but fix it

I'm OK with this for branch if we keep the assert.  On trunk, let's work on doing the "real fix" in either this bug or a followup?
Attachment #236847 - Flags: superreview?(bzbarsky)
Attachment #236847 - Flags: superreview+
Attachment #236847 - Flags: review?(bzbarsky)
Attachment #236847 - Flags: review+
(In reply to comment #11)
> I would be so happy if this would be the root, but it is not.

Ok, you have convinced me that my proposed fix is also just wallpaper
and that yours is better since it covers more cases.

> The goal to me seems to be write safe subfunctions that could not fail
> rather than to put the load on the callers.

Yes and no.
A function should protest when given faulty input values and the root cause
should be tracked down and fixed. That doesn't conflict with the goal that
functions can also be robust.

I would love to see more assertion checks in the table code, not only does
it allow us to find bugs, but it also serves as documentation of
the assumptions the code was written under.

> I would like to have at least some fix on 1.8.1.

I agree that a minimal patch (like yours) is the way to go for branches.
You could also assert when "aColIndexBefore < -1".
Attachment #236967 - Attachment is obsolete: true
Attached patch patch with asserts (obsolete) — Splinter Review
Attached patch revised patchSplinter Review
The function might be called with -1 as a colindex. This is legal as the argument shows the cell after which the new cells shall be inserted. After -1 goes 0 so we insert at the beginning of a row. Requesting a new review.
Attachment #236847 - Attachment is obsolete: true
Attachment #237485 - Attachment is obsolete: true
Attachment #237496 - Flags: superreview?(bzbarsky)
Attachment #237496 - Flags: review?(bzbarsky)
The real fix is bug 351942
Comment on attachment 237496 [details] [diff] [review]
revised patch

You could still assert that aColIndexBefore >= -1....
Attachment #237496 - Flags: superreview?(bzbarsky)
Attachment #237496 - Flags: superreview+
Attachment #237496 - Flags: review?(bzbarsky)
Attachment #237496 - Flags: review+
fixed on trunk
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Does anybody see this on branch if so ask for 1.8.1.1 approval
Yes, it also crashes 1.8.1 builds.
Flags: blocking1.8.1.1?
Attachment #237496 - Flags: approval1.8.1?
Comment on attachment 237496 [details] [diff] [review]
revised patch

a=schrep - approved to land on 1.8.1 branch before rc1.
Attachment #237496 - Flags: approval1.8.1? → approval1.8.1+
fixed on branch
Keywords: fixed1.8.1
This is appropriate for 1.8.0.x right?
Flags: blocking1.8.0.8?
Whiteboard: [sg:critical] uses freed memory
exactly, the real patch is a a 60kb one - bug 351942. I made this patch as I anticipated that bug 351942 is not branchable.
Comment on attachment 237496 [details] [diff] [review]
revised patch

asking approval
Attachment #237496 - Flags: approval1.8.0.8?
Flags: blocking1.8.0.8? → blocking1.8.0.8+
Flags: blocking1.8.1.1?
Comment on attachment 237496 [details] [diff] [review]
revised patch

approved for 1.8.0 branch, a=dveditz for drivers
Attachment #237496 - Flags: approval1.8.0.8? → approval1.8.0.8+
Fix checked into MOZILLA_1_8_0_BRANCH
Keywords: fixed1.8.0.8
v.fixed on 1.8.0 branch with Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.0.8pre) Gecko/20061020 Firefox/1.5.0.8pre, no crash with testcase.
Group: security
After applying the patch provided in the bug report, noticed that Mozilla 1.7 still crashed with the testcase.

Following is the snippet of the stack trace

  [1] __lwp_kill(0x1, 0xb), at 0xd27811a5
  [2] pthread_kill(0x1, 0xb), at 0xd277e072
  [3] raise(0xb), at 0xd272cf43
  [4] nsProfileLock::FatalSignalHandler(0xb, 0x0, 0x8045bac), at 0xcdff998e
  [5] __sighndlr(0xb, 0x0, 0x8045bac, 0xcdff9860), at 0xd277fc9f
  [6] call_user_handler(0xb, 0x0, 0x8045bac), at 0xd277623d
  [7] sigacthandler(), at 0xd27763bd
  ---- called from signal handler with signal 11 (SIGSEGV) ------
=>[8] TableBackgroundPainter::TableBackgroundData::IsVisible(0x2), at 0xce66bc8c
  [9] TableBackgroundPainter::PaintCell(0x80460f8, 0x8b4a8f8, 0x0), at 0xce66b5a1
  [10] TableBackgroundPainter::PaintRow(0x80460f8, 0x8b4a5a0, 0x0), at 0xce66b44a
  [11] TableBackgroundPainter::PaintRowGroup(0x80460f8, 0x8b4a564, 0x0), at 0xce66b176
  [12] TableBackgroundPainter::PaintTable(0x80460f8, 0x8b4a294), at 0xce66ae66
  [13] nsTableFrame::Paint(0x8b4a294, 0x8b84b98, 0x88099a8, 0x804623c, 0x0, 0x0), at 0xce64045c
  [14] nsContainerFrame::PaintChild(0x8b4a140, 0x8b84b98, 0x88099a8, 0x80462f8, 0x8b4a294, 0x0, 0x0), at 0xce4b1ccf
  [15] nsTableOuterFrame::Paint(0x8b4a140, 0x8b84b98, 0x88099a8, 0x80462f8, 0x0, 0x0), at 0xce656848
  [16] nsContainerFrame::PaintChild(0x8b49fcc, 0x8b84b98, 0x88099a8, 0x8046520, 0x8b4a140, 0x0, 0x0), at 0xce4b1ccf
  [17] nsBlockFrame::PaintChild(0x8b49fcc, 0x8b84b98, 0x88099a8, 0x8046520, 0x8b4a140, 0x0, 0x0), at 0xce4a3de2
  [18] PaintLine(0x80463bc, 0x8046520, 0x80463cc, 0x0, 0x8046404, 0x8b84b98, 0x88099a8, 0x0, 0x8b49fcc), at 0xce49fee7
  [19] nsBlockFrame::PaintChildren(0x8b49fcc, 0x8b84b98, 0x88099a8, 0x8046520, 0x0, 0x0), at 0xce49c1b2
  [20] nsHTMLContainerFrame::PaintDecorationsAndChildren(0x8b49fcc, 0x8b84b98, 0x88099a8, 0x8046520, 0x0, 0x1, 0x0), at 0xce4dc8f3/

From the stack trace it appears that,

o TableBackgroundPainter::PaintCell (nsTablePainter.cpp) the   Cols[colIndex].mColGroup is corrupted.

o It is also seen that when the crash happens, the colIndex value is equal to the value of mNumCols.

Have attached a patch to fix this problem. Kindly review.
Attachment #286279 - Flags: review?(dbaron)
The patch for 1.7 is wrong/wall paper. The index should not be out of bound at this place. The place to fix this is inside the cellmap.
Crashtest checked in.
Flags: in-testsuite+
Comment on attachment 286279 [details] [diff] [review]
Patch for Mozilla 1.7

marking review- per Bernd's comment 31
Attachment #286279 - Flags: review?(dbaron) → review-
Crash Signature: [@ nsCSSRendering::PaintBackgroundWithSC]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: