Closed Bug 363791 Opened 13 years ago Closed 13 years ago

Crash [@ nsTreeBodyFrame::PrefillPropertyArray] involving removal of <treecols>

Categories

(Core :: XUL, defect, critical)

defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla1.9alpha2

People

(Reporter: jruderman, Assigned: asqueella)

References

(Blocks 1 open bug)

Details

(4 keywords, Whiteboard: [needs auto-test])

Crash Data

Attachments

(3 files, 2 obsolete files)

Loading this testcase crashes Firefox (Mac trunk).  It seems to crash opt but not debug :/
The
>  tree.appendChild(treechildren); // no real change 
line in the testcase destroys then recreates an nsTreeBodyFrame, which creates a new nsTreeColumns instance in its constructor.

nsTreeColFrame on the other hand caches its nsTreeColumns (see nsTreeColFrame::EnsureColumns) and does not get recreated.

When <treecols> element is removed, its nsTreeColFrame is destroyed, which clears the list in (cached, as noted above) nsTreeColFrame::mColumns. In the testcase nsTreeBodyFrame's and nsTreeColFrame's "columns" objects are different, so the one in nsTreeBodyFrame doesn't get cleared.

nsTreeBodyFrame::mColumns' only nsTreeColumn ends up with a stale pointer to the deleted column frame (mFrame) which leads to crash at aCol->GetContent()->... in PrefillPropertyArray.

The reason the crash doesn't happen in a debug build is that the deleted frame's memory gets filled with 0xdddddddd's and the logic at http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/layout/xul/base/src/tree/src/nsTreeBodyFrame.cpp&rev=1.293&mark=2641-2647#2640 ends up not calling into PaintColumn.
OS: Mac OS X 10.4 → All
Hardware: Macintosh → All
This does hurt perf a little (nsTreeColFrame::SetBounds mainly, I suppose) - the columns object gets reacquired from the <tree> each time we need it. Keeping the treecolframe's mColumns in sync with treebodyframe's mColumns is hard. Especially since we obtain the columns object in nsTreeColFrame::EnsureColumns from the <tree> binding - an evil binding can hand us out a wrong object, causing a similar crash I believe.
Assignee: Jan.Varga → asqueella
Status: NEW → ASSIGNED
Attachment #249692 - Flags: superreview?(neil)
Attachment #249692 - Flags: review?(neil)
Firefox 2 crashes [@ nsTreeBodyFrame::PrefillPropertyArray] too.
Flags: blocking1.9?
Flags: blocking1.8.1.2?
(In reply to comment #3)
>an evil binding can hand us out a wrong object
We could improve that by getting the columns from the tree box object.
Comment on attachment 249692 [details] [diff] [review]
patch, don't cache mColumns in nsTreeColFrame

I don't build non-debug but I'm not sure I'm eligible to r+sr this anyway.

>+  if (columns)
>+    columns->InvalidateColumns();

>+  if (columns)
>+    columns->InvalidateColumns();

>+    if (columns)
>+      columns->InvalidateColumns();
Might be worth renaming GetColumns() to InvalidateColumns() etc ;-)

>+    nsCOMPtr<nsITreeColumns> columns = GetColumns();
>+    if (columns) {
>       nsCOMPtr<nsITreeBoxObject> tree;
>+      columns->GetTree(getter_AddRefs(tree));
>       if (tree)
>         tree->Invalidate();
Since you're no longer caching the columns the only way you can get the columns is from the tree box object which therefore always exists... if you follow my previous comment you would have the tree box object first anyway.
Attachment #249692 - Flags: review?(neil)
Attached patch patch (obsolete) — Splinter Review
Neil, could you take a look if not review this?

One problem with not caching columns I noticed is that when closing a window with <tree> in it, by the time nsTreeColFrame::Destroy is called, the document is half-destroyed and treeElement->GetBoxObject doesn't work (http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/content/xul/content/src/nsXULElement.cpp&rev=1.675&mark=1967#1961).

So from that moment and until nsTreeBodyFrame::Destroy for the same tree is called, the mFrame pointer in nsTreeColumn is still stale. I'm not sure how bad is that (i.e. what may happen during frame destruction). Any comments?
Attachment #249692 - Attachment is obsolete: true
Attachment #249763 - Flags: review?(neil)
Attachment #249692 - Flags: superreview?(neil)
nsTreeColumn is a refcounted object so it could easily outlive the frame, at which point the mFrame pointer has gone stale and GetX(&x) etc. just crashes.

I'm not sure what the solution for that might be (possibly make nsTreeColumn the box object for the <column> element) but I guess that's a separate bug.
Comment on attachment 249763 [details] [diff] [review]
patch

Well, as far as it goes, it certainly looks OK to me.
Attachment #249763 - Flags: review?(neil) → review+
(In reply to comment #8)
> nsTreeColumn is a refcounted object so it could easily outlive the frame, at
> which point the mFrame pointer has gone stale and GetX(&x) etc. just crashes.
> 
Well, the frame's Destroy() method attempts to clear the columns list, so unless it can't get the tree boxobject as in the document teardown case, mFrame shouldn't become stale.
Neil meant that a (say) JS caller could hold a reference to a specific column and crash us when mFrame becomes stale by accessing col.x. That's true, but a different bug than this one and could be fixed by force-clearing mFrame of all columns in InvalidateColumns instead of just releasing the reference to the first column (and adding the necessary null-checks in nsTreeColumn).
This patch doesn't seem to be the best solution due to the issue in comment 7 (and other cases that I didn't notice where we fail to invalidate columns when destroying the column frame). It does improve the situation somewhat (if the issue in comment 11 is fixed as well), do we want it to be reviewed and checked in?

Also, any ideas for a better solution? I haven't thought about it too hard yet.
Attachment #249763 - Flags: superreview?(roc)
Filed bug 365151 for the issue in comment 11.
Flags: blocking1.8.1.2? → blocking1.8.1.2+
Comment on attachment 249763 [details] [diff] [review]
patch

This is definitely the right thing to do.
Attachment #249763 - Flags: superreview?(roc) → superreview+
> This is definitely the right thing to do.

OK, should anything be done about the warning that comes from NS_ENSURE_TRUE in http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/content/xul/content/src/nsXULElement.cpp&rev=1.675&mark=1967#1961 ? This warning will now be shown each time a document with a <tree> in it is closed.
I think it should just return null if the box object couldn't be retrieved.

Attached patch patchSplinter Review
like so?
Attachment #249763 - Attachment is obsolete: true
Whiteboard: [checkin needed]
Checking in mozilla/content/xul/content/src/nsXULElement.cpp;
/cvsroot/mozilla/content/xul/content/src/nsXULElement.cpp,v  <--  nsXULElement.cpp
new revision: 1.678; previous revision: 1.677
done
Checking in mozilla/layout/xul/base/src/tree/src/nsTreeColFrame.cpp;
/cvsroot/mozilla/layout/xul/base/src/tree/src/nsTreeColFrame.cpp,v  <--  nsTreeColFrame.cpp
new revision: 1.49; previous revision: 1.48
done
Checking in mozilla/layout/xul/base/src/tree/src/nsTreeColFrame.h;
/cvsroot/mozilla/layout/xul/base/src/tree/src/nsTreeColFrame.h,v  <--  nsTreeColFrame.h
new revision: 1.29; previous revision: 1.28
done
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Whiteboard: [checkin needed]
Please create a 1.8 branch patch for this blocking bug, or ask for approval on this patch if it's already appropriate.
Attached patch branch patchSplinter Review
No changes from the trunk patch.
Attachment #250671 - Flags: approval1.8.1.2?
Comment on attachment 250671 [details] [diff] [review]
branch patch

Approved for 1.8 branch, a=jay for drivers.

Do we want this on the 1.8.0 branch?  Can someone test and confirm?
Attachment #250671 - Flags: approval1.8.1.2? → approval1.8.1.2+
Does this patch work for the 1.8.0 branch also? We crash on corrupted memory there as well.
Flags: blocking1.8.0.10+
mozilla/content/xul/content/src/nsXULElement.cpp 	1.578.2.22
mozilla/layout/xul/base/src/tree/src/nsTreeColFrame.cpp 	1.32.24.1
mozilla/layout/xul/base/src/tree/src/nsTreeColFrame.h 	1.19.24.1
Keywords: fixed1.8.1.2
VERIFIED FIXED for 1.8.1.2 with Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.2pre) Gecko/2007011004 BonEcho/2.0.0.2pre Fedora FC6 and on Windows XP x64
Status: RESOLVED → VERIFIED
Comment on attachment 250671 [details] [diff] [review]
branch patch

This should apply to 1.8.0 branch as well, but I don't have a tree to test it. Does anybody have the 1.8.0 tree and time to test? (Gavin?)
I haven't been able to get my 1.8.0 build to complete for a little while now (though I can't say that I've tried very hard to fix it), so I'm not sure that I'll be able to help test this patch on that branch.
Flags: in-testsuite?
Comment on attachment 250671 [details] [diff] [review]
branch patch

This applies to 1.8.0 with a little bit merging (due to comments), so it should be ok.

I'll test the patch in a few hours, on Linux (didn't have a 1.8.0 tree and it's hard to build one with vc8).
Attachment #250671 - Flags: approval1.8.0.10?
OK, tested with 1.8.0, the crash is gone and the trees I checked continue to work fine.
Comment on attachment 250671 [details] [diff] [review]
branch patch

Approved for 1.8.0 branch, a=jay for drivers.
Attachment #250671 - Flags: approval1.8.0.10? → approval1.8.0.10+
mozilla/content/xul/content/src/nsXULElement.cpp         1.578.2.1.2.9
mozilla/layout/xul/base/src/tree/src/nsTreeColFrame.cpp  1.32.30.1
mozilla/layout/xul/base/src/tree/src/nsTreeColFrame.h    1.19.30.1
Keywords: fixed1.8.0.10
verified fixed for 1.8.0.10 - tested with Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.0.10pre) Gecko/20070120 Firefox/1.5.0.10pre ID:2007012008
Flags: blocking1.9?
Whiteboard: [needs auto-test]
Target Milestone: --- → mozilla1.9alpha2
Crashtest checked in.
Flags: in-testsuite? → in-testsuite+
Component: XP Toolkit/Widgets: Trees → XUL
QA Contact: xptoolkit.trees → xptoolkit.widgets
Crash Signature: [@ nsTreeBodyFrame::PrefillPropertyArray]
You need to log in before you can comment on or make changes to this bug.