Closed Bug 522030 Opened 15 years ago Closed 15 years ago

Can still crash due to weak refs in the id table

Categories

(Core :: XUL, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9.3a1
Tracking Status
status1.9.2 --- beta1-fixed
blocking1.9.1 --- .4+
status1.9.1 --- .4-fixed

People

(Reporter: bzbarsky, Assigned: bzbarsky)

References

Details

(Keywords: crash, fixed1.9.0.15, fixed1.9.0.16, Whiteboard: [sg:investigate])

Attachments

(4 files)

I hate XUL.  Fix + test coming up, though I still need to try-server this.
Flags: blocking1.9.2?
Flags: blocking1.9.0.15?
Group: core-security
blocking1.9.1: --- → ?
Er, found one issue I'm not sure how to deal with.   There's an AddElementForID caller in the template code that seems to add an element to the id table.  Can this ever be an anonymous element?  Does that element ever actually get added to the DOM?  If so, where?  As far as I can tell, it's added to the id table before it's added to the DOM, which means we can't tell whether it's anonymous...
Assignee: benjamin → bzbarsky
Blocks: 521969
blocking1.9.1: ? → .4+
Flags: blocking1.9.0.15? → blocking1.9.0.15+
We're going to re-spin both Firefox 3.0.15 and 3.5.4 for this issue.
OS: Mac OS X → All
Hardware: x86 → All
Ugh.

Here's the deal: according to bz there are two fixes for this: a branch-safe one and a more-correct one.

The branch-safe fix will come with some small perf costs (~1% on various metrics, though not Ts) and potential for memory leaks (this worries bz the most, and consequently, me) but has the benefit of not absolutely shattering add-on compatibility.

The more-correct fix comes without perf or memory leak costs, but absolutely shatters add-on compatibility and parts of the Firefox UI (which bz said he can fix).

We've decided to start with the more pressing branch-safe fix, which will be checked into mozilla-central and mozilla-1.9.2 to get test coverage and measure the performance impact. This means holding the beta a little longer, so marking this as a P1 blocker. Based on the performance impact, we'll make a decision about what to do for mozilla-1.9.2; in order to preserve add-on compatibility, we might have to suck it up.
Flags: blocking1.9.2? → blocking1.9.2+
Priority: -- → P1
Blocks: 489925
Summary: Can still get at XBL-bound anonymous content via getElementById in XUL documents → Can still crash due to weak refs in the id table
Attached patch Proposed fixSplinter Review
This applies on top of a backout of the patch for bug 489925.
Attachment #406072 - Flags: review?(jst)
Comment on attachment 406072 [details] [diff] [review]
Proposed fix

Looks good! r=jst
Attachment #406072 - Flags: review?(jst) → review+
Pushed:

http://hg.mozilla.org/mozilla-central/rev/b40bf2ef14a7
http://hg.mozilla.org/mozilla-central/rev/b5f738553e38

(backout of the original fix for bug 489925 plus this fix) and 

http://hg.mozilla.org/releases/mozilla-1.9.2/rev/9ef0bceaf6b2
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/96207da1b7c1

Will wait for those to cycle and give us perf and leak data, then request branch approvals.  Will also file followup bug for the real trunk (and maybe 1.9.2) fix.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Severity: normal → critical
Flags: in-testsuite+
Keywords: crash
Target Milestone: --- → mozilla1.9.3a1
This is most emphatically NOT in-testsuite+.  It can't be until we open up the bug (at which point a modified version of the testcase in bug 489925 should be checked in as a crashtest.
Flags: in-testsuite+ → in-testsuite?
Attached file Said crashtest
Attached patch 1.9.1 fixSplinter Review
Please pay particular attention to the ID_NOT_IN_DOCUMENT stuff.  I made changes to Traverse(), ~nsIdentifierMapEntry, and AddIdContent to handle it; any other obvious changes that need to happen?

Again, this applies on top of a backout of bug 489925 on 1.9.1 branch.
Attachment #406177 - Flags: review?(jst)
Attached patch 1.9.0 fixSplinter Review
Lots of merging here... Please look over carefully?
Attachment #406179 - Flags: review?(jst)
No discerned perf impact or leaking on mozilla-central or mozilla-1.9.2. Boris, how should we be looking for leaks? I'm copying in Leakcat who might be able to help there ...

In the meantime, I think we should consider this as fixed for mozilla-1.9.2 until we get evidence that it's caused bigger problems.
As far as leaks, I had two specific worries:

1)  Document leaks through shutdown.  Just browsing around a bunch with leak logging enabled will catch these.  I suspect there won't be many, if any, given the cycle collection hookups this patch ended up taking advantage of.

2)  Element leaks for document lifetime.  I don't know that we have a sane way to measure this; it would only bite situations where the elements are anonymous, so not likely to hit web pages.  Extensions that leak due to this category of leaks would have crashed before, so probably a net win...
(In reply to comment #13)
> As far as leaks, I had two specific worries:
> 

So for 1)  i will run a test over the Top 10000 Topsites to check for leaks and for 2) a run with the Top Extensions installed
That sounds great.  Thanks!

We probably want to do that with all of 1.9.2, 1.9.1, and 1.9.0, since the code is significantly different and the cycle collector behavior is significantly different between them.  :(
(In reply to comment #15)
> That sounds great.  Thanks!
> 
> We probably want to do that with all of 1.9.2, 1.9.1, and 1.9.0, since the code
> is significantly different and the cycle collector behavior is significantly
> different between them.  :(

yeah that's no problem, can cover all this branches with the vm's
Johnny's not actually CCed to this bug...
Comment on attachment 406177 [details] [diff] [review]
1.9.1 fix

Looks good.
Attachment #406177 - Flags: review?(jst) → review+
Attachment #406179 - Flags: review?(jst) → review+
Attachment #406177 - Flags: approval1.9.1.4?
Attachment #406179 - Flags: approval1.9.0.15?
Comment on attachment 406179 [details] [diff] [review]
1.9.0 fix

Approved for 1.9.0.15. a=ss for release-drivers
Attachment #406179 - Flags: approval1.9.0.15? → approval1.9.0.15+
Comment on attachment 406177 [details] [diff] [review]
1.9.1 fix

Approved for 1.9.1.4. a=ss for release-drivers
Attachment #406177 - Flags: approval1.9.1.4? → approval1.9.1.4+
Checked into CVS:

Checking in content/html/document/src/nsHTMLDocument.cpp;
/cvsroot/mozilla/content/html/document/src/nsHTMLDocument.cpp,v  <--  nsHTMLDocument.cpp
new revision: 3.794; previous revision: 3.793
done
Checking in content/html/document/src/nsHTMLDocument.h;
/cvsroot/mozilla/content/html/document/src/nsHTMLDocument.h,v  <--  nsHTMLDocument.h
new revision: 3.232; previous revision: 3.231
done
Checking in content/html/document/src/nsHTMLDocument.cpp;
/cvsroot/mozilla/content/html/document/src/nsHTMLDocument.cpp,v  <--  nsHTMLDocument.cpp
new revision: 3.795; previous revision: 3.794

for 1.9.0.16 and:

Checking in content/html/document/src/nsHTMLDocument.cpp;
/cvsroot/mozilla/content/html/document/src/nsHTMLDocument.cpp,v  <--  nsHTMLDocument.cpp
new revision: 3.793.2.1; previous revision: 3.793
done
Checking in content/html/document/src/nsHTMLDocument.cpp;
/cvsroot/mozilla/content/html/document/src/nsHTMLDocument.cpp,v  <--  nsHTMLDocument.cpp
new revision: 3.793.2.1; previous revision: 3.793
done

for 1.9.0.15
Whiteboard: fixed1.9.0.15, fixed1.9.0.16
Er, the last part of the 1.9.0.15 cvs thing should have been:

Checking in content/html/document/src/nsHTMLDocument.h;
/cvsroot/mozilla/content/html/document/src/nsHTMLDocument.h,v  <--  nsHTMLDocument.h
new revision: 3.231.12.1; previous revision: 3.231
done
Checking in content/html/document/src/nsHTMLDocument.cpp;
/cvsroot/mozilla/content/html/document/src/nsHTMLDocument.cpp,v  <--  nsHTMLDocument.cpp
new revision: 3.793.2.2; previous revision: 3.793.2.1
done
Flags: blocking1.9.0.16+
Attachment #406177 - Flags: approval1.9.1.5+
Attachment #406179 - Flags: approval1.9.0.16+
Boris, I wonder how I can verify this fix. I tried your attached crash test with builds from 10/13 but I cannot get those to crash.
The attached test doesn't crash in those builds due to the fix for bug 489925 being present in them.  Let me see if I can write a testcase which will crash them.
OK, I haven't succeeded at it yet, because XUL does weird things on subtree removals...  I _think_ it's still possible to trigger such crashes, but I'm not sure its worth spending a few days to figure out the exact right codepath to hit.
Whiteboard: fixed1.9.0.15, fixed1.9.0.16
Whiteboard: [sg:investigate]
This doesn't crash with build 3 of 1.9.0.15 (Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.15) Gecko/2009101601 Firefox/3.0.15 (.NET CLR 3.5.30729)) or build 1 of it or 1.9.0.14. 

Is there a means to verify this bug for 1.9.0 and 1.9.1?
Al, I don't have a crash testcase for you, sorry.

One could try to verify buy fuzzing the testcase posted in this bug and especially a XUL (not XHTML) version of that testcase?
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: