Last Comment Bug 773533 - XBL elements (e.g. <marquee>, HTML5 video controls) make about:memory crash, possibly due to nsINode::SubtreeRoot() being defective?
: XBL elements (e.g. <marquee>, HTML5 video controls) make about:memory crash, ...
Status: VERIFIED FIXED
: assertion, crash, regression, testcase
Product: Core
Classification: Components
Component: DOM (show other bugs)
: 16 Branch
: All All
: -- critical with 2 votes (vote)
: mozilla17
Assigned To: Nobody; OK to take it and work on it
: Paul Silaghi, QA [:pauly]
Mentors:
Depends on:
Blocks: 343943 704623
  Show dependency treegraph
 
Reported: 2012-07-12 20:27 PDT by Jesse Ruderman
Modified: 2012-10-16 06:41 PDT (History)
22 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-
verified
verified


Attachments
A page containing <marquee> (53 bytes, text/html)
2012-07-12 20:27 PDT, Jesse Ruderman
no flags Details
stack trace (30.69 KB, text/plain)
2012-07-12 20:27 PDT, Jesse Ruderman
no flags Details
work-around (1.34 KB, patch)
2012-07-16 18:01 PDT, Nicholas Nethercote [:njn]
no flags Details | Diff | Splinter Review
(part 1) - Skip XBL nodes when looking for orphan DOM nodes. (2.66 KB, patch)
2012-07-17 17:58 PDT, Nicholas Nethercote [:njn]
bugs: review+
continuation: feedback+
akeybl: approval‑mozilla‑aurora+
Details | Diff | Splinter Review
(part 2) - Add a <marquee> to test_memoryReporters.xul, to test for the crash in bug 773533. (1.74 KB, patch)
2012-07-17 18:02 PDT, Nicholas Nethercote [:njn]
ehsan: review+
Details | Diff | Splinter Review
part 3: fix the imports (1.88 KB, patch)
2012-07-23 18:12 PDT, Andrew McCreight [:mccr8]
Ms2ger: review+
Details | Diff | Splinter Review

Description Jesse Ruderman 2012-07-12 20:27:12 PDT
Created attachment 641712 [details]
A page containing <marquee>

1. Load a page containing <marquee>
2. Load about:memory

Result: 

###!!! ASSERTION: aRoot not an ancestor of |this|?: 'cur', file nsINode.h, line 1187

followed by a crash [@ xpc::OrphanReporter::sizeOfIncludingThis]
Comment 1 Jesse Ruderman 2012-07-12 20:27:58 PDT
Created attachment 641715 [details]
stack trace

Nightly: bp-16741166-df44-4161-8dc2-935fd2120713
Comment 2 Nicholas Nethercote [:njn] 2012-07-12 23:05:47 PDT
I'm gonna need help from DOM people with this one.

The relevant code in XPCJSRuntime.cpp:

  static size_t
  SizeOfTreeIncludingThis(nsINode *tree)
  {               
      size_t n = tree->SizeOfIncludingThis(OrphanSizeOf);
      for (nsIContent* child = tree->GetFirstChild(); child; child = child->GetNextNode(tree)) {
          n += child->SizeOfIncludingThis(OrphanSizeOf);
      }   
      return n;
  } 

and the parameter passed is gotten with SubtreeRoot():

  nsCOMPtr<nsINode> orphanTree = node->SubtreeRoot();

Looks unexpectional to me.  But we get the assertion failures, and somehow end up in nsINode::GetNextNodeImpl with this==NULL.

Is there something special about marquee nodes?
Comment 3 Jesse Ruderman 2012-07-13 00:03:58 PDT
<marquee> is implemented using XBL, which makes it "special".
Comment 4 Scoobidiver (away) 2012-07-13 00:59:53 PDT
On Windows: bp-2055f67d-b780-4900-8abe-d7b342120713
Comment 5 Andrew McCreight [:mccr8] 2012-07-13 05:58:30 PDT
Maybe this is a problem with SubtreeRoot not handling XBL properly?
Comment 6 Alex Keybl [:akeybl] 2012-07-13 17:05:20 PDT
It's not clear that this could happen in normal user scenarios (which loading about:memory is not one of). Do we suspect that this will affect normal use cases?
Comment 7 Andrew McCreight [:mccr8] 2012-07-13 17:36:47 PDT
At one point we were accidentally calling this kind of code once a minute, but I think that's been stopped.  So I think it shouldn't be called unless somebody loads about:memory.
Comment 8 Andrew McCreight [:mccr8] 2012-07-13 17:37:14 PDT
We should also check HTML5 video controls, with are also XBLy.
Comment 9 Nicholas Nethercote [:njn] 2012-07-13 20:57:18 PDT
> So I think it shouldn't be called unless somebody loads about:memory.

That's correct.
Comment 10 Nicholas Nethercote [:njn] 2012-07-15 21:37:56 PDT
(In reply to Andrew McCreight [:mccr8] from comment #8)
> We should also check HTML5 video controls, with are also XBLy.

It crashes.  Steps to reproduce:

- Visit www.youtube.com/html5 to enable WebM
- Visit http://www.youtube.com/watch?v=X3F3KmZDtyg
- Open about:memory in another tab
- Assert with "ASSERTION: aRoot not an ancestor of |this|?: 'cur', file ../../../dist/include/nsINode.h, line 1187", then crash immediately afterwards
Comment 11 Boris Zbarsky [:bz] 2012-07-15 22:17:56 PDT
So when we hit the assert, aRoot is the <xbl:content>.  |this| is an HTMLDivElement.

What you're running into is that XBL plays fast and loose with its anonymous content.  In particular, if you look at nsXBLBinding::InstallAnonymousContent what it's doing is taking all the kids and rebinding them to a new parent, while still keeping them in the child list of the old parent.  The comments say it all.

The result is that for the special case when "node" is the <xbl:content> clone, node->GetFirstChild()->GetParent() != node.  And then you lose.  This doesn't bite people normally because normally there's no way for anyone who might try walking a DOM to actually get their hands on the <xbl:content>.  I'm a little surprised, and really worried, that in this case it looks like the <xbl:content> is JS-wrapped...  I wonder when that happens.
Comment 12 Nicholas Nethercote [:njn] 2012-07-16 18:01:29 PDT
Created attachment 642813 [details] [diff] [review]
work-around

> - Visit http://www.youtube.com/watch?v=X3F3KmZDtyg

http://www.youtube.com/watch?v=X3F3KmZDtyg&webm=1 forces the WebM version directly.

bz: your diagnosis is spot on, the attached patch skips any node which violates |node->GetFirstChild()->GetParent() == node| and the crashes go away for both <marquee> and native video controls.  It would be nice to fix this in a less crappy way, though.
Comment 13 Andrew McCreight [:mccr8] 2012-07-16 19:55:22 PDT
XBL is a little weird.  It might be good to just skip it anyways.  IIUC we don't really deal with anonymous content otherwise in the memory reporter.
Comment 14 Boris Zbarsky [:bz] 2012-07-16 20:00:09 PDT
I would _really_ like to understand why that <xbl:content> has a JS wrapper at all.  Possibly as a followup....
Comment 15 Nicholas Nethercote [:njn] 2012-07-16 20:03:11 PDT
(In reply to Andrew McCreight [:mccr8] from comment #13)
> XBL is a little weird.  It might be good to just skip it anyways.  IIUC we
> don't really deal with anonymous content otherwise in the memory reporter.

I'm ok with that, but it would also be nicer if the condition was |!isXBL(node)| or somesuch, rather than playing that child/parent game.  Any suggestions on how to do that?

And I sympathize with bz's concerns, too.
Comment 16 Andrew McCreight [:mccr8] 2012-07-16 20:14:23 PDT
(In reply to Boris Zbarsky (:bz) from comment #14)
> I would _really_ like to understand why that <xbl:content> has a JS wrapper
> at all.  Possibly as a followup....

Just to be clear here, the root (<xbl:content>) isn't necessarily the wrapped node: we find wrapped nodes and then get their SubTreeRoots() to get the list of roots.  But maybe you noticed that it is in this case, in which case please ignore me...

(In reply to Nicholas Nethercote [:njn] from comment #15)
> I'm ok with that, but it would also be nicer if the condition was
> |!isXBL(node)| or somesuch, rather than playing that child/parent game.  Any
> suggestions on how to do that?

Yes, there are better ways, but I don't know how available they will be in XPConnect land.  The UnoptimizableCCNode() method could do in a pinch, and should be a conservative approximation.  It may be possible to just dig out some subpart of that, I guess.
Comment 17 Boris Zbarsky [:bz] 2012-07-16 20:25:47 PDT
Yeah, the wrapped node was in fact the xbl:content if I didn't mess up with gdb.  And there should be nothing whose parent chain terminates in that node, so it wouldn't be the subtree root of anything, I think.
Comment 18 John Schoenick [:johns] 2012-07-17 12:20:39 PDT
This also appears to be the reason areweslimyet tests have been failing since the 10th - The workaround patch seems to revive them
Comment 19 Nicholas Nethercote [:njn] 2012-07-17 17:58:31 PDT
Created attachment 643210 [details] [diff] [review]
(part 1) - Skip XBL nodes when looking for orphan DOM nodes.

This patch looks for XBL nodes in a nicer fashion, and still prevents the crashes.
Comment 20 Nicholas Nethercote [:njn] 2012-07-17 18:02:26 PDT
Created attachment 643211 [details] [diff] [review]
(part 2) - Add a <marquee> to test_memoryReporters.xul, to test for the crash in bug 773533.

Ehsan, what's happening in this bug is that the presence of certain HTML
elements (e.g. <marquee>) were causing the memory reporters (that it runs in
the <script> element) to crash.

This patch adds a <marquee> element to the <body> of the
test_memoryReporters.xul test, so that when the memory reporters are run,
this path is hit.  Is it ok to put content in the <body> of a mochitest like
this?  I confirmed that it triggers the crash.

Thanks!
Comment 21 Nicholas Nethercote [:njn] 2012-07-17 18:25:00 PDT
Once this fix lands, we'll need to back-port it to Aurora 16, or back out the original patch.
Comment 22 :Ehsan Akhgari 2012-07-17 18:32:38 PDT
Comment on attachment 643211 [details] [diff] [review]
(part 2) - Add a <marquee> to test_memoryReporters.xul, to test for the crash in bug 773533.

Review of attachment 643211 [details] [diff] [review]:
-----------------------------------------------------------------

marquee fun \o/
Comment 23 Andrew McCreight [:mccr8] 2012-07-17 19:00:19 PDT
Comment on attachment 643210 [details] [diff] [review]
(part 1) - Skip XBL nodes when looking for orphan DOM nodes.

Review of attachment 643210 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/xpconnect/src/XPCJSRuntime.cpp
@@ +13,5 @@
>  #include "XPCJSMemoryReporter.h"
>  #include "WrapperFactory.h"
>  #include "dom_quickstubs.h"
>  
> +#include "Element.h"

I don't know if Element is the right thing to include here, but maybe it is.

@@ +1709,5 @@
>      virtual size_t sizeOfIncludingThis(void *aSupports)
>      {
>          size_t n = 0;
>          nsCOMPtr<nsINode> node = do_QueryInterface(static_cast<nsISupports*>(aSupports));
> +        nsCOMPtr<nsINode> child;

It doesn't look like child is used.
Comment 24 Andrew McCreight [:mccr8] 2012-07-17 19:02:39 PDT
It looks okay to me, but the particulars of this stuff are hazy to me, so Olli should probably look at it.

(In reply to Nicholas Nethercote [:njn] from comment #21)
> Once this fix lands, we'll need to back-port it to Aurora 16, or back out
> the original patch.

I can shepherd this patch through to Aurora if that happens while you are on vacation.
Comment 25 Nicholas Nethercote [:njn] 2012-07-17 19:39:45 PDT
> I don't know if Element is the right thing to include here, but maybe it is.

Well, it won't compile without it :)
Comment 26 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-07-18 00:30:07 PDT
Comment on attachment 643210 [details] [diff] [review]
(part 1) - Skip XBL nodes when looking for orphan DOM nodes.

We can take this, but please file a followup for the issue
mentioned in comment 14.
Comment 27 Hugh Nougher [:Hughman] 2012-07-20 06:00:46 PDT
For the record if your view an XML document then view about:memory it will usually crash.
Comment 29 Nicholas Nethercote [:njn] 2012-07-20 10:54:34 PDT
(In reply to Hugh Nougher [:Hughman] from comment #27)
> For the record if your view an XML document then view about:memory it will
> usually crash.

AFAICT, the patches I just landed fixed this.  I tested with http://www.w3.org/2000/10/swap/test/plist/iTunesMusicLibrary.xml.
Comment 30 Nicholas Nethercote [:njn] 2012-07-20 10:58:24 PDT
Comment on attachment 643210 [details] [diff] [review]
(part 1) - Skip XBL nodes when looking for orphan DOM nodes.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 
Addition of orphan DOM node measurement for about:memory (bug 704623).

User impact if declined:
Visiting about:memory crashes the browser if pages with certain content are loaded, e.g. <marquee>, XML, native video.
 
Testing completed (on m-c, etc.):
Green on try server, and I just landed on mozilla-inbound.

Risk to taking this patch (and alternatives if risky): 
Minimal;  it's a tiny change to the memory reporter that skips problematic DOM nodes.  If this patch isn't approved, then the patch from bug 704623 needs to be backed out.

String or UUID changes made by this patch: 
None.
Comment 32 :Ms2ger (⌚ UTC+1/+2) 2012-07-21 02:31:56 PDT
Comment on attachment 643210 [details] [diff] [review]
(part 1) - Skip XBL nodes when looking for orphan DOM nodes.

Review of attachment 643210 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/xpconnect/src/Makefile.in
@@ +64,5 @@
>  LOCAL_INCLUDES = \
>  		-I$(srcdir)/../wrappers \
>  		-I$(srcdir)/../loader \
>  		-I$(topsrcdir)/caps/include \
> +		-I$(topsrcdir)/content/base/public \

This is wrong.

::: js/xpconnect/src/XPCJSRuntime.cpp
@@ +13,5 @@
>  #include "XPCJSMemoryReporter.h"
>  #include "WrapperFactory.h"
>  #include "dom_quickstubs.h"
>  
> +#include "Element.h"

You should include "mozilla/dom/Element.h".
Comment 33 Nicholas Nethercote [:njn] 2012-07-21 14:38:03 PDT
> > +		-I$(topsrcdir)/content/base/public \
> 
> This is wrong.

What's the right way to do it?
Comment 34 Boris Zbarsky [:bz] 2012-07-21 16:23:16 PDT
The right way is to include the exported version of the header, which is "mozilla/dom/Element.h".
Comment 35 Alex Keybl [:akeybl] 2012-07-23 11:59:46 PDT
We don't need to track this regression for release specifically, as we'd likely ship with this in the product if necessary. But we will approve for FF16 given how early we are in the cycle, and the low risk evaluation.
Comment 36 Andrew McCreight [:mccr8] 2012-07-23 18:12:46 PDT
Created attachment 645160 [details] [diff] [review]
part 3: fix the imports

njn is on vacation, so he said I could fix it up and land it on branches.
Comment 37 :Ms2ger (⌚ UTC+1/+2) 2012-07-24 02:52:38 PDT
Comment on attachment 645160 [details] [diff] [review]
part 3: fix the imports

rs=me; I wouldn't worry about branches.
Comment 38 Andrew McCreight [:mccr8] 2012-07-24 08:30:49 PDT
(In reply to :Ms2ger from comment #37)
> rs=me; I wouldn't worry about branches.

Thanks!  I actually meant for the original patch, so I might as well include this.

https://hg.mozilla.org/integration/mozilla-inbound/rev/12c101f0ed10
Comment 40 Ed Morley [:emorley] 2012-07-25 08:13:11 PDT
https://hg.mozilla.org/mozilla-central/rev/12c101f0ed10
Comment 41 Nicholas Nethercote [:njn] 2012-08-07 17:39:04 PDT
Thanks for fixing the #include and landing the patches on Aurora, mccr8!
Comment 42 Paul Silaghi, QA [:pauly] 2012-09-11 07:49:01 PDT
Able to see the crash on Nightly 2012-07-12-mozilla-central-debug build.
Verified fixed on FF 16 2012-09-11-mozilla-beta-debug build on Win 7 x64, Ubuntu 12.04 and Mac OS X 10.8
Comment 43 Paul Silaghi, QA [:pauly] 2012-10-16 06:41:48 PDT
Verified fixed on FF 17 2012-10-15-mozilla-beta-debug build on Win 7 x64, Ubuntu 12.04 and Mac OS X 10.7.4.

Note You need to log in before you can comment on or make changes to this bug.