Closed Bug 635835 Opened 12 years ago Closed 12 years ago

Crash with nsTreeWalker::NextNode setting currentNode as JS object from chrome


(Core :: DOM: Core & HTML, defect)

Not set



Tracking Status
firefox5 --- fixed
blocking2.0 --- -


(Reporter: WeirdAl, Assigned: WeirdAl)


(Keywords: crash, testcase, Whiteboard: [sg:dos])


(2 files, 1 obsolete file)

Attached file maximal testcase as XPI (obsolete) —
Steps to reproduce:
(1) Take a debug build of FF4 trunk or beta 11
(2) mkdir crash-profile
(3) Download the attachment as crash-proxy-test.xpi
(4) dist/bin/firefox --no-remote -profile crash-profile/ crash-proxy-test.xpi 
(5) Install the XPI as directed, and after restarting, close the browser.
(6) dist/bin/firefox --no-remote -profile crash-profile/ --chrome chrome://editortest/content/test_DOM_bindings.xul

Expected results:  No crash.
Actual results:  Crash at nsINode::GetFirstChild, this == null.

Stack trace coming in a moment.  I'll spend the next few days reducing this testcase.

Basically, I used a bit of JS-based proxy code to wrap a DOM document.  Then I called this code, where the DOM document is called "doc":

  // Remove all white-space.
  function filter(aNode) {
    return (!(/\S/.test(aNode.nodeValue))) ? FILTER_ACCEPT : FILTER_SKIP;
  var walker = doc.createTreeWalker(doc, C_i.nsIDOMNodeFilter.SHOW_TEXT, filter, true);
  // Skip the document node, since they can't have text children.
  while (walker.nextNode()) {
    dump("walker.currentNode: " + walker.currentNode.nodeName + "\n");
    var parent = walker.currentNode.parentNode;
    dump("parent: " + parent.nodeName + "\n\n");
    walker.currentNode = parent;

I think it tried to set walker.currentNode to the proxied parent node, that somehow the JS-proxied parent node made it all the way into the native code.  I see one dump("parent: " ...) call succeed, and then the null dereference.
per reporter request
Group: core-security
Attached file stack trace
On the security-sensitive flag:  Both gal and Waldo asked me to file it as such; I simply forgot to check the box.
(In reply to comment #0)
> I think it tried to set walker.currentNode to the proxied parent node, that
> somehow the JS-proxied parent node made it all the way into the native code.

Stepping through on Windows, I now see my theory was correct:  at nsTreeWalker::SetCurrentNode(nsIDOMNode * aCurrentNode), aCurrentNode is a [nsXPTCStubBase].
I suspect this is neither proxy related or a security problem.

It is expected that we'll get to nsTreeWalker::SetCurrentNode with the proxy object as aCurrentNode. You can pass any (non-xpcom) js-object as node and we'll happily wrap it. However when we QI on line 150, the result of the QI should be null. An alternative way to get the same result is to just do

walker.currentNode = {};

Anyhow, it appears that some of the walking functions aren't able to deal with mCurrentNode being null, which makes sense as that normally shouldn't happen.

So what we should do is to nullcheck the result of the QI before assigning into mCurrentNode in nsTreeWalker::SetCurrentNode.
Jonas's guess (about null-checking the QI) is right on the money, according to a patch I tested locally.

The only thing I don't understand is why the QI failed.  The way I wrote my proxied QI, it calls on all the QI targets it can, including the original object's.  Unless the native do_QueryInterface call itself (or XPConnect code) decided, "This interface isn't scriptable, so I'm not going to call the scripted object's QueryInterface method," it should have worked.

(I can understand why we wouldn't call QI with a non-scriptable interface.  Why have a nsCOMPtr to a non-scriptable interface for a scriptable object, since you can't call those methods and expect them to work?  Still, it'd be nice to have that confirmed.)
Yes, the XPConnect QI never calls into the wrapped object for nonscripatable interfaces.
Please renominate if this hits sg:crit, otherwise we'll be happy to process your approval, but won't block on this.
blocking2.0: ? → -
In particular, non-scriptable interfaces are not exposed to script, period.  You can't QI an object to them from script, and you can't implement them in script.
Opening this up since it's just a null-dereference.
Group: core-security
I'm working on a patch for this - with a couple details answered here, I can have it tonight for review.

(1) Which exception code would people prefer?  I was thinking NS_ERROR_ILLEGAL_VALUE, but DOM code usually says NS_ERROR_DOM_NOT_SUPPORTED_ERR.
(2) Are there "chrome crashtests", and if so, how do I specify a HTML file is one?  (I tried writing it as an ordinary crashtest, and it didn't crash because nsContentUtils::CheckSameOrigin rejected it before we got to the buggy behavior.)
Component: XPConnect → DOM: Core & HTML
QA Contact: xpconnect → general
Component: DOM: Core & HTML → DOM: Traversal-Range
QA Contact: general → traversal-range
NOT_SUPPORTED_ERR: Raised if an attempt is made to set currentNode to null.
Attached patch patch, v1Splinter Review
I thought about putting this check before the CheckSameOrigin call, where we check that aCurrentNode is not null.  I don't have a strong opinion on that, though.
Assignee: nobody → ajvincent
Attachment #514143 - Attachment is obsolete: true
Attachment #514409 - Flags: review?
Attachment #514409 - Flags: review? → review?(jonas)
Summary: Crash combining proxied XML document with nsTreeWalker::NextNode → Crash with nsTreeWalker::NextNode setting currentNode as JS object from chrome
Comment on attachment 514409 [details] [diff] [review]
patch, v1

Would prefer to write a simple crash-test rather than a chrome test, as, well, that's what they're for :-)
Attachment #514409 - Flags: review?(jonas) → review+
Comment on attachment 514409 [details] [diff] [review]
patch, v1

Requesting approval 2.0:  This fixes a crash in nsTreeWalker, which I believe only chrome code (well, maybe proxy code, but I doubt it) can reach.
Attachment #514409 - Flags: approval2.0?
OS: Linux → All
Hardware: x86_64 → All
Comment on attachment 514409 [details] [diff] [review]
patch, v1

Seems to me that this can change behavior in some case from returning null to throwing an exception, which could break scripts in extensions etc. Given that, and the fact that this crash hasn't been seen much (if at all) before, and that it's a non-exploitable crash, I'd prefer this to land after 2.0. So approval2.0-
Attachment #514409 - Flags: approval2.0? → approval2.0-
Depends on: post2.0
Whiteboard: [sg:dos] → [sg:dos][fixed-in-cedar]
Closed: 12 years ago
No longer depends on: post2.0
Flags: in-testsuite+
Resolution: --- → FIXED
Whiteboard: [sg:dos][fixed-in-cedar] → [sg:dos]
Target Milestone: --- → mozilla2.2
Component: DOM: Traversal-Range → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.