Closed Bug 1072980 Opened 10 years ago Closed 9 years ago

[e10s] "Force RTL" addon triggers crash in layout/inspector/inDeepTreeWalker.cpp call to GetChildren

Categories

(Core :: General, defect)

defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla38
Tracking Status
e10s m5+ ---

People

(Reporter: jimm, Assigned: billm)

References

Details

(Keywords: crash, Whiteboard: [caused by ForceRTL])

Crash Data

Attachments

(6 files)

new startup crash with e10s. I've hit this about ten times in today's nightly.

Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0


This bug was filed from the Socorro interface and is 
report bp-08792c42-eccb-419e-8542-10df52140925.
=============================================================

0 	xul.dll 	GetChildren 	layout/inspector/inDeepTreeWalker.cpp
1 	xul.dll 	inDeepTreeWalker::EdgeChild(nsIDOMNode**, bool) 	layout/inspector/inDeepTreeWalker.cpp
2 	xul.dll 	inDeepTreeWalker::FirstChild(nsIDOMNode**) 	layout/inspector/inDeepTreeWalker.cpp
3 	xul.dll 	inDeepTreeWalker::NextNode(nsIDOMNode**) 	layout/inspector/inDeepTreeWalker.cpp
4 	xul.dll 	NS_InvokeByIndex 	xpcom/reflect/xptcall/md/win32/xptcinvoke.cpp
5 	xul.dll 	XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) 	js/xpconnect/src/XPCWrappedNativeJSOps.cpp
6 	mozjs.dll 	js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
7 	mozjs.dll 	Interpret 	js/src/vm/Interpreter.cpp
8 	mozjs.dll 	js::RunScript(JSContext*, js::RunState&) 	js/src/vm/Interpreter.cpp
9 	mozjs.dll 	js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
10 	mozjs.dll 	js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value const*, JS::MutableHandle<JS::Value>) 	js/src/vm/Interpreter.cpp
11 	mozjs.dll 	JS::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, JS::HandleValueArray const&, JS::MutableHandle<JS::Value>) 	js/src/jsapi.cpp
12 	xul.dll 	xpc::SandboxCallableProxyHandler::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) 	js/xpconnect/src/Sandbox.cpp
13 	mozjs.dll 	js::Proxy::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) 	js/src/proxy/Proxy.cpp
14 	mozjs.dll 	js::proxy_Call(JSContext*, unsigned int, JS::Value*) 	js/src/proxy/Proxy.cpp
15 	mozjs.dll 	js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
16 	mozjs.dll 	Interpret 	js/src/vm/Interpreter.cpp
17 	mozjs.dll 	js::RunScript(JSContext*, js::RunState&) 	js/src/vm/Interpreter.cpp
18 	mozjs.dll 	js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
19 	mozjs.dll 	js_fun_call(JSContext*, unsigned int, JS::Value*) 	js/src/jsfun.cpp
20 	mozjs.dll 	js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
21 	mozjs.dll 	js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value const*, JS::MutableHandle<JS::Value>) 	js/src/vm/Interpreter.cpp
22 	mozjs.dll 	js::CrossCompartmentWrapper::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) 	js/src/proxy/CrossCompartmentWrapper.cpp
23 	mozjs.dll 	js::Proxy::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) 	js/src/proxy/Proxy.cpp
24 	mozjs.dll 	js::proxy_Call(JSContext*, unsigned int, JS::Value*) 	js/src/proxy/Proxy.cpp
25 	mozjs.dll 	js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
26 	mozjs.dll 	js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value const*, JS::MutableHandle<JS::Value>) 	js/src/vm/Interpreter.cpp
27 	mozjs.dll 	js::jit::DoCallFallback 	js/src/jit/BaselineIC.cpp
28 		@0x1f5759b4 	
29 		@0x17381357 	
30 		@0x1f570983 	
31 	mozjs.dll 	EnterBaseline 	js/src/jit/BaselineJIT.cpp
32 	mozjs.dll 	js::jit::EnterBaselineMethod(JSContext*, js::RunState&) 	js/src/jit/BaselineJIT.cpp
33 	mozjs.dll 	Interpret 	js/src/vm/Interpreter.cpp
34 	mozjs.dll 	js::RunScript(JSContext*, js::RunState&) 	js/src/vm/Interpreter.cpp
35 	mozjs.dll 	js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
36 	mozjs.dll 	js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value const*, JS::MutableHandle<JS::Value>) 	js/src/vm/Interpreter.cpp
37 	mozjs.dll 	JS_CallFunctionValue(JSContext*, JS::Handle<JSObject*>, JS::Handle<JS::Value>, JS::HandleValueArray const&, JS::MutableHandle<JS::Value>) 	js/src/jsapi.cpp
38 	xul.dll 	nsFrameMessageManager::ReceiveMessage(nsISupports*, nsAString_internal const&, bool, mozilla::dom::StructuredCloneData const*, CpowHolder*, nsIPrincipal*, nsTArray<nsString>*) 	content/base/src/nsFrameMessageManager.cpp
39 	xul.dll 	nsFrameMessageManager::ReceiveMessage(nsISupports*, nsAString_internal const&, bool, mozilla::dom::StructuredCloneData const*, CpowHolder*, nsIPrincipal*, nsTArray<nsString>*) 	content/base/src/nsFrameMessageManager.cpp
40 	xul.dll 	nsFrameMessageManager::ReceiveMessage(nsISupports*, nsAString_internal const&, bool, mozilla::dom::StructuredCloneData const*, CpowHolder*, nsIPrincipal*, nsTArray<nsString>*) 	content/base/src/nsFrameMessageManager.cpp
41 	xul.dll 	nsFrameMessageManager::ReceiveMessage(nsISupports*, nsAString_internal const&, bool, mozilla::dom::StructuredCloneData const*, CpowHolder*, nsIPrincipal*, nsTArray<nsString>*) 	content/base/src/nsFrameMessageManager.cpp
42 	xul.dll 	mozilla::dom::TabParent::ReceiveMessage(nsString const&, bool, mozilla::dom::StructuredCloneData const*, CpowHolder*, nsIPrincipal*, nsTArray<nsString>*) 	dom/ipc/TabParent.cpp
43 	xul.dll 	mozilla::dom::TabParent::AnswerRpcMessage(nsString const&, mozilla::dom::ClonedMessageData const&, nsTArray<mozilla::jsipc::CpowEntry> const&, IPC::Principal const&, nsTArray<nsString>*) 	dom/ipc/TabParent.cpp
44 	xul.dll 	mozilla::dom::PBrowserParent::OnCallReceived(IPC::Message const&, IPC::Message*&) 	obj-firefox/ipc/ipdl/PBrowserParent.cpp
45 	xul.dll 	mozilla::dom::PContentParent::OnCallReceived(IPC::Message const&, IPC::Message*&) 	obj-firefox/ipc/ipdl/PContentParent.cpp
46 	xul.dll 	mozilla::ipc::MessageChannel::DispatchRPCMessage(IPC::Message const&) 	ipc/glue/MessageChannel.cpp
47 	xul.dll 	mozilla::ipc::MessageChannel::DispatchMessageW(IPC::Message const&) 	ipc/glue/MessageChannel.cpp
48 	xul.dll 	mozilla::ipc::MessageChannel::OnMaybeDequeueOne() 	ipc/glue/MessageChannel.cpp
49 	xul.dll 	MessageLoop::RunTask(Task*) 	ipc/chromium/src/base/message_loop.cc
50 	xul.dll 	MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) 	ipc/chromium/src/base/message_loop.cc
51 	xul.dll 	MessageLoop::DoWork()
tracking-e10s: --- → ?
Summary: crash in GetChildren → crash in layout/inspector/inDeepTreeWalker.cpp call to GetChildren
suspect landing.
Blocks: 777674
STR:

1) open a tab or two
2) flip the e10s pref on in Options
3) restart

result: crash on restart
back this work out locally, still seeing the crash.
No longer blocks: 777674
also, this appears to be related to loading bugzilla bug pages.
better str:

1) disable e10s, restart
2) open a tab to this bugzilla page
3) open Options in a new tab
4) flip e10s, restart
5) focus the bugzilla page tab on restart

result: crash
Assignee: nobody → gkrizsanits
This crash is also killing me right now - I had to manually edit my prefs.js to turn off e10s in my default profile. :/
So far I could not reproduce the crash on my machine (on linux) on a fresh build from m-c... It's hard to say anything without being able to debug the crash. I don't see any obvious bug in the code yet either.
As I was able to reproduce the crash reliably, had a debug build, and a few minutes to spare, I dug into this.

Here's the line that we were crashing on in inDeepTreeWalker.cpp:

static already_AddRefed<nsINodeList>
GetChildren(nsIDOMNode* aParent,
            bool aShowAnonymousContent,
            bool aShowSubDocuments)
{
  MOZ_ASSERT(aParent);

  nsCOMPtr<nsINodeList> ret;
  if (aShowSubDocuments) {
    nsCOMPtr<nsIDOMDocument> domdoc = inLayoutUtils::GetSubDocumentFor(aParent);
    if (domdoc) {
      aParent = domdoc;
    }
  }

  nsCOMPtr<nsIContent> parentAsContent = do_QueryInterface(aParent);
  if (parentAsContent && aShowAnonymousContent) {
      ret = parentAsContent->GetChildren(nsIContent::eAllChildren);
  } else {
    // If it's not a content, then it's a document (or an attribute but we can ignore that
    // case here). If aShowAnonymousContent is false we also want to fall back to ChildNodes
    // so we can skip any native anon content that GetChildren would return.
    nsCOMPtr<nsINode> parentNode = do_QueryInterface(aParent);
    MOZ_ASSERT(parentNode);  // <-- CRASHING RIGHT HERE
    ret = parentNode->ChildNodes();
  }
  return ret.forget();
}

parentNode is not null, it's an object that looks like this when I inspect it in XCode:

aParent nsXPTCStubBase *  0x12d2cb5e0 0x000000012d2cb5e0
  mOuter  nsXPCWrappedJS *  0x11d9f3a80 0x000000011d9f3a80
    nsAutoXPTCStub  nsAutoXPTCStub    
    nsSupportsWeakReference nsSupportsWeakReference   
    XPCRootSetElem  XPCRootSetElem    
    mRefCnt nsCycleCollectingAutoRefCnt   
    _mOwningThread  nsAutoOwningThread    
    mJSObj  JS::Heap<JSObject *>    
    mClass  nsRefPtr<nsXPCWrappedJSClass>   
    mRoot nsXPCWrappedJS *  0x11d7c7800 0x000000011d7c7800
    mNext nsXPCWrappedJS *  NULL  0x0000000000000000
    mOuter  nsCOMPtr<nsISupports>   
  mEntry  xptiInterfaceEntry *  0x1117d11a0 0x00000001117d11a0
    mIID  nsID    
    mDescriptor XPTInterfaceDescriptor *  0x1117cdda0 0x00000001117cdda0
    mMethodBaseIndex  uint16_t  3 3
    mConstantBaseIndex  uint16_t  0 0
    mTypelib  xptiTypelibGuts * 0x1117d1058 0x00000001117d1058
    mParent xptiInterfaceEntry *  0x11276b950 0x000000011276b950
    mInfo xptiInterfaceInfo * 0x12eb3ec10 0x000000012eb3ec10
    mFlags  xptiInfoFlags   
    mName char [1]  "nsIDOMNode"  

That... might be a CPOW? Needinfo'ing billm to see.

Anyhow, there some JS execution on the stack, so I dumped the JS stack and got this:

0 forcertl_change_chromedir_in_document(doc = [object CPOW [object HTMLDocument]], dir = "ltr") ["chrome://forcertl/content/forcertl.js":155]
    this = [object ChromeWindow]
1 forcertl_new_document_loaded(event = [object CPOW [object Event]]) ["chrome://forcertl/content/forcertl.js":206]
    this = [object CPOW [object HTMLDocument]]
2 anonymous(browser = [object XULElement], type = "DOMContentLoaded", capturing = true, isTrusted = true, event = [object CPOW [object Event]]) ["resource://gre/modules/RemoteAddonsParent.jsm":473]
    this = [object Object]
3 anonymous(msg = [object Object]) ["resource://gre/modules/RemoteAddonsParent.jsm":453]
    this = [object Object]

So that forcertl thing is from the Force RTL add-on, and I have a feeling this is why Ehsan is hitting it too. I'm willing to bet he's got that enabled in his profile because he's one of the add-on authors.

Disabling that add-on got rid of the crash for me.

Bill, is this us doing something wrong in our shims, or could this be related to nfroyd's recent work in inDeepTreeWalker.cpp from bug 1013475?
Flags: needinfo?(wmccloskey)
The problem is that we're not doing a good job ensuring that CPOWs are not passed to C++ code. We should never allow the CPOW to get into inDeepTreeWalker at all.
Assignee: gkrizsanits → wmccloskey
Flags: needinfo?(wmccloskey)
This is indeed caused by the Force RTL add-on!  The code of interest here is <hg.mozilla.org/users/ehsan.akhgari_gmail.com/extensions/file/6d210b7ca395/forcertl/content/forcertl.js#l155>, FWIW.

This seems to be a CPOW object that we get from e.originalTarget representing a document object (e is a DOMContentLoaded event.)
I can confirm having forceRTL installed causes the browser to enter a crash loop :(
Whiteboard: [caused by ForceRTL]
Whoa, crash loop. Been a while since I saw one of those. But yeah, also happens on OS X, also with the forceRTL add-on.
OS: Windows 7 → All
Hardware: x86 → All
Blocks: e10s-addons
Summary: crash in layout/inspector/inDeepTreeWalker.cpp call to GetChildren → [e10s] "Force RTL" addon triggers crash in layout/inspector/inDeepTreeWalker.cpp call to GetChildren
Attached patch fix-forcertlSplinter Review
This prevents us from passing CPOWs to native code. I tested it with a number of add-ons and it seems to work. It prevents the crash in Force RTL (although the add-on just throws).
Attachment #8519180 - Flags: review?(mrbkap)
Attachment #8519180 - Flags: review?(mrbkap) → review+
Note from Blake: add a test.
Flags: needinfo?(wmccloskey)
For the record: I asked billm about alternatively marking nsIDOMNode (the interface that the offending object comes in as) as builtinclass as a fix for this bug, but he noted that doing so would break the use-case of passing a CPOW through a doubly-wrapped JSObject interface (which his patch explicitly allows).

It's funny, because XPCOM should technically allow all of this to Just Work (TM), but every time we turn around, it shoots us in the foot.
Attached patch rtl-testSplinter Review
Here's the test.
Flags: needinfo?(wmccloskey)
Attachment #8521641 - Flags: review?(mrbkap)
Attachment #8521641 - Flags: review?(mrbkap) → review+
This is causing some test failures, which I guess is not too surprising.
I added Force RTL to the compatibility override list for 36 and above.
What's the status of this patch/test?
Flags: needinfo?(wmccloskey)
This breaks some tests. I have patches to fix the problems, but I haven't had time to post them this week.
Flags: needinfo?(wmccloskey)
This patch fixes a place where we're passing a CPOW to _getDocShell (which the main patch in this bug forbids). The comment kind of explains what is going on. I first tried using this code:
  this._getDocShell(window || content.window)
The problem is that, when window is undefined, we get a JS error (rather than just |undefined|) since window is a variable rather than a property reference. So I added the typeof business.

I still don't understand how |window| could be undefined, but I guess special powers can be loaded in non-window contexts.
Attachment #8533435 - Flags: review?(mconley)
The main patch here also exposed a problem where a lot of tests are using |event.currentTarget|. Our event shims just call the parent process's event handler and pass it a CPOW for the event. If the event handler asks for event.currentTarget, it will get a CPOW around the TabChildGlobal in the child process, since that's where we install the event handler in the child process.

To fix this, I created a scripted proxy around the event CPOW to simulate the right behavior for event.currentTarget. Generally this works well.

However, this code causes additional problems since the |event| object passed to the event handler is no longer a CPOW. In the session store code, we're passing an event back to the child process and expecting it to be a normal event. With the scripted proxy scheme, we now get a CPOW pointing to the parent proxy, which itself wraps a CPOW into the child. Yuck. There's an easy way to work around that, though.

I don't expect other tests will have this problem since they won't rely on the behavior of passing a CPOW back down and getting the original object. The session store tests only do this because I modified the tests in order to get them to run in e10s.
Attachment #8533438 - Flags: review?(mconley)
Attached patch fix-context-menuSplinter Review
This patch fixes an issue where we're passing a document CPOW to saveImageURL. I just changed this to a chrome document. The document is basically only used to get the private browsing status for the window. (Note that aShouldBypassCache is true so the getImgCacheForDocument call never happens.)
Attachment #8533444 - Flags: review?(mconley)
Attachment #8533435 - Flags: review?(mconley) → review?(ally)
Attachment #8533438 - Flags: review?(mconley) → review?(ally)
Attachment #8533444 - Flags: review?(mconley) → review?(ally)
Attached patch fix-about-shimSplinter Review
Mike added a test in bug 1102410 that found an additional place that this patch breaks. When we pass notificationCallbacks through the about protocol shim, we get an exception on this line:
  http://hg.mozilla.org/mozilla-central/annotate/5d6e0d038f95/toolkit/components/addoncompat/RemoteAddonsParent.jsm#l243
The problem is that we're calling a C++ setter on that line, and we're passing it a CPOW. That's forbidden by the main patch in this bug.

One approach here would be to allow this pattern, but it doesn't seem worth it. None of the add-ons we've looked at care about the notificationCallbacks for about channels, so we might as well just set it to null. That avoids the problem.
Attachment #8536777 - Flags: review?(ally)
Attachment #8533435 - Flags: review?(ally) → review+
Attachment #8533444 - Flags: review?(ally) → review+
Attachment #8536777 - Flags: review?(ally) → review+
Comment on attachment 8533438 [details] [diff] [review]
event-current-target

Review of attachment 8533438 [details] [diff] [review]:
-----------------------------------------------------------------

Unfortunate, but I don't see a better way either. :(
Attachment #8533438 - Flags: review?(ally) → review+
I think this may have broken saving images when e10s is enabled. See Bug 1114245.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Pushed the backout to central so it makes it to nightly.
https://hg.mozilla.org/mozilla-central/rev/b915a50bc6be
Depends on: 1114051
https://hg.mozilla.org/mozilla-central/rev/25ca03634cf5
Status: REOPENED → RESOLVED
Closed: 10 years ago9 years ago
Resolution: --- → FIXED
Target Milestone: mozilla37 → mozilla38
Depends on: 1127927
Depends on: 1159616
Depends on: 1160568
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: