This bug was filed from the Socorro interface and is report bp-a7f63872-53d8-467f-a7a1-2c6e32130610 . ============================================================= Frame Module Signature Source 0 xul.dll sipcc::PeerConnectionWrapper::PeerConnectionWrapper media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp:1398 1 xul.dll vcmSetIceMediaParams_m media/webrtc/signaling/src/media/VcmSIPCCBinding.cpp:833 2 xul.dll mozilla::dom::NativeInterface2JSObjectAndThrowIfFailed dom/bindings/BindingUtils.cpp:593 3 xul.dll mozilla::dom::WrapNativeISupportsParent<nsISupports> obj-firefox/dist/include/mozilla/dom/BindingUtils.h:971 4 xul.dll nsClientRect::WrapObject content/html/content/src/nsClientRect.cpp:45 5 xul.dll mozilla::dom::ElementBinding::getBoundingClientRect obj-firefox/dom/bindings/ElementBinding.cpp:1371
Problem might be with the jump from: 2. mozilla::dom::NativeInterface2JSObjectAndThrowIfFailed To 1. vcmSetIceMediaParams_m We might have invalid data within "NativeInterface2JSObjectAndThrowIfFailed" causing the jump to vcmSetIceMediaParams_m If the jump is correct, then the crash is likely due to "PeerConnectionCtx::GetInstance()" being NULL. i.e: - crash is read from address 0x8. - Inside "PeerConnectionCtx" class, "mPeerConnections" should have offset 0x8. Bigger callstack would have been useful here...
Found other similar bugs like: https://crash-stats.mozilla.com/report/index/a74f05d5-40f5-4450-9fa9-551ae2131109 https://crash-stats.mozilla.com/report/index/27725a5a-aebc-4672-8ff5-64c7c2131108 Is this a lone occurance? (stack trace ending up in PeerCOnnectionWrapper?) If so, issue is likely in "NativeInterface2JSObjectAndThrowIfFailed" (jump at random).
I don't know.
Created attachment 830101 [details] [diff] [review] corruption.diff Fixes for issues: https://crash-stats.mozilla.com/report/index/a74f05d5-40f5-4450-9fa9-551ae2131109 https://crash-stats.mozilla.com/report/index/27725a5a-aebc-4672-8ff5-64c7c2131108 Issue here is a side effect. Problem is with jump from: 2. mozilla::dom::NativeInterface2JSObjectAndThrowIfFailed To 1. vcmSetIceMediaParams_m This happens because "cache" may point to invalid data. In crash: https://crash-stats.mozilla.com/report/index/27725a5a-aebc-4672-8ff5-64c7c2131108 Frame 4: nsJSEventListener::HandleEvent(nsIDOMEvent *) Lines: JS::Value retval = handler->Call(mTarget, *(aEvent->InternalDOMEvent()), rv); May have "mTarget" invalid. That is because we just store a pointer and not a smart pointer. Quick & dirty fix - store a nsCOMPtr In crash: https://crash-stats.mozilla.com/report/index/a74f05d5-40f5-4450-9fa9-551ae2131109 We have a bigger problem: - nsGlobalWindow inherits multiple sources of "nsISupports". (which is used for reference count) - This inheritance is not virtual. (CORBA does not allow virtual inheritance, classes which we derive are generated from idl files) - Current code only protected one super class (and not all :) ) - - We only protected: "nsIScriptGlobalObject" superclass. - If ANY of the other superclasses are deleted, main class is destroyed as well via virtual destructor. Actual crash is likely due to: - "RunTimeoutHandler" destroying one of the other superclasses. - That caused main class to be destroyed as well (disregarding our protection) - Corrupted data trickled down to another "RunTimeoutHandler" - That second call crashed. Fix is relatively simple, since we can't have multiple inheritance, we protect all "nsISupports" derived superclasses. When protector class is destroyed, it needs to stop when a delete happens. Notes: - This is all in DOM code - Please ask a area owner to review. - This might have security implications... - I had no idea where to place the MultipleInheritance protector class, so it might need moving (if approved) - Wanted to keep explanation short and to the point, I can detail further if anyone wants.
Additional note: - I'm new to this, so please add details regarding next steps (ex: what I need to test, etc)
Created attachment 830703 [details] [diff] [review] corruptionfix.patch Request review, please add any other persons that need to review this. Details are in previous comments. (This seems like a DOM issue and I don't know who to add to foreview)
Comment on attachment 830703 [details] [diff] [review] corruptionfix.patch Review of attachment 830703 [details] [diff] [review]: ----------------------------------------------------------------- bz: not my area... You should check if the analysis seems valid. Seems like a quite fundamental change; if the analysis is correct, we should also consider alternatives. Adrian: Thanks for taking a run at such a confusing crash. Even if the analysis turns out to be wrong (and it can be tough to diagnose a blind crash report like this), it's great that you're doing this. I'll be more available on IRC (#media is best) in a few days, as I'm traveling for standards work at the moment.
Comment on attachment 830703 [details] [diff] [review] corruptionfix.patch > - nsGlobalWindow inherits multiple sources of "nsISupports". (which is used for reference > count) Not quite. nsISupports is a pure virtual class with no data members. The reference count lives in concrete classes as needed. Looking at nsGlobalWindow, it inherits from mozilla::dom::EventTarget, a bunch of interfaces (which have no members), PRCListStr, and nsSupportsWeakReference. nsSupportsWeakReference and PRCListStr have no refcount members. mozilla::dom::EventTarget has no refcount member, and inherits from nsIDOMEventTarget and nsWrapperCache. Neither of those has a refcount member either. nsGlobalWindow _does_ however declare a refcount member: NS_DECL_CYCLE_COLLECTING_ISUPPORTS and that's the one reference count window objects have. >- - We only protected: "nsIScriptGlobalObject" superclass. No, because its addref/release are virtual so calling them calls nsGlobalWindow::AddRef/Release. So the MultipleInheritanceCOMPtr bit should not be necessary. If it actually fixes something (do we have evidence that it does?) something is fishy. As for the mTarget of nsIJSEventListener.h, there is a comment right before the class declaration: 168 // Note, mTarget is a raw pointer and the owner of the nsIJSEventListener object 169 // is expected to call Disconnect()! Is that not happening?
Crash-stats seems to be down, so I can't look at the reports above, but I assume that we have no reliable way to reproduce testcase for any of these crashes?
OK, crash-stats is back up. The stacks in comment 4 are quite different from the original stack this bug was filed on. We should get a separate bug filed on those... That said, I agree that even the stack in comment 0 has nothing to do with PeerConnection per se. It seems to be corrupt memory in a place where that just shouldn't be happening.
bz - thanks a lot for the review and explanations, had some trouble following the refcount. As for this crash, the callstack seems too small to figure out what would be corrupted, That's why I was looking at other crash reports.
Overtaken by events. In any case, thanks Adrian for taking a run at such a confusing bug