Closed
Bug 1130061
Opened 9 years ago
Closed 9 years ago
Firefox crash in webrtc::videocapturemodule::DeviceInfoDS::CreateCapabilityMap(char const*)
Categories
(Core :: WebRTC: Audio/Video, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla40
People
(Reporter: marcia, Assigned: away)
References
Details
(Keywords: crash)
Crash Data
Attachments
(1 file)
972 bytes,
patch
|
benjamin
:
review+
Sylvestre
:
approval-mozilla-aurora+
Sylvestre
:
approval-mozilla-release+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is report bp-b296e542-44fa-41bb-bcc4-f81b02150203. ============================================================= Seen while reviewing Beta 6 crash stats - almost all URLs are facebook.com. All URLs are here: https://crash-stats.mozilla.com/report/list?signature=webrtc::videocapturemodule::DeviceInfoDS::CreateCapabilityMap%28char%20const*%29 Frame Module Signature Source 0 xul.dll webrtc::videocapturemodule::DeviceInfoDS::CreateCapabilityMap(char const*) media/webrtc/trunk/webrtc/modules/video_capture/windows/device_info_ds.cc 1 xul.dll webrtc::videocapturemodule::DeviceInfoImpl::NumberOfCapabilities(char const*) media/webrtc/trunk/webrtc/modules/video_capture/device_info_impl.cc 2 xul.dll webrtc::ViEInputManager::NumberOfCaptureCapabilities(char const*) media/webrtc/trunk/webrtc/video_engine/vie_input_manager.cc 3 xul.dll webrtc::ViECaptureImpl::NumberOfCapabilities(char const*, unsigned int) media/webrtc/trunk/webrtc/video_engine/vie_capture_impl.cc 4 xul.dll mozilla::MediaEngineWebRTCVideoSource::SatisfiesConstraintSets(nsTArray<mozilla::dom::MediaTrackConstraintSet const*> const&) dom/media/webrtc/MediaEngineWebRTCVideo.cpp 5 xul.dll mozilla::VideoDevice::SatisfiesConstraintSets(nsTArray<mozilla::dom::MediaTrackConstraintSet const*> const&) dom/media/MediaManager.cpp 6 xul.dll mozilla::GetSources<mozilla::VideoDevice, mozilla::VideoTrackConstraintsN> dom/media/MediaManager.cpp 7 xul.dll mozilla::GetUserMediaTask::SelectDevice(mozilla::MediaEngine*) dom/media/MediaManager.cpp 8 xul.dll NS_SetThreadName<9>(nsIThread*, char const (&)[9]) xpcom/glue/nsThreadUtils.h 9 xul.dll mozilla::MediaEngineWebRTC::MediaEngineWebRTC(mozilla::MediaEnginePrefs&) dom/media/webrtc/MediaEngineWebRTC.cpp 10 xul.dll mozilla::GetUserMediaTask::Run() dom/media/MediaManager.cpp 11 xul.dll MessageLoop::DoWork() ipc/chromium/src/base/message_loop.cc 12 @0x5
[Tracking Requested - why for this release]: Somewhat high crash (#31, 0.42%) on 38 beta. Might not be worth tracking, your call.
status-firefox38:
--- → affected
status-firefox39:
--- → affected
status-firefox40:
--- → affected
tracking-firefox38:
--- → ?
For some reason |streamConfig| is 0xF0 in these crashes. Any idea?
Flags: needinfo?(rjesup)
Comment 3•9 years ago
|
||
Most of those are in https://www.facebook.com/videocall/incall/?... Almost 50% of those crashes are in the first 60 seconds of Firefox uptime, making it rank high in the importance-rated top crash score stats I'm experimenting with (at least for 38.0b5).
Updated•9 years ago
|
Flags: needinfo?(blassey.bugs)
Comment 6•9 years ago
|
||
I believe Bas is the person who wrote this code and is the best person to assess what might be going on and what our options are. This code hasn't changed in quite some time; so it may be due to a graphics driver update. I'll needinfo Milan too in case he has more info on that.
Flags: needinfo?(rjesup)
Flags: needinfo?(mreavy)
Flags: needinfo?(milan)
Flags: needinfo?(blassey.bugs)
Flags: needinfo?(bas)
Comment 7•9 years ago
|
||
So, looking at one of the crashes: RELEASE_AND_CLEAR(streamConfig); However, RELEASE_AND_CLEAR(p) -> if (p) { (p)->Release(); (p) = NULL; } So it shouldn't be a null streamConfig. and streamConfig is initialized to NULL before we fetch it. Plus nothing modifies streamConfig after it's fetched. I presume therefore that Release() was inlined somehow and is actually crashing there via a hidden null pointer
This is a COM interface so Release is a virtual call: 620df032 8b4c2410 mov ecx,dword ptr [esp+10h] ; get p 620df036 85c9 test ecx,ecx ; if (p == nullptr) 620df038 740b je ... ; skip the call 620df03a 8b01 mov eax,dword ptr [ecx] ; load vtable (*CRASH HERE*) 620df03c 51 push ecx ; push 'this' 620df03d ff5008 call dword ptr [eax+8] ; call Release The crash is while deref'ing streamConfig to get its vtable. And the address is always 0xf0: Address facet Rank Address Count % 1 0xf0 2707 100.00 %
I agree though that streamConfig must have been valid at some earlier point in the function. Maybe the stack contents (or the stack pointer) got busted sometime after that?
Comment 10•9 years ago
|
||
(In reply to David Major [:dmajor] from comment #9) > I agree though that streamConfig must have been valid at some earlier point > in the function. > > Maybe the stack contents (or the stack pointer) got busted sometime after > that? Seems unlikely (and a very predictable failure if so for stack trashing). Perhaps it was somehow released behind our back? Or some driver bug, though this doesn't seem linked to a specific OS version or driver/manufacturer. Perhaps we're not doing something to lock it down. just grabbing at straws here....
Assignee | ||
Comment 11•9 years ago
|
||
The trouble is it's a pointer on the stack. Releasing behind our back wouldn't change that pointer. That's why I too am grabbing at straws :)
Comment 12•9 years ago
|
||
(In reply to Maire Reavy [:mreavy] (Plz needinfo me) from comment #6) > I believe Bas is the person who wrote this code and is the best person to > assess what might be going on and what our options are. This code hasn't > changed in quite some time; so it may be due to a graphics driver update. > I'll needinfo Milan too in case he has more info on that. I do not recall writing this code? Did I write this code? I don't think I did....
Flags: needinfo?(bas)
Comment 13•9 years ago
|
||
bas: bug 731407: Remove DShow BaseClass usage from webrtc drop r=cpearce 3 years ago....
Assignee | ||
Comment 14•9 years ago
|
||
Looking at the registers in several dumps, the value of |count| is always 99 (0x63), and |tmp| is some huge number. That could be more evidence for stack bustage. (I also noticed that the StringFromGUID2 block is missing arguments on the logging, and seems to leak |pmt|, but those shouldn't affect this issue)
Assignee | ||
Comment 15•9 years ago
|
||
So this magic 0xf0 is always right next to a 0x140, in other words 320x240. https://pastebin.mozilla.org/8831233
Assignee | ||
Comment 16•9 years ago
|
||
99/0x63 is apparently kVideoUnknown
Assignee | ||
Comment 17•9 years ago
|
||
At this point I'd recommend stepping through this code on a live build with an unknown media subtype, and seeing how the stack layout compares to that pastebin.
Assignee | ||
Comment 18•9 years ago
|
||
The stack where we expect streamConfig, actually looks like the params for a GetFrameRateList: 1df4f4cc 0aa60908 ; videoControlConfig this ptr 1df4f4d0 0aa60a04 ; outputCapturePin, in similar memory region 1df4f4d4 00000000 ; tmp 1df4f4d8 00000140 ; size.cx 1df4f4dc 000000f0 ; size.cy (supposed to be streamConfig) 1df4f4e0 1df4f548 ; &listSize, on stack 1df4f4e4 1df4f504 ; &frameDurationList, on stack Btw - frameDurationList leaks!
Comment 19•9 years ago
|
||
Note that sequence will be right below the current SP when we go into (or out of) that call. Stepping through a Nightly, I do see a sequence like that in the stack at the CALL for GetFrameRateList(); in my case, stepping over the CALL trashed the outputCapturePin stack arg (and popped them all off). I can't see any sequence where we get from here to RELEASE_AND_CLEAR(streamConfig) with this stack still intact, or even close. Feels like someone trashed the stack pointer... perhaps while in GetFrameRateList(), so when we pop out it's wrong (but that might hose the return address too) or it got trashed at some later point in the function. Neither makes a ton of sense. It's interesting that tmp is 0. I did try hosing the subtype so it went through the GUID trace; no trashing I could see. Is there any way we could contact a few of the reporters and find out what HW/drivers they're running (especially cameras)?
Flags: needinfo?(kairo)
Comment 20•9 years ago
|
||
> Btw - frameDurationList leaks!
It's not a pointer to memory you're supposed to free, so it doesn't.
We do leak pmt if the subtype is unknown (when we 'continue'), and in that case we also use stack garbage for the width/height in the TRACE debug, though that shouldn't matter to this bug.
Updated•9 years ago
|
Rank: 5
Priority: -- → P1
Assignee | ||
Comment 21•9 years ago
|
||
> > Btw - frameDurationList leaks!
> It's not a pointer to memory you're supposed to free, so it doesn't.
Hm, the doc for IAMVideoControl::GetFrameRateList says:
The caller is responsible for freeing the memory through a call to CoTaskMemFree.
Comment 22•9 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #19) > Is there any way we could contact a few of the reporters and find out what > HW/drivers they're running (especially cameras)? The graphics chips we have in reports, but stuff like cameras not. That said, I'm not sure how much end users even know about what they actually are running, and we rarely reach out to them at all, and with not too much success when we do, in my experience - but then, all outreach to users on crashes has been done via the User Advocacy team so far. All that said, could you add info on interesting hardware/drivers to crash reports with annotations?
Flags: needinfo?(kairo)
Comment 23•9 years ago
|
||
(In reply to David Major [:dmajor] from comment #21) > > > Btw - frameDurationList leaks! > > It's not a pointer to memory you're supposed to free, so it doesn't. > > Hm, the doc for IAMVideoControl::GetFrameRateList says: > The caller is responsible for freeing the memory through a call to > CoTaskMemFree. Huh. Missed the Remark. THanks! Not the cause of this problem, though
Comment 24•9 years ago
|
||
> All that said, could you add info on interesting hardware/drivers to crash
> reports with annotations?
Any pointers to how that's done?
Flags: needinfo?(kairo)
Assignee | ||
Comment 25•9 years ago
|
||
If you want to add a new field, that would be CrashReporter::AnnotateCrashReport. There's also a free-form notes area that you can access through CrashReporter::AppendAppNotesToCrashReport (mostly used by the gfx team). To see how they end up looking in crash-stats, you can go to bp-b296e542-44fa-41bb-bcc4-f81b02150203 and click the Metadata tab.
Flags: needinfo?(kairo)
Comment 26•9 years ago
|
||
(In reply to David Major [:dmajor] from comment #25) > There's also a free-form notes area that you can access through > CrashReporter::AppendAppNotesToCrashReport (mostly used by the gfx team). New fields are preferred to that, though (ideally, I'd like us to fade out the free-form notes field, it's a leftover from the times when adding fields and searching for those wasn't as easy as now).
Assignee | ||
Comment 27•9 years ago
|
||
Argh, I should have checked for correlations sooner: 100% (175/175) vs. 0% (175/64366) vwcsource.ax (1.5.0.0) Image path: C:\Program Files (x86)\WebcamMax\vwcsource.ax Timestamp: Tue Dec 29 22:01:53 2009 (4B3AC2A1) Looks like they're all with version 1.5 from 2009. Should we just block it?
Comment 28•9 years ago
|
||
(In reply to David Major [:dmajor] from comment #27) > Looks like they're all with version 1.5 from 2009. Should we just block it? Probably!
Assignee | ||
Comment 29•9 years ago
|
||
Flags: needinfo?(milan)
Attachment #8597297 -
Flags: review?(benjamin)
Updated•9 years ago
|
Attachment #8597297 -
Flags: review?(benjamin) → review+
Assignee: nobody → dmajor
Keywords: checkin-needed
Comment 30•9 years ago
|
||
David, could you fill the uplift request now so that it is ready? (I will start the build of beta 8 during the week end).
Comment 31•9 years ago
|
||
dmajor: thanks! https://hg.mozilla.org/integration/mozilla-inbound/rev/c09eaddb6740
Keywords: checkin-needed
Comment 32•9 years ago
|
||
FWIW, the next beta for 38 is going to build on Sunday to be released on Monday, if we can get this to m-r before that, would be awesome to see if it works to prevent the crash for the beta audience.
Assignee | ||
Comment 33•9 years ago
|
||
Approval Request Comment [Feature/regressing bug #]: external app [User impact if declined]: crashes in webrtc [Describe test coverage new/current, TreeHerder]: pending [Risks and why]: the usual blocklisting risks (app is broken, may not fail gracefully, etc) [String/UUID change made/needed]: none
Assignee | ||
Comment 34•9 years ago
|
||
Comment on attachment 8597297 [details] [diff] [review] Blocklist Bad paste; please see above for approval request
Attachment #8597297 -
Flags: approval-mozilla-beta?
Attachment #8597297 -
Flags: approval-mozilla-aurora?
Comment 35•9 years ago
|
||
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #32) > FWIW, the next beta for 38 is going to build on Sunday to be released on > Monday, if we can get this to m-r before that, would be awesome to see if it > works to prevent the crash for the beta audience. If a merge doesn't happen, I'll hand-merge the patch over to m-c to ensure it's in tomorrow's nightly.
Comment 36•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/c09eaddb6740
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
Comment 37•9 years ago
|
||
NI to kairo to check crash stats since it should be in m-c today; next Beta build will be Sunday
Flags: needinfo?(kairo)
Comment 38•9 years ago
|
||
Comment on attachment 8597297 [details] [diff] [review] Blocklist [Triage Comment] we merged m-b into m-r
Attachment #8597297 -
Flags: approval-mozilla-release+
Attachment #8597297 -
Flags: approval-mozilla-beta?
Attachment #8597297 -
Flags: approval-mozilla-aurora?
Attachment #8597297 -
Flags: approval-mozilla-aurora+
Assignee | ||
Comment 40•9 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #37) > NI to kairo to check crash stats since it should be in m-c today; next Beta > build will be Sunday It's going to take a while to verify this; it's super low volume on nightly (no surprise). Even on aurora it will need 2-3 builds before I'd feel confident in the numbers.
Comment 41•9 years ago
|
||
(In reply to David Major [:dmajor] from comment #40) > It's going to take a while to verify this; it's super low volume on nightly > (no surprise). Even on aurora it will need 2-3 builds before I'd feel > confident in the numbers. That said, if we want this fix in 38 release, we should uplift it until Wednesday this week (yes, only two days left) as the only build done after that is the hopefully-final release candidate.
Flags: needinfo?(kairo)
Comment 42•9 years ago
|
||
And it has the approval for that anyhow. I'll surely watch Beta 38 data as always, let's leave it as a reminder to check for this one specifically.
Flags: needinfo?(kairo)
Assignee | ||
Comment 43•9 years ago
|
||
Yeah, given the nature of the crash and the fix, I don't think we need to block on a verify to put this on 38 (and it's already a+ for that). Just setting expectations for verification later.
Updated•9 years ago
|
status-firefox38.0.5:
--- → fixed
Comment 46•9 years ago
|
||
I can confirm that this signature is not recorded for 38.0b9 any more.
Flags: needinfo?(kairo)
You need to log in
before you can comment on or make changes to this bug.
Description
•