837539 - WebRTC crash [@VDCVideoNewPlugIn]

Reporter

Description

•

12 years ago

Attached file testcase — Details

This seems to be a crash in the device driver of MacOS. newton % sw_vers ProductName: Mac OS X ProductVersion: 10.8.2 BuildVersion: 12C3006 Before the crash happens we can see in the console: Sun Feb 3 22:32:42 2013 CMIO_Unit_Output_Base.cpp:197:RenderBus something has gone wrong! Sun Feb 3 22:32:42 2013 CMIO_Unit_Output_Base.cpp:199:RenderBus idx = 0, mNumInputs = 1, mNumInputsAllocatedFor = 0 Sun Feb 3 22:32:42 2013 CMIO_Unit_Output_Base.cpp:201:RenderBus theInput = 0x1211dec80, mPullInputOnNextRound = 0x0, mInputHasSeenEndOfData = 0x0 [...] The testcase will not crash in a debug/non-opt build. Tested with m-i changeset: 120689:eae4b34eb792 and -O2 Tested with m-c changeset: 120354:2cc710018b14 and -O2

Christoph Diehl [:posidron]

Reporter

Comment 1

•

12 years ago

Attached file callstack — Details

Randell Jesup [:jesup] (needinfo me)

Comment 2

•

12 years ago

Very fast start/stop of video here.... Perhaps we're triggering an OS/Mac Foundation bug or bug in the webrtc.org code by stopping capture before anything is captured, or some such. It's possible 3.20 will fix it. A workaround may be to enforce a minimum 'on' period (or number of frames captured) before stop() takes effect, especially as it's known that debug builds don't crash.

Assignee: nobody → rjesup

Priority: -- → P1

Whiteboard: [getUserMedia][blocking-gum+][webrtc-uplift]

Henrik Skupin [:whimboo][⌚️UTC+2] (away 10/03 - 10/13)

Updated

•

12 years ago

Flags: in-testsuite?

Jason Smith [:jsmith]

Comment 3

•

12 years ago

We're trying to ship this as part of FF 20. So noming for tracking.

tracking-firefox20: --- → ?

tracking-firefox21: --- → ?

bhavana bajaj [:bajaj]

Updated

•

12 years ago

status-firefox20: --- → affected

status-firefox21: --- → affected

tracking-firefox20: ? → +

tracking-firefox21: ? → +

Maire Reavy [:mreavy]

Comment 4

•

12 years ago

(In reply to Randell Jesup [:jesup] from comment #2) > It's possible 3.20 will fix it. Christoph -- 3.20 just landed yesterday. Can you retest and see if this is still happening? Thanks.

Flags: needinfo?(cdiehl)

Christoph Diehl [:posidron]

Reporter

Comment 5

•

12 years ago

Still works for me.

Flags: needinfo?(cdiehl)

Jason Smith [:jsmith]

Comment 6

•

12 years ago

(In reply to Christoph Diehl [:cdiehl] from comment #5) > Still works for me. Okay. Does it work for you on Aurora as well?

Christoph Diehl [:posidron]

Reporter

Comment 7

•

12 years ago

With current Aurora we see the same error messages but no crash. It does not look like a real ASan specific catch here, will test it later with an ASan build of Aurora though.

Christoph Diehl [:posidron]

Reporter

Comment 8

•

12 years ago

I have tested this now with an ASan debug build of Aurora but it passes the testcase as well. The build I used is: https://people.mozilla.com/~choller/firefox/asan/20130214-mozilla-aurora-macosx64-debug-d083267a188a+asan.html I currently have not enough disk space left to fetch the Aurora tree and make a optimized build out of it. All my previous builds are optimized and the testcases crashes in those builds successfully.

Jason Smith [:jsmith]

Comment 9

•

12 years ago

Let's close this as a works for me then. If we end up reproducing later with some other configuration, then let's reopen.

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → WORKSFORME

Randell Jesup [:jesup] (needinfo me)

Comment 10

•

12 years ago

My opt asan mac build just finished (inbound). Second click on "crash" crashed with the same signature

Status: RESOLVED → REOPENED

Resolution: WORKSFORME → ---

Randell Jesup [:jesup] (needinfo me)

Comment 11

•

12 years ago

Jib is seeing Thu Feb 14 17:44:37 2013 CMIO_Graph.cpp:8839:HandleRenderNotify CMIOGraph::HandleRenderNotify() called but graph is not initialized! http://pastebin.mozilla.org/2139860 where I'm crashing with the RenderBus: something has gone wrong! error. He's running OSX 10.8.2, I'm on 10.7. What are others running where the crash is seen or not seen in an opt asan build on Mac? Could be: a) OS bug fixed in 10.8 (or a missing OS trap of bad inputs added in 10.8) b) compiler bug that causes bad inputs (see a) c) code in mac drivers in this edge case (fast open/close) has a bug that simply passed bad args to the OS ???

Christoph Diehl [:posidron]

Reporter

Comment 12

•

12 years ago

(In reply to Christoph Diehl [:cdiehl] from comment #0) > newton % sw_vers > ProductName: Mac OS X > ProductVersion: 10.8.2 > BuildVersion: 12C3006 It's not a)

Randell Jesup [:jesup] (needinfo me)

Comment 13

•

12 years ago

cdiehl: I assume that means your crashes were under 10.8? d) different Mac Camera drivers react differently I have a MacBook Pro laptop, non-retina, Core i7. (how do I get driver/HW info from it?) I know jib has a newer mac

Christoph Diehl [:posidron]

Reporter

Comment 14

•

12 years ago

(In reply to Randell Jesup [:jesup] from comment #13) > cdiehl: I assume that means your crashes were under 10.8? Correct > I have a MacBook Pro laptop, non-retina, Core i7. (how do I get driver/HW > info from it?) I know jib has a newer mac $ system_profiler | more

Randell Jesup [:jesup] (needinfo me)

Comment 15

•

12 years ago

I got the same error as jib in bug 815231 when running the testcase (constant errors like that). May be a timing issue too

Randell Jesup [:jesup] (needinfo me)

Comment 16

•

12 years ago

FaceTime HD Camera (Built-in): Product ID: 0x8509 Vendor ID: 0x05ac (Apple Inc.) Version: 5.16 Serial Number: CC2B7C06LLDGFKL0 Speed: Up to 480 Mb/sec Manufacturer: Apple Inc. Location ID: 0xfa200000 / 3 Current Available (mA): 500 Current Required (mA): 500 System Version: Mac OS X 10.7.5 (11G56) Kernel Version: Darwin 11.4.2

Jan-Ivar Bruaroey [:jib] (needinfo? me)

Comment 17

•

12 years ago

Yes I have a Retina MacBook Pro laptop "Mid-2012" Core i7 and I'm NOT crashing. FaceTime HD Camera (Built-in): Product ID: 0x8510 Vendor ID: 0x05ac (Apple Inc.) Version: 80.25 Serial Number: CC2C9Q0K4NDN9KE0 Speed: Up to 480 Mb/sec Manufacturer: Apple Inc. Location ID: 0x1a110000 / 3 Current Available (mA): 500 Current Required (mA): 500 System Version: OS X 10.8.2 (12C3006) Kernel Version: Darwin 12.2.1

Randell Jesup [:jesup] (needinfo me)

Comment 18

•

12 years ago

I'm failing on every second click of "Crash". Interestingly, that was the pattern I saw with bug 815231 - one would succeed, the second would fail with errors in the logs, and repeat for an hour. I got a webrtc_trace:65535 log from it, and it closed the capture device as the last thing it did. Interestingly, this is the same point as where we were having issues with shutdown hangs due to sending events tot he mainthread in the QTKit code. I'm adding the "proxy-release-to-mainthread" patch to this build to see if it helps.

Randell Jesup [:jesup] (needinfo me)

Comment 19

•

12 years ago

To the newly cc'd people (smichaud, joedrew, bjacob): this seems to be deep in Mac-land, and the QTKit docs suck at this level. We could use some assistance in attacking this. Google searches on these error messages isn't finding much. With the shutdown hang patch that remotes the shutdown to mainthread the problem still happens, and perhaps even more easily or with more error messages before the crash. And oddly, others like jib can't reproduce it at all. (I suspect strongly an ASAN build is not needed to hit this, though an opt build may be.) Any help or pointers would be greatly appreciated. The code talking to the mac stuff is deep in media/webrtc/trunk, in .mm files. Thanks!

Joe Drew (not getting mail)

Comment 20

•

12 years ago

+Jeff, who has done a lot of very deep debugging in places like this before. Before you do anything else, though, I would send this testcase to Apple through our developer contacts (Steven should know how to use our developer resources) in order to ensure they know they've got a bug in their driver.

Randell Jesup [:jesup] (needinfo me)

Comment 21

•

12 years ago

Joe - thanks, that makes sense and matches what I told akeybl. It seems totally related to how fast you open and close the driver.

Randell Jesup [:jesup] (needinfo me)

Comment 22

•

12 years ago

FYI, I also tried waiting until isRunning changed after startRunning/stopRunning, with no effect

testcase 12 years ago Christoph Diehl [:posidron] 301 bytes, text/html		Details
callstack 12 years ago Christoph Diehl [:posidron] 5.31 KB, text/plain		Details
.mozconfig for asan 12 years ago Randell Jesup [:jesup] (needinfo me) 1.82 KB, text/plain		Details
OS X 10.6.8 crash stack on 15" Early 2011 MBP 12 years ago Steven Michaud [:smichaud] (Retired) 23.44 KB, text/plain		Details
OS X 10.7.5 crash stack on 15" Early 2011 MBP 12 years ago Steven Michaud [:smichaud] (Retired) 21.44 KB, text/plain		Details
Wierd workaround, likely not a fix 12 years ago Steven Michaud [:smichaud] (Retired) 2.89 KB, patch		Details \| Diff \| Splinter Review
Another wierd workaround, definitely not a fix 12 years ago Steven Michaud [:smichaud] (Retired) 844 bytes, application/x-gzip		Details
Fix (partial, not yet for autorelease pool errors) 12 years ago Steven Michaud [:smichaud] (Retired) 1.62 KB, patch		Details \| Diff \| Splinter Review
Interpose library for logging 12 years ago Steven Michaud [:smichaud] (Retired) 2.26 KB, application/x-gzip		Details
Full fix (including for autorelease pool errors) 12 years ago Steven Michaud [:smichaud] (Retired) 10.75 KB, patch	jesup : review+	Details \| Diff \| Splinter Review
Full fix (get rid of _poolInfo instead of commenting it out) 12 years ago Steven Michaud [:smichaud] (Retired) 10.64 KB, patch	smichaud : review+ BenWa : review+	Details \| Diff \| Splinter Review
Full fix (what I actually landed) 12 years ago Steven Michaud [:smichaud] (Retired) 12.52 KB, patch	smichaud : review+ lsblakk : approval-mozilla-aurora+ lsblakk : approval-mozilla-beta+	Details \| Diff \| Splinter Review