See https://treeherder.mozilla.org/#/jobs?repo=try&revision=d8816ecd1b73&selectedJob=26589082 Full log at https://firstname.lastname@example.org/try-macosx64-debug/try_yosemite_r7-debug_test-mochitest-media-bm108-tests1-macosx-build1187.txt.gz Tests just started timing out amid operations and ended with a shutdown hang. Evidence include: * The last test that passed was test_getUserMedia_basicWindowShare.html. * All remaining tests after that that we attempted to run timed out. * We end with a crash at shutdown because shutdown took too long. Most interesting is the stack of one of the threads at shutdown: > 06:57:55 INFO - Thread 35 > 06:57:55 INFO - 0 libsystem_kernel.dylib!__psynch_cvwait + 0xa > 06:57:55 INFO - 1 libnss3.dylib!PR_WaitCondVar [ptsynch.c:d8816ecd1b73 : 396 + 0x8] > 06:57:55 INFO - 2 XUL!mozilla::CondVar::Wait(unsigned int) [BlockingResourceBase.cpp:d8816ecd1b73 : 499 + 0x8] > 06:57:55 INFO - 3 XUL!mozilla::camera::CamerasChild::DispatchToParent(nsIRunnable*, mozilla::MonitorAutoLock&) [Monitor.h:d8816ecd1b73 : 40 + 0xe] > 06:57:55 INFO - 4 XUL!mozilla::camera::LockAndDispatch<int>::LockAndDispatch(mozilla::camera::CamerasChild*, char const*, nsIRunnable*, int const&, int const&) [CamerasChild.cpp:d8816ecd1b73 : 219 + 0x5] > 06:57:55 INFO - 5 XUL!mozilla::camera::CamerasChild::ReleaseCaptureDevice(mozilla::camera::CaptureEngine, int) [CamerasChild.cpp:d8816ecd1b73 : 201 + 0x1b] > 06:57:55 INFO - 6 XUL!mozilla::MediaEngineRemoteVideoSource::Deallocate(mozilla::MediaEngineSource::AllocationHandle*) [CamerasChild.h:d8816ecd1b73 : 137 + 0x8] > 06:57:55 INFO - 7 XUL!mozilla::MediaOperationTask::Run() [MediaManager.cpp:d8816ecd1b73 : 929 + 0x6] Since MediaEngineRemoteVideoSource is in use here we can conclude that we're Deallocate()ing a window or screenshare, since mac on try always uses MediaEngineDefaultVideo for camera access. Evidence is thus pretty clear that this is leftovers from test_getUserMedia_basicWindowShare.html that was the last test to pass. I have seen and heard reports of gUM requests in the wild not resolving and this could definitely be related.
I found bug 1285707 which catches the same symptoms as here (windowsharing fails). And bug 1298274 is the same thing for screensharing.
Is this the reason we have a variety of webrtc intermittents that involve a cascading set of timeouts before we hit the 4 timeout and force-kill limit? Seems like a pretty common failure mode these days from what I can see (albeit only really looking at release branches).
Possibly. Or just part of the reason. If you have bug numbers I can take a look.
Assignee: nobody → pehrson
Flags: needinfo?(pehrson) → needinfo?(ryanvm)
Oh yeah. I just had another look in some logs. This is bug 1304270 but for mac. It landed in bug 1286429 (51) with the mac implementation. It's actually fixed in Nightly then but we'll need an aurora uplift.
50 should be unaffected though. If you have such failures I'd like to see them, Ryan.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1304270
No longer blocks: 1286429
No longer blocks: 1304270
Sorry, got a bit ahead of myself. Thanks for cleaning up the mess!
You need to log in before you can comment on or make changes to this bug.