Closed
Bug 1340718
Opened 7 years ago
Closed 7 years ago
If a audio device (Citrix) disappears and fails on re-open attempts, output can be hung until restart
Categories
(Core :: Audio/Video: MediaStreamGraph, defect, P1)
Core
Audio/Video: MediaStreamGraph
Tracking
()
People
(Reporter: jesup, Assigned: padenot)
References
(Depends on 1 open bug)
Details
Attachments
(3 files, 1 obsolete file)
1.83 KB,
patch
|
jesup
:
review+
|
Details | Diff | Splinter Review |
2.72 KB,
patch
|
jesup
:
review+
Sylvestre
:
approval-mozilla-esr45+
Sylvestre
:
approval-mozilla-esr52+
|
Details | Diff | Splinter Review |
1.80 KB,
patch
|
jcristau
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
If using an audio device such as a Citrix pseudo-mic & speakers, if the Citrix connection disconnects, the audio device fails and returns error codes, and also we see devices being removed. When cubeb tries to reopen devices, it can end up in an ERROR state (3), and after that audio cannot be restarted since datacallbacks will not happen again.
Reporter | ||
Updated•7 years ago
|
Rank: 12
Comment hidden (mozreview-request) |
Comment 2•7 years ago
|
||
mozreview-review |
Comment on attachment 8839520 [details] Bug 1340718 - Retry getting the default device when audio devices are present and the first call failed. https://reviewboard.mozilla.org/r/114136/#review115750 ::: media/libcubeb/src/cubeb_wasapi.cpp:709 (Diff revision 1) > - if (FAILED(hr)) { > + if (FAILED(hr)) { > - LOG("Could not get default audio endpoint: %x\n", hr); > + LOG("Could not get default audio endpoint: %x\n", hr); > - SafeRelease(enumerator); > + Sleep(100); > - return hr; > - } > + } > + } while (FAILED(hr) && hr != E_NOTFOUND); It's hard to review this without information about how it was debugged with a description of what', specifically, is happening. What error is being returned from GetDefaultAudioEndpoint in this situation? This fix seems a bit hacky and my concern is there are other situations where we get an error here and instead of falling back to no sound we end up stuck in this loop. Which caller of get_default_endpoint does this matter for? If this kind of loop/sleep is a necessary workaround, it might be better handled by the caller. ::: media/libcubeb/src/cubeb_wasapi.cpp:713 (Diff revision 1) > - } > + } > + } while (FAILED(hr) && hr != E_NOTFOUND); > > SafeRelease(enumerator); > > return ERROR_SUCCESS; There's now a situation where GetDefaultAudioEndpoint fails (hr == E_NOTFOUND) where we used to return E_NOTFOUND and now return ERROR_SUCCESS.
Attachment #8839520 -
Flags: review?(kinetik) → review-
Comment 3•7 years ago
|
||
If this is happening after a device loss event where we try to reconfigure but the new target/no available device is ready to configure, I think a better approach would be to have a "settle" delay between the device loss event (which we don't control) and when we attempt to reconfigure. IIRC, Chromium has something like this (a single 200ms delay, last time I looked) to try to deal with this sort of thing - the basic problem being that we trigger reconfigure on the first device loss event but there can be a stream of device change events over time and there's no logical beginning or end of them at which to start the reconfigure process.
Assignee | ||
Comment 4•7 years ago
|
||
I forgot to send my message explaining the patch, sorry. Also, thinking again about it, I think this is not ideal (for the reason you found), I'll come up with something better. Also, yes, I've implemented some sort of debouncing for the reconfigure events: we receive a bunch of notification, and only reconfigure once, after things have settled. This is sitting on a branch on my windows machine for now.
Assignee | ||
Comment 5•7 years ago
|
||
Much better patch, only touching `GraphDriver.cpp`. As far as cubeb is concerned, the job is done: the error state is being signaled properly. The graph driver now decides to fallback to a `SystemClockDriver`. The graph then decides to re-try periodically to switch to a new `AudioCallbackDriver` (because at least one track needs to output audio). I removed the error case about returning when we can't get the preferred latency, this makes us go into the case where we fallback to a `SystemClockDriver`. STR: - On windows, disable all the audio devices, apart from a device that can be unplugged (I used a USB headset) - Run any demo that outputs sound and uses an MSG (web audio/webrtc, whatever) - Notice that there is sound - Unplug the USB device - At this point, the machine has no audio devices, falls back to a system thread, notice that with this patch, the graph still runs (can't output audio though) - Re-plug the USB device, notice that audio starts to flow again, with correct timing (no buffering or anything) I think this is similar with what the folks we're talking with experience with Citrix. This is easily back-portable, and is a mozilla-central patch for now.
Attachment #8840522 -
Flags: review?(rjesup)
Reporter | ||
Comment 6•7 years ago
|
||
Comment on attachment 8840522 [details] [diff] [review] patch Review of attachment 8840522 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/GraphDriver.cpp @@ +1048,5 @@ > AudioCallbackDriver::StateCallback(cubeb_state aState) > { > LOG(LogLevel::Debug, ("AudioCallbackDriver State: %d", aState)); > + if (aState == CUBEB_STATE_ERROR) { > + // Fall back to a driver using a normal thread. Add a comment that device changes (re-plug) will cause it to reopen and switch back to an AudioCallbackDriver @@ +1051,5 @@ > + if (aState == CUBEB_STATE_ERROR) { > + // Fall back to a driver using a normal thread. > + MonitorAutoLock lock(GraphImpl()->GetMonitor()); > + SystemClockDriver* nextDriver = new SystemClockDriver(GraphImpl()); > + SetNextDriver(nextDriver); Comment that we can't use SwitchOnNextIteration since there won't be one.
Attachment #8840522 -
Flags: review?(rjesup) → review+
Reporter | ||
Comment 7•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0e4f67858bb71c39351342f5e3fe9ab96ee0c789
Reporter | ||
Updated•7 years ago
|
Component: Audio/Video: cubeb → Audio/Video: MediaStreamGraph
Assignee | ||
Comment 8•7 years ago
|
||
Same fix, changed a few calls so that it compiles on ESR45, tested, it still works as intended.
Attachment #8840916 -
Flags: review?(rjesup)
Pushed by paul@paul.cx: https://hg.mozilla.org/integration/mozilla-inbound/rev/71bbaf9a0d0f When the cubeb stream errors out while driving a MediaStreamGraph, fall back to a SystemClockDriver. r=jesup
Reporter | ||
Comment 10•7 years ago
|
||
Comment on attachment 8840916 [details] [diff] [review] patch for esr45 Review of attachment 8840916 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/GraphDriver.cpp @@ +892,5 @@ > + mNextDriver = nextDriver; > + DebugOnly<bool> found = mGraphImpl->RemoveMixerCallback(this); > + nextDriver->SetGraphTime(this, mIterationStart, mIterationEnd); > + mGraphImpl->SetCurrentDriver(nextDriver); > + nextDriver->Start(); Clone the comment about SwitchAtNextIteration.
Attachment #8840916 -
Flags: review?(rjesup) → review+
Comment 11•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/71bbaf9a0d0f
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Reporter | ||
Comment 12•7 years ago
|
||
Comment on attachment 8840916 [details] [diff] [review] patch for esr45 [Approval Request Comment] If this is not a sec:{high,crit} bug, please state case for ESR consideration: This is critical to Intuit who is using Firefox for doing video CSR support for tax preparation (and also other Intuit products). They use ESR, and are a prime example of an ESR user. All updates to Firefox are done centrally on the Citrix servers, and they try to avoid CSRs being bugged by Firefox update requests they can't resolve. User impact if declined: When a CSR has a network glitch (or citrix glitch) and loses connectivity for a few seconds, they'll have to restart the browser on the server, and likely will lose contact with the customer and have to have them call in again. Fix Landed on Version: 54. Now that it's verified it will be uplifted to all versions. We're asking for ESR52 as well. Risk to taking this patch (and alternatives if risky): Very low risk patch; uses the normal sequence to restart the MediaStreamGraph on cubeb/driver failure callback. String or UUID changes made by this patch: none See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Attachment #8840916 -
Flags: approval-mozilla-esr52?
Attachment #8840916 -
Flags: approval-mozilla-esr45?
Comment 13•7 years ago
|
||
Not sure I understand why we need to uplift it to 45? This isn't a new regression. Can Intuit workaround the issue?
Comment 14•7 years ago
|
||
We don’t usually uplift patch that isn’t a security fix this late in ESR. We can keep this open, and if we end up with a dot release it could ride along.
Comment 15•7 years ago
|
||
Comment on attachment 8840916 [details] [diff] [review] patch for esr45 This isn't something that we usually do but here, Randell explained me the rational (partner related) and it makes sense to have this exception.
Attachment #8840916 -
Flags: approval-mozilla-esr52?
Attachment #8840916 -
Flags: approval-mozilla-esr52+
Attachment #8840916 -
Flags: approval-mozilla-esr45?
Attachment #8840916 -
Flags: approval-mozilla-esr45+
Comment 16•7 years ago
|
||
Polo, I guess we want to uplift that to all other branches, right? Merci et bon week end!
Flags: qe-verify+
Flags: needinfo?(padenot)
Comment 17•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-esr52/rev/3aae8d906162
Comment 18•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-esr45/rev/bfa75fc20c2b
Assignee | ||
Comment 19•7 years ago
|
||
(In reply to Sylvestre Ledru [:sylvestre] from comment #16) > Polo, I guess we want to uplift that to all other branches, right? > Merci et bon week end! Yes, thanks.
Flags: needinfo?(padenot)
Assignee | ||
Comment 20•7 years ago
|
||
Attachment #8839520 -
Attachment is obsolete: true
Attachment #8847656 -
Flags: approval-mozilla-beta?
Comment 21•7 years ago
|
||
Comment on attachment 8847656 [details] [diff] [review] Beta patch Thanks Paul, let's get this in 53.0b3
Attachment #8847656 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 22•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/f64241177060
Updated•7 years ago
|
Comment 23•7 years ago
|
||
[Tracking Requested - why for this release]:
tracking-firefox-esr45:
--- → ?
tracking-firefox-esr52:
--- → ?
Updated•7 years ago
|
Comment 24•7 years ago
|
||
Still an issue on Fx 53b10, Win 10 x64: 1. Open https://mozilla.github.io/webrtc-landing/gum_test.html 2. Click 'audio' 3. Unplug the USB headset Actual results: "Firefox has stopped working" error message displayed. Closing the error pop-up also closes Firefox. Browser console log after the browser restarts: "IndexedDB UnknownErr: ActorsParent.cpp:594 (unknown) UnknownError IndexedDBHelper.jsm:71:12 PushService:stateChangeProcessEnqueue: Error transitioning state UnknownError PushService.jsm:160" Also, there are no crashes in about:crashes. Thoughts?
Flags: needinfo?(padenot)
Assignee | ||
Comment 25•7 years ago
|
||
Paul, this is in contradiction with https://bugzilla.mozilla.org/show_bug.cgi?id=1354528#c10, are you sure you're testing with beta 10 ?
Flags: needinfo?(padenot) → needinfo?(paul.silaghi)
Comment 26•7 years ago
|
||
Yes, but I tested on https://archive.mozilla.org/pub/firefox/candidates/53.0b10-candidates/build2/win32/en-US/ which was build 07-Apr-2017, while Emil tried on a tinderbox build from 09-Apr-2017. So, I guess bug 1354528 is a follow-up for this, but only for the beta branch?
Flags: needinfo?(paul.silaghi) → needinfo?(padenot)
Comment 27•7 years ago
|
||
Anyway, 54.0a2 (2017-04-10) and 55.0a1 (2017-04-09) don't crash like beta, but after re-plug the headset, I can no longer hear myself on https://mozilla.github.io/webrtc-landing/gum_test.html / audio.
Assignee | ||
Comment 28•7 years ago
|
||
Paul, please open a new bug about this.
Flags: needinfo?(padenot) → needinfo?(paul.silaghi)
Comment 29•7 years ago
|
||
Filed bug 1355773. Meanwhile dropping qe-verify+, will verify later when bug 1355773 gets fixed.
Flags: qe-verify+
Flags: needinfo?(paul.silaghi)
You need to log in
before you can comment on or make changes to this bug.
Description
•