Feed output audio from non-default sinks (setSinkid) to AEC
Categories
(Core :: Audio/Video: MediaStreamGraph, task, P1)
Tracking
()
People
(Reporter: pehrsons, Assigned: karlt)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
(Keywords: dev-doc-needed)
Attachments
(1 file)
webrtc::AudioProcessing
that we use for echo cancellation needs to be fed one input stream and one output stream to do its processing.
There is one webrtc::AudioProcessing
per input stream, be it for a default or non-default device.
However, the output stream feeding webrtc::AudioProcessing
is always of the audio played out in the default MediaTrackGraph, i.e. for the default output device, where the AudioProcessingTrack
lives.
setSinkId
with a MediaStreamTrack will create a new MediaTrackGraph running on the callbacks of the device with the given sink id, and connect the track corresponding to the played MediaStreamTrack in the default graph to the new graph. There is no path to feed the output of this new graph to the AudioProcessingTrack
in the default graph.
We'll need to fix this, possibly through bug 1830247.
Reporter | ||
Comment 1•1 years ago
|
||
Karl, would you have cycles to look into this?
Assignee | ||
Comment 2•1 years ago
|
||
Thank you for identifying this issue and linking to the parts involved. I'll look into it.
I assume multiple audio output devices should be treated similar to having multiple channels in a single output device, except that clocks and rates can differ.
Perhaps adding more channels to a single AudioProcessing
would be more efficient than having a series of AudioProcessing
.
Reporter | ||
Comment 3•1 years ago
|
||
Output->echo latencies can also differ. I'm not sure how well the AudioProcessing
AEC would handle different scenarios there.
Assignee | ||
Comment 4•1 years ago
|
||
Ah, yes, thanks. I wonder whether the AEC handles widely separated speakers.
Assignee | ||
Comment 5•1 year ago
•
|
||
Local output from a MediaTrackGraph can come from either a media element or an AudioContext. Currently, non-default outputs can only be set from the media element, but this will be required on AudioContext for recent spec changes.
Our current echo cancellation mechanism is intended only for use when the input/output pair is on the same graph and (so) same thread.
The underlying AudioProcessing
class does have the necessary locking to be used with input and output on different threads. So we could plausibly have an n * m matrix of AudioProcessing
, but we don't need this in the common case.
Assignee | ||
Comment 6•1 year ago
|
||
In the common case that the MediaTrackGraph does not have another active audio output device, instead of always creating a CrossGraphPort
for non-default output devices, we should be able to run the MediaTrackGraph from the non-default device. This would be better than what we have now because Cubeb may then need to resample, but unlike CrossGraphPort
it will not need to drift correct.
Assignee | ||
Comment 7•1 year ago
•
|
||
With the non-default-output-run graph of comment 6, if a default-rate AudioContext is then connected to a MediaStreamTrack of the same graph, a CrossGraphPort will be required if output is to be delivered (from the AudioContext) to the default device. (Connections are not currently supported between different rate graphs.)
-
The CrossGraphPort could either be between a separate graph for the AudioContext and the first graph or at the output of the AudioContext running on the first graph.
-
Having the CrossGraphPort on the AudioContext output would be somewhat simpler as an initial step as it would be consistent with the media element output design, though we would still want the CrossGraphPort between graphs for non-default-rate
AudioContext
s at some point. -
The benefit of having an AudioContext on the same graph as the microphone track is that its output can be integrated into our existing echo cancellation mechanism, if there is only one output device.
Adding some detection of non-silent output on the AudioContext will be useful to free up the graph to choose the output device that is actually in use. This would also have performance / energy consumption benefits.
Assignee | ||
Comment 8•1 year ago
•
|
||
Microphones and PeerConnection
remote tracks both currently get a graph with default sample rate.
Perhaps their graphs could be determined lazily so that rendering their signal through a non-default audio output device, via media element, would not require resampling from the default graph rate. However, that would be considerable re-engineering, so I do not intend to do this here.
Updated•1 year ago
|
Updated•1 year ago
|
Comment 9•1 year ago
|
||
I am not adding anything to this just documenting that Whereby is doing feature detection for setSinkId
. dbaker and I were having echo issues in Whereby, and got a regression range. Disabling the setSinkId
and reloading the site cleared up the issue.
Comment 10•1 year ago
|
||
Until this bug is fixed in Firefox, we're running with a monkey-patch for setSinkId:
const originalSetSinkId = HTMLAudioElement.prototype.setSinkId;
HTMLAudioElement.prototype.setSinkId = function (sinkId) {
// Don't call it in Firefox
if (navigator.userAgent.includes("Firefox"))
return Promise.resolve(undefined);
return originalSetSinkId.call(this, sinkId);
};
Assignee | ||
Comment 11•1 year ago
|
||
Assignee | ||
Comment 12•1 year ago
|
||
Depends on D198237
Assignee | ||
Comment 13•1 year ago
|
||
(In reply to Karl Tomlinson (:karlt) from comment #6)
In the common case that the MediaTrackGraph does not have another active audio output device, instead of always creating a
CrossGraphPort
for non-default output devices, we should be able to run the MediaTrackGraph from the non-default device. This would be better than what we have now because Cubeb may then need to resample, but unlikeCrossGraphPort
it will not need to drift correct.
I haven't done that in the patches here. The patches introduce data structures that would support this, but some additional work would be required to Apply()
PendingResumeOperation
s when a graph's secondary output stream is running, so as not to miss the start of Web Audio rendering. There are some other cases to reconsider too, such as (currently) using an AudioCallbackDriver
when there are any audio tracks, even when there are no device outputs or inputs. Perhaps that may similarly relate to having a device immediately available for a new audio output.
Comment 14•1 year ago
|
||
Comment 15•1 year ago
|
||
bugherder |
Reporter | ||
Comment 16•1 year ago
|
||
Please don't remove the link to bug 1498512. It's linked as a blocker because ideally setSinkId should have been hooked up to the AEC when it shipped.
Assignee | ||
Comment 17•1 year ago
|
||
Release Note Request (optional, but appreciated)
[Why is this notable]:
New web platform feature affecting user experience in some cases.
[Affects Firefox for Android]: No
[Suggested wording]:
Audio echo cancellation can now be applied to microphone inputs when the audio output is redirected to another device with setSinkId().
[Links (documentation, blog post, etc)]:
https://developer.mozilla.org/en-US/docs/Web/API/MediaTrackConstraints/echoCancellation
https://developer.mozilla.org/en-US/docs/Web/API/HTMLMediaElement/setSinkId
dev-doc-needed for https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Releases/123, but users may also notice this.
Comment 18•1 year ago
|
||
Note added to 123 nightly release notes in the Web Platform section, thanks.
Updated•1 year ago
|
Description
•