Closed Bug 1849108 Opened 1 years ago Closed 1 year ago

Feed output audio from non-default sinks (setSinkid) to AEC

Tracking

()

Status:

RESOLVED FIXED

Milestone:

123 Branch

Tracking Flags:

Tracking

Status

relnote-firefox

---

123+

firefox123

---

fixed

People

(Reporter: pehrsons, Assigned: karlt)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: dev-doc-needed)

Attachments

(1 file)

Bug 1849108 use a secondary output device for echo cancellation if the primary device has no output r?pehrsons 1 year ago Karl Tomlinson (:karlt back Feb 10) 48 bytes, text/x-phabricator-request		Details \| Review

Andreas Pehrson [:pehrsons]

Reporter

Description

•

1 years ago

webrtc::AudioProcessing that we use for echo cancellation needs to be fed one input stream and one output stream to do its processing.

There is one webrtc::AudioProcessing per input stream, be it for a default or non-default device.

However, the output stream feeding webrtc::AudioProcessing is always of the audio played out in the default MediaTrackGraph, i.e. for the default output device, where the AudioProcessingTrack lives.

setSinkId with a MediaStreamTrack will create a new MediaTrackGraph running on the callbacks of the device with the given sink id, and connect the track corresponding to the played MediaStreamTrack in the default graph to the new graph. There is no path to feed the output of this new graph to the AudioProcessingTrack in the default graph.

We'll need to fix this, possibly through bug 1830247.

Andreas Pehrson [:pehrsons]

Reporter

Comment 1

•

1 years ago

Karl, would you have cycles to look into this?

Flags: needinfo?(karlt)

Karl Tomlinson (:karlt back Feb 10)

Assignee

Comment 2

•

1 years ago

Thank you for identifying this issue and linking to the parts involved. I'll look into it.

I assume multiple audio output devices should be treated similar to having multiple channels in a single output device, except that clocks and rates can differ.
Perhaps adding more channels to a single AudioProcessing would be more efficient than having a series of AudioProcessing.

Assignee: nobody → karlt

Flags: needinfo?(karlt)

Andreas Pehrson [:pehrsons]

Reporter

Comment 3

•

1 years ago

Output->echo latencies can also differ. I'm not sure how well the AudioProcessing AEC would handle different scenarios there.

Karl Tomlinson (:karlt back Feb 10)

Assignee

Comment 4

•

1 years ago

Ah, yes, thanks. I wonder whether the AEC handles widely separated speakers.

Stefan Fleiter (:sfleiter)

Updated

•

1 year ago

Blocks: 916331

Andreas Pehrson [:pehrsons]

Reporter

Updated

•

1 year ago

Blocks: 1848283

Karl Tomlinson (:karlt back Feb 10)

Assignee

Comment 5

•

1 year ago

•

Edited

Local output from a MediaTrackGraph can come from either a media element or an AudioContext. Currently, non-default outputs can only be set from the media element, but this will be required on AudioContext for recent spec changes.

Our current echo cancellation mechanism is intended only for use when the input/output pair is on the same graph and (so) same thread.

The underlying AudioProcessing class does have the necessary locking to be used with input and output on different threads. So we could plausibly have an n * m matrix of AudioProcessing, but we don't need this in the common case.

Karl Tomlinson (:karlt back Feb 10)

Assignee

Comment 6

•

1 year ago

In the common case that the MediaTrackGraph does not have another active audio output device, instead of always creating a CrossGraphPort for non-default output devices, we should be able to run the MediaTrackGraph from the non-default device. This would be better than what we have now because Cubeb may then need to resample, but unlike CrossGraphPort it will not need to drift correct.

Karl Tomlinson (:karlt back Feb 10)

Assignee

Comment 7

•

1 year ago

•

Edited

With the non-default-output-run graph of comment 6, if a default-rate AudioContext is then connected to a MediaStreamTrack of the same graph, a CrossGraphPort will be required if output is to be delivered (from the AudioContext) to the default device. (Connections are not currently supported between different rate graphs.)

The CrossGraphPort could either be between a separate graph for the AudioContext and the first graph or at the output of the AudioContext running on the first graph.
Having the CrossGraphPort on the AudioContext output would be somewhat simpler as an initial step as it would be consistent with the media element output design, though we would still want the CrossGraphPort between graphs for non-default-rate AudioContexts at some point.
The benefit of having an AudioContext on the same graph as the microphone track is that its output can be integrated into our existing echo cancellation mechanism, if there is only one output device.

Adding some detection of non-silent output on the AudioContext will be useful to free up the graph to choose the output device that is actually in use. This would also have performance / energy consumption benefits.

Karl Tomlinson (:karlt back Feb 10)

Assignee

Comment 8

•

1 year ago

•

Edited

Microphones and PeerConnection remote tracks both currently get a graph with default sample rate.

Perhaps their graphs could be determined lazily so that rendering their signal through a non-default audio output device, via media element, would not require resampling from the default graph rate. However, that would be considerable re-engineering, so I do not intend to do this here.

Jim Mathies [:jimm]

Updated

•

1 year ago

Severity: N/A → S2

Jim Mathies [:jimm]

Updated

•

1 year ago

Priority: P3 → P1

Nico Grunbaum [:ng, @chew:mozilla.org]

Comment 9

•

1 year ago

I am not adding anything to this just documenting that Whereby is doing feature detection for setSinkId. dbaker and I were having echo issues in Whereby, and got a regression range. Disabling the setSinkId and reloading the site cleared up the issue.

Andreas Pehrson [:pehrsons]

Reporter

Updated

•

1 year ago

Comment 10

•

1 year ago

Until this bug is fixed in Firefox, we're running with a monkey-patch for setSinkId:

const originalSetSinkId = HTMLAudioElement.prototype.setSinkId;
HTMLAudioElement.prototype.setSinkId = function (sinkId) {
  // Don't call it in Firefox
  if (navigator.userAgent.includes("Firefox"))
    return Promise.resolve(undefined);
  return originalSetSinkId.call(this, sinkId);
};

Karl Tomlinson (:karlt back Feb 10)

Assignee

Updated

•

1 year ago

Depends on: 1869043

Karl Tomlinson (:karlt back Feb 10)

Assignee

Updated

•

1 year ago

Depends on: 1872519

Karl Tomlinson (:karlt back Feb 10)

Assignee

Comment 11

•

1 year ago

https://treeherder.mozilla.org/jobs?repo=try&revision=f9499e12a91c3e48f835f0f4d7dc8bb1a23e746e

Karl Tomlinson (:karlt back Feb 10)

Assignee

Comment 12

•

1 year ago

Attached file Bug 1849108 use a secondary output device for echo cancellation if the primary device has no output r?pehrsons — Details

Depends on D198237

Karl Tomlinson (:karlt back Feb 10)

Assignee

Comment 13

•

1 year ago

(In reply to Karl Tomlinson (:karlt) from comment #6)

In the common case that the MediaTrackGraph does not have another active audio output device, instead of always creating a CrossGraphPort for non-default output devices, we should be able to run the MediaTrackGraph from the non-default device. This would be better than what we have now because Cubeb may then need to resample, but unlike CrossGraphPort it will not need to drift correct.

I haven't done that in the patches here. The patches introduce data structures that would support this, but some additional work would be required to Apply() PendingResumeOperations when a graph's secondary output stream is running, so as not to miss the start of Web Audio rendering. There are some other cases to reconsider too, such as (currently) using an AudioCallbackDriver when there are any audio tracks, even when there are no device outputs or inputs. Perhaps that may similarly relate to having a device immediately available for a new audio output.

Pulsebot

Comment 14

•

1 year ago

Pushed by ktomlinson@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/fde039d640cb use a secondary output device for echo cancellation if the primary device has no output r=pehrsons

Mathew Hodson

Updated

•

1 year ago

No longer blocks: 1498512

Cristian Tuns

Comment 15

•

1 year ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/fde039d640cb

Status: NEW → RESOLVED

Closed: 1 year ago

status-firefox123: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 123 Branch

Andreas Pehrson [:pehrsons]

Reporter

Comment 16

•

1 year ago

Please don't remove the link to bug 1498512. It's linked as a blocker because ideally setSinkId should have been hooked up to the AEC when it shipped.

Andreas Pehrson [:pehrsons]

Reporter

Updated

•

1 year ago

Blocks: 1498512

Karl Tomlinson (:karlt back Feb 10)

Assignee

Comment 17

•

1 year ago

Release Note Request (optional, but appreciated)
[Why is this notable]:
New web platform feature affecting user experience in some cases.
[Affects Firefox for Android]: No
[Suggested wording]:
Audio echo cancellation can now be applied to microphone inputs when the audio output is redirected to another device with setSinkId().
[Links (documentation, blog post, etc)]:
https://developer.mozilla.org/en-US/docs/Web/API/MediaTrackConstraints/echoCancellation
https://developer.mozilla.org/en-US/docs/Web/API/HTMLMediaElement/setSinkId

dev-doc-needed for https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Releases/123, but users may also notice this.

relnote-firefox: --- → ?

Keywords: dev-doc-needed

Pascal Chevrel:pascalc

Comment 18

•

1 year ago

Note added to 123 nightly release notes in the Web Platform section, thanks.

Karl Tomlinson (:karlt back Feb 10)

Assignee

Updated

•

1 year ago

Duplicate of this bug: 1875745

Pascal Chevrel:pascalc

Updated

•

1 year ago

relnote-firefox: ? → 123+

Karl Tomlinson (:karlt back Feb 10)

Assignee

Updated

•

11 months ago