1541425 - (audio-sharing) Implement audio capture for getDisplayMedia

wilhelm.wanecek

Reporter

Description

•

5 years ago

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0

Steps to reproduce:

Attempted requesting screen capture with audio constraints, e.g. { video: true, audio: true }.

Actual results:

No audio tracks are obtained (i.e. MediaStream#getAudioTracks() returns an empty list), since the feature is not implemented yet.

Expected results:

Firefox should implement the part of the Screen Capture specification[1] describing audio capture. AFAIU, Chromium will implement this as of Chrome 68 [2]. There's some discussion about this feature-set in bug #1321221 [3].

wilhelm.wanecek

Reporter

Updated

•

5 years ago

Type: defect → task

Component: Untriaged → WebRTC: Audio/Video

Product: Firefox → Core

Alex Chronopoulos [:achronop]

Comment 1

•

5 years ago

Can you help me triage this one?

Flags: needinfo?(jib)

Jan-Ivar Bruaroey [:jib] (needinfo? me)

Comment 2

•

5 years ago

I think this is low priority for us at the moment. This is a MAY in the spec, i.e. it's not mandatory to implement, so we are spec compliant without this feature.

While we do have some old disabled audio-capture code in the tree that we might be able to revive from the Firefox Hello days, I believe that code specifically exposed tab audio, and we don't currently expose tab sharing, which is what it was designed to go along with.

Flags: needinfo?(jib)

Priority: -- → P3

Jan-Ivar Bruaroey [:jib] (needinfo? me)

Updated

•

5 years ago

Status: UNCONFIRMED → NEW

Ever confirmed: true

Jan-Ivar Bruaroey [:jib] (needinfo? me)

Updated

•

5 years ago

Type: task → enhancement

wilhelm.wanecek

Reporter

Comment 3

•

5 years ago

Ok, thanks a lot for explaining the situation :) Chromium is in different position since they already had audio capture in their previous, non-standard implementation (although only for tabs on Mac & Linux).

Paul Adenot (:padenot)

Comment 5

•

4 years ago

The spec is really hand-wavy:

In the case of audio, the user agent MAY present the end-user with audio sources to share. Which choices are available to choose from is up to the user agent, and the audio source(s) are not necessarily the same as the video source(s). An audio source may be a particular application, window, browser, the entire system audio or any combination thereof. Unlike mediadevices.getUserMedia() with regards to audio+video, the user agent is allowed not to return audio even if the audio constraint is present. If the user agent knows no audio will be shared for the lifetime of the stream it MUST NOT include an audio track in the resulting stream. The user agent MAY accept a request for audio and video by only returning a video track in the resulting stream, or it MAY accept the request by returning both an audio track and a video track in the resulting stream. The user agent MUST reject audio-only requests.

We don't know what do do, and authors and users don't have real guarantees, so I guess we can do the following to maximize usefulness:

On OSes where we support monitor device/loopback stream, we can offer device monitoring (we have the capability in cubeb for Pulse and WASAPI)
We can offer an input stream for the page's output (this is implemented already)
We might be able to offer the browser's output, but it's more complicated (we'd need to implement a browser-wide multi-process mixer)

github

Comment 6

•

4 years ago

Just chiming in here, we're creating a product that wants to rely on this feature (the audio sharing in particular), so it's definitely a bummer to see firefox behind edge and chrome here. Would love to see at least tab-audio sharing support.

github

Comment 7

•

4 years ago

(I'm assuming caniuse is correct when it shows firefox not supporting this at all: https://caniuse.com/#feat=mdn-api_mediadevices_getdisplaymedia_audio-capture-support)

Comment hidden (advocacy)

Arash

Comment 10

•

3 years ago

Hi there. Is there a way to offer a bounty for this feature? The lack of this feature in Firefox is the only reason I ever open up Chrome anymore. Thanks.

Jan-Ivar Bruaroey [:jib] (needinfo? me)

Updated

•

3 years ago

Depends on: 1685232

Alfredo

Comment 11

•

3 years ago

Any update?

sworddragon2

Comment 12

•

2 years ago

I was a bit curious when I streamed via the web-version of Discord and others told me they can't hear any sound - I guess I found the answer with this ticket why it does not work.

(In reply to Paul Adenot (:padenot) from comment #5)

The spec is really hand-wavy:

In the case of audio, the user agent MAY present the end-user with audio sources to share. Which choices are available to choose from is up to the user agent, and the audio source(s) are not necessarily the same as the video source(s). An audio source may be a particular application, window, browser, the entire system audio or any combination thereof. Unlike mediadevices.getUserMedia() with regards to audio+video, the user agent is allowed not to return audio even if the audio constraint is present. If the user agent knows no audio will be shared for the lifetime of the stream it MUST NOT include an audio track in the resulting stream. The user agent MAY accept a request for audio and video by only returning a video track in the resulting stream, or it MAY accept the request by returning both an audio track and a video track in the resulting stream. The user agent MUST reject audio-only requests.

We don't know what do do, and authors and users don't have real guarantees, so I guess we can do the following to maximize usefulness:

On OSes where we support monitor device/loopback stream, we can offer device monitoring (we have the capability in cubeb for Pulse and WASAPI)

We can offer an input stream for the page's output (this is implemented already)

We might be able to offer the browser's output, but it's more complicated (we'd need to implement a browser-wide multi-process mixer)

This one seems actually pretty straightforward. When getDisplayMedia() causes the permission dialog to popup and the user selects between a specific window or the entire system for screen capture this behavior should be transparently applied to the sound as well if the user opts-in. E.g. the only change in this permissions dialog would be a checkbox that would enable additional sound capture to the video capture:

With capturing sound from the specific window if the user chooses a window.
Or capturing the entire system sound if the user chooses a system-wide capture.

That is pretty simple and what the user probably would expect the most - and it would be simple to use and quite efficient. But the difficulties are in the detail: Tracking the correct sound source related to the choosen window might be tricky if the recorded window uses a different process for the audio or if complex systems (possibly PulseAudio and other sound systems) are around.

Jan-Ivar Bruaroey [:jib] (needinfo? me)

Updated

•

2 years ago

Blocks: tab-sharing

Jan-Ivar Bruaroey [:jib] (needinfo? me)

Updated

•

2 years ago

No longer blocks: tab-sharing

Depends on: tab-sharing

Jan-Ivar Bruaroey [:jib] (needinfo? me)

Comment 13

•

2 years ago

We can offer an input stream for the page's output (this is implemented already)

We should focus on this use case, as it is the most urgent. People want to share e.g. a youtube tab with their audience and already expect audio.

The simplest and fastest would be to block on bug 1651145 and add ☑ share tab audio to our tab-sharing UX, for parity with Chrome (demo).

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

Jan-Ivar Bruaroey [:jib] (needinfo? me)

Updated

•

1 year ago

Blocks: 1803665

Jim Mathies [:jimm]

Updated

•

1 year ago

Alias: audio-sharing

Jim Mathies [:jimm]

Updated

•

11 months ago

Duplicate of this bug: 1837017

Jim Mathies [:jimm]

Updated

•

9 months ago

Depends on: 1178751, 1264333, 1633428, 1633436, 1641585, 1685233, 1844181, 1156472

Karl Tomlinson (:karlt)

Updated

•

6 months ago

Depends on: 1864067