Open Bug 1388586 Opened 8 years ago Updated 1 year ago

Implement the sampleRate microphone constraint

Categories

(Core :: WebRTC: Audio/Video, enhancement, P3)

enhancement

Tracking

()

Tracking Status
firefox57 --- affected

People

(Reporter: jib, Unassigned)

References

Details

(Keywords: stale-bug)

Spec[1]: "The sample rate in samples per second for the audio data." [1] https://w3c.github.io/mediacapture-main/getusermedia.html#def-constraint-sampleRate
(In reply to Jan-Ivar Bruaroey [:jib] (needinfo? me) from comment #0) > Spec[1]: "The sample rate in samples per second for the audio data." > > [1] > https://w3c.github.io/mediacapture-main/getusermedia.html#def-constraint- > sampleRate Does this mean we need to resample to any arbitrary sample rate provided by the user? There are assumptions in the webrtc.org code that sample rate is one of 8000, 16000, 32000, 44100 or 48000. Or am I misunderstanding the intent of this constraint?
I suspect if they ask for an exact rate that we don't support internally we can just return overconstrained ;-)
For instance, the DTMF code assumes a fixed set of sample rates [1]. That's just the tone generation code which I'm sure could be redone in a more readable if perhaps less efficient way, but there might be other places in the code base that make similar assumptions. [1] http://searchfox.org/mozilla-central/rev/e5b13e6224dbe3182050cf442608c4cb6a8c5c55/media/webrtc/trunk/webrtc/modules/audio_coding/neteq/dtmf_tone_generator.cc#38
As jesup touches on, the constraints model, and the spec, doesn't mandate any particular output at all, it just sets up the mechanism to query or discover what a mic can output. I.e. it doesn't mean we have to support arbitrary values. E.g. we could limit to outputting 8000, 16000, 32000, 44100 and 48000 if we find internal limitations, and still be spec compliant. That said, I believe the plan is to support whatever cubeb can give us. We're also adding the AudioContext({sampleRate}) JS option (bug 1387454) which AFAIK might mean that peer connection needs to handle inputs of arbitrary sample rates. I don't know. cc Paul. In either case, we should be sure to test what breaks. Thanks for the heads up!
Flags: needinfo?(padenot)
Cubeb will get you anything in [8000, 192000] on any platform, for output or input, it just gives you what you ask for. We can resample anything to anything, at any point, so it's just a matter of making a decision. It's really unclear what this constraint is for, because the only way to look into a MediaStream is to use the Web Audio API, and the Web Audio API will resample all inputs to its own sample-rate. One of the way to observe this is to notice that the high partials will be cut off because of the lower sample-rate I suppose.
Flags: needinfo?(padenot)
Rank: 25
Priority: P1 → P2
Mass change P2->P3 to align with new Mozilla triage process.
Priority: P2 → P3

Reminder to check the privacy.resistFingerprinting pref as well as bug 1528042 when implementing this feature, since it might expose system sample rates through track.getSettings().

See Also: → 1674892
Severity: normal → S3

(In reply to Paul Adenot (:padenot) from comment #5)

Cubeb will get you anything in [8000, 192000] on any platform, for output or
input, it just gives you what you ask for.

We can resample anything to anything, at any point, so it's just a matter of
making a decision.

It's really unclear what this constraint is for, because the only way to
look into a MediaStream is to use the Web Audio API, and the Web Audio API
will resample all inputs to its own sample-rate. One of the way to observe
this is to notice that the high partials will be cut off because of the
lower sample-rate I suppose.

Firefox does not support resampling MediaStream to an AudioContext (domexception: AudioContext.createMediaStreamSource: Connecting AudioNodes from AudioContexts with different sample-rate is currently not supported)
That I could find, it is currently not possible to reliably read raw data from microphone, since the sampleRates must match but there is no way to know the microphone's sampleRate
(This is actually seriously frustrating, MediaStream is a joke 😠)

(In reply to mat.reiner from comment #8)

(In reply to Paul Adenot (:padenot) from comment #5)

Cubeb will get you anything in [8000, 192000] on any platform, for output or
input, it just gives you what you ask for.

We can resample anything to anything, at any point, so it's just a matter of
making a decision.

It's really unclear what this constraint is for, because the only way to
look into a MediaStream is to use the Web Audio API, and the Web Audio API
will resample all inputs to its own sample-rate. One of the way to observe
this is to notice that the high partials will be cut off because of the
lower sample-rate I suppose.

Firefox does not support resampling MediaStream to an AudioContext (domexception: AudioContext.createMediaStreamSource: Connecting AudioNodes from AudioContexts with different sample-rate is currently not supported)
That I could find, it is currently not possible to reliably read raw data from microphone, since the sampleRates must match but there is no way to know the microphone's sampleRate
(This is actually seriously frustrating, MediaStream is a joke 😠)

I'm also trying to get raw PCM data from a MediaStream source. Short of Firefox implementing the sampleRate microphone constraint (which would of course be ideal), having the ability to determine the default sample rate in advance would be extremely helpful. It seems to usually be 44100, but I don't know how consistent that is across platforms, so I'm hesitant to rely on it.

Just take the AudioContext sample-rate into account for everything, because regardless of what will happen, the microphone data will be provided at that rate.

You cannot access the PCM from the microphone without an AudioContext, and when an AudioContext is created and a getUserMedia stream connected to it, everything underneath is reconfigured to have an input/output duplex stream that has the lowest latency possible and the least resampling possible. Depending on the OS, it's possible to open the audio interface at a native rate, skipping resampling altogether. At a system level, both input and output have the same rate in Firefox, like it is generally done in audio software.

The sampleRate constraint on the microphone is in general not useful.

Just take the AudioContext sample-rate into account for everything, because regardless of what will happen, the microphone data will be provided at that rate.

Good to know thanks, by this I assume you mean:

const ctx = new AudioContext()
const sampleRateToRelyOn = ctx.sampleRate

I agree we don't necessarily need the sampleRate constraint (although it is in the spec), but it would be nice if we could set the sampleRate of the AudioContext and get audio at that sample rate from the media stream. See this JS Fiddle which breaks on Firefox: https://jsfiddle.net/mx9gonuv/8/

Existing issue here: https://bugzilla.mozilla.org/show_bug.cgi?id=1725336

There are lots of things in specs that are useless, and I say this as a spec author :-).

Re. the fact that sometimes connection don't work, this is a known issue that we plan to fix.

There were a lot of things to change underneath to make all this work without having the latency values go up horribly, breaking quite a few use cases that rely on low-latency. As of late last year, those are done and working, and we can focus on bug 1725336. I don't have an estimate to give you, however.

You need to log in before you can comment on or make changes to this bug.