Closed Bug 953265 Opened 6 years ago Closed 4 years ago

getUserMedia captures at 16Khz

Categories

(Core :: WebRTC: Audio/Video, defect, P3)

26 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla44
Tracking Status
firefox44 --- fixed
Blocking Flags:

People

(Reporter: fluffy, Assigned: jesup)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 5 obsolete files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:26.0) Gecko/20100101 Firefox/26.0 (Beta/Release)
Build ID: 20131205075310

Steps to reproduce:

I have a web page the generates a sweep of the audio frequencies and outputs to the speaker. At the same time it records the level of sound on the microphone and plots the frequency responses. You can find the program / web page at 

https://gist.github.com/fluffy/8149170

All audio content over about 11 KHz gets filtered out. The speakers and microphones being used go well above this so that is not right. 


Actual results:

The output is shown in the image at 

http://www.flickr.com/photos/cullenfluffyjennings/11586861046/

The output more or less hits the noise floor at about 11 KHz. Recording SoundBoooth indicate it should go well beyond this.

This was on mac with external sound USB sound card that supports 24 bit 96KHz samples but was being used in 16 bit 48 KHz mode. Have tried on other macs with different sound cards and got same result. 

The problem appears to be on the microphone input side, not the output to the speakers. 


Expected results:

When I run the same program on same computer with Chrome, I get 

http://www.flickr.com/photos/cullenfluffyjennings/11586861076/

This gives me a relatively flat response out to 15KHz then drops to the noise floor somewhere out around 22KHz which is about what I would expect for the mic and speakers being used. (Speakers are Yamaha HS50M, Mic is AKG 420)
Component: Untriaged → Web Audio
Product: Firefox → Core
This is a limitation of the current getUserMedia implementation.
Status: UNCONFIRMED → NEW
Component: Web Audio → WebRTC: Audio/Video
Ever confirmed: true
OS: Mac OS X → All
Hardware: x86 → All
Summary: Web Audio filters out all high frequency content → getUserMedia captures at 16Khz
Example program no longer works in FF 32.0 or 34.0a2 . It only ever gets zero as the input for every sample from the microphone. Have not debugged to see what is wrong. Still works fine in Chrome 37.0.2062.94
I feel like the problem may be here:
http://dxr.mozilla.org/mozilla-central/source/content/media/webrtc/MediaEngineWebRTCAudio.cpp#23

Just out of curiousity, what happens if we change this to 48000?
(In reply to Eric Rescorla (:ekr) from comment #3)
> I feel like the problem may be here:
> http://dxr.mozilla.org/mozilla-central/source/content/media/webrtc/
> MediaEngineWebRTCAudio.cpp#23

We know very well we've used 16000 for getUserMedia so far (per Maire's comment in Feb).

> Just out of curiousity, what happens if we change this to 48000?

We've been meaning to look at doing so, especially now that we've converted the MediaStreamGraph to run at the "ideal" output frequency we find from the hardware, and landed (in 34) the MSG refactor to be driven by the output hardware.

There are some impacts though, especially to the amount of CPU the AEC takes.  (Roughly 3x for that step; some others (noise suppression) will also expand by 3x, and Opus encode will increase a bit.  We also want to look at the cost/delay imposed by the conversion to the MSG rate, especially if the MSG rate is 44100.  We save resampling time capturing at 44100 when the MSG rate is 44100, but if it goes to a peerconnection, I'm not sure if Opus resamples to 48000 first or not)   However, this is something we want to do, and we should put it into the priority queue.  The change is easy, or at least relatively so.  Checking the impact of the change will take some work.

We may want to keep it lower on mobile devices, or on B2G.
Assignee: nobody → rjesup
Hi Cullen -- We can put a patch up on this bug this week for testing.  I'm thinking we should target Desktop first and see what sort of CPU impact and quality impact it has.  Can you help with testing on Desktop?   

We'll also consider doing this for mobile, but I'm thinking we may want a separate bug for mobile (B2G and Android) since that's a separate testing effort.  What do you think?
Flags: needinfo?(fluffy)
Glad to test on osx desktop but a bit hard for me on other platforms right now. ( and this is not blocking any of our stuff in near term so glad to see this get fixed but not high priority ). One though on AEC load, is there a way to tell GUM to turn off AEC for testing this ?
Flags: needinfo?(fluffy)
(In reply to Cullen Jennings from comment #6)
> Glad to test on osx desktop but a bit hard for me on other platforms right
> now. ( and this is not blocking any of our stuff in near term so glad to see
> this get fixed but not high priority ). One though on AEC load, is there a
> way to tell GUM to turn off AEC for testing this ?

media.getusermedia.aec_enabled = false (in about:config)
Ahhg - I feel silly now :-) Thanks.
Note: we can't currently use PreferredSampleRate() since that will return 44100 or 48000, and L16 in webrtc.org only supports 8/16/32K right now.  We'd like to support the native rate and avoid resamples going into or out of the MediaStreamGraph.
https://tbpl.mozilla.org/?tree=Try&rev=bccb9bcc0769

It would be interesting to see a profile (not the in-gecko profiler - it doesn't catch all threads) of 16K vs 32 on a few platforms - and especially android.  Also feedback on the AEC and audio quality.

Cullen: You can download builds by clicking on the appropriate green 'B', selecting Go to build directory (lower left), then downloading and installing the try build.
Flags: needinfo?(fluffy)
Sorry, try again:  (forgot to re-check after cleanup)

Also bumped "fake audio" (fake:true) from 1K to 8K to help test.

 https://tbpl.mozilla.org/?tree=Try&rev=5d463cb73a50
updated patch that works
Attachment #8491724 - Attachment is obsolete: true
Duplicate of this bug: 1042725
From what I'm reading, is it correct 44k will not be added due to limitation specified in webRTC? If that is there case, would there be any other javascript way to record at 44k?
(In reply to kevin from comment #14)
> From what I'm reading, is it correct 44k will not be added due to limitation
> specified in webRTC? If that is there case, would there be any other
> javascript way to record at 44k?

No, nothing in the spec would limit it; it's more an issue of the way we're grabbing input samples and running the AEC on the input.  (Higher frequency + AEC = more CPU use typically)  Likely we'll go from 16K to 32K first, and look at 44100/48000 as part of a refactor/recoding of the input side.  Also, figuring out what to do about the AEC will be interesting (perhaps limit 44100/48000 to cases where the AEC was purposely disabled).
Cullen, any feedback?
I haven't heard about the change to 44k on any roadmap. Is this still waiting on input refactoring.
Unbitrotted, not tested
Attachment #8493836 - Attachment is obsolete: true
Rank: 35
Flags: firefox-backlog+
Priority: -- → P3
backlog: --- → webRTC+
Flags: firefox-backlog+
See Also: → 935617
updated patch, works, haven't extensively tested calls with it yet
Try at https://jsfiddle.net/tjmcq5gw/1/
Attachment #8576802 - Attachment is obsolete: true
Attachment #8662440 - Flags: feedback?(padenot)
The use cases we were looking at were around ultrasonic pairing and the actual audio signals are in the 15KHz to 20KHz zone. (Some of this is at http://www.cisco.com/c/en/us/products/collaboration-endpoints/intelligent-proximity.html). Over time we are tending to move away from the higher frequencies due to microphone limitations on some devices - I sort of doubt that 32KHz would be enough to get much bandwidth in the range above where people can not hear it. I'm a bit surprised that the AEC CPU use is so rate dependent. We tend to find not much echo of higher frequencies so that the AEC can be computed on a low pass version of the signal then subtracted from wideband version. Or use the AEC low pass version to get the bulk delay then just minor tweak within +/- 1 of the low pass sample rate to fine tune the bulk delay in the full bandwidth version. But ignoring our weird ultrasonic use case, I think 32KHz would be more than enough for nice human to human collaboration. (of course I play all my music at 196Khz and only use Monster cables for my digital links because they sounds better :-)
Flags: needinfo?(fluffy)
I agree 32KHz will work well for the majority of uses of webrtc. That being said, the use it seems Cullen and I are looking for still requires higher frequency.
AEC and audio processing in general is in fact low-passed to 32K, and Noise Reduction and other frequency-dependent bits are as well, except maybe voice activity which may be 16KHz max.  It then upsamples the AEC result to 48K... and it may get downsampled again later. (haven't verified that).  The AEC data is split into two bands: 0-16K, and 16-32K, and it appears to do minimal processing (gain, nlp, comfort noise) on 16-32K.
  
 return WebRtcAec_Init(static_cast<Handle*>(handle), apm_->proc_sample_rate_hz(), 48000);

In general, though, the audio processing code in webrtc.org wants 8K, 16K or 32K, and that's it.  (And for mobile AEC, it limits it to 16KHz samplerate.)

Have you verified that a signal >16KHz will pass through Chrome?

Certainly it's possible to support >16KHz signals (>32K sampling), but it would require more work.  The best way to do so would be to add a 3rd split of 32-48ish KHz, and pass that through as well, or make bigger mods to allow an uneven split with different sample buffer sizes for the low and high bands (0-16K, and 16-48K), which would at least be a 2x ratio.

For added fun, we expect the land the new AEC from Chrome45 shortly.
Comment on attachment 8662440 [details] [diff] [review]
Update webrtc capture rate to 32KHz

Review of attachment 8662440 [details] [diff] [review]:
-----------------------------------------------------------------

I'll have a proper look on Monday, I plan to profile a bunch of Web Audio things (a number of things are changing, including MSG blocking and things from karl), I can fold that in. That said the code looks like it would work, but I don't understand the reason for the 16kHz in MediaEngineWebRTC.h.

Also I think it would be worthwhile to do a complete step-by-step walk in the debugger of a WebRTC processing iteration to really understand what is happening, I'm sure we can discover interesting things and make more informed decisions.

::: dom/media/webrtc/MediaEngineWebRTC.h
@@ +133,5 @@
>      , mCapIndex(aIndex)
>      , mChannel(-1)
>      , mInitDone(false)
>      , mStarted(false)
> +    , mSampleFrequency(16000)

Why suddenly 16kHz here ? There is something I'm missing.
I've profile this carefully, using a Hello call between an OSX 10.10 machine (where the profiler was running) and a Linux machine.

Here are the perf numbers of the interesting threads, three runs with the patch, three runs without the patch. Each time, I open a Hello call, I start the profiler, and I wait for the numbers to stabilize (it takes around 30s):

                        16kHz               32kHz
Capture thread:    5.4%,5.2%,5.5%      6.1%,6.0%,5.9%
MSG thread:        5.6%,5.3%,5.2%      5.6%,6.0%,5.9%
Encoding thread:   6.1%,6.4%,6.6%      6.8%,6.4%,6.2%

In light of this small CPU increase, and because the sound quality is clearly better, I think we should land this.
Attachment #8664306 - Attachment description: Update webrtc capture rate to 32KHz (WIP) → Update webrtc capture rate to 32KHz
Attachment #8664306 - Flags: review?(padenot)
Attachment #8664305 - Flags: review?(docfaraday)
Attachment #8664307 - Flags: review?(jib)
>                         16kHz               32kHz
> Capture thread:    5.4%,5.2%,5.5%      6.1%,6.0%,5.9%
> MSG thread:        5.6%,5.3%,5.2%      5.6%,6.0%,5.9%
> Encoding thread:   6.1%,6.4%,6.6%      6.8%,6.4%,6.2%
> 
> In light of this small CPU increase, and because the sound quality is
> clearly better, I think we should land this.

Total for the 3 threads goes from ~17% to ~18.5%.  A fairly small increase.  Right now the patch does NOT apply to B2G (B2G stays at 16KHz).  (Though that reminds me; the Opus bitrate patch should ifdef the bitrate increase until we make it more adaptive)
Attachment #8664305 - Attachment is obsolete: true
Attachment #8664305 - Flags: review?(docfaraday)
Attachment #8662440 - Attachment is obsolete: true
Attachment #8662440 - Flags: feedback?(padenot)
Comment on attachment 8664307 [details] [diff] [review]
make getUserMedia fake audio tones configurable in frequency via pref

Review of attachment 8664307 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm.

::: dom/media/webrtc/MediaEngineDefault.cpp
@@ +309,4 @@
>      mReadLength(0) {
> +    // If we allow arbitrary frequencies, there's no guarantee we won't get rounded here
> +    // We could include an error term and adjust for it in generation; not worth the trouble
> +    //MOZ_ASSERT(mTotalLength * aFrequency == aSampleRate); 

trailing whitespace

::: modules/libpref/init/all.js
@@ +402,5 @@
>  pref("media.peerconnection.video.min_bitrate", 200);
>  pref("media.peerconnection.video.start_bitrate", 300);
>  pref("media.peerconnection.video.max_bitrate", 2000);
>  #endif
> +pref("media.navigator.audio.test_frequency", 1000);

Maybe "media.navigator.audio.fake_frequency" to match "media.navigator.streams.fake" and { fake: true }?
Attachment #8664307 - Flags: review?(jib) → review+
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a7b9544e6901 is green, though I haven't run it against all platforms
Comment on attachment 8664306 [details] [diff] [review]
Update webrtc capture rate to 32KHz

Review of attachment 8664306 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with answers to the questions.

::: dom/media/webrtc/MediaEngineWebRTC.h
@@ +136,5 @@
>      , mCapIndex(aIndex)
>      , mChannel(-1)
>      , mInitDone(false)
>      , mStarted(false)
> +    , mSampleFrequency(16000)

Why is this 16kHz ?

::: dom/media/webrtc/MediaEngineWebRTCAudio.cpp
@@ +19,5 @@
>  #define ENCODING "L16"
>  #define DEFAULT_PORT 5555
>  
> +#define SAMPLE_RATE_16000 256000 // for 16000Hz
> +#define SAMPLE_RATE_32000 512000   // for 32000Hz

How did we determine those are good numbers ?
Attachment #8664306 - Flags: review?(padenot) → review+
Comment on attachment 8664389 [details] [diff] [review]
Adjust Opus bitrate in WebRTC to pass >8KHz audio, and comment

Review of attachment 8664389 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/webrtc/signaling/src/jsep/JsepSessionImpl.cpp
@@ +2009,5 @@
> +  // Per jmspeex on IRC:
> +  // For 32KHz sampling, 28 is ok, 32 is good, 40 should be really good quality.
> +  // Note that 1-2Kbps will be wasted on a stereo Opus channel with mono input
> +  // compared to configuring it for mono.
> +  // If we reduce bitrate enough Opus will low-pass us; 16000 will kill a 9KHz tone.

Wrap this to 80

@@ +2019,5 @@
>        48000,
>        2,
>        960,
> +#ifdef WEBRTC_GONK
> +      16000 // B2G uses lower capture sampling rate

It seems to me that we'd want this logic somewhere else that can determine the capture rate and adjust the bitrate accordingly, but if that is hard I'm ok with leaving it as a TODO with a bug #. My worry is that at some point in the future the capture rate for B2G will change, and it will be easy to miss updating this since it is tucked away far from the media stuff.

@@ +2021,5 @@
>        960,
> +#ifdef WEBRTC_GONK
> +      16000 // B2G uses lower capture sampling rate
> +#else
> +      40000

This is fine, but be vigilant for oranges on android.
Attachment #8664389 - Flags: review?(docfaraday) → review+
> ::: dom/media/webrtc/MediaEngineWebRTC.h
> @@ +136,5 @@
> >      , mCapIndex(aIndex)
> >      , mChannel(-1)
> >      , mInitDone(false)
> >      , mStarted(false)
> > +    , mSampleFrequency(16000)
> 
> Why is this 16kHz ?

It shouldn't be - it should be MediaEngine::DEFAULT_SAMPLE_RATE - but it's actually set in ::Init(), so this is pretty spurious anyways.

> 
> ::: dom/media/webrtc/MediaEngineWebRTCAudio.cpp
> @@ +19,5 @@
> >  #define ENCODING "L16"
> >  #define DEFAULT_PORT 5555
> >  
> > +#define SAMPLE_RATE_16000 256000 // for 16000Hz
> > +#define SAMPLE_RATE_32000 512000   // for 32000Hz
> 
> How did we determine those are good numbers ?

If you look at the old data there, you'll see the "SAMPLE_RATE" for 16000 was 256000.  CodecInst.rate is bps (uncompressed) - 16000*2*8 = 256000, 32000*2*8 = 512000.  I'll change the hard-coded value to that calculation.
Blocks: 1207925
If this bug is considered fixed, should a new one be made for the changes needed for the increase to 44kHz?
Kevin, this will happen as part of the full-duplex work (bug 1142613). We already have some patches there. They are Windows only and half-broken for now, but we've shown the approach is better, and the resulting audio is at whatever the preferred sample rate of the OS is (often 44.1kHz or 48kHz).

We're starting again working on it soon (in Q4 2015 I think).
Just to chime in, I'm also very interested in seeing the mic capture at 44.1 or 48kHz. I am attempting to do some SDR in the browser. My project doesn't quite work in Firefox yet, and I'm attempting to get it there one step at a time. Now that https://bugzilla.mozilla.org/show_bug.cgi?id=987186 is resolved, it looks like this would be the next step.
Brian, we've just landed the Linux part if you want to have a look. You'd have to enable the `full-duplex` pref on nightly. We're working on making the windows and osx part, now.
See Also: → 1249098
Depends on: 1234230
You need to log in before you can comment on or make changes to this bug.