Closed Bug 974537 Opened 10 years ago Closed 10 years ago

Echo cancellation is much worse than chrome on apprtc.appspot.com

Categories

(Core :: WebRTC: Audio/Video, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla30
Tracking Status
firefox28 --- affected
firefox29 - fixed
firefox30 --- fixed
firefox-esr24 --- unaffected
relnote-firefox --- 28+

People

(Reporter: jrmuizel, Assigned: jesup)

Details

(Keywords: relnote, Whiteboard: [webrtc-uplift])

Attachments

(1 file, 1 obsolete file)

Both parties are using OS X on laptops. apprtc.appspot.com is unusable in Firefox but works fine in Chrome.
Randell -- Can you reach out to Jeff and find out more about his environment, etc?  

Jeff -- Did you experience any audio delay on the call?
Assignee: nobody → rjesup
Flags: needinfo?(jmuizelaar)
(In reply to Maire Reavy [:mreavy] from comment #1)
> Randell -- Can you reach out to Jeff and find out more about his
> environment, etc?  
> 
> Jeff -- Did you experience any audio delay on the call?

I believe the video and audio were out of sync. Is that what you mean?
Flags: needinfo?(jmuizelaar)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #2)
> (In reply to Maire Reavy [:mreavy] from comment #1)
> > Randell -- Can you reach out to Jeff and find out more about his
> > environment, etc?  
> > 
> > Jeff -- Did you experience any audio delay on the call?
> 
> I believe the video and audio were out of sync. Is that what you mean?

If audio is delayed, the delay can throw off the echo canceller.  Was video ahead of audio, or was audio ahead of video?
I strongly suspect this maps to known issues with the AEC and interaction with MediaStreamGraph/cubeb (delay outside the AEC tail length (128ms) for cancelling) which are being worked on in other bugs (especially the MediaStreamGraph refactor to be output-pull-driven).  I'll note that mac may be especially vulnerable to this due to much lower cubeb delay than other platforms.

jeff: can you retest with a build with this uncommented in build/gyp.mozbuild:

    # Creates AEC internal sample dump files in current directory
    # 'aec_debug_dump': 1,

Don't make a long call - perhaps 15 or 30 seconds; long enough to clearly show the echo.  Then upload the aec files created in the current dir.  (You should compress them with tar/bz2/zip/etc first)  Or ship them directly to me.

Also note that I need the logs from the machine generating the echo, not the one hearing the echo (if both hear echo, fine).

If you can, also set NSPR_LOG_MODULES=webrtc_trace:65535 WEBRTC_TRACE_FILE=whatever.  This may be up to 10MB; you'll definitely want to compress that.  However, the AEC files are much more useful generally.  A separate test would be to compile with debug, and set latency:5 debugging to do detailed analysis of the frame-by0frame latencies.  That generates a LOT of output, don't run long - but first let's look at the aec dumps before we consider that.

If we find the delay is outside the tail (too late), we have a tweak that can offset it to verify that's the problem. 

Thanks!
I turned on the aec_debug_dump var in gyp.mozbuild, and looking at aec_far0.pcm and aec_near0.pcm, it's clear that the microphone (near) is seeing audio "before" it shows up in the stream going to the speakers (far).

This is only possible because we're telling the system that there's 150ms of delay between audio out (far) in the peerconnection and when the first possible echo will arrive at the peerconnection from the microphone (near).  This is controlled by the pref "media.peerconnection.capture_delay", and it's a stand-in for the fact that we're not using the code in webrtc.org for audio output which would try to determine from the OS what the output and input delays are.

Note that we had a lot of output delay, but padenot's landing of improved cubeb backend latency reduced it, especially on mac.  With this patch, on macbook pro 2011 running 10.7, I see the first echo ~42ms after the 'far' sound (our AEC tail length is 128ms, so this is ok).

We need to verify if the latencies are reasonable on windows and linux, but these rough numbers were derived from my latency measurements with and without padenot's patch.  Likely linux and probably windows can be reduced a bit, and mac might be possible to raise a bit - but it's always better to be too low than too high, unless real-world echoes extend past the AEC tail.
Attachment #8380835 - Flags: feedback?(paul)
Jeff - care to try this?  You can just set the pref (you'll need to add it as it wasn't in all.js before, though it has a default value in the code).  Note that whomever sets it will fix the echo for the *other* person.  I suggest for this test both should set it.  (Also, it must be set before the call and probably before going to apprtc/etc.)

Thanks.

If this works, once we check the linux/windows values, and a few other mac versions/HW, we should uplift this patch.
Flags: needinfo?(jmuizelaar)
I'll try to try this tomorrow.
Audio backend only latency, per platform (i.e., from the point the callback gets called and the data are sent to the OS, i.e. right after cubeb):
- Windows: less than 30ms (most of the time in the 20/30ms range), was 100+ms
- Linux (pulse, alsa): 40ms, was 100ms
- OSX: 12.5ms, unchanged

Then, we should add AudioStream (variable, I have tweaked it since last measurements, but it's less than 80ms, although I've seen weird things on some platforms where opening an audio stream is delayed for some reason). This is going to be removed by the MSG refactor.

Then, the MSG, 10ms or so (this latency cause is also going away). Then, the input side.
Comment on attachment 8380835 [details] [diff] [review]
Make output+input delay for AEC platform-dependent

Review of attachment 8380835 [details] [diff] [review]:
-----------------------------------------------------------------

::: modules/libpref/src/init/all.js
@@ +263,5 @@
> +pref("media.peerconnection.capture_delay", 50);
> +#elif defined(XP_WIN)
> +pref("media.peerconnection.capture_delay", 100);
> +#else
> +pref("media.peerconnection.capture_delay", 150);

Special case between the different flavour of Linux, please. Pulse/Alsa are around 40ms (backend only), Android is at least 100ms (also backend only). Then again, some (recent, high end) Android are below 20ms, so that's not easy.

Can we choose the value in C++ (since you can ask cubeb for a decent value, dynamically, without having to guess, because in complicated cases it goes and directly ask the OS), and still have the pref for manual override?
Attachment #8380835 - Flags: feedback?(paul) → feedback-
Jeff - can you try this about:config setting?
Now measured on all 3 desktop platforms via gy.mozbuild's aec_debug_dump.  These values leave considerable headroom (~40-50ms) on MacBook Pro 2011, Lenovo W520 (Win7), and Fedora 19 desktop with USB Logitech 615 camera/mic.  AEC tail length is 128ms, so effective tail is 128 - 40-to-50, which should be plenty and leaves room for system/driver variation without the headroom becoming negative (which will cause very loud echoes)
Attachment #8380835 - Attachment is obsolete: true
Sorry, build/gyp.mozbuild.

I propose to land this patch, then lean on QA and random people to try it in speakerphone mode, and if it doesn't work adjust the value in about:config to make it work and let me know so we can tune.

Once we land AEC-in-GUM, this should largely become moot.
Attachment #8383769 - Flags: review?(paul)
Comment on attachment 8383769 [details] [diff] [review]
Make output+input delay for AEC platform-dependent

Review of attachment 8383769 [details] [diff] [review]:
-----------------------------------------------------------------

rs=me
Attachment #8383769 - Flags: review?(paul) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/4e6266673302

Note that once this has been checked and any additional tuning done, this needs to be uplifted to 29 and 28.  Primary regressing bug was bug 907817, which reduced backend latencies, though speakerphone AEC wasn't that reliable before the change.  it landed in FF 27 (10/18).
Target Milestone: --- → mozilla30
https://hg.mozilla.org/mozilla-central/rev/4e6266673302
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Comment on attachment 8383769 [details] [diff] [review]
Make output+input delay for AEC platform-dependent

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 907817

User impact if declined: Total failure of echo cancellation on most platforms (esp mac & windows)

Testing completed (on m-c, etc.): On m-c for 2 weeks.  Clearly far better than before; further tuning may be done.

Risk to taking this patch (and alternatives if risky): nil - just a platform-specific pref change to a tuning parameter already in use

String or IDL/UUID changes made by this patch: none
Attachment #8383769 - Flags: approval-mozilla-beta?
Attachment #8383769 - Flags: approval-mozilla-aurora?
Whiteboard: [webrtc-uplift]
Note: I presume beta is not possible at this point, but it would be something to consider if there are respins (as a pref-only change)
relnote-firefox: --- → ?
Keywords: relnote
OS: Mac OS X → All
Hardware: x86 → All
Attachment #8383769 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Will add this to the known issues section in FF28 relnotes.
Attachment #8383769 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
release noted for 28, marked as fixed in 29.
Is this really fixed, we're trying on 33, 34 and 35 and Firefox is still unusable in a webrtc call without headphones (compared to chrome).
What OS are you using?  

Would you be willing to come to #media on irc and help us debug your problem?  Just look for me (mreavy on irc) or Randell Jesup (jesup on irc), and we'll walk you through a few simple tests to find out what's going on.
Flags: needinfo?(tim)
Flags: needinfo?(tim)
Flags: needinfo?(jmuizelaar)
You need to log in before you can comment on or make changes to this bug.