input-only AudioCallbackDriver overchurns MediaStreamGraph

VERIFIED FIXED in Firefox 57

Status

()

defect
P1
normal
Rank:
5
VERIFIED FIXED
2 years ago
2 years ago

People

(Reporter: pehrsons, Assigned: pehrsons, NeedInfo)

Tracking

({regression})

53 Branch
mozilla58
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox56 wontfix, firefox57 verified, firefox58 verified)

Details

Attachments

(1 attachment, 1 obsolete attachment)

STR
Load https://lab.silpaop.com/bug_audio/
Click Start/Stop
Speak
Click Start/Stop
Click Play

Expected: hear your own clear voice

Actualy: voice is distorted, and I think duration has doubled (i.e., for every recorded audio chunk we recorded another chunk of silence)
On this example , if we remove the "autoplay" attribute on the video node , the bug is present
I see -- so when the stream being recorded is not played in a media element at the same time we exhibit the bug. See https://jsfiddle.net/pehrsons/a9gugbut/10/
This is not in the recorder as such, but it appears to be MediaStreamGraph's resampling of input data.

I caught this in rr and in my case we are resampling from 32000 to 44100, and I have a case of 185 samples of sound and 71 samples of silence for a 256 sample iteration. Still digging for the root cause.
Assignee: nobody → apehrson
Status: NEW → ASSIGNED
Component: Audio/Video: Recording → Audio/Video: MediaStreamGraph
In a way bug 1296531 regressed this because it turned off direct listeners for audio with full duplex enabled.

But really it just exposes a latent issue in AudioCallbackDriver.

See this logs that are hopefully self-explanatory:

Input only:
> 2017-10-06 09:14:48.672980 UTC - [Unnamed thread 0x7f07f6dd0620]: E/MediaStreamGraph Appending 320 frames
> 2017-10-06 09:14:48.673220 UTC - [Unnamed thread 0x7f07f6dd0620]: E/MediaStreamGraph interval[135488; 135744] (frames: 132) (duration ticks: 256)
> 2017-10-06 09:14:48.674901 UTC - [Unnamed thread 0x7f07f6dd0620]: E/MediaStreamGraph interval[135744; 136000] (frames: 177) (duration ticks: 256)
> 2017-10-06 09:14:48.675140 UTC - [Unnamed thread 0x7f07f6dd0620]: E/MediaStreamGraph TrackUnionStream appending 71 null samples
> 2017-10-06 09:14:48.677960 UTC - [Unnamed thread 0x7f07f6dd0620]: E/MediaStreamGraph interval[136000; 136256] (frames: 132) (duration ticks: 256)
> 2017-10-06 09:14:48.678054 UTC - [Unnamed thread 0x7f07f6dd0620]: E/MediaStreamGraph TrackUnionStream appending 256 null samples

The null samples here are causing the distortions.


Input+Output:
> 2017-10-06 09:16:59.944354 UTC - [Unnamed thread 0x7f01367194c0]: E/MediaStreamGraph Appending 320 frames
> 2017-10-06 09:16:59.944458 UTC - [Unnamed thread 0x7f01367194c0]: E/MediaStreamGraph interval[83167; 83295] (frames: 132) (duration ticks: 128)
> 2017-10-06 09:16:59.946813 UTC - [Unnamed thread 0x7f01367194c0]: E/MediaStreamGraph interval[83295; 83423] (frames: 176) (duration ticks: 256)
> 2017-10-06 09:16:59.950029 UTC - [Unnamed thread 0x7f01367194c0]: E/MediaStreamGraph interval[83423; 83654] (frames: 133) (duration ticks: 128)

With output present we're fine. From the log it's clear that we don't unconditionally round up the MSG iteration duration.

This is because there is no output mixer writing to the scratch buffer to keep mBuffer.Available() at bay, [1], [2]. The same input+output log with mBuffer.Available() printed:
> 2017-10-06 09:33:49.529814 UTC - [Unnamed thread 0x7f80265ee6e0]: E/MediaStreamGraph Appending 320 frames
> 2017-10-06 09:33:49.529963 UTC - [Unnamed thread 0x7f80265ee6e0]: E/MediaStreamGraph interval[107103; 107231] (frames: 132) (duration ticks: 128) (available: 79)
> 2017-10-06 09:33:49.531827 UTC - [Unnamed thread 0x7f80265ee6e0]: E/MediaStreamGraph interval[107231; 107359] (frames: 177) (duration ticks: 256) (available: 128)
> 2017-10-06 09:33:49.535139 UTC - [Unnamed thread 0x7f80265ee6e0]: E/MediaStreamGraph interval[107359; 107590] (frames: 132) (duration ticks: 128) (available: 4)
I'm keeping this a P2 since we can work around the issue with direct listeners for now.
Rank: 15
Summary: gum-audio only recording is distorted → input-only AudioCallbackDriver overchurns MediaStreamGraph
We just had a chat on irc about this. This needs to _not_ use an AudioContext in the mochitest, only gUM + MediaRecorder, and _then_ use an AudioContext to check the content (or use the duration, but I'd rather have the content checked).

An AudioContext always has an output (so an AudioMixer is always present), so it hides the bug.
So while I had expected the test without the fix to fail on linux, it doesn't because the module-sine-source we use there for input has much less frequent callbacks -- i.e., MediaStreamGraph iterations are much longer (ca 8k frames compared to my microphone where it was ~130 frames), and add distortions per iteration.

We have plans to swap this source module for a monitor in bug 1406350. With the monitor I get even more frequent callbacks than with a microphone so reproducing there should be easy.

For now we'll land the fix with manual testing and I'll file a followup for the test that depends on bug 1406350.
Attachment #8915938 - Attachment is obsolete: true
Attachment #8915938 - Flags: review?(jib)
I'm making this a P1 based on new findings. The issue fixed here seems to be the root cause of most our other sever audio distortion reports (see bug 1407088 comment 3 for instance).
Rank: 15 → 5
Priority: P2 → P1
See Also: 1330360, 1381638
Comment on attachment 8915939 [details]
Bug 1406027 - Always use the mixer so AudioCallbackDriver can track processed samples.

https://reviewboard.mozilla.org/r/187200/#review193144
Attachment #8915939 - Flags: review?(padenot) → review+
Pushed by pehrsons@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/02be66f5ff0c
Always use the mixer so AudioCallbackDriver can track processed samples. r=padenot
https://hg.mozilla.org/mozilla-central/rev/02be66f5ff0c
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Blocks: 1319445
Version: 58 Branch → 53 Branch
Comment on attachment 8915939 [details]
Bug 1406027 - Always use the mixer so AudioCallbackDriver can track processed samples.

Approval Request Comment
[Feature/Bug causing the regression]: bug 1319445
[User impact if declined]: A peer connection (webrtc) sending audio while same tab is *not* playing any MediaStream with audio, or WebAudio, to speakers will contain a lot of distortions, and build up a large delay, as observable by the receiver.
[Is this code covered by automated tests?]: No. Bug 1406910 will add a mochitest that tests this specific code path, as soon as we have a virtual input device that's more like a microphone (bug 1406350)
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]: Yes, since there's no automated coverage yet. STR:
1 load https://jsfiddle.net/pehrsons/0cLfdk4u/
2 press "Start" and allow the capture
3 when the video appears we start recording audio and video for 3 seconds, say something
4 The recording is played automatically. Verify that you hear no audio distortions (Linux is very obvious, Mac not so much but still present, might depend on microphone so verify the broken state in 56 first)
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: No
[Why is the change risky/not risky?]: The simplicity of the patch. It does make us take a new path for some cases inside a hot audio path, but there's very low computational overhead in this path. Just passing some integers around.
[String changes made/needed]: None
Attachment #8915939 - Flags: approval-mozilla-beta?
Hi Jan-Ivar, Paul, is this a must fix for 57? This is an old regression (from 53) and I would rather not add any more risk to 57 by uplifting this. Can it ride the 58 train?
Flags: needinfo?(padenot)
Flags: needinfo?(jib)
As comment 18 mentions this is the root cause of several audio distortion reports. Audio distortions are the worst, as users strongly and negatively associate it with the browser. The length of time (some subset of) users have suffered this regression doesn't seem mitigating to me.

The low risk mentioned in comment 22 makes uplift appealing to me. If Paul agrees about the low risk, I'd say it's worth taking.
Flags: needinfo?(jib)
I've looked over this patch as Module Owner of webrtc.  It's a very simple, low-risk patch.

The testcase above (doing any live recording) is seriously impacted (and that's the primary use of MediaRecorder, though in many cases if you're recording video (not just audio) a self-image display will be given, and that (even if muted) should bypass this issue.  That would not be the case in audio-only recordings - typically they won't be attached to an element and playing while recording.

It does apparently also help some other cases of reported distortions/delay.  Andreas: what other real-world usecases are fixed by this?  I.e. please expand on comment 18 with specifics (beyond "seems to be").
Flags: needinfo?(apehrson)
(In reply to Randell Jesup [:jesup] from comment #25)
> I've looked over this patch as Module Owner of webrtc.  It's a very simple,
> low-risk patch.
> 
> The testcase above (doing any live recording) is seriously impacted (and
> that's the primary use of MediaRecorder, though in many cases if you're
> recording video (not just audio) a self-image display will be given, and
> that (even if muted) should bypass this issue.  That would not be the case
> in audio-only recordings - typically they won't be attached to an element
> and playing while recording.

MediaRecorder uses a direct listener for audio in 57 so in most cases (like a getUserMedia stream straight to the recorder) it is not affected. It would however be affected with something as simple as `new MediaRecorder(new MediaStream(gUMStream.getTracks()));`.


> It does apparently also help some other cases of reported distortions/delay.
> Andreas: what other real-world usecases are fixed by this?  I.e. please
> expand on comment 18 with specifics (beyond "seems to be").

PeerConnections are the ones affected since 53 because in 53 we disabled direct listeners when full duplex is enabled (turned on in 49 though perhaps it was backed out for a couple of versions, but irrelevant).
It was made worse by e10s-multi because of the lower chance of having a background tab in the same process play stream audio.
It was further made worse by MSG-per-document because now the same tab needs to play stream audio for this not to happen (bug 1330360, in 56).

The concrete use case for this in 57 is then sending microphone audio over an RTCPeerConnection without playing anything out. I could see this happen in either
1 A broadcasting mode, bug 1407088 is an example of this
2 When you're the first one to join a video conferencing service backed by a media server (SFU or MCU) so you start publishing audio immediately. However, you don't play audio locally until someone else joins and you start receiving their stream. I haven't tested this case in particular but at this point we have passed a big buffer of null data to the peer connection so I imagine a latency has built up. Perhaps webrtc.org can handle this, perhaps not. Jesup, do you know?

The fiddle in the STR in comment 22 reproduces this issue in RTCPeerConnection by recording remotely: https://jsfiddle.net/pehrsons/0cLfdk4u/
Flags: needinfo?(apehrson) → needinfo?(rjesup)
I would be nice if we could take this one indeed.
Flags: needinfo?(padenot)
Comment on attachment 8915939 [details]
Bug 1406027 - Always use the mixer so AudioCallbackDriver can track processed samples.

Thanks Jan-Ivar, Andreas, Paul for the additional info. I agree that audio distortions, if widespread, is a severe problem and warrants an uplift, Beat57+
Attachment #8915939 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: qe-verify+
Duplicate of this bug: 1407088
I managed to reproduce the bug using the steps from comment 22 on an older version of Nightly (2017-10-05) on macOS 10.13 and Ubuntu 16.04 x64.
I retested everything using latest Nightly 58 and beta 57.0b12 on the same platforms and the bug is not reproducing anymore.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.