Closed Bug 1157932 Opened 9 years ago Closed 9 years ago

Second gUM + PC results in A/V drift

Categories

(Core :: WebRTC: Audio/Video, defect)

40 Branch
x86_64
macOS
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox40 --- affected

People

(Reporter: drno, Unassigned)

References

Details

On my MacBook Pro (Retina, 15-inch, Late 2013) with 10.10.3 (14D136) and Nightly 40.0a1 (2015-04-22 and at least also 2015-04-21) if I'm in a Firefox Hello call and open another Firefox Hello conversation the audio and video start to drift apart with ~0.25s/min.

If I close the second Firefox Hello conversation window the drift stops.
If I leave the call and start another call, the new call is perfectly fine in the beginning.

Instead of opening a second Firefox Hello conversation opening a tab with https://test.webrtc.org (and granting permission to the gUM request) causes the same problem, until I close the tab.
Juan could you please to try to reproduce this on your machine (either by making a call between two machines, or find a colleague for a test call)?
Flags: needinfo?(jbecerra)
Jesup any idea what we could do next (besides Juan trying to repro) to get an idea what is going wrong here?
Flags: needinfo?(rjesup)
I have been trying to reproduce this problem with a similar environment as in comment #0 but so far I can't notice the drift. Is there anything special about the call that I need to take into account?
Flags: needinfo?(jbecerra)
I can still reproduce it on my laptop with Nightly 40.0a1 (2015-05-06)
Looks like this might have something to do with my profile. Not able to reproduce with a fresh profile.
Jesup do you want to keep investigating if this is not reproducible with a fresh profile?
(In reply to Nils Ohlmeier [:drno] from comment #6)
> Jesup do you want to keep investigating if this is not reproducible with a
> fresh profile?

yes - since this was also seen with blassey and ekr and others.

Check what toggles are changed in your user prefs
Flags: needinfo?(rjesup)
(In reply to Nils Ohlmeier [:drno] from comment #6)
> Jesup do you want to keep investigating if this is not reproducible with a
> fresh profile?

Also, archive a copy of the profile to avoid losing whatever is causing it
I can now reproduce this reliably. I call using a debug nightly build to someone using 37.0.2, in Seattle. The drift is very noticeable for both of us after a half minute of so into the call.

I'm going to try getting a Wireshark session during this, and I'll attach it here. If there's anything else I could include, let me know.
Still reproducible with my profile on 41.0a1 (2015-05-12). Going to send a copy of the profile to some people to check if they can reproduce it with the profile.
(In reply to juan becerra [:juanb] from comment #9)
> I can now reproduce this reliably. I call using a debug nightly build to
> someone using 37.0.2, in Seattle. The drift is very noticeable for both of
> us after a half minute of so into the call.
> 
> I'm going to try getting a Wireshark session during this, and I'll attach it
> here. If there's anything else I could include, let me know.

Logs!  from your end: aec logs, webrtc_trace logs, wireshark capture to go along with them (with a capture filter of 'udp').

Also: try it with a fresh profile -- and make sure things like the Gecko profiler isn't running. Both seeing drift is unusual; no one has reported that.  Does the drift continue to build up?

Thanks.  You can share these privately with me to avoid huge uploads, and compress them.
Flags: needinfo?(jbecerra)
So in my case I have identified the problem:

If you ever clicked the "Start Debug mode" and/or "Start AEC Logging" buttons on the about:webrtc page that results in media.webrtc.debug.aec_log_dir and media.webrtc.debug.log_file getting set. In my case media.webrtc.debug.trace_mask was also set (probably at some point back in time manually).

Clicking on "Stop" on either of the two buttons, properly stops the log files from growing, but the user prefs never get reset to their original values.
If these two user prefs are set and I have multiple Hello conversation windows open I get the drift. Note: just having the two user prefs set does not result in any log file to be written (except if like in my case trace_mask is set as well), but somehow this result in A/V sync issues already without writing to a file.
(In reply to Nils Ohlmeier [:drno] from comment #12)
> So in my case I have identified the problem:
> 
> If you ever clicked the "Start Debug mode" and/or "Start AEC Logging"
> buttons on the about:webrtc page that results in
> media.webrtc.debug.aec_log_dir and media.webrtc.debug.log_file getting set.
> In my case media.webrtc.debug.trace_mask was also set (probably at some
> point back in time manually).
> 
> Clicking on "Stop" on either of the two buttons, properly stops the log
> files from growing, but the user prefs never get reset to their original
> values.
> If these two user prefs are set and I have multiple Hello conversation
> windows open I get the drift. Note: just having the two user prefs set does
> not result in any log file to be written (except if like in my case
> trace_mask is set as well), but somehow this result in A/V sync issues
> already without writing to a file.

Jesup: can you think of reasons that having those values set would affect sync even when we're not writing out any log entries?

Also, this points to what sound like very clear set of STR -- I'd be curious if you can get the same result on your mac. I imagine working with the live problem would be much easier than doing a postmortem on logs.
(In reply to Adam Roach [:abr] from comment #13)
> (In reply to Nils Ohlmeier [:drno] from comment #12)
> > So in my case I have identified the problem:
> > 
> > If you ever clicked the "Start Debug mode" and/or "Start AEC Logging"
> > buttons on the about:webrtc page that results in
> > media.webrtc.debug.aec_log_dir and media.webrtc.debug.log_file getting set.
> > In my case media.webrtc.debug.trace_mask was also set (probably at some
> > point back in time manually).
> > 
> > Clicking on "Stop" on either of the two buttons, properly stops the log
> > files from growing, but the user prefs never get reset to their original
> > values.
> > If these two user prefs are set and I have multiple Hello conversation
> > windows open I get the drift. Note: just having the two user prefs set does
> > not result in any log file to be written (except if like in my case
> > trace_mask is set as well), but somehow this result in A/V sync issues
> > already without writing to a file.
> 
> Jesup: can you think of reasons that having those values set would affect
> sync even when we're not writing out any log entries?
> 
> Also, this points to what sound like very clear set of STR -- I'd be curious
> if you can get the same result on your mac. I imagine working with the live
> problem would be much easier than doing a postmortem on logs.

The above is no longer the case. Fixed in bug 1100502 along with the e10s fix. Only the file location prefs are set; the log mask is left at the user set value, which can be blank.
(In reply to Paul Kerr [:pkerr] from comment #14)
> The above is no longer the case. Fixed in bug 1100502 along with the e10s
> fix. Only the file location prefs are set; the log mask is left at the user
> set value, which can be blank.

Just to document what I clarified on IRC yesterday: the log mask only made the problem worse, but it is not needed to reproduce the problem.

It should be enough to:
- turn on the debugging on about:webrtc (this set the location prefs, but the bit mask any more)
- turn off the debugging again
- this leaves the location prefs set
- restart Fx (just to be safe)
- start multiple Hello conversations and experience the drift in the active Hello call
(In reply to Nils Ohlmeier [:drno] from comment #15)
> (In reply to Paul Kerr [:pkerr] from comment #14)
> > The above is no longer the case. Fixed in bug 1100502 along with the e10s
> > fix. Only the file location prefs are set; the log mask is left at the user
> > set value, which can be blank.
> 
> Just to document what I clarified on IRC yesterday: the log mask only made
> the problem worse, but it is not needed to reproduce the problem.
> 
> It should be enough to:
> - turn on the debugging on about:webrtc (this set the location prefs, but
> the bit mask any more)
> - turn off the debugging again
> - this leaves the location prefs set
> - restart Fx (just to be safe)
> - start multiple Hello conversations and experience the drift in the active
> Hello call

Is it a necessary condition to use about:webrtc? One thing to check on is if a user uses the prefs to control logging and not env vars, and then sets the mask back to zero but leaves the file locations, does it cause the drift issue? This was the original use case for putting the logging control into prefs; simplifying log capture for those not familiar with environment variables. Even if I reverted the prefs usage back to the pre about:webrtc page capture upgrade a user can still end up in the same pref state. Does this occur only as an effect of using about:webrtc?
Hmm now I'm able to reproduce it with this:

- create a fresh profile with Nightly 41.0a1 (2015-05-15) on Mac with OSX 10.10
- start a Hello conversation
- this automatically sets the prefs "media.webrtc.debug.aec_log_dir" and "media.webrtc.debug.log_file" to default values, but no files get written
- join the shared link from my Windows laptop
- start 3 more Hello conversations
- let the call run for a few minutes

No idea why in an earlier attempt I was not able to reproduce this with a fresh profile.
The log_file and aec_log_dir prefs are set as soon as the Hello chrome pop-up is created and the audio/video starts up. It is happening via a call to EnableWebRtcLog() that is called from: MediaManager.cpp, AudioConduit.cpp and VideoConduit.cpp. These call points allow that logging system to pick up the user settings from the environment vars (or the prefs) and start logging if that is requested.

about:webrtc calls through WebrtcGlobalInformation.cpp to make a StartWebRtcLog(). Both StartWebRtcLog() and EnableWebRtcLog() both eventually call ConfigWebRtcLog, which is where the final choices for the file locations are set back to the prefs. (I have a test patch that moves the pref setting calls into StartWebRtcLog(), which is only used by WebrtcGlobalInformation.)

What I still do not understand is why simply setting the file locations but keeping the mask set to zero will cause the drift.
Depends on: 1166183
Bug 1166183 seems to fix this issue.

Juan, maybe you can confirm/double check once 1166183 is in Nightly?
I'll retest this once bug 1166183 lands in Nightly.
Bug 1166183 landed and I have been testing this on nightly and developer edition by establishing calls between me and my brother in Seattle where I used to see this problem more or less reliably and noticeably.

After several calls I haven't been able to notice any drift.

I will mark this resolved.
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(jbecerra)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.