Closed Bug 904617 Opened 6 years ago Closed 6 years ago

Latency reporting for MediaStreamGraph, cubeb, and other media components

Categories

(Core :: WebRTC: Audio/Video, defect)

Other Branch
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla27

People

(Reporter: jesup, Assigned: snandaku)

References

Details

(Whiteboard: [webrtc])

Attachments

(10 files, 16 obsolete files)

564.42 KB, application/pdf
Details
975.96 KB, application/pdf
Details
157.87 KB, application/pdf
Details
18.23 KB, patch
kinetik
: review+
Details | Diff | Splinter Review
1.63 KB, patch
padenot
: review+
Details | Diff | Splinter Review
23.85 KB, patch
padenot
: review+
Details | Diff | Splinter Review
136.22 KB, application/pdf
Details
1.05 KB, patch
kinetik
: review+
Details | Diff | Splinter Review
8.55 KB, patch
Details | Diff | Splinter Review
1.15 KB, patch
ehugg
: review+
Details | Diff | Splinter Review
Spun off from bug 785584, which is a meta bug.

This is to clean up and land code to report audio/video latencies (delay), based on padenot's patches.

Some of this may be reportable via the WebRTC Stats API, which would be useful.  Other parts may need to have other mechanisms/APIs.

It should not have any noticeable impact on performance or memory use when not enabled.  This might mean that it will require separate allocations when enabled, and just a pointer in the main structure to those.  It should be compiled in by default on Debug builds, and probably compiled off by default in optimized builds, but possible to turn on with a .mozconfig line (configure change)

The patches/etc from bug 785584 should be moved here
I was thinking the maybe we could do something like about:latency, with a set of chrome API to draw a canvas graph or the like. Some parts are already done, but everything needs more polish overall, there is too much hacky stuff going on.
Attached patch latency-content-api (obsolete) — Splinter Review
Attachment #789676 - Attachment filename: latency-content-api → Part 5 - Add content API to set the requested latency
I moved all the attachments from Bug 785584 to here. So what should be the next steps on this ..
An usage example of the content API can be found at [1], and needs the whole patch queue to work.

[1]: http://paul.cx/latency/index.html
Updated content api to be based on WebIDL to make it reflect in the JavaScript
Attachment #789676 - Attachment is obsolete: true
With the new patch, I was able to verify [1] on Osx and Win 8.1

[1]: http://paul.cx/latency/index.html
Attachment #789942 - Attachment description: Part 5 - Content API to set the requested latency → Part 5 - WebIDL based Content API to set the requested latency
Attached patch Part - 7 Fixes shutdown crashes (obsolete) — Splinter Review
Attachment #791156 - Attachment filename: shutdown_crash_fix.patch → Part 7 - shutdown_crash_fix.patch
Attachment #791156 - Attachment description: shutdown_crash_fix.patch → Part - 7 Fixes shutdown crashes
Attachment #791156 - Attachment filename: Part 7 - shutdown_crash_fix.patch → shutdown_crash_fix.patch
I will be cleaning up these patches to group them better tomorrow and clean up formatting issues during the process.
Attachment #789672 - Attachment is obsolete: true
Attachment #791560 - Flags: review?(rjesup)
Attachment #791560 - Flags: review?(paul)
Attachment #789673 - Attachment is obsolete: true
Attachment #791561 - Flags: review?(rjesup)
Attachment #791561 - Flags: review?(paul)
Attachment #789674 - Attachment is obsolete: true
Attachment #791563 - Attachment is obsolete: true
Attachment #791627 - Attachment is obsolete: true
Attachment #789675 - Attachment is obsolete: true
Attachment #791563 - Flags: review?(rjesup)
Attachment #791563 - Flags: review?(paul)
Attachment #791631 - Flags: review?(rjesup)
Attachment #791631 - Flags: review?(paul)
Attachment #791632 - Flags: review?(rjesup)
Attachment #791632 - Flags: review?(paul)
Attachment #789942 - Attachment is obsolete: true
Attachment #791633 - Flags: review?(rjesup)
Attachment #791633 - Flags: review?(paul)
Attachment #790020 - Attachment is obsolete: true
Attachment #791156 - Attachment is obsolete: true
These patches are now verified on Windows 8.1 - latest Mc and OSX from yesterdays MC version.
So, considering the amount of time in the past we spent on debugging latency-related issues, and because I'm sure we will continue to have regression (and because we want to be able to measure improvements as well), I propose that we invest a little bit of time on some tooling.

There is a couple way for us to get the tooling we need:
(1) Keep the code as it is, with a little python script to process the log.
(2) Add an about: page for that: about:latency, about:webrtc, or something else, I'm open to bikeshedding.

I'd prefer to have the second option, because it will make it very easy for people/qa to give us reports (and for bisecting, we could just grab builds using mozregression to make sure bisecting on platform slow to compile, namely Windows, will not waste developer time anymore like it did for jesup the other day).

The pieces we need to have for that are:
(1) The cubeb api to give us the latency (done, but I need to double check a couple platform-specific things)
(2) The instrumentation at the different location in the pipeline (MSG, BufferedAudioStream, cubeb, etc.) (pretty much done, but we could use another pair of eyeballs to check for the acquisition methods)
(3) A chrome API to get the latency data. I've got a proof of concept for the cubeb latency, we need to make it work for all the latency probes, and probably throw in some aggregation. I've got some code locally that do aggregation in a semi-intelligent manner, so we can catch spikes (basically using the first derivative of the latency graph to know if we should or not aggregate the value we just got).
(4) A graph widget in js/canvas, this is easy enough to do, I've got some code as well (half-done).

This would be enabled by default on release and debug. We would only grab the data if the about: page is opened, to avoid slowing things down. For that, we need to make sure the data collection is not showing in profiles, but this is quite doable.

I can devote some time to get all this working, maybe with the help of crypt if he has time.

Any thoughts?
Flags: needinfo?(rjesup)
I will be glad to help Paul.
Target Milestone: --- → mozilla26
Comment on attachment 791633 [details] [diff] [review]
Part 5 - Content API to set the requested latency

Review of attachment 791633 [details] [diff] [review]:
-----------------------------------------------------------------

Since we're touching Navigator.webidl, we should get a quick r? from smaug (or equivalent)

::: content/media/AudioStream.cpp
@@ +95,5 @@
>  static uint32_t GetCubebLatency()
>  {
> +  return static_cast<uint32_t>(AsyncLatencyLogger::Get()->GetRequestedLatency());
> +  //MutexAutoLock lock(*gAudioPrefsLock);
> +  //return gCubebLatency;

remove commented-out code unless it's important

::: dom/base/Navigator.h
@@ +170,5 @@
>    void RegisterContentHandler(const nsAString& aMIMEType, const nsAString& aURL,
>                                const nsAString& aTitle, ErrorResult& aRv);
>    nsMimeTypeArray* GetMimeTypes(ErrorResult& aRv);
>    nsPluginArray* GetPlugins(ErrorResult& aRv);
> +  

whitespace
Attachment #791633 - Flags: review?(rjesup)
Attachment #791633 - Flags: review?(bugs)
Attachment #791633 - Flags: review+
Comment on attachment 791633 [details] [diff] [review]
Part 5 - Content API to set the requested latency

Review of attachment 791633 [details] [diff] [review]:
-----------------------------------------------------------------

Bumping back to r? to me and padenot to think about where the interface should be exposed.
Attachment #791633 - Flags: review?(bugs)
Attachment #791633 - Flags: review?
Attachment #791633 - Flags: review+
We need chrome-only members to expose this. It cannot be navigator, because there are only one navigator, and multiple audio output.

We could put members that return objects on the MediaStream and PeerConnection, something like this:

RTCPeerConnection.latencies = {
  neteq: x,
  aec: y,
  (...)
}

MediaStream.latencies = {
  audio_output: x,
  msg: y,
  bufferedaudiostream: z,
}

(we should add something for input as well).

It would then be easier to show meaningful data for a given call.
Comment on attachment 791560 [details] [diff] [review]
Part 1- Add a way to get cubeb latency

Review of attachment 791560 [details] [diff] [review]:
-----------------------------------------------------------------

Holding at r? pending locking question and the hw_latency_frames/flag issue

::: media/libcubeb/src/cubeb_audiotrack.c
@@ +321,5 @@
>    if (audiotrack_get_min_frame_count(ctx, &stream_params, &min_frame_count)) {
>      return CUBEB_ERROR;
>    }
>  
> +

remove new blank line

@@ +433,5 @@
> +  assert(stream->instance && latency);
> +
> +  /* Android returns the latency in ms, we want it in frames. */
> +  *latency = stream->context->klass.latency(stream->instance);
> +  *latency = (uint32_t)(*latency / 1000. * stream->params.rate);

How about (*latency * stream->params.rate)/1000;  (avoid floating point!)  Slightly clearer, too.

::: media/libcubeb/src/cubeb_audiounit.c
@@ +49,5 @@
> +
> +  uint64_t pres = AudioConvertHostTimeToNanos(tstamp->mHostTime);
> +  uint64_t now = AudioConvertHostTimeToNanos(AudioGetCurrentHostTime());
> +
> +  return (pres - now) * S_PER_NS * stream->sample_spec.mSampleRate;

Unless there's an overflow concern (and I don't think there should be), I think we should avoid floating point:
((pres - now) * stream->sample_spec.mSampleRate) / PR_NSEC_PER_SEC
and remove S_PER_NS

@@ +275,5 @@
>  
>    stm->frames_played = 0;
>    stm->frames_queued = 0;
> +  stm->current_latency_frames = 0;
> +  stm->hw_latency_frames = -1;

This is a uint64_t... -1 isn't valid and may/will cause compilers to kick you.
Is there a better way to flag needing to set it, assuming it can't be set here?
If we do *need* a flag for it needing to be set, use a bool (or at least use the correct #define for MAX_UINT64)

@@ +450,5 @@
> +      + device_safety_offset;
> +  }
> +
> +  pthread_mutex_lock(&stm->mutex);
> +  *latency = stm->hw_latency_frames + stm->current_latency_frames;

If we're locking around this, do we not need a lock around testing the hw_latency?

::: media/libcubeb/src/cubeb_pulse.c
@@ +587,5 @@
> +  if (r) {
> +    return CUBEB_ERROR;
> +  }
> +
> +  *latency = r_usec * S_PER_USEC * stm->sample_spec.rate;

Same comment as for audiounit.c

::: media/libcubeb/src/cubeb_winmm.c
@@ +136,5 @@
>       during the callback to avoid deadlocks. */
>    LeaveCriticalSection(&stm->lock);
>    got = stm->data_callback(stm, stm->user_ptr, hdr->lpData, wanted);
>    EnterCriticalSection(&stm->lock);
> +  stm->written += got;

I assume this is safe to access outside of the lock?  I.e. nothing else on another thread will read or write it?  I'll note it's read under a lock

@@ +587,5 @@
> +  EnterCriticalSection(&stm->lock);
> +  time.wType = TIME_SAMPLES;
> +  r = waveOutGetPosition(stm->waveout, &time, sizeof(time));
> +  written = stm->written;
> +  LeaveCriticalSection(&stm->lock);

What exactly is being protected by stm->lock?
Attachment #791561 - Flags: review?(rjesup) → review+
Comment on attachment 791631 [details] [diff] [review]
Part 3 - Add latency API and python script to generate graph for logs

Review of attachment 791631 [details] [diff] [review]:
-----------------------------------------------------------------

r- until I feel comfortable about the perf impact of the amount of logging.  I would prefer to batch them if possible (and that may help reduce the "log-spam" impact and the amount of IO/disk traffic).  It's close to an r+ though.

::: content/media/AudioStream.cpp
@@ +906,5 @@
>    WriteDumpFile(mDumpFile, this, aFrames, aBuffer);
> +  mLatency -= servicedFrames;
> +  {
> +    uint32_t latency = -1;
> +    if (cubeb_stream_get_latency(mCubebStream, &latency)) {

I'd like to make sure we don't use any significant cycles if latency logging isn't on.  Perhaps we can wrap blocks like this one with a test of the logging level (and #ifdef PR_LOGGING or #ifdef AUDIO_LATENCY_LOGGING, if we're not having it be always-on)

@@ +910,5 @@
> +    if (cubeb_stream_get_latency(mCubebStream, &latency)) {
> +      NS_WARNING("Could not get latency from cubeb.");
> +    }
> +    LogLatency(AsyncLatencyLogger::AudioStream, static_cast<double>(mBuffer.Length()) / mOutRate * 1000.);
> +    LogLatency(AsyncLatencyLogger::Cubeb, (double) static_cast<double>(latency) / mOutRate * 1000.);

I'd like to avoid floating-point where possible (at least where it buys us nothing).
If we can guarantee latency is less than MAX_INT32/1000 (~2M), then there's no possible overflow from 
static_cast<double>((latency*1000)/mOutRate)
If we're concerned we could exceed that, then promote to int64.

::: content/media/Latency.cpp
@@ +106,5 @@
> +void AsyncLatencyLogger::Log(LatencyLogIndex aIndex, double aValue)
> +{
> +  nsCOMPtr<nsIRunnable> event = new LogEvent(aIndex, aValue);
> +  if (mThread) {
> +    mThread->Dispatch(event, NS_DISPATCH_NORMAL);

I'd like to log latency values without having to do an allocation and dispatch-to-mainthread for each one- this causes a LOT of events to mainthread, and could further warp the log values.

If the code is not timing-sensitive, we can PR_LOG directly.
If it is sensitive, we can test if logging is enabled, and if so we could batch up values, and then dispatch a large number at once (10, 100, more).

We don't want to allocate/send log messages to mainthread if we're not logging latency.

::: content/media/MediaSegment.h
@@ +180,5 @@
>      } else {
>        mChunks.InsertElementAt(0)->SetNull(aDuration);
>      }
> +#ifdef MOZILLA_INTERNAL_API
> +    mChunks[0].mTimeStamp = mozilla::TimeStamp::Now();

Eventually I'd like to take the timestamp from capture (or as close as we can get to it) and have it follow the chunks of data through the system (and if so, we could calculate relative deltas)

::: content/media/webrtc/MediaEngineWebRTC.h
@@ +356,5 @@
>      , mWindowId(aWindowId)
>    {
>  	mVideoSources.Init();
>  	mAudioSources.Init();
> +  AsyncLatencyLogger::Get(true)->AddRef();

indent

::: graph.py
@@ +1,1 @@
> +#!/usr/bin/python

MPL 2
Attachment #791631 - Flags: review?(rjesup) → review+
Comment on attachment 791631 [details] [diff] [review]
Part 3 - Add latency API and python script to generate graph for logs

Review of attachment 791631 [details] [diff] [review]:
-----------------------------------------------------------------

Oops, r-
Attachment #791631 - Flags: review+ → review-
Comment on attachment 791632 [details] [diff] [review]
Part 4 - Added the possibility to mark a location in the log

Review of attachment 791632 [details] [diff] [review]:
-----------------------------------------------------------------

Leaving r? for now until I understand the need for the "silence" logs

::: content/media/AudioStream.cpp
@@ +578,5 @@
>  
>    // Size mBuffer for one second of audio.  This value is arbitrary, and was
>    // selected based on the observed behaviour of the existing AudioStream
>    // implementations.
> +  uint32_t bufferLimit = FramesToBytes(aRate / 10);

This value changed, but the comment didn't.  Something needs to be adjusted (likely the comment)

@@ +894,5 @@
> +      }
> +    }
> +    if (isSilence) {
> +      LogLatency(AsyncLatencyLogger::AudioStream, "Buffer is silent");
> +    }

Do we need all this code, in that we have separate code that logs silence-insertion?  Since this is a floating-point != 0.0, so it won't trigger on "normal" silence, just on inserted silence.  So I don't see the worth on comparing every sample against 0.0f.  If there's a reason, is it needed always, or should it be an option?  A logging level?  Build-time option?  Without understanding the reason for this, it's hard to evaluate.

::: content/media/MediaStreamGraph.cpp
@@ +811,5 @@
>          // Play silence where the track has ended
>          output.AppendNullData(endTicks - sliceEnd);
>          NS_ASSERTION(endTicks == sliceEnd || track->IsEnded(),
>                       "Ran out of data but track not ended?");
> +        LogLatency(AsyncLatencyLogger::MediaStream, "Writing real data.");

Do we need this if we have "Writing Silence"?  Let's try to keep the log traffic down to something semi-sane.
Comment on attachment 791633 [details] [diff] [review]
Part 5 - Content API to set the requested latency

Review of attachment 791633 [details] [diff] [review]:
-----------------------------------------------------------------

We still need to decide where this will be exposed

::: content/media/AudioStream.cpp
@@ +921,5 @@
>      if (cubeb_stream_get_latency(mCubebStream, &latency)) {
>        NS_WARNING("Could not get latency from cubeb.");
>      }
>      LogLatency(AsyncLatencyLogger::AudioStream, static_cast<double>(mBuffer.Length()) / mOutRate * 1000.);
> +    double latency_ms = (double) static_cast<double>(latency) / mOutRate * 1000.;

See comments in other bug about using integer math as much as possible.  Also, we don't want to calculate a bunch of stuff (especially FP) and especially not throw messages to MainThread if we're not actually logging.
Comment on attachment 791633 [details] [diff] [review]
Part 5 - Content API to set the requested latency

Review of attachment 791633 [details] [diff] [review]:
-----------------------------------------------------------------

We still need to decide where this will be exposed

::: content/media/AudioStream.cpp
@@ +921,5 @@
>      if (cubeb_stream_get_latency(mCubebStream, &latency)) {
>        NS_WARNING("Could not get latency from cubeb.");
>      }
>      LogLatency(AsyncLatencyLogger::AudioStream, static_cast<double>(mBuffer.Length()) / mOutRate * 1000.);
> +    double latency_ms = (double) static_cast<double>(latency) / mOutRate * 1000.;

See comments in other bug about using integer math as much as possible.  Also, we don't want to calculate a bunch of stuff (especially FP) and especially not throw messages to MainThread if we're not actually logging.
(In reply to Randell Jesup [:jesup] from comment #31)
> Comment on attachment 791631 [details] [diff] [review]
> Part 3 - Add latency API and python script to generate graph for logs
> 
> Review of attachment 791631 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r- until I feel comfortable about the perf impact of the amount of logging. 
> I would prefer to batch them if possible (and that may help reduce the
> "log-spam" impact and the amount of IO/disk traffic).  It's close to an r+
> though.

I've got ideas for that, some sort of smart aggregation that uses the derivative of the latency value, and writes the value to the log iff the value of the derivative is higher than some finely tuned number (depending on what we are measuring). I could observe that I was not really losing information, but the cross-thread messaging and IO were reduced. 

> ::: content/media/AudioStream.cpp
> @@ +906,5 @@
> >    WriteDumpFile(mDumpFile, this, aFrames, aBuffer);
> > +  mLatency -= servicedFrames;
> > +  {
> > +    uint32_t latency = -1;
> > +    if (cubeb_stream_get_latency(mCubebStream, &latency)) {
> 
> I'd like to make sure we don't use any significant cycles if latency logging
> isn't on.  Perhaps we can wrap blocks like this one with a test of the
> logging level (and #ifdef PR_LOGGING or #ifdef AUDIO_LATENCY_LOGGING, if
> we're not having it be always-on)

Yes, keep in mind this patch queue was quickly written to debug some stuff, obviously not ready for landing. We need to make sure one can at least activate (if not deactivate) the pref without restarting the browser: I can imagine someone experiencing a weird behavior _during_ a call, flipping the pref, and giving us the log, which is always better than "the latency was terrible".

> If the code is not timing-sensitive, we can PR_LOG directly.
> If it is sensitive, we can test if logging is enabled, and if so we could
> batch up values, and then dispatch a large number at once (10, 100, more).

I did the AsyncLogger in the first place _because_ using PR_LOG directly was affecting the latency and results in the first place (especially on Android, but I could observe differences on desktop as well, although not as severe).

> ::: content/media/MediaSegment.h
> @@ +180,5 @@
> >      } else {
> >        mChunks.InsertElementAt(0)->SetNull(aDuration);
> >      }
> > +#ifdef MOZILLA_INTERNAL_API
> > +    mChunks[0].mTimeStamp = mozilla::TimeStamp::Now();
> 
> Eventually I'd like to take the timestamp from capture (or as close as we
> can get to it) and have it follow the chunks of data through the system (and
> if so, we could calculate relative deltas)

Well, this is measuring the amount of time the chunk stays in the graph, and as proven its usefulness already. I need that so I can do some sort of regression testing when I'll refactor the MSG to use cubeb's clock. Also, I believe it is good to have a high granularity, if needed, to rule out modules quickly when debugging an issue.

> ::: graph.py
> @@ +1,1 @@
> > +#!/usr/bin/python
> 
> MPL 2

Yes, but this program is a short term solution. It should be very easy to come up with something more robust using a <canvas> (I'm thinking about:webrtc, here).
(In reply to Randell Jesup [:jesup] from comment #33)
> Comment on attachment 791632 [details] [diff] [review]
> Part 4 - Added the possibility to mark a location in the log
> 
> Review of attachment 791632 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Leaving r? for now until I understand the need for the "silence" logs
> 
> ::: content/media/AudioStream.cpp
> @@ +578,5 @@
> >  
> >    // Size mBuffer for one second of audio.  This value is arbitrary, and was
> >    // selected based on the observed behaviour of the existing AudioStream
> >    // implementations.
> > +  uint32_t bufferLimit = FramesToBytes(aRate / 10);
> 
> This value changed, but the comment didn't.  Something needs to be adjusted
> (likely the comment)

This needs to go away. 

> @@ +894,5 @@
> > +      }
> > +    }
> > +    if (isSilence) {
> > +      LogLatency(AsyncLatencyLogger::AudioStream, "Buffer is silent");
> > +    }
> 
> Do we need all this code, in that we have separate code that logs
> silence-insertion?  Since this is a floating-point != 0.0, so it won't
> trigger on "normal" silence, just on inserted silence.  So I don't see the
> worth on comparing every sample against 0.0f.  If there's a reason, is it
> needed always, or should it be an option?  A logging level?  Build-time
> option?  Without understanding the reason for this, it's hard to evaluate.

This is something I used to debug another issue, that got folded somehow in this patch. I agree this can go, this is debugging code, really.
(In reply to Paul Adenot (:padenot) from comment #36)
> (In reply to Randell Jesup [:jesup] from comment #31)
> > Comment on attachment 791631 [details] [diff] [review]
> > Part 3 - Add latency API and python script to generate graph for logs
> > 
> > Review of attachment 791631 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > r- until I feel comfortable about the perf impact of the amount of logging. 
> > I would prefer to batch them if possible (and that may help reduce the
> > "log-spam" impact and the amount of IO/disk traffic).  It's close to an r+
> > though.
> 
> I've got ideas for that, some sort of smart aggregation that uses the
> derivative of the latency value, and writes the value to the log iff the
> value of the derivative is higher than some finely tuned number (depending
> on what we are measuring). I could observe that I was not really losing
> information, but the cross-thread messaging and IO were reduced. 

For log messages (for later human analysis), periodic state logs (1/s for example) and/or periodic histogram dumps are reasonable and generally more useful than a ton of per-frame debugs.  Logging out-of-spec events can be useful as well.  This could be enabled at a lower debugging level (for now) or by default in debug builds if the overhead at that level is reasonable.  If the overhead for this level of logging is low enough, this data could also be fed into telemetry (and available via the stats API).

Periodic lower-data logging is something I think we can do for 26.

> 
> > ::: content/media/AudioStream.cpp
> > @@ +906,5 @@
> > >    WriteDumpFile(mDumpFile, this, aFrames, aBuffer);
> > > +  mLatency -= servicedFrames;
> > > +  {
> > > +    uint32_t latency = -1;
> > > +    if (cubeb_stream_get_latency(mCubebStream, &latency)) {
> > 
> > I'd like to make sure we don't use any significant cycles if latency logging
> > isn't on.  Perhaps we can wrap blocks like this one with a test of the
> > logging level (and #ifdef PR_LOGGING or #ifdef AUDIO_LATENCY_LOGGING, if
> > we're not having it be always-on)
> 
> Yes, keep in mind this patch queue was quickly written to debug some stuff,
> obviously not ready for landing. We need to make sure one can at least
> activate (if not deactivate) the pref without restarting the browser: I can
> imagine someone experiencing a weird behavior _during_ a call, flipping the
> pref, and giving us the log, which is always better than "the latency was
> terrible".

Right - either we want to run the data all the time (but there's a perf impact to that in any case), or more likely turn it on when they do "something" (open about:webrtc).

However, I'm fine with that being "stage 2" of this patch, and NSPR_LOG_MODULES for enabling in stage 1.

> > If the code is not timing-sensitive, we can PR_LOG directly.
> > If it is sensitive, we can test if logging is enabled, and if so we could
> > batch up values, and then dispatch a large number at once (10, 100, more).
> 
> I did the AsyncLogger in the first place _because_ using PR_LOG directly was
> affecting the latency and results in the first place (especially on Android,
> but I could observe differences on desktop as well, although not as severe).

I guessed as much, but it was worth askign.

> 
> > ::: content/media/MediaSegment.h
> > @@ +180,5 @@
> > >      } else {
> > >        mChunks.InsertElementAt(0)->SetNull(aDuration);
> > >      }
> > > +#ifdef MOZILLA_INTERNAL_API
> > > +    mChunks[0].mTimeStamp = mozilla::TimeStamp::Now();
> > 
> > Eventually I'd like to take the timestamp from capture (or as close as we
> > can get to it) and have it follow the chunks of data through the system (and
> > if so, we could calculate relative deltas)
> 
> Well, this is measuring the amount of time the chunk stays in the graph, and
> as proven its usefulness already. I need that so I can do some sort of
> regression testing when I'll refactor the MSG to use cubeb's clock. Also, I
> believe it is good to have a high granularity, if needed, to rule out
> modules quickly when debugging an issue.

Sure; my comment was we're going to have a "sample-time" at some point (and "playout time" on the play side), and if so we probably only need to record deltas off of that.  It also then enables us to directly measure the in-to-out time of a sample, not try to infer that from the sum of the individual latencies.  In any case, it was just a comment on possible future direction.

> 
> > ::: graph.py
> > @@ +1,1 @@
> > > +#!/usr/bin/python
> > 
> > MPL 2
> 
> Yes, but this program is a short term solution. It should be very easy to
> come up with something more robust using a <canvas> (I'm thinking
> about:webrtc, here).

Sure, but again I'm thinking of a phase 1 here to get some support in the tree for 26.
Flags: needinfo?(rjesup)
Comment on attachment 791632 [details] [diff] [review]
Part 4 - Added the possibility to mark a location in the log

Given padenot's comment, we don't need this
Attachment #791632 - Flags: review?(rjesup)
Attachment #791632 - Flags: review?(paul)
Attachment #791632 - Flags: review-
Attachment #791560 - Attachment is obsolete: true
Attachment #791560 - Flags: review?(rjesup)
Attachment #791560 - Flags: review?(paul)
Attachment #791561 - Attachment is obsolete: true
Attachment #791561 - Flags: review?(paul)
Attachment #791631 - Attachment is obsolete: true
Attachment #791631 - Flags: review?(paul)
Attachment #791633 - Attachment is obsolete: true
Attachment #791633 - Flags: review?(paul)
Attachment #791633 - Flags: review?
Attachment #803312 - Flags: review?(paul)
Attachment #803313 - Flags: review?(paul)
Attachment #803314 - Flags: review?(paul)
Attachment #791632 - Attachment is obsolete: true
Attached file nspr.pdf
Shows a pc_test call with bug 901831 causing ever-increasing latency
FYI, I haven't added the MPL2 header to graph.py; I'll do that before checkin
Comment on attachment 803317 [details] [diff] [review]
Part 4 - Add a content API to set the requested latency

Is navigator a reasonable place to expose the cubeb latency?  I think my earlier needinfo was misunderstood....
Attachment #803317 - Flags: review?(paul)
Comment on attachment 803314 [details] [diff] [review]
Part 3 - Log latency, and adds a python script to understand the log

Review of attachment 803314 [details] [diff] [review]:
-----------------------------------------------------------------

(initial review comments, I'll continue when I'm home).

::: content/media/AudioStream.cpp
@@ +382,5 @@
>    nsAutoRef<cubeb_stream> mCubebStream;
>  
>    uint32_t mBytesPerFrame;
>  
> +  int32_t mLatency;

This is unused.

::: content/media/Latency.cpp
@@ +93,5 @@
> +{
> +  MutexAutoLock lock(mMutex);
> +  if (mStart.IsNull()) {
> +    mStart = TimeStamp::Now();
> +    nsresult rv = NS_NewNamedThread("Media Log", getter_AddRefs(mThread));

"Media Latency Log" maybe.

@@ +94,5 @@
> +  MutexAutoLock lock(mMutex);
> +  if (mStart.IsNull()) {
> +    mStart = TimeStamp::Now();
> +    nsresult rv = NS_NewNamedThread("Media Log", getter_AddRefs(mThread));
> +    NS_ENSURE_SUCCESS(rv,);

NS_ENSURE_SUCCESS_VOID

::: content/media/Latency.h
@@ +13,5 @@
> +
> +PRLogModuleInfo* GetLatencyLog();
> +
> +// This class is a singleton, and works only when there is only one active
> +// stream. It is refcounted.

This now works when there are multiple streams.

::: content/media/Makefile.in
@@ +16,5 @@
>  endif
>  
>  CFLAGS   += $(GSTREAMER_CFLAGS)
>  CXXFLAGS += $(GSTREAMER_CFLAGS)
> +DEFINES  += -DMOZILLA_INTERNAL_API

What is this?

::: content/media/webrtc/MediaEngineWebRTCAudio.cpp
@@ +2,5 @@
>   * License, v. 2.0. If a copy of the MPL was not distributed with this file,
>   * You can obtain one at http://mozilla.org/MPL/2.0/. */
>  
>  #include "MediaEngineWebRTC.h"
> +#include "Latency.h"

Is this really needed?
Comment on attachment 803312 [details] [diff] [review]
Part 1 - Add a way to get cube latency

Review of attachment 803312 [details] [diff] [review]:
-----------------------------------------------------------------

(initial review comments, I'll continue when I'm home).

This patch is lacking a cubeb_wasapi.cpp implementation assignment in the vtable, so this will likely crash on Windows Vista+. The implementation is already there, just uncomment the line at the bottom of the file and you should be good to go.

Maybe we want kinetik to look at this, it does not make sense for me to review my own code, even though you already had a look at it.

::: media/libcubeb/src/cubeb_audiotrack.c
@@ +433,5 @@
> +
> +  /* Android returns the latency in ms, we want it in frames. */
> +  *latency = stream->context->klass.latency(stream->instance);
> +  /* with rate <= 96000, we won't overflow until 44.739 seconds of latency */
> +  *latency = (*latency * stream->params.rate) / 1000;

This is probably wrong, I'm not sure I tested it properly. And I don't trust the Android audio stack at all.

::: media/libcubeb/src/cubeb_opensl.c
@@ +415,5 @@
>  
> +int
> +opensl_stream_get_latency(cubeb_stream * stm, uint32_t * latency)
> +{
> +  *latency = NBUFS * stm->queuebuf_len;

We should improve that somehow, à la cubeb_winmm.c if we can't find a API that gives us the latency (and last time I checked, I could not find one). This is somewhat correct, though.

::: media/libcubeb/src/cubeb_sndio.c
@@ +318,5 @@
> +sndio_stream_get_latency(cubeb_stream * stm, uint32_t * latency)
> +{
> +  // http://www.openbsd.org/cgi-bin/man.cgi?query=sio_open
> +  // in the "Measuring the latency and buffers usage" paragraph.
> +  return stm->wrpos - stm->rdpos;

> *latency = stm->wrpos - stm->rdpos;
> return CUBEB_OK;

This is what you get when you write code for a platform you can't test. :-)
Attachment #803312 - Flags: review?(kinetik)
Attachment #803313 - Flags: review?(paul) → review+
Attachment #791563 - Flags: review?(rjesup)
Attachment #791563 - Flags: review?(paul)
Comment on attachment 803312 [details] [diff] [review]
Part 1 - Add a way to get cube latency

I'll take kinetik or roc; so adding roc.  (If I need to land an update or uplift one, I'm ok doing that as this is for debug/monitoring).
Attachment #803312 - Flags: review?(roc)
Comment on attachment 803317 [details] [diff] [review]
Part 4 - Add a content API to set the requested latency

r? to roc on exposing the requested/actual latency on Navigator - the rest of the patch was reviewed by me, and any mods I made since then (minor) are being reviewed by paul
Attachment #803317 - Flags: review?(roc)
Comment on attachment 803312 [details] [diff] [review]
Part 1 - Add a way to get cube latency

Review of attachment 803312 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/libcubeb/include/cubeb.h
@@ +238,5 @@
> +    between the time cubeb acquires the data in the callback and the listener
> +    can hear the sound.
> +    @param stream
> +    @retval CUBEB_OK
> +    @retval CUBEBERROR */

s/CUBEBERROR/CUBEB_ERROR/
Attachment #803312 - Flags: review?(paul)
Comment on attachment 803317 [details] [diff] [review]
Part 4 - Add a content API to set the requested latency

Review of attachment 803317 [details] [diff] [review]:
-----------------------------------------------------------------

It is meaningless to have the latency exposed on navigator, because it is per stream. In fact, I think we should not take this patch (minus the cubeb_wasapi.cpp bit), as it does not serve the goal we have for this bug (latency logging).

Keep in mind we have "media.cubeb_latency_ms" if we have some user that want to experiment.

::: content/media/Latency.cpp
@@ +22,5 @@
> +      // other devices and b2g have terrible backends.
> +      #define DEFAULT_LATENCY_TARGET 100
> +    #endif
> +  #endif
> +#endif

We should not land this right now, I'm going to take care of it in bug 907817. Plus this is outdated and hacky.

@@ +153,5 @@
> +    return NS_ERROR_FAILURE;
> +  }
> +  mRequestedLatency = aRequestedLatency;
> +  return NS_OK;
> +}

We shouldn't land this. If someone wants to tinker with the latency setting, about:config works.

::: dom/webidl/Navigator.webidl
@@ +150,5 @@
>    readonly attribute DOMString buildID;
>    [Throws, Func="Navigator::HasPowerSupport"]
>    readonly attribute MozPowerManager mozPower;
> +  [Throws]
> +  attribute unsigned long mozRequestedLatency;

We should drop this for now.

::: media/libcubeb/src/cubeb_wasapi.cpp
@@ +878,5 @@
>    /*.stream_destroy =*/ wasapi_stream_destroy,
>    /*.stream_start =*/ wasapi_stream_start,
>    /*.stream_stop =*/ wasapi_stream_stop,
> +  /*.stream_get_position =*/ wasapi_stream_get_position,
> +  /*.stream_get_latency =*/ wasapi_stream_get_latency

OK, this is enabled in this patch.
Attachment #803317 - Flags: review?(paul) → review-
Comment on attachment 803314 [details] [diff] [review]
Part 3 - Log latency, and adds a python script to understand the log

Review of attachment 803314 [details] [diff] [review]:
-----------------------------------------------------------------

This seems okay-ish for an initial landing, and be useful as-is.

::: content/media/Latency.cpp
@@ +117,5 @@
> +{
> +  if (PR_LOG_TEST(GetLatencyLog(), PR_LOG_DEBUG)) {
> +    nsCOMPtr<nsIRunnable> event = new LogEvent(aIndex, aID, aValue);
> +    if (mThread) {
> +      mThread->Dispatch(event, NS_DISPATCH_NORMAL);

Didn't you say you had some throttling code?

::: content/media/Makefile.in
@@ +16,5 @@
>  endif
>  
>  CFLAGS   += $(GSTREAMER_CFLAGS)
>  CXXFLAGS += $(GSTREAMER_CFLAGS)
> +DEFINES  += -DMOZILLA_INTERNAL_API

I now know.

::: graph.py
@@ +1,1 @@
> +#!/usr/bin/python

This has to be moved somewhere nicer than mozilla-central/
Attachment #803314 - Flags: review?(paul) → review+
Comment on attachment 803312 [details] [diff] [review]
Part 1 - Add a way to get cube latency

Review of attachment 803312 [details] [diff] [review]:
-----------------------------------------------------------------

I feel I should not review this, not without Matthew's permission anyway.
Attachment #803312 - Flags: review?(roc) → review-
Comment on attachment 803317 [details] [diff] [review]
Part 4 - Add a content API to set the requested latency

Review of attachment 803317 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/webidl/Navigator.webidl
@@ +152,5 @@
>    readonly attribute MozPowerManager mozPower;
> +  [Throws]
> +  attribute unsigned long mozRequestedLatency;
> +  [Throws]
> +  readonly  attribute unsigned long mozActualLatency;

If this is for debugging (for now), it should be PrefControlled and not prefixed. I think we should just call it "audioLatency".

The other attribute should also be PrefControlled and should be requestedAudioLatency if we land it.
(In reply to Paul Adenot (:padenot) from comment #53)

> ::: content/media/Latency.cpp
> @@ +117,5 @@
> > +{
> > +  if (PR_LOG_TEST(GetLatencyLog(), PR_LOG_DEBUG)) {
> > +    nsCOMPtr<nsIRunnable> event = new LogEvent(aIndex, aID, aValue);
> > +    if (mThread) {
> > +      mThread->Dispatch(event, NS_DISPATCH_NORMAL);
> 
> Didn't you say you had some throttling code?

I measured overhead in an opt build with this forced on.  The logging thread used 0.7% CPU (in a loopback test - two peerconnections; 4 output streams (2 local, 2 remote)).  The overhead for LogLatency() was effectively 0 with logging off (<<0.01%), with logging on it was just under 0.1%.  So while chunking might be useful, it's not necessary given my adding the PR_LOG_TEST() to avoid overhead when it's off.

> ::: content/media/Makefile.in
> @@ +16,5 @@
> >  endif
> >  
> >  CFLAGS   += $(GSTREAMER_CFLAGS)
> >  CXXFLAGS += $(GSTREAMER_CFLAGS)
> > +DEFINES  += -DMOZILLA_INTERNAL_API
> 
> I now know.
> 
> ::: graph.py
> @@ +1,1 @@
> > +#!/usr/bin/python
> 
> This has to be moved somewhere nicer than mozilla-central/

yes (likely in tools).  And also the MPL headers are missing on that and Latency.*; I've added them locally
Attachment #803317 - Attachment is obsolete: true
Attachment #803317 - Flags: review?(roc)
Comment on attachment 804621 [details] [diff] [review]
Part 4 - Add windows wasapi latency entries for cubeb

Minimal portion of former patch 4
Attachment #804621 - Flags: review?(paul)
Attachment #804621 - Flags: review?(paul) → review?(kinetik)
Comment on attachment 803312 [details] [diff] [review]
Part 1 - Add a way to get cube latency

roc forwarded the review to kinetik
Attachment #803312 - Flags: review-
Comment on attachment 803312 [details] [diff] [review]
Part 1 - Add a way to get cube latency

Review of attachment 803312 [details] [diff] [review]:
-----------------------------------------------------------------

Paul caught all the good stuff, this is mostly nits.  r+ with all review comments addressed.

::: media/libcubeb/include/cubeb.h
@@ +236,5 @@
>  
> +/** Get the latency for this stream, in frames. This is the number of frames
> +    between the time cubeb acquires the data in the callback and the listener
> +    can hear the sound.
> +    @param stream

Document latency parameter.

::: media/libcubeb/src/cubeb_alsa.c
@@ +1012,5 @@
> +int
> +alsa_stream_get_latency(cubeb_stream * stm, uint32_t * latency)
> +{
> +  snd_pcm_sframes_t delay;
> +  // This function returns the delay in frames until a frame written using

Use standard C comment syntax.

::: media/libcubeb/src/cubeb_audiounit.c
@@ +374,5 @@
> +int
> +audiounit_stream_get_latency(cubeb_stream * stm, uint32_t * latency)
> +{
> +  pthread_mutex_lock(&stm->mutex);
> +  if (stm->hw_latency_frames == UINT64_MAX) {

Invert the logic here and return early to avoid nesting.

@@ +406,5 @@
> +                                   0,
> +                                   &size,
> +                                   &output_device_id);
> +    if (r == noErr) {
> +

No newline.  Early return on error to avoid nesting.

@@ +414,5 @@
> +                               kAudioUnitScope_Global,
> +                               0,
> +                               &unit_latency_sec,
> +                               &size);
> +      if (r == noErr) {

And here.

@@ +422,5 @@
> +                                       0,
> +                                       NULL,
> +                                       &size,
> +                                       &device_latency_frames);
> +        if (r == noErr) {

And here.  Also, no extra newline.

@@ +425,5 @@
> +                                       &device_latency_frames);
> +        if (r == noErr) {
> +
> +          size = sizeof(device_safety_offset);
> +          r = AudioObjectGetPropertyData(output_device_id,

Handle error here.

::: media/libcubeb/src/cubeb_pulse.c
@@ +580,5 @@
> +
> +  r = WRAP(pa_stream_get_latency)(stm->stream, &r_usec, &negative);
> +
> +  assert(!negative);
> +

Collapse the two newlines here.

::: media/libcubeb/src/cubeb_winmm.c
@@ +63,5 @@
>    int draining;
>    HANDLE event;
>    HWAVEOUT waveout;
>    CRITICAL_SECTION lock;
> +  int64_t written;

Unless this can ever be negative, make it uint64_t.
Attachment #803312 - Flags: review?(kinetik) → review+
Attachment #804621 - Flags: review?(kinetik) → review+
There were also crashes on other platforms:
https://tbpl.mozilla.org/?rev=a317a2d59ed5
Comment on attachment 805445 [details] [diff] [review]
interdiff to resolve Try oranges

Review of attachment 805445 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/Latency.cpp
@@ +23,5 @@
>    "AudioStream",
>    "NetStat"
>  };
>  
> +nsRefPtr<AsyncLatencyLogger> gAsyncLogger;

Should this also be declared static since it is local to this file?
Attachment #805445 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/4a75349b7ec6
https://hg.mozilla.org/mozilla-central/rev/d37fd7a9d31a
https://hg.mozilla.org/mozilla-central/rev/3456780c25fa
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: mozilla26 → mozilla27
Depends on: 919176
Blocks: 920325
Blocks: 920328
Blocks: 920329
You need to log in before you can comment on or make changes to this bug.