Closed Bug 1380786 Opened 7 years ago Closed 5 years ago

Can we now use standard telemetry for webrtc stats?

Categories

(Core :: WebRTC: Signaling, enhancement, P3)

49 Branch
enhancement

Tracking

()

RESOLVED FIXED
mozilla72
Tracking Status
firefox72 --- fixed

People

(Reporter: chutten, Assigned: dminor)

Details

(Whiteboard: [measurement:client:tracking])

Attachments

(2 files)

webrtc stats are one of the few remaining pieces of telemetry captured on childPayloads instead of being aggregated to the parent.

Unfortunately, webrtc stats are by their own nature quite complicated.

We first introduced a pair of probes in bug 970690
Then we introduced a custom struct in bug 1198883

Luckily, no one's using that custom struct yet so we can still change it further if we're clever. If we can use standard telemetry primitives (Histograms, Scalars, whatever) we can remove the custom webrtc handling and get client child aggregation for free.

The trick is that we need to record two 2^11 (2048) bitstrings' worth of information[1].

Anyone have a brilliant idea?

[1]: http://searchfox.org/mozilla-central/source/media/webrtc/signaling/src/peerconnection/WebrtcGlobalInformation.cpp#1053-1080
So we could have a 2048-bucket categorical histogram. 

Pros:
* Excellent tooling support. 
* telemetry.mozilla.org might display this exactly the way we want
* String buckets instead of bitstrings sounds easier to read to me

Cons:
* 2048 buckets... this would be the first histogram quite this wide. Might stress something.
* Have to call Accumulate multiple times per collection (once for each flipped bit). (might not be a big deal as it could replace the bitstring creation code)


If telemetry.mozilla.org won't satisfy the analysis needs, we're looking at custom analysis. At that point it doesn't really matter what format we use so long as we can munge it in python (or SQL) later. That opens things up like keyed boolean histograms (one histogram each for success/failure, keys are the bitstrings), keyed uint scalars (ditto), and probably other ideas.


Questions
1) :drno - what analysis would you like to perform on this data when you get it?
2) :gfritzsche, :Dexter - have any brilliant ideas for storing this data? Any knowledge about internal bucket limits?
Flags: needinfo?(gfritzsche)
Flags: needinfo?(drno)
Flags: needinfo?(alessio.placitelli)
(In reply to Chris H-C :chutten from comment #1)
> So we could have a 2048-bucket categorical histogram. 

That sounds like a good idea to me, given that it's an exceptional measurement and that we don't want to add 2048-bucket every other day.
I'm a bit concerned about the impact on the ping size: our serialization format would basically enforce 2048 keys to be dumped in the "values" section of this histogram. Is that correct?

> 2) :gfritzsche, :Dexter - have any brilliant ideas for storing this data?
> Any knowledge about internal bucket limits?

As far as I can tell/remember, we only require the histogram to be in a whitelist if more than 100 buckets are needed.
We don't seem to enforce other limit (other than the minimum/default number of 50 buckets for categoricals).
Flags: needinfo?(alessio.placitelli)
To give a bit of history: the reason this in custom code is that the default Histograms could not carry the 27 bits we are using. Or there were at least concerns about the size of data to be transferred as each of the 2^27 representations would get transferred (?).

I think the default analyzes I would want to perform on this data would something like:
- show me success vs failure percentages in case where both sides of the call had IPv6 UDP
- show me success vs failure percentages where only one side had TCP available
- show me how many Windows clients had TCP locally available
...

And obviously ;-) all of that per Firefox version, channel and OS :-)

Ideally we would have some kind of interface similar to standard Telemetry interface where people can change OS, version etc in drop downs. But obviously an initial version with hard coded queries would be a good start as well.
Flags: needinfo?(drno)
(In reply to Nils Ohlmeier [:drno] from comment #3)
> To give a bit of history: the reason this in custom code is that the default
> Histograms could not carry the 27 bits we are using. Or there were at least
> concerns about the size of data to be transferred as each of the 2^27
> representations would get transferred (?).

The serialization/transfer of histogram data is sparse, but if you record many of those representations in a single session, that is a concern.
AFAIU, performance becomes a concern for the aggregator though for high bucket counts.

> I think the default analyzes I would want to perform on this data would
> something like:
> - show me success vs failure percentages in case where both sides of the
> call had IPv6 UDP
> - show me success vs failure percentages where only one side had TCP
> available
> - show me how many Windows clients had TCP locally available
> ...

Can you enumerate the standard questions and add standard scalars or histograms for them?
(e.g. boolean scalar for "tcp available", boolean histogram for "success/failure with both sides having udp")
Then you would have them show up automatically in e.g. the TMO dashboard without further work.
Flags: needinfo?(gfritzsche) → needinfo?(drno)
Hey :frank, know of any perf/stability concerns for aggregating particularly wide (~2K buckets) histograms?


...but you know what, maybe we can be more clever than this. We could represent this as four (success/failure and local/remote) 11-bucket categorical histograms. Then if a bit would be flipped in the bitstring, we accumulate to that bit's bucket in the categorical histogram.

For example, a bitstring of 4 and a bitstring of 5 (both local, success) would result in values of [1, 0, 2]. So we'd know what proportion of all connections use which features over time, but not pairs of features (that information goes missing at the client)
Flags: needinfo?(fbertsch)
(In reply to Chris H-C :chutten from comment #5)
> Hey :frank, know of any perf/stability concerns for aggregating particularly
> wide (~2K buckets) histograms?

Nope, we have a bunch that are 1K wide, and two that are 10K wide. See one here: https://mzl.la/2tc1MWf.
Flags: needinfo?(fbertsch)
Rank: 25
Component: WebRTC → WebRTC: Signaling
Priority: -- → P2
Whiteboard: [measurement:client] → [measurement:client:tracking]
Mass change P2->P3 to align with new Mozilla triage process.
Priority: P2 → P3

(In reply to Georg Fritzsche [:gfritzsche] from comment #4)

(In reply to Nils Ohlmeier [:drno] from comment #3)

To give a bit of history: the reason this in custom code is that the default
Histograms could not carry the 27 bits we are using. Or there were at least
concerns about the size of data to be transferred as each of the 2^27
representations would get transferred (?).

The serialization/transfer of histogram data is sparse, but if you record
many of those representations in a single session, that is a concern.
AFAIU, performance becomes a concern for the aggregator though for high
bucket counts.

I think the default analyzes I would want to perform on this data would
something like:

  • show me success vs failure percentages in case where both sides of the
    call had IPv6 UDP
  • show me success vs failure percentages where only one side had TCP
    available
  • show me how many Windows clients had TCP locally available
    ...

Can you enumerate the standard questions and add standard scalars or
histograms for them?
(e.g. boolean scalar for "tcp available", boolean histogram for
"success/failure with both sides having udp")
Then you would have them show up automatically in e.g. the TMO dashboard
without further work.

Looks like this fell between the cracks. The original intent for this telemetry was to answer the question "Why did our ICE success rate go down?". As such, we did not have a small number of things we wanted to monitor. We really did want every combination of capabilities (local and remote).

That said, I don't think anyone has looked at this telemetry in a really long time. I can't even figure out how to find this data anymore. We don't have telemetry for the overall ICE success rate, either.

Has anybody actually used this ICE candidate telemetry in the last year? We may just need to remove this.

Flags: needinfo?(na-g)
Flags: needinfo?(mfroman)
Flags: needinfo?(drno)
Flags: needinfo?(dminor)

I have not.

Flags: needinfo?(mfroman)

I have not.

Flags: needinfo?(na-g)

I'm not using it. I can take care of removing it.

Assignee: nobody → dminor
Flags: needinfo?(dminor)

Please let me know if I can be of any assistance in its removal.

This ICE candidate telemetry has not been used in a long time and in
addition requires special handling by the telemetry code. It is best
removed.

The ICE candidate telemetry recorded using this is no longer useful,
and so this code can be safely removed.

Depends on D50656

Pushed by dminor@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e2536fbffa15
Remove ICE candidate telemetry; r=bwc
https://hg.mozilla.org/integration/autoland/rev/bbd49f460213
Remove WebrtcTelemetry and associated code; r=chutten
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla72
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: