Closed Bug 1569250 Opened 5 months ago Closed 3 months ago

Add telemetry for mDNS use in WebRTC

Categories

(Core :: WebRTC: Networking, enhancement, P2)

70 Branch
enhancement

Tracking

()

RESOLVED FIXED
mozilla71
Tracking Status
firefox70 --- fixed
firefox71 --- fixed

People

(Reporter: dminor, Assigned: dminor)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

I think what we want to be able to do is to distinguish legitimate WebRTC connections from fingerprinting attempts and to ensure that using mDNS doesn't make connectivity worse for legitimate connections.

We'll want to keep track of the number of times we generate a mDNS address, which I expect to be quite large. For example, visiting aliexpress.com generates two on page load, and one for each subsequent page after that.

For legitimate uses, we'll want to determine the number of times where connectivity was established, and maybe how long it takes. For this data to be useful, we'd need non-mDNS data to compare it against, otherwise we can't really evaluate to effect of enabling mDNS.

We generate a mDNS address for pages where the user has not granted audio or video capture permissions. We could add telemetry that counts how often that situation occurs and data for the subsequent ICE connection, which we could use to evaluate the effect of enabling mDNS. For that to work, we'd have to land it a version before mDNS lands or set up some sort of study where mDNS is enabled for a portion of the population.

As this is an ongoing task, mark it as P2 for now.

Priority: -- → P2
Assignee: nobody → dminor
Attached file bug-1569250-request.md
Attachment #9090815 - Flags: data-review?(chutten)
Comment on attachment 9090815 [details]
bug-1569250-request.md

DATA COLLECTION REVIEW RESPONSE:

    Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?

Yes. This collection is Telemetry so is documented in its definitions files [Histograms.json](https://hg.mozilla.org/mozilla-central/file/tip/toolkit/components/telemetry/Histograms.json) and [Scalars.yaml](https://hg.mozilla.org/mozilla-central/file/tip/toolkit/components/telemetry/Scalars.yaml), and the [Probe Dictionary](https://telemetry.mozilla.org/probe-dictionary/).

    Is there a control mechanism that allows the user to turn the data collection on and off?

Yes. This collection is Telemetry so can be controlled through Firefox's Preferences.

    If the request is for permanent data collection, is there someone who will monitor the data over time?

No. This collection will expire in Firefox 77.

    Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 2, Interaction.

    Is the data collection request for default-on or default-off?

Default on for pre-release channels only.

    Does the instrumentation include the addition of any new identifiers?

No.

    Is the data collection covered by the existing Firefox privacy notice?

Yes.

    Does there need to be a check-in in the future to determine whether to renew the data?

Yes. :dminor is responsible for renewing or removing the collection before it expires in Firefox 77.

---
Result: datareview+
Attachment #9090815 - Flags: data-review?(chutten) → data-review+
Pushed by dminor@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ef18dbaff6c7
Add telemetry for mDNS use in WebRTC; r=bwc

Backed out changeset ef18dbaff6c7 for causing failures in AccumulateTimeDelta.

Backout link: https://hg.mozilla.org/integration/autoland/rev/93757a6230ebbd950e3bb393c8c844134efd4e6a

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=success%2Ctestfailed%2Cbusted%2Cexception&tochange=93757a6230ebbd950e3bb393c8c844134efd4e6a&fromchange=ef18dbaff6c7671eac7642f228d458fe15dcc6b9&selectedJob=265708891

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=265708891&repo=autoland&lineNumber=23289

[task 2019-09-09T13:54:54.410Z] 13:54:54 INFO - GECKO(1693) | Assertion failure: !IsNull() (Cannot compute with a null value), at /builds/worker/workspace/build/src/obj-firefox/dist/include/mozilla/TimeStamp.h:556
[task 2019-09-09T13:55:14.307Z] 13:55:14 INFO - GECKO(1693) | #01: mozilla::PeerConnectionImpl::IceConnectionStateChange(mozilla::dom::RTCIceConnectionState) [media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp:2456]
[task 2019-09-09T13:55:14.307Z] 13:55:14 INFO -
[task 2019-09-09T13:55:14.307Z] 13:55:14 INFO - GECKO(1693) | #02: mozilla::PeerConnectionMedia::IceConnectionStateChange_m(mozilla::dom::RTCIceConnectionState) [media/webrtc/signaling/src/peerconnection/PeerConnectionMedia.cpp:0]
[task 2019-09-09T13:55:14.307Z] 13:55:14 INFO -
[task 2019-09-09T13:55:14.307Z] 13:55:14 INFO - GECKO(1693) | #03: mozilla::runnable_args_memfn<mozilla::PeerConnectionMedia*, void (mozilla::PeerConnectionMedia::)(mozilla::dom::RTCIceConnectionState), mozilla::dom::RTCIceConnectionState>::Run() [media/mtransport/runnable_utils.h:150]
[task 2019-09-09T13:55:14.307Z] 13:55:14 INFO -
[task 2019-09-09T13:55:14.308Z] 13:55:14 INFO - GECKO(1693) | #04: nsThread::ProcessNextEvent(bool, bool
) [xpcom/threads/nsThread.cpp:1214]
[task 2019-09-09T13:55:14.308Z] 13:55:14 INFO -
[task 2019-09-09T13:55:14.308Z] 13:55:14 INFO - GECKO(1693) | #05: NS_ProcessNextEvent(nsIThread*, bool) [xpcom/threads/nsThreadUtils.cpp:486]
[task 2019-09-09T13:55:14.308Z] 13:55:14 INFO -
[task 2019-09-09T13:55:14.308Z] 13:55:14 INFO - GECKO(1693) | #06: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) [ipc/glue/MessagePump.cpp:88]
[task 2019-09-09T13:55:14.308Z] 13:55:14 INFO -
[task 2019-09-09T13:55:14.308Z] 13:55:14 INFO - GECKO(1693) | #07: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:291]
[task 2019-09-09T13:55:14.308Z] 13:55:14 INFO -
[task 2019-09-09T13:55:14.309Z] 13:55:14 INFO - GECKO(1693) | #08: nsBaseAppShell::Run() [widget/nsBaseAppShell.cpp:139]
[task 2019-09-09T13:55:14.309Z] 13:55:14 INFO -
[task 2019-09-09T13:55:14.309Z] 13:55:14 INFO - GECKO(1693) | #09: nsAppShell::Run() [widget/cocoa/nsAppShell.mm:705]
[task 2019-09-09T13:55:14.309Z] 13:55:14 INFO -
[task 2019-09-09T13:55:14.309Z] 13:55:14 INFO - GECKO(1693) | #10: XRE_RunAppShell() [toolkit/xre/nsEmbedFunctions.cpp:934]
[task 2019-09-09T13:55:14.309Z] 13:55:14 INFO -
[task 2019-09-09T13:55:14.309Z] 13:55:14 INFO - GECKO(1693) | #11: mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) [ipc/glue/MessagePump.cpp:238]
[task 2019-09-09T13:55:14.309Z] 13:55:14 INFO -
[task 2019-09-09T13:55:14.309Z] 13:55:14 INFO - GECKO(1693) | #12: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:291]
[task 2019-09-09T13:55:14.310Z] 13:55:14 INFO -
[task 2019-09-09T13:55:14.310Z] 13:55:14 INFO - GECKO(1693) | #13: XRE_InitChildProcess(int, char**, XREChildData const*) [toolkit/xre/nsEmbedFunctions.cpp:773]
[task 2019-09-09T13:55:14.310Z] 13:55:14 INFO -
[task 2019-09-09T13:55:14.310Z] 13:55:14 INFO - GECKO(1693) | #14: main [ipc/app/MozillaRuntimeMain.cpp:23]
<...>
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - PROCESS-CRASH | Main app process exited normally | application crashed [@ mozilla::Telemetry::AccumulateTimeDelta(mozilla::Telemetry::HistogramID, mozilla::TimeStamp, mozilla::TimeStamp)]
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - Crash dump filename: /var/folders/m5/sbqrw2ms2m75ctb4zdqyxbt4000017/T/tmpvoylVn.mozrunner/minidumps/BC04EB75-8205-48E9-B54A-FA8DCC0CEBD2.dmp
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - Operating system: Mac OS X
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - 10.14.5 18F132
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - CPU: amd64
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - family 6 model 69 stepping 1
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - 4 CPUs
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO -
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - GPU: UNKNOWN
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO -
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - Crash reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - Crash address: 0x0
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - Process uptime: 198 seconds
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO -
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - Thread 0 (crashed)
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - 0 XUL!mozilla::Telemetry::AccumulateTimeDelta(mozilla::Telemetry::HistogramID, mozilla::TimeStamp, mozilla::TimeStamp) [Telemetry.cpp:ef18dbaff6c7671eac7642f228d458fe15dcc6b9 : 1975 + 0x29]
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - rax = 0x000000010b71ef88 rdx = 0x0000000000000000
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - rcx = 0x000000010cea8de8 rbx = 0x0000000000000000
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - rsi = 0x00000000000120a8 rdi = 0x00007fff9b15e028
[task 2019-09-09T13:55:19.871Z] 13:55:19 INFO - rbp = 0x00007ffeecd0fc30 rsp = 0x00007ffeecd0fc20
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - r8 = 0x00000000000130a8 r9 = 0x00007fff9b15e048
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - r10 = 0x0000000000000000 r11 = 0x00007fff9b15e040
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - r12 = 0x0000000000000000 r13 = 0x000000010d076110
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - r14 = 0x0000000114114000 r15 = 0x0000000000000004
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - rip = 0x00000001089d48fe
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - Found by: given as instruction pointer in context
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - 1 XUL!mozilla::PeerConnectionImpl::IceConnectionStateChange(mozilla::dom::RTCIceConnectionState) [PeerConnectionImpl.cpp:ef18dbaff6c7671eac7642f228d458fe15dcc6b9 : 2476 + 0x10]
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - rbp = 0x00007ffeecd0fc60 rsp = 0x00007ffeecd0fc40
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - rip = 0x000000010421c9e0
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - Found by: previous frame's frame pointer
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - 2 XUL!mozilla::PeerConnectionMedia::IceConnectionStateChange_m(mozilla::dom::RTCIceConnectionState) [PeerConnectionMedia.cpp:ef18dbaff6c7671eac7642f228d458fe15dcc6b9 : 665 + 0x9]
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - rbp = 0x00007ffeecd0fc90 rsp = 0x00007ffeecd0fc70
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - rip = 0x0000000104220e25
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - Found by: previous frame's frame pointer
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - 3 XUL!mozilla::runnable_args_memfn<mozilla::PeerConnectionMedia*, void (mozilla::PeerConnectionMedia::)(mozilla::dom::RTCIceConnectionState), mozilla::dom::RTCIceConnectionState>::Run() [runnable_utils.h:ef18dbaff6c7671eac7642f228d458fe15dcc6b9 : 148 + 0x1b]
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - rbp = 0x00007ffeecd0fca0 rsp = 0x00007ffeecd0fca0
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - rip = 0x0000000104266df6
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - Found by: previous frame's frame pointer
[task 2019-09-09T13:55:19.872Z] 13:55:19 INFO - 4 XUL!nsThread::ProcessNextEvent(bool, bool
) [nsThread.cpp:ef18dbaff6c7671eac7642f228d458fe15dcc6b9 : 1225 + 0x6]

Flags: needinfo?(dminor)
Flags: needinfo?(dminor)
Pushed by dminor@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fd2934cca1ae
Add telemetry for mDNS use in WebRTC; r=bwc
Status: NEW → RESOLVED
Closed: 3 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla71

Changing the high value is not supported, as it changes the layout of the histogram's buckets. The way to handle this is to evolve the histogram into a _2 version which has the values you want. (why, yes, this has happened before, and sometimes more than once).

Oh, and a pro tip: there's a "Get Shortlink" button at the top of the Measurement Dashboard for easier link sharing, if you'd like.

And from the Data Stewardship point-of-view, this is the same data collection, just in a different form. No extra Data Review needed, so long as you don't change the population or expiry.

...while I'm here... from the description, this is recording the state transition duration from "checking" to either "connected" or "failed". In seconds. Nearly 40% (more than 100k) of your samples are at least 600 seconds (ten whole minutes!). Is this expected? I wouldn't expect this to be a normal length of time for anything to do with webrtc except call lengths.

I'd be tempted to take a look into the code that performs the measurement and make sure there isn't something strange going on. Maybe it's only failures that take this long and splitting this into two histograms would be clearer? Maybe the units are wrong and this isn't seconds at all?

Flags: needinfo?(chutten)

Oops, it turns out Telemetry::AccumulateTimeDelta works in milliseconds, not seconds :/
https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/core/Telemetry.cpp#2012

See Also: → 1581577

Comment on attachment 9090814 [details]
Bug 1569250 - Add telemetry for mDNS use in WebRTC; r=bwc!

Beta/Release Uplift Approval Request

  • User impact if declined: I would like to land this on 70 to give us a baseline to compare with when we enable hostname obfuscation on nightly. Without the baseline, it will be difficult to establish if enabling hostname obfuscation is having a negative impact on users.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: Bug 1581577
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Low risk, this just adds telemetry, and has been on Nightly for over a week without problems.
  • String changes made/needed: None

Beta/Release Uplift Approval Request

  • User impact if declined: I would like to land this on 70 to give us a baseline to compare with when we enable hostname obfuscation on nightly. Without the baseline, it will be difficult to establish if enabling hostname obfuscation is having a negative impact on users.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: Bug 1581577
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Low risk, this just adds telemetry, and has been on Nightly for over a week without problems.
  • String changes made/needed: None
Attachment #9090814 - Flags: approval-mozilla-beta?

Comment on attachment 9090814 [details]
Bug 1569250 - Add telemetry for mDNS use in WebRTC; r=bwc!

Adds some new Telemetry probes, approved for 70.0b8.

Attachment #9090814 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.