Closed Bug 1654248 Opened 1 year ago Closed 1 year ago

The WEBRTC_CALL_DURATION histogram is not recorded after having a call on jitsi, google meet, or hangouts

Categories

(Core :: WebRTC, defect, P2)

Desktop
All
defect

Tracking

()

VERIFIED FIXED
81 Branch
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 - wontfix
firefox78 --- wontfix
firefox79 --- wontfix
firefox80 - wontfix
firefox81 --- verified

People

(Reporter: cmuresan, Assigned: jib)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(3 files, 1 obsolete file)

[Affected versions]:

  • Firefox Nightly 80.0a1, Build ID 20200720094507
  • Firefox Beta 79.0, Build ID 20200720193547

[Affected Platforms]:

  • Windows 10 x64
  • macOS 10.15.4
  • Ubuntu 16.04 x64

[Prerequisites:]

  • Have the following prefs set:
    • privacy.webrtc.allowSilencingNotifications to true
    • privacy.webrtc.legacyGlobalIndicator to false

[Steps to reproduce]:

  1. Open the browser with the profile from prerequisites and navigate to the meet.google.com, hangouts.google.com,or meet.jit.si page.
  2. Host a call and have someone join from another device.
  3. End the call and navigate to the about:telemetry page.
  4. Search for the WEBRTC_CALL_DURATION histogram and observe the behavior.

[Expected result]:

  • The histogram is displayed.

[Actual result]:

  • The histogram is not recorded.

[Notes]:

  • The issue is also reproducible on join.me on Windows and macOS.
  • The issue is not reproducible with Zoom, Facebook, Whereby, Talky, webroom.net, jumpchat, and linkello.
  • Attached a screen recording with the issue.

The WEBRTC_CALL_DURATION measurement is done deep in the WebRTC stack, outside of the control of the front-end from what I can tell. Any idea what's happening here, jib?

Component: Site Permissions → WebRTC
Flags: needinfo?(jib)
Product: Firefox → Core

The measurement dashboard does show data, it's all crammed to the left, but even the higher buckets still have a few samples (and are thus not eliminated from the view).

Not sure about about:telemetry, so far I haven't been able to make it record and without data I can't check if it's a data display problem.

Tried a call on https://talky.io/ and indeed it's recorded.
about:telemetry displays that just fine.

That puts it back into Core::WebRTC as it seems to be a recording issue.

Component: Telemetry → WebRTC
Product: Toolkit → Core
Priority: -- → P3

Can you please confirm if this is indeed an issue with the probe for certain domains? We were planning on shipping an experiment next monday that uses the value from this probe to measure experiment success - it would be good to get a confirmation of the issue so we can plan the experiment accordingly.

Flags: needinfo?(achronop)

Redirect to jib

Flags: needinfo?(achronop) → needinfo?(jib)

[Tracking Requested - why for this release]: WEBRTC_CALL_DURATION telemetry recording has been broken since 71.

WEBRTC_CALL_DURATION recording is effectively non-functional for any real site, except the most trivial demo that doesn't do any renegotiation.

I've reproduced this with mozregression using the following steps:

STRs:

  1. Start a fresh instance of Firefox.
  2. Open https://jsfiddle.net/jib1/3x9andz7/ and share cam+mic
  3. Open a second tab
  4. Close the first tab
  5. Open about:telemetry#search=WEBRTC_CALL_DURATION

Expected result: A histogram for WEBRTC_CALL_DURATION is shown
Actual result: No histogram for WEBRTC_CALL_DURATION is shown

Regression range:

  • mozregression gave in at 2019-10-01-2019-10-02, which points squarely to bug 1571015.

In short, the regressing bug added a ref-counting mechanism delaying reporting of WEBRTC_CALL_DURATION until the number of active peer connections per URI drops to zero.

The bug is startCallTelem (which increments) is called in every pc.setRemoteDescription call, whereas RecordEndOfCallTelemetry is called just once on pc.close (the STR fiddle re-negotiates once to repro).

Has Regression Range: --- → yes
Has STR: --- → yes
Flags: needinfo?(jib) → needinfo?(dminor)
Keywords: regression
Priority: P3 → P2
Regressed by: 1571015

You've already identified the problem and the found the regressing bug, so I'm not sure what additional information is needed here.

I don't know if we can salvage the approach from Bug 1571015 or if we should just revert those changes. If we revert it, we'll be back to measuring PeerConnection duration rather than call duration, but maybe that is ok. Since I made a mess of things the first time around, I think we'd be better off if someone with greater knowledge of PeerConnection had a look at this.

Flags: needinfo?(dminor) → needinfo?(jib)

I'll take a stab at it.

Assignee: nobody → jib
Flags: needinfo?(jib)

Thank you!

Attached file request.md (obsolete) —
Attachment #9166919 - Flags: data-review?(tdsmith)
Attached file request.md
Attachment #9166919 - Attachment is obsolete: true
Attachment #9166919 - Flags: data-review?(tdsmith)
Attachment #9166920 - Flags: data-review?(tdsmith)
Comment on attachment 9166920 [details]
request.md

Thanks!

1) Is there or will there be **documentation** that describes the schema for the ultimate data set in a public, complete, and accurate way? Click the documentation link provided in Q6 and ensure it is publicly accessible and does or will contain documentation for the data collection. (see [here](https://github.com/mozilla/activity-stream/blob/master/docs/v2-system-addon/data_dictionary.md), [here](https://github.com/mozilla-mobile/focus/wiki/Install-and-event-tracking-with-the-Adjust-SDK), and [here](https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/telemetry/index.html) for examples).  Refer to the appendix for "documentation" if more detail about documentation standards is needed.

Yes, in Histograms.json and the probe dictionary.

2) Is there a control mechanism that allows the user to turn the data collection on and off? (Note, for data collection not needed for security purposes, Mozilla provides such a control mechanism) Provide details as to the control mechanism available.

Yes, the Firefox telemetry opt-out.

3) If the request is for permanent data collection, is there someone who will monitor the data over time?

Yes, Jan-Ivar Bruaroey will monitor the collection.

4) Using the **[category system of data types](https://wiki.mozilla.org/Firefox/Data_Collection)** on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 1, technical data.

5) Is the data collection request for default-on or default-off?

Default-on.

6) Does the instrumentation include the addition of **any *new* identifiers** (whether anonymous or otherwise; e.g., username, random IDs, etc.  See the appendix for more details)?

No.

7) Is the data collection covered by the existing Firefox privacy notice?

Yes.

8) Does there need to be a check-in in the future to determine whether to renew the data? 

No, permanent collection.

9) Does the data collection use a third-party collection tool?

No.
Attachment #9166920 - Flags: data-review?(tdsmith) → data-review+
Pushed by jbruaroey@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e51e334b08cd
Fix missing telemetry reporting of WEBRTC_CALL_DURATION, and stop counting renegotiations in WEBRTC_CALL_COUNT_2. r=bwc
Pushed by jbruaroey@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c71f8ebaf601
Fix missing telemetry reporting of WEBRTC_CALL_DURATION, and stop counting renegotiations in WEBRTC_CALL_COUNT_2. r=bwc
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 81 Branch

I have verified that the issue is no longer reproducible on the latest Nightly 81.0a1 (Build ID 20200810213634) using Windows 10 x64, macOS 10.15, and Ubuntu 20.04. The WEBRTC_CALL_DURATION histogram is collected when having a call on meet.google.com, hangouts.google.com, and meet.jit.si.

Status: RESOLVED → VERIFIED
Flags: needinfo?(jib)

Too late for Fx80, but I'd consider taking this on ESR78 still if we think it's important enough to do so.

(In reply to Ryan VanderMeulen [:RyanVM] from comment #21)

Too late for Fx80, but I'd consider taking this on ESR78 still if we think it's important enough to do so.

I don't think this is needed on ESR where telemetry is heavily biased by opt-out - I cannot think of any useful output from this change on ESR at the moment.

You need to log in before you can comment on or make changes to this bug.