Closed Bug 1626278 Opened 1 year ago Closed 7 months ago

[wfh] Session statistics disappear from about:webrtc when exiting video-call with close tab

Categories

(Core :: WebRTC: Networking, defect, P3)

Desktop
All
defect

Tracking

()

RESOLVED FIXED
85 Branch
Tracking Status
firefox75 --- wontfix
firefox76 --- wontfix
firefox77 --- wontfix
firefox79 --- wontfix
firefox85 --- fixed

People

(Reporter: peter.magyari, Assigned: bwc)

References

(Blocks 1 open bug, Regressed 3 open bugs)

Details

(Keywords: perf-alert, Whiteboard: [wfh])

Attachments

(15 files, 1 obsolete file)

732.18 KB, text/html
Details
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review

Affected Versions
Nightly 76.0a1 (2020-03-31)
Beta 75.0b11 (2020-03-28)

Affected Platforms
All

Steps to reproduce

  1. Launch Firefox
  2. Navigate to https://whereby.com and log in
  3. Open about:webrtc in a new tab
  4. Session Statistics section shows a log looking like:
    "[ 6442450952 ] https://meet.google.com/_meet/uhz-ivbg-apr?hs=49 Mon Mar 30 2020 17:38:17 GMT+0300 (Eastern European Summer Time)"
  5. Exit the call using "Close Tab"
  6. Refresh the tab with about:webrtc

Expected Result
The logs are kept in about:webrtc and the string "(closed)" appears in the log title after leaving the call.

Actual Result
The logs are lost.

Note
Happens only when you exit the call using close tab, the logs are kept if you exit the call by pressing "leave".

Hmm. Maybe a teardown race. Will have to look into it.

Blocks: 1624328
Priority: -- → P3

Apparently this also occurs on the Zoom Web app.
These are the logs of a session with 3 users, 2 active logs displayed during the call and one log displayed after the call was closed.

Attached file aboutWebrtc_lost_logs.html (obsolete) —

Description above. [Nevermind this]

Attachment #9137140 - Attachment is obsolete: true
Summary: Session statistics disappear from about:webrtc when exiting video-call with close tab → [wfh] Session statistics disappear from about:webrtc when exiting video-call with close tab
Whiteboard: [wfh]

Because this bug's Severity has not been changed from the default since it was filed, and it's Priority is P3 (Backlog,) indicating it has been triaged, the bug's Severity is being updated to S3 (normal.)

Severity: normal → S3

Session statistics logs are also lost when closing the call by the web-app's "Leave"/"Close call" buttons.

This issue was seen to reproduce intermittently with Jitsi, Facebook and Whereby on Mac OS 10.12.6. More often with Whereby then with the others.

Hardware: Unspecified → Desktop

In the case of Google Meet, Vonage, Amazon Chime on Ubuntu 20, the logs get deleted when closing the tab altogether.

Duplicate of this bug: 1650723
Duplicate of this bug: 1656230
Assignee: nobody → docfaraday

Ok, so the problem here is that the ICE ctx is being torn down before the final stats query completes sometimes. However, delaying the teardown of the ICE ctx until the stats query finishes sometimes results in STS shutting down before the ICE ctx is destroyed, leading to assertions.

Another problem is that sometimes closing the window causes the content process to be torn down before the final stats query completes. We might be able to compensate for this by getting rid of PeerConnectionCtx::mStatsForClosedPeerConnections, and instead relying on PeerConnectioonCtx::mLastReports, with a final update when a pc is closed.

It also seems that the only long-term storage we do of these stats is in PeerConnectionCtx, which will disappear when its content process disappears. We really need to be storing this stuff in the parent process, presumably in WebrtcGlobalInformation.

Looking at WebrtcGlobalInformation, I think things like SetDebugLevel and SetAecDebug aren't working at all. Those appear to only take effect on the process on which that API is invoked, but that API is only invoked on the parent process; there is no fan-out to the content processes like there is for GetAllStats/GetLogging.

Depends on D87180

Attachment #9170192 - Attachment description: Bug 1626278: (WIP) Use NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_DESTROY to ensure that MediaTransportHandler stays around long enough to finish pending stats queries. → Bug 1626278: Use NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_DESTROY to ensure that MediaTransportHandler stays around long enough to finish pending stats queries.
Attachment #9172525 - Attachment description: Bug 1626278: (WIP) Simplify/promisify PWebrtcGlobal. → Bug 1626278: Simplify/promisify PWebrtcGlobal.
Attachment #9172526 - Attachment description: Bug 1626278: Add a stats cache to WebrtcGlobalInformation, to avoid stats from being lost when a content process goes away. → Bug 1626278: Add a stats stash to WebrtcGlobalInformation, to avoid stats from being lost when a content process goes away.
Attachment #9174798 - Attachment description: Bug 1626278: Cache ICE logs. → Bug 1626278: Stash for ICE logs, and move log filtering into a single place.
Attachment #9170192 - Attachment description: Bug 1626278: Use NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_DESTROY to ensure that MediaTransportHandler stays around long enough to finish pending stats queries. → Bug 1626278: Use NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_DESTROY to ensure that MediaTransportHandler stays around long enough to finish pending stats queries. r?mjf
Attachment #9177944 - Attachment description: Bug 1626278: Implement MozPromise::AllSettled, based on JS Promise API. → Bug 1626278: Implement MozPromise::AllSettled, based on JS Promise API. r?jya
Attachment #9172525 - Attachment description: Bug 1626278: Simplify/promisify PWebrtcGlobal. → Bug 1626278: Simplify/promisify PWebrtcGlobal. r?ng
Attachment #9172526 - Attachment description: Bug 1626278: Add a stats stash to WebrtcGlobalInformation, to avoid stats from being lost when a content process goes away. → Bug 1626278: Add a stats stash to WebrtcGlobalInformation, to avoid stats from being lost when a content process goes away. r?ng
Attachment #9172527 - Attachment description: Bug 1626278: Filter stats for closed PeerConnections the same way we do for live ones. → Bug 1626278: Filter stats for closed PeerConnections the same way we do for live ones. r?ng
Attachment #9174798 - Attachment description: Bug 1626278: Stash for ICE logs, and move log filtering into a single place. → Bug 1626278: Stash for ICE logs, and move log filtering into a single place. r?ng
Attachment #9176398 - Attachment description: Bug 1626278: Move sigslot stuff to STS, and ensure we disconnect before releasing our ref to the MediaTransportHandler. → Bug 1626278: Move sigslot stuff to STS, and ensure we disconnect before releasing our ref to the MediaTransportHandler. r?mjf

Depends on D91467

This fixes a failure in browser_WebrtcGlobalInformation.js on windows opt.

Depends on D90626

Pushed by bcampen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/82ed327e71ed
Add a browser-chrome test for the WebrtcGlobalInformation interface. r=ng
https://hg.mozilla.org/integration/autoland/rev/3239d49be3ee
Use NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_DESTROY to ensure that MediaTransportHandler stays around long enough to finish pending stats queries. r=mjf
https://hg.mozilla.org/integration/autoland/rev/25c03ac56306
Implement MozPromise::AllSettled, based on JS Promise API. r=jya
https://hg.mozilla.org/integration/autoland/rev/5870625073f6
Test case for MozPromise::AllSettled. r=jya
https://hg.mozilla.org/integration/autoland/rev/387a53fd83f3
Simplify/promisify PWebrtcGlobal. r=ng
https://hg.mozilla.org/integration/autoland/rev/9b612ea670d4
Add a stats stash to WebrtcGlobalInformation, to avoid stats from being lost when a content process goes away. r=ng
https://hg.mozilla.org/integration/autoland/rev/5f453c8df70b
Filter stats for closed PeerConnections the same way we do for live ones. r=ng
https://hg.mozilla.org/integration/autoland/rev/afdb75a17ac9
Stash for ICE logs, and move log filtering into a single place. r=ng
https://hg.mozilla.org/integration/autoland/rev/cd2e50c8af34
Move sigslot stuff to STS, and ensure we disconnect before releasing our ref to the MediaTransportHandler. r=mjf
https://hg.mozilla.org/integration/autoland/rev/59da0d11510e
Ensure that pcid is unique. r=mjf
https://hg.mozilla.org/integration/autoland/rev/59228ee9d9e0
Remove transports from main, just like we do under normal circumstances. r=mjf

(In reply to Pulsebot from comment #44)

Pushed by bcampen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/82ed327e71ed
Add a browser-chrome test for the WebrtcGlobalInformation interface. r=ng
...

== Change summary for alert #27214 (as of Wed, 14 Oct 2020 03:02:20 GMT) ==

Improvements:

2% compiler_metrics num_static_constructors windows2012-64 debug 163.00 -> 159.33

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=27214

(In reply to Bogdan Tara[:bogdan_tara | bogdant] from comment #45)

Backed out 11 changesets (bug 1626278) for leaks and WebrtcGlobalInformation related failures.
...
Backout link: https://hg.mozilla.org/integration/autoland/rev/aab39b45b9d9e9a71a0aa99bf32a04b7d5c2dacb

== Change summary for alert #27211 (as of Wed, 14 Oct 2020 00:30:14 GMT) ==

Regressions:

2% compiler_metrics num_static_constructors windows2012-32 debug 158.33 -> 162.00
2% compiler_metrics num_static_constructors windows2012-64 debug 159.33 -> 163.00

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=27211

Ah. Presentation tests. Looking into it.

Flags: needinfo?(docfaraday)
Pushed by bcampen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2f57fa13df7e
Add a browser-chrome test for the WebrtcGlobalInformation interface. r=ng
https://hg.mozilla.org/integration/autoland/rev/5c0c76832287
Use NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_DESTROY to ensure that MediaTransportHandler stays around long enough to finish pending stats queries. r=mjf
https://hg.mozilla.org/integration/autoland/rev/8432bc08676e
Implement MozPromise::AllSettled, based on JS Promise API. r=jya
https://hg.mozilla.org/integration/autoland/rev/991bc13ae061
Test case for MozPromise::AllSettled. r=jya
https://hg.mozilla.org/integration/autoland/rev/90cf750aba51
Simplify/promisify PWebrtcGlobal. r=ng
https://hg.mozilla.org/integration/autoland/rev/1cf19408d522
Add a stats stash to WebrtcGlobalInformation, to avoid stats from being lost when a content process goes away. r=ng
https://hg.mozilla.org/integration/autoland/rev/5861d6340ee8
Filter stats for closed PeerConnections the same way we do for live ones. r=ng
https://hg.mozilla.org/integration/autoland/rev/d91d9eb46983
Stash for ICE logs, and move log filtering into a single place. r=ng
https://hg.mozilla.org/integration/autoland/rev/1227d9db743d
Move sigslot stuff to STS, and ensure we disconnect before releasing our ref to the MediaTransportHandler. r=mjf
https://hg.mozilla.org/integration/autoland/rev/15f89d2c5981
Ensure that pcid is unique. r=mjf
https://hg.mozilla.org/integration/autoland/rev/cc2023032191
Remove transports from main, just like we do under normal circumstances. r=mjf
https://hg.mozilla.org/integration/autoland/rev/eeed6ebfc444
Shutdown MediaTransportHandlerSTS on profile-change-net-teardown. Also add some logging. r=mjf

(In reply to Pulsebot from comment #50)

Pushed by bcampen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2f57fa13df7e
Add a browser-chrome test for the WebrtcGlobalInformation interface. r=ng

== Change summary for alert #27268 (as of Tue, 20 Oct 2020 06:18:43 GMT) ==

Improvements:

Ratio Suite Test Platform Options Absolute values (old vs new)
2% compiler_metrics num_static_constructors windows2012-64 161.00 -> 157.00

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=27268

(In reply to Narcis Beleuzu [:NarcisB] from comment #51)

Backed out for mochitest leakchecks

Backout link: https://hg.mozilla.org/integration/autoland/rev/41459f9dfbe958dfc4f5e33f0da22e900ab59312
Log link: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=319049077&repo=autoland&lineNumber=4515

== Change summary for alert #27264 (as of Mon, 19 Oct 2020 19:44:38 GMT) ==

Regressions:

Ratio Suite Test Platform Options Absolute values (old vs new)
3% compiler_metrics num_static_constructors windows2012-32 157.00 -> 161.00
3% compiler_metrics num_static_constructors windows2012-64 157.00 -> 161.00

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=27264

Huh. Another case where STS is being shut down before XPCOM shutdown. I think at this point we need a way to register for all/any STS shutdown events, regardless of why they are happening, driven by nsSocketTransportService::Shutdown. I can implement this, but I'd like some feedback from the necko team. What do you think?

Flags: needinfo?(valentin.gosu)
Flags: needinfo?(docfaraday)
Flags: needinfo?(dd.mozilla)

(In reply to Byron Campen [:bwc] from comment #54)

Huh. Another case where STS is being shut down before XPCOM shutdown. I think at this point we need a way to register for all/any STS shutdown events, regardless of why they are happening, driven by nsSocketTransportService::Shutdown. I can implement this, but I'd like some feedback from the necko team. What do you think?

Yes, I think sounds OK.

Flags: needinfo?(valentin.gosu)
Attachment #9181793 - Attachment description: Bug 1626278: Shutdown MediaTransportHandlerSTS on profile-change-net-teardown. Also add some logging. r?mjf → Bug 1626278: Shutdown MediaTransportHandlerSTS on STS shutdown. Also add some logging. r?mjf, r?valentin

Also some improvements in discipline wrt bare pointers.

Depends on D93655

Attachment #9181793 - Attachment description: Bug 1626278: Shutdown MediaTransportHandlerSTS on STS shutdown. Also add some logging. r?mjf, r?valentin → Bug 1626278: Shutdown MediaTransportHandlerSTS on STS shutdown. Also add some logging. r?mjf,valentin
Pushed by bcampen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f2a432dd0bf0
Add a browser-chrome test for the WebrtcGlobalInformation interface. r=ng
https://hg.mozilla.org/integration/autoland/rev/8c7dd3e44dba
Use NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_DESTROY to ensure that MediaTransportHandler stays around long enough to finish pending stats queries. r=mjf
https://hg.mozilla.org/integration/autoland/rev/340bc094a9fa
Implement MozPromise::AllSettled, based on JS Promise API. r=jya
https://hg.mozilla.org/integration/autoland/rev/6a691d46d210
Test case for MozPromise::AllSettled. r=jya
https://hg.mozilla.org/integration/autoland/rev/d7eaecb56f9d
Simplify/promisify PWebrtcGlobal. r=ng
https://hg.mozilla.org/integration/autoland/rev/8b4f6eae0d07
Add a stats stash to WebrtcGlobalInformation, to avoid stats from being lost when a content process goes away. r=ng
https://hg.mozilla.org/integration/autoland/rev/4b2382b08333
Filter stats for closed PeerConnections the same way we do for live ones. r=ng
https://hg.mozilla.org/integration/autoland/rev/2bb01ffd3a40
Stash for ICE logs, and move log filtering into a single place. r=ng
https://hg.mozilla.org/integration/autoland/rev/6b53a7cecd9b
Move sigslot stuff to STS, and ensure we disconnect before releasing our ref to the MediaTransportHandler. r=mjf
https://hg.mozilla.org/integration/autoland/rev/81cd310b0622
Ensure that pcid is unique. r=mjf
https://hg.mozilla.org/integration/autoland/rev/0316785c107e
Remove transports from main, just like we do under normal circumstances. r=mjf
https://hg.mozilla.org/integration/autoland/rev/3b87a745032e
Shutdown MediaTransportHandlerSTS on STS shutdown. Also add some logging. r=mjf,valentin
https://hg.mozilla.org/integration/autoland/rev/9903d462188b
Stop requiring additional dispatches to STS for ICE ctx teardown. r=mjf
https://hg.mozilla.org/integration/autoland/rev/575b67d9c9b6
Make this test-case function dispatch to STS like it is supposed to. r=mjf
Regressions: 1651716
Flags: needinfo?(dd.mozilla)
Regressions: 1684615
Regressed by: 1696593
No longer regressed by: 1696593
Regressions: 1696593
No longer regressions: 1696593
Regressions: 1700922
You need to log in before you can comment on or make changes to this bug.