Closed Bug 1687486 Opened 5 years ago Closed 5 years ago

Intermittent LeakSanitizer | leak at nsRefPtrDeque, mozilla::DataChannelConnection::DataChannelConnection, mozilla::DataChannelConnection::Create, mozilla::PeerConnectionImpl::EnsureDataConnection

Categories

(Core :: WebRTC, defect, P5)

defect

Tracking

()

RESOLVED FIXED
87 Branch
Tracking Status
firefox87 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: bwc)

References

Details

(Keywords: intermittent-failure, perf-alert)

Attachments

(1 file)

Filed by: csabou [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=327149295&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Knld-ZLDR0yYp6jt-BL5Vw/runs/0/artifacts/public/logs/live_backing.log


[task 2021-01-19T14:25:21.345Z] 14:25:21     INFO - PID 14951 | [Child 22043: Main Thread]: E/signaling [main|PeerConnectionImpl] PeerConnectionImpl.cpp:356: PeerConnectionCtx is already gone. Ignoring...
[task 2021-01-19T14:25:21.345Z] 14:25:21     INFO - PID 14951 | [Child 22043: Main Thread]: I/signaling [main|PeerConnectionImpl] PeerConnectionImpl.cpp:360: ~PeerConnectionImpl: PeerConnectionImpl destructor invoked for {f19c49a2-20a2-4424-9738-20ac1770233d}
[task 2021-01-19T14:25:22.027Z] 14:25:22     INFO - PID 14951 | ###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
[task 2021-01-19T14:25:22.061Z] 14:25:22     INFO - PID 14951 | -----------------------------------------------------
[task 2021-01-19T14:25:22.062Z] 14:25:22     INFO - PID 14951 | Suppressions used:
[task 2021-01-19T14:25:22.062Z] 14:25:22     INFO - PID 14951 |   count      bytes template
[task 2021-01-19T14:25:22.063Z] 14:25:22     INFO - PID 14951 |      14        448 nsComponentManagerImpl
[task 2021-01-19T14:25:22.063Z] 14:25:22     INFO - PID 14951 |       2        288 libfontconfig.so
[task 2021-01-19T14:25:22.063Z] 14:25:22     INFO - PID 14951 | -----------------------------------------------------
[task 2021-01-19T14:25:22.247Z] 14:25:22     INFO - PID 14951 | -----------------------------------------------------
[task 2021-01-19T14:25:22.247Z] 14:25:22     INFO - PID 14951 | Suppressions used:
[task 2021-01-19T14:25:22.247Z] 14:25:22     INFO - PID 14951 |   count      bytes template
[task 2021-01-19T14:25:22.247Z] 14:25:22     INFO - PID 14951 |      14        448 nsComponentManagerImpl
[task 2021-01-19T14:25:22.247Z] 14:25:22     INFO - PID 14951 |       2        288 libfontconfig.so
[task 2021-01-19T14:25:22.248Z] 14:25:22     INFO - PID 14951 | -----------------------------------------------------
[task 2021-01-19T14:25:22.264Z] 14:25:22     INFO - PID 14951 | -----------------------------------------------------
[task 2021-01-19T14:25:22.264Z] 14:25:22     INFO - PID 14951 | Suppressions used:
[task 2021-01-19T14:25:22.265Z] 14:25:22     INFO - PID 14951 |   count      bytes template
[task 2021-01-19T14:25:22.265Z] 14:25:22     INFO - PID 14951 |      14        448 nsComponentManagerImpl
[task 2021-01-19T14:25:22.265Z] 14:25:22     INFO - PID 14951 |       2        288 libfontconfig.so
[task 2021-01-19T14:25:22.265Z] 14:25:22     INFO - PID 14951 | -----------------------------------------------------
[task 2021-01-19T14:25:22.488Z] 14:25:22     INFO - PID 14951 | -----------------------------------------------------
[task 2021-01-19T14:25:22.488Z] 14:25:22     INFO - PID 14951 | Suppressions used:
[task 2021-01-19T14:25:22.488Z] 14:25:22     INFO - PID 14951 |   count      bytes template
[task 2021-01-19T14:25:22.488Z] 14:25:22     INFO - PID 14951 |      14        448 nsComponentManagerImpl
[task 2021-01-19T14:25:22.488Z] 14:25:22     INFO - PID 14951 |       2        288 libfontconfig.so
[task 2021-01-19T14:25:22.488Z] 14:25:22     INFO - PID 14951 |       2      65568 nr_reg_local_init
[task 2021-01-19T14:25:22.489Z] 14:25:22     INFO - PID 14951 |     195      18592 r_log_register
[task 2021-01-19T14:25:22.489Z] 14:25:22     INFO - PID 14951 |      63       2904 nr_reg_set
[task 2021-01-19T14:25:22.489Z] 14:25:22     INFO - PID 14951 | -----------------------------------------------------
[task 2021-01-19T14:25:22.774Z] 14:25:22     INFO - PID 14951 | -----------------------------------------------------
[task 2021-01-19T14:25:22.776Z] 14:25:22     INFO - PID 14951 | Suppressions used:
[task 2021-01-19T14:25:22.776Z] 14:25:22     INFO - PID 14951 |   count      bytes template
[task 2021-01-19T14:25:22.777Z] 14:25:22     INFO - PID 14951 |       1         32 nsComponentManagerImpl
[task 2021-01-19T14:25:22.777Z] 14:25:22     INFO - PID 14951 | -----------------------------------------------------
[task 2021-01-19T14:25:23.796Z] 14:25:23     INFO - PID 14951 | -----------------------------------------------------
[task 2021-01-19T14:25:23.797Z] 14:25:23     INFO - PID 14951 | Suppressions used:
[task 2021-01-19T14:25:23.797Z] 14:25:23     INFO - PID 14951 |   count      bytes template
[task 2021-01-19T14:25:23.797Z] 14:25:23     INFO - PID 14951 |      11        344 nsComponentManagerImpl
[task 2021-01-19T14:25:23.798Z] 14:25:23     INFO - PID 14951 |     633      18083 libfontconfig.so
[task 2021-01-19T14:25:23.798Z] 14:25:23     INFO - PID 14951 |       3        624 mozJSComponentLoader
[task 2021-01-19T14:25:23.799Z] 14:25:23     INFO - PID 14951 | -----------------------------------------------------
[task 2021-01-19T14:25:23.877Z] 14:25:23     INFO - Browser exited with return code 0
[task 2021-01-19T14:25:23.878Z] 14:25:23     INFO - PROCESS LEAKS None
[task 2021-01-19T14:25:23.878Z] 14:25:23     INFO - ERROR | LeakSanitizer | SUMMARY: AddressSanitizer: 1656 byte(s) leaked in 13 allocation(s).
[task 2021-01-19T14:25:23.879Z] 14:25:23     INFO - LeakSanitizer | To show the addresses of leaked objects add report_objects=1 to LSAN_OPTIONS
[task 2021-01-19T14:25:23.879Z] 14:25:23     INFO - This can be done in testing/mozbase/mozrunner/mozrunner/utils.py
[task 2021-01-19T14:25:23.879Z] 14:25:23     INFO - Allowed depth was 4
[task 2021-01-19T14:25:23.879Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at nsRefPtrDeque, mozilla::DataChannelConnection::DataChannelConnection, mozilla::DataChannelConnection::Create, mozilla::PeerConnectionImpl::EnsureDataConnection
[task 2021-01-19T14:25:23.880Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::dom::DocGroup::Create, mozilla::dom::BrowsingContextGroup::AddDocument, mozilla::dom::Document::SetScopeObject, mozilla::dom::Document::SetScriptGlobalObject
[task 2021-01-19T14:25:23.880Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at allocate, allocate, _M_allocate, std::vector
[task 2021-01-19T14:25:23.881Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::DataChannelConnection::Open, mozilla::PeerConnectionImpl::CreateDataChannel, mozilla::PeerConnectionImpl::CreateDataChannel, mozilla::dom::PeerConnectionImpl_Binding::createDataChannel
[task 2021-01-19T14:25:23.881Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::WeakPtr, mozilla::DataChannelConnection::Destroy, mozilla::PeerConnectionImpl::CloseInt, mozilla::PeerConnectionImpl::Close
[task 2021-01-19T14:25:23.882Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at Alloc, nsTSubstring, Gecko_StartBulkWriteString, nsstring::nsAString::start_bulk_write_impl
[task 2021-01-19T14:25:23.882Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::DataChannelConnection::Create, mozilla::PeerConnectionImpl::EnsureDataConnection, mozilla::PeerConnectionImpl::CreateDataChannel, mozilla::PeerConnectionImpl::CreateDataChannel
[task 2021-01-19T14:25:23.883Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at Alloc, nsTSubstring, nsTSubstring, nsTSubstring
[task 2021-01-19T14:25:23.883Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::dom::DocGroup::DocGroup, mozilla::dom::DocGroup::Create, mozilla::dom::BrowsingContextGroup::AddDocument, mozilla::dom::Document::SetScopeObject
[task 2021-01-19T14:25:23.883Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at NS_NewDOMDataChannel, mozilla::PeerConnectionImpl::CreateDataChannel, mozilla::PeerConnectionImpl::CreateDataChannel, mozilla::dom::PeerConnectionImpl_Binding::createDataChannel
[task 2021-01-19T14:25:23.885Z] 14:25:23     INFO - Closing logging queue
[task 2021-01-19T14:25:23.885Z] 14:25:23     INFO - queue closed
[task 2021-01-19T14:25:23.885Z] 14:25:23     INFO - PROCESS LEAKS None
[task 2021-01-19T14:25:23.886Z] 14:25:23     INFO - LeakSanitizer | To show the addresses of leaked objects add report_objects=1 to LSAN_OPTIONS
[task 2021-01-19T14:25:23.886Z] 14:25:23     INFO - This can be done in testing/mozbase/mozrunner/mozrunner/utils.py
[task 2021-01-19T14:25:23.886Z] 14:25:23     INFO - Allowed depth was 4
[task 2021-01-19T14:25:23.886Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at nsRefPtrDeque, mozilla::DataChannelConnection::DataChannelConnection, mozilla::DataChannelConnection::Create, mozilla::PeerConnectionImpl::EnsureDataConnection
[task 2021-01-19T14:25:23.887Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::dom::DocGroup::Create, mozilla::dom::BrowsingContextGroup::AddDocument, mozilla::dom::Document::SetScopeObject, mozilla::dom::Document::SetScriptGlobalObject
[task 2021-01-19T14:25:23.887Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at allocate, allocate, _M_allocate, std::vector
[task 2021-01-19T14:25:23.887Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::DataChannelConnection::Open, mozilla::PeerConnectionImpl::CreateDataChannel, mozilla::PeerConnectionImpl::CreateDataChannel, mozilla::dom::PeerConnectionImpl_Binding::createDataChannel
[task 2021-01-19T14:25:23.888Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::WeakPtr, mozilla::DataChannelConnection::Destroy, mozilla::PeerConnectionImpl::CloseInt, mozilla::PeerConnectionImpl::Close
[task 2021-01-19T14:25:23.888Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at Alloc, nsTSubstring, Gecko_StartBulkWriteString, nsstring::nsAString::start_bulk_write_impl
[task 2021-01-19T14:25:23.888Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::DataChannelConnection::Create, mozilla::PeerConnectionImpl::EnsureDataConnection, mozilla::PeerConnectionImpl::CreateDataChannel, mozilla::PeerConnectionImpl::CreateDataChannel
[task 2021-01-19T14:25:23.888Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at Alloc, nsTSubstring, nsTSubstring, nsTSubstring
[task 2021-01-19T14:25:23.889Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::dom::DocGroup::DocGroup, mozilla::dom::DocGroup::Create, mozilla::dom::BrowsingContextGroup::AddDocument, mozilla::dom::Document::SetScopeObject
[task 2021-01-19T14:25:23.889Z] 14:25:23     INFO - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at NS_NewDOMDataChannel, mozilla::PeerConnectionImpl::CreateDataChannel, mozilla::PeerConnectionImpl::CreateDataChannel, mozilla::dom::PeerConnectionImpl_Binding::createDataChannel
[task 2021-01-19T14:25:23.900Z] 14:25:23     INFO - Got 0 unexpected results
[task 2021-01-19T14:25:23.900Z] 14:25:23     INFO - SUITE-END | took 795s
[task 2021-01-19T14:25:23.946Z] 14:25:23     INFO - Removed font: Ahem.ttf
[task 2021-01-19T14:25:24.005Z] 14:25:24     INFO - Closing logging queue
[task 2021-01-19T14:25:24.006Z] 14:25:24     INFO - queue closed
[task 2021-01-19T14:25:24.408Z] 14:25:24     INFO - Return code: 0
[task 2021-01-19T14:25:24.408Z] 14:25:24    ERROR - Got 21 unexpected statuses
[task 2021-01-19T14:25:24.408Z] 14:25:24  WARNING - # TBPL WARNING #
[task 2021-01-19T14:25:24.409Z] 14:25:24  WARNING - setting return code to 1
[task 2021-01-19T14:25:24.409Z] 14:25:24     INFO - Running post-action listener: _package_coverage_data
[task 2021-01-19T14:25:24.409Z] 14:25:24     INFO - Running post-action listener: _resource_record_post_action
[task 2021-01-19T14:25:24.409Z] 14:25:24     INFO - Running post-action listener: process_java_coverage_data
[task 2021-01-19T14:25:24.409Z] 14:25:24     INFO - Running post-action listener: stop_device
[task 2021-01-19T14:25:24.409Z] 14:25:24     INFO - [mozharness: 2021-01-19 14:25:24.409492Z] Finished run-tests step (success)
[task 2021-01-19T14:25:24.409Z] 14:25:24     INFO - Running post-run listener: _resource_record_post_run
[task 2021-01-19T14:25:24.484Z] 14:25:24     INFO - Total resource usage - Wall time: 856s; CPU: 60%; Read bytes: 69632; Write bytes: 4961796096; Read time: 8; Write time: 1135360
[task 2021-01-19T14:25:24.484Z] 14:25:24     INFO - TinderboxPrint: CPU usage<br/>60.2%
[task 2021-01-19T14:25:24.484Z] 14:25:24     INFO - TinderboxPrint: I/O read bytes / time<br/>69,632 / 8
[task 2021-01-19T14:25:24.484Z] 14:25:24     INFO - TinderboxPrint: I/O write bytes / time<br/>4,961,796,096 / 1,135,360
[task 2021-01-19T14:25:24.484Z] 14:25:24     INFO - TinderboxPrint: CPU idle<br/>1,341.5 (39.5%)
[task 2021-01-19T14:25:24.484Z] 14:25:24     INFO - TinderboxPrint: CPU system<br/>280.3 (8.2%)
[task 2021-01-19T14:25:24.484Z] 14:25:24     INFO - TinderboxPrint: CPU user<br/>1,761.6 (51.8%)
[task 2021-01-19T14:25:24.484Z] 14:25:24     INFO - TinderboxPrint: Swap in / out<br/>0 / 0
[task 2021-01-19T14:25:24.485Z] 14:25:24     INFO - pull - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2021-01-19T14:25:24.485Z] 14:25:24     INFO - start-emulator - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2021-01-19T14:25:24.486Z] 14:25:24     INFO - verify-device - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2021-01-19T14:25:24.486Z] 14:25:24     INFO - install - Wall time: 51s; CPU: 25%; Read bytes: 8192; Write bytes: 2340962304; Read time: 0; Write time: 1014680
[task 2021-01-19T14:25:24.491Z] 14:25:24     INFO - run-tests - Wall time: 806s; CPU: 62%; Read bytes: 61440; Write bytes: 2620624896; Read time: 8; Write time: 120656
[task 2021-01-19T14:25:24.615Z] 14:25:24  WARNING - returning nonzero exit status 1```

Th link with the failures: https://treeherder.mozilla.org/jobs?repo=autoland&group_state=expanded&revision=b0f61f0266c29ebc6b32fec4d69eabfccebf97f4&searchStr=linux%2C18.04%2Cx64%2Casan%2Copt%2Cweb%2Cplatform%2Ctests%2Ctest-linux1804-64-asan%2Fopt-web-platform-tests-e10s%2Cwpt4&selectedTaskRun=Knld-ZLDR0yYp6jt-BL5Vw.0

Andrew, could you please have a look over these wpt leaks? The See Also bugs look very similar, in the sense that we can find part of each other in the stack trace. Should we keep just one of them? Thank you.

Flags: needinfo?(continuation)
See Also: → 1683106, 1687058

WebRTC leaks, maybe Bryce can assist.

Component: Networking → WebRTC
Flags: needinfo?(bvandyk)

Forwarding to :mjf as he knows the domain better than I.

Flags: needinfo?(bvandyk) → needinfo?(mfroman)

(In reply to Cosmin Sabou [:CosminS] from comment #1)

Andrew, could you please have a look over these wpt leaks? The See Also bugs look very similar, in the sense that we can find part of each other in the stack trace. Should we keep just one of them? Thank you.

The basic problem is that once we start leaking a bunch of stuff, there's a lot of generic things that show up, so it can be hard to distinguish. I think we got a little lucky here in that the data channel stuff is fairly distinctive and it is near the start of the alphabet so it showed up. It looks like the failures in bug 1687486 also mention data channel, but the ones in bug 1687058 do not.

Ideally, LeakSanitizer failures would include information about what tests were running, and also it would be great if there was some kind of analyzer for large leaks that filtered out the generic stacks such as, for instance, "Alloc, nsTSubstring, nsTSubstring, nsTSubstring". That is also an issue with leakcheck failures, but it is a lot worse with LSan, because each individual leak has a bunch of stack frames in it.

Flags: needinfo?(continuation)

So this does not look like a leak inside the DataChannel code, it looks like we're leaking the DOM object that owns the DataChannel objects, possibly due to a cycle that isn't properly instrumented?

This looks a lot like a shutdown leak to me, lots of output like this:

[task 2021-01-19T14:20:49.474Z] 14:20:49 INFO - PID 14951 | [Child 15099: Main Thread]: E/signaling [main|PeerConnectionImpl] PeerConnectionImpl.cpp:356: PeerConnectionCtx is already gone. Ignoring...
[task 2021-01-19T14:20:49.474Z] 14:20:49 INFO - PID 14951 | [Child 15099: Main Thread]: I/signaling [main|PeerConnectionImpl] PeerConnectionImpl.cpp:360: ~PeerConnectionImpl: PeerConnectionImpl destructor invoked for {708bbf55-79d3-48bd-a4d5-7746bb0fdd3a}

I'm going to experiment with closing all PCs when xpcom shutdown begins.

Assignee: nobody → docfaraday

Thanks for taking a look!

Flags: needinfo?(mfroman)

Try looks fine.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → 87 Branch

== Change summary for alert #28562 (as of Sat, 30 Jan 2021 09:13:08 GMT) ==

Improvements:

Ratio Suite Test Platform Options Absolute values (old vs new)
12% reddit fcp linux64-shippable-qr cold nocondprof webrender 222.08 -> 195.00
6% yandex fcp macosx1014-64-shippable-qr cold nocondprof webrender 319.73 -> 300.83

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=28562

Keywords: perf-alert
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: