Closed Bug 1629529 Opened 4 years ago Closed 4 years ago

Crash in [@ mozilla::net::WebrtcTCPSocket::CloseWithReason]

Categories

(Core :: WebRTC, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla77
Tracking Status
firefox-esr68 --- unaffected
firefox75 --- unaffected
firefox76 --- unaffected
firefox77 --- fixed

People

(Reporter: gsvelto, Assigned: mjf)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug is for crash report bp-0c275aba-7cfd-48c5-b6c1-2febf0200412.

Top 10 frames of crashing thread:

0 XUL mozilla::net::WebrtcTCPSocket::CloseWithReason media/mtransport/ipc/WebrtcTCPSocket.cpp:109
1 XUL mozilla::net::WebrtcTCPSocketParent::OnClose media/mtransport/ipc/WebrtcTCPSocketParent.cpp:93
2 XUL mozilla::net::WebrtcTCPSocket::InvokeOnClose media/mtransport/ipc/WebrtcTCPSocket.cpp:469
3 XUL mozilla::detail::RunnableMethodImpl<mozilla::net::WebrtcTCPSocket*, void  xpcom/threads/nsThreadUtils.h:1220
4 XUL nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1200
5 XUL nsThread::Shutdown xpcom/threads/nsThread.cpp:889
6 XUL mozilla::net::nsSocketTransportService::ShutdownThread netwerk/base/nsSocketTransportService2.cpp:791
7 XUL mozilla::net::nsIOService::SetOffline netwerk/base/nsIOService.cpp:1256
8 XUL mozilla::net::nsIOService::Observe netwerk/base/nsIOService.cpp
9 XUL non-virtual thunk to mozilla::net::nsIOService::Observe netwerk/base/nsIOService.cpp

The crash reason is:

MOZ_DIAGNOSTIC_ASSERT(false) (NS_SUCCEEDED(mSocketThread->Dispatch(NewRunnableMethod<nsresult>( "WebrtcTCPSocket::CloseWithReason", this, &WebrtcTCPSocket::CloseWithReason, aReason))))

The oldest buildid with this crash is 20200409095500.

This looks like fallout from Bug 1620152

Priority: -- → P1
Regressed by: 1620152
Has Regression Range: --- → yes

Byron, any thoughts quickly come to mind on why mSocketThread->Dispatch would fail here:
https://searchfox.org/mozilla-central/rev/2cd2d511c0d94a34fb7fa3b746f54170ee759e35/media/mtransport/ipc/WebrtcTCPSocket.cpp#109

Or should we just convert this to MOZ_ASSERT(NS_SUCCEEDED(rv))?

Flags: needinfo?(docfaraday)

That kind of thing is usually caused by dispatching during shutdown, after STS has stopped taking new events. The stack on crash-stats verifies this. It seems that we are shutting STS down (on main), which entails running all of STS's queued events, on main. So the pattern of "If we aren't on STS, rerun this function on STS, otherwise run the following code." does not play well with the shutdown code.

I can think of a few ways we could deal with this:

  1. We could just make this a regular assertion.
  2. Modify code like this to fall through to the code below if the dispatch fails. (Scary)
  3. Get OnSocketThread() to return true if we're doing this run-STS-events-on-main-because-we're-shutting-down thing. (Not sure how, but probably the "best" option)
Flags: needinfo?(docfaraday)
Severity: -- → normal

After playing with this a bit I think we'll need to go with option 1 in comment 3 for a immediate fix. I think we'd need to know that mThread from gSocketTransportService is shutting down (and thus spinning the event loop on main to drain the event queue) in order to implement option 3. nsIThread doesn't currently support this.

Byron, are you OK with me changing this to the regular assert for now?

Flags: needinfo?(docfaraday)

That's fine with me.

Flags: needinfo?(docfaraday)
Assignee: nobody → mfroman

If we're shutting down and mSocketThread is no longer accepting work,
this dispatch can fail. As far as I can see, fully fixing this would
require changes to nsIThread in order to allow checking whether the
thread is shutting down.

I added Bug 1632179 to add support in nsIThread for checking whether the thread is shutting down in the hopes we can more fully fix this.

See Also: → 1632179
Pushed by mfroman@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fabb6fb27e8a
fix shutdown dispatch assert in WebrtcTCPSocket. r=bwc
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla77
Backout by nbeleuzu@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/72894e6dc3a9
Backed out changeset fabb6fb27e8a for being an incorrect fix. CLOSED TREE
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Pushed by mfroman@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9d1756abffd2
fix shutdown dispatch assert in WebrtcTCPSocket. r=bwc
Backout by nbeleuzu@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cedfc92b703e
Backed out changeset 9d1756abffd2 for bustages on WebrtcTCPSocket.cpp . CLOSED TREE
Pushed by mfroman@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fdb5017a2712
fix shutdown dispatch assert in WebrtcTCPSocket. r=bwc
Flags: needinfo?(mfroman)
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla77

Hi Michael, is there something that QA can help with here? And if so, are there any STR for this? Thanks!

Flags: needinfo?(mfroman)

(In reply to Catalin Sasca, QA [:csasca] from comment #17)

Hi Michael, is there something that QA can help with here? And if so, are there any STR for this? Thanks!

There were no STR filed in the bug and I haven't seen this happen locally so I don't have any STR to add. Since this is a shutdown race, there isn't really anything for QA to do here.

Flags: needinfo?(mfroman)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: