Open Bug 1649163 Opened 1 year ago Updated 8 months ago

TSan data race in sctp_timer_start

Categories

(Core :: WebRTC: Networking, defect, P2)

defect

Tracking

()

People

(Reporter: bwc, Unassigned)

References

(Blocks 1 open bug)

Details

1:40.41 GECKO(2978394) ==================
1:40.41 GECKO(2978394) WARNING: ThreadSanitizer: data race (pid=2978511)
1:40.41 GECKO(2978394) Write of size 4 at 0x7fb63baa2d00 by thread T35 (mutexes: write M202802403893457152):
1:40.41 GECKO(2978394) #0 sctp_handle_tick /home/bcampen/checkouts/mozilla-central/netwerk/sctp/src/netinet/sctp_callout.c:237:15 (libxul.so+0x65bb0b7)
1:40.41 GECKO(2978394) #1 user_sctp_timer_iterate /home/bcampen/checkouts/mozilla-central/netwerk/sctp/src/netinet/sctp_callout.c:296:3 (libxul.so+0x65bb0b7)
1:40.41 GECKO(2978394) Previous read of size 4 at 0x7fb63baa2d00 by thread T5 (mutexes: write M201676503986613120):
1:40.41 GECKO(2978394) #0 sctp_timer_start /home/bcampen/checkouts/mozilla-central/netwerk/sctp/src/netinet/sctputil.c:2317:6 (libxul.so+0x663488b)
1:40.41 GECKO(2978394) #1 sctp_del_addr_from_vrf /home/bcampen/checkouts/mozilla-central/netwerk/sctp/src/netinet/sctp_pcb.c:916:3 (libxul.so+0x66081f7)
1:40.41 GECKO(2978394) #2 usrsctp_deregister_address /home/bcampen/checkouts/mozilla-central/netwerk/sctp/src/user_socket.c:3351:2 (libxul.so+0x6648f74)
1:40.41 GECKO(2978394) #3 mozilla::DataChannelConnection::DestroyOnSTS(socket*, socket*) /home/bcampen/checkouts/mozilla-central/netwerk/sctp/datachannel/DataChannel.cpp:411:3 (libxul.so+0x664d608)
1:40.41 GECKO(2978394) #4 __invoke_impl<void, void (mozilla::DataChannelConnection::*const &)(socket *, socket *), RefPtr<mozilla::DataChannelConnection> &, socket *, socket *> /usr/lib/gcc/x86_64-redhat-linux/9/../../../../include/c++/9/bits/invoke.h:73:14 (libxul.so+0x66634d4)
1:40.41 GECKO(2978394) #5 __invoke<void (mozilla::DataChannelConnection::*const &)(socket *, socket *), RefPtr<mozilla::DataChannelConnection> &, socket *, socket *> /usr/lib/gcc/x86_64-redhat-linux/9/../../../../include/c++/9/bits/invoke.h:95:14 (libxul.so+0x66634d4)
1:40.41 GECKO(2978394) #6 operator()<RefPtr<mozilla::DataChannelConnection> &, socket *, socket > /usr/lib/gcc/x86_64-redhat-linux/9/../../../../include/c++/9/functional:114:11 (libxul.so+0x66634d4)
1:40.41 GECKO(2978394) #7 __invoke_impl<void, std::_Mem_fn<void (mozilla::DataChannelConnection::
)(socket *, socket *)>, RefPtr<mozilla::DataChannelConnection> &, socket , socket > /usr/lib/gcc/x86_64-redhat-linux/9/../../../../include/c++/9/bits/invoke.h:60:14 (libxul.so+0x66634d4)
1:40.41 GECKO(2978394) #8 __invoke<std::_Mem_fn<void (mozilla::DataChannelConnection::
)(socket , socket )>, RefPtr<mozilla::DataChannelConnection> &, socket , socket > /usr/lib/gcc/x86_64-redhat-linux/9/../../../../include/c++/9/bits/invoke.h:95:14 (libxul.so+0x66634d4)
1:40.41 GECKO(2978394) #9 __apply_impl<std::_Mem_fn<void (mozilla::DataChannelConnection::
)(socket , socket )>, std::tuple<RefPtr<mozilla::DataChannelConnection> &, socket , socket >, 0, 1, 2> /usr/lib/gcc/x86_64-redhat-linux/9/../../../../include/c++/9/tuple:1684:14 (libxul.so+0x66634d4)
1:40.41 GECKO(2978394) #10 apply<std::_Mem_fn<void (mozilla::DataChannelConnection::
)(socket , socket )>, std::tuple<RefPtr<mozilla::DataChannelConnection> &, socket , socket > > /usr/lib/gcc/x86_64-redhat-linux/9/../../../../include/c++/9/tuple:1694:14 (libxul.so+0x66634d4)
1:40.41 GECKO(2978394) #11 mozilla::runnable_args_memfn<RefPtr<mozilla::DataChannelConnection>, void (mozilla::DataChannelConnection::
)(socket
, socket
), socket
, socket
>::RunInternal() /home/bcampen/checkouts/mozilla-central/objdir-ff-tsan/dist/include/mtransport/runnable_utils.h:121:5 (libxul.so+0x66634d4)
1:40.41 GECKO(2978394) #12 mozilla::detail::runnable_args_base<(mozilla::detail::RunnableResult)0>::Run() /home/bcampen/checkouts/mozilla-central/objdir-ff-tsan/dist/include/mtransport/runnable_utils.h:41:5 (libxul.so+0x6663162)
1:40.41 GECKO(2978394) #13 nsThread::ProcessNextEvent(bool, bool
) /home/bcampen/checkouts/mozilla-central/xpcom/threads/nsThread.cpp:1234:14 (libxul.so+0x5c54050)
1:40.41 GECKO(2978394) #14 NS_ProcessNextEvent(nsIThread
, bool) /home/bcampen/checkouts/mozilla-central/xpcom/threads/nsThreadUtils.cpp:504:10 (libxul.so+0x5c58c05)
1:40.41 GECKO(2978394) #15 mozilla::net::nsSocketTransportService::Run() /home/bcampen/checkouts/mozilla-central/netwerk/base/nsSocketTransportService2.cpp:1177:11 (libxul.so+0x5e17551)
1:40.42 GECKO(2978394) #16 non-virtual thunk to mozilla::net::nsSocketTransportService::Run() /home/bcampen/checkouts/mozilla-central/netwerk/base/nsSocketTransportService2.cpp (libxul.so+0x5e18a0d)
1:40.42 GECKO(2978394) #17 nsThread::ProcessNextEvent(bool, bool
) /home/bcampen/checkouts/mozilla-central/xpcom/threads/nsThread.cpp:1234:14 (libxul.so+0x5c54050)
1:40.42 GECKO(2978394) #18 NS_ProcessNextEvent(nsIThread
, bool) /home/bcampen/checkouts/mozilla-central/xpcom/threads/nsThreadUtils.cpp:504:10 (libxul.so+0x5c58c05)
1:40.42 GECKO(2978394) #19 mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate
) /home/bcampen/checkouts/mozilla-central/ipc/glue/MessagePump.cpp:302:20 (libxul.so+0x674a0ee)
1:40.42 GECKO(2978394) #20 RunInternal /home/bcampen/checkouts/mozilla-central/ipc/chromium/src/base/message_loop.cc:316:10 (libxul.so+0x667c4bc)
1:40.42 GECKO(2978394) #21 RunHandler /home/bcampen/checkouts/mozilla-central/ipc/chromium/src/base/message_loop.cc:309:3 (libxul.so+0x667c4bc)
1:40.42 GECKO(2978394) #22 MessageLoop::Run() /home/bcampen/checkouts/mozilla-central/ipc/chromium/src/base/message_loop.cc:291:3 (libxul.so+0x667c4bc)
1:40.42 GECKO(2978394) #23 nsThread::ThreadFunc(void
) /home/bcampen/checkouts/mozilla-central/xpcom/threads/nsThread.cpp:447:10 (libxul.so+0x5c4fcc8)
1:40.42 GECKO(2978394) #24 _pt_root /home/bcampen/checkouts/mozilla-central/nsprpub/pr/src/pthreads/ptthread.c:201:5 (libnspr4.so+0x50be0)
1:40.42 GECKO(2978394) Location is global 'system_base_info' of size 1824 at 0x7fb63baa2a30 (libxul.so+0x00000fe94d00)

We're accessing sctp_callout->c_flags field on from multiple threads without any threadsafety protections.

Also happening from sctp_handle_sack.

The callout system is tricky, since you can't hold locks which doing the callout. I'm aware of a problem which is tracked in https://github.com/sctplab/usrsctp/pull/417. Originally Mozilla reported a use-after free, Peter Lei who took the callout system from FreeBSD and put it into userland years ago, fixed it. At least this is what Mozilla reported (no crashes anymore). However, Google reported that there are deadlocks now introduce by that fix. The PR should fix this. I can take a look at the race after the other issues are fixed.

Blocks: tsan
Duplicate of this bug: 1219409

It looks like the upstream PR was landed (well, a different version of it was). Do we just need to update libusrsctp?

Flags: needinfo?(tuexen)

Updating makes sense... I'm currently working on a couple of other fixes as well, so you might want to keep an eye on the usrsctp commits.

Flags: needinfo?(tuexen)
You need to log in before you can comment on or make changes to this bug.