Last Comment Bug 651059 - Data race on AsyncChannel::mChannelState
: Data race on AsyncChannel::mChannelState
Product: Core
Classification: Components
Component: IPC (show other bugs)
: Trunk
: x86_64 Linux
-- normal (vote)
: ---
Assigned To: Chris Jones [:cjones] inactive; ni?/f?/r? if you need me
: Bill McCloskey (:billm)
Depends on:
  Show dependency treegraph
Reported: 2011-04-19 02:24 PDT by Julian Seward [:jseward]
Modified: 2011-04-28 17:16 PDT (History)
1 user (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

Fix (benign) read/write race on AsyncChannel.mChannelState (924 bytes, patch)
2011-04-20 18:22 PDT, Chris Jones [:cjones] inactive; ni?/f?/r? if you need me
bent.mozilla: review+
Details | Diff | Splinter Review

Description User image Julian Seward [:jseward] 2011-04-19 02:24:18 PDT
I think this is happening in a/the (hazy) plugin process:

 Command: \
   /space2/sewardj/MOZ/MC-18-04-2011-HG/ff-opt/dist/bin/plugin-container \
        /home/sewardj/.mozilla/plugins/ 6957 false plugin

with TEST_PATH=content/base/test/test_CSP.html

Summary: AsyncChannel::OnChannelOpened writes mChannelState whilst
holding no lock, vs AsyncChannel::Open reads mChannelState whilst
holding mMutex.

[23:51] <cjones> sewardj, that's a legitimate read/write race but is benign
[23:52] <cjones> definitely worth fixing for race-checker cleanliness

Helgrindage below.


 Lock at 0x100EA5F0 was first observed
    at 0x4C2C8C4: pthread_mutex_init (/home/sewardj/VgTRUNK/HGDEV2/helgrind/hg_intercepts.c:430)
    by 0x799CFFC: PR_NewLock (nsprpub/pr/src/pthreads/ptsynch.c:179)
    by 0x665AFED: mozilla::ipc::AsyncChannel::AsyncChannel(mozilla::ipc::AsyncChannel::AsyncListener*) (ff-opt/ipc/glue/../../dist/include/mozilla/Mutex.h:81)
    by 0x66655EA: mozilla::ipc::SyncChannel::SyncChannel(mozilla::ipc::SyncChannel::SyncListener*) (ipc/glue/SyncChannel.cpp:64)
    by 0x666100E: mozilla::ipc::RPCChannel::RPCChannel(mozilla::ipc::RPCChannel::RPCListener*) (ipc/glue/RPCChannel.cpp:100)
    by 0x66B348B: mozilla::plugins::PPluginModuleChild::PPluginModuleChild() (ff-opt/ipc/ipdl/PPluginModuleChild.cpp:93)
    by 0x664C226: mozilla::plugins::PluginModuleChild::PluginModuleChild() (dom/plugins/ipc/PluginModuleChild.cpp:117)
    by 0x5A35CAE: XRE_InitChildProcess (ff-opt/toolkit/xre/../../dist/include/mozilla/plugins/PluginProcessChild.h:55)
    by 0x400758: main (ipc/app/MozillaRuntimeMain.cpp:80)

 Possible data race during write of size 4 at 0x100DF8D8 by thread #2
 Locks held: none
    at 0x665AA64: mozilla::ipc::AsyncChannel::OnChannelOpened() (ipc/glue/AsyncChannel.cpp:472)
    by 0x665B571: RunnableMethod<mozilla::ipc::AsyncChannel, void (mozilla::ipc::AsyncChannel::*)(), Tuple0>::Run() (ipc/chromium/src/base/tuple.h:383)
    by 0x67799DD: MessageLoop::RunTask(Task*) (ipc/chromium/src/base/
    by 0x677A092: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (ipc/chromium/src/base/
    by 0x677A64F: MessageLoop::DoWork() (ipc/chromium/src/base/
    by 0x67A559C: base::MessagePumpLibevent::Run(base::MessagePump::Delegate*) (ipc/chromium/src/base/
    by 0x6779F3A: MessageLoop::RunInternal() (ipc/chromium/src/base/
    by 0x6779F46: MessageLoop::RunHandler() (ipc/chromium/src/base/
    by 0x6779FB5: MessageLoop::Run() (ipc/chromium/src/base/
    by 0x678B3AB: base::Thread::ThreadMain() (ipc/chromium/src/base/
    by 0x67A5ED9: ThreadFunc(void*) (ipc/chromium/src/base/
    by 0x4C2CBE7: mythread_wrapper (/home/sewardj/VgTRUNK/HGDEV2/helgrind/hg_intercepts.c:221)

 This conflicts with a previous read of size 4 by thread #1
 Locks held: 1, at address 0x100EA5F0
    at 0x665AE8F: mozilla::ipc::AsyncChannel::Open(IPC::Channel*, MessageLoop*) (ipc/glue/AsyncChannel.cpp:165)
    by 0x66B1A85: mozilla::plugins::PPluginModuleChild::Open(IPC::Channel*, int, MessageLoop*) (ff-opt/ipc/ipdl/PPluginModuleChild.cpp:110)
    by 0x664BC44: mozilla::plugins::PluginModuleChild::Init(std::string const&, int, MessageLoop*, IPC::Channel*) (dom/plugins/ipc/PluginModuleChild.cpp:206)
    by 0x6651338: mozilla::plugins::PluginProcessChild::Init() (dom/plugins/ipc/PluginProcessChild.cpp:127)
    by 0x5A35B42: XRE_InitChildProcess (toolkit/xre/nsEmbedFunctions.cpp:504)
    by 0x400758: main (ipc/app/MozillaRuntimeMain.cpp:80)

 Address 0x100DF8D8 is 88 bytes inside a block of size 1088 alloc'd
    at 0x4C27B08: malloc (/home/sewardj/VgTRUNK/HGDEV2/coregrind/m_replacemalloc/vg_replace_malloc.c:236)
    by 0x5253251: moz_xmalloc (memory/mozalloc/mozalloc.cpp:100)
    by 0x5A35C80: XRE_InitChildProcess (ff-opt/toolkit/xre/../../dist/include/mozilla/mozalloc.h:229)
    by 0x400758: main (ipc/app/MozillaRuntimeMain.cpp:80)
Comment 1 User image Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-04-19 02:40:58 PDT
This race is benign because
 (1) our code never checks for the ChannelOpening state
 (2) the "worker thread" in Open() only cares about mChannelState != ChannelConnected
 (3) the transitions from ChannelClosed->ChannelOpening and ChannelOpening->ChannelConnected are serial on the "IO thread" (so the former happens-before the latter), and the ChannelOpening->ChannelConnected transition is correctly synchronized by the AsyncChannel mutex/condvar
 (4) there's no way for the worker thread to spuriously observe mChannelState == ChannelConnected because of the racy write

Easiest fix is to synchronize the mChannelState = ChannelOpening write with AsyncChannel.mMutex.  There's no noticeable perf penalty from doing this: OnChannelOpened is called once per spawned process currently.
Comment 2 User image Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-04-20 18:22:47 PDT
Created attachment 527457 [details] [diff] [review]
Fix (benign) read/write race on AsyncChannel.mChannelState
Comment 3 User image Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-04-28 17:16:54 PDT

Note You need to log in before you can comment on or make changes to this bug.