Closed Bug 638807 Opened 9 years ago Closed 9 years ago

Data race on nsBuiltinDecoder::mFrameBufferLength

Categories

(Core :: Audio/Video, defect)

x86_64
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla5

People

(Reporter: jseward, Assigned: yury)

References

Details

Attachments

(1 file, 4 obsolete files)

TEST_PATH=content/base/test/test_mozfiledataurl.html

using Helgrinding setup described in bug 551155 comment 13 and later
(ignore preceding junk)

Report below appears approx every second run. (Helgrind is scheduling
sensitive, so identical runs report overlapping, essentially random
subsets of the set of all races it can detect, alas.  Hence need
multiple runs to get full coverage.)

Looks like someone is writing nsBuiltinDecoder::mFrameBufferLength
without holding the relevant lock.  (see 2nd stack)

Am happy to try out suggested patches.

-----------------------------------------------

Possible data race during write of size 4 at 0x1aec0654 by thread #27 while holding 1 lock
   at 0x5CA9F89: nsMediaDecoder::RequestFrameBufferLength(unsigned int) (content/media/nsMediaDecoder.cpp:123)
   by 0x5CB5C54: nsBuiltinDecoderStateMachine::Run() (content/media/nsBuiltinDecoderStateMachine.cpp:1020)
   by 0x63A142C: nsThread::ProcessNextEvent(int, int*) (xpcom/threads/nsThread.cpp:642)
   by 0x63618C9: NS_ProcessNextEvent_P(nsIThread*, int) (ff-opt/xpcom/build/nsThreadUtils.cpp:250)
   by 0x63A1B3C: nsThread::ThreadFunc(void*) (xpcom/threads/nsThread.cpp:281)
   by 0x7A44A1C: _pt_root (nsprpub/pr/src/pthreads/ptthread.c:189)
   by 0x4C2CBE7: mythread_wrapper (/home/sewardj/VgTRUNK/hgdev/helgrind/hg_intercepts.c:221)
   by 0x4E369C9: start_thread (/build/buildd/eglibc-2.11.1/nptl/pthread_create.c:300)
   by 0xB2DB70C: clone (/build/buildd/eglibc-2.11.1/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:112)

 This conflicts with a previous write of size 4 by thread #1 while holding 0 locks
   at 0x5CB1C45: nsBuiltinDecoder::MetadataLoaded(unsigned int, unsigned int, unsigned int) (content/media/nsBuiltinDecoder.cpp:339)
   by 0x5CB6491: nsAudioMetadataEventRunner::Run() (content/media/nsBuiltinDecoderStateMachine.cpp:136)
   by 0x63A142C: nsThread::ProcessNextEvent(int, int*) (xpcom/threads/nsThread.cpp:642)
   by 0x63618C9: NS_ProcessNextEvent_P(nsIThread*, int) (ff-opt/xpcom/build/nsThreadUtils.cpp:250)
   by 0x625ACDF: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (ipc/glue/MessagePump.cpp:134)
   by 0x63E29AA: MessageLoop::RunInternal() (ipc/chromium/src/base/message_loop.cc:219)
   by 0x63E29B6: MessageLoop::RunHandler() (ipc/chromium/src/base/message_loop.cc:202)
   by 0x63E2A25: MessageLoop::Run() (ipc/chromium/src/base/message_loop.cc:176)

 Address 0x1aec0654 is 68 bytes inside a block of size 216 alloc'd
   at 0x4C27B08: malloc (/home/sewardj/VgTRUNK/hgdev/coregrind/m_replacemalloc/vg_replace_malloc.c:236)
   by 0x7410251: moz_xmalloc (memory/mozalloc/mozalloc.cpp:98)
   by 0x5ACF812: nsHTMLMediaElement::CreateDecoder(nsACString_internal const&) (ff-opt/content/html/content/src/../../../../dist/include/mozilla/mozalloc.h:229)
   by 0x5AD22FB: nsHTMLMediaElement::InitializeDecoderForChannel(nsIChannel*, nsIStreamListener**) (content/html/content/src/nsHTMLMediaElement.cpp:1850)
   by 0x5AD2453: nsHTMLMediaElement::MediaLoadListener::OnStartRequest(nsIRequest*, nsISupports*) (content/html/content/src/nsHTMLMediaElement.cpp:304)
   by 0x565C313: nsBaseChannel::OnStartRequest(nsIRequest*, nsISupports*) (netwerk/base/src/nsBaseChannel.cpp:712)
   by 0x5664513: nsInputStreamPump::OnStateStart() (netwerk/base/src/nsInputStreamPump.cpp:441)
   by 0x56649B4: nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*) (netwerk/base/src/nsInputStreamPump.cpp:397)
   by 0x6389809: nsInputStreamReadyEvent::Run() (xpcom/io/nsStreamUtils.cpp:112)
   by 0x63A142C: nsThread::ProcessNextEvent(int, int*) (xpcom/threads/nsThread.cpp:642)
   by 0x63618C9: NS_ProcessNextEvent_P(nsIThread*, int) (ff-opt/xpcom/build/nsThreadUtils.cpp:250)
   by 0x625AC52: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (ipc/glue/MessagePump.cpp:110)
I believe this comes from the following section of nsBuiltinDecoderStateMachine::Run():

        NS_DispatchToMainThread(metadataLoadedEvent, NS_DISPATCH_NORMAL);
        if (HasAudio()) {
          mEventManager.Init(info.mAudioChannels, info.mAudioRate);
          mDecoder->RequestFrameBufferLength(frameBufferLength);
        }

So in theory both writes should be the same value. However, the design still seems broken here, and if we ever grow another caller of RequestFrameBufferLength(), they're potentially going to be very surprised.
nsBuiltinDecoder::MetadataLoaded writes to mFrameBufferLength without acquiring the lock. I'm not sure why it's set there as well as by the RequestFrameBufferLength call in the nsBuiltinDecoderStateMachine::Run method. Added Dave Humphrey who worked on bug 490705 where this was added.
FTR (mostly so I don't lose this info), immediately prior to the above-discussed
race, two other races were reported.  They look closely related.  Comments/
analysis/"this is the same problem"/"this isn't the same problem" welcomed.

The "while holding N locks" feature is intended to be helpful, but has been
present in the tool for a whole 6 hours now :-) so don't take it as gospel.

--------------------

thread announcement (not an error)

Thread #27 was created
   at 0xB2DB6CE: clone (/build/buildd/eglibc-2.11.1/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:77)
   by 0x4E37172: pthread_create@@GLIBC_2.2.5 (/build/buildd/eglibc-2.11.1/nptl/../nptl/sysdeps/pthread/createthread.c:75)
   by 0x4C2CA4A: pthread_create_WRK (/home/sewardj/VgTRUNK/hgdev/helgrind/hg_intercepts.c:257)
   by 0x4C2CB5E: pthread_create@* (/home/sewardj/VgTRUNK/hgdev/helgrind/hg_intercepts.c:288)
   by 0x7A44D83: _PR_CreateThread (nsprpub/pr/src/pthreads/ptthread.c:432)
   by 0x7A44FD4: PR_CreateThread (nsprpub/pr/src/pthreads/ptthread.c:515)
   by 0x63A19EA: nsThread::Init() (xpcom/threads/nsThread.cpp:355)
   by 0x63A29AD: nsThreadManager::NewThread(unsigned int, nsIThread**) (xpcom/threads/nsThreadManager.cpp:249)
   by 0x6361A80: NS_NewThread_P(nsIThread**, nsIRunnable*) (ff-opt/xpcom/build/nsThreadUtils.cpp:74)
   by 0x5CB1E8D: nsBuiltinDecoder::StartStateMachineThread() (content/media/nsBuiltinDecoder.cpp:237)
   by 0x5CB2124: nsBuiltinDecoder::Load(nsMediaStream*, nsIStreamListener**, nsMediaDecoder*) (content/media/nsBuiltinDecoder.cpp:227)
   by 0x5AD2340: nsHTMLMediaElement::InitializeDecoderForChannel(nsIChannel*, nsIStreamListener**) (content/html/content/src/nsHTMLMediaElement.cpp:1863)

--------------------

Possible data race during write of size 1 at 0x152a8284 by thread #27 while holding 1 lock
   at 0x5CC91C6: nsOggReader::ReadMetadata() (content/media/ogg/nsOggReader.cpp:305)
   by 0x5CB382C: nsBuiltinDecoderStateMachine::LoadMetadata() (content/media/nsBuiltinDecoderStateMachine.cpp:1521)
   by 0x5CC5B15: nsOggDecoderStateMachine::LoadMetadata() (content/media/ogg/nsOggDecoderStateMachine.cpp:51)
   by 0x5CB581C: nsBuiltinDecoderStateMachine::Run() (content/media/nsBuiltinDecoderStateMachine.cpp:980)
   by 0x63A142C: nsThread::ProcessNextEvent(int, int*) (xpcom/threads/nsThread.cpp:642)
   by 0x63618C9: NS_ProcessNextEvent_P(nsIThread*, int) (ff-opt/xpcom/build/nsThreadUtils.cpp:250)
   by 0x63A1B3C: nsThread::ThreadFunc(void*) (xpcom/threads/nsThread.cpp:281)
   by 0x7A44A1C: _pt_root (nsprpub/pr/src/pthreads/ptthread.c:189)
   by 0x4C2CBE7: mythread_wrapper (/home/sewardj/VgTRUNK/hgdev/helgrind/hg_intercepts.c:221)
   by 0x4E369C9: start_thread (/build/buildd/eglibc-2.11.1/nptl/pthread_create.c:300)
   by 0xB2DB70C: clone (/build/buildd/eglibc-2.11.1/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:112)

 This conflicts with a previous read of size 1 by thread #1 while holding 1 lock
   at 0x5CB44C8: nsBuiltinDecoderStateMachine::HaveNextFrameData() const (content/media/nsBuiltinDecoderStateMachine.cpp:192)
   by 0x5CB4561: nsBuiltinDecoderStateMachine::GetNextFrameStatus() (content/media/nsBuiltinDecoderStateMachine.cpp:750)
   by 0x5CB1514: nsBuiltinDecoder::UpdateReadyStateForData() (content/media/nsBuiltinDecoder.cpp:639)
   by 0x5CB1A10: nsBuiltinDecoder::NotifyDownloadEnded(unsigned int) (content/media/nsBuiltinDecoder.cpp:593)
   by 0x5CB0D6A: DataEnded::Run() (content/media/nsMediaStream.cpp:738)
   by 0x63A142C: nsThread::ProcessNextEvent(int, int*) (xpcom/threads/nsThread.cpp:642)
   by 0x63618C9: NS_ProcessNextEvent_P(nsIThread*, int) (ff-opt/xpcom/build/nsThreadUtils.cpp:250)
   by 0x625AC52: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (ipc/glue/MessagePump.cpp:110)

 Address 0x152a8284 is 324 bytes inside a block of size 616 alloc'd
   at 0x4C27B08: malloc (/home/sewardj/VgTRUNK/hgdev/coregrind/m_replacemalloc/vg_replace_malloc.c:236)
   by 0x7410251: moz_xmalloc (memory/mozalloc/mozalloc.cpp:98)
   by 0x5CC5B90: nsOggDecoderStateMachine::nsOggDecoderStateMachine(nsBuiltinDecoder*) (ff-opt/content/media/ogg/../../../dist/include/mozilla/mozalloc.h:229)
   by 0x5CC5A5D: nsOggDecoder::CreateStateMachine() (content/media/ogg/nsOggDecoder.cpp:46)
   by 0x5CB206E: nsBuiltinDecoder::Load(nsMediaStream*, nsIStreamListener**, nsMediaDecoder*) (content/media/nsBuiltinDecoder.cpp:209)
   by 0x5AD2340: nsHTMLMediaElement::InitializeDecoderForChannel(nsIChannel*, nsIStreamListener**) (content/html/content/src/nsHTMLMediaElement.cpp:1863)
   by 0x5AD2453: nsHTMLMediaElement::MediaLoadListener::OnStartRequest(nsIRequest*, nsISupports*) (content/html/content/src/nsHTMLMediaElement.cpp:304)
   by 0x565C313: nsBaseChannel::OnStartRequest(nsIRequest*, nsISupports*) (netwerk/base/src/nsBaseChannel.cpp:712)
   by 0x5664513: nsInputStreamPump::OnStateStart() (netwerk/base/src/nsInputStreamPump.cpp:441)
   by 0x56649B4: nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*) (netwerk/base/src/nsInputStreamPump.cpp:397)
   by 0x6389809: nsInputStreamReadyEvent::Run() (xpcom/io/nsStreamUtils.cpp:112)
   by 0x63A142C: nsThread::ProcessNextEvent(int, int*) (xpcom/threads/nsThread.cpp:642)

--------------------

Possible data race during write of size 1 at 0x152a8285 by thread #27 while holding 1 lock
   at 0x5CC91D5: nsOggReader::ReadMetadata() (content/media/ogg/nsOggReader.cpp:306)
   by 0x5CB382C: nsBuiltinDecoderStateMachine::LoadMetadata() (content/media/nsBuiltinDecoderStateMachine.cpp:1521)
   by 0x5CC5B15: nsOggDecoderStateMachine::LoadMetadata() (content/media/ogg/nsOggDecoderStateMachine.cpp:51)
   by 0x5CB581C: nsBuiltinDecoderStateMachine::Run() (content/media/nsBuiltinDecoderStateMachine.cpp:980)
   by 0x63A142C: nsThread::ProcessNextEvent(int, int*) (xpcom/threads/nsThread.cpp:642)
   by 0x63618C9: NS_ProcessNextEvent_P(nsIThread*, int) (ff-opt/xpcom/build/nsThreadUtils.cpp:250)
   by 0x63A1B3C: nsThread::ThreadFunc(void*) (xpcom/threads/nsThread.cpp:281)
   by 0x7A44A1C: _pt_root (nsprpub/pr/src/pthreads/ptthread.c:189)
   by 0x4C2CBE7: mythread_wrapper (/home/sewardj/VgTRUNK/hgdev/helgrind/hg_intercepts.c:221)
   by 0x4E369C9: start_thread (/build/buildd/eglibc-2.11.1/nptl/pthread_create.c:300)
   by 0xB2DB70C: clone (/build/buildd/eglibc-2.11.1/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:112)

 This conflicts with a previous read of size 1 by thread #1 while holding 1 lock
   at 0x5CB44DD: nsBuiltinDecoderStateMachine::HaveNextFrameData() const (content/media/nsBuiltinDecoderStateMachine.cpp:192)
   by 0x5CB4561: nsBuiltinDecoderStateMachine::GetNextFrameStatus() (content/media/nsBuiltinDecoderStateMachine.cpp:750)
   by 0x5CB1514: nsBuiltinDecoder::UpdateReadyStateForData() (content/media/nsBuiltinDecoder.cpp:639)
   by 0x5CB1A10: nsBuiltinDecoder::NotifyDownloadEnded(unsigned int) (content/media/nsBuiltinDecoder.cpp:593)
   by 0x5CB0D6A: DataEnded::Run() (content/media/nsMediaStream.cpp:738)
   by 0x63A142C: nsThread::ProcessNextEvent(int, int*) (xpcom/threads/nsThread.cpp:642)
   by 0x63618C9: NS_ProcessNextEvent_P(nsIThread*, int) (ff-opt/xpcom/build/nsThreadUtils.cpp:250)
   by 0x625AC52: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (ipc/glue/MessagePump.cpp:110)

 Address 0x152a8285 is 325 bytes inside a block of size 616 alloc'd
   at 0x4C27B08: malloc (/home/sewardj/VgTRUNK/hgdev/coregrind/m_replacemalloc/vg_replace_malloc.c:236)
   by 0x7410251: moz_xmalloc (memory/mozalloc/mozalloc.cpp:98)
   by 0x5CC5B90: nsOggDecoderStateMachine::nsOggDecoderStateMachine(nsBuiltinDecoder*) (ff-opt/content/media/ogg/../../../dist/include/mozilla/mozalloc.h:229)
   by 0x5CC5A5D: nsOggDecoder::CreateStateMachine() (content/media/ogg/nsOggDecoder.cpp:46)
   by 0x5CB206E: nsBuiltinDecoder::Load(nsMediaStream*, nsIStreamListener**, nsMediaDecoder*) (content/media/nsBuiltinDecoder.cpp:209)
   by 0x5AD2340: nsHTMLMediaElement::InitializeDecoderForChannel(nsIChannel*, nsIStreamListener**) (content/html/content/src/nsHTMLMediaElement.cpp:1863)
   by 0x5AD2453: nsHTMLMediaElement::MediaLoadListener::OnStartRequest(nsIRequest*, nsISupports*) (content/html/content/src/nsHTMLMediaElement.cpp:304)
   by 0x565C313: nsBaseChannel::OnStartRequest(nsIRequest*, nsISupports*) (netwerk/base/src/nsBaseChannel.cpp:712)
   by 0x5664513: nsInputStreamPump::OnStateStart() (netwerk/base/src/nsInputStreamPump.cpp:441)
   by 0x56649B4: nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*) (netwerk/base/src/nsInputStreamPump.cpp:397)
   by 0x6389809: nsInputStreamReadyEvent::Run() (xpcom/io/nsStreamUtils.cpp:112)
   by 0x63A142C: nsThread::ProcessNextEvent(int, int*) (xpcom/threads/nsThread.cpp:642)
The issue in comment 3 appears to be different. There is an mInfo structure owned by nsOggReader. This is written to in the state machine thread and guarded by a monitor owned by nsOggReader (through its base class nsBuiltinDecoderReader).

It's read from the main thread in nsOggDecoderStateMachine::HaveNextFrameData. This is guarded by a lock on the nsDecoder's monitor. This monitor and nsOggReader's monitor are different monitors.

The mInfo in nsBuiltinDecoderReader shouldn't be accessed without taking its lock.
Assignee: nobody → chris.double
Status: NEW → ASSIGNED
RE comment 1: the same exact FrameBufferLength value is set in two different threads at the same time. 

Looks like the statement at nsBuiltinDecoder::MetadataLoaded (http://mxr.mozilla.org/mozilla-central/source/content/media/nsBuiltinDecoder.cpp#339)

mFrameBufferLength = aFrameBufferLength;

does not perform anything useful. We can switch the dispatch event with the |if (HasAudio())| block at the http://mxr.mozilla.org/mozilla-central/source/content/media/nsBuiltinDecoderStateMachine.cpp#1018 .
Attachment #517617 - Flags: feedback?(david.humphrey)
Comment on attachment 517617 [details] [diff] [review]
Fixing mFrameBufferLength data race

Doesn't this still have a problem whereby the mFrameBufferLength variable in nsMediaDecoder is read and written on the main thread (via the DOM methods) and read on the audio thread (via mEventManager.QueueWrittenAudioData) without any locking?
Assignee: chris.double → async.processingjs
I've split out the issue in comment 3 and comment 4 into bug 639721.
In this patch the decoder pushes the frame buffer length to the event manager. Using the existing monitors to avoid any data race. I'll keep first patch for comparison.
Attachment #517953 - Flags: review?(chris.double)
Am trying to verify that the patch fixes the race, but currently I
can't reproduce the race even in an unpatched build.  Investigating.
(In reply to comment #8)
> Created attachment 517953 [details] [diff] [review]
> Fixing mFrameBufferLength data race (using monitors)

This gets rid of the reported race, as far as I can tell.
Comment on attachment 517953 [details] [diff] [review]
Fixing mFrameBufferLength data race (using monitors)

>diff -r 4e771e65764a content/media/nsAudioAvailableEventManager.cpp
>+  // Sets the size of the signal buffer.
>+  void SetSignalBufferLength(PRUint32 aLength);

Describe in comment if it can be called from any thread or is for calling from a specific thread only.

>+  // Sets the current size of the framebuffer used in MozAudioAvailable events.
>+  virtual void SetFrameBufferLength(PRUint32 aLength) = 0;

Same with this comment here.

>+void nsBuiltinDecoderStateMachine::SetFrameBufferLength(PRUint32 aLength)
>+{
>+  mDecoder->GetMonitor().AssertCurrentThreadIn();
>+  mEventManager.SetSignalBufferLength(aLength);
>+}

Should this assert that the length lies within the required range as mentioned in the comment in the header?

r+ with those changes.
Attachment #517953 - Flags: review?(chris.double) → review+
Attachment #517617 - Attachment is obsolete: true
Attachment #517953 - Attachment is obsolete: true
Attachment #517617 - Flags: feedback?(david.humphrey)
Keywords: checkin-needed
Fixing the assert.
Attachment #524900 - Attachment is obsolete: true
> Should this assert that the length lies within the required range as mentioned
in the comment in the header?

Okay, the assert helped to find something... Requesting the review again. The change from previous review:

@@ -215,23 +215,40 @@ nsresult nsBuiltinDecoder::Load(nsMediaS
+    if (mFrameBufferLength > 0) {
+      // The valid mFrameBufferLength value was specified earlier
       mDecoderStateMachine->SetFrameBufferLength(mFrameBufferLength);
+    }

and update to the current tree.
Attachment #524980 - Attachment is obsolete: true
Attachment #525029 - Flags: review?(chris.double)
Attachment #525029 - Flags: review?(chris.double) → review+
http://hg.mozilla.org/mozilla-central/rev/71fa806ffd26
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Whiteboard: fixed-in-cedar
Target Milestone: --- → mozilla2.2
Blocks: 686137
You need to log in before you can comment on or make changes to this bug.