Closed
Bug 757707
Opened 13 years ago
Closed 13 years ago
Intermittent crashtests/752784-1.html | application timed out after 330 seconds with no output
Categories
(Core :: Audio/Video, defect)
Tracking
()
RESOLVED
FIXED
mozilla15
People
(Reporter: philor, Assigned: kinetik)
References
Details
(Keywords: intermittent-failure)
Attachments
(1 file, 1 obsolete file)
|
3.17 KB,
patch
|
roc
:
review+
|
Details | Diff | Splinter Review |
https://tbpl.mozilla.org/php/getParsedLog.php?id=11968696&tree=Mozilla-Inbound
Rev3 Fedora 12 mozilla-inbound debug test crashtest on 2012-05-22 18:47:11 PDT for push 99878dcae029
slave: talos-r3-fed-008
REFTEST TEST-START | file:///home/cltbld/talos-slave/test/build/reftest/tests/content/media/test/crashtests/752784-1.html | 418 / 2059 (20%)
WARNING: No outer window available!: file ../../../dom/base/nsGlobalWindow.cpp, line 8391
WARNING: No outer window available!: file ../../../dom/base/nsGlobalWindow.cpp, line 8391
WARNING: No outer window available!: file ../../../dom/base/nsGlobalWindow.cpp, line 8391
WARNING: No outer window available!: file ../../../dom/base/nsGlobalWindow.cpp, line 8391
++DOMWINDOW == 62 (0xb9bfdd4) [serial = 941] [outer = 0x9fbb830]
nsBuiltinDecoderStateMachine::RunStateMachine queuing nsBuiltinDecoder::PlaybackEnded
nsBuiltinDecoderStateMachine::RunStateMachine queuing nsBuiltinDecoder::PlaybackEnded
nsBuiltinDecoder::PlaybackEnded mPlayState=3
nsBuiltinDecoder::PlaybackEnded mPlayState=3
WARNING: No outer window available!: file ../../../dom/base/nsGlobalWindow.cpp, line 8391
WARNING: No outer window available!: file ../../../dom/base/nsGlobalWindow.cpp, line 8391
WARNING: No outer window available!: file ../../../dom/base/nsGlobalWindow.cpp, line 8391
WARNING: No outer window available!: file ../../../dom/base/nsGlobalWindow.cpp, line 8391
nsBuiltinDecoderStateMachine::RunStateMachine queuing nsBuiltinDecoder::PlaybackEnded
nsBuiltinDecoder::PlaybackEnded mPlayState=3
nsBuiltinDecoderStateMachine::RunStateMachine queuing nsBuiltinDecoder::PlaybackEnded
nsBuiltinDecoder::PlaybackEnded mPlayState=3
nsBuiltinDecoderStateMachine::RunStateMachine queuing nsBuiltinDecoder::PlaybackEnded
nsBuiltinDecoder::PlaybackEnded mPlayState=3
TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/reftest/tests/content/media/test/crashtests/752784-1.html | application timed out after 330 seconds with no output
XScreenSaver state: Disabled
User input has been idle for 709 seconds
args: ['/home/cltbld/talos-slave/test/build/bin/screentopng']
(screenshot)
INFO | automation.py | Application ran for: 0:06:48.826801
INFO | automation.py | Reading PID log: /tmp/tmp_6xBhfpidlog
==> process 2126 launched child process 2159
INFO | automation.py | Checking for orphan process with PID: 2159
Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux-debug/1337736022/firefox-15.0a1.en-US.linux-i686.crashreporter-symbols.zip
PROCESS-CRASH | file:///home/cltbld/talos-slave/test/build/reftest/tests/content/media/test/crashtests/752784-1.html | application crashed (minidump found)
Crash dump filename: /tmp/tmpv4Xc0F/minidumps/16ab87d0-32ad-d9fb-3f9db17b-1c82ee4f.dmp
Operating system: Linux
0.0.0 Linux 2.6.31.5-127.fc12.i686.PAE #1 SMP Sat Nov 7 21:25:57 EST 2009 i686
CPU: x86
GenuineIntel family 6 model 23 stepping 10
2 CPUs
Crash reason: SIGABRT
Crash address: 0x84b
Thread 0 (crashed)
0 linux-gate.so + 0x424
eip = 0x007f8424 esp = 0xbfa181a4 ebp = 0xbfa181d8 ebx = 0x0b079b74
esi = 0x00000000 edi = 0x00000011 eax = 0xfffffffc ecx = 0x00000080
edx = 0x00000002 efl = 0x00200246
Found by: given as instruction pointer in context
1 libnspr4.so!PR_Lock [ptsynch.c : 174 + 0x7]
eip = 0x00134446 esp = 0xbfa181e0 ebp = 0xbfa181f8
Found by: previous frame's frame pointer
2 libnspr4.so!PR_EnterMonitor [ptsynch.c : 529 + 0xa]
eip = 0x0013519b esp = 0xbfa18200 ebp = 0xbfa18228 ebx = 0x00149530
esi = 0x0b079b70
Found by: call frame info
3 libxul.so!mozilla::ReentrantMonitor::Enter [BlockingResourceBase.cpp : 282 + 0x8]
eip = 0x020e5e53 esp = 0xbfa18230 ebp = 0xbfa18268 ebx = 0x02d27288
esi = 0x0960d8d4 edi = 0x00000000
Found by: call frame info
4 libxul.so!nsBuiltinDecoderStateMachine::Shutdown [nsBuiltinDecoderStateMachine.cpp : 1442 + 0x1c]
eip = 0x01aa9efa esp = 0xbfa18270 ebp = 0xbfa182a8 ebx = 0x02d27288
esi = 0x09a72900 edi = 0x09a7290c
Found by: call frame info
5 libxul.so!nsBuiltinDecoder::Shutdown [nsBuiltinDecoder.cpp : 152 + 0x16]
eip = 0x01aa5321 esp = 0xbfa182b0 ebp = 0xbfa182c8 ebx = 0x02d27288
esi = 0x0960d830 edi = 0x0a8a3730
Found by: call frame info
6 libxul.so!nsHTMLMediaElement::ShutdownDecoder [nsHTMLMediaElement.cpp : 553 + 0x10]
eip = 0x017e8d69 esp = 0xbfa182d0 ebp = 0xbfa182f8 ebx = 0x02d27288
esi = 0x0a8a3740 edi = 0x0a8a3730
Found by: call frame info
7 libxul.so!nsHTMLMediaElement::~nsHTMLMediaElement [nsHTMLMediaElement.cpp : 1681 + 0x8]
eip = 0x017eac75 esp = 0xbfa18300 ebp = 0xbfa18318 ebx = 0x02d27288
esi = 0x0a8a36e8 edi = 0x0a3537f0
Found by: call frame info
8 libxul.so!nsHTMLAudioElement::~nsHTMLAudioElement [nsHTMLAudioElement.cpp : 82 + 0x39]
eip = 0x017e167b esp = 0xbfa18320 ebp = 0xbfa18338 ebx = 0x02d27288
esi = 0x0a8a36e8 edi = 0x0a3537f0
Found by: call frame info
9 libxul.so!nsNodeUtils::LastRelease [nsNodeUtils.cpp : 251 + 0x7]
eip = 0x0169c9b6 esp = 0xbfa18340 ebp = 0xbfa18388 ebx = 0x02d27288
esi = 0x0a8a36e8 edi = 0x0a3537f0
The main thread is trying to take the decoder lock in nsBuiltinDecoderStateMachine::Shutdown due to the media element being GCed.
On the libcubeb thread, nsBufferedAudioStream::DataCallback is trying to take the lock on the stream.
On the nsBuiltinDecoderStateMachine thread (thread 18), we have a deep nest of nsBuiltinDecoderStateMachine::StopDecodeThread calls. We're exiting our monitor, trying to stop mDecodeThread, and while that waits, we get a nested nsBuiltinDecoderStateMachine::Run event which eventually triggers another nsBuiltinDecoderStateMachine::StopDecodeThread, etc. There's a nest of these 9-10 deep. At the bottom it's blocked trying to get a decoder lock. I guess at that point it shouldn't be holding any locks since StopDecodeThread drops the lock it's holding.
A very large number of nsBuiltinDecoderStateMachine::AudioLoop threads are trying to write to their nsBufferedAudioStream but waiting for space to open up. I guess they're not holding any locks.
One AudioLoop thread (thread 27) is calling nsBufferedAudioStream::Pause. I can't tell from the stack whether it's blocked or not.
Several DecodeLoop threads are doing a Wait, I think it must be the decoder lock. So they shouldn't be holding any locks.
So I can't tell who's holding the decoder lock that the main thread is waiting for. Nor can I tell who is holding the nsBufferedAudioStream lock.
Due to the large number of threads I'm pretty sure this problem is caused by 691096-1.html.
| Assignee | ||
Comment 2•13 years ago
|
||
nsAudioStream::Pause() is only called from one place, and it is holding the decoder lock when it does so. Presumably that AudioLoop thread is holding the decoder lock, the nsBufferedAudioStream lock (inside Pause()) and is off in space somewhere in cubeb_stream_stop. A decent stack would help; maybe we'll get lucky and this will fail on a debug build overnight.
Comment 3•13 years ago
|
||
I've had plans for a while to refactor the way we're shutting down media threads, so we don't call Shutdown() on threads until we know their thread runners have actually exited so we don't have to wait. I was waiting for the audio threads to go away before doing that though.
We've got a few other orange/crash bugs that occur with call stacks like this (older ones with stacks calling into libsydneyaudio).
| Assignee | ||
Comment 4•13 years ago
|
||
I can reproduce this by running the content/media/test/crashtests a few hundred times in a opt build.
Assignee: nobody → kinetik
Status: NEW → ASSIGNED
| Assignee | ||
Comment 5•13 years ago
|
||
This is a fairly simple deadlock.
Thread 198 (AudioLoop)
stm in PROCESSING, blocked on condition, holding nsBufferedAudioStream monitor
Thread 212 (cubeb_run)
waiting on nsBufferedAudioStream monitor, responsible for transition stm from PROCESSING to another state on completion
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Assignee | ||
Comment 8•13 years ago
|
||
I was sloppy with holding locks when I wrote this originally. Holding locks across calls to cubeb can result in deadlocks between the caller and the data or state callback running on another thread.
Pushed to try here: https://tbpl.mozilla.org/?tree=Try&rev=b627afd8ee5d
Attachment #628235 -
Flags: review?(roc)
Comment on attachment 628235 [details] [diff] [review]
patch v0
Review of attachment 628235 [details] [diff] [review]:
-----------------------------------------------------------------
So what happens if multiple nsBufferedAudioStream operations race, e.g. two Writes or a Write and a Pause? Now their execution can be interleaved.
Or are we always using nsBufferedAudioStreams from a single thread?
Comment 10•13 years ago
|
||
Consistently getting this when I seek opus audio, it is not intermittent for me. Occuring on http://media.kradradio.com/test/zan%20and%20shawn%20-%203-28-12_32kbit.opus to be specific.
It’s just sat here in the `bt`.
#0 0x00007f997c7e789c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007f997c7e309b in _L_lock_1006 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2 0x00007f997c7e301c in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#3 0x00007f997b4e6f59 in PR_Lock () from /home/john/bin/firefox-nightly/firefox/libnspr4.so
#4 0x00007f997b4e75e5 in PR_EnterMonitor () from /home/john/bin/firefox-nightly/firefox/libnspr4.so
#5 0x00007f997887d0d0 in nsBuiltinDecoder::Play() () from /home/john/bin/firefox-nightly/firefox/libxul.so
#6 0x00007f997867df7c in nsHTMLMediaElement::Play() () from /home/john/bin/firefox-nightly/firefox/libxul.so
#7 0x00007f9978d725ca in NS_InvokeByIndex_P () from /home/john/bin/firefox-nightly/firefox/libxul.so
Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/15.0 Firefox/15.0a1
Comment 11•13 years ago
|
||
(In reply to Matthew Gregan [:kinetik] from comment #8)
> Created attachment 628235 [details] [diff] [review]
> patch v0
>
> I was sloppy with holding locks when I wrote this originally. Holding locks
> across calls to cubeb can result in deadlocks between the caller and the
> data or state callback running on another thread.
>
> Pushed to try here: https://tbpl.mozilla.org/?tree=Try&rev=b627afd8ee5d
This works for me.
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Assignee | ||
Comment 13•13 years ago
|
||
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #9)
> Or are we always using nsBufferedAudioStreams from a single thread?
It's intended that callers perform their own synchronization, and I know this is true for AudioLoop and the Audio Data API (I didn't check the new MediaStreams code)--both use a single thread, with a small exception. The exception is that the GetPosition calls (and possibly SetVolume) are intended to be safe to call from a different thread. Also note that nsNativeAudioStream contains no synchronization (which may not matter given how little state it has now, but it used to have a shared buffer that would've been corrupted by interleaved writes).
The synchronization in nsBufferedAudioStream is intended to protect the state between the callbacks (which may run on different threads) and (already synchronized) calls to the nsBufferedAudioStream API. The last patch does have a bug where it could clobber mState == ERRORED set by the state callback while the lock is not held over cubeb_stream_start/stop, so I've fixed that and cleaned up a few of the other changes.
Try push: https://tbpl.mozilla.org/?tree=Try&rev=389fb496f098
Attachment #628235 -
Attachment is obsolete: true
Attachment #628235 -
Flags: review?(roc)
Attachment #628609 -
Flags: review?(roc)
Comment on attachment 628609 [details] [diff] [review]
patch v1
Review of attachment 628609 [details] [diff] [review]:
-----------------------------------------------------------------
Please document the threading requirements for nsBufferedAudioStream clients.
Attachment #628609 -
Flags: review?(roc) → review+
| Assignee | ||
Comment 15•13 years ago
|
||
Target Milestone: --- → mozilla15
Comment 16•13 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Updated•13 years ago
|
Keywords: intermittent-failure
Updated•13 years ago
|
Whiteboard: [orange]
You need to log in
before you can comment on or make changes to this bug.
Description
•