Closed Bug 560784 Opened 15 years ago Closed 15 years ago

rev 3 Fedora 'pthread_mutex_destroy(&m->mutex) == 0' 'pa_atomic_load(&(c)->_ref) >= 1' fatal assertions during test_playback.html | bug461281.ogg

Categories

(Core :: Audio/Video, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: karlt, Assigned: kinetik)

References

Details

(Keywords: assertion, intermittent-failure)

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1271829036.1271830197.19660.gz 45422 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug461281.ogg time should run forwards: p=1.312000036239624 c=1.3519999980926514 Assertion 'pthread_mutex_destroy(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:83, function pa_mutex_free(). Aborting. Assertion 'pa_atomic_load(&(c)->_ref) >= 1' failed at pulse/context.c:1042, function pa_context_get_state(). Aborting. TEST-UNEXPECTED-FAIL | automation.py | Exited with code 6 during test run
This has only happened once. Can we close this bug?
Has anything changed that might have fixed this? If not it might happen again, so I think we should leave it open. That doesn't necessarily mean it needs to block turning on these machines.
A slightly different failure, but I'll lump it in here for lack of a better place: 46414 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv checking playback has ended Assertion 'pthread_mutex_destroy(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:83, function pa_mutex_free(). Aborting. Assertion 'pa_atomic_load(&(c)->_ref) >= 1' failed at pulse/context.c:1042, function pa_context_get_state(). Aborting. NEXT ERROR TEST-UNEXPECTED-FAIL | automation.py | Exited with code 6 during test run http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1273000321.1273002364.13350.gz Rev3 Fedora 12 mozilla-central debug test mochitests-1/5 on 2010/05/04 12:12:01
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1273105924.1273107007.25328.gz Rev3 Fedora 12 mozilla-central debug test mochitests-1/5 on 2010/05/05 17:32:04 s: talos-r3-fed-005 46425 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv checking playback has ended Assertion 'pthread_mutex_destroy(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:83, function pa_mutex_free(). Aborting. Assertion 'pa_atomic_load(&(c)->_ref) >= 1' failed at pulse/context.c:1042, function pa_context_get_state(). Aborting. NEXT ERROR TEST-UNEXPECTED-FAIL | automation.py | Exited with code 6 during test run
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1273657090.1273659071.3005.gz Rev3 Fedora 12 mozilla-central debug test mochitests-1/5 on 2010/05/12 02:38:10 46467 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv time should run forwards: p=0.4659999907016754 c=0.5 46468 INFO TEST-PASS | /tests/content/media/test/test_playback.html | Name should match test.name #4 46469 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv time should run forwards: p=0.5 c=0.5329999923706055 46470 INFO TEST-PASS | /tests/content/media/test/test_playback.html | Name should match test.name #2 46471 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv checking readyState 46472 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv shouldn't report NETWORK_LOADED 46473 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv checking playback has ended Assertion 'pthread_mutex_destroy(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:83, function pa_mutex_free(). Aborting. Assertion 'pa_atomic_load(&(c)->_ref) >= 1' failed at pulse/context.c:1042, function pa_context_get_state(). Aborting. NEXT ERROR TEST-UNEXPECTED-FAIL | automation.py | Exited with code 6 during test run
Linux mozilla-central opt test mochitests-1/5 [testfailed] Started 09:58, finished 10:12, took 15mins http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1274201880.1274202618.21075.gz
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1274466631.1274468165.7949.gz Rev3 Fedora 12 mozilla-central debug test mochitests-1/5 on 2010/05/21 11:30:31 s: talos-r3-fed-010 46628 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv checking playback has ended Assertion 'pthread_mutex_destroy(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:83, function pa_mutex_free(). Aborting. Assertion 'pa_atomic_load(&(c)->_ref) >= 1' failed at pulse/context.c:1042, function pa_context_get_state(). Aborting. NEXT ERROR TEST-UNEXPECTED-FAIL | automation.py | Exited with code 6 during test run
I happened to catch this one of these bugs in valgrind while looking at something else ==1988== Thread 22: ==1988== Invalid read of size 4 ==1988== at 0x307A25BC: pa_stream_get_state (in /usr/lib/libpulse.so.0.12.0) ==1988== by 0x320A59F0: ??? (in /usr/lib/alsa-lib/libasound_module_pcm_pulse.so) ==1988== by 0x320A608A: ??? (in /usr/lib/alsa-lib/libasound_module_pcm_pulse.so) ==1988== by 0xD6FD5DC: snd_pcm_recover (in /usr/lib/libasound.so.2.0.0) ==1988== by 0x615B6B3: sa_stream_get_position (sydney_audio_alsa.c:260) ==1988== by 0x61190BB: nsAudioStream::GetPosition() (nsAudioStream.cpp:263) ==1988== by 0x6114BE5: nsBuiltinDecoderStateMachine::GetAudioClock() (nsBuiltinDecoderStateMachine.cpp:1043) ==1988== by 0x6114D80: nsBuiltinDecoderStateMachine::AdvanceFrame() (nsBuiltinDecoderStateMachine.cpp:1072) ==1988== by 0x6113A3D: nsBuiltinDecoderStateMachine::Run() (nsBuiltinDecoderStateMachine.cpp:796) ==1988== by 0x6B988F4: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:547) ==1988== by 0x6B267D3: NS_ProcessNextEvent_P(nsIThread*, int) (nsThreadUtils.cpp:250) ==1988== by 0x6B97901: nsThread::ThreadFunc(void*) (nsThread.cpp:263) ==1988== Address 0x1bd90d20 is 0 bytes inside a block of size 1,416 free'd ==1988== at 0x4C24D68: free (vg_replace_malloc.c:325) ==1988== by 0x307AFC3D: pa_xfree (in /usr/lib/libpulse.so.0.12.0) ==1988== by 0x320A5F8F: ??? (in /usr/lib/alsa-lib/libasound_module_pcm_pulse.so) ==1988== by 0xD73E703: ??? (in /usr/lib/libasound.so.2.0.0) ==1988== by 0xD701524: snd_pcm_close (in /usr/lib/libasound.so.2.0.0) ==1988== by 0x615B494: sa_stream_destroy (sydney_audio_alsa.c:167) ==1988== by 0x61189CA: nsAudioStream::Shutdown() (nsAudioStream.cpp:113) ==1988== by 0x6118E48: nsAudioStream::Write(void const*, unsigned int, int) (nsAudioStream.cpp:192) ==1988== by 0x6111EB2: nsBuiltinDecoderStateMachine::AudioLoop() (nsBuiltinDecoderStateMachine.cpp:387) ==1988== by 0x6116D57: nsRunnableMethodImpl<void (nsBuiltinDecoderStateMachine::*)(), true>::Run() (nsThreadUtils.h:347) ==1988== by 0x6B988F4: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:547) ==1988== by 0x6B267D3: NS_ProcessNextEvent_P(nsIThread*, int) (nsThreadUtils.cpp:250) ==1988== Assertion 'pa_atomic_load(&(s)->_ref) >= 1' failed at pulse/stream.c:267, function pa_stream_get_state(). Aborting. Program ./firefox-bin (pid = 1988) received signal 6. Stack: UNKNOWN [/lib/libpthread.so.0 +0x0000F190] gsignal+0x00000035 [/lib/libc.so.6 +0x000334B5] abort+0x00000180 [/lib/libc.so.6 +0x00036F50] UNKNOWN [/usr/lib/libpulse.so.0 +0x0002365C] UNKNOWN [/usr/lib/alsa-lib/libasound_module_pcm_pulse.so +0x000029F1] UNKNOWN [/usr/lib/alsa-lib/libasound_module_pcm_pulse.so +0x0000308B] snd_pcm_recover+0x000000DD [/usr/lib/libasound.so.2 +0x0004D5DD] UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x011156B4] UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x010D30BC] UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x010CEBE6] UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x010CED81] UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x010CDA3E] UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x01B528F5] UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x01AE07D4] UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x01B51902] UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libnspr4.so +0x00038507] UNKNOWN [/lib/libpthread.so.0 +0x00006A04] clone+0x0000006D [/lib/libc.so.6 +0x000DFD4D]
Backtrace from abort, with the symbols available on my machine gsignal abort pa_stream_get_state /build/buildd/pulseaudio-0.9.19/src/pulse/stream.c:267 UNKNOWN [/usr/lib/alsa-lib/libasound_module_pcm_pulse.so +0x000029F1] UNKNOWN [/usr/lib/alsa-lib/libasound_module_pcm_pulse.so +0x0000308B] snd_pcm_recover sa_stream_get_position /home/cjones/mozilla/mozilla-central/media/libsydneyaudio/src/sydney_audio_alsa.c:260 nsAudioStream::GetPosition() /home/cjones/mozilla/mozilla-central/content/media/nsAudioStream.cpp:263 nsBuiltinDecoderStateMachine::GetAudioClock() /home/cjones/mozilla/mozilla-central/content/media/nsBuiltinDecoderStateMachine.cpp:1043 nsBuiltinDecoderStateMachine::AdvanceFrame() /home/cjones/mozilla/mozilla-central/content/media/nsBuiltinDecoderStateMachine.cpp:1072 nsBuiltinDecoderStateMachine::Run() /home/cjones/mozilla/mozilla-central/content/media/nsBuiltinDecoderStateMachine.cpp:798 nsThread::ProcessNextEvent(int, int*) /home/cjones/mozilla/mozilla-central/xpcom/threads/nsThread.cpp:547 [event loop gunk]
The above stack made a bit more readable by a new script of mine 0x0000F190: __restore_rt (sigaction.c:0) 0x000334B5: *__GI_raise (raise.c:64) 0x00036F50: *__GI_abort (abort.c:94) 0x0002365C: pa_stream_get_state (stream.c:267) 0x000029F1: UNKNOWN (libasound_module_pcm_pulse.so:0) 0x0000308B: UNKNOWN (libasound_module_pcm_pulse.so:0) 0x0004D5DD: snd_pcm_recover+0x000000DD (libasound.so.2:0) 0x011156B4: sa_stream_get_position (sydney_audio_alsa.c:260) 0x010D30BC: nsAudioStream::GetPosition() (nsAudioStream.cpp:263) 0x010CEBE6: nsBuiltinDecoderStateMachine::GetAudioClock() (nsBuiltinDecoderStateMachine.cpp:1043) 0x010CED81: nsBuiltinDecoderStateMachine::AdvanceFrame() (nsBuiltinDecoderStateMachine.cpp:1072) 0x010CDA3E: nsBuiltinDecoderStateMachine::Run() (nsBuiltinDecoderStateMachine.cpp:798) 0x01B528F5: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:547) 0x01AE07D4: NS_ProcessNextEvent_P(nsIThread*, int) (nsThreadUtils.cpp:250) 0x01B51902: nsThread::ThreadFunc(void*) (nsThread.cpp:262) 0x00038507: _pt_root (ptthread.c:231) 0x00006A04: start_thread (pthread_create.c:300) 0x000DFD4D: __clone (clone.S:114)
That's awesome, thanks Chris! So, the nsAudioStream has failed during a write and promptly destroyed the audio backend state. Another thread is calling GetPosition at the same time. AudioLoop() holds the audio lock and does not hold the decoder lock during the write. Both callers of GetAudioClock() hold the decoder lock and do not hold the audio lock. The underlying call to sa_stream_get_write size is thread safe, but sa_stream_destroy is not thread safe and assumes the application is not (and will not) attempt to use the sa_stream state during and after destruction.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1275007837.1275009485.18893.gz Rev3 Fedora 12x64 mozilla-central debug test mochitests-1/5 on 2010/05/27 17:50:37 s: talos-r3-fed64-018
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1275038417.1275039948.19276.gz Rev3 Fedora 12 mozilla-central debug test mochitests-1/5 on 2010/05/28 02:20:17 s: talos-r3-fed-020
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1275357988.1275359161.16152.gz Rev3 Fedora 12 mozilla-central debug test mochitests-1/5 on 2010/05/31 19:06:28 s: talos-r3-fed-005
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1275563954.1275564368.10377.gz Rev3 Fedora 12x64 mozilla-central opt test mochitests-1/5 on 2010/06/03 04:19:14
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1276022690.1276024395.1726.gz Rev3 Fedora 12x64 mozilla-central debug test mochitests-1/5 on 2010/06/08 11:44:50 s: talos-r3-fed64-025
Keywords: assertion
OS: Linux → All
Hardware: x86_64 → All
Depends on: 571798
Depends on: 589071
Hasn't occurred for a couple of months now, so the fixes in the bugs this bug depends on must have worked. Marking fixed.
Assignee: nobody → kinetik
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.