Closed Bug 560784 Opened 10 years ago Closed 9 years ago

rev 3 Fedora 'pthread_mutex_destroy(&m->mutex) == 0' 'pa_atomic_load(&(c)->_ref) >= 1' fatal assertions during test_playback.html | bug461281.ogg

Categories

(Core :: Audio/Video, defect)

defect
Not set

Tracking

()

RESOLVED FIXED

People

(Reporter: karlt, Assigned: kinetik)

References

Details

(Keywords: assertion, intermittent-failure)

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1271829036.1271830197.19660.gz

45422 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug461281.ogg time should run forwards: p=1.312000036239624 c=1.3519999980926514
Assertion 'pthread_mutex_destroy(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:83, function pa_mutex_free(). Aborting.
Assertion 'pa_atomic_load(&(c)->_ref) >= 1' failed at pulse/context.c:1042, function pa_context_get_state(). Aborting.

TEST-UNEXPECTED-FAIL | automation.py | Exited with code 6 during test run
This has only happened once.
Can we close this bug?
Has anything changed that might have fixed this?
If not it might happen again, so I think we should leave it open.
That doesn't necessarily mean it needs to block turning on these machines.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1272805160.1272806266.10656.gz
Rev3 Fedora 12x64 mozilla-central debug test mochitests-1/5 on 2010/05/02 05:59:20
A slightly different failure, but I'll lump it in here for lack of a better place:

46414 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv checking playback has ended
Assertion 'pthread_mutex_destroy(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:83, function pa_mutex_free(). Aborting.
Assertion 'pa_atomic_load(&(c)->_ref) >= 1' failed at pulse/context.c:1042, function pa_context_get_state(). Aborting.
NEXT ERROR TEST-UNEXPECTED-FAIL | automation.py | Exited with code 6 during test run


http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1273000321.1273002364.13350.gz
Rev3 Fedora 12 mozilla-central debug test mochitests-1/5 on 2010/05/04 12:12:01
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1273105924.1273107007.25328.gz
Rev3 Fedora 12 mozilla-central debug test mochitests-1/5 on 2010/05/05 17:32:04
s: talos-r3-fed-005

46425 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv checking playback has ended
Assertion 'pthread_mutex_destroy(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:83, function pa_mutex_free(). Aborting.
Assertion 'pa_atomic_load(&(c)->_ref) >= 1' failed at pulse/context.c:1042, function pa_context_get_state(). Aborting.
NEXT ERROR TEST-UNEXPECTED-FAIL | automation.py | Exited with code 6 during test run
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1273657090.1273659071.3005.gz
Rev3 Fedora 12 mozilla-central debug test mochitests-1/5 on 2010/05/12 02:38:10

46467 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv time should run forwards: p=0.4659999907016754 c=0.5
46468 INFO TEST-PASS | /tests/content/media/test/test_playback.html | Name should match test.name #4
46469 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv time should run forwards: p=0.5 c=0.5329999923706055
46470 INFO TEST-PASS | /tests/content/media/test/test_playback.html | Name should match test.name #2
46471 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv checking readyState
46472 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv shouldn't report NETWORK_LOADED
46473 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv checking playback has ended
Assertion 'pthread_mutex_destroy(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:83, function pa_mutex_free(). Aborting.
Assertion 'pa_atomic_load(&(c)->_ref) >= 1' failed at pulse/context.c:1042, function pa_context_get_state(). Aborting.
NEXT ERROR TEST-UNEXPECTED-FAIL | automation.py | Exited with code 6 during test run
Linux mozilla-central opt test mochitests-1/5 [testfailed] Started 09:58, finished 10:12, took 15mins

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1274201880.1274202618.21075.gz
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1274466631.1274468165.7949.gz
Rev3 Fedora 12 mozilla-central debug test mochitests-1/5 on 2010/05/21 11:30:31  
s: talos-r3-fed-010

46628 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug498380.ogv checking playback has ended
Assertion 'pthread_mutex_destroy(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:83, function pa_mutex_free(). Aborting.
Assertion 'pa_atomic_load(&(c)->_ref) >= 1' failed at pulse/context.c:1042, function pa_context_get_state(). Aborting.
NEXT ERROR TEST-UNEXPECTED-FAIL | automation.py | Exited with code 6 during test run
I happened to catch this one of these bugs in valgrind while looking at something else

==1988== Thread 22:
==1988== Invalid read of size 4
==1988==    at 0x307A25BC: pa_stream_get_state (in /usr/lib/libpulse.so.0.12.0)
==1988==    by 0x320A59F0: ??? (in /usr/lib/alsa-lib/libasound_module_pcm_pulse.so)
==1988==    by 0x320A608A: ??? (in /usr/lib/alsa-lib/libasound_module_pcm_pulse.so)
==1988==    by 0xD6FD5DC: snd_pcm_recover (in /usr/lib/libasound.so.2.0.0)
==1988==    by 0x615B6B3: sa_stream_get_position (sydney_audio_alsa.c:260)
==1988==    by 0x61190BB: nsAudioStream::GetPosition() (nsAudioStream.cpp:263)
==1988==    by 0x6114BE5: nsBuiltinDecoderStateMachine::GetAudioClock() (nsBuiltinDecoderStateMachine.cpp:1043)
==1988==    by 0x6114D80: nsBuiltinDecoderStateMachine::AdvanceFrame() (nsBuiltinDecoderStateMachine.cpp:1072)
==1988==    by 0x6113A3D: nsBuiltinDecoderStateMachine::Run() (nsBuiltinDecoderStateMachine.cpp:796)
==1988==    by 0x6B988F4: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:547)
==1988==    by 0x6B267D3: NS_ProcessNextEvent_P(nsIThread*, int) (nsThreadUtils.cpp:250)
==1988==    by 0x6B97901: nsThread::ThreadFunc(void*) (nsThread.cpp:263)
==1988==  Address 0x1bd90d20 is 0 bytes inside a block of size 1,416 free'd
==1988==    at 0x4C24D68: free (vg_replace_malloc.c:325)
==1988==    by 0x307AFC3D: pa_xfree (in /usr/lib/libpulse.so.0.12.0)
==1988==    by 0x320A5F8F: ??? (in /usr/lib/alsa-lib/libasound_module_pcm_pulse.so)
==1988==    by 0xD73E703: ??? (in /usr/lib/libasound.so.2.0.0)
==1988==    by 0xD701524: snd_pcm_close (in /usr/lib/libasound.so.2.0.0)
==1988==    by 0x615B494: sa_stream_destroy (sydney_audio_alsa.c:167)
==1988==    by 0x61189CA: nsAudioStream::Shutdown() (nsAudioStream.cpp:113)
==1988==    by 0x6118E48: nsAudioStream::Write(void const*, unsigned int, int) (nsAudioStream.cpp:192)
==1988==    by 0x6111EB2: nsBuiltinDecoderStateMachine::AudioLoop() (nsBuiltinDecoderStateMachine.cpp:387)
==1988==    by 0x6116D57: nsRunnableMethodImpl<void (nsBuiltinDecoderStateMachine::*)(), true>::Run() (nsThreadUtils.h:347)
==1988==    by 0x6B988F4: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:547)
==1988==    by 0x6B267D3: NS_ProcessNextEvent_P(nsIThread*, int) (nsThreadUtils.cpp:250)
==1988== 
Assertion 'pa_atomic_load(&(s)->_ref) >= 1' failed at pulse/stream.c:267, function pa_stream_get_state(). Aborting.

Program ./firefox-bin (pid = 1988) received signal 6.
Stack:
UNKNOWN [/lib/libpthread.so.0 +0x0000F190]
gsignal+0x00000035 [/lib/libc.so.6 +0x000334B5]
abort+0x00000180 [/lib/libc.so.6 +0x00036F50]
UNKNOWN [/usr/lib/libpulse.so.0 +0x0002365C]
UNKNOWN [/usr/lib/alsa-lib/libasound_module_pcm_pulse.so +0x000029F1]
UNKNOWN [/usr/lib/alsa-lib/libasound_module_pcm_pulse.so +0x0000308B]
snd_pcm_recover+0x000000DD [/usr/lib/libasound.so.2 +0x0004D5DD]
UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x011156B4]
UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x010D30BC]
UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x010CEBE6]
UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x010CED81]
UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x010CDA3E]
UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x01B528F5]
UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x01AE07D4]
UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libxul.so +0x01B51902]
UNKNOWN [/home/cjones/mozilla/ff-dbg/dist/lib/libnspr4.so +0x00038507]
UNKNOWN [/lib/libpthread.so.0 +0x00006A04]
clone+0x0000006D [/lib/libc.so.6 +0x000DFD4D]
Backtrace from abort, with the symbols available on my machine

gsignal
abort
pa_stream_get_state
  /build/buildd/pulseaudio-0.9.19/src/pulse/stream.c:267
UNKNOWN [/usr/lib/alsa-lib/libasound_module_pcm_pulse.so +0x000029F1]
UNKNOWN [/usr/lib/alsa-lib/libasound_module_pcm_pulse.so +0x0000308B]
snd_pcm_recover
sa_stream_get_position
  /home/cjones/mozilla/mozilla-central/media/libsydneyaudio/src/sydney_audio_alsa.c:260
nsAudioStream::GetPosition()
  /home/cjones/mozilla/mozilla-central/content/media/nsAudioStream.cpp:263
nsBuiltinDecoderStateMachine::GetAudioClock()
  /home/cjones/mozilla/mozilla-central/content/media/nsBuiltinDecoderStateMachine.cpp:1043
nsBuiltinDecoderStateMachine::AdvanceFrame()
  /home/cjones/mozilla/mozilla-central/content/media/nsBuiltinDecoderStateMachine.cpp:1072
nsBuiltinDecoderStateMachine::Run()
  /home/cjones/mozilla/mozilla-central/content/media/nsBuiltinDecoderStateMachine.cpp:798
nsThread::ProcessNextEvent(int, int*)
  /home/cjones/mozilla/mozilla-central/xpcom/threads/nsThread.cpp:547
[event loop gunk]
The above stack made a bit more readable by a new script of mine

  0x0000F190: __restore_rt (sigaction.c:0)
  0x000334B5: *__GI_raise (raise.c:64)
  0x00036F50: *__GI_abort (abort.c:94)
  0x0002365C: pa_stream_get_state (stream.c:267)
  0x000029F1: UNKNOWN (libasound_module_pcm_pulse.so:0)
  0x0000308B: UNKNOWN (libasound_module_pcm_pulse.so:0)
  0x0004D5DD: snd_pcm_recover+0x000000DD (libasound.so.2:0)
  0x011156B4: sa_stream_get_position (sydney_audio_alsa.c:260)
  0x010D30BC: nsAudioStream::GetPosition() (nsAudioStream.cpp:263)
  0x010CEBE6: nsBuiltinDecoderStateMachine::GetAudioClock() (nsBuiltinDecoderStateMachine.cpp:1043)
  0x010CED81: nsBuiltinDecoderStateMachine::AdvanceFrame() (nsBuiltinDecoderStateMachine.cpp:1072)
  0x010CDA3E: nsBuiltinDecoderStateMachine::Run() (nsBuiltinDecoderStateMachine.cpp:798)
  0x01B528F5: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:547)
  0x01AE07D4: NS_ProcessNextEvent_P(nsIThread*, int) (nsThreadUtils.cpp:250)
  0x01B51902: nsThread::ThreadFunc(void*) (nsThread.cpp:262)
  0x00038507: _pt_root (ptthread.c:231)
  0x00006A04: start_thread (pthread_create.c:300)
  0x000DFD4D: __clone (clone.S:114)
That's awesome, thanks Chris!

So, the nsAudioStream has failed during a write and promptly destroyed the audio backend state.  Another thread is calling GetPosition at the same time.

AudioLoop() holds the audio lock and does not hold the decoder lock during the write.  Both callers of GetAudioClock() hold the decoder lock and do not hold the audio lock.

The underlying call to sa_stream_get_write size is thread safe, but sa_stream_destroy is not thread safe and assumes the application is not (and will not) attempt to use the sa_stream state during and after destruction.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1275007837.1275009485.18893.gz
Rev3 Fedora 12x64 mozilla-central debug test mochitests-1/5 on 2010/05/27 17:50:37
s: talos-r3-fed64-018
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1275038417.1275039948.19276.gz
Rev3 Fedora 12 mozilla-central debug test mochitests-1/5 on 2010/05/28 02:20:17
s: talos-r3-fed-020
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1275357988.1275359161.16152.gz
Rev3 Fedora 12 mozilla-central debug test mochitests-1/5 on 2010/05/31 19:06:28
s: talos-r3-fed-005
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1275563954.1275564368.10377.gz
Rev3 Fedora 12x64 mozilla-central opt test mochitests-1/5 on 2010/06/03 04:19:14
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1276022690.1276024395.1726.gz
Rev3 Fedora 12x64 mozilla-central debug test mochitests-1/5 on 2010/06/08 11:44:50
s: talos-r3-fed64-025
Duplicate of this bug: 579544
Keywords: assertion
OS: Linux → All
Hardware: x86_64 → All
Depends on: 571798
Depends on: 589071
Hasn't occurred for a couple of months now, so the fixes in the bugs this bug depends on must have worked.  Marking fixed.
Assignee: nobody → kinetik
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.