Note: There are a few cases of duplicates in user autocompletion which are being worked on.

Invalid read [@ snd_config_searcha_hooks] during shutdown

RESOLVED FIXED

Status

()

Core
Audio/Video
RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: Jesse Ruderman, Assigned: kinetik)

Tracking

(Blocks: 1 bug, {valgrind})

Trunk
x86_64
Linux
valgrind
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 2 obsolete attachments)

(Reporter)

Description

7 years ago
Created attachment 450955 [details]
valgrind output

This happens sometimes during shutdown.  I don't have steps to reproduce, but I'm pretty sure there's an ogg vorbis file involved.
(Assignee)

Comment 1

7 years ago
Looking at the implementation in libasound, snd_pcm_open/snd_pcm_close are not thread-safe.

I can't find any documentation that states what (if any) ALSA functionality *is* thread-safe, so sydneyaudio could be stepping on toes all over the place.
(Assignee)

Comment 2

7 years ago
This general problem is probably largely responsible for the random orange in bug 562004, bug 557432, and maybe others.
(Assignee)

Comment 3

7 years ago
Adding two locks to sydneyaudio may be sufficient: a library-level lock for snd_pcm_open/snd_pcm_close, and a per-stream lock for everything else.  This would require changing nsBuiltinDecoderStateMachine's logic back to the emulated-non-blocking approach we used to use, otherwise we'll be forced to hold locks around long-running calls to snd_pcm_writei.

Digging around more, mAudioMonitor should already protect most calls into sydneyaudio, so perhaps we only need a lock for the library and any other threading bugs can be attributed to incorrect use of mAudioMonitor (such as bug 560784).
(Reporter)

Updated

7 years ago
Blocks: 562004, 557432
(Assignee)

Comment 4

7 years ago
Created attachment 453634 [details] [diff] [review]
patch v0

(In reply to comment #3)
> Digging around more, mAudioMonitor should already protect most calls into
> sydneyaudio, so perhaps we only need a lock for the library and any other
> threading bugs can be attributed to incorrect use of mAudioMonitor (such as bug
> 560784).

The OS X and Win32 sydneyaudio implementations seem to already be thread-safe with respect to concurrent calls to write/get_position, so I think it actually does make sense to make the per-stream stuff thread safe.

This patch does that, but doesn't address the potential problem with blocking snd_pcm_writei calls vs calls to get_position.  The OS X and Win32 backends won't suffer from this problem, so I think I'll need to address it before asking for review.
Assignee: nobody → kinetik
Status: NEW → ASSIGNED
(Assignee)

Comment 5

7 years ago
Created attachment 454312 [details] [diff] [review]
patch v1
Attachment #453634 - Attachment is obsolete: true
(Assignee)

Comment 6

7 years ago
Same as the last patch, but addresses the blocking snd_pcm_writei issue.  The key differences are:

Open the pcm device as non-blocking:

   if (snd_pcm_open(&s->output_unit, 
                    "default", 
                    SND_PCM_STREAM_PLAYBACK, 
-                   0) < 0) {
+                   SND_PCM_NONBLOCK) < 0) {

Handle non-blocking writes by dropping the stream lock, waiting for the pcm to become ready, and reacquiring the lock:

+      if (frames == -EAGAIN && snd_pcm_state(s->output_unit) == SND_PCM_STATE_RUNNING) {
+        pthread_mutex_unlock(&s->mutex);
+        snd_pcm_wait(s->output_unit, -1);
+        pthread_mutex_lock(&s->mutex);
+        continue;
+      }

I think this patch is good to go, but needs testing by more people on more machines.  I've tested Fedora 12 with PulseAudio and direct ALSA and it works fine there (under VMWare Fusion 3.1 and VirtualBox 3.2).
(Assignee)

Comment 7

7 years ago
Confirmation that it's not safe to call ALSA library functions for the same device concurrently: http://mailman.alsa-project.org/pipermail/alsa-devel/2010-June/028349.html

Unfortunately, I've found a problem with the patch: in my F12/VMWare Fusion 3.1 setup I'm seeing a livelock where snd_pcm_writei returns EAGAIN, snd_pcm_wait returns 1 (ready), and snd_pcm_writei still returns EAGAIN, so the write makes no progress.  I can usually reproduce this within 2-3 runs of 8 parallel sets of test_playback running.
(Assignee)

Comment 8

7 years ago
It looks like that hang is fixed in the current git head of alsa-plugins, most likely fixed by the patch discussed in bug 573924 comment 2.
(Assignee)

Updated

7 years ago
Blocks: 560784
Still seeing this on m-c of two days back
TEST_PATH=content/media/test/test_playback.html

I get the same valgrind errors as Jesse (comment #1).  Native
runs randomly either succeed, terminate with "out of memory" (!)
or terminate with a double free detected by glibc.
This is on Ubuntu 10.04 x86_64.
(Assignee)

Updated

7 years ago
Depends on: 574190
(Assignee)

Comment 10

7 years ago
(In reply to comment #5)
> Created attachment 454312 [details] [diff] [review]
> patch v1

This patch isn't right.  It looks like it's not safe to call snd_pcm_wait on one thread while another thread calls other snd_pcm_* functions.  I think the current patch also breaks drain, since in non-blocking mode the drain will return immediately.  The drain call needs to be modified to wait on the PCM device and check its status until it moves from DRAINING to whatever state it ends up in after draining.

Rather than try to fix everything at once and end up blocked on updating PulseAudio/ALSA on the test machines, I'll attach a patch that fixes this particular bug.
(Assignee)

Updated

7 years ago
Attachment #454312 - Attachment is obsolete: true
(Assignee)

Comment 11

7 years ago
Created attachment 468249 [details] [diff] [review]
patch v2
(Assignee)

Comment 12

7 years ago
Comment on attachment 468249 [details] [diff] [review]
patch v2

This adds a library-global lock that's used for creating and destroying streams.  It doesn't solve the general thread-safety problems we have with the current design, but it will prevent a bunch of crashes caused by asserts like the following:

Assertion 'pa_atomic_load(&(s)->_ref) >= 1' failed at pulse/stream.c:253, function pa_stream_unref(). Aborting.
Attachment #468249 - Flags: review?(chris.double)

Updated

7 years ago
Attachment #468249 - Flags: review?(chris.double) → review+
(Assignee)

Updated

7 years ago
Attachment #468249 - Flags: approval2.0?
Attachment #468249 - Flags: approval2.0? → approval2.0+
(Assignee)

Updated

7 years ago
Keywords: checkin-needed
http://hg.mozilla.org/mozilla-central/rev/2ba7ee65686d
Status: ASSIGNED → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
Keywords: checkin-needed
You need to log in before you can comment on or make changes to this bug.