Closed Bug 622211 Opened 14 years ago Closed 14 years ago

Chrome process intermittently aborts in content/media/test/crashtests/459439-1.html and crashes in content/media/test/crashtests/466607-1.html and content/media/test/crashtests/466945-1.html

Categories

(Core :: Audio/Video, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
Tracking Status
fennec 2.0b4+ ---

People

(Reporter: cjones, Assigned: kinetik)

References

Details

Attachments

(1 file, 1 obsolete file)

This is in a debug desktop build, running crashtest-ipc (bug 615386). REFTEST INFO | START file:///home/cjones/mozilla/mozilla-central/content/media/test/crashtests/459439-1.html WARNING: Subdocument container has no content: file /home/cjones/mozilla/mozilla-central/layout/base/nsDocumentViewer.cpp, line 2383 ++DOMWINDOW == 112 (0x1243b308) [serial = 753] [outer = 0x256d1b0] ************************************************************ * Call to xpconnect wrapped JSObject produced this error: * [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get]" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: file:///home/cjones/mozilla/ff-dbg/dist/bin/components/nsHandlerService.js :: <TOP_LEVEL> :: line 909" data: no] ************************************************************ REFTEST INFO | [CONTENT] OnDocumentLoad triggering WaitForTestEnd REFTEST INFO | Initializing canvas snapshot REFTEST INFO | [CONTENT] MakeProgress: STATE_WAITING_TO_FIRE_INVALIDATE_EVENT REFTEST INFO | [CONTENT] MakeProgress: waiting for MozAfterPaint REFTEST INFO | [CONTENT] AttrModifiedListener fired ************************************************************ * Call to xpconnect wrapped JSObject produced this error: * [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get]" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: file:///home/cjones/mozilla/ff-dbg/dist/bin/components/nsHandlerService.js :: <TOP_LEVEL> :: line 909" data: no] ************************************************************ REFTEST INFO | [CONTENT] AfterPaintListener in file:///home/cjones/mozilla/mozilla-central/content/media/test/crashtests/459439-1.html REFTEST INFO | Updating canvas for invalidation REFTEST INFO | No canvas to update! REFTEST INFO | [CONTENT] MakeProgress: STATE_WAITING_TO_FIRE_INVALIDATE_EVENT REFTEST INFO | [CONTENT] MakeProgress: waiting for MozAfterPaint REFTEST INFO | [CONTENT] AttrModifiedListener fired REFTEST INFO | [CONTENT] MakeProgress: STATE_WAITING_TO_FIRE_INVALIDATE_EVENT REFTEST INFO | [CONTENT] MakeProgress: waiting for MozAfterPaint Assertion 'pa_atomic_load(&(c)->_ref) >= 1' failed at pulse/context.c:1032, function pa_context_get_state(). Aborting. Assertion 'pthread_mutex_destroy(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:83, function pa_mutex_free(). Aborting. ###!!! [Child][SyncChannel] Error: Channel error: cannot send/recv WARNING: content process _exit()ing: file /home/cjones/mozilla/mozilla-central/dom/ipc/ContentChild.cpp, line 471 TEST-UNEXPECTED-FAIL | file:///home/cjones/mozilla/mozilla-central/content/media/test/crashtests/459439-1.html | Exited with code 1 during test run INFO | automation.py | Application ran for: 0:01:11.345283 INFO | automation.py | Reading PID log: /tmp/tmpEX2xsUpidlog ==> process 30987 launched child process 30998 INFO | automation.py | Checking for orphan process with PID: 30998 PROCESS-CRASH | file:///home/cjones/mozilla/mozilla-central/content/media/test/crashtests/459439-1.html | application crashed (minidump found) Neither MINIDUMP_STACKWALK nor MINIDUMP_STACKWALK_CGI is set, can't process dump. Fairly easy to reproduce for me, ~1/5 or so. Will see what valgrind has to say in a bit.
This bug likely affects maemo and probably android too, if it's a race condition or something like that in the remote audio code.
tracking-fennec: --- → ?
==16501== Thread 11: ==16501== Invalid read of size 8 ==16501== at 0xE335913: snd_pcm_poll_descriptors_revents (in /usr/lib/libasound.so.2.0.0) ==16501== by 0xE3396B8: ??? (in /usr/lib/libasound.so.2.0.0) ==16501== by 0xE33998C: ??? (in /usr/lib/libasound.so.2.0.0) ==16501== by 0xE3784D9: ??? (in /usr/lib/libasound.so.2.0.0) ==16501== by 0x6457D78: sa_stream_write (sydney_audio_alsa.c:260) ==16501== by 0x63F98F2: nsAudioStreamLocal::Write(void const*, unsigned int, int) (nsAudioStream.cpp:455) ==16501== by 0x6D91564: mozilla::dom::AudioWriteEvent::Run() (AudioParent.cpp:59) ==16501== by 0x702C554: nsThread::ProcessNextEvent(int, int*) (nsThread.cpp:626) ==16501== by 0x6FB90E3: NS_ProcessNextEvent_P(nsIThread*, int) (nsThreadUtils.cpp:250) ==16501== by 0x702B3E0: nsThread::ThreadFunc(void*) (nsThread.cpp:278) ==16501== by 0x983057A: _pt_root (ptthread.c:187) ==16501== by 0x4E339C9: start_thread (pthread_create.c:300) ==16501== Address 0x1cbb1dd8 is 392 bytes inside a block of size 440 free'd ==16501== at 0x4C270BD: free (vg_replace_malloc.c:366) ==16501== by 0x706613B: free (nsTraceMalloc.c:1303) ==16501== by 0xE339D1E: ??? (in /usr/lib/libasound.so.2.0.0) ==16501== by 0xE33AB22: snd_pcm_close (in /usr/lib/libasound.so.2.0.0) ==16501== by 0x6457C66: sa_stream_destroy (sydney_audio_alsa.c:223) ==16501== by 0x63F9490: nsAudioStreamLocal::Shutdown() (nsAudioStream.cpp:380) ==16501== by 0x6D90F64: mozilla::dom::AudioParent::Recv__delete__() (AudioParent.cpp:196) ==16501== by 0x6E0CEFD: mozilla::dom::PAudioParent::OnMessageReceived(IPC::Message const&) (PAudioParent.cpp:128) ==16501== by 0x6E1E86F: mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&) (PContentParent.cpp:628) ==16501== by 0x6DC9FFE: mozilla::ipc::AsyncChannel::OnDispatchMessage(IPC::Message const&) (AsyncChannel.cpp:262) ==16501== by 0x6DD30EB: mozilla::ipc::RPCChannel::OnMaybeDequeueOne() (RPCChannel.cpp:438) ==16501== by 0x6DD8FCF: void DispatchToMethod<mozilla::ipc::RPCChannel, bool (mozilla::ipc::RPCChannel::*)()>(mozilla::ipc::RPCChannel*, bool (mozilla::ipc::RPCChannel::*)(), Tuple0 const&) (tuple.h:383) ==16501==
It looks to me like we RecvWrite(), fire off a Write task to thread 11 here, then soon thereafter happily destroy the libsydney stuff on the main thread from Recv__delete__(), while the write task is still being processed. How are those supposed to synchronize?
one would have hoped that the stream shutdown (sa_stream_destroy()) would have prevented this. I suppose libsydney isn't as threadsafe as you'd would hope. Basically what we have is: Thread A calling sa_stream_destroy(i) Thread B calling sa_stream_write(i) without any synchronization.
Attached patch patch v0 (obsolete) — Splinter Review
Stream writes are requeued on the parent audio thread, but the shutdown calls are not. This makes it possible to have pending writes queued and racing with shutdown. The fix is to also shunt AudioStream::Shutdown to the audio thread to restore the natural synchronization expected by the callers.
Assignee: nobody → kinetik
Status: NEW → ASSIGNED
Attachment #501215 - Flags: review?(doug.turner)
tracking-fennec: ? → 2.0b4+
Comment on attachment 501215 [details] [diff] [review] patch v0 >+class AudioShutdownEvent : public nsRunnable >+{ >+ public: >+ AudioShutdownEvent(nsAudioStream* owner) >+ { >+ mOwner = owner; >+ } >+ >+ NS_IMETHOD Run() >+ { >+ mOwner->Shutdown(); >+ return NS_OK; >+ } >+ >+ private: >+ nsRefPtr<nsAudioStream> mOwner; >+}; have you seen roc's patch? It adds a similar event for shutting down the audio thread. Would you mind changing the name of this class to something like |AudioStreamShutdownEvent|? see http://hg.mozilla.org/mozilla-central/rev/29867a7bccf2 >- if (mStream) >- mStream->SetVolume(aVolume); >+ mStream->SetVolume(aVolume); > return true; > } ... > mStream = nsAudioStream::AllocateStream(); > if (mStream) I am fine with not testing for mStream, however.... we do have methods called from various places. If any place calls after the shutdown event is called, but before AudioParent is destroyed, we will crash. Quickly looking, I don't think that will happen. However, I'd feel more comfortable with NS_ASSERTIONS() or NS_ENSURE so that we can catch any new case and remove it. Also, remove/assert here too: http://mxr.mozilla.org/mozilla-central/source/dom/ipc/AudioParent.cpp#211
Attachment #501215 - Flags: review?(doug.turner) → review-
Attached patch patch v1Splinter Review
Renamed the stream shutdown event. I've also restored the mStream tests (and added them in more places), and generally reviewed what happens when there's an initialization error. The old code, and my earlier patch, both could've had problems. The right way to deal with initialization problems is by fixing bug 612799, but this at least makes a failed stream safe.
Attachment #501215 - Attachment is obsolete: true
Attachment #501511 - Flags: review?(doug.turner)
Oh, and I changed modelines to match the style of the file, rather than reformatting the entire file.
Attachment #501511 - Flags: review?(doug.turner) → review+
I also saw this on windows tinderbox: http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1294818191.1294818480.24265.gz s: try-w32-slave30 TEST-UNEXPECTED-FAIL | file:///e:/builds/moz2_slave/test/build/reftest/tests/content/media/test/crashtests/466607-1.html | Exited with code -1073741819 during test run PROCESS-CRASH | file:///e:/builds/moz2_slave/test/build/reftest/tests/content/media/test/crashtests/466607-1.html | application crashed (minidump found) Thread 23 (crashed) TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | missing output line for total leaks! TEST-UNEXPECTED-FAIL | tab process 4064 | automationutils.processLeakLog() | missing output line for total leaks! http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1294823249.1294823532.15996.gz s: try-w32-slave30 TEST-UNEXPECTED-FAIL | file:///e:/builds/moz2_slave/test/build/reftest/tests/content/media/test/crashtests/466607-1.html | Exited with code -1073741819 during test run PROCESS-CRASH | file:///e:/builds/moz2_slave/test/build/reftest/tests/content/media/test/crashtests/466607-1.html | application crashed (minidump found) Thread 23 (crashed) TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | missing output line for total leaks! TEST-UNEXPECTED-FAIL | tab process 4064 | automationutils.processLeakLog() | missing output line for total leaks! Looks like the same issue from a quick glance.
OS: Linux → All
Hardware: x86_64 → All
Summary: Chrome process intermittently aborts in content/media/test/crashtests/459439-1.html → Chrome process intermittently aborts in content/media/test/crashtests/459439-1.html and crashes in content/media/test/crashtests/466607-1.html and content/media/test/crashtests/466945-1.html
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: