Last Comment Bug 495794 - Crash in xul.dll when viewing video [@ closeAudio ]
: Crash in xul.dll when viewing video [@ closeAudio ]
Status: VERIFIED FIXED
: crash, regression, verified1.9.1.1
Product: Core
Classification: Components
Component: Audio/Video (show other bugs)
: Trunk
: x86 Windows XP
: -- critical with 2 votes (vote)
: mozilla1.9.2a1
Assigned To: Matthew Gregan [:kinetik]
:
Mentors:
http://mozilla.seanmartell.com/OGV/
: 501579 (view as bug list)
Depends on:
Blocks: 485433 498186
  Show dependency treegraph
 
Reported: 2009-06-01 09:33 PDT by Wladimir Palant
Modified: 2011-06-09 14:58 PDT (History)
16 users (show)
mbeltzner: blocking1.9.1-
samuel.sidler+old: blocking1.9.1.1+
mbeltzner: wanted1.9.1.x+
hskupin: in‑testsuite?
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
possible fix v0 (3.03 KB, patch)
2009-06-02 16:50 PDT, Matthew Gregan [:kinetik]
no flags Details | Diff | Review
patch v1 (14.33 KB, patch)
2009-07-01 21:39 PDT, Matthew Gregan [:kinetik]
cajbir.bugzilla: review+
Details | Diff | Review
patch v1 against 1.9.1 (9.81 KB, patch)
2009-07-10 15:50 PDT, Matthew Gregan [:kinetik]
no flags Details | Diff | Review
patch v1.1 against 1.9.1 (14.33 KB, patch)
2009-07-10 15:57 PDT, Matthew Gregan [:kinetik]
samuel.sidler+old: approval1.9.1.1+
Details | Diff | Review

Description Wladimir Palant 2009-06-01 09:33:57 PDT
I experienced several crashes when viewing http://mozilla.seanmartell.com/OGV/ in Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090601 Minefield/3.6a1pre. Some of the crash reports:

http://crash-stats.mozilla.com/report/index/f421aa5d-3207-4b7e-a169-0de962090601
http://crash-stats.mozilla.com/report/index/19703643-1b99-4205-8f2f-04f742090601

I couldn't find a way to reproduce it reliably - one crash happened while the video was still buffering, another when I wanted to replay it. Other times it played several times without any problems.
Comment 2 Ria Klaassen (not reading all bugmail) 2009-06-01 14:57:26 PDT
Regression range seems to be: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=9e33f1b0a4dc&tochange=989bd3bc6cdf

Very difficult to reproduce, and the only thing I'm sure of is that I could reproduce a crash with a 30 March 2009 build.
Comment 3 Ria Klaassen (not reading all bugmail) 2009-06-01 15:00:55 PDT
Scrolling down with the scroll bar and repeatedly clicking on the start/pause button was the way to reproduce the crash.
Comment 4 Matthew Gregan [:kinetik] 2009-06-01 15:15:09 PDT
Probably bug 485036?
Comment 5 Matthew Gregan [:kinetik] 2009-06-01 15:26:13 PDT
Wladimir's stacks don't make much sense, but Jo's is:

closeAudio
sa_stream_destroy
nsAudioStream::Init
nsOggDecodeStateMachine::OpenAudioStream
nsOggDecodeStateMachine::StartPlayback
nsOggDecodeStateMachine::ResumePlayback

Which suggests sa_stream_open is failing so we're calling sa_stream_destroy when trying to clean up.
Comment 6 Mike Beltzner [:beltzner, not reading bugmail] 2009-06-01 21:19:25 PDT
No way does this block, as per comment 3
Comment 7 Matthew Gregan [:kinetik] 2009-06-01 23:56:34 PDT
I haven't been able to reproduce the crash yet.  If anyone who can reproduce it can do so with a debug build and get a good stack, that'd be super useful.  Also, noting any messages printed to stdout/stderr might be useful.

The reason the sa_stream_open fails is because the media linked from the bug URL has a Vorbis track which is 6 channels at 48kHz.  The existing code in sydney_audio_waveapi.c only supports 1 or 2 channel audio (due to passing a WAVEFORMAT rather than WAVEFORMATEX to waveOutOpen), so we bail out of sa_stream_open after waveOutOpen(..., WAVE_FORMAT_QUERY), and that control path appears to clean up after itself correctly.

I did notice that if waveOutOpen(..., WAVE_FORMAT_QUERY) succeeds but the real waveOutOpen fails that freeBlocks() isn't called, which looks suspicious but I'm not sure it's possible for this to happen.  It this case is being hit, "opening audio device for playback" will be printed to stdout sometime before the crash.
Comment 8 Matthew Gregan [:kinetik] 2009-06-02 00:05:39 PDT
(In reply to comment #7)
> WAVEFORMAT rather than WAVEFORMATEX to waveOutOpen), so we bail out of

I mean "WAVEFORMATEX rather than WAVEFORMATEXTENSIBLE".
Comment 9 Matthew Gregan [:kinetik] 2009-06-02 16:50:49 PDT
Created attachment 381188 [details] [diff] [review]
possible fix v0

I still can't reproduce this, but this patch may fix the issue if I'm right about the cause.

I've pushed this to the try server, I'll post the link to the build when it's done so others can test the patch.
Comment 10 Matthew Gregan [:kinetik] 2009-06-02 18:38:33 PDT
Try server builds are now available at: https://build.mozilla.org/tryserver-builds/mgregan@mozilla.com-try-b7bca10c08f/

Wladimir, Jo, Ria -- if you could try this and let me know if the problem is solved, that'd be great.  Thanks!
Comment 11 Wladimir Palant 2009-06-02 23:03:47 PDT
I was unable to reproduce this crash with the tryserver build. However, for some reason I couldn't reproduce it with the 20090602 Minefield build either.
Comment 12 Wladimir Palant 2009-06-02 23:11:16 PDT
(In reply to comment #2)
> Regression range seems to be:
> http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=9e33f1b0a4dc&tochange=989bd3bc6cdf

There is an identical crash report with a 20090519 Minefield build:

http://crash-stats.mozilla.com/report/index/027ecb83-a38d-49fd-8b12-43c992090527

So this issue is probably older.
Comment 13 Matthew Gregan [:kinetik] 2009-06-03 00:26:47 PDT
The earliest failure I could find is:

http://crash-stats.mozilla.com/report/index/7b60a420-5a4e-4586-815a-90af72090408

I couldn't see anything obvious in the mozilla-central checkin window around the time of that crash.

Looking through the crash reports, most of them have bogus looking stacks where the entire stack is PR_DestroyMonitor -> closeAudio, which isn't possible.
Comment 14 timeless 2009-06-03 01:05:05 PDT
matthew: if you don't see a source file name, please assume that there are no symbols for the library and the function name you see is just the closest exported symbol (in a certain fixed direction).
Comment 15 Ria Klaassen (not reading all bugmail) 2009-06-03 05:31:30 PDT
In any case I still can reproduce the crash with the latest hourly on Windows XP.
Comment 16 Ria Klaassen (not reading all bugmail) 2009-06-03 07:49:16 PDT
With the latest trunk hourly I saw 4 crashes (3 at the end of buffering), but not with the tryserver build, which can mean 3 things:

1. coincidence;
2. it is improved;
3. it is fixed.
Comment 17 Jo Hermans 2009-06-03 09:12:51 PDT
- the latest 3.5 still crashes
- the latest 3.6 also crashes
- the tryserver build doesn't crash

The way I can reliably reproduce, is by clearing the cache, restarting and then download http://mozilla.seanmartell.com/OGV/ directly. Reloading, even with the shift key held down, doesn't work (I'm behind a proxyserver, it's probably caching).
Comment 18 Matthew Gregan [:kinetik] 2009-06-04 21:13:05 PDT
Thanks Jo, following those instructions allowed me to reproduce this on XP pretty easily.  Having looked at this in a debugger, the cause of the failure is more obvious than what I guessed.  In the common failure cases, waveOutOpen fails (e.g. due to an unsupported format, as seen with this URL), which calls freeBlocks and then returns with an error.  This leaves the sa_stream_t with a waveBlocks pointer that points to freed memory.  When the caller then tries to clean up by calling sa_stream_destroy, closeAudio is called which attempts to access each of the blocks and call waveOutUnprepareHeader on them if they have the WHDR_PREPARED flag set.  Since, in this case, that memory has previously been freed by the failure path of openAudio, we either access freed memory or crash attempting to access unmapped memory.  I'm actually surprised this was so hard to reproduce, given this.

Bug 485433 updated libsydneyaudio, which pulled in a cleanup to closeAudio I had submitted upstream.  :-(  The old closeAudio code would usually have bailed out before attempting to access waveBlocks because waveOutReset would return MMSYSERR_INVALHANDLE.

The patch I attached here fixes the crash because closeAudio returns early when hWaveOut failed to initialize in openAudio.  It also changed the initialization order of openAudio so that waveBlocks is only allocated one waveOutOpen is successful.  I did notice an additional error path that's not handled correctly, so I'll fix that and post an updated patch shortly.
Comment 19 ari 2009-06-30 10:07:59 PDT
Downloaded the 3.5 RTM on windows 7 64bit this morning and get a 100% reproducible crash whenever trying to play the video on the "Welcome to firefox" page:

xul!closeAudio+0x42
xul!sa_stream_destroy+0x14
xul!nsAudioStream::Init+0x3f
xul!nsOggDecodeStateMachine::OpenAudioStream+0x43
vfbasics!AVrfpTlsGetValue+0x49
xul!nsOggDecodeStateMachine::ResumePlayback+0x1b
xul!nsOggDecodeStateMachine::PlayFrame+0x3c
xul!nsOggDecodeStateMachine::Run+0x7a0
xul!nsThread::ProcessNextEvent+0x253
xul!nsThread::ThreadFunc+0x6e
nspr4!_PR_NativeRunThread+0xdb
nspr4!pr_root+0xd
MOZCRT19!_callthreadstartex+0x48
MOZCRT19!_threadstartex+0x66
vfbasics!AVrfpStandardThreadFunction+0x2f
kernel32!BaseThreadInitThunk+0xe
ntdll!__RtlUserThreadStart+0x70
ntdll!_RtlUserThreadStart+0x1b

without app verifier, I get fatal heap exception for heap_failure_block_not_busy (which is a double free IIRC). I've had it crash enough times that the fault tolerant heap enabled on firefox.exe (but still doesn't solve the crashing).
Comment 20 timeless 2009-07-01 01:57:22 PDT
*** Bug 501579 has been marked as a duplicate of this bug. ***
Comment 21 Steffan Corley 2009-07-01 06:02:52 PDT
I also get an 100% reproducible crash from the Firefox 3.5 RTM when clicking on the video on the welcome page (crash is about 0.5 secs after clicking the video).

Also get this on the hacks.mozilla.org pop art demo (http://hacks.mozilla.org/2009/06/pop-art-video/).  I haven't yet found a video that does work, but I haven't searched extensively.

Stack trace in bug 501579.
Comment 22 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2009-07-01 17:07:15 PDT
This is bad, people are crashing on the video in the start page. What's holding this bug up?
Comment 23 Matthew Gregan [:kinetik] 2009-07-01 21:39:54 PDT
Created attachment 386441 [details] [diff] [review]
patch v1

NULL out waveBlocks whenever it is freed.  NULL check in closeAudio before unpreparing blocks.  Test the result of the final waveOutOpen and free memory and return an error on failure.  Also remove the printfs from the success path (resolves bug 498186).

Mochitests pass with audio enabled and disabled.  Includes a crashtest.
Comment 24 Matthew Gregan [:kinetik] 2009-07-01 21:40:59 PDT
This is currently the #41 topcrash for Firefox 3.5.
Comment 25 Matthew Gregan [:kinetik] 2009-07-01 21:50:53 PDT
Upstream bug: https://trac.annodex.net/ticket/491
Comment 26 Steffan Corley 2009-07-02 02:24:28 PDT
I suspect this is happening far more frequently than the topcrash stats are showing - I certainly don't get a dialogue allowing me to report the crash.
Comment 27 Matthew Gregan [:kinetik] 2009-07-02 03:02:01 PDT
Pushed http://hg.mozilla.org/mozilla-central/rev/e037975c9472
Comment 28 Jo Hermans 2009-07-02 16:05:34 PDT
seems to be fixed for me (using the testcase in comment 17)

it doesn't crash anymore in
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090702 Minefield/3.6a1pre
(http://hg.mozilla.org/mozilla-central/rev/1c6ec6110a0f)

while it did crash yesterday, and it still does in 3.5 or the current 1.9.1 branch
Comment 29 Christopher Blizzard (:blizzard) 2009-07-06 09:34:09 PDT
Marked as blocking? 1.9.1.1, don't know what the triage or schedule for that release is yet.
Comment 30 Samuel Sidler (old account; do not CC) 2009-07-10 14:10:10 PDT
Blocking on this because it's a topcrash and has happened on our start page, which is bad.

Matthew, does this patch apply to 1.9.1? Can you please request approval1.9.1.1 on it, if so? (Or attach a patch that does apply and request approval on it...)
Comment 31 Matthew Gregan [:kinetik] 2009-07-10 15:50:56 PDT
Created attachment 387963 [details] [diff] [review]
patch v1 against 1.9.1

It doesn't apply as is because the layout of content/media changed, and there were some minor conflicts in media/libsydneyaudio due to trunk-only patches.

Here's a version for 1.9.1.  I'll request approval on it once I've built and tested it.
Comment 32 Matthew Gregan [:kinetik] 2009-07-10 15:57:55 PDT
Created attachment 387966 [details] [diff] [review]
patch v1.1 against 1.9.1

With crashtest file included.
Comment 33 Matthew Gregan [:kinetik] 2009-07-10 18:09:10 PDT
Comment on attachment 387966 [details] [diff] [review]
patch v1.1 against 1.9.1

Fixes a common crash (it was a topcrash, but it has since fallen off of the top 100 list) for Windows users with no hardware or when trying to play audio that the sound hardware doesn't support.  High visibility because it can happen with the video embedded in the firstrun page.

Patch is simple, has been on trunk for a week, and includes a crashtest.
Comment 34 Samuel Sidler (old account; do not CC) 2009-07-13 13:07:58 PDT
Comment on attachment 387966 [details] [diff] [review]
patch v1.1 against 1.9.1

Approved for 1.9.1.1. a=ss for release-drivers (assuming you don't include the patch of the patch in this diff ;).
Comment 35 Matthew Gregan [:kinetik] 2009-07-13 14:56:50 PDT
Thanks!

The inclusion of the patch is intentional.   sydneyaudio is an upstream library, so any changes we apply to it are also included in the media/sydneyaudio directory as a patch so that it's easy to push patches upstream and reapply patches when syncing with upstream.
Comment 36 Matthew Gregan [:kinetik] 2009-07-13 19:25:02 PDT
Pushed http://hg.mozilla.org/releases/mozilla-1.9.1/rev/7f375d928283
Comment 37 Jo Hermans 2009-07-14 07:55:52 PDT
Now it seems to be fixed in 1.9.1.1 too (using the testcase in comment 17)

crashed in yesterday's build:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.1pre) Gecko/20090713 Shiretoko/3.5.1pre

fixed in today's build :
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.1pre) Gecko/20090714 Shiretoko/3.5.1pre

Thanks !
Comment 38 Henrik Skupin (:whimboo) 2009-07-15 06:16:17 PDT
Thanks Jo. I also tried the given testcases and cannot crash a recent 1.9.1.1 and trunk build. Marking verified fixed.

Note You need to log in before you can comment on or make changes to this bug.