Closed Bug 495794 Opened 11 years ago Closed 11 years ago

Crash in xul.dll when viewing video [@ closeAudio ]

Categories

(Core :: Audio/Video, defect, critical)

x86
Windows XP
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla1.9.2a1

People

(Reporter: ecfbugzilla, Assigned: kinetik)

References

()

Details

(Keywords: crash, regression, verified1.9.1.1)

Crash Data

Attachments

(2 files, 2 obsolete files)

I experienced several crashes when viewing http://mozilla.seanmartell.com/OGV/ in Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090601 Minefield/3.6a1pre. Some of the crash reports:

http://crash-stats.mozilla.com/report/index/f421aa5d-3207-4b7e-a169-0de962090601
http://crash-stats.mozilla.com/report/index/19703643-1b99-4205-8f2f-04f742090601

I couldn't find a way to reproduce it reliably - one crash happened while the video was still buffering, another when I wanted to replay it. Other times it played several times without any problems.
Regression range seems to be: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=9e33f1b0a4dc&tochange=989bd3bc6cdf

Very difficult to reproduce, and the only thing I'm sure of is that I could reproduce a crash with a 30 March 2009 build.
Scrolling down with the scroll bar and repeatedly clicking on the start/pause button was the way to reproduce the crash.
Probably bug 485036?
Wladimir's stacks don't make much sense, but Jo's is:

closeAudio
sa_stream_destroy
nsAudioStream::Init
nsOggDecodeStateMachine::OpenAudioStream
nsOggDecodeStateMachine::StartPlayback
nsOggDecodeStateMachine::ResumePlayback

Which suggests sa_stream_open is failing so we're calling sa_stream_destroy when trying to clean up.
Flags: blocking1.9.1?
No way does this block, as per comment 3
Flags: wanted1.9.1.x+
Flags: blocking1.9.1?
Flags: blocking1.9.1-
I haven't been able to reproduce the crash yet.  If anyone who can reproduce it can do so with a debug build and get a good stack, that'd be super useful.  Also, noting any messages printed to stdout/stderr might be useful.

The reason the sa_stream_open fails is because the media linked from the bug URL has a Vorbis track which is 6 channels at 48kHz.  The existing code in sydney_audio_waveapi.c only supports 1 or 2 channel audio (due to passing a WAVEFORMAT rather than WAVEFORMATEX to waveOutOpen), so we bail out of sa_stream_open after waveOutOpen(..., WAVE_FORMAT_QUERY), and that control path appears to clean up after itself correctly.

I did notice that if waveOutOpen(..., WAVE_FORMAT_QUERY) succeeds but the real waveOutOpen fails that freeBlocks() isn't called, which looks suspicious but I'm not sure it's possible for this to happen.  It this case is being hit, "opening audio device for playback" will be printed to stdout sometime before the crash.
Assignee: nobody → kinetik
(In reply to comment #7)
> WAVEFORMAT rather than WAVEFORMATEX to waveOutOpen), so we bail out of

I mean "WAVEFORMATEX rather than WAVEFORMATEXTENSIBLE".
Blocks: 494120
No longer blocks: 494120
Attached patch possible fix v0 (obsolete) — Splinter Review
I still can't reproduce this, but this patch may fix the issue if I'm right about the cause.

I've pushed this to the try server, I'll post the link to the build when it's done so others can test the patch.
Try server builds are now available at: https://build.mozilla.org/tryserver-builds/mgregan@mozilla.com-try-b7bca10c08f/

Wladimir, Jo, Ria -- if you could try this and let me know if the problem is solved, that'd be great.  Thanks!
I was unable to reproduce this crash with the tryserver build. However, for some reason I couldn't reproduce it with the 20090602 Minefield build either.
(In reply to comment #2)
> Regression range seems to be:
> http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=9e33f1b0a4dc&tochange=989bd3bc6cdf

There is an identical crash report with a 20090519 Minefield build:

http://crash-stats.mozilla.com/report/index/027ecb83-a38d-49fd-8b12-43c992090527

So this issue is probably older.
The earliest failure I could find is:

http://crash-stats.mozilla.com/report/index/7b60a420-5a4e-4586-815a-90af72090408

I couldn't see anything obvious in the mozilla-central checkin window around the time of that crash.

Looking through the crash reports, most of them have bogus looking stacks where the entire stack is PR_DestroyMonitor -> closeAudio, which isn't possible.
matthew: if you don't see a source file name, please assume that there are no symbols for the library and the function name you see is just the closest exported symbol (in a certain fixed direction).
In any case I still can reproduce the crash with the latest hourly on Windows XP.
With the latest trunk hourly I saw 4 crashes (3 at the end of buffering), but not with the tryserver build, which can mean 3 things:

1. coincidence;
2. it is improved;
3. it is fixed.
- the latest 3.5 still crashes
- the latest 3.6 also crashes
- the tryserver build doesn't crash

The way I can reliably reproduce, is by clearing the cache, restarting and then download http://mozilla.seanmartell.com/OGV/ directly. Reloading, even with the shift key held down, doesn't work (I'm behind a proxyserver, it's probably caching).
Thanks Jo, following those instructions allowed me to reproduce this on XP pretty easily.  Having looked at this in a debugger, the cause of the failure is more obvious than what I guessed.  In the common failure cases, waveOutOpen fails (e.g. due to an unsupported format, as seen with this URL), which calls freeBlocks and then returns with an error.  This leaves the sa_stream_t with a waveBlocks pointer that points to freed memory.  When the caller then tries to clean up by calling sa_stream_destroy, closeAudio is called which attempts to access each of the blocks and call waveOutUnprepareHeader on them if they have the WHDR_PREPARED flag set.  Since, in this case, that memory has previously been freed by the failure path of openAudio, we either access freed memory or crash attempting to access unmapped memory.  I'm actually surprised this was so hard to reproduce, given this.

Bug 485433 updated libsydneyaudio, which pulled in a cleanup to closeAudio I had submitted upstream.  :-(  The old closeAudio code would usually have bailed out before attempting to access waveBlocks because waveOutReset would return MMSYSERR_INVALHANDLE.

The patch I attached here fixes the crash because closeAudio returns early when hWaveOut failed to initialize in openAudio.  It also changed the initialization order of openAudio so that waveBlocks is only allocated one waveOutOpen is successful.  I did notice an additional error path that's not handled correctly, so I'll fix that and post an updated patch shortly.
Blocks: 485433
Downloaded the 3.5 RTM on windows 7 64bit this morning and get a 100% reproducible crash whenever trying to play the video on the "Welcome to firefox" page:

xul!closeAudio+0x42
xul!sa_stream_destroy+0x14
xul!nsAudioStream::Init+0x3f
xul!nsOggDecodeStateMachine::OpenAudioStream+0x43
vfbasics!AVrfpTlsGetValue+0x49
xul!nsOggDecodeStateMachine::ResumePlayback+0x1b
xul!nsOggDecodeStateMachine::PlayFrame+0x3c
xul!nsOggDecodeStateMachine::Run+0x7a0
xul!nsThread::ProcessNextEvent+0x253
xul!nsThread::ThreadFunc+0x6e
nspr4!_PR_NativeRunThread+0xdb
nspr4!pr_root+0xd
MOZCRT19!_callthreadstartex+0x48
MOZCRT19!_threadstartex+0x66
vfbasics!AVrfpStandardThreadFunction+0x2f
kernel32!BaseThreadInitThunk+0xe
ntdll!__RtlUserThreadStart+0x70
ntdll!_RtlUserThreadStart+0x1b

without app verifier, I get fatal heap exception for heap_failure_block_not_busy (which is a double free IIRC). I've had it crash enough times that the fault tolerant heap enabled on firefox.exe (but still doesn't solve the crashing).
Duplicate of this bug: 501579
I also get an 100% reproducible crash from the Firefox 3.5 RTM when clicking on the video on the welcome page (crash is about 0.5 secs after clicking the video).

Also get this on the hacks.mozilla.org pop art demo (http://hacks.mozilla.org/2009/06/pop-art-video/).  I haven't yet found a video that does work, but I haven't searched extensively.

Stack trace in bug 501579.
Flags: blocking1.9.1.1?
This is bad, people are crashing on the video in the start page. What's holding this bug up?
Attached patch patch v1Splinter Review
NULL out waveBlocks whenever it is freed.  NULL check in closeAudio before unpreparing blocks.  Test the result of the final waveOutOpen and free memory and return an error on failure.  Also remove the printfs from the success path (resolves bug 498186).

Mochitests pass with audio enabled and disabled.  Includes a crashtest.
Attachment #381188 - Attachment is obsolete: true
Attachment #386441 - Flags: review?(chris.double)
This is currently the #41 topcrash for Firefox 3.5.
Blocks: 498186
Attachment #386441 - Flags: review?(chris.double) → review+
I suspect this is happening far more frequently than the topcrash stats are showing - I certainly don't get a dialogue allowing me to report the crash.
Pushed http://hg.mozilla.org/mozilla-central/rev/e037975c9472
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
seems to be fixed for me (using the testcase in comment 17)

it doesn't crash anymore in
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090702 Minefield/3.6a1pre
(http://hg.mozilla.org/mozilla-central/rev/1c6ec6110a0f)

while it did crash yesterday, and it still does in 3.5 or the current 1.9.1 branch
Keywords: regression
Marked as blocking? 1.9.1.1, don't know what the triage or schedule for that release is yet.
Blocking on this because it's a topcrash and has happened on our start page, which is bad.

Matthew, does this patch apply to 1.9.1? Can you please request approval1.9.1.1 on it, if so? (Or attach a patch that does apply and request approval on it...)
Flags: blocking1.9.1.1? → blocking1.9.1.1+
Attached patch patch v1 against 1.9.1 (obsolete) — Splinter Review
It doesn't apply as is because the layout of content/media changed, and there were some minor conflicts in media/libsydneyaudio due to trunk-only patches.

Here's a version for 1.9.1.  I'll request approval on it once I've built and tested it.
With crashtest file included.
Attachment #387963 - Attachment is obsolete: true
Attachment #387966 - Flags: approval1.9.1.1?
Comment on attachment 387966 [details] [diff] [review]
patch v1.1 against 1.9.1

Fixes a common crash (it was a topcrash, but it has since fallen off of the top 100 list) for Windows users with no hardware or when trying to play audio that the sound hardware doesn't support.  High visibility because it can happen with the video embedded in the firstrun page.

Patch is simple, has been on trunk for a week, and includes a crashtest.
Comment on attachment 387966 [details] [diff] [review]
patch v1.1 against 1.9.1

Approved for 1.9.1.1. a=ss for release-drivers (assuming you don't include the patch of the patch in this diff ;).
Attachment #387966 - Flags: approval1.9.1.1? → approval1.9.1.1+
Thanks!

The inclusion of the patch is intentional.   sydneyaudio is an upstream library, so any changes we apply to it are also included in the media/sydneyaudio directory as a patch so that it's easy to push patches upstream and reapply patches when syncing with upstream.
Now it seems to be fixed in 1.9.1.1 too (using the testcase in comment 17)

crashed in yesterday's build:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.1pre) Gecko/20090713 Shiretoko/3.5.1pre

fixed in today's build :
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.1pre) Gecko/20090714 Shiretoko/3.5.1pre

Thanks !
Thanks Jo. I also tried the given testcases and cannot crash a recent 1.9.1.1 and trunk build. Marking verified fixed.
Status: RESOLVED → VERIFIED
Flags: in-testsuite?
Target Milestone: --- → mozilla1.9.2a1
Crash Signature: [@ closeAudio ]
You need to log in before you can comment on or make changes to this bug.