Closed Bug 487519 Opened 11 years ago Closed 11 years ago

Crash in vorbis_synthesis [@ 0x11ae1c8a]

Categories

(Core :: Audio/Video, defect, critical)

defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla1.9.2a1

People

(Reporter: kinetik, Assigned: cajbir)

References

()

Details

(Keywords: crash, verified1.9.1, Whiteboard: [sg: ?])

Crash Data

Attachments

(7 files, 1 obsolete file)

The bug URL crashes in a current trunk build and 3.1 beta 3.  It works fine in VLC and QT Player with XiphQT installed.

(gdb) fr
#0  0x11ae1c8a in vorbis_synthesis (vb=0x1d4ac330, op=0xb04d4958) at /Users/kinetik/work/mozilla-central/mozilla/media/libvorbis/lib/vorbis_synthesis.c:49
49	  vb->W=ci->mode_param[mode]->blockflag;
(gdb) p mode
$4 = 0
(gdb) p ci->mode_param[mode]
$5 = (vorbis_info_mode *) 0x0

Full stack:
#0  0x11ae1c8a in vorbis_synthesis (vb=0x1d4ac330, op=0xb04d4958) at /Users/kinetik/work/mozilla-central/mozilla/media/libvorbis/lib/vorbis_synthesis.c:49
#1  0x11ab1e0e in fs_vorbis_decode (fsound=0x1d41cc40, buf=0x1c463170 "??oN?\021wG*?w??{??w\"???\a?,L\004`?L??1?e(kEU?^?2ZU?\nFQ?Yi?'YQEA]?\0322\n???C??bBK\016CG??\f\025?,\026\021", bytes=84) at /Users/kinetik/work/mozilla-central/mozilla/media/libfishsound/src/libfishsound/fishsound_vorbis.c:158
#2  0x11ab1154 in fish_sound_decode (fsound=0x1d41cc40, buf=0x1c463170 "??oN?\021wG*?w??{??w\"???\a?,L\004`?L??1?e(kEU?^?2ZU?\nFQ?Yi?'YQEA]?\0322\n???C??bBK\016CG??\f\025?,\026\021", bytes=84) at /Users/kinetik/work/mozilla-central/mozilla/media/libfishsound/src/libfishsound/fishsound_decode.c:117
#3  0x11ab6a09 in oggplay_callback_audio (oggz=0xd8f800, op=0x1c3c3550, serialno=2091143420, user_data=0x1c460220) at /Users/kinetik/work/mozilla-central/mozilla/media/liboggplay/src/liboggplay/oggplay_callback.c:368
#4  0x11ac1480 in oggz_read_deliver_packet (elem=0x1c3c3550) at /Users/kinetik/work/mozilla-central/mozilla/media/liboggz/src/liboggz/oggz_read.c:300
#5  0x11ac06b4 in oggz_dlist_deliter (dlist=0x1c45d070, func=0x11ac1381 <oggz_read_deliver_packet>) at /Users/kinetik/work/mozilla-central/mozilla/media/liboggz/src/liboggz/oggz_dlist.c:176
#6  0x11ac1955 in oggz_read_sync (oggz=0xd8f800) at /Users/kinetik/work/mozilla-central/mozilla/media/liboggz/src/liboggz/oggz_read.c:458
#7  0x11ac1e1b in oggz_read (oggz=0xd8f800, n=8192) at /Users/kinetik/work/mozilla-central/mozilla/media/liboggz/src/liboggz/oggz_read.c:597
#8  0x11ab5b4b in oggplay_step_decoding (me=0x1d481760) at /Users/kinetik/work/mozilla-central/mozilla/media/liboggplay/src/liboggplay/oggplay.c:662
#9  0x11aa51b2 in nsOggDecodeStateMachine::DecodeFrame (this=0x1d4fbea0) at /Users/kinetik/work/mozilla-central/mozilla/content/media/video/src/nsOggDecoder.cpp:535
#10 0x11aa931c in nsOggDecodeStateMachine::Run (this=0x1d4fbea0) at /Users/kinetik/work/mozilla-central/mozilla/content/media/video/src/nsOggDecoder.cpp:921
Another video on the same weblog crashes in the same way: http://podcasts.linux-foundation.org/ogg/LinuxTools/valgrind-demo-2.ogg

Another data point: if I use oggz-rip to split out the Theora and Vorbis content, both play fine individually.  The crash only occurs with the original file.
Looks like a duplicate of bug 481933 (fixed on trunk).
Severity: normal → critical
Keywords: crash
Ah, yes.  Our liboggplay doesn't contain the patch in bug 481933 comment 4, so that bug probably shouldn't have been closed until we got that part of the fix in place.  My mistake.
Blocks: 481933
It looks like liboggplay upstream doesn't contain that patch either, so CCing Viktor.
see bug 481933 comment 8 => do not integrate the patch in bug 481933 comment 4 !
Summary: Crash in vorbis_synthesis @ 0x11ae1c8a → Crash in vorbis_synthesis [@ 0x11ae1c8a]
OS: Mac OS X → All
Flags: blocking1.9.2?
Flags: blocking1.9.1?
This does crash on trunk but does not crash in oggplayer (command line liboggplay client). We need to work out what the difference is between what oggplayer is doing and what trunk is doing.
Flags: blocking1.9.1? → blocking1.9.1+
We should try to get an idea of what's going on here.
I put a list of the crash urls we have from firerox 3.5 beta4 data at http://people.mozilla.com/~chofmann/crash-data/vorbis-crashes.txt

I guess we need to figure out if other bugs need to be spun off from this list or if the are basically the same problem with variations in the stacksignatures and crash locations.

I reproduced this one with Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b5pre) Gecko/20090515 Shiretoko/3.5b5pre

 vorbis_synthesis	0x146acb7	200905110110	17	22
	http://en.wikipedia.org/wiki/Come_as_You_Are_(Nirvana_song)
	http://crash-stats.mozilla.com/report/index/84e54207-4e55-4b36-835e-6ac082090511
also reproduced "smells like teen spirit" 
details at
http://crash-stats.mozilla.com/report/index/370f668b-ad12-4ef7-85c4-ad74d2090518
Bug 493331 is a similar crash, perhaps a dup. Two of the crashing files there use floor type 0 vorbis (or at least the url indicates such). I gather from http://www.hydrogenaudio.org/forums/lofiversion/index.php/t46805.html that floor type 0 vorbis is an older version of vorbis that is not supported by all decoders. If someone has a way to tell if a ogg file uses floor type 0 we could see if this was the common denominator of all these files.
Using the list chris hofmann attached, these are the direct urls to the ogg files that crash for me on Windows trunk. linux.xjtuns.cn was down, so I couldn't check that entry.
Attachment #378227 - Attachment mime type: application/octet-stream → text/plain
The files that have vorbis_synthesis in the stack trace seem to be crashing Firefox trunk, but they don't crash the other liboggplay based players. We're trying to track down what the issue is. 

The original problem from bug 481933 was fixed in liboggplay and liboggz. We don't have the patch for liboggplay yet (from  liboggplay git repository commit 3602bf) in our tree, but even with this applied, we still get the crash. The fix in liboggplay and liboggz did fix the crash in the other players.

We're still trying to track down what the difference is.
Progress! The main difference between oggplayer (command line player that uses liboggplay) and Firefox is that Firefox calls oggplay_get_duration after reading the Ogg metadata to get the duration of the stream. This causes a number of seeks to occur using liboggz routines.

if I remove the call to oggplay_get_duration and the oggplay_seek immediately after then oggplay_step_decode returns the error E_OGGPLAY_BAD_INPUT instead of crashing. This is how it is supposed to work. Something about doing the duration call is causing problems.
Trac ticket 476 raised for this issue in the liboggplay bug system:

https://trac.annodex.net/ticket/476

The error occurs when doing a seek call before the first call to oggplay_step_decode.
Are we going to take a local patch to make code freeze today? I'm a bit alarmed that we don't have an assignee here, but I may be over-reading that. :)
Assignee: nobody → chris.double
Chris and I have discussed possible workarounds in our code but we're not comfortable with them because they introduce new risks and quite possibly leave the underlying bug reachable in other ways. We have shown that it is an upstream bug, and those maintainers are aware and (I think) working on it. I only made this blocking since it appeared to be fixed upstream and we needed to investigate if this was our problem. Now we know it isn't, I'm happy to take it off the blocking list again.
Flags: blocking1.9.1+ → wanted1.9.1+
It has come to our attention that this bug may be the cause of a large number of potentially exploitable crashes.

Blocking until we're convinced that it shouldn't. Chris: you should probably consider this your highest priority.
Flags: blocking1.9.1+
Whiteboard: [sg: ?]
We're waiting for a fix from upstream for this. They're working on it.
Depends on: 495129
Transferring blocking to 495129.
Flags: wanted1.9.1+
Flags: blocking1.9.2?
Flags: blocking1.9.1+
Attached patch Fix (obsolete) — Splinter Review
This fixes the crash with the Beethoven file mentioned in comment 20. It also fixes the crash from the file in bug 481933.

The issue is that when a seek is done the information that the ogg libraries has about the error in the file is lost. In this fix I decode a frame before doing the seek so I can get the error code and handle it.

Note that the underlying problem whereby the ogg libraries lose the error information on the first seek before a frame is decoded is still there and upstream is working on that. This fix however ensures that the bug in the library is not exposed.
Attachment #380045 - Flags: superreview?(roc)
Attachment #380045 - Flags: review?(roc)
Attachment #380045 - Flags: superreview?(roc)
Attachment #380045 - Flags: superreview+
Attachment #380045 - Flags: review?(roc)
Attachment #380045 - Flags: review+
I'm sure Shaver would block for this fix.
Flags: blocking1.9.1+
Blocks: 493331
Whiteboard: [sg: ?] → [sg: ?][needs 191 landing]
Whiteboard: [sg: ?][needs 191 landing] → [sg: ?]
I'm getting a test failure on test_duration1 with this patch - investigating.
It looks like decoding the first frame before the call to get the duration results in the duration call returning a different result.

Chris, do the recent changes to how duration is computed require the oggplay_get_duration call to happen before any frames are decoded?
(In reply to comment #25)
> It looks like decoding the first frame before the call to get the duration
> results in the duration call returning a different result.
> 
> Chris, do the recent changes to how duration is computed require the
> oggplay_get_duration call to happen before any frames are decoded?

No, my changes to duration don't affect the underlying oggplay layer. oggplay_get_duration() should really be called "oggplay_get_end_time()" or something similar. Try my liboggz patches in bug 494305, they affect how the underlying "duration" is calculated. One of those patches changes test_duration1.html as I start to get (what I think is) the correct duration returned.
Chris, any luck hunting this down?
To be clear, this is one of the last few blockers for this release.  If at all possible, we need to stay on this until it's done, even over the weekend.
Unfortunately our workaround triggers a(nother) bug in liboggplay that causes test failures on a couple of our tests. I have a workaround for that too, but I don't know why it works. But it does and all tests pass and none of the .ogg files crash. I suggest taking that until we get a real fix for the underlying issues from the liboggplay maintainers.
Attached patch Fix 2Splinter Review
Workaround as mentioned in comment 29. This fixes crashes in remaining vorbis files that I've tested and fixes test failures from the previous patch. I suggest taking this until upstream maintainers fix the issues these workarounds are for.
Attachment #380045 - Attachment is obsolete: true
Attachment #380570 - Flags: superreview?(roc)
Attachment #380570 - Flags: review?(roc)
Attachment #380570 - Flags: superreview?(roc)
Attachment #380570 - Flags: superreview+
Attachment #380570 - Flags: review?(roc)
Attachment #380570 - Flags: review+
Whiteboard: [sg: ?] → [sg: ?][needs landing]
Version: unspecified → Trunk
Thanks guys!
Can we get this on trunk and 191? I'm fine with them landing in both places at the same time, really.
http://hg.mozilla.org/mozilla-central/rev/cff23a837bea
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Whiteboard: [sg: ?][needs landing] → [sg: ?][needs 191 landing]
We should land some tests here.
Flags: in-testsuite?
This cased all video reftests to fail with "failed to load":
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1243766299.1243770335.25990.gz#err28

So backed out:
http://hg.mozilla.org/mozilla-central/rev/b9e74cdddde1
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [sg: ?][needs 191 landing] → [sg: ?]
Did those not fail locally? Do you need access to one of the tinderboxes to reproduce?
How many frames of video are there in the reftests? Possibly if there is less than two then it's failing during the two decode frames before the duration seek?
(In reply to comment #38)
> Did those not fail locally? Do you need access to one of the tinderboxes to
> reproduce?

They fail locally. I guess Chris didn't run them.

(In reply to comment #39)
> How many frames of video are there in the reftests? Possibly if there is less
> than two then it's failing during the two decode frames before the duration
> seek?

There's only one frame. I guess the second decode fails or something.
Attached patch fix v3Splinter Review
My attempt at a fix: if we receive E_OGGPLAY_OK when decoding the first frame, don't try to seek, just set the duration directly. Unfortunately this doesn't work, because NextFrame returns null. I can't see why, what we're doing here for single-frame videos seems completely straightforward.

It also seems to make test_duration1.html fail, and I'm not sure why that is either.

My enthusiasm for working around the Ogg bug(s) here is waning.
I've asked Viktor to post an update here on progress on fixing the underlying bug.
I've been working on a fix for this bug in liboggz in order to be able to detect the hole in the stream even if a seek was done; i.e. when calculating duration.

It still crashes on Beethoven_Sonata_op57_No23_Appassionata_Mvt1.ogg file. Although it crashes with the same stack, the problem is caused by something else, as if we disable the duration call in the beginning of decoding the file the HOLE event is not triggered... and the file plays w/o any troubles.
Attached patch Alternative fixSplinter Review
This is a different approach to the fix. I started with Paul Nickerson's fix in attachment 368028 [details] [diff] [review] of bug 481933. 

The problem with that fix is that the error information is not propogated down so liboggplay can handle it. libfishsound seems to not check return values of a lot of libvorbis calls.

The remaining changes are to pass the error information down through the various libraries.

This fixes the crash on the Beethoven file. Mochitests pass. I'm unable to run reftests on my machine for some reason. I've tested that the reftest video file loads though. Could someone test an actual reftest run with this patch?

I think this fix is an appropriate approach to prevent crashing until the ogg maintainers can come up with a fix that prevents getting into the state where vorbis_synthesis is reached with bad data in the first place.
(In reply to comment #44)
> This fixes the crash on the Beethoven file. Mochitests pass. I'm unable to run
> reftests on my machine for some reason. I've tested that the reftest video file
> loads though. Could someone test an actual reftest run with this patch?

Reftests WFM on latest trunk with this patch applied (on OS X).
Reftests and mochitests pass for me on Mac. Chris, can you get some kind of review from Ogg folks?
Sure. Viktor, any chance of a review of this code? Thanks.
Attachment #380780 - Flags: review?(wiking)
Chris, I've checked the last patch and it does the job - currently i'm trying to make it that until the headers are not processed in liboggplay you cannot do any seeking, as that is the main point here currently.

but again this will do the job; the only thing is that i would return -1 in 'oggplay_callback_audio' when the return value of 'fish_sound_decode' != 0 instead of OGGZ_ERR_HOLE_IN_DATA as it can be, that simply there's no library available for decoding that content....
IIRC, I tried returning -1 and it still caused problems due to -1 being handled differently by the liboggz code - it ended up ignoring the error or trying to recover in some manner which triggered a problem where the file would endlessly try loading.
Sounds like we have a patch that's ready to commit?  Can we get it on central ASAP so that we can bake a bit before hitting the branch and spinning RC1 builds?

(Yay!)
Yes, I think we should land it.
http://hg.mozilla.org/mozilla-central/rev/b0766717ce73
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
This patch basically enforces that every header packet is read for each track before doing anything with the content, e.g seeking.

I've tried to test with most of the files and it have not crashed at all. It as well fixes bug 481933. Basically it detects on the first place that it has a HOLE during initialization of OggPlay handle, thus the return value of the 'oggplay_open_with_reader' will be NULL.
I'll talk to Chris D tomorrow about which patch we want to take --- or maybe we want both.
Whiteboard: [sg: ?] → [sg: ?][needs 191 landing]
The patch in comment 53 causes mochitest failures. test_duration1.html fails due to the duration being reported as 1 frame shorter than it actually is. Also 320x240.ogv does not display correctly. It seems to not decode the initial keyframe.
I'll have to retest the patch - ignore comment 56. My update, patch application and build is suspect due to what looks like file system corruption.
Chris, you said over in bug 495159 that the end key will work again when this patch is checked in. I cannot see that this key works! Pressing it the video stops playing. How should we proceed?
re: comment 58, let's get it filed in a new bug please?
Verified fixed on trunk and 1.9.1 with builds on all platforms like:

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20090603 Minefield/3.6a1pre ID:20090603031250

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1pre) Gecko/20090603 Shiretoko/3.5pre ID:20090603031333
Status: RESOLVED → VERIFIED
Hardware: x86 → All
Target Milestone: --- → mozilla1.9.2a1
Ria please check bug 496326. It could be related to that one.
Attachment #380780 - Flags: review?(wiking)
Duplicate of this bug: 493331
liboggz part applied upstream in commit 8c2da14d88f5957bc98badfcc71f83aa7d05c0e1
Just reviewing this, the libfishound part was already applied upstream on 2009-09-13 in commit 2be07f5cec835afb917d891c7a465f9f0cece3e2
Flags: wanted1.9.0.x-
Crash Signature: [@ 0x11ae1c8a]
You need to log in before you can comment on or make changes to this bug.