Closed Bug 496581 Opened 16 years ago Closed 16 years ago

crash [@ oggplay_buffer_set_last_data ]

Categories

(Core :: Audio/Video, defect)

1.9.1 Branch
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla1.9.2a1
Tracking Status
blocking1.9.1 --- .4+
status1.9.1 --- .4-fixed

People

(Reporter: kbrosnan, Assigned: roc)

References

()

Details

(4 keywords, Whiteboard: [3.5Beta99testday])

Crash Data

Attachments

(4 files)

Load http://tinyvid.tv/show/4yy6letux3ct watch the video. After the video ends drag the playback marker to the right of the end. Firefox will crash after dragging for a couple seconds. 0 libxul.so oggplay_buffer_set_last_data media/liboggplay/src/liboggplay/oggplay_buffer.c:150 1 libxul.so oggplay_step_decoding media/liboggplay/src/liboggplay/oggplay.c:715 2 libxul.so nsOggStepDecodeEvent::Run() content/media/video/src/nsOggDecoder.cpp:664 3 libxul.so nsThread::ProcessNextEvent(int, int*) xpcom/threads/nsThread.cpp:510 4 libxul.so NS_ProcessNextEvent_P(nsIThread*, int) nsThreadUtils.cpp:230 5 libxul.so nsThread::ThreadFunc(void*) xpcom/threads/nsThread.cpp:254 6 libnspr4.so _pt_root nsprpub/pr/src/pthreads/ptthread.c:228 7 libpthread-2.9.so libpthread-2.9.so@0x64fe http://crash-stats.mozilla.com/report/index/a48086c5-f4d4-4ece-af8f-01cdf2090605 http://crash-stats.mozilla.com/report/index/5dbb8902-c178-40d3-82a7-9926d2090605
Flags: blocking1.9.1?
Whiteboard: [3.5Beta99testday]
Can you please specify what do you mean with dragging a couple of seconds? For me the scrubber jumps to the beginning of the file when dragging after the end.
Linux only? I can't reproduce on OSX.
Confirmed on w32, can't reproduce on OSX.
OS: Linux → All
Hardware: x86 → All
This regressed between nightly builds of 2009053004 and 2009053104
sorry, I just got 0530 build to crash... let me go back further.
it's at least as old as 5/26 continually messing with the slider eventually causes a crash.
so examining this more closely.... There seems to be an older underlying crash that was exposed on 6/2 with the quickly wiggling timeline slider when it is dragged to the right momentarily 20090601 http://hg.mozilla.org/releases/mozilla-1.9.1/rev/0e772546d16b does not exhibit the problem, but 20090602 http://hg.mozilla.org/releases/mozilla-1.9.1/rev/18046f6a7cb1 does crash with wiggling slider The underlying crash manually messing with the slider as tmyoung posted in his screencast has been around much longer. at least 5/26 (I didn't check further back)
Attached video video of crash
By staying to the right of the end of the video I get Firefox to crash much quicker.
Given by Tracys changeset the following checkins happened in this timeframe: http://hg.mozilla.org/releases/mozilla-1.9.1/pushloghtml?fromchange=0e772546d16b&tochange=18046f6a7cb1 Eventually a regression from the fix on bug 495300 or bug 495319? If Roc cannot say it right away, I could start a bisect.
Ok, windbg doesn't help me much here. I only get memory access errors. I don't have an idea if that could be the root cause here? me 0x10655e82 struct _OggPlay * buffer 0x00000000 struct OggPlayBuffer * buffer_list <Memory access error> * <Memory access error> * buffer_mirror <Memory access error> buffer_size <Memory access error> last_filled <Memory access error> last_emptied <Memory access error> frame_sem <Memory access error> That would mean with the following assignment we will have a bad pointer and accessing a erroneous child will cause us to crash: http://mxr.mozilla.org/mozilla1.9.1/source/media/liboggplay/src/liboggplay/oggplay_buffer.c#147 I cannot reproduce the crash in my Windows Vista environment where the build system has been setup. I now have to setup everything on XP before I could give more information and bisect the range.
> buffer 0x00000000 struct OggPlayBuffer * that's a simple null pointer. a stack trace will show the caller, the callee (oggplay_buffer_set_last_data) clearly doesn't approve of null pointers. as to the line number, keep in mind that MSVC optimizes heavily, it can favor 147 p = (OggPlayCallbackInfo *)buffer->buffer_list over 143 if (buffer->last_filled == -1) { under the assumptions that: 1. it'll crash doing buffer-> in either case 2. buffer->buffer_list is more useful 3. it will need to check [buffer->last_filled]; 4. it wants the array index after it has buffer_list
I should have been added a bit more to my last comment. Sorry Timeless. When checking oggplay_step_decoding in oggplay.c I can see only one call to oggplay_buffer_set_last_data which happens here: http://mxr.mozilla.org/mozilla1.9.1/source/media/liboggplay/src/liboggplay/oggplay.c#709 Comparing with the locals output in oggplay_step_decoding 'me' is a null pointer there but not in oggplay_buffer_set_last_data. That's why I think something is corrupt. Am I wrong? me 0x00000000 struct _OggPlay * reader <Memory access error> oggz <Memory access error> decode_data <Memory access error> callback_info <Memory access error> num_tracks <Memory access error> all_tracks_initialised <Memory access error> callback_period <Memory access error> callback <Memory access error> callback_user_ptr <Memory access error> target <Memory access error> active_tracks <Memory access error> buffer <Memory access error> presentation_time <Memory access error> trash <Memory access error> shutdown <Memory access error> pt_update_valid <Memory access error> duration <Memory access error> chunk_count 1 i 1 info 0x00000000 * <Memory access error> r 4414770 remaining 1
Everything looks fine two frames higher in nsOggStepDecodeEvent::Run() http://mxr.mozilla.org/mozilla1.9.1/source/content/media/video/src/nsOggDecoder.cpp#664 initialDownloadPosition 18961292836974540 mon class nsAutoMonitor r 62913648 (No matching enumerant) this 0x06cdab50 class nsOggStepDecodeEvent * nsRunnable class nsRunnable mDecodeStateMachine 0x045ea200 class nsOggDecodeStateMachine * mPlayer 0x03b7d940 struct _OggPlay * reader 0x03b7bc70 struct _OggPlayReader * oggz 0x03b91400 decode_data 0x03b7e160 callback_info 0x03b0f5b0 struct _OggPlayCallbackInfo * num_tracks 3 all_tracks_initialised 1 callback_period 172018876 callback 0x103bc4b3 callback_user_ptr 0x00000000 target 272305880707 active_tracks 0 buffer 0x01eb7040 struct OggPlayBuffer * buffer_list 0x040229c0 buffer_mirror 0x04022ab0 buffer_size 20 last_filled 6 last_emptied 6 frame_sem 0x000005a4 presentation_time 247019105935 trash 0x00000000 struct OggPlaySeekTrash * shutdown 0 pt_update_valid 0 duration 64164
Attached file wingdb session
I spent some time together with timeless on debugging this problem with windbg. Attached you can find our findings. Looks like we wanna access an buffer element outside of its boundary (20).
Chris, could you have a look at it?
I don't think we should block, but we should still take a fix.
Flags: blocking1.9.1? → wanted1.9.1+
Attached file stack
With a debug build I get an assertion when dragging the scrubber to the right when it is near the end position: ASSERTION: Can only seek in range [0,duration]: 'mSeekTime >= 0 && mSeekTime <= duration' See the attached wingdb stack.
On trunk this regressed between the following two builds from 090531 and 090601. Checkins in this timeframe: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=f014c39be5dc&tochange=ab395a1916be Due to there are only two check-ins which match the regression range on 1.9.1 and the patch on bug 487519 was backed out again, bug 495300 has to be the reason for this regression.
Attached patch fixSplinter Review
OK, this bug is because we seek to the end, decode one frame, then decode another frame to see if we're at the end. In this case we are, so DecodeFrame returns E_OGGPLAY_OK. Then we go back to the DECODING state and try to decode another frame. This triggers a crash in oggplay, but it's really our fault because we shouldn't try to do oggplay_step_decoding after E_OGGPLAY_OK was returned without seeking in between. Obviously we should go to the COMPLETED state when we know we've got the last frame. We also shouldn't be reading mDecodedFrames while not holding the lock, that was bad.
Assignee: nobody → roc
Attachment #382085 - Flags: review?(chris.double)
While seeking to the end a lot in this testcase, we hit a lot of ###!!! ASSERTION: Can only seek in range [0,duration]: 'mSeekTime >= 0 && mSeekTime <= duration', file /home/roc/shared/mozilla-checkin/content/media/video/src/nsOggDecoder.cpp, line 1236 But I'm pretty sure this is due to the first frame presentation time being reported incorrectly by liboggplay; see https://bugzilla.mozilla.org/show_bug.cgi?id=496051#c3.
Attachment #382085 - Flags: review?(chris.double) → review+
Comment on attachment 382085 [details] [diff] [review] fix Avoids a crash while seeking to the end of some videos.
Comment on attachment 382085 [details] [diff] [review] fix How well tested are we here? Let's definitely start by getting it on trunk.
Oh, I was not aware that we could land on trunk without 1.9.1 approval now
Whiteboard: [3.5Beta99testday] → [3.5Beta99testday][needs landing]
Sadly, looks like this will miss the 191 cutoff, though :(
Flags: wanted1.9.1.x+
Keywords: relnote
Status: NEW → RESOLVED
Closed: 16 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Whiteboard: [3.5Beta99testday][needs landing] → [3.5Beta99testday][needs 191 approval]
Target Milestone: --- → mozilla1.9.2a1
Verified fixed on trunk with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090613 Minefield/3.6a1pre ID:20090613042114
Status: RESOLVED → VERIFIED
Comment on attachment 382085 [details] [diff] [review] fix a191=beltzner
Attachment #382085 - Flags: approval1.9.1? → approval1.9.1+
This patch appears to have gotten 1.9.1 approval but didn't land? Or was it fixed as part of some other bug and forgotten to be marked here?
blocking1.9.1: --- → ?
Comment on attachment 382085 [details] [diff] [review] fix Removing the stale 1.9.1 approval -- we're code-frozen for 1.9.1.3 and don't want this landing right at the moment. If this has not been fixed on that branch we could land it next time, or if you no longer want to land this patch then just remove the approval request.
Attachment #382085 - Flags: approval1.9.1+ → approval1.9.1.4?
blocking1.9.1: ? → .4+
Comment on attachment 382085 [details] [diff] [review] fix Approved for 1.9.1.4, a=dveditz for release-drivers Please make sure this gets in before the code-freeze for the 1.9.1 branch.
Attachment #382085 - Flags: approval1.9.1.4? → approval1.9.1.4+
Keywords: checkin-needed
Whiteboard: [3.5Beta99testday][needs 191 approval] → [3.5Beta99testday][needs 191 landing]
Keywords: checkin-needed
Whiteboard: [3.5Beta99testday][needs 191 landing] → [3.5Beta99testday]
Verified fixed on 1.9.1 with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.4pre) Gecko/20090922 Shiretoko/3.5.4pre (.NET CLR 3.5.30729) ID:20090922041550
Keywords: verified1.9.1
Removing relnote
Keywords: relnote
Crash Signature: [@ oggplay_buffer_set_last_data ]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: