Closed
Bug 791167
Opened 13 years ago
Closed 13 years ago
SIGSEGV when decoding H.264 videos using hardware codec
Categories
(Core :: Audio/Video, defect)
Tracking
()
People
(Reporter: cajbir, Assigned: cjones)
Details
Attachments
(1 file, 1 obsolete file)
1.73 KB,
patch
|
roc
:
review+
|
Details | Diff | Splinter Review |
Requires fix to Bug 791164 first to reproduce since without that the hardware decoder is not used.
Steps to reproduce:
1) Start browser
2) Visit http://cd.pn/b2
3) Press Play
Expected result:
Video plays
Actual result:
Browser tab crashes and shows sad face. logcat shows:
D/memalloc( 1225): /dev/pmem: Allocated buffer base:0x4ad00000 size:348160 offset:3944448 fd:178
D/memalloc( 1349): /dev/pmem: Mapped buffer base:0x46465000 size:4292608 offset:3944448 fd:57
F/libc ( 1349): Fatal signal 11 (SIGSEGV) at 0x4687d000 (code=1)
Commenting out the code in nsBuiltDecoderReader.cpp that is special cased for the YV12 format like so works around the issue:
ImageFormat format[2] = {PLANAR_YCBCR, GRALLOC_PLANAR_YCBCR};
#if 0
if (IsYV12Format(Y, Cb, Cr)) {
v->mImage = aContainer->CreateImage(format, 2);
} else {
#endif
v->mImage = aContainer->CreateImage(format, 1);
#if 0
}
#endif
The special case code seems to have been introduced in bug 767480.
Reporter | ||
Updated•13 years ago
|
blocking-basecamp: --- → ?
Updated•13 years ago
|
blocking-basecamp: ? → +
Reporter | ||
Comment 2•13 years ago
|
||
This is my temporary workaround to get H264 video to play in the video app using the hardware decoder. With this applied and the workaround in bug 791164 applied I get http://cd.pn/b2/story.mp4 playing from the video app when I copy it to the /sdcard/Movies directory.
Reporter | ||
Comment 3•13 years ago
|
||
Updating to the latest repo code and this crash doesn't occur with bug 759506 applied. It only occurs when using the omx media plugins backend.
Assignee | ||
Comment 4•13 years ago
|
||
That means that we'll regress SW-decoded playback with that patch, right?
Reporter | ||
Comment 5•13 years ago
|
||
That patch isn't a fix, it's just to show where the crash happens and commenting it out stops it from happening. I'm not sure what you mean by it regressing software decoder playback.
Kanru would be able to comment on what his fast path actually does.
Assignee | ||
Comment 6•13 years ago
|
||
That work enables using gralloc as our transport for decoded frames so we don't have to reupload them on the compositor. If the bug 759506 has a fork of that code from before bug 767480, then I suspect it might be missing the bug 767480 optimization.
Or, if you're saying that you guys took the latest omx plugin code and then re-applied edwin's transformations on it to internalize it, then maybe the HW-decoded path is skipping Kan-Ru's new code and we'll only use it for CPU-decoded videos. In which case, all is well :).
Reporter | ||
Comment 7•13 years ago
|
||
Bug 759506 is not hitting the crash with this bug. It's the media/omx-plugin code that is hitting the crash. So with edwin's code applied we don't have a problem. Which means all is well in b2g land I think.
Assignee | ||
Comment 8•13 years ago
|
||
We had a quick IRC powwow. It looks like the code in bug 767480 added repacking code to make buffers published by our internal decoders directly texturable through gralloc. What's probably happening is that the omx buffers don't adhere to the preconditions needed for the repacking, and we're probably reading out-of-bounds memory. Something like bug 785441.
I'm ok with temporarily disabling this code to get omx working+landed, but reenabling it is a blocker; vp8 and theora are borderline unplayable without bug 767480.
Assignee | ||
Comment 9•13 years ago
|
||
Forgot to add --- step one here is to run this code in a debug build. If an assertion doesn't loudly blow up, we need to add that assertion.
Assignee | ||
Comment 10•13 years ago
|
||
I forgot to attach gdb when repro'ing, but in logcat I see
F/libc (28965): Fatal signal 11 (SIGSEGV) at 0x46a15000 (code=1)
I/DEBUG ( 109): memory map around addr 46a15000:
I/DEBUG ( 109): 4677d000-46a15000 /dev/pmem
I/DEBUG ( 109): (no map for address)
I/DEBUG ( 109): 46ab5000-46ab6000
which means we're attempting to read or write past the end of a gralloc buffer.
Assignee | ||
Comment 11•13 years ago
|
||
(gdb) bt
#0 memcpy () at bionic/libc/arch-arm/bionic/memcpy.S:225
#1 0x41b8f66a in mozilla::layers::GrallocPlanarYCbCrImage::SetData (this=0x44d49fc0, aData=...) at /home/cjones/mozilla/inbound/gfx/layers/GrallocImages.cpp:82
#2 0x4110272e in VideoData::Create (aInfo=..., aContainer=0x44fea330, aOffset=163147, aTime=0, aEndTime=1, aBuffer=..., aKeyframe=true, aTimecode=-1, aPicture=...) at /home/cjones/mozilla/inbound/content/media/nsBuiltinDecoderReader.cpp:237
#3 0x41129ad8 in nsMediaPluginReader::DecodeVideoFrame (this=0x44df5500, aKeyframeSkip=@0x44838c4f, aTimeThreshold=0) at /home/cjones/mozilla/inbound/content/media/plugins/nsMediaPluginReader.cpp:204
#4 0x41101d1a in nsBuiltinDecoderReader::DecodeVideoFrame (this=0x44df5500) at /home/cjones/mozilla/inbound/content/media/nsBuiltinDecoderReader.h:507
#5 0x41103724 in nsBuiltinDecoderReader::DecodeToFirstData<VideoData> (this=0x44df5500, aDecodeFn=(bool (nsBuiltinDecoderReader::*)(nsBuiltinDecoderReader *)) 0x41101cf1 <nsBuiltinDecoderReader::DecodeVideoFrame()>, aQueue=...) at /home/cjones/mozilla/inbound/content/media/nsBuiltinDecoderReader.h:497
#6 0x41102a4e in nsBuiltinDecoderReader::FindStartTime (this=0x44df5500, aOutStartTime=@0x44838cf8) at /home/cjones/mozilla/inbound/content/media/nsBuiltinDecoderReader.cpp:289
#7 0x410ff3de in nsBuiltinDecoderStateMachine::FindStartTime (this=0x4377eef0) at /home/cjones/mozilla/inbound/content/media/nsBuiltinDecoderStateMachine.cpp:2345
#8 0x410fd040 in nsBuiltinDecoderStateMachine::DecodeMetadata (this=0x4377eef0) at /home/cjones/mozilla/inbound/content/media/nsBuiltinDecoderStateMachine.cpp:1742
#9 0x410f7cd2 in nsBuiltinDecoderStateMachine::DecodeThreadRun (this=0x4377eef0) at /home/cjones/mozilla/inbound/content/media/nsBuiltinDecoderStateMachine.cpp:479
#10 0x404d9eb6 in nsRunnableMethodImpl<void (nsUpdateProcessor::*)(), true>::Run (this=0x48fe8640) at ../../dist/include/nsThreadUtils.h:349
#11 0x41a4d61a in nsThread::ProcessNextEvent (this=0x452f3be0, mayWait=true, result=0x44838e87) at /home/cjones/mozilla/inbound/xpcom/threads/nsThread.cpp:624
#12 0x419efd52 in NS_ProcessNextEvent_P (thread=0x452f3be0, mayWait=true) at /home/cjones/mozilla/new-b2g/objdir-gecko/xpcom/build/nsThreadUtils.cpp:220
#13 0x41a4c78a in nsThread::ThreadFunc (arg=0x452f3be0) at /home/cjones/mozilla/inbound/xpcom/threads/nsThread.cpp:257
#14 0x401d3426 in _pt_root (arg=0x452f3c50) at /home/cjones/mozilla/inbound/nsprpub/pr/src/pthreads/ptthread.c:156
#15 0x400eb0ec in __thread_entry (func=0x401d3355 <_pt_root>, arg=0x452f3c50, tls=<value optimized out>) at bionic/libc/bionic/pthread.c:217
#16 0x400eac40 in pthread_create (thread_out=<value optimized out>, attr=0x493ffb18, start_routine=0x401d3355 <_pt_root>, arg=0x452f3c50) at bionic/libc/bionic/pthread.c:357
#17 0x00000000 in ?? ()
(gdb) f 1
#1 0x41b8f66a in mozilla::layers::GrallocPlanarYCbCrImage::SetData (this=0x44d49fc0, aData=...) at /home/cjones/mozilla/inbound/gfx/layers/GrallocImages.cpp:82
(gdb) p mData
$1 = {
mYChannel = 0x4a3f7000 <Address 0x4a3f7000 out of bounds>,
mYStride = 640,
mYSize = {
<mozilla::gfx::BaseSize<int, nsIntSize>> = {
width = 640,
height = 360
}, <No data fields>},
mYSkip = 0,
mCbChannel = 0x4a433001 <Address 0x4a433001 out of bounds>,
mCrChannel = 0x4a433000 <Address 0x4a433000 out of bounds>,
mCbCrStride = 640,
mCbCrSize = {
<mozilla::gfx::BaseSize<int, nsIntSize>> = {
width = 320,
height = 180
}, <No data fields>},
mCbSkip = 1,
mCrSkip = 1,
mPicX = 0,
mPicY = 0,
mPicSize = {
<mozilla::gfx::BaseSize<int, nsIntSize>> = {
width = 640,
height = 360
}, <No data fields>},
mStereoMode = mozilla::STEREO_MODE_MONO
}
/proc/[]/maps says that these addresses are
4a177000-4a8a2000 rw-s 0d300000 00:0b 906 /dev/pmem_adsp
Poking around to see what's out of sync here ...
Assignee | ||
Comment 12•13 years ago
|
||
I haven't been able to determine the underlying problem yet, but I suspect it will boil down to gecko having a different usage of "stride" than gralloc. gralloc means "pixel stride", but gecko means "byte stride".
http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/ShadowLayerUtilsGralloc.cpp#308
Assignee | ||
Comment 14•13 years ago
|
||
Attachment #661670 -
Attachment is obsolete: true
Attachment #662097 -
Flags: review?(roc)
Assignee | ||
Updated•13 years ago
|
Component: General → Video/Audio
Product: Boot2Gecko → Core
Comment on attachment 662097 [details] [diff] [review]
stride*height is the size of the region
Review of attachment 662097 [details] [diff] [review]:
-----------------------------------------------------------------
Ouch!
Attachment #662097 -
Flags: review?(roc) → review+
Assignee | ||
Comment 16•13 years ago
|
||
Comment 17•13 years ago
|
||
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
You need to log in
before you can comment on or make changes to this bug.
Description
•