Closed
Bug 970008
Opened 10 years ago
Closed 10 years ago
[tarako]monkey test crash at libxul.so!mozilla::dom::TabChild::RecvRealTouchEvent
Categories
(Core :: Graphics: Layers, defect)
Tracking
()
People
(Reporter: james.zhang, Unassigned)
References
Details
(Keywords: crash, Whiteboard: [b2g-crash])
Attachments
(5 files, 2 obsolete files)
[FFOS minidump: mtlog-now-sp6821a_gonk-98-custom_hudson-jameszhangubtpc-1402080859/dump_parse (the top 10 stack info)] 0 libxul.so!mozilla::dom::TabChild::RecvRealTouchEvent(mozilla::WidgetTouchEvent const&, mozilla::layers::ScrollableLayerGuid const&) [nsCOMPtr.h : 554 + 0x2] 1 libxul.so!mozilla::dom::PBrowserChild::OnMessageReceived(IPC::Message const&) [PBrowserChild.cpp : 2080 + 0xd] 2 libxul.so!mozilla::dom::PContentChild::OnMessageReceived(IPC::Message const&) [PContentChild.cpp : 3170 + 0x7] 3 libxul.so!mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const&) [MessageChannel.cpp : 1126 + 0x5] 4 libxul.so!mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message const&) [MessageChannel.cpp : 1044 + 0x3] 5 libxul.so!mozilla::ipc::MessageChannel::OnMaybeDequeueOne() [MessageChannel.cpp : 1027 + 0x3] 6 libxul.so!RunnableMethod<WebCore::ReverbConvolver, void (WebCore::ReverbConvolver::*)(), Tuple0>::Run() [tuple.h : 383 + 0x5] 7 libxul.so!mozilla::ipc::MessageChannel::DequeueTask::Run() [MessageChannel.h : 371 + 0x9] 8 libxul.so!MessageLoop::RunTask(Task*) [message_loop.cc : 340 + 0x5] 9 libxul.so!MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) [message_loop.cc : 348 + 0x5]
Reporter | ||
Updated•10 years ago
|
blocking-b2g: --- → 1.3T?
Comment 1•10 years ago
|
||
Hi James, is it a high occurance bug ? and if this can be easily reproduced with any script that you can share? Thanks
Flags: needinfo?(james.zhang)
Reporter | ||
Comment 2•10 years ago
|
||
(In reply to Joe Cheng [:jcheng] from comment #1) > Hi James, is it a high occurance bug ? > and if this can be easily reproduced with any script that you can share? > Thanks Yes. We can catch this crash every two days. I have given the script to ttsai, but it's base on our hudson daily build. You should modify the script or use run-6821-local.sh to catch minidump and parse the backtrace.
Flags: needinfo?(james.zhang)
Reporter | ||
Comment 3•10 years ago
|
||
cd test-config ./run-6821-local.sh Otherwise your QA should configure the same hudson build as our.
Updated•10 years ago
|
Flags: needinfo?(ahuang)
Comment 4•10 years ago
|
||
(In reply to James Zhang from comment #0) > Created attachment 8372966 [details] > mtlog-now-sp6821a_gonk-98-custom_hudson-jameszhangubtpc-1402080859.tar.bz2 > > [FFOS minidump: > mtlog-now-sp6821a_gonk-98-custom_hudson-jameszhangubtpc-1402080859/ > dump_parse (the top 10 stack info)] > 0 > libxul.so!mozilla::dom::TabChild::RecvRealTouchEvent(mozilla:: > WidgetTouchEvent const&, mozilla::layers::ScrollableLayerGuid const&) > [nsCOMPtr.h : 554 + 0x2] > 1 libxul.so!mozilla::dom::PBrowserChild::OnMessageReceived(IPC::Message > const&) [PBrowserChild.cpp : 2080 + 0xd] Hi James, The backtrace looks really weird. mozilla::dom::TabChild::RecvRealTouchEvent should be in TabChild.cpp. Are you using optimized build? If you are not sure, can you attach the "config.status" file under objdir-gecko folder? Thanks.
Flags: needinfo?(ahuang)
Reporter | ||
Comment 6•10 years ago
|
||
(In reply to Alan Huang [:ahuang|away 2/27-3/3] from comment #4) > (In reply to James Zhang from comment #0) > > Created attachment 8372966 [details] > > mtlog-now-sp6821a_gonk-98-custom_hudson-jameszhangubtpc-1402080859.tar.bz2 > > > > [FFOS minidump: > > mtlog-now-sp6821a_gonk-98-custom_hudson-jameszhangubtpc-1402080859/ > > dump_parse (the top 10 stack info)] > > 0 > > libxul.so!mozilla::dom::TabChild::RecvRealTouchEvent(mozilla:: > > WidgetTouchEvent const&, mozilla::layers::ScrollableLayerGuid const&) > > [nsCOMPtr.h : 554 + 0x2] > > 1 libxul.so!mozilla::dom::PBrowserChild::OnMessageReceived(IPC::Message > > const&) [PBrowserChild.cpp : 2080 + 0xd] > > Hi James, > > The backtrace looks really weird. mozilla::dom::TabChild::RecvRealTouchEvent > should be in TabChild.cpp. Are you using optimized build? If you are not > sure, can you attach the "config.status" file under objdir-gecko folder? > Thanks. Please NeedInfo me if you need more information.
Comment 7•10 years ago
|
||
(''' MOZ_OPTIMIZE ''', r''' 1 '''), (''' MOZ_FRAMEPTR_FLAGS ''', r''' -fomit-frame-pointer '''), (''' MOZ_OPTIMIZE_FLAGS ''', r''' -Os -freorder-blocks -fno-reorder-functions '''), Yeah, as I expected in comment 4
Flags: needinfo?(ahuang)
Updated•10 years ago
|
blocking-b2g: 1.3T? → 1.3T+
Whiteboard: [SI-testing-blocker]
Comment 8•10 years ago
|
||
Steven can you help understand why this was made a blocker ? What are the next steps here ?
Flags: needinfo?(styang)
Comment 9•10 years ago
|
||
(In reply to bhavana bajaj [:bajaj] from comment #8) > Steven can you help understand why this was made a blocker ? What are the > next steps here ? We need to pass the stability test before shipping, this happened often in the test. James, we need you to use the optimized build for the testing again. flag it to 1.3T? for monitoring.
Flags: needinfo?(styang) → needinfo?(james.zhang)
Updated•10 years ago
|
blocking-b2g: 1.3T+ → 1.3T?
Reporter | ||
Comment 10•10 years ago
|
||
(In reply to Steven Yang [:styang] from comment #9) > (In reply to bhavana bajaj [:bajaj] from comment #8) > > Steven can you help understand why this was made a blocker ? What are the > > next steps here ? > > We need to pass the stability test before shipping, this happened often in > the test. > > James, we need you to use the optimized build for the testing again. flag it > to 1.3T? for monitoring. We use optimized build and you use ununoptimized build, right? We can still catch this crash in 0305 daily build.
Flags: needinfo?(james.zhang)
Comment 11•10 years ago
|
||
Is there any chance to attach the .sc file that triggered this so we can re-run it?
Reporter | ||
Comment 12•10 years ago
|
||
(In reply to Andreas Gal :gal from comment #11) > Is there any chance to attach the .sc file that triggered this so we can > re-run it? No, the monkey run over 12~24 hour, and the content in sdcard is differnent, we can't reproduce this issue by .sc. Andreas, can we add gnasp when minidump? Spreadtrum slog can catch snapshot when native crash.
Reporter | ||
Comment 13•10 years ago
|
||
For example, in this monkey test log, we can find minidump log in minidump folder, and we can also find tombstone in slog_external/20230601042625/misc/tombstones. We can see the tombstone screenshot and comfirm it caused by video thumbnail crash. But we only see the backtrace and no minidump screenshot, we don't know how to analyze this minidump.
Reporter | ||
Updated•10 years ago
|
Flags: needinfo?(gal)
Comment 15•10 years ago
|
||
Comment 16•10 years ago
|
||
Comment 17•10 years ago
|
||
James, the 2nd screenshot seems to be video playback, not thumbnail preview.
Comment 18•10 years ago
|
||
James, is comment 13 the same crash cause? Because that looks like a mediaserver crash.
Reporter | ||
Comment 19•10 years ago
|
||
(In reply to Andreas Gal :gal from comment #17) > James, the 2nd screenshot seems to be video playback, not thumbnail preview. Andreas, it's just a example, this bug caused by my side and we have fixed it. I think gecko should catch the screenshot when minidump happen, we can get more information. commit 8f6662d002f2a7084f2c79af369965b51c26e017 Author: ming.li <ming.li@spreadtrum.com> Date: Thu Mar 6 17:00:04 2014 +0800 Bug #286278 mediaserver crash [bug number ] 286278 [root cause ] failed to malloc mem from ion heap [changes ] protect when failed [side effects] [reviewers ] Change-Id: I84f860ef4a570dfca32e03256c0e7bba5d9ae313
Reporter | ||
Comment 20•10 years ago
|
||
(In reply to Andreas Gal :gal from comment #14) > gnasp? gnasp is a tool to catch screenshot, you can aslo write code to catch screenshot. We need catch more information when minidump happen, log is not enough.
Updated•10 years ago
|
Attachment #8388023 -
Attachment is obsolete: true
Updated•10 years ago
|
Attachment #8388024 -
Attachment is obsolete: true
Comment 21•10 years ago
|
||
Ok, thanks. Sorry for my confusion. How do you trigger the screenshot? Can you link me to the gnasp tool? I will check out our crash dump handler.
Reporter | ||
Comment 22•10 years ago
|
||
(In reply to Andreas Gal :gal from comment #21) > Ok, thanks. Sorry for my confusion. How do you trigger the screenshot? Can > you link me to the gnasp tool? I will check out our crash dump handler. If you have tarako or fugu, you can use "adb shell gsnap /data/1.jpg /dev/graphics/fb0" catch the screenshot, and use "adb pull /data/1.jpg" to get it from phone. gsnap is one tool of busybox.
Comment 23•10 years ago
|
||
Ok thanks. We can hack up something to save fb0. Its probably not safe in production, but we can add an env variable.
Comment 24•10 years ago
|
||
This is completely untested and likely doesn't even compile but this shows how we should be able to save a ppm during a crash. James, do you guys want to give this a try and see if you can apply this? I will try to find someone on our end Monday if you don't make progress.
Comment 25•10 years ago
|
||
This might not work when the child process dies in case the child process has no access to the FB or when the child process isn't visible anyway.
Reporter | ||
Comment 26•10 years ago
|
||
Very quick patch! Our monkey test engineer will intergrate it into monkey test version.
Reporter | ||
Comment 27•10 years ago
|
||
Hi Andreas, how to open the ppm file or convert it to jpg or png?
Reporter | ||
Comment 28•10 years ago
|
||
(In reply to Andreas Gal :gal from comment #24) > Created attachment 8388039 [details] [diff] [review] > patch > > This is completely untested and likely doesn't even compile but this shows > how we should be able to save a ppm during a crash. James, do you guys want > to give this a try and see if you can apply this? I will try to find someone > on our end Monday if you don't make progress. Lianxiang will take this patch.
Flags: needinfo?(lianxiang.zhou)
Comment 29•10 years ago
|
||
http://stackoverflow.com/questions/4012889/converting-ppm-to-png Keep in mind that the patch is totally untested and uncompiled. Its just a sketch. But something like it should work. On mac and linux convert should do the trick to read the ppm file.
Comment 30•10 years ago
|
||
Comment on attachment 8388039 [details] [diff] [review] patch bsmedberg, does something like this make sense? Will this run for the child as well? Can we somehow make sure this always runs in the parent? Even when a child crashes? (so we have access to dev/fb).
Attachment #8388039 -
Flags: feedback?(benjamin)
Comment 31•10 years ago
|
||
Steven/James, is this still a true blocker for partner stability testing? how easy is it to reproduce in your stability testing? thanks
blocking-b2g: 1.3T? → 1.3T+
Flags: needinfo?(styang)
Comment 32•10 years ago
|
||
This looks reasonable as debugging code behind a envvar or pref or something. If we wanted it as production code we'd need to add additional code to clean up the screenshots along with the other crash data. As written it will only take a screenshot if the main process crashes. If you wanted to take a screen shot of content process crashes, you'd need to add something similar to OnChildProcessDumpRequested here: http://hg.mozilla.org/mozilla-central/annotate/c8bea55437c1/toolkit/crashreporter/nsExceptionHandler.cpp#l2401 Presumably this should this be #ifdef MOZ_B2G or something like that...
Updated•10 years ago
|
Attachment #8388039 -
Flags: feedback?(benjamin)
Reporter | ||
Comment 33•10 years ago
|
||
(In reply to Joe Cheng [:jcheng] from comment #31) > Steven/James, is this still a true blocker for partner stability testing? > how easy is it to reproduce in your stability testing? thanks Maybe this crash caused by media server crash, I can't see this crash after apply media server crash. Lianxiang, please keep tracking.
Updated•10 years ago
|
Component: General → Graphics: Layers
Product: Firefox OS → Core
Comment 34•10 years ago
|
||
James, since you are no longer seeing this crash, we will minus for now
blocking-b2g: 1.3T+ → -
Whiteboard: [SI-testing-blocker]
Updated•10 years ago
|
Flags: needinfo?(styang)
James, if you are no longer seeing this crash, can we close this out?
Comment 37•10 years ago
|
||
Per comment 36, close this bug.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
Updated•10 years ago
|
Flags: needinfo?(lianxiang.zhou)
You need to log in
before you can comment on or make changes to this bug.
Description
•