Closed Bug 867025 Opened 12 years ago Closed 12 years ago

[unagi][tara][weekly build 13.04.17]monkey test crash in mozilla::dom::ContentChild::ProcessingError

Categories

(Firefox OS Graveyard :: General, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

(blocking-b2g:leo+, b2g18 verified)

VERIFIED FIXED
1.1 QE2 (6jun)
blocking-b2g leo+
Tracking Status
b2g18 --- verified

People

(Reporter: james.zhang, Assigned: daleharvey)

References

Details

(Keywords: crash, Whiteboard: [caf priority: p1][b2g-crash][TD-8929][TD-23322][TD-24777][TD-25336][CR 486768])

Crash Data

Attachments

(4 files)

Hi Justin and bent, Please find 'ABORT' log in unagi/tara crash report, in the bugreport file. If exist 'ABORT' log, also exist 'camera close' log in context. Maybe camera close cause this abort? unagi log 04-25 14:51:35.850 117 5893 E CAM_FD : cam_conf: CAMERA_EXIT 04-25 14:51:35.850 117 117 I QualcommCameraHardware: release X: mCameraRunning = 0, mFrameThreadRunning = 0 04-25 14:51:35.850 117 117 I QualcommCameraHardware: mVideoThreadRunning = 0, mSnapshotThreadRunning = 0, mJpegThreadRunning = 0 04-25 14:51:35.850 117 117 I QualcommCameraHardware: camframe_timeout_flag = 0, mAutoFocusThreadRunning = 0 04-25 14:51:35.850 117 117 E QualcommCamera: Qint android::set_preview_window(camera_device*, preview_stream_ops*): E window = 0x0 04-25 14:51:35.850 117 117 E QualcommCameraHardware: : set_preview_window 04-25 14:51:35.850 117 117 V QualcommCameraHardware: setPreviewWindow: E 04-25 14:51:35.850 117 117 W QualcommCameraHardware: Setting NULL preview window 04-25 14:51:35.850 117 117 E QualcommCameraHardware: Set preview window:: 04-25 14:51:35.850 117 117 V QualcommCameraHardware: setPreviewWindow : X 04-25 14:51:35.850 117 117 E QualcommCamera: Qint android::close_camera_device(hw_device_t*): device =0x1cefc98 E 04-25 14:51:35.850 117 117 I QualcommCamera: void android::close_Hal_obj(camera_device*): E 04-25 14:51:35.850 117 117 I QualcommCamera: void android::close_Hal_obj(camera_device*): clear hw 04-25 14:51:35.850 117 117 I QualcommCameraHardware: ~QualcommCameraHardware E 04-25 14:51:35.850 117 117 V QualcommCameraHardware: ~MMCameraDL: E 04-25 14:51:35.860 117 5901 V QualcommCameraHardware: runFrameThread X 04-25 14:51:35.880 111 252 D memalloc: /dev/pmem_adsp: Freeing buffer base:0x44000000 size:233472 fd:210 04-25 14:51:35.900 111 252 D memalloc: /dev/pmem_adsp: Freeing buffer base:0x44151000 size:233472 fd:242 04-25 14:51:35.900 111 252 D memalloc: /dev/pmem_adsp: Freeing buffer base:0x44500000 size:233472 fd:247 04-25 14:51:35.900 111 252 D memalloc: /dev/pmem_adsp: Freeing buffer base:0x44665000 size:233472 fd:258 04-25 14:51:35.900 111 252 D memalloc: /dev/pmem_adsp: Freeing buffer base:0x44a00000 size:233472 fd:261 04-25 14:51:35.900 111 252 D memalloc: /dev/pmem_adsp: Freeing buffer base:0x44ba7000 size:233472 fd:264 04-25 14:51:35.900 111 252 D memalloc: /dev/pmem_adsp: Freeing buffer base:0x45013000 size:233472 fd:267 04-25 14:51:35.900 111 252 D memalloc: /dev/pmem_adsp: Freeing buffer base:0x451a1000 size:233472 fd:270 04-25 14:51:35.900 111 252 D memalloc: /dev/pmem_adsp: Freeing buffer base:0x45700000 size:233472 fd:274 04-25 14:51:35.900 111 252 D memalloc: /dev/pmem: Freeing buffer base:0x4b40a000 size:614400 offset:4628480 fd:180 04-25 14:51:35.910 5877 5877 D memalloc: /dev/pmem: Unmapping buffer base:0x46882000 size:5242880 offset:4628480 04-25 14:51:35.910 5877 5877 D memalloc: /dev/pmem: Unmapping buffer base:0x44400000 size:5324800 offset:5242880 04-25 14:51:35.920 5877 5877 D memalloc: /dev/pmem: Unmapping buffer base:0x45128000 size:8343552 offset:8335360 04-25 14:51:35.930 111 252 D memalloc: /dev/pmem: Freeing buffer base:0x4b4a0000 size:81920 offset:5242880 fd:183 04-25 14:51:35.930 5877 5877 I Gecko : 04-25 14:51:35.930 5877 5877 I Gecko : ###!!! [Child][AsyncChannel] Error: Route error: message sent to unknown actor ID 04-25 14:51:35.930 5877 5877 I Gecko : 04-25 14:51:35.940 5877 5877 I Gecko : [Child 5877] ###!!! ABORT: aborting because of fatal error: file /home/apuser/unagi_weekly_build/B2G/gecko/dom/ipc/ContentChild.cpp, line 1020 04-25 14:51:35.940 5877 5877 E Gecko : mozalloc_abort: [Child 5877] ###!!! ABORT: aborting because of fatal error: file /home/apuser/unagi_weekly_build/B2G/gecko/dom/ipc/ContentChild.cpp, line 1020 04-25 14:51:35.960 111 252 D memalloc: /dev/pmem: Freeing buffer base:0x4b632000 size:614400 offset:6889472 fd:279 04-25 14:51:35.970 111 252 D memalloc: /dev/pmem: Freeing buffer base:0x4b4b4000 size:81920 offset:5324800 fd:197 04-25 14:51:36.100 117 117 V QualcommCameraHardware: closed MM Camera DL 04-25 14:51:36.100 117 117 V QualcommCameraHardware: ~MMCameraDL: X 04-25 14:51:36.100 117 117 I QualcommCameraHardware: ~QualcommCameraHardware X 04-25 14:51:36.100 117 117 I QualcommCamera: void android::close_Hal_obj(camera_device*): X tara log 04-25 23:13:46.327 81 25111 I CameraService: Destroying camera 0 04-25 23:13:46.327 81 25111 I SprdCameraHardware: int android::HAL_camera_device_close(hw_device_t*) 04-25 23:13:46.327 81 25111 V SprdCameraHardware: ~SprdCameraHardware E 04-25 23:13:46.327 81 25111 V SprdCameraHardware: ~SprdCameraHardware X 04-25 23:13:46.327 81 25111 D CameraService: CameraService::removeClient E (pid 10075) 04-25 23:13:46.327 81 25111 D CameraService: removeClient: clear camera 0 04-25 23:13:46.327 81 25111 D CameraService: CameraService::removeClient X (pid 10075) 04-25 23:13:46.327 81 25111 E CameraService: disconnect X (pid 10075) 04-25 23:13:46.337 81 3941 D CameraService: Client::~Client E (pid 81, this 0x1477648) 04-25 23:13:46.337 81 3941 E CameraService: disconnect E (pid 81) 04-25 23:13:46.337 81 3941 D CameraService: CameraService::releaseSound ref=1 04-25 23:13:46.337 81 3941 I MediaPlayerService: disconnect(47) from pid 81 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] stop mState=4 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] stop mState=4 a 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] stop mState=4 b 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] stop mState=4 c 04-25 23:13:46.337 81 10492 I OMXCodec: [OMX.google.vorbis.decoder] Now Idle. mState(0), mState 04-25 23:13:46.337 81 10492 I OMXCodec: [OMX.google.vorbis.decoder] Now Loaded. 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] stop mState=1 d 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] stop mState=1 e 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] stopped in state 1 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] ~OMXCodec in state 04-25 23:13:46.337 81 3941 I AwesomePlayer: --mSystemTimeSourceForSync reset ok 04-25 23:13:46.337 81 3941 I AwesomePlayer: --mSystemTimeSourceForSync reset ok 04-25 23:13:46.337 81 3941 I AwesomePlayer: --mSystemTimeSourceForSync reset ok 04-25 23:13:46.337 81 3941 I MediaPlayerService: Client(47) destructor pid = 81 04-25 23:13:46.337 81 3941 I MediaPlayerService: disconnect(47) from pid 81 04-25 23:13:46.337 81 3941 V AudioFlinger: releasing 287 from 81 04-25 23:13:46.337 81 3941 W AudioFlinger: session id 287 not found for pid 81 04-25 23:13:46.337 81 3941 I MediaPlayerService: disconnect(48) from pid 81 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] stop mState=4 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] stop mState=4 a 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] stop mState=4 b 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] stop mState=4 c 04-25 23:13:46.337 81 10495 I OMXCodec: [OMX.google.vorbis.decoder] Now Idle. mState(0), mState 04-25 23:13:46.337 81 10495 I OMXCodec: [OMX.google.vorbis.decoder] Now Loaded. 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] stop mState=1 d 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] stop mState=1 e 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] stopped in state 1 04-25 23:13:46.337 81 3941 I OMXCodec: [OMX.google.vorbis.decoder] ~OMXCodec in state 04-25 23:13:46.337 81 3941 I AwesomePlayer: --mSystemTimeSourceForSync reset ok 04-25 23:13:46.337 81 3941 I AwesomePlayer: --mSystemTimeSourceForSync reset ok 04-25 23:13:46.337 81 3941 I AwesomePlayer: --mSystemTimeSourceForSync reset ok 04-25 23:13:46.337 81 3941 I MediaPlayerService: Client(48) destructor pid = 81 04-25 23:13:46.337 81 3941 I MediaPlayerService: disconnect(48) from pid 81 04-25 23:13:46.337 81 3941 V AudioFlinger: releasing 288 from 81 04-25 23:13:46.337 81 3941 W AudioFlinger: session id 288 not found for pid 81 04-25 23:13:46.337 81 3941 D CameraService: Client::~Client X (pid 81, this 0x1477648) 04-25 23:13:46.397 10075 10075 I Gecko : 04-25 23:13:46.397 10075 10075 I Gecko : ###!!! [Child][AsyncChannel] Error: Route error: message sent to unknown actor ID 04-25 23:13:46.397 10075 10075 I Gecko : 04-25 23:13:46.397 10075 10075 I Gecko : [Child 10075] ###!!! ABORT: aborting because of fatal error: file /home/apuser/MocorHTML5/b2g_latest/B2G/gecko/dom/ipc/ContentChild.cpp, line 1020 04-25 23:13:46.397 10075 10075 E Gecko : mozalloc_abort: [Child 10075] ###!!! ABORT: aborting because of fatal error: file /home/apuser/MocorHTML5/b2g_latest/B2G/gecko/dom/ipc/ContentChild.cpp, line 1020
Attached file tara crash report
Attached file unagi crash report
This one is probably better for bent than me.
Flags: needinfo?(bent.mozilla)
Whiteboard: [TD-8929][TD-23322]
leo has a same problem that is detected by Quality Testing.
Attached file leo's crash report
In general these things are hard to debug... You need to figure out which message is being sent and why it is being sent to an actor that has already gone away. The fix may take many forms, from redesigning the protocol to changing message ordering, etc.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(bent.mozilla)
Whiteboard: [TD-8929][TD-23322] → [TD-8929][TD-23322][TD-24777][TD-25336]
Target Milestone: --- → 1.1 QE2
Severity: critical → blocker
Priority: -- → P1
Crash Signature: [@ mozalloc_abort | NS_DebugBreak_P | mozilla::dom::ContentChild::ProcessingError]
Keywords: crash
Whiteboard: [TD-8929][TD-23322][TD-24777][TD-25336] → [b2g-crash][TD-8929][TD-23322][TD-24777][TD-25336]
bent, did you mean me? :) I think sotaro is the expert on the CameraService here.
Flags: needinfo?(sotaro.ikeda.g)
User impact versus level of effort here doesn't play out. Does anybody have an instance of this in a normal user scenario? Please nominate for blocking-b2g:leo? if you think this should be a critical blocking bug.
blocking-b2g: --- → -
tracking-b2g18: --- → +
leo looks this is problems. i think it looks also same in unagi / tara. STR> Settings -> Display -> Wallpaper -> Select from: Camera -> Camera runs as Attach Mode. at this case, User can take a picture and then Camera App will be automatically exits. in during attaching picture taken by camera after shot, user can close camera app through x button of bottom's area. when camera closed by user action, and more, attaching picture not yet done, PContentPermissionRequest:: Msg___delete____ID is raised somewhere. (i thinks that this message was raised to close camera after everything is done.) but, Actor(camera) was already closed by user action. so, PContentPermissionRequest:: Msg___delete____ID isn't processed by receiver.
to fix this problem, 1. when camera runs as attach mode, x button of bottom's area SHOULD disable after take a picture. do not close by user until everything is done. 2. in ipc message processing, it is processed by asynchronouse communication. so, i think b2g have a many chance like this kind of ipc error. if there are unhandled ipc message by actor. it always make a crash report by mozilla::dom::ContentChild::ProcessingError() { NS_RUNTIMEABORT(); } is it right abort case? how about just remove instead of NS_RUNTIMEABORT()?
daleharvey, see comment 11.
Flags: needinfo?(dale)
Got it, cheers
Assignee: nobody → dale
Flags: needinfo?(dale)
Whiteboard: [b2g-crash][TD-8929][TD-23322][TD-24777][TD-25336] → [b2g-crash][TD-8929][TD-23322][TD-24777][TD-25336][BTG-1524]
(leo+, as I think pretty much everybody on v1.1 is now seeing this crash during stability test)
blocking-b2g: - → leo+
tracking-b2g18: + → ---
Whiteboard: [b2g-crash][TD-8929][TD-23322][TD-24777][TD-25336][BTG-1524] → [b2g-crash][TD-8929][TD-23322][TD-24777][TD-25336][CR 486768]
Attached file Pointed to GH
Attachment #750779 - Flags: review?(dflanagan)
Flags: needinfo?(sotaro.ikeda.g)
Comment on attachment 750779 [details] Pointed to GH I can't reproduce the bug on Unagi. But the patch seems perfectly safe, so I'll go ahead and land it. We can do UX followup (at least graying out the X button) in bug 870726
Attachment #750779 - Flags: review?(dflanagan) → review+
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
I've landed and uplifted Dale's patch, but could not reproduce the crash, so I can't verify that the patch fixes it. James or Leo, would you verify that the bug is fixed?
Flags: needinfo?(james.zhang)
Flags: needinfo?(leo.bugzilla.gecko)
(In reply to leo.bugzilla.gecko from comment #11) > leo looks this is problems. i think it looks also same in unagi / tara. > > STR> > Settings -> Display -> Wallpaper -> > Select from: Camera -> Camera runs as Attach Mode. > at this case, User can take a picture and then Camera App will be > automatically exits. > > in during attaching picture taken by camera after shot, > user can close camera app through x button of bottom's area. > > when camera closed by user action, and more, attaching picture not yet done, > > PContentPermissionRequest:: Msg___delete____ID is raised somewhere. > (i thinks that this message was raised to close camera after everything is > done.) > but, Actor(camera) was already closed by user action. > > so, PContentPermissionRequest:: Msg___delete____ID isn't processed by > receiver. Yes, tara and unagi can reproduce this same issue.
Flags: needinfo?(james.zhang)
(In reply to David Flanagan [:djf] from comment #21) > I've landed and uplifted Dale's patch, but could not reproduce the crash, so > I can't verify that the patch fixes it. James or Leo, would you verify that > the bug is fixed? Thank you , David, it works fine after applying your patch. But I think it is not a good way to fix this issue in gaia, because you can't prevent people to design different UI. I suggest that Sotaro can fix it in gecko.
I don't quite know what's going on here, but it seems like a bug if usage of DOM APIs can cause a child process to crash. I don't mind us working around it, but can you please file a bug explaining what's going wrong?
(In reply to Justin Lebar [:jlebar] from comment #24) > I don't quite know what's going on here, but it seems like a bug if usage of > DOM APIs can cause a child process to crash. I don't mind us working around > it, but can you please file a bug explaining what's going wrong? Hi Justin, I think the comment 12 is the root cause.
All I get from comment 12 is "we're getting a processing error and killing the child process," which doesn't explain /why/ we're getting that error. Do you understand more about what's going wrong and how we should fix it in Gecko?
(In reply to Justin Lebar [:jlebar] from comment #26) > All I get from comment 12 is "we're getting a processing error and killing > the child process," which doesn't explain /why/ we're getting that error. > Do you understand more about what's going wrong and how we should fix it in > Gecko? The receiver can't handle the ipc message if camera process is closed. Maybe we can handle this AsyncChannel error properly, instead of 'ASSERT'. Gecko : ###!!! [Parent][AsyncChannel] Error: Channel error: cannot send/recv Gecko : ###!!! [Child][AsyncChannel] Error: Route error: message sent to unknown actor ID mozalloc_abort: [Child 3713] ###!!! ABORT: aborting because of fatal error: file /home/apuser/B2G_MONKEY_TEST/gecko/dom/ipc/ContentChild.cpp, line 1020 Justin, you can ask leo.bugzilla.gecko@gmail.com for more information and advice.
> Justin, you can ask leo.bugzilla.gecko@gmail.com for more information and advice. That's what I was trying to ask in comment 24. Dale, can you help us design a minimal testcase here?
Flags: needinfo?(dale)
attachment 750779 [details] is verified on leo.
Status: RESOLVED → VERIFIED
Flags: needinfo?(leo.bugzilla.gecko)
sorry, my mistake. just verified on leo
Status: VERIFIED → RESOLVED
Closed: 12 years ago12 years ago
(In reply to Justin Lebar [:jlebar] from comment #24) > I don't quite know what's going on here, but it seems like a bug if usage of > DOM APIs can cause a child process to crash. I don't mind us working around > it, but can you please file a bug explaining what's going wrong? Ok. Comment 12 is just my suggestion. (i think it is not good if ContentChild.cpp did abort as NS_RUNTIMEABORT() to process unhandled ipc message) anyway, the point of problems was fixed. if there needs any improvement, i will fill a new bug, let's talk there. thanks
(In reply to leo.bugzilla.gecko from comment #31) > (In reply to Justin Lebar [:jlebar] from comment #24) > > I don't quite know what's going on here, but it seems like a bug if usage of > > DOM APIs can cause a child process to crash. I don't mind us working around > > it, but can you please file a bug explaining what's going wrong? > > Ok. Comment 12 is just my suggestion. > (i think it is not good if ContentChild.cpp did abort as NS_RUNTIMEABORT() > to process unhandled ipc message) > > anyway, the point of problems was fixed. > if there needs any improvement, i will fill a new bug, let's talk there. > thanks Hi Leo, you're right, I catch another 'mozilla::dom::ContentChild::ProcessingError' monkey test crash after applying this patch, maybe crash in gallery according the snapshot. Please file a new bug and I'll attach the new crash.
In reply to leo.bugzilla.gecko from comment #31) > anyway, the point of problems was fixed. > if there needs any improvement, i will fill a new bug, let's talk there. > thanks branch to Bug 874338
David Flanagan and I were both unable to reproduce this bug in manual testing, agreed that the DOM shouldnt be able to cause gecko crashes, but it looks like the same core issue has been raised again in #874338, we shouldnt work around in gaia in future though, I assumed it was to help testings, not an actual 'fix'
Flags: needinfo?(dale)
This crash occurs again. And gaia contains Dale's patch. git log camera.js commit 76a8a13a8eea7ff98b1b80992b307bddb8f50daa Author: David Flanagan <dflanagan@mozilla.com> Date: Fri May 17 14:52:19 2013 -0700 Merge pull request #9826 from daleharvey/camera-monkey Bug 867025 - Disable cancel activity button when picture taken r=@davidflanagan(cherry picked from commit 771ad40f7f7fc3b36aae2ea0c25b091e29d626c6)
(In reply to James Zhang from comment #35) > This crash occurs again. And gaia contains Dale's patch. I hit this crash too, but filed a new bug 878366.
Flags: in-moztrap-
Verified fixed on: Device: Leo phone Build Identifier: 20130611074722 Update channel: leo/1.1.0/nightly Gecko: http://hg.mozilla.org/releases/mozilla-b2g18/rev/8d0562d20324 Gaia: 8c64e19b82d4e0490a7780232d3d6bd07d0ba9ec1370937291 Git commit info: 2013-06-11 07:54:51 OS version: 1.1.0.0-prerelease and Device: Unagi phone Build Identifier: 20130611074722 Update channel: unagi/1.1.0/beta Gecko: http://hg.mozilla.org/releases/mozilla-b2g18/rev/8d0562d20324 Gaia: 8c64e19b82d4e0490a7780232d3d6bd07d0ba9ec1370937291 Git commit info: 2013-06-11 07:54:51 OS version: 1.1.0.0-prerelease Steps to Reproduce: 1. Go to Settings -> Display -> Wallpaper. 2. Tap on the Edit icon. On the overlay screen Select from, tap on Camera. 3. When the camera app launches, tap on the Camera icon to take a photo. Expected Result: The camera app exits automatically. The Settings -> Display -> Wallpaper screen is displayed again. The wallpaper preview shows the photo just taken. The actual wallpaper has also been set to the photo just taken. Actual Result: Matches the expected result.
Status: RESOLVED → VERIFIED
Whiteboard: [b2g-crash][TD-8929][TD-23322][TD-24777][TD-25336][CR 486768] → [caf priority: p1][b2g-crash][TD-8929][TD-23322][TD-24777][TD-25336][CR 486768]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: