Closed
Bug 1057220
Opened 10 years ago
Closed 10 years ago
Failed to map ion memory in the client on gonk
Categories
(Core :: Graphics: Layers, defect, P1)
Tracking
()
People
(Reporter: tkundu, Assigned: sotaro)
References
Details
(Keywords: crash, Whiteboard: [b2g-crash][caf-crash 217][caf priority: p1][CR 686674])
Attachments
(6 files, 1 obsolete file)
574.59 KB,
text/plain
|
Details | |
3.01 KB,
patch
|
Details | Diff | Splinter Review | |
3.04 KB,
patch
|
Details | Diff | Splinter Review | |
5.31 KB,
patch
|
Details | Diff | Splinter Review | |
1.31 KB,
patch
|
Details | Diff | Splinter Review | |
833.63 KB,
application/x-bzip
|
Details |
+++ This bug was initially created as a clone of Bug #1038461 +++
+++ This bug was initially created as a clone of Bug #1034294 +++
This bug is created from Bug 1034294 comment 32.
Logs :
---------------------------------------------------------
07-14 16:25:00.210 8506 8536 E qdmemalloc: ion: Failed to map memory in the client: Invalid argument
07-14 16:25:00.210 8506 8536 E qdgralloc: Could not mmap handle 0xb0c55c40, fd=45 (Invalid argument)
07-14 16:25:00.210 8506 8536 E qdgralloc: gralloc_register_buffer: gralloc_map failed
07-14 16:25:00.210 8506 8536 W GraphicBufferMapper: registerBuffer(0xb0c55c40) failed -22 (Invalid argument)
07-14 16:25:00.210 8506 8536 E GraphicBuffer: unflatten: registerBuffer failed: Invalid argument (-22)
07-14 16:25:00.210 8506 8536 I Gecko : ParamTraits<MagicGrallocBufferHandle>::Read() failed to get gralloc buffer
07-14 16:25:00.220 8506 8536 I Gecko : IPDL protocol error: Error deserializing 'MaybeMagicGrallocBufferHandle'
07-14 16:25:00.220 8506 8536 I Gecko : [Child 8506] ###!!! ABORT: IPDL error [PSharedBufferManagerChild]: "Error deserializing 'MaybeMagicGrallocBufferHandle'". abort()ing as a result.: file ../../../../../../../../gecko/ipc/glue/ProtocolUtils.cpp, line 198
07-14 16:25:00.230 232 910 W Adreno-GSL: <gsl_ldd_control:412>: ioctl fd 141 code 0xc02c093d (IOCTL_KGSL_SUBMIT_COMMANDS) failed: errno 22 Invalid argument
07-14 16:25:00.230 8506 8536 E Gecko : mozalloc_abort: [Child 8506] ###!!! ABORT: IPDL error [PSharedBufferManagerChild]: "Error deserializing 'MaybeMagicGrallocBufferHandle'". abort()ing as a result.: file ../../../../../../../../gecko/ipc/glue/ProtocolUtils.cpp, line 198
Crash logs: Please see attachment in bug 1034294 comment 32
Reporter | ||
Comment 1•10 years ago
|
||
Issue is reproduced again with fix from bug 1053204 on msm8926 1GB device
We attached logs in Comment 0 when this happened again.
We asked our internal test team to reproduce it again with additional logging patch from
bug 1038461 Comment 80 ?
Please let us know if you want us to reproduce it with any other logs. I have a big doubt that gecko is closing some FD by mistake and that is causing this issue.
Flags: needinfo?(sotaro.ikeda.g)
Comment 2•10 years ago
|
||
Observed on:
Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.066
Moz BuildID: 20140810160202
B2G Version: ---
Gecko Version: 32.0
Gaia: http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=de28796a8956a48bb98ca67df6a33e0622d642d1
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=2b27becae85092d46bfadcd4fb5605e82e1e1093
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → sotaro.ikeda.g
Flags: needinfo?(sotaro.ikeda.g)
Assignee | ||
Comment 3•10 years ago
|
||
(In reply to Tapas Kumar Kundu from comment #1)
>
> We asked our internal test team to reproduce it again with additional
> logging patch from
> bug 1038461 Comment 80 ?
Thanks for re-testing:-) Right now, the log paches are enough. If we got more information, we could add more logs.
Assignee | ||
Comment 4•10 years ago
|
||
From attachment 8477215 [details], the crash happened at Setting app.
Assignee | ||
Comment 5•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #4)
> From attachment 8477215 [details], the crash happened at Setting app.
But from Bug 1038461, the crash is caused by b2g process's close(). It is not clear if the Setting app is related to the crash.
Reporter | ||
Comment 7•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #4)
> From attachment 8477215 [details], the crash happened at Setting app.
Agreed. I also notice same :) .
Updated•10 years ago
|
blocking-b2g: 2.0? → 2.0+
No longer blocks: CAF-v2.1-FC-metabug
Reporter | ||
Comment 8•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #3)
> (In reply to Tapas Kumar Kundu from comment #1)
> >
> > We asked our internal test team to reproduce it again with additional
> > logging patch from
> > bug 1038461 Comment 80 ?
>
> Thanks for re-testing:-) Right now, the log paches are enough. If we got
> more information, we could add more logs.
bionic patch in bug 1038461 Comment 80 is not working and causing some boot hang issue on device. Could you please take a look ? Thanks
Flags: needinfo?(sotaro.ikeda.g)
Assignee | ||
Comment 9•10 years ago
|
||
(In reply to Tapas Kumar Kundu from comment #8)
> (In reply to Sotaro Ikeda [:sotaro] from comment #3)
> > (In reply to Tapas Kumar Kundu from comment #1)
> >
> > Thanks for re-testing:-) Right now, the log paches are enough. If we got
> > more information, we could add more logs.
>
> bionic patch in bug 1038461 Comment 80 is not working and causing some boot
> hang issue on device. Could you please take a look ? Thanks
Wasn't the patch worked in the past on your side, was it? On my side, the patch works without problem on v2.0 flame. I confirmed it just now.
Flags: needinfo?(sotaro.ikeda.g)
Assignee | ||
Comment 10•10 years ago
|
||
There seems 2 possibility why the patch makes problem.
- [1] patch to close() makes bad thing.
- [2] logout in eartly stage of booting makes bad thing.
Assignee | ||
Comment 11•10 years ago
|
||
Tapas, can you try an updated bionic patch? I updated the bionic patch. It is attachment 8477769 [details] [diff] [review] in bug 1038461.
Flags: needinfo?(tkundu)
Assignee | ||
Comment 12•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #10)
> - [2] logout in eartly stage of booting makes bad thing.
I faced this problem during creating the kernel log patch. I am not sure this could happen from user side logout.
Assignee | ||
Comment 13•10 years ago
|
||
Tapas, if attachment 8477769 [details] [diff] [review] does not work, can you test attachment 8477772 [details] [diff] [review] in bug 1038461? It does no logout, it just implement close() with c instead of assembler. I create it to make clear an cause of the boot problem between [1] and [2] in Comment 10.
Reporter | ||
Comment 14•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #11)
> Tapas, can you try an updated bionic patch? I updated the bionic patch. It
> is attachment 8477769 [details] [diff] [review] in bug 1038461.
New bionic patch works. But new bionic patch won't confirm us if FD is closed accidentally .
Reporter | ||
Updated•10 years ago
|
Flags: needinfo?(tkundu)
Reporter | ||
Updated•10 years ago
|
Flags: needinfo?(sotaro.ikeda.g)
Assignee | ||
Comment 15•10 years ago
|
||
(In reply to Tapas (always NI me) from comment #14)
> (In reply to Sotaro Ikeda [:sotaro] from comment #11)
> > Tapas, can you try an updated bionic patch? I updated the bionic patch. It
> > is attachment 8477769 [details] [diff] [review] in bug 1038461.
>
> New bionic patch works. But new bionic patch won't confirm us if FD is
> closed accidentally .
The bionic patch do not have a way to recognize b2g. There seems no other good way now.
Before the crash, close() failure should happen randomly. If we are lucky enough, an incorrect close could be in a log. Can you run test this weekend with the current patches?
Bug 1053277 seems to make close() to recognize b2g.
Flags: needinfo?(sotaro.ikeda.g) → needinfo?(tkundu)
Assignee | ||
Comment 16•10 years ago
|
||
wrap a close() in gecko and add logout.
Assignee | ||
Updated•10 years ago
|
Flags: needinfo?(tkundu)
Assignee | ||
Comment 17•10 years ago
|
||
Tapas, can you try attachment 8477789 [details] [diff] [review]? It could add close() log that is called from gecko.
Flags: needinfo?(tkundu)
Reporter | ||
Comment 18•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #17)
> Tapas, can you try attachment 8477789 [details] [diff] [review]? It could
> add close() log that is called from gecko.
Hi Sotaro,
Did you test this patch in your device ? I can seeing close syscalls logs are coming from gecko in logcat but homescreen is not showing any icons in device after enabling these logs.
I think that logging all close syscall() in gecko has some overhead and this is causing this issue.
Could you please change your patch so that it prints a table to close syscall() and device timestamps every 10secs . Please test this patch on your device 1st and confirm that it does not cause additional issue for us.
Thanks a lot for your help
Flags: needinfo?(sotaro.ikeda.g)
Reporter | ||
Updated•10 years ago
|
Flags: needinfo?(tkundu)
Reporter | ||
Comment 19•10 years ago
|
||
Hi Sotaro, I also tried to log all close syscall using strace but b2g process gives segmentation fault.
Please give us a minimum logging patch for close syscall and test it in your build fully.
Reporter | ||
Comment 20•10 years ago
|
||
Sotaro, IMO, we should try to store timestamps of close syscall with FD in memory and print it in regular interval when CPU usage is low. WDYT ? Please give us a patch and test it in your build 1st.
Assignee | ||
Comment 21•10 years ago
|
||
(In reply to Tapas (always NI me) from comment #18)
> (In reply to Sotaro Ikeda [:sotaro] from comment #17)
> > Tapas, can you try attachment 8477789 [details] [diff] [review]? It could
> > add close() log that is called from gecko.
>
> Hi Sotaro,
>
> Did you test this patch in your device ? I can seeing close syscalls logs
> are coming from gecko in logcat but homescreen is not showing any icons in
> device after enabling these logs.
Yes, I tested on v2.0 flame. And it works as expected.
Flags: needinfo?(sotaro.ikeda.g)
Assignee | ||
Comment 22•10 years ago
|
||
When I put patches to bugzilla, basically all patches are tested locally if possible.
Assignee | ||
Comment 23•10 years ago
|
||
(In reply to Tapas (always NI me) from comment #20)
> Sotaro, IMO, we should try to store timestamps of close syscall with FD in
> memory and print it in regular interval when CPU usage is low. WDYT ? Please
> give us a patch and test it in your build 1st.
I think that it seems more efficient if we just find the code that is doing invalid fd close(). From the experience of Bug 1057220, before fd is closed incorrectly, a lot of invalid fd closes happened. If b2g process is intentionally crashed when invalid fd close happen, we could find out the place of the bug.
Assignee | ||
Comment 24•10 years ago
|
||
The patch cause the crash if gecko close invalid fd.
Assignee | ||
Comment 25•10 years ago
|
||
Reduce logout.
Attachment #8477860 -
Attachment is obsolete: true
Assignee | ||
Comment 26•10 years ago
|
||
I put attachment 8477860 [details] [diff] [review] to try server. Two tests were failed because of invalid fd close, but tryserver logs did not have correct decoded stack dump :-(
https://tbpl.mozilla.org/?tree=Try&rev=a0f0d7c3afc8
Assignee | ||
Comment 27•10 years ago
|
||
Tapas, how do you think about Comment 23?
Flags: needinfo?(tkundu)
Assignee | ||
Comment 28•10 years ago
|
||
(In reply to Tapas (always NI me) from comment #8)
> > Thanks for re-testing:-) Right now, the log paches are enough. If we got
> > more information, we could add more logs.
>
> bionic patch in bug 1038461 Comment 80 is not working and causing some boot
> hang issue on device. Could you please take a look ? Thanks
I confirmed it when the patch is applied to kk-flame. But the problem did not happen on JB flame. I suspect it is a KK gonk or kernal's problem.
A week ago, mozilla flameKK build ROM's full flash did not boot at all. Today I tried built flame KK ROM's full flash. It booted correctly, but some capabilities(ex bluetooth) still did not work correctly.
Assignee | ||
Comment 29•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #28)
> (In reply to Tapas (always NI me) from comment #8)
> > > Thanks for re-testing:-) Right now, the log paches are enough. If we got
> > > more information, we could add more logs.
> >
> > bionic patch in bug 1038461 Comment 80 is not working and causing some boot
> > hang issue on device. Could you please take a look ? Thanks
>
> I confirmed it when the patch is applied to kk-flame. But the problem did
> not happen on JB flame. I suspect it is a KK gonk or kernal's problem.
But mozilla's flame KK build did not cause the problem of Comment 18.
Reporter | ||
Comment 30•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #27)
> Tapas, how do you think about Comment 23?
Nice idea. I assumed that patch in Comment 25 is for this only .
Flags: needinfo?(tkundu) → needinfo?(sotaro.ikeda.g)
Assignee | ||
Comment 31•10 years ago
|
||
(In reply to Tapas (always NI me) from comment #30)
> (In reply to Sotaro Ikeda [:sotaro] from comment #27)
> > Tapas, how do you think about Comment 23?
>
> Nice idea. I assumed that patch in Comment 25 is for this only .
Yes.
But this patch track only gecko's fd close. If the problem happens on other gonk modules in b2g process, the problem can not be tracked. Therefore, bionic's invalid fd close check only in b2g process becomes necessary for further checks. I am going to create a patch of doing it.
Flags: needinfo?(sotaro.ikeda.g)
Assignee | ||
Comment 32•10 years ago
|
||
I confirmed that why the bionic patch does not work on KK. Since KK, __libc_write_log() calls openg() and close() in every function call. Therefor, since applying the patch, close() recursively call close() forever. On JB, fd is hold as static variable and close() is not called.
Assignee | ||
Comment 33•10 years ago
|
||
From comment 32, KK's __libc_write_log() calls open() and close() for the every function call. KK's __libc_write_log() takes longer time than JB's one.
Assignee | ||
Comment 34•10 years ago
|
||
The patch provides the following capability.
- logout close() log only on b2g and b2g content process.
- rollback __libc_write_log() implementation to JB's one.
- abort if close() close invalid fd
Assignee | ||
Comment 35•10 years ago
|
||
gecko part of attachment 8477910 [details] [diff] [review].
Assignee | ||
Comment 36•10 years ago
|
||
By applying both the following patches, I confirmed flame-KK works without problem with the logout.
- part 1: attachment 8477910 [details] [diff] [review]
- part 2: attachment 8477910 [details] [diff] [review]
Assignee | ||
Comment 37•10 years ago
|
||
Tapas, can you confirm that the patches in Comment 36 works. They can check more wide area than attachment 8477861 [details] [diff] [review].
Flags: needinfo?(tkundu)
Assignee | ||
Comment 38•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #36)
> By applying both the following patches, I confirmed flame-KK works without
> problem with the logout.
> - part 1: attachment 8477910 [details] [diff] [review]
> - part 2: attachment 8477910 [details] [diff] [review]
sorry, Correction:
- part 1: attachment 8477910 [details] [diff] [review]
- part 2: attachment 8477911 [details] [diff] [review]
Reporter | ||
Comment 39•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #38)
> (In reply to Sotaro Ikeda [:sotaro] from comment #36)
> > By applying both the following patches, I confirmed flame-KK works without
> > problem with the logout.
> > - part 1: attachment 8477910 [details] [diff] [review]
> > - part 2: attachment 8477910 [details] [diff] [review]
>
> sorry, Correction:
> - part 1: attachment 8477910 [details] [diff] [review]
> - part 2: attachment 8477911 [details] [diff] [review]
This works fine .I asked our test team to reproduce again with these patches.
Reporter | ||
Comment 40•10 years ago
|
||
This is the log collected from device which was running some stability test only to reproduce this bug .
Stack trace is different but I am seeing all stacktraces is coming from
libc.so!close [close.c : 50 + 0x3]
i.e.
following code with your bionic patches :
int close(int fd)
{
if (sIsB2GProcess) {
__libc_format_log(ANDROID_LOG_WARN, "libc", "close() fd %d", fd);
}
int ret;
ret = __close(fd);
if (sIsB2GProcess && (ret < 0) && (fd != -1)) {
__libc_format_log(ANDROID_LOG_WARN, "libc", "close() ret %d errno %d fd %d", ret, -errno, fd);
abort(); =================> This is line number 50 in stack trace.
}
return ret;
}
This confirm us that we are closing invalid FD in 3 more places (3 stacktraces) ?
Could you please fix it ?
Flags: needinfo?(tkundu) → needinfo?(sotaro.ikeda.g)
Blocks: 1058366
Depends on: 1058368
Two of the stacks are the same and in bluetooth. I filed bug 1058366 for that. The other one is different, and I filed bug 1058368 for that.
Assignee | ||
Comment 42•10 years ago
|
||
(In reply to Tapas (always NI me) from comment #40)
>
> This confirm us that we are closing invalid FD in 3 more places (3
> stacktraces) ?
>
> Could you please fix it ?
The related bugs are already created by khuey, thanks!
Flags: needinfo?(sotaro.ikeda.g)
Comment 43•10 years ago
|
||
log patches part 1 and part 2 are applied and built into AU 73.
Assignee | ||
Comment 45•10 years ago
|
||
Flags: needinfo?(tkundu)
Reporter | ||
Comment 46•10 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #45)
> Tapas, can you do re-testing by adding attachment 8479910 [details] [diff] [review]
> [review] in Bug 1054929?
we are landing this patch internally and we will update here soon :)
Comment 47•10 years ago
|
||
We suspect that there are still issues with duplicate fd close and so would like to keep this bug open to track as we plan to debug further with bionic patch.
Comment 48•10 years ago
|
||
Greg/Tapas -- discussed with Bhavana and lets close this bug and reopen if we find any future issues. Currently there is no pending bug here.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
status-b2g-v2.0:
--- → fixed
status-b2g-v2.1:
--- → fixed
status-b2g-v2.2:
--- → unaffected
status-firefox32:
--- → wontfix
status-firefox33:
--- → wontfix
status-firefox34:
--- → fixed
Target Milestone: --- → mozilla34
Assignee | ||
Updated•10 years ago
|
Flags: needinfo?(tkundu)
You need to log in
before you can comment on or make changes to this bug.
Description
•