Closed Bug 916626 Opened 11 years ago Closed 11 years ago

Running out of pmem fast on hamachi

Categories

(Core :: Graphics, defect, P2)

ARM
Gonk (Firefox OS)
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: bjacob, Assigned: jrmuizel)

References

Details

(Keywords: perf, regression, Whiteboard: [MemShrink:P2] [c=memory p= s=2013.12.06 u=])

Attachments

(2 files, 2 obsolete files)

STR: - use B2G master (m-c 9366ee039645) - make sure that tiling is disabled (layers.force-tiles = false), that will soon be again the default (bug 916112) - start CubeVid app - result: CubeVid is very slow, and in adb logcat, we see a lot of errors like this: E/memalloc( 144): /dev/pmem: No more pmem available E/msm7627a.gralloc( 144): gralloc failed err=Out of memory W/GraphicBufferAllocator( 144): alloc(960, 720, 1, 00000300, ...) failed -12 (Out of memory) I/Gecko ( 431): Unexpected non-Gralloc SharedSurface in IPC path!
Depends on: 916627
Jason: I really don't know whether this is a recent regression, but if it is, that would be very useful to know --- even only a very vague idea of the regression date would help.
blocking-b2g: --- → koi?
Keywords: regression
Keywords: perf
With a recent mozilla-central on unagi I don't see these messages in the logcat. Can anyone reproduce it ? Framerate is rather bad though, but I don't know how much framerate we are supposed to expect here.
Unagi build does not have Bug 905784. Without it, if unagi becomes out of pmem, gralloc fallback from pmem to system memory.
QA Contact: sarsenyev
The hamachi that I am testing on has a 8 M pmem limit, and it is perhaps not unreasonable that the Cubevid app exhausts that quickly. So I am not sure at the moment that this bug report is valid. Sotaro, do you think that we should close this bug as INVALID or WONTFIX? Should we instead discuss how we can make a better usage of the available pmem?
Flags: needinfo?(sotaro.ikeda.g)
Before we flag it as invalid or wontfix, I think we need to get an explicit list of exactly where the pmem is going. What buffers are being allocated by what, and what they're being used for.
I got the similar "pmem" error without launching any apps on the latest Buri master build, after connecting to adb and got this errors: 09-16 18:18:42.850: E/memalloc(140): /dev/pmem: No more pmem available 09-16 18:18:42.850: E/msm7627a.gralloc(140): gralloc failed err=Out of memory 09-16 18:18:42.850: W/GraphicBufferAllocator(140): alloc(256, 120, 1, 00000133, ...) failed -12 (Out of memory) 09-16 18:18:42.860: E/memalloc(140): /dev/pmem: No more pmem available 09-16 18:18:42.860: E/msm7627a.gralloc(140): gralloc failed err=Out of memory 09-16 18:18:42.860: W/GraphicBufferAllocator(140): alloc(320, 20, 2, 00000133, ...) failed -12 (Out of memory) 09-16 18:18:42.860: E/memalloc(140): /dev/pmem: No more pmem available 09-16 18:18:42.860: E/msm7627a.gralloc(140): gralloc failed err=Out of memory 09-16 18:18:42.860: W/GraphicBufferAllocator(140): alloc(320, 20, 2, 00000133, ...) failed -12 (Out of memory)
(In reply to Benoit Jacob [:bjacob] from comment #4) > Sotaro, do you think that we should close this bug as INVALID or WONTFIX? > Should we instead discuss how we can make a better usage of the available > pmem? From Comment 6, it seems better to analyze the pmem usage. In the past, only MozBuild ROMs were implicitly fallback from gralloc to pmem. And since August 28th, MozBuild stop the implicit fallback on Bug 905784.
Flags: needinfo?(sotaro.ikeda.g)
I couldn't reproduce this issue from latest m-c with leo device. Could you dump below info to see any pmem leak? peter@peter-desktop:~/b2g_debug_leo$ bps APPLICATION USER PID PPID VSIZE RSS WCHAN PC NAME b2g root 134 1 200240 66128 ffffffff 40109604 S /system/b2g/b2g Usage app_379 379 134 68408 27724 ffffffff 4007f604 S /system/b2g/plugin-container Homescreen app_384 384 134 79952 35848 ffffffff 40099604 S /system/b2g/plugin-container CubeVid app_461 461 134 126828 42108 ffffffff 40b01d9a R /system/b2g/plugin-container plugin-containe root 508 134 40404 4612 00000000 b0003958 R /system/b2g/plugin-container peter@peter-desktop:~/b2g_debug_leo$ adb shell lsof 461|grep -E "pmem|PID" COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME plugin-co 461 app_461 32 ??? ??? ??? ??? /dev/pmem plugin-co 461 app_461 42 ??? ??? ??? ??? /dev/pmem plugin-co 461 app_461 44 ??? ??? ??? ??? /dev/pmem
bjacob, yesterday, you asked about layer's buffer release when application went to background. It happens like the following. PresShell::SetIsActive(false); ->TabChild::MakeHidden() ->PuppetWidget::Show(false) ->ClientLayerManager::ClearCachedResources() ->ClientLayerManager::ClearLayer()
This is a bug if we should not be using that much pmem, but if we run out because we need it all, than it's a different issues. Jeff, can you help figure out which it is?
Assignee: nobody → jmuizelaar
Flags: needinfo?(jmuizelaar)
Flags: needinfo?(jmuizelaar)
Blocks: 915120
Attached patch Improved version (obsolete) — Splinter Review
This is just an improved version that dumps automatically on out-of-pmem, as a html page also containing a layermanagers dump.
Oh, note that this applies to the mozilla-aurora tree. It includes BenWa's patch making layers dumping easier, which is already on central, so it will conflict there.
Improved again, doing the logging at the right time so we don't get locking failures.
Attachment #812237 - Attachment is obsolete: true
Improved some more (reports on gralloc buffers that are unreferenced from the layer tree... haven't found any so far).
Attachment #812268 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Whiteboard: [MemShrink] [c=memory p= s= u=]
Able to see the issue on the earlier Buri engineering build, but have no access to flash the device to other earlier builds to find the regression window, the earliest engineering build that available is 9/14 Build ID: 20130914040202 Gecko: http://hg.mozilla.org/mozilla-central/rev/53d5e43e23cc Gaia: 3f51f302c3a0c57d8bad482ec7ee86b2819389fb Platform Version: 26.0a1
Just ran into this at the Mozilla Summit: if I use a feature of the Summit app that requires Persona login, when I touch the email field, the keyboard appears for a moment and then the screen goes completely black. logcat as follows: E/memalloc( 2238): /dev/pmem: No more pmem available E/msm7627a.gralloc( 2238): gralloc failed err=Out of memory W/GraphicBufferAllocator( 2238): alloc(320, 480, 1, 00000133, ...) failed -12 (Out of memory) Interestingly, if I rotate the device and then return it to its original orientation, the screen returns to normal and the field can be used. Warning: I'm currently using mozilla-central gecko with 1.0.1 gonk, because of how difficult it is to flash these devices.
The bug that is "why are we running out quickly" is not a blocker for 1.2, but we will investigate again for 1.3. We will deal with the fact that we're running out in other blockers.
blocking-b2g: koi? → 1.3?
Whiteboard: [MemShrink] [c=memory p= s= u=] → [MemShrink:P2] [c=memory p= s= u=]
For getting the regression window, you should be able to do this on a production build by doing the following: 1. Go to http://jds2501.github.io/webapi-permissions-tests/ 2. Install "Packaged App Test Case 22" 3. Launch the app and execute the STR in comment 0
All 1.2 central builds are affected. When running the CubeVid app, logcat shows errors "No more pmem available", "out of memory" 10-18 19:15:43.839: E/memalloc(137): /dev/pmem: No more pmem available 10-18 19:15:43.839: E/msm7627a.gralloc(137): gralloc failed err=Out of memory 10-18 19:15:43.839: W/GraphicBufferAllocator(137): alloc(960, 720, 1, 00000300, ...) failed -12 (Out of memory) Device: Buri v1.2 Mozilla central BuildID: 20130621031231 Gaia: e2f19420fa6a26c4287588701efaec09a750dba1 Gecko: 7ba8c86f1a56 Version: 24.0a1 Firmware Version: US_20131015
The issue doesn't reproduce on the latest 1.1 build on Leo The logcat doesn't show any "out of memory" errors Device: Leo 1.1 Moz RIL BuildID: 20131018041205 Gaia: 39b0203fa9809052c8c4d4332fef03bbaf0426fc Gecko: 7fedb6a967ea Version: 18.0 Firmware Version: US_20130912
I am seeing out of pmem unlocking hamachi with a 1.2 build: E/memalloc( 475): /dev/pmem: No more pmem available E/msm7627a.gralloc( 475): gralloc failed err=Out of memory W/GraphicBufferAllocator( 475): alloc(640, 960, 2, 00000133, ...) failed -12 (Out of memory)
We're still trying to get the OEM ROM with the proper fallback fix.
Our TAMs got me a new image for helix within 24h after I asked them. Did you talk to them about hamachi?
(In reply to Milan Sreckovic [:milan] from comment #18) > The bug that is "why are we running out quickly" is not a blocker for 1.2, > but we will investigate again for 1.3. We will deal with the fact that > we're running out in other blockers. Milan, can we commit to this for 1.3, i.e. 1.3+ this?
Priority: -- → P2
At this point, this is really a meta bug saying "try to reduce graphics memory usage". The fallback was handled in another bug, and that is fixed. I don't think this is a valid measurable bug anymore. Benoit, does the crash reproduce for you, or do you see the proper fallback?
Flags: needinfo?(bjacob)
The original STR are about the CubeVid application, but I don't seem to have it anymore in my current B2G master build. I checked a few WebGL pages in out-of-pmem conditions and the fallback to non-pmem seems to be working as intended, with WebGL working as expected. I get this in logcat: E/memalloc( 139): /dev/pmem: No more pmem available W/memalloc( 139): Falling back to ashmem D/wpa_supplicant( 554): RTM_NEWLINK: operstate=1 ifi_flags=0x11043 ([UP][RUNNING][LOWER_UP]) D/wpa_supplicant( 554): RTM_NEWLINK, IFLA_IFNAME: Interface 'wlan0' added D/wpa_supplicant( 554): nl80211: if_removed already cleared - ignore event E/memalloc( 139): /dev/pmem: No more pmem available W/memalloc( 139): Falling back to ashmem E/msm7627a.hwcomposer( 139): hwc_set: Unable to render by hwc due to non-pmem memory E/msm7627a.hwcomposer( 139): hwc_set: Unable to render by hwc due to non-pmem memory I/Gonk ( 139): RIL[0]: OnConnectSuccess I/Gonk ( 139): RIL[0]: OnDisconnect I/Gecko ( 767): SharedSurface_Gralloc::Create ------- E/msm7627a.hwcomposer( 139): hwc_set: Unable to render by hwc due to non-pmem memory E/memalloc( 139): /dev/pmem: No more pmem available W/memalloc( 139): Falling back to ashmem I/Gonk ( 139): RIL[0]: OnConnectSuccess I/Gecko ( 767): SharedSurface_Gralloc::Create: success -- surface 0x4337c540, GraphicBuffer 0x43d78380. I/Gonk ( 139): RIL[0]: OnDisconnect E/msm7627a.hwcomposer( 139): hwc_set: Unable to render by hwc due to non-pmem memory E/msm7627a.hwcomposer( 139): hwc_set: Unable to render by hwc due to non-pmem memory E/msm7627a.hwcomposer( 139): hwc_set: Unable to render by hwc due to non-pmem memory I/Gecko ( 767): SharedSurface_Gralloc::Create ------- I/Gecko ( 767): SharedSurface_Gralloc::Create: success -- surface 0x4337ca80, GraphicBuffer 0x40422500. E/msm7627a.hwcomposer( 139): hwc_set: Unable to render by hwc due to non-pmem memory And my spinning cube is spinning.
Flags: needinfo?(bjacob)
So should this be closed as works for me then?
I think so, but I would rather have Milan and/or Sotaro make that call.
(In reply to Benoit Jacob [:bjacob] from comment #29) > I think so, but I would rather have Milan and/or Sotaro make that call. Milan - Can you confirm?
Flags: needinfo?(milan)
The way it's written, we'll close it as worksforme. We have an overarching goal to reduce memory usage, but we need to make that a bit more actionable before we open a bug for it.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Flags: needinfo?(milan)
Resolution: --- → WORKSFORME
blocking-b2g: 1.3? → ---
Whiteboard: [MemShrink:P2] [c=memory p= s= u=] → [MemShrink:P2] [c=memory p= s=2013.12.06 u=]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: