Closed Bug 1029856 Opened 10 years ago Closed 10 years ago

Potential memory leak in b2g process.

Categories

(Core :: Graphics: Layers, defect)

2.0 Branch
ARM
Gonk (Firefox OS)
defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla33
blocking-b2g 1.4+
Tracking Status
firefox31 --- wontfix
firefox32 --- fixed
firefox33 --- fixed
b2g-v1.4 --- fixed
b2g-v2.0 --- fixed
b2g-v2.1 --- fixed

People

(Reporter: sushilchauhan, Assigned: sotaro)

References

()

Details

(Keywords: memory-footprint, perf, regression, Whiteboard: [MemShrink])

Attachments

(2 files, 6 obsolete files)

When HWC is enabled, during Camcorder recording on device (similar to Flame), it seems there is a memory leak in b2g process.

These 2 values continuously increase:
cat /sys/class/kgsl/kgsl/page_alloc
cat /sys/class/kgsl/kgsl/pagetables/<<b2g-pid>>/mapped

When I exit Camera App:
cat /sys/class/kgsl/kgsl/page_alloc: It comes back to ~ 5 MB.
cat /sys/class/kgsl/kgsl/pagetables/<<b2g-pid>>/mapped : It remains at that higher value.

So in the b2g process, the kgsl/GPU memory which has been mapped for textures during camcorder recording is not getting freed/un-mapped, hence memory leak.
When HWC is disabled, it is not observed and "/sys/class/kgsl/kgsl/page_alloc" value always remains ~ 5 MB during Camcorder recording. This issue has been observed on b2g v2.0 and should be present on v1.4 as well. So, due to 256 MB restrictions, the Camera App gets killed after around 15 mins of recording.
Sotaro, do you know if any recent Gecko change related to buffer handling (acquire/release) could have introduced this memory leak ?
Flags: needinfo?(sotaro.ikeda.g)
blocking-b2g: --- → 2.0?
Present in 1.4 so we should fix it there.
blocking-b2g: 2.0? → 1.4+
I don't think we can be leaking actual images here. We would run out of memory much quicker. My guess is we are leaking _handles_ here instead. The kernel probably does some sort of malloc for handles and thats what eats up our ram eventually. Maybe image handles? or texture handles?
(In reply to Sushil from comment #1)
> Sotaro, do you know if any recent Gecko change related to buffer handling
> (acquire/release) could have introduced this memory leak ?

I do not know about it. It seems wired that the problem happens only when HwComposer enabled. Anyway, I did a lot of change around buffers and fences handling to fix WebGL problems. I am going to investigate about the problem. At first, I am going to check the problem could be reproducible on my side.
Assignee: nobody → sotaro.ikeda.g
Flags: needinfo?(sotaro.ikeda.g)
I confirmed the problem happens on v1.4 flame and the problem does not happen v1.3 flame base rom.
Sushil, can you try to get some diagnostics info from your driver to see what we are leaking exactly?
Flags: needinfo?(sushilchauhan)
See Also: → 1028253
Can we get a couple of memory reports for this? One at Camera startup and one at the point that leaks are suspected would be best. Run the following command from your B2G checkout:

    MOZ_IGNORE_NUWA_PROCESS=1 tools/get_about_memory.py
Whiteboard: [MemShrink]
(In reply to Sotaro Ikeda [:sotaro] from comment #5)
> I confirmed the problem happens on v1.4 flame and the problem does not
> happen v1.3 flame base rom.

Correction:
HwComposer is not used on camera preview on v1.3 flame base image. Therefore the above v1.3 is OpenGL compositon.

I saw memory increase just on camera preview on v1.4 flame with HwComposer composition.
 > cat /sys/class/kgsl/kgsl/page_alloc
Flags: needinfo?(sushilchauhan)
Sotaro,

If I avoid eglSwapBuffers() call at [1] after changing HWC HAL to handle the frame as HWC_OVERLAY Composition by using intermediate Copybit render buffers in HAL to compose the frame, instead of using FrameBuffer layer handle to compose (which is called HWC_BLIT composition). So in that case, we do not need to flip the FrameBuffer target at [1] hence eglSwapBuffers() does not get called and I do not see the kgsl memory leak in Camcorder recording. So it narrows down the leak to eglSwapBuffers() call. Can you check the buffer handling (acquire/release) in the eglSwapBuffers() call in this case.

[1]: http://mxr.mozilla.org/mozilla-central/source/widget/gonk/HwcComposer2D.cpp#587
Andreas,

Just a heads-up, I have observed that vertical home screen needs lot of kgsl memory (check "cat /sys/class/kgsl/kgsl/page_alloc" value) as compared to normal home screen. Difference is around 9 MB. So I don't think it is optimal to set vertical Home Screen as default on a 256 MB memory constrained device.
Flags: needinfo?(sotaro.ikeda.g)
(In reply to Sushil from comment #9)
> Sotaro,
> 
> If I avoid eglSwapBuffers() call at [1] after changing HWC HAL to handle the
> frame as HWC_OVERLAY Composition by using intermediate Copybit render
> buffers in HAL to compose the frame, instead of using FrameBuffer layer
> handle to compose (which is called HWC_BLIT composition). 

Can flame device could use HWC_OVERLAY? IIRC, it has copybit hw composer.

>So in that case,
> we do not need to flip the FrameBuffer target at [1] hence eglSwapBuffers()
> does not get called and I do not see the kgsl memory leak in Camcorder
> recording. So it narrows down the leak to eglSwapBuffers() call. Can you
> check the buffer handling (acquire/release) in the eglSwapBuffers() call in
> this case.

Can you explain more about it? which part actually do I need to check? eglSwapBuffers() works almost all same way as android.
Flags: needinfo?(sotaro.ikeda.g) → needinfo?(sushilchauhan)
The following is a diagram around HwcComposer2D. eglSwapBuffers() related code is almost same as android. Only different part is around FramebufferSurface. HwcComposer2D also call FramebufferSurface's function. 

https://github.com/sotaroikeda/firefox-diagrams/blob/master/widget/widget_HwcComposer2D__FirefoxOS_1_4.pdf?raw=true
eglSwapBuffers() related function call sequence in blit composition case is like the following.

LayerManagerComposite::Render()
->HwcComposer2D::TryRender()
 ->HwcComposer2D::TryHwComposition()
  ->if (blitComposite) // if blit composition case,
   ->GonkDisplayJB::UpdateFBSurface()
    ->eglSwapBuffers() // function internal is not exist as source code.

we do not have actual implementation of eglSwapBuffers(). I created a function call sequence from  eglSwapBuffers() from anroid source code log.

eglSwapBuffers()// actual implementation source is not exist.
 ->Surface::hook_dequeueBuffer_DEPRECATED()
  ->Surface::dequeueBuffer()
   ->BufferQueue::dequeueBuffer()
 ->android::Surface::hook_queueBuffer()
  ->Surface::queueBuffer()
   ->BufferQueue::queueBuffer()
    ->FramebufferSurface::onFrameAvailable()


eglSwapBuffers() function call sequence, only FramebufferSurface::onFrameAvailable() is b2g specific implementation.
(In reply to Sotaro Ikeda [:sotaro] from comment #11)
> 
> Can flame device could use HWC_OVERLAY? IIRC, it has copybit hw composer.
> 

Sotaro, can you revert this CAF patch and confirm Comment 9 on Flame device on v2.0 or v1.4 code:
https://www.codeaurora.org/cgit/quic/la/platform/hardware/qcom/display/commit/?h=kk_3.5&id=c67b03798c274884172c1f99222ba31ce33e1b6b

Yes, Flame will use HWC_OVERLAY Composition by reverting this patch. HWC HAL will use Copybit render buffers to compose the frame content, hence it will not use FrameBuffer layer for composition.
(In reply to Eric Rahm [:erahm] from comment #7)
> Can we get a couple of memory reports for this? One at Camera startup and
> one at the point that leaks are suspected would be best. Run the following
> command from your B2G checkout:
> 
>     MOZ_IGNORE_NUWA_PROCESS=1 tools/get_about_memory.py

I generated gecko memory report when b2g hits 92MB(USS) in b2g-procrank but gecko memory report does not explain anything for 90MB(USS) memory growth in b2g's USS. 

You can also see 90MB (b2g process's USS) in output of b2g-procrank log in : 
https://drive.google.com/file/d/0B1cSMS8_GuAEN24xcWhDYWUxb1U/edit?usp=sharing

I also tried to make FFOS 2.0 build with DMD enabled as mentioned in https://wiki.mozilla.org/Performance/MemShrink/DMD 
But device does not boot after enabling DMD in FFOS 2.0 .

Will it help if we can get memory report from DMD enabled build ? 
I am creating a new bugid for DMD enabled build issue in FFOS 2.0
Flags: needinfo?(erahm)
Okey, I am going to check the revert :-)
for DMD build boot failure issue,  new #bug 1030358
The DMD issue is known, I'm looking into now (bug 1019634).
Flags: needinfo?(erahm)
The problem is fixed by reverting the patch! I confirmed on v1.4 flame.
(In reply to Sotaro Ikeda [:sotaro] from comment #19)
> The problem is fixed by reverting the patch! I confirmed on v1.4 flame.

It is not fix. By reverting this patch, we introduce 3 Copybit render buffers in HAL, which is ~ 4.5 MB. So let's debug the memory leak in eglSwapBuffers. In worst case, we can think of reverting this patch.
Flags: needinfo?(sushilchauhan)
sushil, please always file new bugs for things like comment 10, we can definitely fix that
Depends on: 1030608
Keywords: regression
Sotaro,

During Camcorder recording, in b2g process, I checked the memory alloc and free calls in graphics code. The Gfx memory allocation requests which are coming from [1] are getting freed eventually. But I see intermittent Gfx memory allocation requests which are not coming from [1] and they are not getting freed. This memory alloc request comes within eglSwapBuffers() call at [2]. This is the cause of memory leak. With GPU Composition, I do not see these intermittent alloc requests, hence there is no leak. So do you know any place in eglSwapBuffers() path which request for Gfx memory allocation in b2g process?

[1]: http://mxr.mozilla.org/mozilla-central/source/widget/gonk/nativewindow/GonkConsumerBaseKK.cpp#176
[2]: http://mxr.mozilla.org/mozilla-central/source/widget/gonk/libdisplay/GonkDisplayJB.cpp#298
Flags: needinfo?(sotaro.ikeda.g)
I checked eglSwapBuffers() stack in b2g process, I did not find any Gfx memory request from Gecko. So if allocation request is initiated within the Graphics code, then it needs to be freed by Graphics module.
The patch adds glClear() before HwComposer rendering. It fixes the memory leak problem on v1.4 flame.
Flags: needinfo?(sotaro.ikeda.g)
This is an issue on 2.0 as well?
To create attachment 8447295 [details] [diff] [review], I assumes that flame's OpenGL/EGL driver/module might require some OpenGL calls before eglSwapBuffers(). I got this idea from the following SurfaceFlinger code.

http://androidxref.com/4.3_r2.1/xref/frameworks/native/services/surfaceflinger/SurfaceFlinger.cpp#1548
(In reply to Milan Sreckovic [:milan] from comment #25)
> This is an issue on 2.0 as well?

Yes, it does not care about b2g version.
(In reply to Sotaro Ikeda [:sotaro] from comment #27)
> (In reply to Milan Sreckovic [:milan] from comment #25)
> > This is an issue on 2.0 as well?
> 
> Yes, it does not care about b2g version.

But it require HwComposer composition and copybit HwComposer. I am not sure if v1.3 flame uses HwComposer. All flame base ROMs that I received does not use Hwcomposer on the ROMs.
Attachment #8447295 - Attachment description: patch - Call glClear() before HwComposer composition → patch for v1.4 - Call glClear() before HwComposer composition
Sushil, does attachment 8447295 [details] [diff] [review] works on your side?
Flags: needinfo?(sushilchauhan)
(In reply to Sotaro Ikeda [:sotaro] from comment #26)
> To create attachment 8447295 [details] [diff] [review], I assumes that
> flame's OpenGL/EGL driver/module might require some OpenGL calls before
> eglSwapBuffers(). I got this idea from the following SurfaceFlinger code.
> 
> http://androidxref.com/4.3_r2.1/xref/frameworks/native/services/
> surfaceflinger/SurfaceFlinger.cpp#1548

Firefox OS does not have worm hole regions like Android. On Firefox OS, the entire screen gets painted and rendered in every frame (layer tree) and the background layer has opaque content, so we do not need glClear() for H/W Composition. But as long as glClear() is efficient, there is no harm in calling it. I will test it and let you know.
Sotaro,

I do not see the memory leak with this patch. But if you see SF code, hasHwcComposition at [1] is FALSE when the frame uses HWC_BLIT Composition (see the SF code on CAF which has HWC_BLIT changes). So we never call glClear in HWC_BLIT Composition (Camcorder recording frame). Only hasGlesComposition is TRUE in HWC_BLIT Composition.

I think we are missing the eglMakeCurrent() call at: http://androidxref.com/4.3_r2.1/xref/frameworks/native/services/surfaceflinger/DisplayDevice.cpp#254 which is called from:
http://androidxref.com/4.3_r2.1/xref/frameworks/native/services/surfaceflinger/SurfaceFlinger.cpp#1540.

If this works, it would be better solution than introducing glClear().Can you try porting it on Gecko ?

[1]:http://androidxref.com/4.3_r2.1/xref/frameworks/native/services/surfaceflinger/SurfaceFlinger.cpp#1548
Flags: needinfo?(sushilchauhan) → needinfo?(sotaro.ikeda.g)
(In reply to Sushil from comment #31)
> 
> If this works, it would be better solution than introducing glClear().Can
> you try porting it on Gecko ?

Thanks for the confirmation and info! I am going to try it.
Flags: needinfo?(sotaro.ikeda.g)
The patch implement Comment 31. But it does not fix the leak problem.
Yes, I verified it too just now.
Add MakeCurrent() before gl function calls.
Attachment #8447295 - Attachment is obsolete: true
It seems like gl module internal problem. attachment 8447382 [details] [diff] [review] could be a workaround of it.
Sushil, is there concern to use attachment 8447382 [details] [diff] [review] as fix?
Flags: needinfo?(sushilchauhan)
Sotaro,

This patch touches all the compositions: GPU and full/partial HWC. So, it will unnecessary invoke GPU in full HWC Composition on a H/W Overlay device (ex: Video playback), which will impact power numbers. If the workaround is only needed for HWC_BLIT Composition, then I think glClear() should be called at: http://mxr.mozilla.org/mozilla-central/source/widget/gonk/HwcComposer2D.cpp#586 .
Flags: needinfo?(sushilchauhan)
A lot of mobile GPUs are tile-based. In case of tile based rendering glClear is not just cheap, its actually extremely expensive to _not_ use glClear. In contrast to desktop GPUs that immediately start rendering when draw commands come in, tile based renderers save the frame data and then render that scene tile by tile. glClear essentially flushes the entire graph. If you try to avoid the glClear and then draw over that scene the driver has to instead do some analysis and figure out what parts of the scene can be culled, which is much more expensive.

Adreno is tile-based. I would most definitely land the glClear patch. Its useful in several ways, in addition to working around this driver bug.
- GPU Composition: I do not know about tile rendering and GPU Composition, so I am not concerned about glClear in GPU Composition path. Isn't LayerManagerComposite already doing glClear() in GPU Composition path (prior to this patch)?

- Partial HWC Comp: It is fine to call glClear() in partial HWC Composition. SurfaceFlinger is also doing it. But this is also not optimal to clear the entire FrameBuffer. See the comment in SF code at: http://androidxref.com/4.3_r2.1/xref/frameworks/native/services/surfaceflinger/SurfaceFlinger.cpp#1548

- Full HWC Composition: My only concern is this. We do not need glClear() in full HWC Composition frame and SF is not doing it as well. See above link. It will impact power numbers on H/W Overlay devices by unnecessarily invoking GPU even when composition is entirely handled by HWC.

- HWC_BLIT Composition: We need glClear() to workaround this bug. It is for Copybit device (Flame).
We should get inputs from Graphics team on Tile-based rendering and glClear() in GPU Composition path.
Yes, we are doing glClear() in GPU Composition path at: http://mxr.mozilla.org/mozilla-central/source/gfx/layers/opengl/CompositorOGL.cpp#795
Whiteboard: [MemShrink] → [MemShrink] [CR 686758]
Keywords: footprint, perf
Attachment #8447380 - Attachment is obsolete: true
Attachment #8447382 - Attachment is obsolete: true
attachment 8448010 [details] [diff] [review] calls glClear() only blit composition and GPU is AdrenoTM305. The patch fixes the problem on master flame.
Attachment #8448010 - Flags: review?(sushilchauhan)
Attachment #8448011 - Flags: review?(jgilbert)
Attachment #8448010 - Flags: review?(sushilchauhan)
Attachment #8448011 - Flags: review?(jgilbert)
(In reply to Sotaro Ikeda [:sotaro] from comment #44)
> Created attachment 8448011 [details] [diff] [review]
> patch part 2 - Add AdrenoTM305 to GLRenderer

I received a comment from Sushil by offline that adreno check is not necessary, because the blit composition is used only on adreno. I am going to update the patch.
Remove adreno check based on the comment.
Attachment #8448010 - Attachment is obsolete: true
Attachment #8448011 - Attachment is obsolete: true
Attachment #8448120 - Flags: review?(sushilchauhan)
Comment on attachment 8448120 [details] [diff] [review]
patch - Call glClear() before blit composition

Review of attachment 8448120 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM with nits resolved.

::: widget/gonk/HwcComposer2D.cpp
@@ +589,5 @@
> +            // See bug 1029856.
> +            if (mGLContext) {
> +              mGLContext->MakeCurrent();
> +              mGLContext->fClearColor(0.0, 0.0, 0.0, 0.0);
> +              mGLContext->fClear(LOCAL_GL_COLOR_BUFFER_BIT);

4 space indentation in these 3 lines.
Attachment #8448120 - Flags: review?(sushilchauhan)
Apply the comment. Set Review+ based on Comment 48.
Attachment #8448120 - Attachment is obsolete: true
Attachment #8448172 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/78adf063a6be
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
Michael, can you make sure the next MTBF run has this patch in it?
Flags: needinfo?(mvines)
Flags: needinfo?(mvines) → needinfo?(ikumar)
Thanks for the fix.
Flags: needinfo?(ikumar)
Whiteboard: [MemShrink] [CR 686758] → [MemShrink]
Not enough information with current STR to write test case to address bug.
QA Whiteboard: [QAnalyst-Triage?]
Flags: in-moztrap?(rmead)
Flags: needinfo?(ktucker)
Test case added in moztrap:

https://moztrap.mozilla.org/manage/case/14320/
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(rmead)
Flags: in-moztrap+
Michael Treese. This is suspected as increasing power by 30%, as it may be keeping GPU on. Can you involve the power experts in Mozilla to investigate.
Flags: needinfo?(mtreese)
Blocks: 1150518
This is a workaround of codeaurora's OpenGL module's problem. If the problem is already fixed by the module. It could be removed.
Dave,
Chould you take a look at this.
Flags: needinfo?(mtreese) → needinfo?(dhylands)
mtreese: not sure why you're asking me to look at this, as I've never been involved with graphics stuff up until now. I'm totally unfamiliar with all of the code involved.

I mean - sure - I can look at it, but it may take me several orders of magnitude longer to figure anything out since I'm totally unfamiliar with this code.
Flags: needinfo?(dhylands) → needinfo?(mtreese)
Flags: needinfo?(mtreese)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: