912134 - Scroll FPS is low in FFOS1.2 than FFOS 1.1

Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me)

Reporter

Description

•

11 years ago

Hi,

Gecko ICS Master Branch's LayerManagerComposite::Render() takes 2 to 3 times more time than Gecko ICS 1.1 LayerManagerOGL:Render(). 

It can be verified by putting simple log inside that function .

It causes low FPS in ICS Master branch. Can anyone fix this regression in ICS Master branch ?

Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me)

Reporter

Updated

•

11 years ago

Keywords: perf

Summary: Scroll FPS is low in Master than 1.1 → Scroll FPS is low in ICS Master than ICS 1.1

Michael Vines [:m1] [:evilmachines]

Comment 1

•

11 years ago

Tapas, can you please provides more detail on the type of visible UX degradation that this will regression will cause for existing v1.1 devices.

blocking-b2g: --- → koi?

Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me)

Reporter

Comment 2

•

11 years ago

In ICS 1.1 Contact Scroll, we see top FPS as ~55fps . But we are seeing 34fps as top FPS in ICS master Contact Scroll. 

Average GPU Composition delay in LayerManagerComposite::Render() is ~18ms in "ICS master" 

Average GPU Composition delay in  LayerManagerOGL:Render() is ~9ms in "ICS 1.1".






(In reply to Michael Vines [:m1] [:evilmachines] from comment #1)
> Tapas, can you please provides more detail on the type of visible UX
> degradation that this will regression will cause for existing v1.1 devices.

Ben Kelly [:bkelly, not reviewing]

Comment 3

•

11 years ago

What device are you running on?  I've had issues with gralloc lock timeouts during contacts scroll on the buri/hamachi platform.

And does "ICS master" mean mozilla-central and gaia/master?

Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me)

Reporter

Comment 4

•

11 years ago

(In reply to Ben Kelly [:bkelly] from comment #3)
> What device are you running on?  I've had issues with gralloc lock timeouts
> during contacts scroll on the buri/hamachi platform.
> 

Does LayerManagerComposite::Render() function can cause delay from gralloc ?? I guess "NO" . 

> And does "ICS master" mean mozilla-central and gaia/master?

Yes

Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me)

Reporter

Comment 5

•

11 years ago

(In reply to Ben Kelly [:bkelly] from comment #3)
> What device are you running on?  I've had issues with gralloc lock timeouts
> during contacts scroll on the buri/hamachi platform.
> 
> And does "ICS master" mean mozilla-central and gaia/master?

I am running on QRD device. Can you please measure FPS or delay from above function in your device ?

Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me)

Reporter

Comment 6

•

11 years ago

Both ICS Master and ICS 1.1 uses BindAndDrawQuadWithTextureRect() to draw texture using glDrawArrays(). 

BindAndDrawQuadWithTextureRect accepts one parameter aTexCoorRect 

which is defined as below in source code :

"|aTexCoordRect| is the rectangle from the texture that we want to draw using the given program" . 

It means if size of aTexCoordRect is larger than BindAndDrawQuadWithTextureRect can cause long delay during calling glDrawArrays().  If we print size of aTexCoordRect from BindAndDrawQuadWithTextureRect  to logcat then we can see that ICS Master branch is trying to draw much larger texture than 1.1 for same device. 

My guess is that this is causing main delay in gldrawArray() in master.

Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me)

Reporter

Comment 7

•

11 years ago

(In reply to Tapas Kumar Kundu from comment #6)
> Both ICS Master and ICS 1.1 uses BindAndDrawQuadWithTextureRect() to draw
> texture using glDrawArrays(). 
> 
> BindAndDrawQuadWithTextureRect accepts one parameter aTexCoorRect 
> 
> which is defined as below in source code :
> 
> "|aTexCoordRect| is the rectangle from the texture that we want to draw
> using the given program" . 
> 
> It means if size of aTexCoordRect is larger than
> BindAndDrawQuadWithTextureRect can cause long delay during calling
> glDrawArrays().  If we print size of aTexCoordRect from
> BindAndDrawQuadWithTextureRect  to logcat then we can see that ICS Master
> branch is trying to draw much larger texture than 1.1 for same device. 
> 
> My guess is that this is causing main delay in gldrawArray() in master.


This observation is for Contact Scroll (with 500 contacts loaded) in both ICS master and ICS 1.1 running on same device. 

Please compare size of 2nd parameter passed to CompositorOGL::BindAndDrawQuadWithTextureRect (in ICS master branch FFOS 1.2) 
with LayerManagerOGL::BindAndDrawQuadWithTextureRect (in ICS FFOS 1.1).

This will show that we are drawing much bigger texture for each layer in FFOS 1.2  than  FFOS 1.1 during contact scroll. 

This may cause fDrawArrays() to take long time in FFOS 1.2 than FFOS 1.1 .  This seems to be regression.

Please confirm.

Ben Kelly [:bkelly, not reviewing]

Comment 8

•

11 years ago

From the app perspective I believe the DOM structure is fairly similar in both v1.1 and v1.2.

Milan, does the situation in comment 6 and comment 7 sound gfx related?  Could it be related to the layers refactor in m-c?

Flags: needinfo?(milan)

Dave Huseby [:huseby]

Updated

•

11 years ago

Keywords: regression

Whiteboard: [perf-reviewed]

Mike Lee [:mlee]

Updated

•

11 years ago

blocking-b2g: koi? → koi+

Whiteboard: [perf-reviewed] → [perf-reviewed][c=handeye p= s= u=]

Ben Kelly [:bkelly, not reviewing]

Updated

•

11 years ago

Priority: -- → P1

Milan Sreckovic [:milan] (needinfo for best results)

Comment 9

•

11 years ago

BenWa, can you take a look?  I don't know if it's pan/zoom or any of the other things we're tracking already, or something new.

Flags: needinfo?(milan) → needinfo?(bgirard)

Benoit Girard (:BenWa)

Comment 10

•

11 years ago

CompositorOGL::BindAndDrawQuadWithTextureRect is not getting hit for me on b2g master. Still looking.

Flags: needinfo?(bgirard)

Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me)

Reporter

Comment 11

•

11 years ago

(In reply to Benoit Girard (:BenWa) from comment #10)
> CompositorOGL::BindAndDrawQuadWithTextureRect is not getting hit for me on
> b2g master. Still looking.

Which device are you using ? Can you please force GPU Composition ?

Flags: needinfo?(bgirard)

Benoit Girard (:BenWa)

Comment 12

•

11 years ago

I'm using the arm emulator. I am using OGL Composition.

Flags: needinfo?(bgirard)

Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me)

Reporter

Comment 13

•

11 years ago

Can you please use actual device ? This may help to reproduce this issue.

Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me)

Reporter

Updated

•

11 years ago

Flags: needinfo?(bgirard)

Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me)

Reporter

Updated

•

11 years ago

Summary: Scroll FPS is low in ICS Master than ICS 1.1 → Scroll FPS is low in FFOS1.2 than FFOS 1.1

Andreas Gal :gal

Comment 14

•

11 years ago

We have likely at least two problems here, and they might be related.

Our chipset vendor friends found that we are compositing (via the GPU) larger textures since the refactoring, which is slower. They will post the differences between 1.1 and 1.2 they observed in a follow-up comment.

Vivien found that in settings but also contacts we are no longer using buffer rotation. Instead, we render with multiple constantly changing layers as we scroll. Repainting all those layers is likely causing part of the slowdown. Details to follow from Vivien and me.

Sotaro Ikeda [:sotaro]

Assignee

Comment 15

•

11 years ago

About gralloc buffer handling, there might be 2 gralloc usage conflicts.
[1] conflict between OpenGL and HwComposer
[2] conflict between application and LayerManagerComposite.

It become clear that [2] is happening. By disabling HWComposer, scroll becomes smooth and gen lock failure becomes smooth. [1] still could happen, need to check about it.

Sotaro Ikeda [:sotaro]

Assignee

Comment 16

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #15)
> About gralloc buffer handling, there might be 2 gralloc usage conflicts.
> [1] conflict between OpenGL and HwComposer
> [2] conflict between application and LayerManagerComposite.
> 
> It become clear that [2] is happening. By disabling HWComposer, scroll
> becomes smooth and gen lock failure becomes smooth. [1] still could happen,
> need to check about it.

Sorry, correction. [1] is happening. Need to check about [2].

Andreas Gal :gal

Comment 17

•

11 years ago

Filed bug 914143 for the bad layer creation.

Andreas Gal :gal

Updated

•

11 years ago

Depends on: 914143

Andreas Gal :gal

Updated

•

11 years ago

No longer depends on: 914143

Andreas Gal :gal

Updated

•

11 years ago

Depends on: 914143

Sotaro Ikeda [:sotaro]

Assignee

Comment 18

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #15)
> About gralloc buffer handling, there might be 2 gralloc usage conflicts.
> [1] conflict between OpenGL and HwComposer
> [2] conflict between application and LayerManagerComposite.

When using gralloc buffers, the user of the buffers need to grab genlock. OpenGL rendering and HwCompser rendering could switch dynamically in each rendering. If HwComposer or OpenGL continue to keep gralloc buffer's genlock, another one can not get genlock and wait until got the genlock is freed. If the wait time is longer than genlock timeout, "genlock failure" log printed to logcat. I wrote this genlock owner ship competiion as "gralloc usage conflicts".

This "gralloc usage conflicts" has 2 patterns.
-(1) multiple user try to use gralloc buffers simultaneously.
-(2) genlock is not released after user.

In b2g18, Bug 879297, Bug 898919 are related to genlock failure. Both cases are related camera/video frame rendering and both are categorized to (2).

The following page explain details about gralloc buffers.
 https://wiki.mozilla.org/Platform/GFX/Gralloc

Sotaro Ikeda [:sotaro]

Assignee

Comment 19

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #15)
> About gralloc buffer handling, there might be 2 gralloc usage conflicts.
> [1] conflict between OpenGL and HwComposer
> [2] conflict between application and LayerManagerComposite.

At first, we think about [1]'s possibility, because during web page scrolling, genlock failure had happened. But [1] seems not related to low fps on OpenGL rendering. I had a chat with codeaurora's engineer and their recognition was also same. Low fps happens even when HwComposer is disabled. I am going to think more about [2].

Mandyam Vikram

Comment 20

•

11 years ago

Yes, disabling HWC does not address the low FPS issue

Sotaro Ikeda [:sotaro]

Assignee

Comment 21

•

11 years ago

I found a one candidate of the problem. In b2g18, OpenGL's texture binding happened on the following timing.
- (1) video/camera frame(gralloc yuv) rendering: Bind to texture just before the rendering.
- (2) other layers rendering: bind to texture during buffer swap

In master, all layers rendering were change to (1) in b2g. In b2g18, (1) made a lot of troubles about genlock failure.

In b2g18, (2) is done like following.
ShadowThebesLayerOGL::Swap()
 ->ShadowLayerManager::OpenDescriptorForDirectTexturing()
   ->GLContextEGL::CreateDirectTextureImage()
     ->DirectTextureImageEGL::BindTexture(LOCAL_GL_TEXTURE0);

Sotaro Ikeda [:sotaro]

Assignee

Comment 22

•

11 years ago

Bug 908276 is a bug to change a timing of Texture binding from before rendering to swap. I am going to check it.

Mandyam Vikram

Comment 23

•

11 years ago

One observation: 
On 1.1, the calls to BindAndDrawQuadWithTextureRect()are pretty much back-back and the function takes about 3-6 ms on the average on each call.
On 1.2, the function itself seems to be executing much faster, but, there seems to be some processing in-between that is delaying the calls by 4-6 ms in between. Will try to track down what's happening in-between.

Sotaro Ikeda [:sotaro]

Assignee

Comment 24

•

11 years ago

Stacktrace of GraphicBufferMapper::lock() failure.

(gdb) info stack
#0  0x4003d68a in android::GraphicBufferMapper::lock (this=0xffffffea, handle=<value optimized out>, usage=<value optimized out>, 
    bounds=<value optimized out>, vaddr=0xbed2c8b4) at frameworks/base/libs/ui/GraphicBufferMapper.cpp:80
#1  0x4003cbb2 in android::GraphicBuffer::lock (this=0x45c83600, usage=51, rect=..., vaddr=<value optimized out>)
    at frameworks/base/libs/ui/GraphicBuffer.cpp:175
#2  0x4003cbda in android::GraphicBuffer::lock (this=0x40143300, usage=320, vaddr=0xbed2c8b4) at frameworks/base/libs/ui/GraphicBuffer.cpp:162
#3  0x417822a2 in mozilla::layers::ShadowLayerForwarder::PlatformOpenDescriptor (aMode=<value optimized out>, aSurface=<value optimized out>)
    at /home/sotaro/b2g_master_hamachi/B2G/gecko/gfx/layers/ipc/ShadowLayerUtilsGralloc.cpp:483
#4  0x417aa124 in mozilla::layers::ShadowLayerForwarder::OpenDescriptor (aMode=1075065600, aSurface=...)
    at /home/sotaro/b2g_master_hamachi/B2G/gecko/gfx/layers/ipc/ShadowLayers.cpp:579
#5  0x417ace34 in mozilla::layers::DeprecatedTextureClientShmem::GetSurface (this=0x478dd200)
    at /home/sotaro/b2g_master_hamachi/B2G/gecko/gfx/layers/client/TextureClient.cpp:400
#6  0x417ace4a in mozilla::layers::DeprecatedTextureClientShmem::LockSurface (this=0x40143300)
    at ../../dist/include/mozilla/layers/TextureClient.h:469
#7  0x417afc14 in mozilla::layers::ThebesLayerBuffer::EnsureBuffer (this=<value optimized out>)
    at /home/sotaro/b2g_master_hamachi/B2G/gecko/gfx/layers/ThebesLayerBuffer.cpp:438
#8  0x417afc76 in mozilla::layers::ThebesLayerBuffer::GetContextForQuadrantUpdate (this=0x40143300, aBounds=..., aSource=480, 
    aTopLeft=0xbed2c8b4) at /home/sotaro/b2g_master_hamachi/B2G/gecko/gfx/layers/ThebesLayerBuffer.cpp:330
#9  0x41798680 in mozilla::layers::ContentClientDoubleBuffered::UpdateDestinationFrom (this=0x472f62c0, aSource=..., aUpdateRegion=...)
    at /home/sotaro/b2g_master_hamachi/B2G/gecko/gfx/layers/client/ContentClient.cpp:530
#10 0x417988f4 in mozilla::layers::ContentClientDoubleBuffered::SyncFrontBufferToBackBuffer (this=0x472f62c0)
    at /home/sotaro/b2g_master_hamachi/B2G/gecko/gfx/layers/client/ContentClient.cpp:518
#11 0x4178f172 in mozilla::layers::ClientThebesLayer::PaintThebes (this=0x47e8e800)
    at /home/sotaro/b2g_master_hamachi/B2G/gecko/gfx/layers/client/ClientThebesLayer.cpp:73
#12 0x4178f56e in mozilla::layers::ClientThebesLayer::RenderLayer (this=0x47e8e800)
---Type <return> to continue, or q <return> to quit---
    at /home/sotaro/b2g_master_hamachi/B2G/gecko/gfx/layers/client/ClientThebesLayer.cpp:139
#13 0x4178e04c in mozilla::layers::ClientContainerLayer::RenderLayer (this=<value optimized out>)
    at /home/sotaro/b2g_master_hamachi/B2G/gecko/gfx/layers/client/ClientContainerLayer.h:81
#14 0x4178e04c in mozilla::layers::ClientContainerLayer::RenderLayer (this=<value optimized out>)
    at /home/sotaro/b2g_master_hamachi/B2G/gecko/gfx/layers/client/ClientContainerLayer.h:81
#15 0x4178ebe6 in mozilla::layers::ClientLayerManager::EndTransactionInternal (this=0x49175cc0, aCallback=<value optimized out>, 
    aCallbackData=<value optimized out>) at /home/sotaro/b2g_master_hamachi/B2G/gecko/gfx/layers/client/ClientLayerManager.cpp:185
#16 0x4178ec82 in mozilla::layers::ClientLayerManager::EndTransaction (this=0x49175cc0, 
    aCallback=0x40ea3c31 <mozilla::FrameLayerBuilder::DrawThebesLayer(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, nsIntRegion const&, void*)>, aCallbackData=0xbed2cf40, aFlags=mozilla::layers::LayerManager::END_NO_COMPOSITE)
    at /home/sotaro/b2g_master_hamachi/B2G/gecko/gfx/layers/client/ClientLayerManager.cpp:208
#17 0x40ec45de in nsDisplayList::PaintForFrame (this=<value optimized out>, aBuilder=0xbed2cf40, aCtx=<value optimized out>, 
    aForFrame=<value optimized out>, aFlags=13) at /home/sotaro/b2g_master_hamachi/B2G/gecko/layout/base/nsDisplayList.cpp:1223
#18 0x40ec4738 in nsDisplayList::PaintRoot (this=0xbed2d2d8, aBuilder=0xbed2cf40, aCtx=0x0, aFlags=13)
    at /home/sotaro/b2g_master_hamachi/B2G/gecko/layout/base/nsDisplayList.cpp:1070
#19 0x40ed4cc4 in nsLayoutUtils::PaintFrame (aRenderingContext=<value optimized out>, aFrame=0x45a55298, aDirtyRegion=<value optimized out>, 
    aBackstop=<value optimized out>, aFlags=772) at /home/sotaro/b2g_master_hamachi/B2G/gecko/layout/base/nsLayoutUtils.cpp:2150
#20 0x40ee1fd6 in PresShell::Paint (this=0x40349d00, aViewToPaint=<value optimized out>, aDirtyRegion=<value optimized out>, aFlags=1)
    at /home/sotaro/b2g_master_hamachi/B2G/gecko/layout/base/nsPresShell.cpp:5616
#21 0x4114eb86 in nsViewManager::ProcessPendingUpdatesForView (this=0x45cfe4c0, aView=0x45b289c0, aFlushDirtyRegion=<value optimized out>)
    at /home/sotaro/b2g_master_hamachi/B2G/gecko/view/src/nsViewManager.cpp:411
#22 0x4114ec00 in nsViewManager::ProcessPendingUpdates (this=<value optimized out>)
    at /home/sotaro/b2g_master_hamachi/B2G/gecko/view/src/nsViewManager.cpp:1040
#23 0x40ee6c16 in nsRefreshDriver::Tick (this=<value optimized out>, aNowEpoch=1378820469939457, aNowTime=...)
---Type <return> to continue, or q <return> to quit---

Andreas Gal :gal

Comment 25

•

11 years ago

Opening the back buffer for writing fails. Someone must be holding a lock on the back buffer still.

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

11 years ago

Blocks: GFXB2G1.2

Nicolas Silva [:nical]

Comment 26

•

11 years ago

(In reply to Andreas Gal :gal from comment #25)
> Opening the back buffer for writing fails. Someone must be holding a lock on
> the back buffer still.

Have we already tried to go back to using b2g18's hack for unlocking gralloc buffers?

What I recently understood from the semantics of gralloc locking for compositing is that binding the gralloc buffer to a GL texture will lock it, and that it causes the previous gralloc buffer that was bound to this GL texture will be unlocked at the next buffer swap of the screen (I mean like EGLSwapBuffers or whatever it is called).

"The next buffer swap" is the important part here because we used to think it would be unlocked right away. So if we are trying to prepare the next frame in the back buffer while the current frame (the front buffer) hasn't hit the screen yet, opening the back buffer could fail the way you said (if my understanding is right).

I am not sure about what exactly happens in the case of unlocking with a null EGLImage, apart from that we used to do that on b2g18 (and that some drivers don't like it), but it could behave differently as far as this locking problem is concerned.

Andreas Gal :gal

Comment 27

•

11 years ago

Sotaro is telling me we never did the null EGLImage hack for regular content, only for video images. The explanation above sounds sane though. We will try the EGLImage hack. I think we should be able to use one shared null image that we let GL ref count on, so we don't really care when that is released.

Sotaro Ikeda [:sotaro]

Assignee

Comment 28

•

11 years ago

I seems to find why the patch in Bug 908276 made fps worse than normal. Current b2g master seems to consume more pmem than b2g18. When "out of mem" happens during layer buffer allocation, layer buffer fallback to Shmem(for content process) or MemoryImage(for b2g process).
http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/ISurfaceAllocator.cpp#69

Current gfx layers implementation expect, layer's both front and back buffer are gralloc buffers. If only one part of the buffer is gralloc buffer. GrallocDeprecatedTextureHostOGL continue to bind the gralloc buffer to texture. This might block from an application to get the buffer's genlock. Further, CreateDeprecatedTextureHostOGL() does not handle the fallback from gralloc to shmem/MemoryImage. The function create TextureImageDeprecatedTextureHostOGL in this case. It seems incorrect class be created.

http://mxr.mozilla.org/mozilla-central/source/gfx/layers/opengl/TextureHostOGL.cpp#41

Benoit Girard (:BenWa)

Comment 29

•

11 years ago

I'm making some progress profiling this:
http://people.mozilla.org/~bgirard/cleopatra/#report=6334f7bd68d228e59af5761570168d4074d9b2b8

There's a surprising amount of time that we're not spending in compositing. It could be idle time or not, I'm trying to confirm that. We also spend over 60% of our compositing time binding the Gralloc texture.

Flags: needinfo?(bgirard)

Sotaro Ikeda [:sotaro]

Assignee

Comment 30

•

11 years ago

If comment #28 is correct, the following seems necessary
[1] Bind dummy buffer after end of rendering.
[2] Reduce pmem usage by layers(much layering than b2g18?)
[3] Correctly implement fallback.

Andreas Gal :gal

Comment 31

•

11 years ago

Why are we using more pmem?

Benoit Girard (:BenWa)

Updated

•

11 years ago

Depends on: 914847

Benoit Girard (:BenWa)

Comment 32

•

11 years ago

Here's a profile of the contact app:
http://people.mozilla.org/~bgirard/cleopatra/#report=70f79696ad4b58afe9d1ed57db3ac1b93f1deb08

We're bottlenecked on the content process

Andreas Gal :gal

Comment 33

•

11 years ago

We are blocking 50% time in read?

Sotaro Ikeda [:sotaro]

Assignee

Comment 34

•

11 years ago

(In reply to Andreas Gal :gal from comment #31)
> Why are we using more pmem?

It is just my feeling received from b2g master. Bug 913821 might affected to it some. nical, how do you think about it?

Flags: needinfo?(nical.bugzilla)

Sotaro Ikeda [:sotaro]

Assignee

Comment 35

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #30)
> If comment #28 is correct, the following seems necessary
> [1] Bind dummy buffer after end of rendering.

[1] can be implemented by similar way as Bug 909851. It is not clear that it can be uses as a long term solution.

Benoit Girard (:BenWa)

Updated

•

11 years ago

Depends on: 914854

Benoit Girard (:BenWa)

Comment 36

•

11 years ago

Attached patch Disable active layering — Details — Splinter Review

This patch disables creating active layer. Everything gets merged with the root thebes layer. This causes a LOT of extra painting. This patch is to measure the benefit of our layer decisions vs. a naive don't layer anything.

From this on the hamachi with HWC disabled I can scroll Settings at 50 FPS and Contacts with med-workload at ~30FPS. This is with the extra rasterization pressure of not layering scrolling content.

Something is very wrong. I'm working on tracking down what in our layer decisions causes this.

Sotaro Ikeda [:sotaro]

Assignee

Comment 37

•

11 years ago

I made basic misunderstanding to attachment 794124 [details] [diff] [review] in Bug 908276. The patch allocate Texture for each GrallocDeprecatedTextureHostOGL. It should be for each CompositableHost. And the patch do not delete OpenGL texture correctly.

Benoit Girard (:BenWa)

Updated

•

11 years ago

Attachment #802644 - Attachment is patch: true

Sotaro Ikeda [:sotaro]

Assignee

Comment 38

•

11 years ago

Summarize investigation infos.
- Using same temporary gl textures in CompositorOGL for different texture target
   for same texture unit(Bug 906671)
- Using same texture for different texture target might remain gralloc texture's genlock to be held.
- If same texture is reused among the layers, it might be affect to the performance.
- Layer's gralloc buffers are fallback to Shmem(for content process) or MemoryImage(for b2g process)
  In fallbacked cases, TextureImageDeprecatedTextureHostOGL() is created.
  It is not clear that using  TextureImageDeprecatedTextureHostOGL() in the fallback case is correct.

Sotaro Ikeda [:sotaro]

Assignee

Comment 39

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #38)
> - If same texture is reused among the layers, it might be affect to the
> performance.

Same texture seems shared by among layers, because temporary texture is decided only by texture unit.

Mandyam Vikram

Comment 40

•

11 years ago

Could there be some issue with the way the following API is being used: 
GL_APICALL void GL_APIENTRY glEGLImageTargetTexture2DOES (GLenum target, GLeglImageOES image);
in that the EGL Image cannot be freed until the GPU h/w has finished with it?

Sotaro Ikeda [:sotaro]

Assignee

Comment 41

•

11 years ago

In b2g18, thebes layer create a texture for each gralloc buffer swap.

ShadowThebesLayerOGL::Swap()
 ->ShadowLayerManager::OpenDescriptorForDirectTexturing()
   ->GLContextEGL::CreateDirectTextureImage()
     ->fGenTextures(1, &texture); //passed to DirectTextureImageEGL 
     ->new DirectTextureImageEGL
     ->DirectTextureImageEGL::BindTexture(LOCAL_GL_TEXTURE0);

Sotaro Ikeda [:sotaro]

Assignee

Comment 42

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #39)
> (In reply to Sotaro Ikeda [:sotaro] from comment #38)
> > - If same texture is reused among the layers, it might be affect to the
> > performance.
> 
> Same texture seems shared by among layers, because temporary texture is
> decided only by texture unit.

Only LOCAL_GL_TEXTURE0(0x84C0) is used for rendering for all layers rendering if there is no camera preview nor video playback. Therefore only one Texture is used for all layers rendering.

Nicolas Silva [:nical]

Comment 43

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #34)
> (In reply to Andreas Gal :gal from comment #31)
> > Why are we using more pmem?
> 
> It is just my feeling received from b2g master. Bug 913821 might affected to
> it some. nical, how do you think about it?

Bug 913821 and the other bugs related to new textures only affect video and canvas, and in both these cases they don't change the allocation strategy. I don't think it changes anything for general scrolling performances.
If we are using more pmem it could be the differences in how we layerize. The buffer rotation stuff (which is the most exercised path when visiting regular pages, scrolling, etc.) is a bit out of my comfort zone so I don't think it has changed its allocation strategy but I am not 100% sure.

Flags: needinfo?(nical.bugzilla)

Nicolas Silva [:nical]

Comment 44

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #38)
> - Layer's gralloc buffers are fallback to Shmem(for content process) or
> MemoryImage(for b2g process)
>   In fallbacked cases, TextureImageDeprecatedTextureHostOGL() is created.
>   It is not clear that using  TextureImageDeprecatedTextureHostOGL() in the
> fallback case is correct.

If you fallback to shmem or ram, the only code path that lets you upload that to a GL texture on the layers side is TextureImageDeprecatedTextureHostOGL (or with new textures, the equivalent which is TextureImageTextureSourceOGL).
I don't see what is not correct about this. We could manually upload to a GLTexture but that would not buy us anything over using a TextureImage since TextureImage::DirectUpdate just does that.

Peter Chang[:pchang]

Comment 45

•

11 years ago

In my leo device with latest m-c, I found the compositor took about 10~15 ms for LayerManagerComposite::Render() during home scrolling, it took about 0~4 ms on b2g18.

After digging more, I found it spent lots of time inside GrallocDeprecatedTextureHostOGL::BindTexture() calls, and I didn't see long delay on glDrawArray calls.
http://mxr.mozilla.org/mozilla-central/source/gfx/layers/opengl/TextureHostOGL.cpp#1091


BTW, I didn't see the pmem not enough in my leo.

#0  mozilla::layers::GrallocDeprecatedTextureHostOGL::BindTexture (this=0x46beb670, aTextureUnit=33984) at /media/ramdisk/b2g_central_new/gfx/layers/opengl/TextureHostOGL.cpp:1114
#1  0x415b6ed0 in mozilla::layers::AutoBindTexture::Bind (this=<value optimized out>, aTexture=0x46beb670, aTextureUnit=1) at /media/ramdisk/b2g_central_new/gfx/layers/opengl/CompositorOGL.cpp:993
#2  0x415b900a in AutoBindTexture (this=0x46dd0100, aRect=<value optimized out>, aClipRect=<value optimized out>, aEffectChain=<value optimized out>, aOpacity=1, aTransform=..., aOffset=...)
    at /media/ramdisk/b2g_central_new/gfx/layers/opengl/CompositorOGL.cpp:981
#3  mozilla::layers::CompositorOGL::DrawQuad (this=0x46dd0100, aRect=<value optimized out>, aClipRect=<value optimized out>, aEffectChain=<value optimized out>, aOpacity=1, aTransform=..., aOffset=...)
    at /media/ramdisk/b2g_central_new/gfx/layers/opengl/CompositorOGL.cpp:1116
#4  0x415bfc2e in mozilla::layers::ContentHostBase::Composite (this=0x4370b8b0, aEffectChain=<value optimized out>, aOpacity=<value optimized out>, aTransform=<value optimized out>, aOffset=..., 
    aFilter=@0x453ff760, aClipRect=..., aVisibleRegion=0x453ff710, aLayerProperties=0x0) at /media/ramdisk/b2g_central_new/gfx/layers/composite/ContentHost.cpp:191
#5  0x415d93ee in mozilla::layers::ThebesLayerComposite::RenderLayer (this=0x474a7800, aOffset=<value optimized out>, aClipRect=<value optimized out>)
    at /media/ramdisk/b2g_central_new/gfx/layers/composite/ThebesLayerComposite.cpp:146
#6  0x415bcdf8 in mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite> (aContainer=0x474a7400, aOffset=..., aManager=<value optimized out>, aClipRect=...)
    at /media/ramdisk/b2g_central_new/gfx/layers/composite/ContainerLayerComposite.cpp:240
#7  0x415bd0a4 in mozilla::layers::ContainerLayerComposite::RenderLayer (this=<value optimized out>, aOffset=..., aClipRect=...)
    at /media/ramdisk/b2g_central_new/gfx/layers/composite/ContainerLayerComposite.cpp:336
#8  0x415bcdf8 in mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite> (aContainer=0x474a6400, aOffset=..., aManager=<value optimized out>, aClipRect=...)
    at /media/ramdisk/b2g_central_new/gfx/layers/composite/ContainerLayerComposite.cpp:240
#9  0x415bd0a4 in mozilla::layers::ContainerLayerComposite::RenderLayer (this=<value optimized out>, aOffset=..., aClipRect=...)

Andreas Gal :gal

Comment 46

•

11 years ago

Peter,  I think you are observing the same genlock timeout issue we are seeing. Sotaro has a plan to change the way we cycle front and back buffers to avoid this.

Mandyam Vikram

Comment 47

•

11 years ago

Home Scrolling uses HWC. Most other native app scroll use cases GPU Composition (Rotated buffers for scrolling).
BTW, can we make rotated buffers be usable for HWC?

Mandyam Vikram

Comment 48

•

11 years ago

From the driver side, on 1.2, I'm seeing the following observations:
- Within a frame, there are multiple GL_TEXTURE_2D texture calls being made (across layers as well)
- Every call is using the same texture unit number and the same texture as well

Sotaro Ikeda [:sotaro]

Assignee

Comment 49

•

11 years ago

(In reply to peter chang[:pchang] from comment #45)
> In my leo device with latest m-c, I found the compositor took about 10~15 ms
> for LayerManagerComposite::Render() during home scrolling, it took about 0~4
> ms on b2g18.
> 
> After digging more, I found it spent lots of time inside
> GrallocDeprecatedTextureHostOGL::BindTexture() calls, and I didn't see long
> delay on glDrawArray calls.
> http://mxr.mozilla.org/mozilla-central/source/gfx/layers/opengl/
> TextureHostOGL.cpp#1091

Currently, master's gecko layer uses only one Texture. From Tapas analysis, using only one Texture seems to cause a perfomance issue. I am creating a patch using Texture for each TextureHost or CompositableHost. In the morning, it seems easy. But until now, the patch does not work correctly...

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

11 years ago

Assignee: nobody → bgirard

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

11 years ago

Assignee: bgirard → sotaro.ikeda.g

Andreas Gal :gal

Comment 50

•

11 years ago

Mandyam, please look at bug 915098. That patch should round-robin texture units during composition, more evenly utilizing the hardware. Can you check whether that gains us anything?

Mike Lee [:mlee]

Updated

•

11 years ago

Status: NEW → ASSIGNED

Whiteboard: [perf-reviewed][c=handeye p= s= u=] → [c=handeye p= s= u=1.2]

Benoit Jacob [:bjacob] (mostly away)

Updated

•

11 years ago

No longer blocks: GFXB2G1.2

Benoit Jacob [:bjacob] (mostly away)

Updated

•

11 years ago

Blocks: GFXB2G1.2

Benoit Jacob [:bjacob] (mostly away)

Comment 51

•

11 years ago

(In reply to Mandyam Vikram from comment #48)
> From the driver side, on 1.2, I'm seeing the following observations:
> - Within a frame, there are multiple GL_TEXTURE_2D texture calls being made
> (across layers as well)
> - Every call is using the same texture unit number and the same texture as
> well

Mandyam, could you please elaborate on how the choice of OpenGL texture unit (as in glActiveTexture) has any incidence on performance or has otherwise any bearing on what happens at the hardware level?

Flags: needinfo?(mvikram)

Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me)

Reporter

Updated

•

11 years ago

Blocks: 915241

Sotaro Ikeda [:sotaro]

Assignee

Comment 52

•

11 years ago

Attached patch WIP patch - allocate texture per CompositableHost (obsolete) — Details — Splinter Review

This is a wip patch. The patch allocate texture per CompositableHost. Current master share one textutre among all TextureHosts.
The performance seems to become better than default b2g master.

Sotaro Ikeda [:sotaro]

Assignee

Comment 53

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #52)
> Created attachment 803199 [details] [diff] [review]
> WIP patch - allocate texture per CompositableHost
> 
> This is a wip patch. The patch allocate texture per CompositableHost.
> Current master share one textutre among all TextureHosts.
> The performance seems to become better than default b2g master.

genlock failure in client side seems different problem than slow OpenGL rendering.

Andreas Gal :gal

Comment 54

•

11 years ago

Sotaro can you expand on comment 53?

Sotaro Ikeda [:sotaro]

Assignee

Comment 55

•

11 years ago

(In reply to Andreas Gal :gal from comment #54)
> Sotaro can you expand on comment 53?

By applying the patch, FPS of browser scrolling becomes better. But genlock failure are still on the logcat. FPS during scrolling seems good but sometimes UI of the scrolling seems blocked for a short time. It seems caused by the genlock failure.

attachment 803199 [details] [diff] [review] just change the way of sharing texture from all CompositableHosts to each CompositableHost. It does not change the way of gralloc buffers locking. If the genlock failure happens by the logical failure, attachment 803199 [details] [diff] [review] should not fix the genlock failure. And actually, the patch does not fix the genlock failure. Then I am tend to think the cause of the genlock failure is somewhere in the logic around gfx layers. I do not have clear evidence about it. Just my assumption. I am going to investigate about the genlock failure more.

Sotaro Ikeda [:sotaro]

Assignee

Comment 56

•

11 years ago

Attached patch patch - allocate texture per CompositableHost (obsolete) — Details — Splinter Review

From previous patch, this patch adds support of camera preview and video playback. camera preview and video playback needs Bug 913821 and Bug 901224 fixed.

Attachment #803199 - Attachment is obsolete: true

Sotaro Ikeda [:sotaro]

Assignee

Comment 57

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #56)
> Created attachment 803362 [details] [diff] [review]
> patch - allocate texture per CompositableHost

As a bonus, the patch revives unagi phones' camera preview. This might be affected from lower drivers task or bypass OpenGL library's defect by using multiple texture.

Andreas Gal :gal

Updated

•

11 years ago

Depends on: 913821, 901224

Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me)

Reporter

Comment 58

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #55)
> (In reply to Andreas Gal :gal from comment #54)
> > Sotaro can you expand on comment 53?
> 
> By applying the patch, FPS of browser scrolling becomes better. But genlock
> failure are still on the logcat. FPS during scrolling seems good but
> sometimes UI of the scrolling seems blocked for a short time. It seems
> caused by the genlock failure.
> 
> attachment 803199 [details] [diff] [review] just change the way of sharing
> texture from all CompositableHosts to each CompositableHost. It does not
> change the way of gralloc buffers locking. If the genlock failure happens by
> the logical failure, attachment 803199 [details] [diff] [review] should not
> fix the genlock failure. And actually, the patch does not fix the genlock
> failure. Then I am tend to think the cause of the genlock failure is
> somewhere in the logic around gfx layers. I do not have clear evidence about
> it. Just my assumption. I am going to investigate about the genlock failure
> more.

Can you please look at https://bugzilla.mozilla.org/show_bug.cgi?id=915078#c1 . I am also seeing that genlock is creating problem both during scroll and app launch. It causes very big stutter for PresShell:Paint() (~1sec durantion Paint !!) .

Depends on: 906715

Dietrich Ayala (:dietrich)

Updated

•

11 years ago

Blocks: 915064

ying.xu

Comment 59

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #18)
> (In reply to Sotaro Ikeda [:sotaro] from comment #15)
> > About gralloc buffer handling, there might be 2 gralloc usage conflicts.
> > [1] conflict between OpenGL and HwComposer
> > [2] conflict between application and LayerManagerComposite.
> 
> When using gralloc buffers, the user of the buffers need to grab genlock.
> OpenGL rendering and HwCompser rendering could switch dynamically in each
> rendering. If HwComposer or OpenGL continue to keep gralloc buffer's
> genlock, another one can not get genlock and wait until got the genlock is
> freed. If the wait time is longer than genlock timeout, "genlock failure"
> log printed to logcat. I wrote this genlock owner ship competiion as
> "gralloc usage conflicts".
> 
> This "gralloc usage conflicts" has 2 patterns.
> -(1) multiple user try to use gralloc buffers simultaneously.
> -(2) genlock is not released after user.
> 
> In b2g18, Bug 879297, Bug 898919 are related to genlock failure. Both cases
> are related camera/video frame rendering and both are categorized to (2).
> 
> The following page explain details about gralloc buffers.
>  https://wiki.mozilla.org/Platform/GFX/Gralloc

Hi, sotaro.
Did you remember bug 905304, you added a compositionComplete function call in function GonkDisplayJB::SwapBuffers.
The function compositionComplete calls glFinish in gralloc HAL module eventually.
The glFinish function will wait until all GPU commands issued befored glFinish be executed by GPU, and make eglSwapBuffers function be called synchronously. 
This means GraphicBuffers will be unlocked by GPU driver after eglSwapBuffers being called. 
As a result, LayerManagerComposite::Render will not return until all the GPU operations are finished.
So we will see calling LayerManagerComposite::Render taking more time than before.
This also explans that "By disabling HWComposer, scroll becomes smooth and gen lock failure becomes smooth". Because GPU rendering is synchronous while hwcomposer compositing is synchronous or not(I dont know that on qcom platform).
If hwcomposer compositing is not synchronous , we should see GraphicBuffer locking by the app layer before hwcomposer unlocking them.

Sotaro Ikeda [:sotaro]

Assignee

Comment 60

•

11 years ago

attachment 803362 [details] [diff] [review] changes the following 2. Both could improve the performance of the rendering.
- Allocate and use a texture for each CompositableHost
    + current master uses only one texture for all CompositableHosts(Layers)
- Create EGLImage during buffer swap
    + current master re-create EGLImage for every rendering

Nicolas Silva [:nical]

Comment 61

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #57)
> (In reply to Sotaro Ikeda [:sotaro] from comment #56)
> > Created attachment 803362 [details] [diff] [review]
> > patch - allocate texture per CompositableHost
> 
> As a bonus, the patch revives unagi phones' camera preview. This might be
> affected from lower drivers task or bypass OpenGL library's defect by using
> multiple texture.

That is a pretty cool bonus

Sotaro Ikeda [:sotaro]

Assignee

Comment 62

•

11 years ago

(In reply to Sotaro Ikeda [:sotaro] from comment #42)
> 
> Only LOCAL_GL_TEXTURE0(0x84C0) is used for rendering for all layers
> rendering if there is no camera preview nor video playback. Therefore only
> one Texture is used for all layers rendering.

It becomes clear that "one Texture is used for all layers" is not correct. Actually one Texture is used for all layers that using gralloc buffers". When out of pmem happens, buffer usage fallbacks to Shmem(for content process) or MemoryImage(for b2g process). And  TextureImageDeprecatedTextureHostOGL is used as TextureHost. The TextureHost allocate TiledTextureImage for rendering. The TiledTextureImage allocate and use own texture. It does not use CompositorOGL's shared temporary Texture.

Then, there could be a case that when out of pmem happens, a previous bounded gralloc buffer does not unbounded by new buffer binding. Next TextureHost under same Compositor might use different texture, when out of pmem happens.

Sotaro Ikeda [:sotaro]

Assignee

Comment 63

•

11 years ago

(In reply to peter chang[:pchang] from comment #45)
> In my leo device with latest m-c, I found the compositor took about 10~15 ms
> for LayerManagerComposite::Render() during home scrolling, it took about 0~4
> ms on b2g18.

I always using hamachi for debugging. The hw difference might make the difference to the remaining pmem amount.

Sotaro Ikeda [:sotaro]

Assignee

Comment 64

•

11 years ago

When I see a genlock failure on my device with HWComposer enabled, previous HWCompositor's rendering always output "ThebesLayerComposite Layer doesn't have a gralloc buffer". 

---------------------------------------------------
09-12 11:03:07.649   144   265 D HWComposer: ThebesLayerComposite Layer doesn't have a gralloc buffer
09-12 11:03:07.649   144   265 D HWComposer: Render aborted. Nothing was drawn to the screen
09-12 11:03:07.949   539   539 E libgenlock: perform_lock_unlock_operation: GENLOCK_IOC_DREADLOCK failed (lockType0x1, err=Connection timed out fd=44)
09-12 11:03:07.949   539   539 E msm7627a.gralloc: gralloc_lock: genlock_lock_buffer (lockType=0x2) failed
09-12 11:03:07.949   539   539 W GraphicBufferMapper: lock(...) failed -22 (Invalid argument)
09-12 11:03:07.949   539   539 E GraphicBufferMapper: lock(...) failed

Mandyam Vikram

Comment 65

•

11 years ago

(In reply to Benoit Jacob [:bjacob] from comment #51)
> (In reply to Mandyam Vikram from comment #48)
> Mandyam, could you please elaborate on how the choice of OpenGL texture unit
> (as in glActiveTexture) has any incidence on performance or has otherwise
> any bearing on what happens at the hardware level?

Please refer to  Comment 62  (Sotaro Ikeda [:sotaro] 2013-09-12 03:22:53 PDT) also. When the same texture is re-used multiple times within the same frame the driver needs to wait until the HW is done with it.

Flags: needinfo?(mvikram)

Benoit Jacob [:bjacob] (mostly away)

Comment 66

•

11 years ago

(In reply to Mandyam Vikram from comment #65)
> (In reply to Benoit Jacob [:bjacob] from comment #51)
> > (In reply to Mandyam Vikram from comment #48)
> > Mandyam, could you please elaborate on how the choice of OpenGL texture unit
> > (as in glActiveTexture) has any incidence on performance or has otherwise
> > any bearing on what happens at the hardware level?
> 
> Please refer to  Comment 62  (Sotaro Ikeda [:sotaro] 2013-09-12 03:22:53
> PDT) also. When the same texture is re-used multiple times within the same
> frame the driver needs to wait until the HW is done with it.

Sure. I understand that there can be lock contention when multiple things try to lock the same texture. But comment 48 used the phrase "texture unit" so it sounded like it was about OpenGL texture units in the sense of glActiveTexture. Can you confirm that by "texture unit" you meant "texture object" i.e. the thing passed to glBindTexture, not the thing passed to glActiveTexture? Sorry if that sounds obvious, but it seems to have confused people into thinkinf that we should do more glActiveTexture calls to use more units...

Benoit Jacob [:bjacob] (mostly away)

Updated

•

11 years ago

Flags: needinfo?(mvikram)

Sotaro Ikeda [:sotaro]

Assignee

Comment 67

•

11 years ago

From andreas's comment, I am going to land the only FPS improvement part in this bug. This performance improvement is very important for v1.2. But genlock failure bug is still present. The genlock failure problem is going to be handled in a different bug.

Sotaro Ikeda [:sotaro]

Assignee

Comment 68

•

11 years ago

Attached patch patch v2 - allocate texture per CompositableHost (obsolete) — Details — Splinter Review

Fix nits and build problem in other platform.

Attachment #803362 - Attachment is obsolete: true

Sotaro Ikeda [:sotaro]

Assignee

Comment 69

•

11 years ago

Attached patch patch v3 - allocate texture per CompositableHost (obsolete) — Details — Splinter Review

Previous patch was old patch. Replace it.

Attachment #803850 - Attachment is obsolete: true

Sotaro Ikeda [:sotaro]

Assignee

Comment 70

•

11 years ago

Comment on attachment 803855 [details] [diff] [review]
patch v3 - allocate texture per CompositableHost

Jeff, can you review the patch soon? The patch is very hackey and created very quickly. For v1.2, OpenGL rendering performance is very low. To address this, we need to commit the fix as soon as possible.

Attachment #803855 - Flags: review?(jmuizelaar)

Benoit Jacob [:bjacob] (mostly away)

Updated

•

11 years ago

Attachment #803855 - Flags: review?(nical.bugzilla)

Benoit Jacob [:bjacob] (mostly away)

Comment 71

•

11 years ago

Comment on attachment 803855 [details] [diff] [review]
patch v3 - allocate texture per CompositableHost

Review of attachment 803855 [details] [diff] [review]:
-----------------------------------------------------------------

This patch is very important to review well, also flagging :nical and myself.

Attachment #803855 - Flags: review?(bjacob)

Sotaro Ikeda [:sotaro]

Assignee

Comment 72

•

11 years ago

I updated the patch to fix the build failure on other platform. But the latest patch failed to build on more platforms :-(
- before
  https://tbpl.mozilla.org/?tree=Try&rev=41eb4599fefc
- after
  https://tbpl.mozilla.org/?tree=Try&rev=d08f677c843a

Jeff Muizelaar [:jrmuizel]

Comment 73

•

11 years ago

Comment on attachment 803855 [details] [diff] [review]
patch v3 - allocate texture per CompositableHost

Review of attachment 803855 [details] [diff] [review]:
-----------------------------------------------------------------

::: b2g/app/b2g.js
@@ +72,5 @@
>  pref("mozilla.widget.force-24bpp", true);
>  pref("mozilla.widget.use-buffer-pixmap", true);
>  pref("mozilla.widget.disable-native-theme", true);
>  pref("layout.reflow.synthMouseMove", false);
> +pref("layers.force-tiles", false);

This was probably unintentional.

::: gfx/layers/opengl/TextureHostOGL.cpp
@@ +191,5 @@
> +  if (!mTexture) {
> +    gl()->MakeCurrent();
> +    gl()->fGenTextures(1, &mTexture);
> +  }
> +  return mTexture;

I think this mTexture can just live in TextureHostOGL and then we won't need the CompositableQuirksGonkOGL. If it's easier we can avoid using the temporary texture on all platforms and not just GonkOGL.

Jeff Muizelaar [:jrmuizel]

Comment 74

•

11 years ago

(In reply to Jeff Muizelaar [:jrmuizel] from comment #73)
> I think this mTexture can just live in TextureHostOGL and then we won't need
> the CompositableQuirksGonkOGL. If it's easier we can avoid using the
> temporary texture on all platforms and not just GonkOGL.

This comment was based on a misunderstanding and can be ignored.

Jeff Muizelaar [:jrmuizel]

Comment 75

•

11 years ago

Comment on attachment 803855 [details] [diff] [review]
patch v3 - allocate texture per CompositableHost

Review of attachment 803855 [details] [diff] [review]:
-----------------------------------------------------------------

::: b2g/app/b2g.js
@@ +72,5 @@
>  pref("mozilla.widget.force-24bpp", true);
>  pref("mozilla.widget.use-buffer-pixmap", true);
>  pref("mozilla.widget.disable-native-theme", true);
>  pref("layout.reflow.synthMouseMove", false);
> +pref("layers.force-tiles", false);

This probably was unintentional.

::: gfx/layers/opengl/GrallocTextureHost.cpp
@@ +181,5 @@
> +  gl()->fBindTexture(textureTarget, tex);
> +  if (!mEGLImage) {
> +    mEGLImage = gl()->CreateEGLImageForNativeBuffer(mGraphicBuffer->getNativeBuffer());
> +  }
> +  gl()->fEGLImageTargetTexture2D(textureTarget, mEGLImage);

It's worth adding a comment about why we need to do this early. i.e. include what Codeaurora said about how the EGL stuff works.

Attachment #803855 - Flags: review?(jmuizelaar) → review+

Jeff Muizelaar [:jrmuizel]

Comment 76

•

11 years ago

Nical's review can happen after this lands if you're in a hurry.

Benoit Jacob [:bjacob] (mostly away)

Comment 77

•

11 years ago

Comment on attachment 803855 [details] [diff] [review]
patch v3 - allocate texture per CompositableHost

Review of attachment 803855 [details] [diff] [review]:
-----------------------------------------------------------------

A comment to maybe help other people reading this code, from talking with Sotaro and reading this code with Jeff:

The fundamental change of approach in this patch is that now the temporary texture is per-compositable.

Originally, the temporary texture was per-TextureHost. That was too many temporary textures.
With Nical's work in bug 875211, that switched to having a temporary texture per compositor only. That's what turned out to be too few.
Now we have one per compositable which is fewer than one per TextureHost, because e.g. a ContentHost, which is a single Compositable, may have 2 TextureHosts to implement double-buffering.

Attachment #803855 - Flags: review?(bjacob)

Attachment #803855 - Flags: review?

Attachment #803855 - Flags: review+

Jeff Muizelaar [:jrmuizel]

Updated

•

11 years ago

Attachment #803855 - Flags: review? → review+

Jeff Muizelaar [:jrmuizel]

Comment 78

•

11 years ago

Attached patch Fix the mac-build by moving things out of headers (obsolete) — Details — Splinter Review

Attachment #803930 - Flags: review?(sotaro.ikeda.g)

Sotaro Ikeda [:sotaro]

Assignee

Comment 79

•

11 years ago

Comment on attachment 803930 [details] [diff] [review]
Fix the mac-build by moving things out of headers

Thanks!

Attachment #803930 - Flags: review?(sotaro.ikeda.g) → review+

Sotaro Ikeda [:sotaro]

Assignee

Comment 80

•

11 years ago

Attached patch patch v4 - allocate texture per CompositableHost — Details — Splinter Review

Apply comments from jrmuizel and bjacob. Committable patch.

Attachment #803855 - Attachment is obsolete: true

Attachment #803930 - Attachment is obsolete: true

Attachment #803855 - Flags: review?(nical.bugzilla)

Attachment #804019 - Flags: review+

Sotaro Ikeda [:sotaro]

Assignee

Comment 81

•

11 years ago

https://tbpl.mozilla.org/?tree=Try&rev=c5f3b34537cf

Sotaro Ikeda [:sotaro]

Assignee

Comment 82

•

11 years ago

Comment on attachment 804019 [details] [diff] [review]
patch v4 - allocate texture per CompositableHost

:nical, can you review the patch as a post review?

Attachment #804019 - Flags: review?(nical.bugzilla)

Benoit Girard (:BenWa)

Comment 83

•

11 years ago

I profiled with your patch Sotaro:
http://people.mozilla.org/~bgirard/cleopatra/#report=78c4868fb762827f587f9ae35db6e2ccb47ce9be

We're no longer blocking in:
Gralloc::BindTexture
CompositorOGL::DrawQuad
ThebesLayerComposite::RenderLayer
LayerManagerComposite::Render
CompositorParent::Composite

Compositor time is moved to 'LayerManagerComposite::EndFrame' which is much better. I'm running at 53 FPS solid with this patch, HWC off and tiling off.

Jeff Muizelaar [:jrmuizel]

Comment 84

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/732bb76aad5a

Sotaro Ikeda [:sotaro]

Assignee

Comment 85

•

11 years ago

By applying the patch, EGLImage creations are reduced from each rendering to each buffer swap. It is still redundant. In android, EGLImage is created only once during GraphicBuffer's lifetime. b2g should be same as android does, but it needs more simplification and code changes around gfx layers.

Nicolas Silva [:nical]

Comment 86

•

11 years ago

Comment on attachment 804019 [details] [diff] [review]
patch v4 - allocate texture per CompositableHost

Review of attachment 804019 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/composite/CompositableHost.h
@@ +59,5 @@
>  
>  /**
> + * A base class for doing CompositableHost and platform dependent task on TextureHost.
> + */
> +class CompositableQuirks : public RefCounted<CompositableQuirks>

This class really needs a more explicit name but we can rename it later. Let's not spend time debating names now, this looks like an important patch.

::: gfx/layers/ipc/CompositableTransactionParent.cpp
@@ +242,5 @@
> +      // set CompositableQuirks
> +      // on gonk, create EGLImage if possible.
> +      // create EGLImage during buffer swap could reduce the graphic driver's task
> +      // during rendering.
> +      tex->SetCompositableQuirks(compositable->GetCompositableQuirks());

It looks like this call should be in AddTextureHost

::: gfx/layers/opengl/TextureHostOGL.cpp
@@ +1122,2 @@
>    DeleteTextures();
> +#if 1

Remove the useless #if 1 before landing

@@ +1156,5 @@
>    MOZ_ASSERT(gl());
>    gl()->MakeCurrent();
>  
> +  mQuirks->SetCompositor(mCompositor);
> +  GLuint tex = static_cast<CompositableQuirksGonkOGL*>(mQuirks.get())->GetTexture();

I see a pattern of doing the two lines above in many places. You should wrap this in a GetGLTexture() method in the Gralloc TextureHost classes.

Attachment #804019 - Flags: review?(nical.bugzilla) → review+

Ed Morley [:emorley]

Comment 87

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/732bb76aad5a

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Sotaro Ikeda [:sotaro]

Assignee

Comment 88

•

11 years ago

nical, thanks for the post review. I created Bug 916116 as a follow up.

Sotaro Ikeda [:sotaro]

Assignee

Comment 89

•

11 years ago

Bug 916264 is created for the genlock failure bug.

Ryan VanderMeulen [:RyanVM]

Updated

•

11 years ago

status-firefox26: --- → fixed

Ryan VanderMeulen [:RyanVM]

Updated

•

11 years ago

Blocks: 880780

Mandyam Vikram

Updated

•

11 years ago

Flags: needinfo?(mvikram)

Disable active layering 11 years ago Benoit Girard (:BenWa) 406 bytes, patch		Details \| Diff \| Splinter Review
WIP patch - allocate texture per CompositableHost 11 years ago Sotaro Ikeda [:sotaro] 21.73 KB, patch		Details \| Diff \| Splinter Review
patch - allocate texture per CompositableHost 11 years ago Sotaro Ikeda [:sotaro] 31.29 KB, patch		Details \| Diff \| Splinter Review
patch v2 - allocate texture per CompositableHost 11 years ago Sotaro Ikeda [:sotaro] 21.73 KB, patch		Details \| Diff \| Splinter Review
patch v3 - allocate texture per CompositableHost 11 years ago Sotaro Ikeda [:sotaro] 31.67 KB, patch	jrmuizel : review+ bjacob : review+	Details \| Diff \| Splinter Review
Fix the mac-build by moving things out of headers 11 years ago Jeff Muizelaar [:jrmuizel] 5.57 KB, patch	sotaro : review+	Details \| Diff \| Splinter Review
patch v4 - allocate texture per CompositableHost 11 years ago Sotaro Ikeda [:sotaro] 34.45 KB, patch	sotaro : review+ nical : review+	Details \| Diff \| Splinter Review