Closed Bug 720467 Opened 12 years ago Closed 12 years ago

Use a FBO instead of a PBuffer in GLContextProviderEGL

Categories

(Core :: Graphics, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla13

People

(Reporter: bjacob, Assigned: joe)

References

Details

Attachments

(6 files, 3 obsolete files)

This should fix lots of issues, especially on mobile GPUs that don't support NPOT PBuffers, like what we're seeing in bug 716770.
Attached patch use FBOs en EGL (obsolete) — Splinter Review
We shouldn't need to create a 16x16 dummy PBuffer, unless the extant GLContextProviderEGL::CreateForWindow() doesn't work. This is how we do FBOs vs PBuffers in WGL, at least.
Attached patch switch from PBuffers to FBOs (obsolete) — Splinter Review
This should work, but b2g crashes on startup for me.
Attachment #590832 - Attachment is obsolete: true
mwu, can you give this a spin?
(In reply to Andreas Gal :gal from comment #4)
> mwu, can you give this a spin?

I applied this patch but it seems to make things worse - now the apps don't show up at all.
I would like to see the logcat output with MOZ_GL_DEBUG_VERBOSE=1 MOZ_GL_DEBUG_ABORT_ON_ERROR=1 in a debug build (and bt if it hits a GL error).
This affecting all EGL-using platforms, could also be tested on Windows, where this patch breaks WebGL rendering.

Jeff G has found out why this doesn't work: GLContextProviderEGL is hardcoding use of PBuffers in several other places.

This code is pretty much un-owned which has hurt us several times in the past, especially for GL layers on Android, but Jeff is much more knowledgeable than me about that, so, assigning to him.
Assignee: nobody → jgilbert
Seems to work in EGL with both accel and non-accel layers on windows.
Attachment #591288 - Attachment is obsolete: true
Blocks: 716770
(In reply to Jeff Gilbert [:jgilbert] from comment #8)
> Created attachment 591529 [details] [diff] [review]
> Use dummy PBuffer context to use for FBO backing in EGL

This doesn't work on Android -- nothing gets rendered.
(In reply to Jeff Gilbert [:jgilbert] from comment #8)
> Created attachment 591529 [details] [diff] [review]
> Use dummy PBuffer context to use for FBO backing in EGL
> 
> Seems to work in EGL with both accel and non-accel layers on windows.

Tested on B2G and doesn't seem to help. The app icons don't show up like before.
(In reply to Ali Juma [:ajuma] from comment #9)
> (In reply to Jeff Gilbert [:jgilbert] from comment #8)
> > Created attachment 591529 [details] [diff] [review]
> > Use dummy PBuffer context to use for FBO backing in EGL
> 
> This doesn't work on Android -- nothing gets rendered.

This is actually good news, as it means that one can debug this on Android as well as on B2G. Please have a look!
Depends on: 721205
(In reply to Benoit Jacob [:bjacob] from comment #11)
> This is actually good news, as it means that one can debug this on Android
> as well as on B2G. Please have a look!

The problem on Android seems unrelated to the problem on B2G, so I've filed Bug 721205 for the Android issue.
Jeff G just landed the fix for bug 721205 on inbound. Please retry with that.
(In reply to Benoit Jacob [:bjacob] from comment #13)
> Jeff G just landed the fix for bug 721205 on inbound. Please retry with that.

As before, nothing gets rendered. However, when I modify that fix so that mSupport_ES_ReadPixels_BGRA_Ubyte is false unconditionally, this works.
@ Michael Wu: we think that comment 14 is a driver bug on Ali's phone and still hope that Jeff's patch here, on top of the one in bug 721205, may make things work on the Maguro. Could you try?
(In reply to Benoit Jacob [:bjacob] from comment #15)
> @ Michael Wu: we think that comment 14 is a driver bug on Ali's phone and
> still hope that Jeff's patch here, on top of the one in bug 721205, may make
> things work on the Maguro. Could you try?

I applied both patches and no icons show up.
Michael, if you have a debug build of B2G, can you run with the env vars in comment 6? Do you get a crash?  (these env vars will in particular convert any GL error into an assert failure)
(In reply to Benoit Jacob [:bjacob] from comment #17)
> Michael, if you have a debug build of B2G, can you run with the env vars in
> comment 6? Do you get a crash?  (these env vars will in particular convert
> any GL error into an assert failure)

I tried a debug build but

- It takes enough space that I had to delete other things in the image to get it to fit
- It crashes

So I replaced instances of #ifdef DEBUG in gfx/gl/* with #ifdef DEBUGG and defined DEBUGG in the makefile. I'm not getting any aborts.
(In reply to Michael Wu [:mwu] from comment #18)
> (In reply to Benoit Jacob [:bjacob] from comment #17)
> > Michael, if you have a debug build of B2G, can you run with the env vars in
> > comment 6? Do you get a crash?  (these env vars will in particular convert
> > any GL error into an assert failure)
> 
> I tried a debug build but
> 
> - It takes enough space that I had to delete other things in the image to
> get it to fit

Which things?  Can they be permanently deleted? :)
This modifies bjacob/jgilbert's previous patch to use context sharing, so we can address the WebGL context's backing texture from the layer manager's context.
Attachment #591529 - Attachment is obsolete: true
Attachment #593589 - Flags: review?(jgilbert)
Attachment #593589 - Flags: review?(bjacob)
Attachment #593589 - Flags: review?(jgilbert) → review+
Comment on attachment 593589 [details] [diff] [review]
Share FBO textures

Just listen to Jeff.
Attachment #593589 - Flags: review?(bjacob)
Try run for af67d5743098 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=af67d5743098
Results (out of 206 total builds):
    success: 177
    warnings: 29
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jdrew@mozilla.com-af67d5743098
 Timed out after 06 hours without completing.
This fails mochitests on Windows because we only shut down the WGL ContextProvider, since that's the default context provider on Windows. Shutting down the EGL context provider explicitly (in addition to the default context provider) fixes this problem.

If we didn't have two context providers on Windows, we wouldn't need this. :(
Assignee: jgilbert → joe
Attachment #593953 - Flags: review?(jgilbert)
Attachment #593953 - Flags: review?(bjacob)
If we can get a real opengl context from EGL on windows, that would be ideal.

I would prefer using EGL everywhere, really.
Try run for e7e39f6b03a9 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=e7e39f6b03a9
Results (out of 30 total builds):
    success: 27
    warnings: 3
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jdrew@mozilla.com-e7e39f6b03a9
I don't think that is a possibility. EGL, after all, is implemented by ANGLE, and it hands out GLES contexts implemented on top of Direct3D 9.
Blocks: 724143
Comment on attachment 593953 [details] [diff] [review]
Shutdown EGL explicitly on Windows

Jeff gave me r+ on IRC for a version of this that also shuts down OSMesa. As he points out, that context provider is also available on all platforms.
Attachment #593953 - Flags: review?(jgilbert)
Attachment #593953 - Flags: review?(bjacob)
Attachment #593953 - Flags: review+
You rock!

By the way, OSMesa isn't terrificly useful anymore. Maybe we should remove it?
I believe some of the fuzzing still goes on on OSMesa, but I could be wrong. I think we should keep it, anyways.
https://hg.mozilla.org/mozilla-central/rev/7f6db301b290
https://hg.mozilla.org/mozilla-central/rev/fa92500f4ad2
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla13
Attachment #595349 - Attachment mime type: text/x-log → text/plain
Attachment #595348 - Attachment mime type: text/x-log → text/plain
Not much difference in profiling result before or after patch.

The profiling result shows heavy loading in memcpy(), it's not related to this patch.
But about one month ago, there were no such heavy loaded memcpy().
It's another issue need to notice.  Unfortunately, the callgraph doesn't show who is calling
memcpy(), maybe it's called by some library without symbols or been optimized.
Unlike high CPU usage of b2g process at "Bug 708524" which happens all the time.
The high memcpy() only happens in screen on situation.  This is a separate issue.

Below is the perf top result.

   PerfTop:     799 irqs/sec  kernel: 2.8%  exact:  0.0% [1000Hz cycles],  (target_pid: 4000)
-------------------------------------------------------------------------------

             samples  pcnt function                    DSO
             _______ _____ ___________________________ _________________

             1838.00 86.9% memcpy                      libc.so          
               99.00  4.7% _gles_fb_tex_sub_image_2d   libGLESv2_mali.so
               24.00  1.1% floor                       libm.so          
               14.00  0.7% _raw_spin_unlock_irqrestore [kernel.kallsyms]
               13.00  0.6% __divsi3                    libmozglue.so    
                9.00  0.4% __do_softirq                [kernel.kallsyms]
                8.00  0.4% getnstimeofday              [kernel.kallsyms]
                6.00  0.3% pthread_mutex_lock          libc.so          
                5.00  0.2% memset                      libc.so
If I keep scrolling the screen, the heavy loaded module becomes to vmlinux.

1. oprofile summary: Phone with this patch and tested by scrolling.

CPU: ARM Cortex-A9, speed 0 MHz (estimated)
Counted CPU_CYCLES events (Number of CPU cycles) with a unit mask of 0x00 (No unit mask) count 125000
CPU_CYCLES:125000|
  samples|      %|
------------------
    14556 77.1056 vmlinux
     1111  5.8852 libxul.so
     1107  5.8640 oprofiled
      872  4.6191 libMali.so
      594  3.1465 libc.so
      257  1.3614 libGLESv2_mali.so
      187  0.9906 libmozglue.so
       87  0.4609 b2g

2. oprofile summary: Phone without this patch and tested by scrolling.

CPU: ARM Cortex-A9, speed 0 MHz (estimated)
Counted CPU_CYCLES events (Number of CPU cycles) with a unit mask of 0x00 (No unit mask) count 125000
CPU_CYCLES:125000|
  samples|      %|
------------------
    38756 37.2550 libxul.so
    21172 20.3520 vmlinux
    13190 12.6792 libc.so
     9496  9.1282 oprofiled
     7166  6.8885 libGLESv2_mali.so
     5705  5.4840 libmozglue.so
     3708  3.5644 libMali.so
     2218  2.1321 b2g
>              1838.00 86.9% memcpy                      libc.so          
>                99.00  4.7% _gles_fb_tex_sub_image_2d   libGLESv2_mali.so

This suggests that glTexSubImage2D is broken on this phone and we should use glTexImage2D instead. It can then make sense that the present patch would regress performance, if we are using glTexSubImage2D for the texture uploads here.

See also bug 724094.
Why does this suggest that glTexSubImage2D is broken? The data has to get there somehow.
I think bjacob is right. The driver on adreno devices (at least some versions on it) is known to implement TexSubImage2D by downloading the whole texture in CPU memory, then doing a memcpy, and then re-uploading the entire texture. Even my radeon driver on my mac does this. I don't know why driver writers hate TexSubImage2D so much.
We should probably use pixel buffer objects, in that case. Have to check whether that's supported on our devices.
When I disable the animation for the battery icon, glTexSubImage2D is not called at all during panning (only a couple of times after I lift my finger). Going to profile now.
I should note that panning is still plenty slow with the battery animation disabled.
I'm having trouble getting callgraph results with oprofile, unfortunately. How does one set up perf to work on a B2G Galaxy S 2?

Shian-Yu, it would be very helpful if you could attach a full perf report of panning after applying the following gaia patch:

diff --git a/apps/homescreen/style/homescreen.css b/apps/homescreen/style/homescreen.css
index 9143032..79f55e7 100644
--- a/apps/homescreen/style/homescreen.css
+++ b/apps/homescreen/style/homescreen.css
@@ -263,7 +263,7 @@ iframe.appWindow.noTransition {
 
 .battery > #fuel.charging {
   background-image: -moz-linear-gradient(top, #b9fd00, #8fc400);
-  -moz-animation-duration: 3s;
+/*  -moz-animation-duration: 3s; */
   -moz-animation-name: batteryCharging;
   -moz-animation-iteration-count: infinite;
   -moz-animation-timing-function: linear;
Target Milestone: mozilla13 → ---
No more heavy memcpy() after removing CSS animation.

   PerfTop:     101 irqs/sec  kernel:72.3%  exact:  0.0% [1000Hz cycles],  (all, 1 CPU)
-------------------------------------------------------------------------------

             samples  pcnt function                     DSO
             _______ _____ ____________________________ _____________

              150.00 21.6% s5pv310_enter_idle           [kernel]     
               82.00 11.8% _raw_spin_unlock_irqrestore  [kernel]     
               68.00  9.8% _raw_spin_unlock_irq         [kernel]     
               49.00  7.1% tick_nohz_stop_sched_tick    [kernel]     
               33.00  4.7% tick_nohz_restart_sched_tick [kernel]     
               20.00  2.9% add_preempt_count            [kernel]     
               19.00  2.7% schedule                     [kernel]     
               19.00  2.7% getnstimeofday               [kernel]     
               17.00  2.4% __do_softirq                 [kernel]     
               16.00  2.3% sub_preempt_count            [kernel]     
               15.00  2.2% vector_swi                   [kernel]     
               12.00  1.7% __divsi3                     [kernel]     
               10.00  1.4% __udivsi3                    libmozglue.so
               10.00  1.4% __mutex_unlock_slowpath      [kernel]     
                9.00  1.3% __mutex_lock_slowpath        [kernel]     
                7.00  1.0% memcpy                       [kernel]     

I tried perf with callgraph for having some problem on my system at this moment.
It works if you don't add a -g (callgraph) flag.

Here's a prebuilt perf for android, FYI.

http://code.google.com/p/android-group-korea/downloads/detail?name=perf-binary-kwangwoo-for-android.zip
This is the perf callgraph on b2g process with panning.
Nice!  Did you have to modify b2g at all to get the callgraph from perf?
No, I don't have to modify anything on b2g to get callgraph from perf.

Here is the perf command I used.  (2599 is b2g's pid)

# ./perf record -g -p 2599
# ./perf report > perf_callgraph.txt

There are still some C++ demanging issue to be improved in the callgraph.
Is perf installed by default now?
Is perf installed by default now?
(In reply to Shian-Yow Wu from comment #45)
> Created attachment 595622 [details]
> perf callgraph with panning
> 
> This is the perf callgraph on b2g process with panning.

Spending > 50% time around IRQs??
There is some 'mali200' stuff in the callers for these 2 IRQ-related symbols, so, does that mean that some GPU IO is very inefficient (blocking) ??
Depends on: 725747
I've filed bug 725747 on the performance regression.
I forgot to commit some improved comments when originally pushing this patch: https://hg.mozilla.org/integration/mozilla-inbound/rev/965a86374b2e
(In reply to Shian-Yow Wu from comment #47)
> No, I don't have to modify anything on b2g to get callgraph from perf.
> 
More accurately, I build b2g image as usual, but replaced libxul.so with symbols on sgs2.
Depends on: 724476
Depends on: 726396
No longer depends on: 724476
Depends on: 728361
(In reply to Jeff Gilbert [:jgilbert] from comment #24)
> If we can get a real opengl context from EGL on windows, that would be ideal.
> 
> I would prefer using EGL everywhere, really.

This would be possibly.  You have to lobby the Desktop GL vendors to provide it though. Currently they claim no one is interested in it.
Depends on: 732143
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: