Closed Bug 619491 Opened 14 years ago Closed 12 years ago

Decode shadowimagelayer YUV frames on the GPU when possible

Categories

(Core :: Graphics, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: cjones, Unassigned)

References

Details

Stuart reports that even with the NEON optimizations, YUV->RGB conversion eats too much CPU.  If we're using OGL compositing, we should have a special path in ShadowableImageLayer to post YUV frames directly to the compositor, bypassing the YUV->RGB conversion on the CPU.
Worth re-checking perf after bug 615870 lands; might be good enough for release.
> Stuart reports that even with the NEON optimizations, YUV->RGB conversion eats
> too much CPU.  If we're using OGL compositing, we should have a special path in

It is not really true...  I was profiling this case and saw 2 major problems:
1) Layout heavy CPU usage (bug 615870)
2) Software rendering pipeline memory copies (lot of them)
yuv_rgb_neon is less evil now, comparing to 1) and 2)

yuv_neon + Software Rendering:
  1) Video fast on simple page with Theora video
  2) Video a bit lagging on simple page with WebM codec video
  3) Video is slow on page with something else (youtube, scaled page)

for complex pages without scaling will help a lot this bug 615870.
Scaling problem can be partially solved by applying patches from http://cgit.freedesktop.org/~siamashka/pixman/  (siarhei working on upstreaming right now)

Also it can be solved by using OpenGL rendering and here we have 2 options:
1) limited option with Lock EGL extension, where we can do software conversion yuv_neon->Locked Texture RGB data
2) Send YUV frames to directly to GPU and use yuv shader.

2-nd will help a bit but not so much... main benefit we only get by removing log and expensive software rendering pipeline, and removing layout expensive calculations 615870
GL is mostly what we care about here.  We can't rely on texture-lock extensions, we need a fast fallback.

(In reply to comment #2)
> 2) Send YUV frames to directly to GPU and use yuv shader.
> 
> 2-nd will help a bit but not so much...

Why is that?  Are you saying the YUV->RGB conversion isn't showing up in profiles?  (I guess with your NEON patches applied?)

> main benefit we only get by removing
> log and expensive software rendering pipeline, and removing layout expensive
> calculations 615870

Yeah, this is great stuff, thanks.
(In reply to comment #3)
> GL is mostly what we care about here.  We can't rely on texture-lock
> extensions, we need a fast fallback.
> 
> (In reply to comment #2)
> > 2) Send YUV frames to directly to GPU and use yuv shader.
> > 
> > 2-nd will help a bit but not so much...
> 
> Why is that?  Are you saying the YUV->RGB conversion isn't showing up in
> profiles?  (I guess with your NEON patches applied?)
> 

Note too that keeping frames in YUV all the way to the GPU should reduce memory bandwidth by 2x, even if YUV->RGB conversion on the GPU doesn't significantly reduce CPU usage.
> Note too that keeping frames in YUV all the way to the GPU should reduce memory
> bandwidth by 2x, even if YUV->RGB conversion on the GPU doesn't significantly
Yep, you right
> reduce CPU usage.
... then I think we just need to IPC yuv frames and upload to to GPU directly...

the only thing we should do here is to make sure that decoder writing yuv data directly into system shared memory... so we can send it to another process without memcpy and just spend CPU for glTexImage2D upload...
No longer blocks: 598864
Depends on: 598864
QA Contact: thebes → bgirard
I already see this code in 'gfx/layers/opengl/ImageLayerOGL.cpp', looks like it landed in bug 649417. Closing, reopen is I misunderstood.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.