Closed Bug 534215 Opened 16 years ago Closed 13 years ago

Optimize WebGL premultiply

Categories

(Core :: Graphics: CanvasWebGL, defect)

1.9.2 Branch
x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 738343

People

(Reporter: romaxa, Unassigned)

Details

Attachments

(1 file, 1 obsolete file)

For microb browser, there is only way to get GL buffer data by using glReadPixels API... I think it would be nice to have preference for enabling image mode. maemo pixman is not really happy about translate+scale vertical flipping (it goes through composite_general path). Easiest way to do it - is give source surface with negative stride.
Attachment #417110 - Flags: review?(vladimir)
Using this operation via pixman API involves something like this (and it is ARM NEON optimized): src_img = pixman_image_create_bits ( PIXMAN_x8b8g8r8, width, height, src, stride); msk_img = pixman_image_create_bits ( PIXMAN_a8b8g8r8, width, height, src, stride); dst_img = pixman_image_create_bits ( PIXMAN_a8r8g8b8, width, height, dst, stride); pixman_image_composite (PIXMAN_OP_SRC, src_img, msk_img, dst_img, 0, 0, 0, 0, 0, 0, width, height); It may make sense to also add NEON optimization for similar OVER operation to pixman.
It is using pixman neon premultiply implementation... not sure if we can use it for all other platforms...
Attachment #417110 - Attachment is obsolete: true
Attachment #497432 - Flags: feedback?(vladimir)
Attachment #417110 - Flags: review?(vladimir)
Component: Canvas: WebGL → Graphics
QA Contact: canvas.webgl → thebes
Summary: Provide official option for GL image rendering on maemo, and optimize it a bit. → Optimize WebGL premultiply
(In reply to comment #2) > It is using pixman neon premultiply implementation... not sure if we can use it > for all other platforms... It's a valid use of pixman API, which should be portable across all the platforms. But pixman might be just missing SSE2 optimizations for this operation at the moment, failing to provide the expected performance improvement.
Oleg, given that the WebGL implementation has to handle ~200 different combination of source format, destination format, and premultiplication/unpremultiplication operations; and given that it also has to handle different strides and may have to flip the y axis in the same pass; do you think that all that could be handled by pixman? If yes, do you think that it could be faster than what we currently have? see http://mxr.mozilla.org/mozilla-central/source/content/canvas/src/WebGLContextGL.cpp#3313 and http://mxr.mozilla.org/mozilla-central/source/content/canvas/src/WebGLTexelConversions.h
Component: Graphics → Canvas: WebGL
QA Contact: thebes → canvas.webgl
(In reply to comment #4) > Oleg, given that the WebGL implementation has to handle ~200 different > combination of source format, destination format, and > premultiplication/unpremultiplication operations; Where does this number come from? AFAIK cairo only uses premultiplied a8r8g8b8 format, which means that only "any->premultiplied a8r8g8b8" and "premultiplied a8r8g8b8->any" conversions are relevant unless I'm missing something. Moreover, these "any" formats should have alpha channel themselves, otherwise there is nothing to premultiply. > and given that it also has to handle different strides and may > have to flip the y axis in the same pass; This is supported, and explicitly mentioned in comment 0 > do you think that all that could be handled by pixman? unpremultiplication is not supported by pixman right now > If yes, do you think that it could be faster than what we > currently have? see > http://mxr.mozilla.org/mozilla-central/source/content/canvas/src/ > WebGLContextGL.cpp#3313 > and > http://mxr.mozilla.org/mozilla-central/source/content/canvas/src/ > WebGLTexelConversions.h If this code does not use SSE2/NEON, then pixman is surely going to run circles around it for the operations where SIMD optimizations are available.
(In reply to comment #5) > (In reply to comment #4) > > Oleg, given that the WebGL implementation has to handle ~200 different > > combination of source format, destination format, and > > premultiplication/unpremultiplication operations; > > Where does this number come from? `nm libxul.so` gave me 155 compiled paths before we landed WebGL float textures; now it will be more. To give a rough order of magnitude, there are 12 source formats, 8 destination formats, 4 source float formats, 4 destination float formats; and whenever alpha is involved, the number of paths is tripled (default, premultiply, unpremultiply). See http://mxr.mozilla.org/mozilla-central/source/content/canvas/src/WebGLContextGL.cpp#3313 > AFAIK cairo only uses premultiplied I don't see how cairo is relevant to WebGL > If this code does not use SSE2/NEON, then pixman is surely going to run > circles around it for the operations where SIMD optimizations are available. Only if pixman can do the job in one pass, I suppose. Otherwise, the redundant memory accesses are going to make us lose the benefit of SIMD.
(In reply to comment #6) > (In reply to comment #5) > > AFAIK cairo only uses premultiplied > > I don't see how cairo is relevant to WebGL The attached patch modifies the code, which is responsible for generating data for the cairo surface (wrapped into thebes). So I don't see how it is not relevant to this bug.
(In reply to comment #7) > (In reply to comment #6) > > (In reply to comment #5) > > > AFAIK cairo only uses premultiplied > > > > I don't see how cairo is relevant to WebGL > > The attached patch modifies the code, which is responsible for generating > data for the cairo surface (wrapped into thebes). So I don't see how it is > not relevant to this bug. Oh, I see. It's really obsolete: WebGL can take textures from non-Cairo-surfaces now. Specifically, WebGL can use any JS array or Typed Array as a texture, with a choice of a dozen texture formats (see above link).
Duping forward to our plan for a better pixel format conversion system/library.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: