Closed Bug 738343 Opened 12 years ago Closed 7 years ago

Use a central library for pixel format conversion and alpha-(un)premultiplication

Categories

(Core :: Graphics, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox54 --- fixed

People

(Reporter: jgilbert, Assigned: lsalzman)

References

(Blocks 1 open bug)

Details

Attachments

(7 files, 8 obsolete files)

13.37 KB, patch
jrmuizel
: review+
Details | Diff | Splinter Review
13.27 KB, patch
jrmuizel
: review+
Details | Diff | Splinter Review
13.67 KB, patch
jrmuizel
: review+
Details | Diff | Splinter Review
25.90 KB, patch
jrmuizel
: review+
Details | Diff | Splinter Review
19.45 KB, patch
jrmuizel
: review+
Details | Diff | Splinter Review
19.60 KB, patch
jrmuizel
: review+
Details | Diff | Splinter Review
7.94 KB, patch
jrmuizel
: review+
Details | Diff | Splinter Review
Currently, we have scattered code for doing pixel format conversions and alpha-premuliplication. We should gather these uses together in a central location for speed, consistency, and maintainability.
OS: Windows 7 → All
Hardware: x86_64 → All
Severity: normal → enhancement
This adds the core PremultiplyData/UnpremultiplyData API to Moz2d with generic CPU fallbacks that are faster than our existing thebes table-based premultiplication.

The code here is necessarily a bit grubby, but outside of the fallback routines, mainly what it is doing is trying to implement a dispatch system based on source and destination formats getting mapped to template function dispatchers that ultimately call SIMD optimized routines. So the complexity is warranted...
Assignee: nobody → lsalzman
Status: NEW → ASSIGNED
Attachment #8834916 - Flags: review?(jmuizelaar)
Attachment #8834931 - Flags: review?(jmuizelaar)
Attachment #8834939 - Flags: review?(jmuizelaar)
Attachment #8834941 - Flags: review?(jmuizelaar)
Attached patch Use Moz2d premultiply in thebes (obsolete) — Splinter Review
Attachment #8834942 - Flags: review?(jmuizelaar)
Attachment #8834942 - Flags: review?(jmuizelaar) → review+
Comment on attachment 8834916 [details] [diff] [review]
Add Moz2d API for optimized premultiply/unpremultiply

Review of attachment 8834916 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/2d/Premultiply.cpp
@@ +71,5 @@
> +// multiplications used. That is, the R and B components are isolated from the G and A
> +// components, which then can be multiplied as if they were two 2-component vectors.
> +// Otherwise, an approximation if divide-by-255 is used which is faster than an actual
> +// division. These optimizations are also used for the SSE2 and NEON implementations.
> +template<bool aSwapRB, bool aOpaqueAlpha,

I think it would be better off writing the aOpaqueAlpha function separately instead of interleaved. That should make reading either one more obvious.

::: gfx/2d/Premultiply.h
@@ +11,5 @@
> +namespace gfx {
> +
> +GFX2D_API void PremultiplyData(const uint8_t* aSrc, int32_t aSrcStride, SurfaceFormat aSrcFormat,
> +                               uint8_t* aDst, int32_t aDstStride, SurfaceFormat aDstFormat,
> +                               const IntSize& aSize);

This should have documentation about whether aSrc and aDst can be the same. (i.e. in place conversion)
Attachment #8834916 - Flags: review?(jmuizelaar) → review+
Can you also add a test for this code?
Comment on attachment 8834941 [details] [diff] [review]
Use Moz2d premultiply in Canvas2D

Review of attachment 8834941 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/canvas/CanvasRenderingContext2D.cpp
@@ +5543,5 @@
>  
>      uint8_t* dst = data + dstWriteRect.y * (aWidth * 4) + dstWriteRect.x * 4;
>  
> +    UnpremultiplyData(src, srcStride,
> +                      mOpaque ? SurfaceFormat::X8R8G8B8_UINT32 : SurfaceFormat::A8R8G8B8_UINT32,

I feel like it might be better to have a separate function for converting opaque data to ARGB to make things more explicit.
Attachment #8834941 - Flags: review?(jmuizelaar) → review+
This separates out the non-premul case into the beast that is now otherwise known as SwizzleData. In the service of that, I also had to make the template dispatching system stronger (i.e. supporting partial specialization), which required wrapping the dispatch functions inside templated structs instead of templating the dispatch functions directly (since template functions can't be partially specialized). But this should make it overall more future-proof for adding more complicated cases as needed.

Otherwise, many incidentally found bugs have been fixed, and I poked around with GCC's output to get the assembly looking better too.
Attachment #8834916 - Attachment is obsolete: true
Attachment #8835848 - Flags: review?(jmuizelaar)
Attached patch Optimize Moz2d swizzles for SSE2 (obsolete) — Splinter Review
Updated with swizzle optimizations
Attachment #8834931 - Attachment is obsolete: true
Attachment #8834931 - Flags: review?(jmuizelaar)
Attachment #8835850 - Flags: review?(jmuizelaar)
Updated with swizzle optimizations
Attachment #8834939 - Attachment is obsolete: true
Attachment #8834939 - Flags: review?(jmuizelaar)
Attachment #8835851 - Flags: review?(jmuizelaar)
Now also uses SwizzleData where appropriate.
Attachment #8834941 - Attachment is obsolete: true
Attachment #8835853 - Flags: review?(jmuizelaar)
Comment on attachment 8835850 [details] [diff] [review]
Optimize Moz2d swizzles for SSE2

Review of attachment 8835850 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/2d/SwizzleSSE2.cpp
@@ +265,5 @@
> +  // Combine everything back together.
> +  return _mm_or_si128(rb, ga);
> +}
> +
> +#if 0

Why are these #if 0?
Updated to remove some libyuv usage in favor of SwizzleData
Attachment #8834942 - Attachment is obsolete: true
Attachment #8835854 - Flags: review?(jmuizelaar)
This removes random adhoc implementations of conversions in various places in Moz2d which have now been consolidated into SwizzleData.
Attachment #8835855 - Flags: review?(jmuizelaar)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #8)
> Can you also add a test for this code?

It is conceivable, but Canvas2D is already sort of the ultimate test of this. dom/canvas/test/test_canvas.html has a considerable amount of tests solely devoted to ensuring data is properly conditioned going through putImageData/getImageData/createImageData. PremultiplyData/UnpremultiplyData are architected almost solely just to pass those tests, otherwise they could be a lot looser in terms of the transforms. So a distinct test for those paths would I think be redundant and not necessarily as thorough.

SwizzleData, OTOH, is kinda going to get used everywhere, but not probably concentrated in one place. When SwizzleData breaks, a lot of random reftests definitely scream in agony, though maybe not in a way that would directly peg it. So thus a question would be, if SwizzleData were to be tested, what kinds of tests would we want to help peg SwizzleData as the cause of breakage when a ton of reftests throughout the tree are already breaking as a result of it? That would, I imagine, be the main goal of such a test. Just to make the smoking gun a bit smokier.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #14)
> Comment on attachment 8835850 [details] [diff] [review]
> Optimize Moz2d swizzles for SSE2
> 
> Review of attachment 8835850 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/2d/SwizzleSSE2.cpp
> @@ +265,5 @@
> > +  // Combine everything back together.
> > +  return _mm_or_si128(rb, ga);
> > +}
> > +
> > +#if 0
> 
> Why are these #if 0?

At present they only profile to be of equivalent performance to the generic C fallbacks, but that could change in the future depending. So they are there but commented out just in case one day we decide to put them back, and also as just a form of self-documentation as to what said swizzles would look like in their SSE2 forms if we ever did want to put them back (for testing or whatever).
Simplified the dispatching to just hash the formats to a dense key that is directly switched on to to the handlers rather than going through template specialization.
Attachment #8835848 - Attachment is obsolete: true
Attachment #8835848 - Flags: review?(jmuizelaar)
Attachment #8836389 - Flags: review?(jmuizelaar)
Simplified dispatching
Attachment #8835850 - Attachment is obsolete: true
Attachment #8835850 - Flags: review?(jmuizelaar)
Attachment #8836391 - Flags: review?(jmuizelaar)
Simplified dispatching
Attachment #8835851 - Attachment is obsolete: true
Attachment #8835851 - Flags: review?(jmuizelaar)
Attachment #8836392 - Flags: review?(jmuizelaar)
Adds gtests for PremultiplyData/UnpremultiplyData/SwizzleData that verify all common format conversions work and do at least some minimum of expected things right.
Attachment #8836393 - Flags: review?(jmuizelaar)
Attachment #8835853 - Flags: review?(jmuizelaar) → review+
Attachment #8835854 - Flags: review?(jmuizelaar) → review+
Attachment #8835855 - Flags: review?(jmuizelaar) → review+
Attachment #8836393 - Flags: review?(jmuizelaar) → review+
Comment on attachment 8836389 [details] [diff] [review]
Add Moz2d API for optimized swizzle/premultiply/unpremultiply

Review of attachment 8836389 [details] [diff] [review]:
-----------------------------------------------------------------

I didn't review this super closely as I'm just eager to have something in the tree and any badness should be easy to catch and will not escape this file.

::: gfx/2d/Swizzle.cpp
@@ +16,5 @@
> + */
> +
> +// Hash the formats to a relatively dense value to optimize jump table generation.
> +#define FORMAT_KEY(aSrcFormat, aDstFormat) \
> +  (int(aSrcFormat) * 6 + int(aDstFormat) + (int(int(aDstFormat) >= 6) << 6))

It's worth have a comment that gives an overview of what's happening here. e.g. how are hash collisions dealt with.
Attachment #8836389 - Flags: review?(jmuizelaar) → review+
Attachment #8836391 - Flags: review?(jmuizelaar) → review+
Attachment #8836392 - Flags: review?(jmuizelaar) → review+
Pushed by lsalzman@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5831b623b4ee
part 1 - Add Moz2d API for optimized swizzle/premultiply/unpremultiply. r=jrmuizel
https://hg.mozilla.org/integration/mozilla-inbound/rev/93d02309c315
part 2 - Optimize Moz2d swizzles for SSE2. r=jrmuizel
https://hg.mozilla.org/integration/mozilla-inbound/rev/982acc0d3b89
part 3 - Optimize Moz2d swizzles for ARM NEON. r=jrmuizel
https://hg.mozilla.org/integration/mozilla-inbound/rev/915e36e7ee7e
part 4 - Use Moz2d swizzles in Canvas2D. r=jrmuizel
https://hg.mozilla.org/integration/mozilla-inbound/rev/5917ab01b18e
part 5 - Use Moz2d swizzles in thebes. r=jrmuizel
https://hg.mozilla.org/integration/mozilla-inbound/rev/b30c0559cf77
part 6 - Use Moz2d swizzles in Moz2d. r=jrmuizel
https://hg.mozilla.org/integration/mozilla-inbound/rev/982b64c2bf76
part 7 - add gtest for Moz2d swizzles. r=jrmuizel
Depends on: 1339058
Depends on: 1340297
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: