Closed Bug 683243 Opened 13 years ago Closed 12 years ago

Dithered gradients for 16bpp surfaces

Categories

(Core :: Graphics, defect)

ARM
All
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla16
blocking-kilimanjaro +
blocking-basecamp +

People

(Reporter: stechz, Assigned: jrmuizel)

References

Details

(Keywords: mobile)

Attachments

(3 files, 9 obsolete files)

To eliminate banding on fennec.
This is the basic idea. The gradients that this fixes look pretty awesome to me. On the minus side, this doesn't get any of our fast-paths so we either need to add dithering to our fast paths or find a more general place to put this.
Assignee: nobody → doug.turner
Attachment #556941 - Attachment is obsolete: true
From what I understand gradients are slow on mobile already. Using a dithering matrix like this will probably make it a harder exercise to make them fast? Isn't it a better idea to make gradients fast first and then look into a dithering method that is the right performance/quality trade-off?
Gradients are slow already, though that matters a lot less when we are painting in the content process. Using the Bayer matrix is a pretty fast pass to tack on.

My gut says we won't notice the extra checkerboard time but we have been noticing the terrible banding we see on our phones. We could verify my gut by writing something that paints a lot of gradients a second using mozRequestAnimationFrame, and compare them.
(In reply to Benjamin Stover (:stechz) from comment #5)
> Gradients are slow already, though that matters a lot less when we are
> painting in the content process. Using the Bayer matrix is a pretty fast
> pass to tack on.
> 
> My gut says we won't notice the extra checkerboard time but we have been
> noticing the terrible banding we see on our phones. We could verify my gut
> by writing something that paints a lot of gradients a second using
> mozRequestAnimationFrame, and compare them.

What I meant was that if gradients were made to be -not- slow (i.e. new gradient code). I believe the SIMD instruction sets make linear interpolation very easy to do. The lookups required for the Bayer matrix will be a lot harder to optimize I think. But I might be completely wrong :). Derf might have a better idea.
Keep in mind, since hardware acceleration is near for Android, it might make sense to start putting background gradients in their own layers (but only when we're hw acceled).
(In reply to Benjamin Stover (:stechz) from comment #7)
> Keep in mind, since hardware acceleration is near for Android, it might make
> sense to start putting background gradients in their own layers (but only
> when we're hw acceled).

We've discussed this idea before, obviously the fixed function interpolators are great for gradients! However I doubt on mobile we'll have the shader power to have dithering there.
Keywords: mobile
Hardware: All → ARM
Blocks: 687835
Blocks: 694828
Assignee: doug.turner → nobody
Keywords: helpwanted
This is something we critically need for b2g as well (obviously).
Does skia already dither gradients for 16bpp surfaces?  That would be a nice solution to this problem.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #10)
> Does skia already dither gradients for 16bpp surfaces?  That would be a nice
> solution to this problem.

Yes.
Skia is not going to be ready for general content any time soon. I think we can ship B2G with this, given that Gaia is going to ship gradient images; further, we've already shipped multiple Fennec releases with this bug.

A shorter-term fix would be to implement dithered gradients in pixman, but we have nobody allocated to do that.

For that reason, I'm removing it from the b2g-product-phone blocker list.
No longer blocks: b2g-product-phone
This is a product decision.  We can find other people to work on this.  Now is not the time to make the call.  Manually imaging and dithering every single gradient in the gaia UI is just creating a massive burden for another team on the critical path to b2g ship.  We have nobody allocated to do that.
Comment 14 means this still blocks. Decision needed by platform & graphics on solution.
Whiteboard: [b2g:blocking+]
blocking-basecamp: --- → +
blocking-kilimanjaro: --- → +
Whiteboard: [b2g:blocking+]
Here's some information on the situation:

Android switched from a default of 16-bit to 32-bit surfaces in 2.3. This largely solves the gradient looking bad problem.

Skia draws gradients by building up a fixed size cache of the gradient. (Imagine drawing the gradient to a 1x256 image) it then does a NEAREST scale of this cache to the destination. This is how it manages to be much faster than pixman, which is doing a linear interpolation between the nearest two stops.

To dither the gradients Skia doubles the cache to 2x256 and reorders some of the components of the 2nd line. Then, when it's scanning across a line it alternately picks the value from 1st and 2nd line, producing the dithering effect.

It should be possible to implement this kind alternating dithering in the existing pixman gradient code, but it will slow the already slow gradients down further. Alternatively, we could add support to pixman to use a cache like Skia which would give us faster dithered gradients but will be more work.
No longer blocks: 687835
I don't understand how 32-bit surfaces help ameliorate this problem on 16bpp displays, but it's probably best for me to enlighten myself privately.

What's the feasibility of just importing the skia code into pixman as an alternate gradient drawing backend?
(In reply to Chris Jones [:cjones] [:warhammer] from comment #17)
> I don't understand how 32-bit surfaces help ameliorate this problem on 16bpp
> displays, but it's probably best for me to enlighten myself privately.

I believe, at least in some cases, the screens would be dithered from 32bit by the lcd controller.

> 
> What's the feasibility of just importing the skia code into pixman as an
> alternate gradient drawing backend?

It feels like that would be more work than just porting the technique, but I'd need to look at the feasibility more to make an accurate judgement.
We discussed this a bit offline.  Thanks for enlightening me.

Jeff suggests that porting the dumb dithering technique to pixman (slowing it down more) would be relatively easy.  It would be interesting to compare the performance/quality of (i) skia vs. (ii) current pixman vs. (iii) pixman-with-dumb-dithering.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #19)
> We discussed this a bit offline.  Thanks for enlightening me.
> 
> Jeff suggests that porting the dumb dithering technique to pixman (slowing
> it down more) would be relatively easy.  It would be interesting to compare
> the performance/quality of (i) skia vs. (ii) current pixman vs. (iii)
> pixman-with-dumb-dithering.

The thing I forgot about when suggesting this, is that pixman's pipeline is currently all 32 bit for gradients. It will take a bit of surgery to make it so we can produce gradients in 16bits.
So pixman just chops off the low bits when filling a 16 bit surface with a gradient?
(In reply to Chris Jones [:cjones] [:warhammer] from comment #21)
> So pixman just chops off the low bits when filling a 16 bit surface with a
> gradient?

Correct.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #20)
> (In reply to Chris Jones [:cjones] [:warhammer] from comment #19)
> > We discussed this a bit offline.  Thanks for enlightening me.
> > 
> > Jeff suggests that porting the dumb dithering technique to pixman (slowing
> > it down more) would be relatively easy.  It would be interesting to compare
> > the performance/quality of (i) skia vs. (ii) current pixman vs. (iii)
> > pixman-with-dumb-dithering.
> 
> The thing I forgot about when suggesting this, is that pixman's pipeline is
> currently all 32 bit for gradients. It will take a bit of surgery to make it
> so we can produce gradients in 16bits.

I had a look at what it would take to add a 16bit pipeline to pixman and it doesn't seem _too_ bad. We may be able to get something that works well enough without too much trouble.
Attached patch Sketch out a 16bit pipeline (obsolete) — Splinter Review
Attachment #633032 - Attachment is obsolete: true
Attached patch 16 bit pipeline (obsolete) — Splinter Review
Attachment #633320 - Attachment is obsolete: true
Attached patch dithering of linear gradients (obsolete) — Splinter Review
This should mostly work. However, it only works on opaque gradients that end up being painted with SOURCE. This means, a rounded rectangle with a gradient will not be dithered because it goes through the regular 32 bit path. This should be fixable.
This version supports OVER of opaque gradients with a mask. It should fix a bunch of the banding that was still happening.
Attachment #634502 - Attachment is obsolete: true
Bad banding is gone on examples I provided before except

http://people.mozilla.com/~cjones/email-gradient-banding.html

It appears that the gradient is being dithered though, but maybe the dithering algorithm is just falling over.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #29)
> Bad banding is gone on examples I provided before except
> 
> http://people.mozilla.com/~cjones/email-gradient-banding.html
> 
> It appears that the gradient is being dithered though, but maybe the
> dithering algorithm is just falling over.

Yes, the dithering we do is a very simple 2x2 dither. In some cases it won't be sufficient to eliminate all of the banding. While, it's possible to use a higher quality dither, it will be slower and it will make it more difficult to use the fast table based approach that Skia uses.

In the case of http://people.mozilla.com/~cjones/email-gradient-banding.html it probably makes more sense to modify the design so that the banding is not as noticeable.
Keywords: helpwanted
Seeing nullptr crashes below

    compose = _pixman_implementation_lookup_combiner (
	imp->toplevel, op, component_alpha, narrow, !!rgb16);

at the table lookup this calls into.  Sorry lost backtrace, don't have op.

My build from 

https://github.com/cgjones/platform-demo-mc/commit/38a3ff0cb44011a30a7d18b26c4ef91bdff7a838

crashes on desktop and device.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #31)
> Seeing nullptr crashes below
> 
>     compose = _pixman_implementation_lookup_combiner (
> 	imp->toplevel, op, component_alpha, narrow, !!rgb16);
> 
> at the table lookup this calls into.  Sorry lost backtrace, don't have op.
> 
> My build from 
> 
> https://github.com/cgjones/platform-demo-mc/commit/
> 38a3ff0cb44011a30a7d18b26c4ef91bdff7a838
> 
> crashes on desktop and device.

Can you reproduce them reliably?
Attached patch 16 bit pipeline v3 (obsolete) — Splinter Review
This adds a 16 bit fetch which will help avoid a bunch format conversion in the common case.
Attachment #636456 - Attachment is obsolete: true
Attached patch dithering of linear gradients v2 (obsolete) — Splinter Review
This speeds up dithering of vertical gradients
Attachment #634503 - Attachment is obsolete: true
Attached patch dithering of linear gradients v2 (obsolete) — Splinter Review
Attachment #637636 - Attachment is obsolete: true
pixman_combine_32_func_t
_pixman_implementation_lookup_combiner (pixman_implementation_t *imp,
					pixman_op_t		 op,
					pixman_bool_t		 component_alpha,
					pixman_bool_t		 narrow,
					pixman_bool_t		 rgb16)
{
    pixman_combine_32_func_t f;

    do
    {
	pixman_combine_32_func_t (*combiners[]) =
	{
	    (pixman_combine_32_func_t *)imp->combine_64,
	    (pixman_combine_32_func_t *)imp->combine_64_ca,
	    imp->combine_32,
	    imp->combine_32_ca,
	    (pixman_combine_32_func_t *)imp->combine_16,
	    NULL,
	};

	f = combiners[component_alpha + (rgb16 ? 4 : (narrow << 1))][op];

component_alpha == 1, which it never is on "mobile", and for some reason rgb16 == 1, so we end up looking up null[op].  Maybe you fixed this in latest patches.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #32)
> (In reply to Chris Jones [:cjones] [:warhammer] from comment #31)
> > Seeing nullptr crashes below
> > My build from 
> > 
> > https://github.com/cgjones/platform-demo-mc/commit/
> > 38a3ff0cb44011a30a7d18b26c4ef91bdff7a838
> > 
> > crashes on desktop and device.
> 
> Can you reproduce them reliably?

The on-device crash was a red herring, caused by unrelated bustage.  Desktop crash reproduces 100%.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #36)

> component_alpha == 1, which it never is on "mobile", and for some reason
> rgb16 == 1, so we end up looking up null[op].  Maybe you fixed this in
> latest patches.

Yes this has been fixed in my latest patches.
Jeff, what's left to get these patches reviewed and landed?
This adds support for radial gradients and should fix all of the crashes
Attachment #637634 - Attachment is obsolete: true
Attachment #641093 - Flags: review?(bgirard)
This adds support for dithering radial gradients as well.
Attachment #637643 - Attachment is obsolete: true
Attachment #641094 - Flags: review?(bgirard)
Comment on attachment 641094 [details] [diff] [review]
dithering of gradients v3

Review of attachment 641094 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/cairo/libpixman/src/pixman-linear-gradient.c
@@ +240,4 @@
>      int             y      = iter->y;
>      int             width  = iter->width;
>      uint16_t *      buffer = (uint16_t*)iter->buffer;
> +    int toggle             = ((x ^ y) & 1);

I would rather see this be a bool and use ! operations rather then ^

@@ +313,5 @@
> +	    color16 = dither_8888_to_0565(color, toggle);
> +	    color16b = dither_8888_to_0565(color, toggle^1);
> +	    // compute the difference
> +	    dither_diff =  color16 ^ color16b;
> +	    while (buffer < end) {

This deserve a comment explain how your using this to toggle from color16/color16b

::: gfx/cairo/libpixman/src/pixman-radial-gradient.c
@@ +496,4 @@
>      int y = iter->y;
>      int width = iter->width;
>      uint16_t *buffer = iter->buffer;
> +    int toggle             = ((x ^ y) & 1);

same
Comment on attachment 641094 [details] [diff] [review]
dithering of gradients v3

Review of attachment 641094 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/cairo/libpixman/src/pixman-linear-gradient.c
@@ +240,4 @@
>      int             y      = iter->y;
>      int             width  = iter->width;
>      uint16_t *      buffer = (uint16_t*)iter->buffer;
> +    int toggle             = ((x ^ y) & 1);

As discussed change to a pixman bool but we will keep the XOR because it's faster.
Attachment #641094 - Flags: review?(bgirard) → review+
Comment on attachment 641093 [details] [diff] [review]
16 bit pipeline v4

Review of attachment 641093 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/cairo/libpixman/src/pixman-bits-image.c
@@ +1419,5 @@
> +        }
> +        else
> +        {
> +	    iter->get_scanline = dest_get_scanline_16;
> +	    iter->write_back = dest_write_back_16;

You could pull out write_back

::: gfx/cairo/libpixman/src/pixman-combine16.c
@@ +39,5 @@
> +combine_src_u (pixman_implementation_t *imp,
> +               pixman_op_t              op,
> +               uint32_t *                dest,
> +               const uint32_t *          src,
> +               const uint32_t *          mask,

indent nit.

@@ +83,5 @@
> +    if (!mask)
> +        memcpy (dest, src, width * sizeof (uint16_t));
> +    else
> +    {
> +        uint16_t *d = (uint32_t*)dest;

uint16_t

@@ +98,5 @@
> +            } else {
> +                // the mask is still 32bits
> +                uint32_t s = combine_mask (convert_0565_to_8888(*src16), *mask);
> +                uint32_t ia = ALPHA_8 (~s);
> +                uint32_t d32 = convert_0565_to_8888(*(d + i));

We should add a comment about having a 32 bit mask in a 16 bit pipeline.

::: gfx/cairo/libpixman/src/pixman-general.c
@@ +128,4 @@
>  	Bpp = 8;
>      }
>  
> +    // XXX: deal with when we have a mask

Can be removed.

::: gfx/cairo/libpixman/src/pixman-linear-gradient.c
@@ +298,5 @@
> +	next_inc = 0;
> +
> +	if (((pixman_fixed_32_32_t )(inc * width)) == 0)
> +	{
> +	    register uint32_t color;

uint16_t, maybe remove register?

@@ +377,5 @@
>      if (linear_gradient_is_horizontal (
>  	    iter->image, iter->x, iter->y, iter->width, iter->height))
>      {
> +	if (iter->flags & ITER_16)
> +	    linear_get_scanline_16 (iter, NULL);

Would be nice to support a fast path for dithered 16 hoz gradients

::: gfx/cairo/libpixman/src/pixman-private.h
@@ +211,5 @@
>       */
>      ITER_LOCALIZED_ALPHA =	(1 << 1),
>      ITER_IGNORE_ALPHA =		(1 << 2),
> +    ITER_IGNORE_RGB =		(1 << 3),
> +    ITER_16 =			(1 << 4)

Put a comment here that we now have 2 flags for the 3 pipeline (16,32,64).
Attachment #641093 - Flags: review?(bgirard) → review+
Assignee: nobody → jmuizelaar
Target Milestone: --- → mozilla16
Sorry, backed out for the invisible differences in the six reftests in https://tbpl.mozilla.org/php/getParsedLog.php?id=13479764&tree=Mozilla-Inbound
Target Milestone: mozilla16 → ---
https://hg.mozilla.org/mozilla-central/rev/369c2e5757ea
https://hg.mozilla.org/mozilla-central/rev/ab3a6f89cb7f
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla16
Verified in a Nighly 17.0a1 and Aurora 16.0a2 build from today (2012-07-31)
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: