Closed
Bug 921495
Opened 11 years ago
Closed 11 years ago
AlphaBoxBlur reduces opacity due to bad rounding
Categories
(Core :: Graphics, defect)
Tracking
()
RESOLVED
FIXED
mozilla28
People
(Reporter: mstange, Assigned: mstange)
Details
Attachments
(3 files, 2 obsolete files)
236 bytes,
text/html
|
Details | |
4.45 KB,
patch
|
bas.schouten
:
review+
|
Details | Diff | Splinter Review |
5.76 KB,
patch
|
bas.schouten
:
review+
|
Details | Diff | Splinter Review |
In this testcase, the color values in the opaque white part of the shadow should all be #FFFFFF, but instead they are #FCFCFC. Blurring uses three passes and reduces the color values by 1 per pass due to a rounding problem. I have patches locally that convert the SVG gaussian blur filter to use AlphaBoxBlur internally, and due to this bug, feGaussianBlur-1.svg and feGaussianBlur-2.svg currently fail.
Assignee | ||
Comment 1•11 years ago
|
||
The color values are calculated as ((uint64_t(1) << 32) / boxSize * valueSum) >> 32. For a boxSize of 1521, for example, blurring only pixels with value 255 gives: ((uint64_t(1) << 32) / 1521 * (255 * 1521) >> 32 == (2823778 * 387855 ) >> 32 == 1095216416190 >> 32 == 254 <-- !! and 1095216416190 / 2.0e32 == 254.999943. I think the right way to fix this is to use a rounding shift, or, equivalently, to add 2^31 before the shift. This patch makes the DrawShadow200x200LargeRadius perftest slower by 4.3% on my machine, using the SSE2 code.
Attachment #811151 -
Flags: review?(bas)
Assignee | ||
Comment 2•11 years ago
|
||
This patch improves performance on the DrawShadow200x200LargeRadius perftest for me by 14.8% after the part1 patch. Compared to the original code, part1 + part2 are 11.2% faster than the original code. (Only taking part2 would make us 14.5% faster than the original code, but I'm not advocating that because I want the rounding fix.)
Attachment #811154 -
Flags: review?(bas)
Assignee | ||
Comment 3•11 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #1) > and 1095216416190 / 2.0e32 == 254.999943. ^ This was supposed to be a "2.0^32", of course. Perf numbers for the Blur500x500x50 perftest (which is only on the Moz2d filters branch at the moment): original: 544.396ms +/- 3.00381 only part1: 573.329ms +/- 9.1663 -- 5.3% slower than original part1+part2: 473.078ms +/- 6.34367 -- 13.1% faster than original only part2: 455.779ms +/- 5.02724 -- 16.3% faster than original
Assignee | ||
Comment 4•11 years ago
|
||
If a 5% regression is the cutoff for taking the rounding fix, we can reorder the numbers from comment 3: original: 544.396ms +/- 3.00381 only part2: 455.779ms +/- 5.02724 -- 16.3% faster than original part1+part2: 473.078ms +/- 6.34367 -- 3.7% slower than with speed fix :-)
Assignee | ||
Comment 5•11 years ago
|
||
Comment on attachment 811154 [details] [diff] [review] part 2: regain performance Review of attachment 811154 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/2d/BlurSSE2.cpp @@ +33,4 @@ > > +MOZ_ALWAYS_INLINE > +__m128i BlurFourPixels(__m128i aTopLeft, __m128i aTopRight, > + __m128i aBottomRight, __m128i aBottomLeft, This actually does not compile on Windows because VC does not allow passing more than three __m128i arguments on the stack. I'll change it to const __m128i&, it will get inlined anyway.
Assignee | ||
Comment 6•11 years ago
|
||
This reorders the union so that this compiles on Windows. Note to self: If we don't take this fix, I need to make feGaussianBlur-1.svg and feGaussianBlur-2.svg fuzzy (1,12800 and 3,1600) when landing filters.
Attachment #811151 -
Attachment is obsolete: true
Attachment #811151 -
Flags: review?(bas)
Attachment #8334154 -
Flags: review?(bas)
Assignee | ||
Comment 7•11 years ago
|
||
Attachment #811154 -
Attachment is obsolete: true
Attachment #811154 -
Flags: review?(bas)
Attachment #8334155 -
Flags: review?(bas)
Comment 8•11 years ago
|
||
Comment on attachment 8334154 [details] [diff] [review] part 1: fix rounding Review of attachment 8334154 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/2d/BlurSSE2.cpp @@ +17,5 @@ > + const __m128i mask = _mm_setr_epi32(0x0, 0xffffffff, 0x0, 0xffffffff); > + const union { > + int64_t i64[2]; > + __m128i m; > + } roundingAddition = { { int64_t(1) << 31, int64_t(1) << 31 } }; I wonder if all compilers do this well. If you perftested this on all platforms I'm happy.
Attachment #8334154 -
Flags: review?(bas) → review+
Updated•11 years ago
|
Attachment #8334155 -
Flags: review?(bas) → review+
Assignee | ||
Comment 9•11 years ago
|
||
Good thing you made me check, it was slow on Windows. Making roundingAddition static fixes it, luckily.
Updated•11 years ago
|
Attachment #811151 -
Flags: review?(bas)
Updated•11 years ago
|
Attachment #811154 -
Flags: review?(bas)
Assignee | ||
Comment 10•11 years ago
|
||
http://hg.mozilla.org/integration/mozilla-inbound/rev/bfb947e70395 http://hg.mozilla.org/integration/mozilla-inbound/rev/54825f09185b http://hg.mozilla.org/integration/mozilla-inbound/rev/aadb3dfc9fe9
Flags: in-testsuite+
Comment 11•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/bfb947e70395 https://hg.mozilla.org/mozilla-central/rev/54825f09185b https://hg.mozilla.org/mozilla-central/rev/aadb3dfc9fe9
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
You need to log in
before you can comment on or make changes to this bug.
Description
•