Closed
Bug 416305
Opened 17 years ago
Closed 17 years ago
Speed up Gaussian blur filter considerably by optimizing for alpha-only surfaces
Categories
(Core :: SVG, defect)
Core
SVG
Tracking
()
RESOLVED
FIXED
People
(Reporter: roc, Assigned: roc)
References
Details
(Keywords: perf)
Attachments
(3 files)
151.83 KB,
patch
|
longsonr
:
review+
MatsPalmgren_bugz
:
superreview+
|
Details | Diff | Splinter Review |
12.33 KB,
patch
|
longsonr
:
review+
MatsPalmgren_bugz
:
superreview+
|
Details | Diff | Splinter Review |
75.73 KB,
image/jpeg
|
Details |
It's pretty common to use Gaussian blur with a SourceAlpha source to blur just the alpha channel, to get a shadow. Examples do this often (e.g. the infamous http://ejohn.org/files/ecma-cloud.svg ) and the spec even suggests optimizing for it. It's a really easy optimization to detect a SourceAlpha source here and should speed up such blurs by nearly a factor of 4. Probably worth doing for 1.9 since it's easy, and shadow blurs are common and one of our slowest filter operations.
Flags: blocking1.9?
Updated•17 years ago
|
Flags: tracking1.9? → blocking1.9?
Comment 1•17 years ago
|
||
roc, do you want to block on this?
Assignee | ||
Comment 2•17 years ago
|
||
No, but we could make it wanted1.9.
Assignee | ||
Comment 4•17 years ago
|
||
Not for 1.9, but I'm working on it.
Flags: wanted1.9-
Flags: wanted1.9+
Flags: wanted-next+
Assignee | ||
Comment 5•17 years ago
|
||
I have a fix for this in my tree, works great. I'll break it out for submission later.
Assignee | ||
Comment 6•17 years ago
|
||
Okay, here's a big patch that restructures filters to lay the groundwork for the optimizations. There are a few things going on here:
-- nsSVGFilterResource is eliminated. Everything the filter primitives need is provided to them directly.
-- Instead of messing around with an "image dictionary", the filter primitive Filter methods are passed Image objects representing their inputs and write to an Image output object. The Image object contains a gfxImageSurface, a copy of the filter primitive subregion that generate the object, and the colormodel.
-- The Filter method takes a new rect parameter indicating which pixels need to be computed. This rectangle is relative to the temporary image surfaces.
-- Most of the filter logic is moved from nsSVGFilterFrame to nsSVGFilterInstance. An nsSVGFilterInstance contains the FilterAnalysis graph that we were building before, and uses that graph for various analyses as well as (now) the actual drawing. The image dictionary no longer exists because all the image dataflow is encoded in the graph now.
I apologise for the big patch, but it actually does simplify the code quite a bit and makes it easy to develop additional analyses of the filter structure for big wins.
This patch was not intended to change behaviour but it actually improved my gaussmark test from 84s to 81s.
Attachment #326837 -
Flags: superreview?(mats.palmgren)
Attachment #326837 -
Flags: review?(longsonr)
Assignee | ||
Comment 7•17 years ago
|
||
That patch also has a potential big memory usage saving, because it frees image surfaces once we know there are no more uses in the filter pipeline. So a linear pipeline of N primitives goes from N+1 temporary image surfaces to 2.
Assignee | ||
Comment 8•17 years ago
|
||
This fixes this bug. It's a tiny patch on top of the last one.
This reduced the time for my gaussmark test from 81s to 65s. I experimented with hoisting the aAlphaOnly check out --- making a special alpha-only inner loop --- but that did not produce a significant performance improvement. Probably because the branch is extremely well predicted.
Attachment #326838 -
Flags: superreview?(mats.palmgren)
Attachment #326838 -
Flags: review?(longsonr)
Comment 9•17 years ago
|
||
Comment on attachment 326837 [details] [diff] [review]
part 1: restructure filters
>+ result.mSource = new gfxImageSurface(scaledSize,
>+ gfxASurface::ImageFormatARGB32);
>+ result.mTarget = new gfxImageSurface(scaledSize,
>+ gfxASurface::ImageFormatARGB32);
>+ if (!result.mSource || result.mSource->CairoStatus() ||
>+ !result.mTarget || result.mTarget->CairoStatus())
>+ {
{ should be on the previous line
>+nsSVGFETurbulenceElement::Filter(nsSVGFilterInstance *instance,
>+ const nsTArray<const Image*>& aSources,
>+ const Image* aTarget,
>+ const nsIntRect& rect)
>+{
>+ PRUint8* targetData = aTarget->mImage->Data();
>+ PRUint32 stride = aTarget->mImage->Stride();
>+
>+ nsIntRect filterSubregion(PRInt32(aTarget->mFilterPrimitiveSubregion.X()),
>+ PRInt32(aTarget->mFilterPrimitiveSubregion.Y()),
>+ PRInt32(aTarget->mFilterPrimitiveSubregion.Width()),
>+ PRInt32(aTarget->mFilterPrimitiveSubregion.Height()));
Why not do this via the new GfxRectToIntRect function?
>- if (x == rect.x) {
>+ if (x == 0) {
> xflag = 0;
>- } else if (x == rect.XMost() - 1) {
>+ } else if (x == rect.width - 1) {
> xflag = 2;
> } else {
> xflag = 1;
> }
>- if (y == rect.y) {
>+ if (y == 0) {
> yflag = 0;
>- } else if (y == rect.YMost() - 1) {
>+ } else if (y == rect.height - 1) {
> yflag = 2;
> } else {
> yflag = 1;
> }
This looks right but please ensure the testcase for bug 429774 is checked in as a crashtest to prove it.
> protected:
> virtual PRBool OperatesOnSRGB(nsSVGFilterInstance* aInstance,
>- nsSVGString* aString) {
>- if (aString == &mStringAttributes[IN1]) {
>- return aInstance->LookupImageColorModel(aString->GetAnimValue()).mColorSpace ==
>- nsSVGFilterInstance::ColorModel::SRGB;
>- }
>+ PRUint32 aInput, Image* aImage) {
>+ if (aInput == 0 && aImage)
>+ return aImage->mColorModel.mColorSpace == ColorModel::SRGB;
>
just check aImage and ignore aInput to match comment in nsSVGFilters.h
>+nsresult
>+nsSVGFilterInstance::BuildSourceImages()
>+{
...
>+ // Clear the color channel
>+ PRUint32* src = reinterpret_cast<PRUint32*>(sourceColorAlpha->Data());
Nit: prefer const pointer for src.
>+ PRUint32* dest = reinterpret_cast<PRUint32*>(mSourceAlpha.mImage.mImage->Data());
>+ for (PRInt32 yy = 0; yy < mSurfaceRect.height; yy++) {
>+ PRUint32 rowOffset = (mSourceAlpha.mImage.mImage->Stride()*yy) >> 2;
>+ for (PRInt32 xx = 0; xx < mSurfaceRect.width; xx++) {
>+ dest[rowOffset + xx] = src[rowOffset + xx] & 0xFF000000U;
>+ }
>+ }
Nit: Could just use x and y as the variable names now you've split this into separate functions.
Attachment #326837 -
Flags: review?(longsonr) → review+
Updated•17 years ago
|
Attachment #326838 -
Flags: review?(longsonr) → review+
Assignee | ||
Comment 10•17 years ago
|
||
(In reply to comment #9)
> { should be on the previous line
OK
> Why not do this via the new GfxRectToIntRect function?
Yeah OK
> This looks right but please ensure the testcase for bug 429774 is checked in
> as a crashtest to prove it.
OK
>
> > protected:
> > virtual PRBool OperatesOnSRGB(nsSVGFilterInstance* aInstance,
> >- nsSVGString* aString) {
> >- if (aString == &mStringAttributes[IN1]) {
> >- return aInstance->LookupImageColorModel(aString->GetAnimValue()).mColorSpace ==
> >- nsSVGFilterInstance::ColorModel::SRGB;
> >- }
> >+ PRUint32 aInput, Image* aImage) {
> >+ if (aInput == 0 && aImage)
> >+ return aImage->mColorModel.mColorSpace == ColorModel::SRGB;
> >
>
> just check aImage and ignore aInput to match comment in nsSVGFilters.h
Not sure what you mean here. The code wants to leave the first input in its current format, and do the default for the other input and the output. So we need to test both aInput and aImage.
> Nit: prefer const pointer for src.
OK.
> >+ PRUint32* dest = reinterpret_cast<PRUint32*>(mSourceAlpha.mImage.mImage->Data());
> >+ for (PRInt32 yy = 0; yy < mSurfaceRect.height; yy++) {
> >+ PRUint32 rowOffset = (mSourceAlpha.mImage.mImage->Stride()*yy) >> 2;
> >+ for (PRInt32 xx = 0; xx < mSurfaceRect.width; xx++) {
> >+ dest[rowOffset + xx] = src[rowOffset + xx] & 0xFF000000U;
> >+ }
> >+ }
>
> Nit: Could just use x and y as the variable names now you've split this into
> separate functions.
OK.
Mats, could you look at this? I've got a pretty long queue of stuff that follows this :-).
Comment 11•17 years ago
|
||
(In reply to comment #10)
> Not sure what you mean here. The code wants to leave the first input in its
> current format, and do the default for the other input and the output. So we
> need to test both aInput and aImage.
>
My mistake, I read it wrong. Ignore that comment.
Comment 12•17 years ago
|
||
Comment on attachment 326837 [details] [diff] [review]
part 1: restructure filters
sr=mats
> content/svg/content/src/nsSVGFilters.cpp
>+CopyRect(...)
> {
>+ NS_ASSERTION(aDest->mImage->Stride() == aSrc->mImage->Stride(), "stride mismatch");
>+ CopyDataRect(aDest->mImage->Data(), aSrc->mImage->Data(),
>+ aSrc->mImage->Stride(), aDataRect);
Can we assert aDataRect is within bounds?
(if so, in GaussianBlur() too)
Attachment #326837 -
Flags: superreview?(mats.palmgren) → superreview+
Comment 13•17 years ago
|
||
Comment on attachment 326838 [details] [diff] [review]
part 2: fix
sr=mats
- SUM(0); SUM(1); SUM(2); SUM(3);
+ if (!aAlphaOnly) { SUM(GFX_ARGB32_OFFSET_B);
+ SUM(GFX_ARGB32_OFFSET_G);
+ SUM(GFX_ARGB32_OFFSET_R); }
+ SUM(GFX_ARGB32_OFFSET_A);
Why not use SUM_PIXEL() here too?
Attachment #326838 -
Flags: superreview?(mats.palmgren) → superreview+
Assignee | ||
Comment 14•17 years ago
|
||
(In reply to comment #10)
> > Why not do this via the new GfxRectToIntRect function?
>
> Yeah OK
Actually we can't because GfxRectToIntRect fails if the input rectangle is not integer-aligned, so I left that as-is.
(In reply to comment #12)
> Can we assert aDataRect is within bounds?
> (if so, in GaussianBlur() too)
Yes, OK.
Assignee | ||
Comment 15•17 years ago
|
||
Pushed part 1 with those comments addressed: a36e8ebbe3f8.
Assignee | ||
Comment 16•17 years ago
|
||
(In reply to comment #13)
> (From update of attachment 326838 [details] [diff] [review])
> sr=mats
>
> - SUM(0); SUM(1); SUM(2); SUM(3);
> + if (!aAlphaOnly) { SUM(GFX_ARGB32_OFFSET_B);
> + SUM(GFX_ARGB32_OFFSET_G);
> + SUM(GFX_ARGB32_OFFSET_R); }
> + SUM(GFX_ARGB32_OFFSET_A);
>
> Why not use SUM_PIXEL() here too?
I could, but that would rely on cpp macros not using lexical scope (i.e. that SUM_PIXEL sees the redefinition of SUM, not the SUM in scope when SUM_PIXEL was declared), which seems a little bit too tricky to me.
Assignee | ||
Comment 17•17 years ago
|
||
Pushed part 2: baa5a51b7f90
Assignee | ||
Updated•17 years ago
|
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Comment 18•17 years ago
|
||
(In reply to comment #16)
FWIW, it looks like C and C++ standards requires it, 6.10.3.4/6.10.3.5:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf
In the examples in 6.10.3.5 they have:
#define x 3
#define f(a) f(x * (a))
#undef x
#define x 2
...
f(y+1)
results in f(2 * (y+1))
(I found a C++ draft that have the same text and example)
Assignee | ||
Comment 19•17 years ago
|
||
I agree it's correct, I just prefer to avoid writing code that depends on obscure stuff like that.
Comment 20•17 years ago
|
||
I think this broke the shadow of the favicons in my Ctrl-Tab extension (<http://en.design-noir.de/mozilla/ctrl-tab/>):
svg|*.ctrlTab-icon {
filter: url(chrome://browser/content/browser.xul#ctrlTab-iconShadow);
}
...
<svg:filter id="ctrlTab-iconShadow">
<svg:feOffset result="shadow" in="SourceAlpha" dx="2" dy="-1"/>
<svg:feColorMatrix result="transparent-shadow" in="shadow"
values="1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 0.5 0"/>
<svg:feBlend in="SourceGraphic" in2="transparent-shadow"/>
</svg:filter>
Comment 21•17 years ago
|
||
I also don't see the shadows on <http://ejohn.org/files/ecma-cloud.svg>. I'm on Windows.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 22•17 years ago
|
||
I still see shadows on http://ejohn.org/files/ecma-cloud.svg on Windows.
Comment 23•17 years ago
|
||
Here's what I see. Works as expected on the 1.9.0 branch.
Assignee | ||
Comment 24•17 years ago
|
||
Dão, you know as well as I do that we don't reopen bugs for regressions, unless we end up backing the patch out.
Status: REOPENED → RESOLVED
Closed: 17 years ago → 17 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 25•17 years ago
|
||
I see the regression in the Windows nightly build.
Assignee | ||
Comment 26•17 years ago
|
||
I found what I'm pretty sure is the regression, filed it as bug 445268 with a patch. It only shows up on Windows nightly builds though, so I'll need to land that patch to verify.
You need to log in
before you can comment on or make changes to this bug.
Description
•