Closed Bug 750172 Opened 12 years ago Closed 12 years ago

Do background image scaling on the GPU

Categories

(Core :: Graphics, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla15
Tracking Status
fennec + ---

People

(Reporter: gal, Assigned: gal)

References

Details

(Keywords: mobile, perf)

Attachments

(2 files, 5 obsolete files)

Bug 650988 introduced scaling images on the GPU. We should do this for background images too. This requires adding a separate layer for the background image.
Blocks: 650988
No longer blocks: 651060
No longer depends on: 650988
Assignee: nobody → gal
Comment on attachment 619504 [details] [diff] [review]
Part 1: refactor nsDisplayImage so we can reuse GetLayerState and ConfigureLayer

>diff --git a/layout/base/nsDisplayList.h b/layout/base/nsDisplayList.h
>--- a/layout/base/nsDisplayList.h
>+++ b/layout/base/nsDisplayList.h
>@@ -1600,17 +1600,22 @@ public:
>   virtual nsRegion GetOpaqueRegion(nsDisplayListBuilder* aBuilder,
>                                    bool* aSnap,
>                                    bool* aForceTransparentSurface);
>   virtual bool IsVaryingRelativeToMovingFrame(nsDisplayListBuilder* aBuilder,
>                                                 nsIFrame* aFrame);
>   virtual bool IsUniform(nsDisplayListBuilder* aBuilder, nscolor* aColor);
>   virtual bool ShouldFixToViewport(nsDisplayListBuilder* aBuilder);
>   virtual nsRect GetBounds(nsDisplayListBuilder* aBuilder, bool* aSnap);
>+  virtual LayerState GetLayerState(nsDisplayListBuilder* aBuilder,
>+                                   LayerManager* aManager);
>   virtual void Paint(nsDisplayListBuilder* aBuilder, nsRenderingContext* aCtx);
>+  virtual already_AddRefed<Layer> BuildLayer(nsDisplayListBuilder* aBuilder,
>+                                             LayerManager* aManager,
>+                                             const ContainerParameters& aContainerParameters);
>   NS_DISPLAY_DECL_NAME("Background", TYPE_BACKGROUND)
>   // Returns the value of GetUnderlyingFrame()->IsThemed(), but cached
>   bool IsThemed() { return mIsThemed; }
> 
> protected:
>   nsRegion GetInsideClipRegion(nsPresContext* aPresContext, PRUint8 aClip,
>                                const nsRect& aRect, bool* aSnap);
> 
>diff --git a/layout/generic/nsImageFrame.cpp b/layout/generic/nsImageFrame.cpp
>--- a/layout/generic/nsImageFrame.cpp
>+++ b/layout/generic/nsImageFrame.cpp
>@@ -1243,36 +1243,36 @@ nsDisplayImage::GetDestRect()
>   nsRect dest = imageFrame->GetInnerArea() + ToReferenceFrame();
>   gfxRect destRect(dest.x, dest.y, dest.width, dest.height);
>   destRect.ScaleInverse(factor); 
> 
>   return destRect;
> }
> 
> LayerState
>-nsDisplayImage::GetLayerState(nsDisplayListBuilder* aBuilder,
>-                              LayerManager *aManager)
>+nsDisplayImage::GetLayerState(imgIContainer *aImage,
>+                              const gfxRect aDestRect,
>+                              nsPresContext *aPresContext)
> {
>-  if (mImage->GetType() != imgIContainer::TYPE_RASTER)
>+  if (aImage->GetType() != imgIContainer::TYPE_RASTER)
>     return LAYER_NONE;
> 
>   PRInt32 imageWidth;
>   PRInt32 imageHeight;
>-  mImage->GetWidth(&imageWidth);
>-  mImage->GetHeight(&imageHeight);
>+  aImage->GetWidth(&imageWidth);
>+  aImage->GetHeight(&imageHeight);
> 
>   NS_ASSERTION(imageWidth != 0 && imageHeight != 0, "Invalid image size!");
> 
>-  gfxRect destRect = GetDestRect();
>-
>   // Calculate the scaling factor for the frame.
>-  gfxSize scale = gfxSize(destRect.width / imageWidth, destRect.height / imageHeight);
>+  gfxSize scale = gfxSize(aDestRect.width / imageWidth,
>+                          aDestRect.height / imageHeight);
> 
>   // Adjust it by the zoom factor.
>-  nsIPresShell* presShell = mFrame->PresContext()->GetPresShell();
>+  nsIPresShell* presShell = aPresContext->GetPresShell();
>   scale.width *= presShell->GetXResolution();
>   scale.height *= presShell->GetYResolution();
> 
>   // If we are not scaling at all, no point in separating this into a layer.
>   if (scale.width == 1.0f && scale.height == 1.0f) {
>     return LAYER_INACTIVE;
>   }
> 
>@@ -1280,16 +1280,49 @@ nsDisplayImage::GetLayerState(nsDisplayL
>   // reduced memory usage of redrawing at the lower resolution.
>   if (scale.width * scale.height < 0.5 * 0.5) {
>     return LAYER_INACTIVE;
>   }
> 
>   return LAYER_ACTIVE;
> }
> 
>+void
>+nsDisplayImage::ConfigureLayer(ImageLayer *aLayer,
>+                               GraphicsFilter aFilter,
>+                               imgIContainer *aImage,
>+                               const gfxRect &aDestRect)
>+{
>+  aLayer->SetFilter(aFilter);
>+
>+  PRInt32 imageWidth;
>+  PRInt32 imageHeight;
>+  mImage->GetWidth(&imageWidth);
>+  mImage->GetHeight(&imageHeight);
>+
>+  NS_ASSERTION(imageWidth != 0 && imageHeight != 0, "Invalid image size!");
>+
>+  gfxMatrix transform;
>+  transform.Translate(aDestRect.TopLeft());
>+  transform.Scale(aDestRect.Width()/imageWidth,
>+                  aDestRect.Height()/imageHeight);
>+  aLayer->SetTransform(gfx3DMatrix::From2D(transform));
>+
>+  aLayer->SetVisibleRegion(nsIntRect(0, 0, imageWidth, imageHeight));
>+}
>+
>+LayerState
>+nsDisplayImage::GetLayerState(nsDisplayListBuilder* aBuilder,
>+                              LayerManager *aManager)
>+{
>+  return GetLayerState(mImage,
>+                       DestRect(),
>+                       mFrame->PresContext());
>+}
>+
> already_AddRefed<Layer>
> nsDisplayImage::BuildLayer(nsDisplayListBuilder* aBuilder,
>                            LayerManager* aManager,
>                            const ContainerParameters& aParameters)
> {
>   nsRefPtr<ImageContainer> container;
>   nsresult rv = mImage->GetImageContainer(getter_AddRefs(container));
>   NS_ENSURE_SUCCESS(rv, nsnull);
>@@ -1298,34 +1331,20 @@ nsDisplayImage::BuildLayer(nsDisplayList
>   layer->SetContainer(container);
>   ConfigureLayer(layer);
>   return layer.forget();
> }
> 
> void
> nsDisplayImage::ConfigureLayer(ImageLayer *aLayer)
> {
>-  aLayer->SetFilter(nsLayoutUtils::GetGraphicsFilterForFrame(mFrame));
>-
>-  PRInt32 imageWidth;
>-  PRInt32 imageHeight;
>-  mImage->GetWidth(&imageWidth);
>-  mImage->GetHeight(&imageHeight);
>-
>-  NS_ASSERTION(imageWidth != 0 && imageHeight != 0, "Invalid image size!");
>-
>-  const gfxRect destRect = GetDestRect();
>-
>-  gfxMatrix transform;
>-  transform.Translate(destRect.TopLeft());
>-  transform.Scale(destRect.Width()/imageWidth,
>-                  destRect.Height()/imageHeight);
>-  aLayer->SetTransform(gfx3DMatrix::From2D(transform));
>-
>-  aLayer->SetVisibleRegion(nsIntRect(0, 0, imageWidth, imageHeight));
>+  ConfigurLayer(aLayer,
>+                nsLayoutUtils::GetGraphicsFilterForFrame(mFrame),
>+                mImage,
>+                GetDestRect());
> }
> 
> void
> nsImageFrame::PaintImage(nsRenderingContext& aRenderingContext, nsPoint aPt,
>                          const nsRect& aDirtyRect, imgIContainer* aImage,
>                          PRUint32 aFlags)
> {
>   // Render the image into our content area (the area inside
>diff --git a/layout/generic/nsImageFrame.h b/layout/generic/nsImageFrame.h
>--- a/layout/generic/nsImageFrame.h
>+++ b/layout/generic/nsImageFrame.h
>@@ -46,16 +46,18 @@
> #include "nsIObserver.h"
> 
> #include "nsStubImageDecoderObserver.h"
> #include "imgIDecoderObserver.h"
> 
> #include "nsDisplayList.h"
> #include "imgIContainer.h"
> 
>+#include "gfxPattern.h"
>+
> class nsIFrame;
> class nsImageMap;
> class nsIURI;
> class nsILoadGroup;
> struct nsHTMLReflowState;
> struct nsHTMLReflowMetrics;
> struct nsSize;
> class nsDisplayImage;
>@@ -414,27 +416,41 @@ public:
>   /**
>    * Returns an ImageContainer for this image if the image type
>    * supports it (TYPE_RASTER only).
>    */
>   already_AddRefed<ImageContainer> GetContainer();
> 
>   gfxRect GetDestRect();
> 
>+  /**
>+   * Determine the layer state for the given image and presentation. This
>+   * is a static function so nsDisplayBackground can share this code
>+   * with us.
>+   */
>+  static LayerState GetLayerState(imgIContainer *aImage,
>+                                  nsPresContext *aPresContext);
>+
>+  /**
>+   * Configure an ImageLayer for this display item. Set the required filter
>+   * and scaling transform. This is static for the benefit of
>+   * nsDisplayBackground as well.
>+   */
>+  static void ConfigureLayer(ImageLayer* aLayer,
>+                             gfxPattern::GraphicsFilter aFilter,
>+                             imgIContainer *aImage,
>+                             const gfxRect &aDestRect);
>+
>   virtual LayerState GetLayerState(nsDisplayListBuilder* aBuilder,
>                                    LayerManager* aManager);
> 
>   virtual already_AddRefed<Layer> BuildLayer(nsDisplayListBuilder* aBuilder,
>                                              LayerManager* aManager,
>                                              const ContainerParameters& aContainerParameters);
> 
>-  /**
>-   * Configure an ImageLayer for this display item.
>-   * Set the required filter and scaling transform.
>-   */
>   void ConfigureLayer(ImageLayer* aLayer);
> 
>   NS_DISPLAY_DECL_NAME("Image", TYPE_IMAGE)
> private:
>   nsCOMPtr<imgIContainer> mImage;
> };
> 
> #endif /* nsImageFrame_h___ */
Attachment #619504 - Attachment is patch: true
I'm interested to see how this will turn out. Background drawing can be quite expensive however in some case it will require us to turn a 565 thebes layer into a RGBA thebes layer which is also slower to paint and upload.
BenWA, why must we render as 565?
Rasterizing the thebes layer + uploading in 565 format is significantly faster then the same in RGBA format. If we're rendering the background using an image layer then the content above it, such as text must be transparent/RGBA and thus will be slower to draw.

I'm hoping that drawing the background on the GPU instead of rasterizing it into the thebes layer will save enough to offset the increase cost of needing an RGBA thebes layer.

When we measure we will want to make sure that we disable the drawing with nearest shortcut.
Ok, so here is the plan:

ContainerState::ThebesLayerData::Accumulate checks whether the first thing we render is an image, and if so it keeps track of that and makes it eligible to be transformed into an image layer. We add support to nsDisplayBackground to indicate whether its one uniform image (similar to what we do with colors), and flag that image in Accumulate as well. The actual conversion we do in PopThebesLayerData similar to color layers. Roc, does this sound right?

This won't work for tiled cases and it won't work for multiple layers of background. For that we have to split nsDisplayBackground into separate display items.
(In reply to Andreas Gal :gal from comment #7)
> ContainerState::ThebesLayerData::Accumulate checks whether the first thing
> we render is an image, and if so it keeps track of that and makes it
> eligible to be transformed into an image layer. We add support to
> nsDisplayBackground to indicate whether its one uniform image (similar to
> what we do with colors), and flag that image in Accumulate as well. The
> actual conversion we do in PopThebesLayerData similar to color layers. Roc,
> does this sound right?

No, because then if you have a CSS background image with some other stuff (text etc), they'll get put together into a single ThebesLayer. The code you're looking at in Accumulate will only convert a ThebesLayer to an ImageLayer when it contains a single image. Generalizing that to handle CSS background images would be doable but wouldn't help the testcases you care about.

Here you need to do to nsDisplayBackground what you're doing to nsDisplayImage in the other bug. You won't be able to handle multiple-background cases without splitting the display item (which is doable) but multiple-backgrounds are not so common.
Attached patch patch (obsolete) — Splinter Review
Attachment #619506 - Attachment is obsolete: true
Comment on attachment 620147 [details] [diff] [review]
patch

Review of attachment 620147 [details] [diff] [review]:
-----------------------------------------------------------------

The nsCSSRendering refactoring part should go into its own patch for easier regression-finding.

::: layout/base/nsCSSRendering.cpp
@@ +1670,5 @@
>  
>  nscolor
>  nsCSSRendering::DetermineBackgroundColor(nsPresContext* aPresContext,
>                                           nsStyleContext* aStyleContext,
>                                           nsIFrame* aFrame)

Just remove this overload of DetermineBackgroundColor I think, and have its callers provide the dummy parameters.

::: layout/base/nsDisplayList.cpp
@@ +1090,5 @@
>    return rgn.Contains(aContainedRect);
>  }
>  
> +bool
> +nsDisplayBackground::CanOptimizeToImageLayer(nsDisplayListBuilder* aBuilder)

This has side effects, so better verb the name. Say "TryOptimizeToImageLayer"?

@@ +1118,5 @@
> +      bg->BottomLayer().mRepeat.mXRepeat == NS_STYLE_BG_REPEAT_REPEAT &&
> +      bg->BottomLayer().mRepeat.mYRepeat == NS_STYLE_BG_REPEAT_REPEAT &&
> +      bg->BottomLayer().mImage.IsOpaque()) {
> +    drawBackgroundColor = false;
> +  }

Seems to me that this can move into DetermineBackgroundColor so the callers don't need to duplicate code.

@@ +1151,5 @@
> +
> +  // We currently can't handle tiled backgrounds.
> +  if (state.mFillArea.width > state.mDestArea.width ||
> +      state.mFillArea.height > state.mDestArea.height) {
> +    return false;

Actually here you need to check that mDestArea.Contains(mFillArea).

::: layout/base/nsDisplayList.h
@@ +1633,5 @@
>    /* Used to cache mFrame->IsThemed() since it isn't a cheap call */
>    bool mIsThemed;
>    nsITheme::Transparency mThemeTransparency;
> +
> +  /* If this background can be a simple image layer, we store the format here. */

Better document that CanOptimizeToImageLayer sets these.
Comment on attachment 620147 [details] [diff] [review]
patch

Review of attachment 620147 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/base/nsDisplayList.cpp
@@ +1156,5 @@
> +  }
> +
> +  mDestArea = state.mDestArea;
> +  mAnchor = state.mAnchor;
> +  mImageContainer = state.mImageRenderer.GetContainer();

One thing you're not handling here is background-clip. You need to store mFillArea and then use it to set the cliprect on the ImageLayer.

Probably a good idea to store mDestArea/mFillArea as gfxRects in pixels, and scale them here. Use presContext->AppUnitsPerDevPixel().

mAnchor is a bit of a problem. With layers we can't do what mAnchor needs (we need to snap to ensure that mAnchor lies on a pixel corner). I think we should just bail out of this optimization if mAnchor isn't (0,0). (It usually is 0,0.) Then just get rid of mAnchor.

@@ +1175,5 @@
> +
> +  gfxSize imageSize = mImageContainer->GetCurrentSize();
> +  NS_ASSERTION(imageSize.width != 0 && imageSize.height != 0, "Invalid image size!");
> +
> +  gfxRect destRect = gfxRect(mDestArea.x, mDestArea.y, mDestArea.width, mDestArea.height);

Now you can just use mDestArea directly.

@@ +1220,5 @@
> +  nsPoint origin = mAnchor + mDestArea.TopLeft();
> +  transform.Translate(gfxPoint(origin.x, origin.y));
> +  transform.Scale(mDestArea.Width()/imageSize.width,
> +                  mDestArea.Height()/imageSize.height);
> +  aLayer->SetTransform(gfx3DMatrix::From2D(transform));

This is buggy right now, but should come right once mDestArea is a gfxRect in pixels.
Actually I just thought of another restriction. If mDestArea is not exactly equal to mFillArea, we can't do this optimization safely. The problem is that some Web content does spriting by using a single large image as the background for small elements with different background-positions. In these cases, mFillArea is much smaller than mDestArea, and when filling mFillArea with resampling scaling, pixels outside mFillArea are sampled. These pixels are from other sprites which leads to corrupt pixels being drawn around the edges.

Let's limit this to mDestArea == mFillArea for now (and only store one of them). Longer term we may need to extend ImageLayer with support for tiling and "sampling restriction" so we can handle those cases in the layer system.
These are the images and background images on cnn.com the two patches convert into layers:

background layer 990.000000 5.000000 to 1320.000000 6.666667
image layer 54 47 to 72.000000 62.666667
image layer 119 82 to 158.666667 109.333333
background layer 301.000000 21.000000 to 401.333333 28.000000
image layer 250 250 to 333.333333 333.333333
image layer 416 234 to 554.666667 312.000000
background layer 136.000000 42.000000 to 181.333333 56.000000
image layer 1 1 to 181.333333 56.000000
image layer 120 68 to 160.000000 90.666667
image layer 120 68 to 160.000000 90.666667
image layer 120 68 to 160.000000 90.666667
image layer 120 68 to 160.000000 90.666667
image layer 120 68 to 160.000000 90.666667
image layer 120 68 to 160.000000 90.666667
image layer 300 250 to 400.000000 333.333333
image layer 1 1 to 160.000000 90.666667
image layer 1 1 to 160.000000 90.666667

Zooming in I get this:

background layer 990.000000 5.000000 to 1980.000000 10.000000
image layer 119 82 to 238.000000 164.000000
background layer 301.000000 21.000000 to 602.000000 42.000000
image layer 250 250 to 500.000000 500.000000
image layer 416 234 to 832.000000 468.000000
background layer 136.000000 42.000000 to 272.000000 84.000000
image layer 1 1 to 272.000000 84.000000
image layer 300 250 to 600.000000 500.000000

No wonder that the mobile CPU **** out on this.
Fix applied as requested. Doesn't change the results for cnn.com
Attachment #620209 - Attachment is obsolete: true
Attachment #620206 - Flags: review?(roc)
Attachment #620211 - Flags: review?(roc)
This is my best shot at measuring the performance difference on real-world content. Both samples are for cnn.com for moving across a zoomed-in version of the cnn homepage. The measurement is very noisy, but CPU load clearly is down significantly.

Running (Self)          Symbol Name
1351.0ms   23.7%               argb32_sample_argb32
241.0ms    4.2%                         argb32_image_mark
111.0ms    1.9%                                 PR_GetThreadPrivate
92.0ms    1.6%                                          malloc_rtree_get
74.0ms    1.3%                                                  malloc_rtree_get_locked
66.0ms    1.1%                                                          argb32_mark_pixelshape
64.0ms    1.1%                                                                  memset_pattern4

Running (Self)                                                                          Symbol Name
937.0ms   17.4%                                                                                argb32_sample_argb32
163.0ms    3.0%                                                                                         argb32_image_mark
118.0ms    2.2%                                                                                                 PR_GetThreadPrivate
76.0ms    1.4%                                                                                                          PR_GetCurrentThread
75.0ms    1.3%                                                                                                            malloc_rtree_get
74.0ms    1.3%                                                                                                              void glgConvertTo_32<GLGC\
onverter_ABGR8_ARGB8, (GLGMemory)1>(GLGOperationRec const*, GLDPixelModeRec const*)
Comment #18 died the formatting death. 27% CPU time spent scaling images vs 20.4% CPU time spent scaling images with the patch (we aren't offloading everything to the GPU yet).

The pure paint time speedup is around 30%, but this number is _extremely_ noisy with my current lame non-deterministic setup, so don't rely on it too much.
Comment on attachment 620211 [details] [diff] [review]
Part 2: support turning simple background images into image layers

Review of attachment 620211 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/base/nsDisplayList.cpp
@@ +1117,5 @@
> +      bg->BottomLayer().mRepeat.mXRepeat == NS_STYLE_BG_REPEAT_REPEAT &&
> +      bg->BottomLayer().mRepeat.mYRepeat == NS_STYLE_BG_REPEAT_REPEAT &&
> +      bg->BottomLayer().mImage.IsOpaque()) {
> +    drawBackgroundColor = false;
> +  }

This should move into nsCSSRendering::DetermineBackgroundColor.

@@ +1153,5 @@
> +    return false;
> +  }
> +
> +  // Sub-pixel alignment is hard, lets punt on that.
> +  if (state.mAnchor.x != 0.0f || state.mAnchor.y != 0.0f) {

if (state.mAnchor != gfxPoint(0.0f, 0.0f))

@@ +1157,5 @@
> +  if (state.mAnchor.x != 0.0f || state.mAnchor.y != 0.0f) {
> +    return false;
> +  }
> +
> +  PRInt32 appUnitsPerDevPixel = mFrame->PresContext()->AppUnitsPerDevPixel();

Use 'presContext'

@@ +1230,5 @@
> +  nsIntRect rect = nsIntRect(PRInt32(mDestRect.X()),
> +                             PRInt32(mDestRect.Y()),
> +                             PRInt32(mDestRect.Width()),
> +                             PRInt32(mDestRect.Height()));
> +  aLayer->SetClipRect(&rect);

You don't need to set the clip rect now, since we already verified that mDestRect equals the image bounds.
Attachment #620211 - Attachment is obsolete: true
Attachment #620211 - Flags: review?(roc)
Attachment #620545 - Flags: review?(roc)
Target Milestone: --- → mozilla15
Depends on: 753282
No longer depends on: 753282
Depends on: 955712
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: