Speed up image drawing/decoding

VERIFIED FIXED

Status

()

Core
Graphics
VERIFIED FIXED
12 years ago
11 years ago

People

(Reporter: Stuart Parmenter, Assigned: Stuart Parmenter)

Tracking

({perf})

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 2 obsolete attachments)

(Assignee)

Description

12 years ago
Now that we can really make use of 32bpp bitmaps, we should decode to 32bit ARGB we should do so.  I'm seeing a huge hot spot in the code we currently use to combibne RGB and A1/A8 in to ARGB so fixing this should give us a big perf win.  Doing this also cleans up lots of code in the image decoders and will eventually lead to a single path for decoding in each decoder instead of the mix of mac/windows/bgr/rgb/toptobottom/bottomtotop/endianness/etc mess that is there now.
(Assignee)

Comment 1

12 years ago
Created attachment 215852 [details] [diff] [review]
first pass

this is just a checkpoint along the way.  still need to fix animated gifs, bmps, xbms and platform icons.

this patch also ends up holding on to 2 copies of the image when it optimizes.  need to fix that too.

also need to remove the aptr stuff in the rgb_1 case in the png decoder since it is unused
Assignee: nobody → pavlov
Status: NEW → ASSIGNED
(Assignee)

Comment 2

12 years ago
Created attachment 216084 [details] [diff] [review]
second pass

if anyone wants to do bmp, ico, xbm, i wouldn't be too upset.  I think I got most of the gif stuff, but possibily missed a couple spots.
Attachment #215852 - Attachment is obsolete: true
(Assignee)

Comment 3

12 years ago
Created attachment 216170 [details] [diff] [review]
third pass (checked in)

I think this solves everything short of transparency in ICOs.  I'll fix that as a seperate bug.
Attachment #216084 - Attachment is obsolete: true
Attachment #216170 - Flags: review?(vladimir)
Comment on attachment 216170 [details] [diff] [review]
third pass (checked in)

>--- gfx/src/thebes/nsThebesImage.cpp	8 Mar 2006 00:28:19 -0000	1.7
>+++ gfx/src/thebes/nsThebesImage.cpp	25 Mar 2006 00:02:49 -0000

> nsresult
> nsThebesImage::Init(PRInt32 aWidth, PRInt32 aHeight, PRInt32 aDepth, nsMaskRequirements aMaskRequirements)
> {

>-    return NS_OK;
>-}
>+    mImageSurface = new gfxImageSurface(format, mWidth, mHeight);
>+    memset(mImageSurface->Data(), 0xF, mHeight * mImageSurface->Stride());

0xFF, not 0xF.  Actually, make it 0x00, I dunno what I was thinking.  Why do we always create the image


>Index: gfx/thebes/src/gfxWindowsFonts.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/gfx/thebes/src/gfxWindowsFonts.cpp,v
>retrieving revision 1.29
>diff -u -6 -p -r1.29 gfxWindowsFonts.cpp
>--- gfx/thebes/src/gfxWindowsFonts.cpp	14 Mar 2006 23:17:55 -0000	1.29
>+++ gfx/thebes/src/gfxWindowsFonts.cpp	25 Mar 2006 00:02:49 -0000

This stuff shouldn't be part of this patch


>--- modules/libpr0n/decoders/gif/imgContainerGIF.cpp	21 Feb 2006 23:19:19 -0000	1.23
>+++ modules/libpr0n/decoders/gif/imgContainerGIF.cpp	25 Mar 2006 00:02:50 -0000
>@@ -45,12 +45,16 @@
> //******************************************************************************
> // Fill aFrame with black. Does not change the mask.
> void imgContainerGIF::BlackenFrame(gfxIImageFrame *aFrame)
> {
>   if (!aFrame)
>     return;
>-
>+#ifdef MOZ_CAIRO_GFX
>+  nsCOMPtr<nsIImage> img(do_GetInterface(aFrame));
>+  if (!img)
>+    return;
>+  PRInt32 w, h;
>+  aFrame->GetWidth(&w);
>+  aFrame->GetHeight(&h);
>+
>+  nsRefPtr<gfxASurface> surf;
>+  img->GetSurface(getter_AddRefs(surf));
>+  nsRefPtr<gfxContext> ctx = new gfxContext(surf);
>+  ctx->SetColor(gfxRGBA(0.0, 0.0, 0.0, 1.0));
>+  ctx->Rectangle(gfxRect(0, 0, w, h));
>+  ctx->Fill();
>+#else

No need for the w/h business; just ctx->SetColor(gfxRGB(0,0,0)); ctx->SetOperator(SOURCE); ctx->Paint();


r= me with those things fixed.. I /may/ be too tired to review the patch, but I think it looks ok.
Attachment #216170 - Flags: review?(vladimir) → review+
(Assignee)

Comment 5

12 years ago
i always create the image because we 99.99% of the time will write stuff to it almost immediately.  fixed the other stuff
Status: ASSIGNED → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → FIXED

Comment 6

12 years ago
Additional speedup is available:  Change line 548 of
nsPNGDecoder.cpp (after applying "third pass" patch) from
      } else {
to
      } else if (a != 255) {

and line 562 from
      }
to
      } else {
        cptr += 4;
      }

This will avoid calculations in the common case of fully opaque pixels.



Comment on attachment 216170 [details] [diff] [review]
third pass (checked in)

+      const PRUint8 a = (format == gfxIFormat::RGB_A1) ? abpr[i>>3] : abpr[i];

why is this right in the A1 case? don't you need to find the right bit and use 0/255?

XBM:
+                *ar++ = (val << 24) | 0;

shouldn't this be:
  *ar++ |= (val << 24);
?

Comment 8

12 years ago
                const PRUint8 val = ((pixel & (1 << i)) >> i) ? 255 : 0;
                *ar++ = (val << 24) | 0;

We don't even need "val".  Replace the two statements with

                *ar++ |= ((pixel & (1 << i)) >> i) ? 0xFF000000L : 0;
(Assignee)

Updated

12 years ago
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 9

12 years ago
this didn't really give me the speed up i was looking for on Tp.  not sure why.  Lets toss some more performance gains at the problem...
Status: REOPENED → ASSIGNED
(Assignee)

Comment 10

12 years ago
Created attachment 216289 [details] [diff] [review]
write directly (checked in)

this patch builds on top of the previous stuff checked in.  it causes the PNG and JPEG decoders to write directly to the buffer provided from the gfxImageSurface.  This way we don't have to allocate temp row buffers to write to.  I've also changed the JPEG, PNG and GIF decoders to write out as 32bit values using shifts so that a) I can get rid of the endian ifdefs and b) hopefully make the writes a bit faster.
Attachment #216289 - Flags: review?(vladimir)
(Assignee)

Updated

12 years ago
Attachment #216170 - Attachment description: third pass → third pass (checked in)
Comment on attachment 216289 [details] [diff] [review]
write directly (checked in)

>+
>+        PRUint32 colorIndex = *rowBufIndex * 3;
>+        *rgbRowIndex++ = (0xFF << 24) |
>+          (cmap[colorIndex] << 16) |
>+          (cmap[colorIndex+1] << 8) |
>+          (cmap[colorIndex+2]);

Make all this stuff use a macro for readability at some point.



>+  // Note! row_stride here must match the row_stride in
>+  // nsJPEGDecoder::WriteFrom
>+#if defined(MOZ_CAIRO_GFX) || defined(XP_MAC) || defined(XP_MACOSX)
>+  const int row_stride = mInfo.output_width << 2; // * 4
>+  // we're thebes. we can write stuff directly to the data

We're not always Thebes, since XP_MAC; change that comment to be "we're writing XRGB data out" or something?

>+  PRUint8 *imageData;
>+  PRUint32 imageDataLength;
>+  mFrame->GetImageData(&imageData, &imageDataLength);
>+  nsCOMPtr<nsIImage> img(do_GetInterface(mFrame));
>+  nsIntRect r(0, 0, mInfo.output_width, 1);
>+#else
>+  const int row_stride = mInfo.output_width * 3;
>+#endif




>+        PRUint8 r = *j1++;
>+        PRUint8 g = *j1++;
>+        PRUint8 b = *j1++;
>+        *ptrOutputBuf++ = (0xFF << 24) | (r << 16) | (g << 8) | b;

I'm pretty sure the compiler will do the right thing here and get rid of the temporaries, but you can get rid of them with a macro; ((*j1++) << 16) | ((*j1++) << 8) ... woudl be too ugly, I think.

This looks fine, but I don't think you'll see any speedup from it -- we had way more copies before and things didn't get faster.  I think there's some format strangeness...
Attachment #216289 - Flags: review?(vladimir) → review+
Comment on attachment 216289 [details] [diff] [review]
write directly (checked in)

why do you GetImageData also for non-cairo mac, in jpeg?
( nsJPEGDecoder::OutputScanlines())

couldn't you use NS_RGBA instead of (0xFF << 24) | (r << 16) | (g << 8) | b;?

Why do you call ImageUpdated in the jpeg decoder but not the png one?
Attachment #216289 - Flags: review+ → review?(vladimir)
> couldn't you use NS_RGBA instead of (0xFF << 24) | (r << 16) | (g << 8) | b;?

ah, no, you can't, unfortunately...

Comment 14

12 years ago
In the GIF decoder one can also directly write to the thebes image buffer (instead of allocating RGBRow...)
(Assignee)

Updated

12 years ago
Attachment #216289 - Attachment description: write directly → write directly (checked in)
Attachment #216289 - Flags: review?(vladimir) → review+
(Assignee)

Comment 15

12 years ago
Alfred Kayser: yeah, there is still a lot of optimizations that can be had.  32bit reads from png data, etc.   I'm planning on leaving this bug open until we clean it up more.

gif is in a bad place right now because of the way it deals with previous frames and compositing.  i'll post more in 324707 soon.

Comment 16

12 years ago
*** Bug 280529 has been marked as a duplicate of this bug. ***

Updated

12 years ago
Depends on: 332611
(In reply to comment #7)
> shouldn't this be:
>   *ar++ |= (val << 24);

OK, thinking some more about this, it shouldn't. but the | 0 does look odd.
No longer depends on: 332611

Updated

12 years ago
Depends on: 332611
Could this have caused bug 332713?

Updated

12 years ago
Depends on: 332713
and Bug 332386 ?
Depends on: 332386
Depends on: 333135
Blocks: 334719
(Assignee)

Updated

12 years ago
OS: Windows XP → All
Hardware: PC → All

Comment 20

11 years ago
Add bug 143046 to the dependency list, as the patch for that one (using 8bit storage for animation frames of GIF's) provides a huge memory saving and a performance speed up of 100% (i.e. one specific large animation now takes < 10% cpu, instead of > 20% cpu load).
Depends on: 143046

Comment 21

11 years ago
Bug 360000 is about addressing the ICO and BMP transparancy decoding in Cairo
Depends on: 360000

Updated

11 years ago
Depends on: 366465

Comment 22

11 years ago
Note bug 324707 mentioned in comment #15 is fixed (by bug 336532)

Opened bug 366465 for further optimization of buffer usage in GIF decoding in Cairo.

Comment 23

11 years ago
Why not add a 'NS_COLOR_ARGB' for Cairo pixels or something like that?
Also alpha premultiplication is needed in multiple locations.
(See bug 360000 for an example of a SetPixel(r,g,b,a) doing the premultiplication

Comment 24

11 years ago
Bug 366727 adds a NS_CAIRO_PIXEL macro to nsColor.h
Depends on: 366727

Updated

11 years ago
Depends on: 361299

Updated

11 years ago
Depends on: 334144

Updated

11 years ago
Depends on: 370942
(Assignee)

Comment 25

11 years ago
this is fixed enough.
Status: ASSIGNED → RESOLVED
Last Resolved: 12 years ago11 years ago
Resolution: --- → FIXED

Updated

11 years ago
Status: RESOLVED → VERIFIED
Depends on: 242145
You need to log in before you can comment on or make changes to this bug.