Closed Bug 331298 Opened 18 years ago Closed 17 years ago

Speed up image drawing/decoding

Categories

(Core :: Graphics, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: pavlov, Assigned: pavlov)

References

Details

(Keywords: perf)

Attachments

(2 files, 2 obsolete files)

Now that we can really make use of 32bpp bitmaps, we should decode to 32bit ARGB we should do so.  I'm seeing a huge hot spot in the code we currently use to combibne RGB and A1/A8 in to ARGB so fixing this should give us a big perf win.  Doing this also cleans up lots of code in the image decoders and will eventually lead to a single path for decoding in each decoder instead of the mix of mac/windows/bgr/rgb/toptobottom/bottomtotop/endianness/etc mess that is there now.
Attached patch first pass (obsolete) — Splinter Review
this is just a checkpoint along the way.  still need to fix animated gifs, bmps, xbms and platform icons.

this patch also ends up holding on to 2 copies of the image when it optimizes.  need to fix that too.

also need to remove the aptr stuff in the rgb_1 case in the png decoder since it is unused
Assignee: nobody → pavlov
Status: NEW → ASSIGNED
Attached patch second pass (obsolete) — Splinter Review
if anyone wants to do bmp, ico, xbm, i wouldn't be too upset.  I think I got most of the gif stuff, but possibily missed a couple spots.
Attachment #215852 - Attachment is obsolete: true
I think this solves everything short of transparency in ICOs.  I'll fix that as a seperate bug.
Attachment #216084 - Attachment is obsolete: true
Attachment #216170 - Flags: review?(vladimir)
Comment on attachment 216170 [details] [diff] [review]
third pass (checked in)

>--- gfx/src/thebes/nsThebesImage.cpp	8 Mar 2006 00:28:19 -0000	1.7
>+++ gfx/src/thebes/nsThebesImage.cpp	25 Mar 2006 00:02:49 -0000

> nsresult
> nsThebesImage::Init(PRInt32 aWidth, PRInt32 aHeight, PRInt32 aDepth, nsMaskRequirements aMaskRequirements)
> {

>-    return NS_OK;
>-}
>+    mImageSurface = new gfxImageSurface(format, mWidth, mHeight);
>+    memset(mImageSurface->Data(), 0xF, mHeight * mImageSurface->Stride());

0xFF, not 0xF.  Actually, make it 0x00, I dunno what I was thinking.  Why do we always create the image


>Index: gfx/thebes/src/gfxWindowsFonts.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/gfx/thebes/src/gfxWindowsFonts.cpp,v
>retrieving revision 1.29
>diff -u -6 -p -r1.29 gfxWindowsFonts.cpp
>--- gfx/thebes/src/gfxWindowsFonts.cpp	14 Mar 2006 23:17:55 -0000	1.29
>+++ gfx/thebes/src/gfxWindowsFonts.cpp	25 Mar 2006 00:02:49 -0000

This stuff shouldn't be part of this patch


>--- modules/libpr0n/decoders/gif/imgContainerGIF.cpp	21 Feb 2006 23:19:19 -0000	1.23
>+++ modules/libpr0n/decoders/gif/imgContainerGIF.cpp	25 Mar 2006 00:02:50 -0000
>@@ -45,12 +45,16 @@
> //******************************************************************************
> // Fill aFrame with black. Does not change the mask.
> void imgContainerGIF::BlackenFrame(gfxIImageFrame *aFrame)
> {
>   if (!aFrame)
>     return;
>-
>+#ifdef MOZ_CAIRO_GFX
>+  nsCOMPtr<nsIImage> img(do_GetInterface(aFrame));
>+  if (!img)
>+    return;
>+  PRInt32 w, h;
>+  aFrame->GetWidth(&w);
>+  aFrame->GetHeight(&h);
>+
>+  nsRefPtr<gfxASurface> surf;
>+  img->GetSurface(getter_AddRefs(surf));
>+  nsRefPtr<gfxContext> ctx = new gfxContext(surf);
>+  ctx->SetColor(gfxRGBA(0.0, 0.0, 0.0, 1.0));
>+  ctx->Rectangle(gfxRect(0, 0, w, h));
>+  ctx->Fill();
>+#else

No need for the w/h business; just ctx->SetColor(gfxRGB(0,0,0)); ctx->SetOperator(SOURCE); ctx->Paint();


r= me with those things fixed.. I /may/ be too tired to review the patch, but I think it looks ok.
Attachment #216170 - Flags: review?(vladimir) → review+
i always create the image because we 99.99% of the time will write stuff to it almost immediately.  fixed the other stuff
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Additional speedup is available:  Change line 548 of
nsPNGDecoder.cpp (after applying "third pass" patch) from
      } else {
to
      } else if (a != 255) {

and line 562 from
      }
to
      } else {
        cptr += 4;
      }

This will avoid calculations in the common case of fully opaque pixels.



Comment on attachment 216170 [details] [diff] [review]
third pass (checked in)

+      const PRUint8 a = (format == gfxIFormat::RGB_A1) ? abpr[i>>3] : abpr[i];

why is this right in the A1 case? don't you need to find the right bit and use 0/255?

XBM:
+                *ar++ = (val << 24) | 0;

shouldn't this be:
  *ar++ |= (val << 24);
?
                const PRUint8 val = ((pixel & (1 << i)) >> i) ? 255 : 0;
                *ar++ = (val << 24) | 0;

We don't even need "val".  Replace the two statements with

                *ar++ |= ((pixel & (1 << i)) >> i) ? 0xFF000000L : 0;
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
this didn't really give me the speed up i was looking for on Tp.  not sure why.  Lets toss some more performance gains at the problem...
Status: REOPENED → ASSIGNED
this patch builds on top of the previous stuff checked in.  it causes the PNG and JPEG decoders to write directly to the buffer provided from the gfxImageSurface.  This way we don't have to allocate temp row buffers to write to.  I've also changed the JPEG, PNG and GIF decoders to write out as 32bit values using shifts so that a) I can get rid of the endian ifdefs and b) hopefully make the writes a bit faster.
Attachment #216289 - Flags: review?(vladimir)
Attachment #216170 - Attachment description: third pass → third pass (checked in)
Comment on attachment 216289 [details] [diff] [review]
write directly (checked in)

>+
>+        PRUint32 colorIndex = *rowBufIndex * 3;
>+        *rgbRowIndex++ = (0xFF << 24) |
>+          (cmap[colorIndex] << 16) |
>+          (cmap[colorIndex+1] << 8) |
>+          (cmap[colorIndex+2]);

Make all this stuff use a macro for readability at some point.



>+  // Note! row_stride here must match the row_stride in
>+  // nsJPEGDecoder::WriteFrom
>+#if defined(MOZ_CAIRO_GFX) || defined(XP_MAC) || defined(XP_MACOSX)
>+  const int row_stride = mInfo.output_width << 2; // * 4
>+  // we're thebes. we can write stuff directly to the data

We're not always Thebes, since XP_MAC; change that comment to be "we're writing XRGB data out" or something?

>+  PRUint8 *imageData;
>+  PRUint32 imageDataLength;
>+  mFrame->GetImageData(&imageData, &imageDataLength);
>+  nsCOMPtr<nsIImage> img(do_GetInterface(mFrame));
>+  nsIntRect r(0, 0, mInfo.output_width, 1);
>+#else
>+  const int row_stride = mInfo.output_width * 3;
>+#endif




>+        PRUint8 r = *j1++;
>+        PRUint8 g = *j1++;
>+        PRUint8 b = *j1++;
>+        *ptrOutputBuf++ = (0xFF << 24) | (r << 16) | (g << 8) | b;

I'm pretty sure the compiler will do the right thing here and get rid of the temporaries, but you can get rid of them with a macro; ((*j1++) << 16) | ((*j1++) << 8) ... woudl be too ugly, I think.

This looks fine, but I don't think you'll see any speedup from it -- we had way more copies before and things didn't get faster.  I think there's some format strangeness...
Attachment #216289 - Flags: review?(vladimir) → review+
Comment on attachment 216289 [details] [diff] [review]
write directly (checked in)

why do you GetImageData also for non-cairo mac, in jpeg?
( nsJPEGDecoder::OutputScanlines())

couldn't you use NS_RGBA instead of (0xFF << 24) | (r << 16) | (g << 8) | b;?

Why do you call ImageUpdated in the jpeg decoder but not the png one?
Attachment #216289 - Flags: review+ → review?(vladimir)
> couldn't you use NS_RGBA instead of (0xFF << 24) | (r << 16) | (g << 8) | b;?

ah, no, you can't, unfortunately...
In the GIF decoder one can also directly write to the thebes image buffer (instead of allocating RGBRow...)
Attachment #216289 - Attachment description: write directly → write directly (checked in)
Attachment #216289 - Flags: review?(vladimir) → review+
Alfred Kayser: yeah, there is still a lot of optimizations that can be had.  32bit reads from png data, etc.   I'm planning on leaving this bug open until we clean it up more.

gif is in a bad place right now because of the way it deals with previous frames and compositing.  i'll post more in 324707 soon.
*** Bug 280529 has been marked as a duplicate of this bug. ***
Depends on: 332611
(In reply to comment #7)
> shouldn't this be:
>   *ar++ |= (val << 24);

OK, thinking some more about this, it shouldn't. but the | 0 does look odd.
No longer depends on: 332611
Depends on: 332611
Could this have caused bug 332713?
Depends on: 332713
Blocks: 334719
OS: Windows XP → All
Hardware: PC → All
Add bug 143046 to the dependency list, as the patch for that one (using 8bit storage for animation frames of GIF's) provides a huge memory saving and a performance speed up of 100% (i.e. one specific large animation now takes < 10% cpu, instead of > 20% cpu load).
Depends on: slowGIF
Bug 360000 is about addressing the ICO and BMP transparancy decoding in Cairo
Depends on: 360000
Depends on: 366465
Note bug 324707 mentioned in comment #15 is fixed (by bug 336532)

Opened bug 366465 for further optimization of buffer usage in GIF decoding in Cairo.
Why not add a 'NS_COLOR_ARGB' for Cairo pixels or something like that?
Also alpha premultiplication is needed in multiple locations.
(See bug 360000 for an example of a SetPixel(r,g,b,a) doing the premultiplication
Bug 366727 adds a NS_CAIRO_PIXEL macro to nsColor.h
Depends on: 366727
Depends on: 361299
Depends on: 334144
Depends on: 370942
this is fixed enough.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago17 years ago
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
Depends on: 242145
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: