Closed
Bug 364221
Opened 18 years ago
Closed 17 years ago
[cairo] bad page scrolling performance with large background images
Categories
(Core :: Graphics, defect, P2)
Core
Graphics
Tracking
()
RESOLVED
FIXED
mozilla1.9beta2
People
(Reporter: phiw2, Assigned: vlad)
References
()
Details
(Keywords: perf, regression)
Attachments
(1 file, 2 obsolete files)
10.49 KB,
patch
|
pavlov
:
review+
tor
:
superreview+
|
Details | Diff | Splinter Review |
test case: https://bugzilla.mozilla.org/attachment.cgi?id=212976
(testcase from bug 328380, which was about the same issue on Windows)
Some numbers:
Minefield (cairo build 20061217): 31609ms
Minefield (non-cairo build 20061112): 11218ms
Camino (cairo 2006121502 1.2+): 30827ms
Camino (non-cairo build 1.2+): 8769ms
Firefox 2.0 (build 2006101022): 7613ms
Opera 9.10: 5563ms
Safari (Webkit build r18270): 10789ms
All tests performed on 10.4.8, ppc (Powerbook G4 1.5Ghz. 1.5Gb Ram)
testcase loaded from localhost.
Comment 1•18 years ago
|
||
Some more numbers:
Camino (Home-grown Intel-only, 20070102): 29666-29692ms.
WebKit (rev. 18516): 9879-9889ms.
Range of 5 page loads from localhost. MBP 2.0GHz CoreDuo, 2GB RAM.
Reporter | ||
Updated•18 years ago
|
Flags: blocking1.9?
Assignee | ||
Updated•18 years ago
|
Assignee: nobody → vladimir
Flags: blocking1.9? → blocking1.9+
Comment 2•17 years ago
|
||
Is this what's causing the extremly slow scrolling at the top of digg.com ?
Reporter | ||
Comment 3•17 years ago
|
||
(In reply to comment #2)
> Is this what's causing the extremly slow scrolling at the top of digg.com ?
>
Almost certainly.
That site use a 2400X600px background-image, just below the header
Assignee | ||
Updated•17 years ago
|
Target Milestone: --- → mozilla1.9beta1
Comment 4•17 years ago
|
||
same here i guess: www.homegate.ch
the site has a 2400px × 3000px background image
Reporter | ||
Comment 5•17 years ago
|
||
bug 383960 (Cairo upgrade) should improve performance massively - the 20070610-18 Minefield build (when Cairo 1.4.8 was first checked in) is more than twice as fast on the testcase.
from an average of 31000ms to an average of 13500ms here.
Assignee | ||
Updated•17 years ago
|
Priority: -- → P2
Reporter | ||
Comment 6•17 years ago
|
||
After the landing of bug 383960 - using 10.4.10, ppc:
Minefield (build 2007080222): 17787ms
Camino (build 2007080217): 18499 ms
Average from 5 runs of the testcase in comment 0.
---
The improvement is slightly less significant than what I had mentioned earlier in comment 5.
The one difference to note is that comment 5 was done with builds made with GCC3.3/OSX10.3SDK, whereas the latest test is done with GCC4/OSX10.4uSDK (the default now for PPC builds).
I suspect there is something in the OSX10.4uSDK that makes Cairo perform slower on PowerPC Macs under that build configuration.
Updated•17 years ago
|
Target Milestone: mozilla1.9 M8 → ---
Assignee | ||
Comment 7•17 years ago
|
||
Some notes:
In fx2, this falls into argb32_image/argb32_image_mark_RGB32 -- in fx3, we end up in argb32_image/sseCGSBlendXXXX8888. I can't figure out how to get us to take the faster argb32_image_mark_RGB32 path, but I'd guess that it has something to do with fx2 using QD contexts and the like for drawing everywhere; perhaps the backing buffer for the window is RGB24 and not RGB32 like it would be under Cocoa.
However, when the testcase is run without chrome, in an 800x900 window (to avoid tiling), I get 5900ms in fx2, and 6800ms in minefield. With chrome, again in an 800x900 window, I get 6400ms in fx2, and *9400*ms in minefield. That's a huge hit from having the chrome enabled in minefield; 5900->6800ms is explainable with widget changes, and possibly due to using the (potentially more expensive) SSE blend function instead of mark_RGB32, but 6400->9400 is ridiculous. I have no idea where those 3000ms are going.
Comment 8•17 years ago
|
||
The gif image in the test case has a 'tpixel' set (index=255), but the image itself doesn't have transparancy. So nsGIFDecoder2.cpp will set format to RGB_A1 because of the tpixel, which causes gfxImageFrame/nsThebesImage to uses ImageFormatARGB32 for this image.
So, fixing the gif decoder to detect for such cases (a defined but not used tpixel) may help some.
A quick (and ugly) test showed improvement from ±10 seconds to ±4 seconds!!
The patch for the GIF image decoder is simple, but just doing SetHasNoAlpha() after the image has been created is not enough, as the format change is not passed to the Cairo level, so Cairo will still assume ARGB32...
So, we really need to a good way change the format from ARGB32 to RGB24 of an existing gfxImageSurface, including telling Cairo that the format has changed.
For this cairo-image-surface.c needs to provide a function for this effect.
And note, this also means that SetHasNoAlpha also doesn't have the desired effect for other images (PNG's particulary).
Comment 9•17 years ago
|
||
This is only the detection of the no alpha case in GIF images.
We still need to make SetHasNoAlpha more effective (and rename to more sensible?)
Assignee | ||
Comment 10•17 years ago
|
||
Hmm, you're right -- I must've broken things when I "cleaned up" the patch, ugh. Ok, I'll fix things up and get it working correctly again.
Assignee | ||
Comment 11•17 years ago
|
||
Okay, here's a fix. It adds the missing transparency checks in the GIF decoder, and makes the right things happen on Optimize when SetHasNoAlpha is called.
Attachment #287725 -
Flags: review?(pavlov)
Comment 12•17 years ago
|
||
Some more digging results: For windows, the Optimize() in nsThebesImage only
works for images < 1024*1024, and the image in the test case is 1280x960.
So, that image is not optimized, even if SetHasNoAlpa is set. So, even with
this patch, scrolling with large (>1000x1000) background images will still be
slow, but for other platforms it is probably faster.
As far as I can see in bug 355548 the limit of 4*1024*1024 per image has been
selected rather empirical, so bigger values could be acceptable?
Reporter | ||
Comment 13•17 years ago
|
||
(In reply to comment #11)
> Created an attachment (id=287725) [details]
> potential fix
>
I tried to apply this patch (OS X, PowerPC), but failed:
> patching file modules/libpr0n/decoders/gif/nsGIFDecoder2.cpp
> Hunk #2 succeeded at 393 (offset 9 lines).
> Hunk #3 FAILED at 454.
> 1 out of 3 hunks FAILED -- saving rejects to file modules/libpr0n/decoders/gif/nsGIFDecoder2.cpp.rej
---
A related slowdown when running the testcase here:
hiding the scrollbar (by moving the window partly offscreen) improves the speed quite a bit: on my (aging...) PowerBook G4/OS X 10.4.10, that makes for a 3 seconds difference, going from ~17,000ms to ~14,000ms (with scrollbar offscreen).
(there is bug 398137 on similar issues)
Assignee | ||
Comment 14•17 years ago
|
||
Yeah, patch broke after the 8-bit GIF decoder stuff landed.
Comment 15•17 years ago
|
||
With this patch the result is 5.2sec in Windows as opposed to 10s (on my Thinkpad T41).
Some notes:
1. The 'mSawTransparency' is only needed for the first frame, as that frame is the only frame that is optimized, and the following frames are handled differently anyway.
2. The mSawTransparency detection loop can just check for pixel values equal to zero, as all transparent pixels in GIF frames are always set to 0. And countdown is faster and countup in the loop, as the loop then doesn't need to check for a class member variable for the end check.
Attachment #287659 -
Attachment is obsolete: true
Attachment #287725 -
Attachment is obsolete: true
Attachment #287810 -
Flags: review?(pavlov)
Attachment #287725 -
Flags: review?(pavlov)
Comment 16•17 years ago
|
||
With current MicroB engine (based on GTK2 backend) on N800 this testcase URL takes about 6344ms...
On Trunk mozilla based MicroB on N800 it takes ~44816ms....
Reporter | ||
Comment 17•17 years ago
|
||
(In reply to comment #15)
> Created an attachment (id=287810) [details]
> V2: Fix bitrot
Hmm. That didn't have the desired or expected results on my PowerBook (OS X 10.4.10 PowerPC). A patched build was actually *slower* than an unpatched build, going from an average of 17000ms to an average of 20000ms, over 5 runs.
Both builds are otherwise identical.
Comment 18•17 years ago
|
||
Oh, after fixing testcase s/200/300/ it shows on N800 60641ms(Cairo) vs 6433(Gtk)...
For me too it looks like with patch it little bit slower...
Assignee | ||
Comment 19•17 years ago
|
||
If it's a win on x86 we'll probably still take it, we can always comment out the format-change optimization on PPC or others (just the but in nsThebesImage::SetHasNoAlpha). I'm guessing that there might be something in there with byte order, depending on how 10.4 stores images in memory on PPC; for the maemo issues, we'll have to look at that, but I guess there are/used to be known issues for doing RGB24->ARGB32 operations due to a bug in X and/or pixman.
Comment 20•17 years ago
|
||
ups, previous measurement was wrong...
N800
16340ms(Cairo) vs 6433(Gtk)...
Assignee | ||
Comment 21•17 years ago
|
||
so it went from 44816ms (without patch) to 16340ms (with patch)?
Comment 22•17 years ago
|
||
No, I was experimenting with xlib/image surfaces and did some broken build which was working very slow and shows 44816ms....
it went from 44816ms to 16340ms after removing that experimental hacks, sorry...
Comment 23•17 years ago
|
||
Romaxa,
Can you provide timings before/after patch on the current trunk, with out any other changes? I am interesting whether this patch by itself does improve anything.
Updated•17 years ago
|
Attachment #287810 -
Flags: review?(pavlov) → review+
Reporter | ||
Comment 24•17 years ago
|
||
FWIW, this patch maybe is no as bad as I thought based on my first tests above in PowerPC Macs.
I made two identical Camino trunk builds (Checkout 20071112~15:30PDT) and ran the test above with
G4 - PowerBook 1.5Ghz, 1.5Gb ram
G5 - iMac 1.9Ghz, 1.5Gb ram
nopatch patch
G5 ~ 13,790 ~ 13,503.6
G4 ~ 17,112.2 ~ 17,096.6
average over 5 runs, empty cache, default profile
Updated•17 years ago
|
Attachment #287810 -
Flags: superreview?(tor)
Comment 25•17 years ago
|
||
So, for Windows this is a big improvement (from 10 to 5.2 seconds).
For Mac, this is not much an improvement (but, not worse either).
For the N800, we need more tests as it only compares FF2 with FF3+patch, and not the FF3-nopatch versus FF3+patch...
So, what can people report from the Linux front?
Attachment #287810 -
Flags: superreview?(tor) → superreview+
Comment 27•17 years ago
|
||
vlad has commit access, so he can take care of this himself. If he wants somebody else to land it for him, he can personally add the checkin-needed keyword.
Keywords: checkin-needed
Comment 28•17 years ago
|
||
Oh, I thought vlad wrote the final patch. Ignore my last comment. Sure, I'll check this in for you.
Comment 29•17 years ago
|
||
Reassigning to Alfred since two of the three patches are his (majority rules!). Feel free to correct this.
Landing the patch now.
Assignee: vladimir → alfredkayser
OS: Mac OS X → All
Hardware: Macintosh → All
Target Milestone: --- → mozilla1.9 M10
Comment 30•17 years ago
|
||
Checking in gfx/src/thebes/nsThebesImage.cpp;
/cvsroot/mozilla/gfx/src/thebes/nsThebesImage.cpp,v <-- nsThebesImage.cpp
new revision: 1.52; previous revision: 1.51
done
Checking in gfx/src/thebes/nsThebesImage.h;
/cvsroot/mozilla/gfx/src/thebes/nsThebesImage.h,v <-- nsThebesImage.h
new revision: 1.17; previous revision: 1.16
done
Checking in gfx/thebes/public/gfxPlatform.h;
/cvsroot/mozilla/gfx/thebes/public/gfxPlatform.h,v <-- gfxPlatform.h
new revision: 1.26; previous revision: 1.25
done
Checking in gfx/thebes/src/gfxPlatform.cpp;
/cvsroot/mozilla/gfx/thebes/src/gfxPlatform.cpp,v <-- gfxPlatform.cpp
new revision: 1.28; previous revision: 1.27
done
Checking in modules/libpr0n/decoders/gif/nsGIFDecoder2.cpp;
/cvsroot/mozilla/modules/libpr0n/decoders/gif/nsGIFDecoder2.cpp,v <-- nsGIFDecoder2.cpp
new revision: 1.87; previous revision: 1.86
done
Checking in modules/libpr0n/decoders/gif/nsGIFDecoder2.h;
/cvsroot/mozilla/modules/libpr0n/decoders/gif/nsGIFDecoder2.h,v <-- nsGIFDecoder2.h
new revision: 1.32; previous revision: 1.31
done
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 31•17 years ago
|
||
I wouldn't be so fast in closing this bug. I originally opened it for Mac OS X issues.
Apparently the patch fixes issues on Windows, maybe on Intel Mac (can't test those), but definitely *not* on Mac PowerPC.
A post patch Minefield build just runs as slow as ever on my side (see also my comments above (comment 17, comment 24).
Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O 10.4; en-US; rv:1.9b2pre) Gecko/2007112702 Minefield/3.0b2pre
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 32•17 years ago
|
||
At least the patch above makes it so that non-transparent images are really detected as such.
This needs further investigation whether the following line does result in the desired effect (that the most optimal imageSurface is created for the image):
http://lxr.mozilla.org/seamonkey/source/gfx/src/thebes/nsThebesImage.cpp#302:
302 mOptSurface = gfxPlatform::GetPlatform()->OptimizeImage(mImageSurface, mFormat);
I don't have access to a Mac, so won't able to able with Mac specific issues here.
Assignee: alfredkayser → nobody
Status: REOPENED → NEW
Assignee | ||
Comment 33•17 years ago
|
||
I'm going to close this bug as the bulk of the work here is fixed -- please file a new bug for Mac PPC issues, however.
Reporter | ||
Comment 34•17 years ago
|
||
(In reply to comment #33)
> I'm going to close this bug as the bulk of the work here is fixed -- please
> file a new bug for Mac PPC issues, however.
>
(sigh). OK, I filed bug 405894 for the Mac PowerPC issues.
Please transfer blocking flags, thanks.
I'll mark this one as fixed then.
Status: NEW → RESOLVED
Closed: 17 years ago → 17 years ago
Resolution: --- → FIXED
Comment 35•17 years ago
|
||
On intel mac:
Build 2007112904: 18783ms
20010: 5754ms
So showing a little over 3x worse for Trunk vs branch - re-opening bug...
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 36•17 years ago
|
||
(In reply to comment #35)
> On intel mac:
>
> Build 2007112904: 18783ms
> 20010: 5754ms
Mike,
Your Intel Mac is even slower than my Powerbook G4, ppc ??
For reference, I repost the latest numbers I have (posted in bug 405894#c0)
> Gecko 1.9 Gecko 1.8 Safari 3.04
> 17220 7417 10089
> 17280 7347 10003
> 17070 7475 9906
> 17056 7311 9907
> 17236 7533 9895
> ----- ----- -----
> 17,172.4 7,424.6 9,960 (average over 5 runs)
>
> Gecko 1.9: 2007112816 Minefield/3.0b2pre
> Gecko 1.8: 20071128 BonEcho/2.0.0.11pre
Comment 37•17 years ago
|
||
(In reply to comment #36)
> (In reply to comment #35)
> > On intel mac:
> >
> > Build 2007112904: 18783ms
> > 20010: 5754ms
>
> Mike,
> Your Intel Mac is even slower than my Powerbook G4, ppc ??
>
Looks that way - is an Intel Mac Mini.
Vlad/Pav you want this in this bug or in bug 405894
Assignee: nobody → vladimir
Status: REOPENED → NEW
Comment 38•17 years ago
|
||
(In reply to comment #37)
> (In reply to comment #36)
> > (In reply to comment #35)
> > > On intel mac:
> > >
> > > Build 2007112904: 18783ms
> > > 20010: 5754ms
> >
> > Mike,
> > Your Intel Mac is even slower than my Powerbook G4, ppc ??
> >
>
> Looks that way - is an Intel Mac Mini.
>
> Vlad/Pav you want this in this bug or in bug 405894
>
Vlad swung by - looks like I had the window size large enough to cause the image to tile. If I re-size the window my 1.8/1.9 numbers are much much closer.
Assignee | ||
Comment 39•17 years ago
|
||
I have a patch that uses a 10.5-only API if it's available to improve the tiled case significantly as well; gonna get it into upstream cairo and try to get it into our code before b2.
Assignee | ||
Comment 40•17 years ago
|
||
Pretty sure this is fixed one way or another at this point, especially on the Mac.
Status: NEW → RESOLVED
Closed: 17 years ago → 17 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•