Cairo image scaling/composite significantly slower on Linux.
For example if page contain a lot of scaled images, or Mozilla started on X with Big DPY (possible to check with layout.css.dpi option)
On the same laptop (T43 1.8GHz, RAM 1024Mb, ATI x300)
Open URL and try scroll many times (30-40) and fast with mouse wheel button...
Linux Debian SID fglrx: possible to make some coffee while it finish scrolling
Windows XP : Scrolling works very fast without any delays...
On the same PC under Linux Opera works with the same speed as SM2.0 or FF3 on windows (may be some faster..).
Could this be a driver problem? I'm using the proprietary NVidia driver and the URL works fine for me.
email@example.com: from our perspective, it's a cairo problem. the video system we're using doesn't have a fancy driver at all. which means cairo falls back to something else. the fallback is terrible.
the t43 should have a reasonable graphics driver.
now, it could be a bug in how we use cairo, or it could be a bug in the cairo implementation. but it is not acceptable to blame the video driver. this stuff works fine on the same hardware.
fglrx (possibly an acronym for FireGL and Radeon Linux driver) is the name of the Linux display driver used for ATI Radeon and ATI FireGL family video adapters. It contains open source and closed source parts. For proper Direct Rendering Infrastructure (DRI) support, the kernel source code for the currently running kernel must be installed and compiled. The driver can work without the kernel module, but DRI will not be available.
romaxa: are you using the kernel module?
(In reply to comment #2)
> Could this be a driver problem? I'm using the proprietary NVidia driver and the
> URL works fine for me.
Yeah, almost certainly a driver problem. The ATI proprietary drivers have never been all that high quality, and we're going down a different path in the X server now than previous versions of fx, or from what opera, etc. is using. So ATI's driver is probably hitting some case where everything needs to be copied back into system memory, maybe even back into application memory and then back to the card to be displayed.
Not pretty, but I doubt we'll do anything about it, sorry -- ATI keeps promising they'll fix their linux drivers, but they've been saying the same thing for the past 5 years. You might want to try the open-source ATI driver and see if that helps.
I'm trying to understand how Opera doing that Composition/Scaling/.... without any hardware drivers, without big count of memory...
There are probably very secret and perfect algorithms...
(In reply to comment #5)
> I'm trying to understand how Opera doing that Composition/Scaling/.... without
> any hardware drivers, without big count of memory...
> There are probably very secret and perfect algorithms...
Well, it's probably that we're buying into the Linux Dream and they're being pragmatic and doing everything in software...
Tested with OQO Model 2, 384
WinXP, Mozilla Firefox Trunk.
layout.css.dpi = 200 ()
Open URL or http://browser.garage.maemo.org/docs/browser_paper.html
Try to scroll with finger UP/Down ~10 times.... ;), after ~1-2 minutes it will finished... :(
What do you think about pre-scaling of image surface in ::Optimize function?
If Image have style,attr+or/and DPI!=96, then we can create optimized surface with scaled image...
It will disable Scale/Composite operations on each repaint...
Created attachment 273099 [details] [diff] [review]
Simple scaling in Optimize function
With this patch Image scaling happens only once...
and when the image is used multiple times on a page...?
Ups... I did not think about it :(
Click on "Mozilla Engine" button
Click on "Documentation" button
Linux,ATI,No HW acceleration: ~ 10 sec
Linux,ATI,fglrx HW acceleration: ~ 3 sec
Windows,Nvidia, Nvidia Driver: ~ 1 sec
Linux,ATI,No HW acceleration: ~ 2 sec
Minusing this for now; it's largely a driver issue, and there a bunch of other bugs that cover basically the same case. The one thing we're thinking of doing is creating a pref, and possibly doing some timings on startup, and doing pure software rendering (that is, in cairo, using only in-memory image surfaces and never using the x server for anything other than the final display).
Would that really be faster? As I recall, doing it all client side is slower than using Render, which is slower than using Xlib....
It would really faster if Render not accelerated... (GFX-GTK2 backend working much faster)
But for testcase
Render and HW acceleration will not help, because cairo does not have implementation of cairo_xlib_surface_fill and it always works with _cairo_surface_fallback_fill
Created attachment 288178 [details]
Profile data for browser_paper.html
Profile data for http://browser.garage.maemo.org/docs/browser_paper.html testcase
I haven't looked at the profile data, but it doesn't matter -- fallback_fill is correct in this case, because that's where tessellation happens. After that composite or composite_trapezoids is used, which goes to the surface backend. It's only if the surface doesn't implement composite or composite_traps that pixman gets involved.
I'm told that mobile is supposed to be treated as tier 1. as such, i think vlad's minus is no longer valid.
Taking off the blocking 1.9 list again. Timeless Mobile is Tier 1 post Gecko 1.9 - we still would really want to get this fixed but it will not block the 1.9 release.
Created attachment 330850 [details] [diff] [review]
Enable 16bpp (or native visual xrender fmt)
With this patch we are getting significant performance improvement on N8XX / 16bpp visual format.
Without this patch scroll time on browser.garage.maemo.org from "Microb Engine" -> "Documentation" takes ~ 11 sec.
With this patch it takes ~ 2 sec
Also http://browser.garage.maemo.org/docs/browser_paper.html panning and scrolling much more faster now...
Also Fennec browser looks much more faster with this fix.
Created attachment 348343 [details] [diff] [review]
No conflict with force24 bit option
Disable 16 bit usage if force 24 bit option enabled
Created attachment 351034 [details] [diff] [review]
Enable using system visual format by preference
As I understand it is not very good idea to use always system format (and it can affect on some 16 bit desktop systems...).
Here is the preference to use it when it is required.
Any updates for this patch review?
I've been investigating important fennec pageload slowdowns associated with chinese fonts and the problem turns out to be closely related to this bug. I've applied the patches and notices significant speed-ups, so if we could just refocus our interest on this bug (and on bug 459078), that'd be great !
Probably we should take latest pixman with improved scaling functionality, also merge with bug 545632.
I don't really like adding new prefs. What's the reason we can't just do the right thing?
Is this still a latent win for Fennec, or did the work in bug 545632 capture the valuable part?
Yes, it looks like the patch in bug #545632 makes this the default behaviour now. Marking this as fixed, please re-open if this is an error.