Closed Bug 650968 Opened 13 years ago Closed 12 years ago

Enabling a lightweight theme (Persona) causes significant startup slowness

Categories

(Toolkit :: Add-ons Manager, defect)

defect
Not set
major

Tracking

()

VERIFIED FIXED
mozilla18
Tracking Status
firefox6 - ---

People

(Reporter: jorgev, Assigned: ttaubert)

References

(Depends on 1 open bug)

Details

(Whiteboard: [ts][Snappy:P1])

Attachments

(1 file, 9 obsolete files)

While investigating a performance issue with Personas Plus in bug 650709, Wladimir discovered that only having a Persona enabled is enough to cause a significant slowdown in Firefox startup (see bug 650709 comment #3).

I tested this on my system and the startup overhead of having only the Groovy Blue persona enabled - without Personas Plus or any other add-on installed - was close to 18% (549.40ms baseline vs. 647.80ms with persona, warm startup on Mac OS). The slowdown is so large that I suspected the persona was always downloaded from the server, but that's incorrect (it's stored on disk). Starting Firefox with a proxy pointing to localhost yielded similar results.

It looks like the code that loads the selected persona at startup needs some looking into. Given the large amount of Firefox users that have personas installed, this is probably one of our largest startup performance bottlenecks, if not the largest.
I'm going to guess this is more to do with painting the image than the actual selection code, testing with tiny header/footer images would verify that.
Given the huge number of users with a Persona, nominating for some investigation in the Firefox 6 timeframe.
Assignee: nobody → mh+mozilla
This seems to be what is taking that much time:
http://mxr.mozilla.org/mozilla-central/source/toolkit/content/LightweightThemeConsumer.jsm#110

which suggests that setting the background is synchronous.
So, this is not the aElement.style.backgroundImage = ... line taking time, but it triggering something that does (not really surprising). So it very much looks like this is related to the UI not being displayed early, as it was before 4.0. (I think there was a bug for that).

Delaying the lightweight theme initialization might help, at the risk of first displaying the UI without the theme during startup.
Attached patch Hack (obsolete) — Splinter Review
This somehow solves the problem but is really not a proper solution.
What can be seen with this is interesting, though, as it makes it even more obvious that the UI display is delayed. Without the patch applied, you get the UI with the personas applied directly. With the patch applied, you get the default UI without the personas, then the personas background color, then the personas background image. It's even more obvious when all three have radically different colours, and when the background image is even bigger than the groovy blue persona (I've been playing with a 6400x4800 entirely red jpeg which adds 400ms of startup time)
Attached patch Partial fix (obsolete) — Splinter Review
This works on Windows and Mac, for some value of work: the UI shows up earlier, but tabs don't load anything until the persona is loaded.
On Linux, however, while the window is opened earlier, its content is not displayed until the persona is loaded. I'll file a separate bug for that.
Attachment #527497 - Attachment is obsolete: true
Depends on: 651884
Boris, any thoughts on this?
Hmm.  So what this is doing is setting the background style on the documentElement of browser.xul, right?  That will trigger a style reresolve on everything in the DOM (more precisely, everything in the subtree rooted at that element, but it's the root).  Then it will trigger a repaint when the image loads, etc.

And this is being done via the root-element binding constructor, so ipso facto this code is running _after_ we have already resolved styles for the root and constructed frames for all the chrome stuff.

I wonder how much of the cost here is just the restyle.  If you take out the background set and instead just set .style.top or some such, how do the timings look?  If that's most of it, then the right solution is to just move the style set earlier in the window loading process; ideally before the StartLayout call.  In other words, move the style munging out of XBL.
AFAICT, most of the time is spent actually decoding the jpeg. Loading it earlier may save a style reresolve, but won't save the jpeg decoding. I'll however try to check how much time is spent on the restyle.
Ah, yes.  The JPEG decode won't start until we resolve style for the root the first time after the style is set...
BTW, if setting the background color results in the same style reresolve as setting the background image, then I already know the overhead we get by setting the background color only is negligible, that's the first thing I isolated (comment 3)
Changing background color is identical to changing background image from a style reresolution perspective, yes.  I assume in comment 3 you just commented out the backgroundImage set but kept the backgroundColor set?  If so, then yes, sounds like the reresolve time is not what's hurting here.

Is the jpeg decode really 100ms here?  How big are these jpegs?  Can we convert them to a faster-to-decode format when we cache them, if that would help?
(In reply to comment #12)
> Changing background color is identical to changing background image from a
> style reresolution perspective, yes.  I assume in comment 3 you just commented
> out the backgroundImage set but kept the backgroundColor set?  If so, then yes,
> sounds like the reresolve time is not what's hurting here.
> 
> Is the jpeg decode really 100ms here?

I think it is, but I will do some profiling.

> How big are these jpegs?

3000x200.

> Can we convert them to a faster-to-decode format when we cache them, if that would help?

I'm afraid there's not much we can do for such sizes :( We would still need some kind of compression, because that's 2.4MB of uncompressed data and reading that off disk is costly (and would certainly take these 100ms on old or crappy disks) ; I don't think pngs would be much faster, though I will check.

I however think we have 3 different problems for this bug:
- The UI not showing up until the images are decoded, which is addressed by the patch I attached, modulo bug 651884 on Linux
- The image decoding taking too much time/the images being too big.
- The tabs not loading until the personas images are decoded, even when the UI is shown before they are

I think for the second, we could try to cheat and cache a downscaled version of the image that we could display early and update with the fullsize version when it's decoded. I'm not sure how fast such temporary upscaling would be, though.

For the latter, I'll need to dive deeper.
Two thoughts:
a) Should we call these heavyweight themes?
b)We could do some speculative decoding on a thread. Record all of the image loads that happen before .firstPaint in startup cache and then decode them on a separate thread during startup. This has 2 benefits 1) it gives us a chance of decoding the images before they are needed, b) it keeps the cpu busier which increases the speed that we can do io at =D
(In reply to comment #14)
> Two thoughts:
> a) Should we call these heavyweight themes?

I guess people are creating personas this big because of modern screen resolutions.

> b)We could do some speculative decoding on a thread. Record all of the image
> loads that happen before .firstPaint in startup cache and then decode them on a
> separate thread during startup. This has 2 benefits 1) it gives us a chance of
> decoding the images before they are needed, b) it keeps the cpu busier which
> increases the speed that we can do io at =D

Note that the 15~20% speed impact the jpeg decoding represents is on warm startup.
(In reply to comment #15)

> Note that the 15~20% speed impact the jpeg decoding represents is on warm
> startup.

It wont hurt to do this speculatively on warm startup
(In reply to comment #16)
> (In reply to comment #15)
> 
> > Note that the 15~20% speed impact the jpeg decoding represents is on warm
> > startup.
> 
> It wont hurt to do this speculatively on warm startup

Sure but we can't rely on that alone to make things faster.
(In reply to comment #15)

> I guess people are creating personas this big because of modern screen
> resolutions.

We've required Personas be 2500-3000px wide and 200px tall since the beginning (100px for footer).  Max filesize of 300kb.
FWIW, a good part of the overhead is spent in:
- ycc_rgb_convert_argb
- jpeg_idct_islow
- h2v2_fancy_upsample
- imgFrame::Optimize

(this was with a libjpeg build because I don't have a recent enough yasm for libjpeg-turbo)

An important part of that overhead is also spent doing clear_page_c in the kernel, most of it coming from gfxImageSurface::gfxImageSurface.
The previous comment was with my test image. With the actual groovy blue image this is:
- ycc_rgb_convert
- jpeg_idct_islow
- qcms_transform_data_rgb_out_lut_sse2

In png cases, a good part is spent in:
- MOZ_PNG_read_filt_row
- MOZ_Z_inflate_fast
- qcms_transform_data_rgb_out_lut_sse2

PNGs do seem to be slightly faster to decode (for the same groovy blue background).
Another related problem: decoding the footer image is altering the startup time even when nothing displays it. (BTW, during inspection, I've seen a lot of unused images being loaded during startup)
Hmm....  Unused in what sense?  For chrome I believe we may do decode-on-load, not decode-on-draw, and load will happen if the rule with the image matches some element (because at that point we don't know whether the element will be visible).  Or are these images from rules that don't actually match anything?
(In reply to comment #22)
> Hmm....  Unused in what sense?  For chrome I believe we may do decode-on-load,
> not decode-on-draw, and load will happen if the rule with the image matches
> some element (because at that point we don't know whether the element will be
> visible).  Or are these images from rules that don't actually match anything?

My guess is that they are images that are actually invisible from the UI, but appear somewhere. I'll get a list of them.
Mike, I highly suggest you upgrade your yasm and reprofile; libjpeg isn't what we're going to be shipping in 5 or 6, after all.

Chrome images are never discarded, so we always decode everything necko gives us synchronously (rather than storing it and then decoding it asynchronously in 200k chunks later, as we do for other images). We define chrome images as having a scheme of chrome:// or resource://, fwiw.

This could be changed, probably without a huge amount of fuss, but it's not on my radar for the time being.
(In reply to comment #23)
> My guess is that they are images that are actually invisible from the UI, but
> appear somewhere. I'll get a list of them.

Filed bug 652808 for further investigation.

(In reply to comment #24)
> Mike, I highly suggest you upgrade your yasm and reprofile; libjpeg isn't what
> we're going to be shipping in 5 or 6, after all.

I don't think it matters much, because with nightly builds, I don't see much difference in the overhead provoked by the personas images.

> Chrome images are never discarded, so we always decode everything necko gives
> us synchronously (rather than storing it and then decoding it asynchronously in
> 200k chunks later, as we do for other images). We define chrome images as
> having a scheme of chrome:// or resource://, fwiw.

Note the personas images are loaded through a file:// uri.
Hm, I think maybe personas are also special-cased to not discard; I don't remember how, but I remember Bobby doing some work regarding that.

You can validate this by breaking on mozilla::imagelib::RasterImage::AddSourceData and making sure StoringSourceData() returns false.
(In reply to comment #26)
> Hm, I think maybe personas are also special-cased to not discard; I don't
> remember how, but I remember Bobby doing some work regarding that.
> 
> You can validate this by breaking on
> mozilla::imagelib::RasterImage::AddSourceData and making sure
> StoringSourceData() returns false.

It returns true for personas images.
FWIW, mozilla::imagelib::RasterImage::AddSourceData receives the personas images by 32K chunks
OK. Therefore, personas images are treated exactly the same way as content images.
Blocks: 622908
tracking-firefox flags aren't for suggesting something is important. If you think it's important, work with the leads of that module to get it prioritized.
Mossop, can you prioritize this bug accordingly, or should I add it to the Features Inbox?
Mike, when can you find time to work on this?
I'll try to gather my notes on the subject and file different bugs for the different aspects of the problem tomorrow and/or talk to the most relevant people first.
I have to check on non linux environments but it does seem like bug 590260 had a negative impact here. Using the new prefs to use the old values makes things go a bit like the patch I attached here (even hitting bug 651884), though the timings don't seem as good as with the patch here (but I was taking timings with my huge red jpeg).
Wondering if this one got lost. Seems like a pretty big issue impacting startup.
Whiteboard: [ts]
This is likely to have been "fixed" (for some value of fixed) by bug 666352.
Depends on: 666352
I tested this on Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0) Gecko/20100101 Firefox/8.0 and got pretty much the same result as in Comment 0, about 18% overhead just for having the Groovy Blue Persona installed.
Tested this with Fx12 (Aurora) just now and the issue is as bad as ever (actually slightly worse than in Comment 0, but that's probably just differences in setup).  I tested on Windows7 with same Groovy Blue Persona.
Whiteboard: [ts] → [ts][Snappy]
Mike can you revive this? Marking it as P2 because I hope that a relatively small percentage of our users uses personas.
Whiteboard: [ts][Snappy] → [ts][Snappy:P2]
(In reply to Taras Glek (:taras) from comment #39)
> Mike can you revive this? Marking it as P2 because I hope that a relatively
> small percentage of our users uses personas.

I think we have tens of millions of Personas users. In aggregate, I'll bet Personas are bigger than any single Add-on out there. Jorge or Fligtar can correct me if I'm lying.
Last I heard Personas usage was relatively high (~20% of users).

I'd be curious to see a breakdown of cost due do image loading, decoding, and other side effects.

Quick experiment: try replacing the actual Personas image with a tiny 1x1 image, to see how that changes things? ISTR this being pretty trivial to do, but I don't remember how. :)
(In reply to Justin Dolske [:Dolske] from comment #41)
> Last I heard Personas usage was relatively high (~20% of users).
> 
> I'd be curious to see a breakdown of cost due do image loading, decoding,
> and other side effects.

We determined that most of the overhead is due to synchronous image loading.
Whiteboard: [ts][Snappy:P2] → [ts][Snappy:P1]
I thought bug 666352 would have made this much better.
Maybe tweaking image.mem.max_bytes_for_sync_decode would help?
(In reply to Mike Hommey [:glandium] from comment #43)
> I thought bug 666352 would have made this much better.
> Maybe tweaking image.mem.max_bytes_for_sync_decode would help?

From my (non-scientific) test, it does!  I reduced it to 15000 from the default 150000 and the times went to close to the persona-less time.  (increasing it made the time worse).  I'd be interested in this being replicated in more rigorous tests though.
How are you measuring startup time? We absolutely must decode the full image before Firefox is fully "loaded", despite whatever our measurements tell us. Reducing the pref will make our startup time strictly worse, because it's more work to return to the event loop and then re-start image loading.

If you're looking at how long it takes for onload to fire on about:blank, for example, that isn't accurate, because reducing the pref lets onload fire on about:blank before the persona image decodes.
(In reply to Joe Drew (:JOEDREW!) from comment #45)
> How are you measuring startup time? We absolutely must decode the full image
> before Firefox is fully "loaded", despite whatever our measurements tell us.
> Reducing the pref will make our startup time strictly worse, because it's
> more work to return to the event loop and then re-start image loading.

I used the about:startup addon [https://developer.mozilla.org/en/Measuring_Add-on_Startup_Performance#Using_about:startup] and was quoting firstPaint times.  I'm happy to accept these numbers are wrong (I did say unscientific!) if someone wants to run some more accurate tests.
This needs off-main-thread image decoding, imo.
That'll only help if the computer has multiple cores, or the main thread is otherwise blocked (for example, on I/O), but either of those things may be the case.
Or if it really is just decoding we could convert the Personas into a bitmap or something and then read them off the disk.
That will be a lot more I/O.

Crazy idea: store bitmap copy in the startupcache. Should result in some reduction in I/O, compared to storing in it's own file.
Jorge, it would be *really* helpful to get the following stats on personas:
* distribution of image size (in pixels)
* distribution of png/jpg files

This will help us decide how aggressive we need be in optimizing this.
(In reply to Taras Glek (:taras) from comment #51)
> Jorge, it would be *really* helpful to get the following stats on personas:
> * distribution of image size (in pixels)

According to this https://www.getpersonas.com/en-US/demo_create, all images are 3000px wide x 200px high for the header, and 3000px wide x 100px high for the footer. Maximum file size is 300kb.

> * distribution of png/jpg files

Wil, do you know how we can find out the distribution of png and jpg files for personas?
(In reply to Jorge Villalobos [:jorgev] from comment #52)
> Wil, do you know how we can find out the distribution of png and jpg files
> for personas?

70168 pngs, 387424 jpgs, around 3000 other assorted types (gif, bmp(!), etc.)
(In reply to Wil Clouser [:clouserw] from comment #53)
> (In reply to Jorge Villalobos [:jorgev] from comment #52)
> > Wil, do you know how we can find out the distribution of png and jpg files
> > for personas?
> 
> 70168 pngs, 387424 jpgs, around 3000 other assorted types (gif, bmp(!), etc.)

Thanks. So looks like the best option is to provide multiple background jpgs/pngs/etc so the browser can choose one based on resolution. Can we do this on the serverside?
(In reply to Taras Glek (:taras) from comment #54)
> Thanks. So looks like the best option is to provide multiple background
> jpgs/pngs/etc so the browser can choose one based on resolution. Can we do
> this on the serverside?

I feel this would be better client side - Firefox can crop the image to the exact size needed then.  If its done on shutdown no-one would notice and the full size could be swapped back in shortly after startup or on the first resize event.

Otherwise Fx might end up either having to download all the sizes or end up with a network lag the first time the window is resized (e.g. the window is resized from 799px wide to 801px wide)
(In reply to Andrew Williamson [:eviljeff] from comment #55)
> (In reply to Taras Glek (:taras) from comment #54)
> > Thanks. So looks like the best option is to provide multiple background
> > jpgs/pngs/etc so the browser can choose one based on resolution. Can we
> > do this on the serverside?
> 
> I feel this would be better client side - Firefox can crop the image to the
> exact size needed then.  If its done on shutdown no-one would notice and the
> full size could be swapped back in shortly after startup or on the first
> resize event.
> 
> Otherwise Fx might end up either having to download all the sizes or end up
> with a network lag the first time the window is resized (e.g. the window is
> resized from 799px wide to 801px wide)

Addenum idea:
As the images are stored locally, would it be possible to store a
pre-cropped to max. screen-width (landscape) image in the startupCache?

That way the for most screens ridiculous width of 3000px of the images would not
have a that heavy impact on the startup.

Also it could be part of the solution to test what image format is the fastest to
load and decode in summary, load here means load from slow media (e.g. 10MB/s),
in summary means the times of load and decode added together, as it makes no sense
to have a superfast decode with a immense image-size on disk, and the other extreme
is just as bad for the end-user.
(In reply to Michael Foerster from comment #56)
> Addenum idea:
> As the images are stored locally, would it be possible to store a
> pre-cropped to max. screen-width (landscape) image in the startupCache?
> 
> That way the for most screens ridiculous width of 3000px of the images would
> not have a that heavy impact on the startup.

It would help but the optimum width would still be the exact size of the Firefox window when it closed.
Or an arbitrarily smaller size. And CSS3 background properties could be used to merge them all back together properly. I wonder if our code base is smart enough not to load CSS background images that aren't on screen.
(In reply to Mike Hommey [:glandium] from comment #58)
> Or an arbitrarily smaller size. And CSS3 background properties could
> be used to merge them all back together properly. I wonder if our
> code base is smart enough not to load CSS background images that
> aren't on screen.

The problem here is in multiple points:
- Why load the footer bg image when no footer (add-on ber) is shown?
- To load a image is has to be decoded in full size, no matter how
   much of it is shown. (here 3000px width)

Possible solution could include:
- Code that decode the images once on installing the theme, and
   storing a smaller version in startCache
- Code that makes the renderer smart enough to be able to decide
   not to load / render images not shown (e.g. display:none)

The first would help with "lightweight themes", the second would
improve the renderer for all uses (should become it's own bug).
(In reply to Andrew Williamson [:eviljeff] from comment #57)
> (In reply to Michael Foerster from comment #56)
> > Addendum idea:
> > As the images are stored locally, would it be possible to store
> > a pre-cropped to max. screen-width (landscape) image in the 
> > startupCache?
> > 
> > That way the for most screens ridiculous width of 3000px of the
> > images would not have a that heavy impact on the start-up.
> 
> It would help but the optimum width would still be the exact size
> of the Firefox window when it closed.

Not exactly. Let's show the cases:
Mobile: can change orientation -> let's just use landscape width,
 no additional decoding of the original 3000px width image needed.
Tablet: Ditto.
Desktop: user sometime use Full-screen, and sometimes not. Otherwise
 same as Mobile.

So, decoding the images once on install of the theme to
landscape-width would safe the most of the unneeded decoding,
while still grasping the most efficiency in terms of memory and
cpu usage.

A question that comes to the mind here is:
Do we keep the unneeded areas of the images still in memory,
or just the area we show?
In reply to comment #60:
Desktop: when (not if) the user installs a wider screen and/or a better-resolution graphics card, Toolkit should notice it (at least the first time) and reprocess the full-width image. This means we don't dare avoid checking the window width every time; this, however, ought not to be onerous.
Blocks: 729656
(In reply to Michael Foerster from comment #59)
> - Why load the footer bg image when no footer (add-on ber) is shown?

The trouble with that is that when it is shown, there may be a lag in displaying the image.

Sadly, a lot of possible startup optimizations for this just shift the issue to when Firefox is already running, and make it more obvious :\
Mike, if you're still working on this give a holler, otherwise it sounds like the plan here is to try to create optimized images client-side.

As people have mentioned, using the screen width as a max is a good starting point. I'm guessing that should cover 80+% of people. We can store that just like we store the other images into the profile with the width appended (lightweighttheme-header-1920). nsIScreenManager doesn't let us enumerate the available widths, so we start with primaryScreen (or maybe even screen for current window). In the consumer, we look for optimized values and see if the screen that window is on is <= the optimized value & uses that resized header/footer if so otherwise fallback to. Optionally, we can send that screen size back the the manager which could write additional sizes out as needed (so when I'm not using my external, we'd have even smaller images to use instead). If we wanted to get fancy, we could start looking at window moves and rechoose the right image appropriately & swap out (though we'd hit the decoding sluggishness and we wouldn't have won too much). I figure we give something along those lines a try and see where it gets us, then explore using the startup cache as needed.

(In reply to Blair McBride (:Unfocused) from comment #62)
> Sadly, a lot of possible startup optimizations for this just shift the issue
> to when Firefox is already running, and make it more obvious :\

I think the files are written async, so I don't think it will be too bad. (I assume) There's already some expectation of "not instantaneous" theme installs, so if most of the work is done then it wouldn't be very noticeable.
I finally sat down and wrote a document which describes what I think needs to be done in order to attack this problem from multiple angles:

https://etherpad.mozilla.org/SnappyLightweightThemes

Please comment if I've missed something.  Not sure what the appropriate next steps are going to be.
I weekly follow the Taras Glek blog and on this bug i can make a contribution. 

Link: http://pastebin.com/FphVGy2n

The link points to an KDE project script updated by me and already submited to KDE.  This script parallely optimizes PNG,MNG,SVGZ and JPG in a given directory. My tests on ArchLinux system files using a Core 2 Duo T7250 2GHz saved about ~700MB and took 36hours.

I'm a compressionist and in my spare time will look into some of the problems described by Ehsan document.
Also would arithmetic decoding be enabled in firefox ?
Attached patch Patch v0.1 (WIP) (obsolete) — Splinter Review
Here's a WIP patch - it will crop header/footer images to the width of the first screen (primary) and save those to the profile, then use those on next window/startup...

Right now it only creates 1 set of cropped images at theme install and then it's done. We could do more (send feedback from Consumer into Provider), but we can't enumerate screens at install time unless we go fix nsIScreenManager.

A problem with having multiple images to choose from though is that we may end up loading different images - window 1 could have image A, window 2 image B, and then we may have just made the problem worse, not better.

A big problem with this right now is that it's always saving as jpg, so sometimes "optimized" images end up larger than the originals.
(In reply to Paul O'Shannessy [:zpao] from comment #67)
> A big problem with this right now is that it's always saving as jpg, so
> sometimes "optimized" images end up larger than the originals.

Hmm, you mean, as opposed to PNGs?  Note that saving PNG images to JPG might also incur some quality loss.  Why can't we save in the image's original format?
(In reply to Ehsan Akhgari [:ehsan] from comment #68)
> Why can't we save in the image's original format?

We should. I just didn't get the mimetype detection figured out to do it yet, so I just fell back to saving everything as image/jpeg (5x more jpgs)
Note that saving jpegs as jpegs is also likely to incur some quality loss. It is theoretically possible to cut jpegs without recoding them, but i doubt you can do that from javascript.

As an aside, wouldn't it be simpler (for some value of simpler) and less error prone to cut at a fixed size (say, 300px), and define the background as a tile of all the cuts?
Attached patch Patch v0.2 (WIP) (obsolete) — Splinter Review
Now with mimetype detection. I'm probably not going to have time to work on this so if somebody wants to take this and run or go with Mike's cutting idea, feel free.
Attachment #627038 - Attachment is obsolete: true
Assignee: mh+mozilla → nobody
I wrote a little script to measure firstPaint times on my Linux machine. Here are the average numbers for 100 runs:

[Hint: those are all JPG images from the "Groovy Blue" persona.]

1) no persona = 461ms

2) with persona = 536ms (16% hit)
   header size = 180kb
   footer size = 80kb

3) with persona, cropped to screen width (1366px) = 484ms (5% hit)
   header size = 51kb
   footer size = 22kb

4) with persona, tiled images (300px width) = 482ms (5% hit)


Looks like we'll be better off with (4) as that would be a one-time operation compared to (3) that would need to adjust to changing screen sizes.
Assignee: nobody → ttaubert
Status: NEW → ASSIGNED
FWIW, (3) and (4) would also fix bug 729656. We'd effectively take the first frame and crop that. No more animations.
Depends on: 769634
Instead of using the canvas to crop images like attachment 629357 [details] [diff] [review] does, I filed bug 769634 to add cropping functionality to imgITools. This should be cleaner and has less overhead.

We'll need that for (3) and/or (4).
Assuming the "footer" isn't visible for most people as they might not have any add-ons installed or any that added buttons to it or it's disabled, we can save some decoding time (almost half of it) here as well.

This will be fixed by bug 683286 - with its patch applied the firstPaint times go down to 507ms (from 551ms).
Depends on: 683286
I tested with putting persona header/footer images in the startup cache:

[BTW, my linux machine has a SSD, not sure if I should interpret my results differently because of that?]


1) I stored the raw image data (decoded) in the scache for both images. They were of course quite big which resulted in ~504ms firstPaint time which is even worse than reading and decoding JPGs from disk.

2) I stored decoded image tiles in the scache (with 300px width) which resulted in firstPaint times ~494ms. A little better but still not good.

3) Reading decoded and cropped images (1366px width) from the scache is a little faster with ~489ms.

4) JPG encoded image tiles in the scache result in ~486ms firstPaint times.

5) The best results so far (even a little better than [4] from comment #72) yielded cropped JPGs stored in the scache - 471ms.


PNGs were a bit slower, probably because the trade-off for JPG decoding time and reading them from disk is a little better than for PNGs. I'm not entirely sure which of all those approaches is the way to go. The startup cache improvements might be better on machines with slower disks, but storing them decoded seems not an option, that's just too much IO.

Seems like [5] would be a good option with its only ~2% hit. We'd crop images and store them in the startup cache. Ideally they would be stored as JPGs but maybe we should keep them in their original format for quality reasons?
(In reply to Tim Taubert [:ttaubert] from comment #76)
> Seems like [5] would be a good option with its only ~2% hit. We'd crop
> images and store them in the startup cache. Ideally they would be stored as
> JPGs but maybe we should keep them in their original format for quality
> reasons?

It sounds like any of those would be a huge win over the ~20% hit we're taking today. Can we pick which ever one is easiest and get it shipped to our users ASAP?
Depends on: 771077
Attached patch patch v3 (obsolete) — Splinter Review
This patch does basically the same as Paul's patch v2 but I moved the code to a JSM and use ImageTools for cropping.

Every time we set new background images we check if there's a cropped version available for the current screen size. If so, we'll just exchange the urls in the given data object.

If there's no cropped version yet then we'll just return the original image and start cropping (as) asynchronously (as possible) in the background.

When persisting images for a newly installed lightweight theme we throw away all cropped images so that we're not re-using wrong images from other themes.

I did some new measurements because lots of platform stuff has changed in the meantime:

1) no persona = 440ms
2) with persona = 508ms (+15.5%)
3) with persona, cropped = 460ms (+4.5%)
Attachment #527555 - Attachment is obsolete: true
Attachment #629357 - Attachment is obsolete: true
Attachment #641027 - Flags: review?(dtownsend+bugmail)
Comment on attachment 641027 [details] [diff] [review]
patch v3

What's the behavior when previewing a lightweight theme?
(In reply to Dão Gottwald [:dao] from comment #79)
> What's the behavior when previewing a lightweight theme?

We don't do anything in this case. We're only optimizing file:// URLs.
Originally there was a preference to turn this off. I don't see it in the current patch?
To turn what off?
> To turn what off?

All of the cropping code.

I have an add-on that does significant modifications to Personas. I'd like to not invoke this code.


On a separate note, if we are optimizing the image to a screen size, what happens when the user moves his browser to a second screen on their machine that is larger? 

In a multi monitor situation, which screen is used?
(In reply to Michael Kaply (mkaply) from comment #83)
> > To turn what off?
> 
> All of the cropping code.
> 
> I have an add-on that does significant modifications to Personas. I'd like
> to not invoke this code.

(sarcasm)
so, you are heavily interested in hindering 99% of the personas users for -what-
real win?
(/sarcasm)

Every application of such dual-path code is much more error prone than a single
path code. And add to that kind of stress, that every test for Personas would have to be done twice. NOT nice.

So far this code is Nightly(16), it may be elevated to Aurora(15), but for Beta(14) it's simply to late.

You have at least 8 weeks, more realistic 14 weeks do find a solution in the addon to handle the cropping Fx does with this code.

Since the very start of Personas, the lag introduced by the image-loading was a 
very big turn-off for me, especially with the directive of 3000 Pixel width.
 
 
> On a separate note, if we are optimizing the image to a screen size, what
> happens when the user moves his browser to a second screen on their machine
> that is larger? 
> In a multi monitor situation, which screen is used?

Now THAT is a valid concern, but not the snag, as we still have the original
images and can 'reload' and re-crop for the new situation.

Resizing a window to bigger than before causes a full redraw of the content in
most cases anyway a reload+re-crop for the new, bigger width adds not that much
overhead.

The move / resize situation is much different from startup, esp cold startup.

Any of the Gurus, please correct me, if I'm wrong here, and @mkaply: this is not
meant personal against you, it's meant as a matter of perspective.

None the less, maybe a function/api that can be manually called to start the
reload + re-crop process with a given path-to-image could be worth to be
investigated.
(In reply to Michael Kaply (mkaply) from comment #83)
> On a separate note, if we are optimizing the image to a screen size, what
> happens when the user moves his browser to a second screen on their machine
> that is larger?

Need to handle this, will update the patch.

> In a multi monitor situation, which screen is used?

The screen that contains most of the window. If the user stretches the Window via both screens this should be handled accordingly by the screen manager - at least GTK/X11/Xinerama seems to work that way and I expect it to be handled the same for other OSs.
(In reply to Michael Kaply (mkaply) from comment #83)
> On a separate note, if we are optimizing the image to a screen size, what
> happens when the user moves his browser to a second screen on their machine
> that is larger? 

At the moment, the image isn't re-cropped.

> In a multi monitor situation, which screen is used?

Based on some quick testing, it seems to always use the primary monitor, even if on startup the window appears on another monitor. I'm guessing window.screen gets updated sometime after startup to reflect the current monitor.

Also, it appears that optimized images are generated on the next startup after a lightweight theme is selected. I think optimize() needs to be called somewhere in _persistImages in LightweightThemeManager.

(Its 2am, so I haven't looked at anything too deeply.)
(In reply to Blair McBride (:Unfocused) from comment #86)
> (In reply to Michael Kaply (mkaply) from comment #83)
> > On a separate note, if we are optimizing the image to a screen size, what
> > happens when the user moves his browser to a second screen on their machine
> > that is larger? 
> 
> At the moment, the image isn't re-cropped.

I'm working on it right now.

> > In a multi monitor situation, which screen is used?
> 
> Based on some quick testing, it seems to always use the primary monitor,
> even if on startup the window appears on another monitor. I'm guessing
> window.screen gets updated sometime after startup to reflect the current
> monitor.

Every time window.screen is accessed it actually gets the corresponding screen and may yield a different value based on where your windows is positioned.

> Also, it appears that optimized images are generated on the next startup
> after a lightweight theme is selected. I think optimize() needs to be called
> somewhere in _persistImages in LightweightThemeManager.

No, they're generated as soon as you select your new lightweight theme but they're not instantly applied because they're cropped asynchronously. But they're ready to be used at the next startup.

I chose LightweightThemeConsumer._update() to optimize the images because that's the only way to migrate existing themes. We can't tell users to re-install their themes to reduce their startup times :)
Attached patch patch v4 (obsolete) — Splinter Review
The patch does now listen for resize events on the given window. If it detects that the screen size has changed it will crop new images in the background (if necessary).

I chose the resize event because there is no "screensizechanged" event or the like. Also we don't really need to adapt the theme's background images if the user drags the browser window to a different screen but doesn't actually resize the window.
Attachment #641027 - Attachment is obsolete: true
Attachment #641027 - Flags: review?(dtownsend+bugmail)
Attachment #641469 - Flags: review?(felipc)
Comment on attachment 641469 [details] [diff] [review]
patch v4

>-  _update: function (aData) {
>+  update: function (aData) {

This method isn't for external use.

What's the point of the "resize timer"?
(In reply to Dão Gottwald [:dao] from comment #89)
> >-  _update: function (aData) {
> >+  update: function (aData) {
> 
> This method isn't for external use.

You think it'd be fine to call it from the ResizeWatcher still with the underscore? Properties beginning with an underscore are usually only accessed by the object itself.

> What's the point of the "resize timer"?

AFAIK resize fires multiple times while resizing and I didn't want to slow it down. If you say otherwise I'd be happy to remove it.

Now that I think about it comparing the new screen size should not be more costly than cancelling and initiating the timer again...
Attached patch patch v5 (obsolete) — Splinter Review
Reverted the method name to "_update". Removed the resize timer.

Dão: feel free to steal the review if you'd like to do it. Didn't want to just re-assign it to you.
Attachment #641469 - Attachment is obsolete: true
Attachment #641469 - Flags: review?(felipc)
Attachment #641482 - Flags: review?(felipc)
(In reply to Tim Taubert [:ttaubert] from comment #90)
> (In reply to Dão Gottwald [:dao] from comment #89)
> > >-  _update: function (aData) {
> > >+  update: function (aData) {
> > 
> > This method isn't for external use.
> 
> You think it'd be fine to call it from the ResizeWatcher still with the
> underscore?

Yes, but I'd rather get rid of the "ResizeWatcher" entirely and let the LightweightThemeConsumer instance listen for the resize event directly.
(In reply to Dão Gottwald [:dao] from comment #92)
> Yes, but I'd rather get rid of the "ResizeWatcher" entirely and let the
> LightweightThemeConsumer instance listen for the resize event directly.

We can do this, of course. I thought it might be better to have those responsibilities separated.
Comment on attachment 641482 [details] [diff] [review]
patch v5

>   _update: function (aData) {
>     if (!aData)
>       aData = { headerURL: "", footerURL: "", textcolor: "", accentcolor: "" };
> 
>+    this._resizeWatcher.update(aData);
>+    LightweightThemeImages.optimize(aData, this._win.screen);

>+  update: function RW_update(aData) {
>+    this._data = this.copy(aData);
>+  },
>+
>+  resize: function RW_resize() {
>+    let lastScreenSize = this._screenSize;
>+    let {width, height} = this._window.screen;
>+
>+    if (lastScreenSize.width != width || lastScreenSize.height != height) {
>+      this._consumer._update(this.copy(this._data));
>+      this._screenSize = {width: width, height: height};
>+    }
>+  },

The resize -> copy -> _update -> update -> copy flow is pretty confusing...

>+++ b/toolkit/mozapps/extensions/LightweightThemeImages.jsm

>+let ImagesInternal = {

This object doesn't seem to be of value, its methods can be functions in the global scope.
(In reply to Dão Gottwald [:dao] from comment #94)
> >+++ b/toolkit/mozapps/extensions/LightweightThemeImages.jsm
> 
> >+let ImagesInternal = {
> 
> This object doesn't seem to be of value, its methods can be functions in the
> global scope.

Really? You mean I should put all those functions and the 'inProgress' object just in the global scope of the JSM? That's very unlike we usually write modules, isn't it? I'd rather put it all in 'LightweightThemeImages' then and prefix those methods with underscores but this seems all weird to me.
(In reply to Tim Taubert [:ttaubert] from comment #95)
> (In reply to Dão Gottwald [:dao] from comment #94)
> > >+++ b/toolkit/mozapps/extensions/LightweightThemeImages.jsm
> > 
> > >+let ImagesInternal = {
> > 
> > This object doesn't seem to be of value, its methods can be functions in the
> > global scope.
> 
> Really? You mean I should put all those functions and the 'inProgress'
> object just in the global scope of the JSM? That's very unlike we usually
> write modules, isn't it?

LightweightThemeManager.jsm has various global helper functions. We can group functions in objects where it makes sense, but I don't see the point here.
Attached patch patch v6 (obsolete) — Splinter Review
Removed the 'ResizeWatcher' and incorporated everything into the 'LightweightThemeConsumer' object.
Attachment #641482 - Attachment is obsolete: true
Attachment #641482 - Flags: review?(felipc)
Attachment #641502 - Flags: review?(felipc)
Comment on attachment 641502 [details] [diff] [review]
patch v6

>+    let lastScreenSize = this._lastScreenSize;
>+    let {width, height} = this._win.screen;
>+
>+    if (lastScreenSize.width != width || lastScreenSize.height != height) {
>+      this._update(this._lastData);
>+      this._lastScreenSize = {width: width, height: height};
>+    }

nit: use this._lastScreenWidth and this._lastScreenHeight instead of the this._lastScreenSize object

>+  optimizeImage: function LWTII_optimizeImage(aImage, aScreen, aOrigin) {

s/aImage/aImageURL/

I still don't see the point of ImagesInternal. 'optimize' and 'purge' can be implemented directly on LightweightThemeImages (which might better be called LightweightThemeImagesOptimizer for clarity?).
(In reply to Dão Gottwald [:dao] from comment #98)
> I still don't see the point of ImagesInternal. 'optimize' and 'purge' can be
> implemented directly on LightweightThemeImages (which might better be called
> LightweightThemeImagesOptimizer for clarity?).

Sounds good to me. Will do.
Comment on attachment 641502 [details] [diff] [review]
patch v6

Patch needs to be updated.
Attachment #641502 - Flags: review?(felipc)
(In reply to Tim Taubert [:ttaubert] from comment #99)
> (In reply to Dão Gottwald [:dao] from comment #98)
> > I still don't see the point of ImagesInternal. 'optimize' and 'purge' can be
> > implemented directly on LightweightThemeImages (which might better be called
> > LightweightThemeImagesOptimizer for clarity?).
> 
> Sounds good to me. Will do.

Correction: "LightweightThemeImageOptimizer" would be proper English, I think...
(In reply to Dão Gottwald [:dao] from comment #101)
> Correction: "LightweightThemeImageOptimizer" would be proper English, I
> think...

Sure, I named it that way already :)
(In reply to Tim Taubert [:ttaubert] from comment #87)
> No, they're generated as soon as you select your new lightweight theme but
> they're not instantly applied because they're cropped asynchronously. But
> they're ready to be used at the next startup.

But you're only doing that for local files. In most cases when a lightweight theme is enabled, its headerURL and footerURL will be remote http:// URLs. The only time they'll be local is if that lightweight theme was the previous one to be enabled.

So the usual process at the moment is:
* enable theme (remote URL, full images)
* cache to local file (full images)
* restart
* theme re-applied (local URL, full images)
* optimize
* restart
* theme re-applied (local URL, optimized images)
(In reply to Dão Gottwald [:dao] from comment #96)
> (In reply to Tim Taubert [:ttaubert] from comment #95)
> > Really? You mean I should put all those functions and the 'inProgress'
> > object just in the global scope of the JSM? That's very unlike we usually
> > write modules, isn't it?
> 
> LightweightThemeManager.jsm has various global helper functions. We can
> group functions in objects where it makes sense, but I don't see the point
> here.

Please no. I really dislike the way LightweightThemeManager has functions and variables scattered everywhere (I'd love to refactor that whole module, to clean that up).
As I said, meaningful grouping is fine. But there's just no point in wrapping everything in an "FooInternal" container. Everything not explicitly exported is internal.
I.e. I don't see how this:

var FooInternal = {
  x: function () {
    ...
  },

  y: function () {
    ...
  },

  z: function () {
    ...
  }
}

... is better structured or less scattered than this:

function x() {
  ...
}

function y() {
  ...
}

function z() {
  ...
}
Attached patch patch v7 (obsolete) — Splinter Review
(In reply to Dão Gottwald [:dao] from comment #105)
> As I said, meaningful grouping is fine. But there's just no point in
> wrapping everything in an "FooInternal" container. Everything not explicitly
> exported is internal.

Agreed, putting all the code in one big "internal" object doesn't make sense. I separated the code into smaller objects with meaningful names. How does the patch look to you?

Still need to address Blair's feedback from comment #103.
Attachment #641502 - Attachment is obsolete: true
Attachment #641918 - Flags: feedback?(dao)
Attached patch patch v8Splinter Review
Addressed comment #103. After installing a new lwtheme images are now cropped and prepared for the next start.
Attachment #641918 - Attachment is obsolete: true
Attachment #641918 - Flags: feedback?(dao)
Attachment #653978 - Flags: review?(dao)
Attachment #653978 - Flags: review?(felipc)
Comment on attachment 653978 [details] [diff] [review]
patch v8

Review of attachment 653978 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/mozapps/extensions/LightweightThemeImageOptimizer.jsm
@@ +39,5 @@
> +  },
> +
> +  purge: function LWTIO_purge() {
> +    try {
> +      FileUtils.getDir("ProfD", ["lwtheme"]).remove(true);

set followLinks = false in this nsIFile object before calling remove()
Attachment #653978 - Flags: review?(felipc) → review+
https://hg.mozilla.org/integration/fx-team/rev/87a7a1c60a3a
Whiteboard: [ts][Snappy:P1] → [ts][Snappy:P1][fixed-in-fx-team]
Attachment #653978 - Flags: review?(dao)
https://hg.mozilla.org/mozilla-central/rev/87a7a1c60a3a
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Whiteboard: [ts][Snappy:P1][fixed-in-fx-team] → [ts][Snappy:P1]
Target Milestone: --- → mozilla18
Can someone verify the persona/persona-less delta and that it is small?
Depends on: 819190
(In reply to Taras Glek (:taras away until Jan 1st, ask vladan instead) from comment #112)
> Can someone verify the persona/persona-less delta and that it is small?

I ran the same tests I ran before, using the latest Beta on Mac OS X, and this time the overhead with this Persona was only %1.
Status: RESOLVED → VERIFIED
Depends on: 843969
You need to log in before you can comment on or make changes to this bug.