Last Comment Bug 593614 - [Meta] Site rendering is very HDD intensive, especially after cold start(due to font enumeration)
: [Meta] Site rendering is very HDD intensive, especially after cold start(due ...
Status: RESOLVED FIXED
[Snappy:p1]
: main-thread-io
Product: Core
Classification: Components
Component: General (show other bugs)
: Trunk
: x86 Windows 7
: -- normal with 10 votes (vote)
: mozilla13
Assigned To: John Daggett (:jtd)
:
Mentors:
Depends on: 600713 602792 705594
Blocks: slowui
  Show dependency treegraph
 
Reported: 2010-09-04 10:02 PDT by Peter Henkel [:Terepin]
Modified: 2012-05-06 23:16 PDT (History)
45 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments

Description Peter Henkel [:Terepin] 2010-09-04 10:02:59 PDT
User-Agent:       Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b6pre) Gecko/20100904 Firefox/4.0b6pre
Build Identifier: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b6pre) Gecko/20100904 Firefox/4.0b6pre

(Probably) Javascript processing is using my hard drive like crazy, especially after cold start.  Plus, while processing the entire UI is choppy and unresponsive. I'm saying probably because I'm not 100%, but the more Javascript is on the page (e.g. GMail), the worse it is, so I asume it is it. Sunspider becnhmark is very easy test how to reproduce it; HDD is constatntly working while the benchmark is running and if I run something else HDD intensive, then the benchmark's result gets a lot worse, around 100ms or more. I can record video of the issue if required.

Reproducible: Always
Comment 1 Harsh86 2010-09-04 10:44:17 PDT
Since you're using Windows 7, you'll have access to Resource Manager (accessible via Task Manager > Performance tab)

While running a JavaScript intensive site that causes a lot of disk io (like SunSpider), open up Resource Manager > Disk tab > Disk Activity panel. Sort the list by "Total (B/sec)". You should be able to see which file(s) firefox.exe is trying to read/write to. Tell us which files show up.

Hopefully based on the file you post, it should at the very least give us a clue to what disk intensive activity Firefox is trying to perform while the JavaScript intensive site is running.
Comment 2 bthaxor 2010-09-05 02:29:17 PDT
I just tried this, was getting between 10 - 30KB being written per second during SunSpider... This doesn't seem very significant?
Comment 3 Peter Henkel [:Terepin] 2010-09-05 05:10:04 PDT
Hmm, so it seems it isn't speed-related, but IO related. Because I'm seeing writting/reading 20-30 files at the same time with total 500 kBps.
Comment 4 Peter Henkel [:Terepin] 2010-09-05 05:42:11 PDT
I recorded two videos:
1. Cold start time - 1:13 (you're reading it right, one minute and thirteen seconds)
http://www.sendspace.com/file/aebmq2
The strange thing is I saw landing improvements for cold start, but it didn't help me at all.

2. Cold start I/O (peak reading 5 MBps, around 40 files reading/writing at the same time)
http://www.sendspace.com/file/n4jdgm

I ran HDD benchmark, here are the results:
http://img715.imageshack.us/i/hddbenchmark.png/
Comment 5 Harsh86 2010-09-06 06:34:35 PDT
I'm thinking screenshots would have been better. I can hardly read the screen on the video.

Post some screenshots of the resource manager while running sunspider as thats a "javascript processing" test case. Cold start normally is disk intensive but js processing isn't supposed to be. (in the screenshots make sure you sort the list by total B/sec so we can see the disk intensive files at the top)

Also how fragmented is your hard drive?
Comment 6 Mike Beltzner [:beltzner, not reading bugmail] 2010-09-06 13:40:21 PDT
--> Core::JS, although that would surprise me to be the cause. It looks like this is being caused more by poor I/O at startup. However, the fact that it happens more on JS intensive pages and the SS benchmark is interesting.
Comment 7 Brendan Eich [:brendan] 2010-09-06 14:56:10 PDT
I/O is when the process makes system calls to read and write files, either explicitly or via implicit memory mappings. JS doesn't do explicit I/O. Is it possible that on Peter's system it's doing implicit I/O somehow? If not, then this is not a JS engine bug.

/be
Comment 8 Scott A. 2010-09-21 11:26:00 PDT
I have been having the same problems with the nightly's for a long time now. On cold start after bootup it will take about 10 seconds for the browser to open, after the browser opens and goes to the homepage (default), the HDD activity light will go crazy and the browser will run really slow for about the next 5-10 seconds. I thought perhaps it might have been the laptop harddrive (even though 3.6 doesn't do this), so  I installed it on my desktop (i7 @ 3.5Ghz, 6gb ddr3, GTX480, much faster HDD) and notice the same thing (slightly faster only due to system specs, SIGNIFICANTLY worse than 3.6).

Not sure if this is actually JS related, but this bug should be renamed appropriately and definitely set as blocking2.0
Comment 9 (dormant account) 2010-09-21 11:27:43 PDT
This sounds really weird and bad. Scott, any chance of you capturing an xperf profile? https://developer.mozilla.org/En/Profiling_with_Xperf
Comment 10 Peter Henkel [:Terepin] 2010-09-21 11:35:24 PDT
I will update my results after my laptop will return from repair, however, Scott suffers from the exactly same behaviour as I do.
Comment 11 (dormant account) 2010-09-21 11:51:11 PDT
While we wait for an xperf profile, some other basic info could help.

Given then specs on Scott's machine this *should* not happen. Are there antivirus programs or some other invasive programs that are running? Is this a result of a massive memory leak? What is the memory usage like in windows task manager while this heavy disk activity is going on? With 6gb of ram, i'm assuming Scott is running a 32bit build on win64 so this *should not* be a problem.
Comment 12 Scott A. 2010-09-21 14:22:38 PDT
Yes it is 32bit build on win7 64bit. No active AV or any other programs running for that matter, relatively new OS install with bare minimal installed and running, All latest windows updates and drivers, fresh profile. I don't notice anything unsual about memory usage during this time, the browser will fully load with a usage around 65-70MB, no other processes are showing any unusually high usage either. Just takes a significantly longer time to cold start than 3.6 did and continues to load HDD for awhile after browser does load.
Comment 13 Csaba Kozák [:WonderCsabo] 2010-09-25 07:37:40 PDT
Absolutely confirmed.

Other symptoms of this bug: tab animations are very sluggish when complex sites are open. :S

Please someone set it to NEW, this bug really kills Fx's UX!
Comment 14 Peter Henkel [:Terepin] 2010-09-25 07:50:48 PDT
The entire chrome get's slugish. I'm not FX developer, but to me it seems that chrome is connected to rendering, which I believe it shouldn't. Otherweise I can't think of any reason why they should afect each other.
Comment 15 Csaba Kozák [:WonderCsabo] 2010-09-25 09:02:59 PDT
You're correct, but i told "symptom". The sluggishness can be seen principally on animations.

I hope a dev will look into this.

Mike: Can you mark this new, and make starting the work on this?
Comment 16 Boris Zbarsky [:bz] 2010-09-25 14:15:28 PDT
What do comments 13-15 have to do with this bug?  This bug is about high hard drive usage after startup with things like _sunspider_, which is about as far from "complex sites" as you can get.  Please stop hijacking bugs and file a separate bug, with clear steps to reproduce, on whatever issues you're seeing?
Comment 17 Thomas 2010-09-25 17:26:38 PDT
I've investigated this issue and found some interesting things on my computer (using latest nightly 20100925). Firefox loads huge amounts of font data (e g > 100MB of chinese(?) fonts) directly after UI/tabs are visible and it is during this font loading time that the UI is slow. This is with D2D enabled, without D2D it seems like firefox is just loading small parts of those fonts (just made one test without D2D though).

I have the same symptoms as Scott in comment 8, i e slow UI responsiveness (and high HDD activity) the first 10-15 seconds after start-up of Firefox after cold-start/restart of computer, I believe this is what this bug is about (according to bug title at least).

I also have a similar system (modern hardware and well working W7 64-bit). 
I profiled the start-up with ProcessMonitor from sysinternals using a filter for read and write operations (great utility). I have logs available if it is useful.

Test case
1. Cold start of computer.
2. I waited 10 minutes so all pre-fetch etc after OS start-up had finished
3. Starting Firefox, there was only three tabs saved from last Firefox session (this bug page, mozilla.org and an empty tab). D2D enabled.
4. It takes 5(?) seconds to see any UI/tabs, ok I guess.
5. Then UI is very slow for maybe 10 seconds, for example it is very slow to switch tabs.

What happens is that during the time the UI is slow and unresponsive, Firefox is loading 120MB of chinese fonts. There are no other significant file operations during this time but the font loading. The following fonts are loaded on my computer in this 

specific case (they seem to be loaded completely):

mingliub.ttc (30MB)
AdobeKaitiStd-Regular.otf (16MB)
mingliu.ttc (31MB)
msjh.ttf (21MB)
msyh.ttf (21MB)

If I start Firefox almost directly after after W7 startup even more big fonts seems to be loaded and this seems to cause an even longer period of slow UI. 

Unfortunately I haven't managed to remove these fonts from my system to verify that the font loading is the cause, W7 complains that they are system fonts that cannot be removed ( I have a UK version of W7).
Comment 18 Csaba Kozák [:WonderCsabo] 2010-09-26 02:04:46 PDT
I tried to remove / rename these fonts. I have administrative privileges, but i cannot remove them with that either. :(
Comment 19 Peter Henkel [:Terepin] 2010-09-26 05:47:57 PDT
(In reply to comment #16)
> What do comments 13-15 have to do with this bug?  This bug is about high hard
> drive usage after startup with things like _sunspider_, which is about as far
> from "complex sites" as you can get.  Please stop hijacking bugs and file a
> separate bug, with clear steps to reproduce, on whatever issues you're seeing?

Stop accusing us from hijacking, it's insulting! Symptoms are similiar, they are caused by the SAME problem. There is a word for creating several bugs caused by the same problem and that word is duplication. This bug isn't about those symptoms, this bug is about discovering what is causing them.
Comment 20 Csaba Kozák [:WonderCsabo] 2010-09-26 08:01:14 PDT
(In reply to comment #17)

I removed these fonts from my system. The symptoms are still here. (UI responsiveness and huge HDD usage)
Comment 21 Boris Zbarsky [:bz] 2010-09-26 11:41:27 PDT
Peter, does disabling d2d make your problem better as it does for Thomas?  Or does comment 17 need to be spun off into a separate bug?

And my apologies for not realizing that comment 14 is from the reporter of this bug.  ;)
Comment 22 Peter Henkel [:Terepin] 2010-09-26 12:16:46 PDT
Nope, not at all. It's still slow and btw when I searched in my history via Location Bar, loading was so intensive that FX freezed for a couple of seconds.
Comment 23 Boris Zbarsky [:bz] 2010-09-26 12:27:26 PDT
OK.  Thomas, could you please spin the D2D issue you see into a different bug, if it goes away with D2D disabled?
Comment 24 Emanuel Hoogeveen [:ehoogeveen] 2010-09-27 07:28:55 PDT
This is a huge problem for running Minefield from a pendrive on my university's computers. When opening a new tab, but also seemingly at random, Minefield will slow to a crawl for up to a minute as my pendrive blinks away doing some sort of I/O.

I could probably help with testing if required.
Comment 25 Peter Henkel [:Terepin] 2010-09-27 08:01:52 PDT
Could someone set it to NEW, please? This was confirmed by several users now.

Thanks.
Comment 26 Mike Beltzner [:beltzner, not reading bugmail] 2010-09-27 08:05:53 PDT
Peter, I'd set this to NEW if we had confirmed STR. There are several users reporting HDD usage, but not specific to "heavy" sites as per your original report, and not related to JavaScript.

I actually think this is a bunch of bugs. I'd like to evaluate them more, but perhaps individually.
Comment 27 Peter Henkel [:Terepin] 2010-09-27 08:37:46 PDT
(In reply to comment #26)
> Peter, I'd set this to NEW if we had confirmed STR. There are several users
> reporting HDD usage, but not specific to "heavy" sites as per your original
> report, and not related to JavaScript.

Please, ignore my first post. As I said before, this bug isn't about concrete problem (yet), now it's "only" discovering bug. I'm not denying other might have different problems. However we are still trying to figure it out what is causing them, so the description, name and categories are temporary. Or perhaps not. My point is the moment we discover that little ugly bug, I'll change required informations accordingly.
Comment 28 Boris Zbarsky [:bz] 2010-09-27 09:27:03 PDT
EHoogeveen, a separate but with some data from a profiler or debugger indicating what's on the callstack would be useful for your case.  So would just a list of files being accessed; I assume Windows has a way of generating that (similar to strace or the like, say).  Note that what you see is totally different from what Thomas, say, sees (e.g. we're surely not loading _fonts_ off the pen drive).  It's also certainly not a memory/gc issue, unless your swap partition on the pen drive.  I assume your profile and Firefox install are on the pen drive, so presumably one of those is being accessed.  Again, a clear bug on your issue with information about what's being accessed would help most.  Please mention the bug number here.

Peter, do you want to just turn this into a tracking bug for specific instances of intensive disk usage?  It sounds to me like you do.....
Comment 29 Peter Henkel [:Terepin] 2010-09-27 09:37:53 PDT
Not yet. My laptop will return from repair within three days, I'll run another tests as requested and we'll see.
Comment 30 Emanuel Hoogeveen [:ehoogeveen] 2010-09-27 13:58:57 PDT
bz, I'll see what I can do - I should have some time to look into this on Wednesday. WonderCsabo did mention he tried removing the fonts and it didn't help, so his issue might be closer to mine.
Comment 31 (dormant account) 2010-09-27 16:35:03 PDT
bz, do you know if we can avoid reading in all of the fonts on the system?

I just got an xperf profile from supernova_00 on irc. His profile seems to include an awful lot of font io. Since we read fonts on startup, we get punished pretty hard if the windows indexer or some other io-heavy process is going on in parallel. 

Is there any chance of moving font enumeration off the main thread? 
Marking this in as main-thread-io based on description.
Comment 32 (dormant account) 2010-09-27 16:40:55 PDT
Is anyone else that's experiencing this bug using the AVG antivirus?
Comment 33 Boris Zbarsky [:bz] 2010-09-27 19:22:39 PDT
Taras, I'm the wrong person to ask.  We should really spin the font thing off into a separate bug and have jfkthame or jdaggett look into it, imo....
Comment 34 Emanuel Hoogeveen [:ehoogeveen] 2010-09-30 07:39:06 PDT
Sorry I haven't been able to do any testing yet - it'll probably be the weekend at this rate. Did find out I can't do much on my university's computers as I would need an admin account. Still, testing this at home on a fast computer (with a slow memory stick) might actually be more telling.
Comment 35 (dormant account) 2010-09-30 12:23:24 PDT
Ok, so antivirus isn't enough to cause this behavior. Are any of the people experiencing this firefox sync users?
Comment 36 Emanuel Hoogeveen [:ehoogeveen] 2010-09-30 12:43:20 PDT
Yes. Although I wasn't able to do any quantitative testing, turning off Sync seemed to help a lot on the university computers. Why would opening a new tab cause so much synchronous I/O though? Shouldn't most of this stuff be kept in RAM anyway? (assuming Sync is a big part of the problem)
Comment 37 (dormant account) 2010-09-30 12:54:43 PDT
(In reply to comment #36)
> Yes. Although I wasn't able to do any quantitative testing, turning off Sync
> seemed to help a lot on the university computers. Why would opening a new tab
> cause so much synchronous I/O though? Shouldn't most of this stuff be kept in
> RAM anyway? (assuming Sync is a big part of the problem)

I'm pretty sure opening a new tab isn't causing grave amounts of io. It's more like a large amount of io is already going on due to either sync and/or font loading(trying to figure out the culprit). So whatever io that needs to happen for a new tab is blocked.

By opening a new tab do you mean trying to load a page in it, or just doing 'ctrl+t'
Comment 38 Emanuel Hoogeveen [:ehoogeveen] 2010-09-30 12:59:25 PDT
(In reply to comment #37)
> By opening a new tab do you mean trying to load a page in it, or just doing
> 'ctrl+t'

Loading an actual page - though again, no quantitative testing. I just started turning things off to see how it effected browser responsiveness and page load time, with some I/O statistics on in the background. Turning off browsing history seemed to help, but turning off sync seemed to help a lot more.
Comment 39 (dormant account) 2010-09-30 13:00:55 PDT
(In reply to comment #38)
> (In reply to comment #37)
> > By opening a new tab do you mean trying to load a page in it, or just doing
> > 'ctrl+t'
> 
> Loading an actual page - though again, no quantitative testing. I just started
> turning things off to see how it effected browser responsiveness and page load
> time, with some I/O statistics on in the background. Turning off browsing
> history seemed to help, but turning off sync seemed to help a lot more.

Ok so try turning off your disk cache. The problem is that there is a bunch of io happening in background, that's causing disk cache to be blocked on reads.
Comment 40 Shawn Wilsher :sdwilsh 2010-09-30 13:18:22 PDT
(In reply to comment #39)
> Ok so try turning off your disk cache. The problem is that there is a bunch of
> io happening in background, that's causing disk cache to be blocked on reads.
But those reads from the disk cache should not be happening on the main thread anymore after bug 513008.
Comment 41 (dormant account) 2010-09-30 13:20:48 PDT
(In reply to comment #40)
> (In reply to comment #39)
> > Ok so try turning off your disk cache. The problem is that there is a bunch of
> > io happening in background, that's causing disk cache to be blocked on reads.
> But those reads from the disk cache should not be happening on the main thread
> anymore after bug 513008.

threading is irrelevant, the site will still take forever to render if the disk is busy.
Comment 42 Shawn Wilsher :sdwilsh 2010-09-30 13:22:06 PDT
(In reply to comment #41)
> threading is irrelevant, the site will still take forever to render if the disk
> is busy.
Right.  I thought you were implying that the UI was blocked due to this, which would be surprising to me.
Comment 43 (dormant account) 2010-09-30 13:32:50 PDT
(In reply to comment #42)
> (In reply to comment #41)
> > threading is irrelevant, the site will still take forever to render if the disk
> > is busy.
> Right.  I thought you were implying that the UI was blocked due to this, which
> would be surprising to me.

I was. We do occasionally load xul/etc from disk. I agree the likelihood of that is lower than slow webpage rendering.
Comment 44 Emanuel Hoogeveen [:ehoogeveen] 2010-09-30 14:07:39 PDT
Do you mean Firefox's Offline Storage settings? (check "Override automatic cache management", Limit cache to 0 MB of space) I tried that (this is from memory) but it didn't seem to help matters much. Turning off Sync helped a lot more, and I don't understand why Sync would cause (a lot of) I/O every time you navigate to a new page. Again though, I've yet to do any quantitative measurement, so I don't know what files Firefox is actually accessing. I'll try to do that this weekend.
Comment 45 (dormant account) 2010-09-30 14:08:55 PDT
(In reply to comment #44)
> Do you mean Firefox's Offline Storage settings? (check "Override automatic
> cache management", Limit cache to 0 MB of space) I tried that (this is from
> memory) but it didn't seem to help matters much. Turning off Sync helped a lot
> more, and I don't understand why Sync would cause (a lot of) I/O every time you
> navigate to a new page. Again though, I've yet to do any quantitative
> measurement, so I don't know what files Firefox is actually accessing. I'll try
> to do that this weekend.

Sync does some io soon after startup. It shouldn't happen every time you navigate to a page.
Comment 46 Mike Shaver (:shaver -- probably not reading bugmail closely) 2010-09-30 14:10:01 PDT
Copying some sync people here, but I think this wants to be a separate bug WRT Sync causing pain for users on slow-I/O configurations.  Thanks a ton for your info here, Emanuel!
Comment 47 Emanuel Hoogeveen [:ehoogeveen] 2010-09-30 14:12:35 PDT
No problem, though I feel a bit uncomfortable talking about this from memory. I'll check what's actually going on soon, and file a bug with my findings as I promised bz (or post them here if that would be better). Hopefully I'll see similar behavior on my system as I did on the university computers.
Comment 48 Ed Lee :Mardak 2010-09-30 15:13:30 PDT
Sync does track TabOpen/TabSelect/TabClose events and writes a JSON file to disk. I believe the disk writing part is done on a delay.
Comment 49 Mike Shaver (:shaver -- probably not reading bugmail closely) 2010-09-30 15:18:37 PDT
On a delay is still likely to be when the user is going to be interacting with the browser.  Is it written on a background thread?  Why do we need to write to disk at all, if the point is to synchronize into the cloud?  How does this file differ from session-store state that we track already?
Comment 50 Shawn Wilsher :sdwilsh 2010-09-30 15:29:15 PDT
(In reply to comment #48)
> Sync does track TabOpen/TabSelect/TabClose events and writes a JSON file to
> disk. I believe the disk writing part is done on a delay.
We write out sessionstore stuff with NetUtil.asyncCopy so we aren't doing it on the main thread anymore.  Can sync do this too?
Comment 51 Shawn Wilsher :sdwilsh 2010-09-30 15:30:03 PDT
(In reply to comment #50)
> We write out sessionstore stuff with NetUtil.asyncCopy so we aren't doing it on
> the main thread anymore.  Can sync do this too?
For reference: http://mxr.mozilla.org/mozilla-central/source/browser/components/sessionstore/src/nsSessionStore.js#3596
Comment 52 Mike Connor [:mconnor] 2010-09-30 15:32:51 PDT
Given how tab sync works, I don't think this needs to be on disk at all.  The only thing we care about is ordering, and even that is rather tenuous for across-browser-session persistence.  It's even less useful once we fix bug 600991 as this was the core reason we cared about ordering at all, so I've filed bug 600993 to make tab sync never ever touch disk.  Ever.

There's some other jsonSave calls we should look at making async + smarter, but tabs is the busiest AND unnecessary, so we can fix that for sure.
Comment 53 Ed Lee :Mardak 2010-09-30 15:38:06 PDT
(In reply to comment #50)
> We write out sessionstore stuff with NetUtil.asyncCopy so we aren't doing it on
> the main thread anymore.  Can sync do this too?
Sounds reasonable. As mconnor pointed out, we probably don't need to write to disk for tab data, but for other engines it would need to stick around so that on restarts, Sync remembers what data hasn't been synced yet. But even then, potentially that could be batched and deferred to shutdown assuming no crashes.
Comment 54 Boris Zbarsky [:bz] 2010-09-30 16:42:18 PDT
Peter, do you have Sync enabled?  Or are all the sync comments extraneous to the bug you're seeing?
Comment 55 Peter Henkel [:Terepin] 2010-10-01 02:05:54 PDT
Yes, Sync is enabled. But UI is locking up when HDD is busy with something else. Even Windows is telling me that Minefield has stopped working.
Comment 56 u88484 2010-10-01 06:47:55 PDT
(In reply to comment #54)
> Peter, do you have Sync enabled?  Or are all the sync comments extraneous to
> the bug you're seeing?

I think I also have sync enabled.  In the options panel all the options are checked but the button beside my username says 'connect'.  Does that mean sync is enabled or not enabled, or is that for me to force a sync right then?  

Also, I've been working with taras on IRC and have provided a bunch of traces.  I haven't provided one with sync disabled yet though.  So far it seems loading fonts and avg running  were the biggest reasons for my slow startup.  I think my places.sqlite database was/is also another factor.
Comment 57 Philipp von Weitershausen [:philikon] 2010-10-01 06:50:54 PDT
(In reply to comment #56)
> (In reply to comment #54)
> > Peter, do you have Sync enabled?  Or are all the sync comments extraneous to
> > the bug you're seeing?
> 
> I think I also have sync enabled.  In the options panel all the options are
> checked but the button beside my username says 'connect'.  Does that mean sync
> is enabled or not enabled, or is that for me to force a sync right then?  

It means Sync is enabled and tracking changes to your profile.
Comment 58 Boris Zbarsky [:bz] 2010-10-01 08:50:36 PDT
Peter, as an experiment, can you try disabling Sync and seeing whether the problem continues for you?
Comment 59 John Daggett (:jtd) 2010-10-03 20:11:25 PDT
Peter, if you have your machine back, what's the difference in the performance you see with FF4beta vs. FF3.6?  Others seem to be saying it's much worse with FF4, is that your experience?
Comment 60 Peter Henkel [:Terepin] 2010-10-04 02:27:05 PDT
Yes, it's worse.
Comment 61 (dormant account) 2010-10-04 15:58:52 PDT
(In reply to comment #60)
> Yes, it's worse.

Peter, could you record some xperf profiles for ff4 and ff3.6?
Comment 62 Boris Zbarsky [:bz] 2010-10-04 17:43:56 PDT
Joe, why did you just resummarize this bug to be about the font issue?  Comment 22 says that disabling d2d didn't fix this bug for the reporter....

If the dwrite issus is something you want to block on, it needs to be spun off into a separate bug, no, while we try to figure out what Peter is actually seeing here?
Comment 63 Shawn Wilsher :sdwilsh 2010-10-04 17:59:31 PDT
(In reply to comment #62)
> If the dwrite issus is something you want to block on, it needs to be spun off
> into a separate bug, no, while we try to figure out what Peter is actually
> seeing here?
I thought that was being handled by bug 600713 (kinda at least)
Comment 64 Boris Zbarsky [:bz] 2010-10-04 18:43:34 PDT
Peter, still waiting for data from you on whether turning off Sync helps...
Comment 65 Boris Zbarsky [:bz] 2010-10-04 20:24:40 PDT
Yeah, Shawn's right.  Undoing the summary changes Joe made, pending sorting out what Peter is actually seeing.
Comment 66 Peter Henkel [:Terepin] 2010-10-05 02:43:32 PDT
(In reply to comment #64)
> Peter, still waiting for data from you on whether turning off Sync helps...

Nope, still the same.
Comment 67 Emanuel Hoogeveen [:ehoogeveen] 2010-10-05 04:19:46 PDT
It appears my university's AV is not helping matters - I can't change its settings, but looking at its scanning statistics I see it scanning all sorts of files from Firefox - sessionstore-1.js, for instance. Obviously there's little FF can do about this; I guess many small files (that can be scanned quickly) might be better in this case than a few large ones, but I think the focus has actually been on reducing the number of files FF uses.
Comment 68 Emanuel Hoogeveen [:ehoogeveen] 2010-10-05 06:34:00 PDT
Some additional results: there's the obvious problem of loading all the font files, but in addition to that I seem to be getting long stalls related to a file called localstore-1.rdf. This file seems to be temporary as I can't find it in my profile, but I've had it deadlock the browser for as long as 30 seconds (probably related to the AV, but other files don't seem nearly as bad). I'm not sure if this happens often or just once after the browser is started.
Comment 69 Peter Henkel [:Terepin] 2010-10-05 07:55:42 PDT
(In reply to comment #61)
> (In reply to comment #60)
> > Yes, it's worse.
> 
> Peter, could you record some xperf profiles for ff4 and ff3.6?

Even with "tutorial" I have absolutely no idea how to do it.
Comment 70 Boris Zbarsky [:bz] 2010-10-05 08:05:31 PDT
OK, so going back to Peter's problem for a second... just to make sure:

1)  Turning off sync doesn't help.
2)  Turning off dwrite doesn't help.
3)  There's no antivirus involved.

Right?
Comment 71 Peter Henkel [:Terepin] 2010-10-05 08:17:41 PDT
Right (with AV's real-time detection disabled).
Yetsterday I was downloading around 2,5 MBps. Browsing with Firefox at the same time was like a walk in hell. Every damn action, like new tab, new site, bookmarks, new menu, AOM, was causing Firefox to freeze and if freeze was long enough, it broke title bar, so I had to restart FX quite often.
Comment 72 Boris Zbarsky [:bz] 2010-10-05 08:23:27 PDT
OK.  Let's talk about this xperf thing, then.  Do the steps under "XPerf" at http://blog.mozilla.com/tglek/2010/10/04/diagnosing-slow-startup/ make more sense?  If not, what's the first step that doesn't make sense?
Comment 73 Peter Henkel [:Terepin] 2010-10-05 08:38:10 PDT
OK, did it. Which of those information you require?
Comment 74 Boris Zbarsky [:bz] 2010-10-05 09:10:53 PDT
Taras?  What do you want from Peter here?
Comment 75 (dormant account) 2010-10-05 09:39:24 PDT
(In reply to comment #73)
> OK, did it. Which of those information you require?

Can you email me the .etl file(compress it first)?
Comment 76 (dormant account) 2010-10-05 09:40:32 PDT
(In reply to comment #72)
> OK.  Let's talk about this xperf thing, then.  Do the steps under "XPerf" at
> http://blog.mozilla.com/tglek/2010/10/04/diagnosing-slow-startup/ make more
> sense?  If not, what's the first step that doesn't make sense?

Peter can you also try to vacuum your places as linked on there? How big is your places.sqlite file?
Comment 77 Peter Henkel [:Terepin] 2010-10-05 10:04:03 PDT
60 MB before vacuum, 50 MB after vacuum.
Comment 78 (dormant account) 2010-10-05 10:06:59 PDT
(In reply to comment #77)
> 60 MB before vacuum, 50 MB after vacuum.

Is the performance any better?
Comment 79 Peter Henkel [:Terepin] 2010-10-05 10:09:57 PDT
Nope.
Comment 80 (dormant account) 2010-10-05 12:48:25 PDT
Peter,
Comodo firewall is showing a lot of actitivy, try disabling that. There is a lot of weird non-firefox activity going on in your trace. This is probably the cause of your slowdown.


Your plugin-container is also pretty active, try disabling plugins if above doesn't help.
Comment 81 (dormant account) 2010-10-05 12:49:00 PDT
If you are still having issues, please send me another trace without Comodo firewall running.
Comment 82 Peter Henkel [:Terepin] 2010-10-05 12:52:36 PDT
That means to uninstall CIS. *Sigh*, oh well, I'll do it tommorow and post the results of my test.
Comment 83 Peter Henkel [:Terepin] 2010-10-06 04:53:45 PDT
OK, disabled CIS. Small improvement in start-up time, none improvement in HDD usage.
I have 10 plugins, 4 are disabled, rest are essential for web browsing.
Comment 84 (dormant account) 2010-10-06 09:58:36 PDT
(In reply to comment #83)
> OK, disabled CIS. Small improvement in start-up time, none improvement in HDD
> usage.
> I have 10 plugins, 4 are disabled, rest are essential for web browsing.

Can you do another xperf trace?
Comment 85 (dormant account) 2010-10-06 10:00:01 PDT
(In reply to comment #84)
> (In reply to comment #83)
> > OK, disabled CIS. Small improvement in start-up time, none improvement in HDD
> > usage.
> > I have 10 plugins, 4 are disabled, rest are essential for web browsing.
> 

Also, I'm not asking you to disable plugins permanently. Just disable them for a few minutes to see if anything changes(and maybe record another xperf trace).
Comment 86 (dormant account) 2010-10-11 13:08:33 PDT
(In reply to comment #83)
> OK, disabled CIS. Small improvement in start-up time, none improvement in HDD
> usage.
> I have 10 plugins, 4 are disabled, rest are essential for web browsing.

From the last profile that Peter sent me this looks like bug 600713 on a slightly busier than usual system with a slow hard drive. That causes slow startups. Looks like at least 20seconds of IO due to loading firefox, extensions and fonts.

Beyond that there is some activity in places.sqlite that causes a 4s delay during use.
Comment 87 Peter Henkel [:Terepin] 2010-10-12 05:27:47 PDT
So it's a dupe of that bug then?
Comment 88 (dormant account) 2010-10-12 06:01:39 PDT
(In reply to comment #87)
> So it's a dupe of that bug then?

maybe. Is firefox still hdd-intensive if you wait 5min after startup before using it? If so that's likely the same bug.
Comment 89 Peter Henkel [:Terepin] 2010-10-12 06:04:33 PDT
OK, closing this bug for now. If bug 600713 won't be enough, I'll reopen it.
Thanks for help!

*** This bug has been marked as a duplicate of bug 600713 ***
Comment 90 John Daggett (:jtd) 2010-10-12 06:23:05 PDT
Actually, I think we should leave this as is.  There are now two bugs that affect the problems reported here, so it's not really a simple duplicate.  Better to leave this as a meta bug for those other two for now.
Comment 91 (dormant account) 2010-11-19 12:41:02 PST
Peter, can you try this build
http://people.mozilla.com/~tglek/startup/firefox-4.0b8pre.en-US.win32.zip

I commented out a bunch of font loading code. Make sure gfx.font_rendering.directwrite.enabled is set to false.

Curious if this performs better for the page rendering case.
Comment 92 Peter Henkel [:Terepin] 2010-11-20 10:14:12 PST
Can't, it crashes before start up.
Comment 93 John Daggett (:jtd) 2010-12-01 05:20:28 PST
On bug 602792, I've posted some patches to deal with some of the heavy I/O associated with cold startup when Direct2D/DirectWrite is enabled.

I'd appreciate it if those with machine environments affected by this could check out this tryserver build and note whether this improves their cold startup times or not:

  http://bit.ly/gEARUZ

See bug 602792, comment 59 for details and an example testcase to use.

The build dumps font startup timing info to a file, 'fonttiming.out' in the bin directory.  This includes time spent in InitFontList (e.g. GetSystemFontCollection) along with timestamps of all font table reads.

I've seen a lot of variability in DirectWrite performance.  Some machines seem to do fine, pulling most data out of a font cache even after cold startup.  Other environments are more adversely affected, especially on low spec machines and in when there's lots of crapware running (i.e. large numbers of services that are constantly interacting with the file system, file-level virus scanners etc.)
Comment 94 u88484 2010-12-01 17:28:51 PST
(In reply to comment #93)
> The build dumps font startup timing info to a file, 'fonttiming.out' in the bin
> directory.

I can't find this file on my system.  I downloaded the win32 zip non-debug build.
Comment 95 John Daggett (:jtd) 2010-12-01 17:52:54 PST
(In reply to comment #94)
> I can't find this file on my system.  I downloaded the win32 zip non-debug
> build.

After you run Firefox, the file will be created in the same directory as the exe (not the bin dir, that's only for dev builds, my mistake).

Do you see improved cold startup times with the tryserver build?  Also, are you running with a file-scanning virus checker running?
Comment 96 u88484 2010-12-01 17:57:42 PST
(In reply to comment #95)
> (In reply to comment #94)
> > I can't find this file on my system.  I downloaded the win32 zip non-debug
> > build.
> 
> After you run Firefox, the file will be created in the same directory as the
> exe (not the bin dir, that's only for dev builds, my mistake).
> 
Yeah, still not there.  I searched all of C:\ for fonttiming, fonttiming.* and fonttiming.out

> Do you see improved cold startup times with the tryserver build?  
Between two and three seconds.  Total time went fromt 43 to just over 40 seconds but I have other startup issues that I've been working with Taras about.

Also, are you
> running with a file-scanning virus checker running? Nope, I got rid of AVG a while ago to shave almost a minute off my startup times.
Comment 97 John Daggett (:jtd) 2010-12-01 18:07:11 PST
(In reply to comment #96)
> (In reply to comment #95)
> > (In reply to comment #94)
> > > I can't find this file on my system.  I downloaded the win32 zip non-debug
> > > build.
> > 
> > After you run Firefox, the file will be created in the same directory as the
> > exe (not the bin dir, that's only for dev builds, my mistake).
> > 
> Yeah, still not there.  I searched all of C:\ for fonttiming, fonttiming.* and
> fonttiming.out

Hmm, that's curious.  It should be right next to the firefox.exe executable in the tryserver build directory on your machine.

> > Do you see improved cold startup times with the tryserver build?  
> Between two and three seconds.  Total time went fromt 43 to just over 40
> seconds but I have other startup issues that I've been working with Taras
> about.

Might help to use Process Monitor to track exactly what's going on:

  http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx

Run it just before starting Firefox after a reboot and it will capture all system events, including I/O events.
Comment 98 Benjamin Smedberg [:bsmedberg] 2010-12-15 14:45:30 PST
Not going to block on the meta bug at this point. If problems remain after the dependencies land, please renominate with data.
Comment 99 Peter Henkel [:Terepin] 2011-01-23 08:16:11 PST
With fixes from bug 602792 startup is much better, however HDD usage during page rendering remains untouched. I have feeling this bug bug is about something completely different.
Comment 100 Scott A. 2011-01-23 12:33:23 PST
(In reply to comment #99)
> With fixes from bug 602792 startup is much better, however HDD usage during
> page rendering remains untouched. I have feeling this bug bug is about
> something completely different.

Page rendering does do a lot of churning on my slower laptop drive, occasionally causing a pause before the page renders, maybe it should be split into a separate bug that deals with disk cache and IO.
Comment 101 Peter Henkel [:Terepin] 2011-01-23 12:42:49 PST
This bug was and still is about this issue. Could someone from devs comment this, please?
Comment 102 John Daggett (:jtd) 2011-01-23 16:35:16 PST
Scott, Peter, I think the best thing would be to run Process Monitor to diagnose what's going on during FF startup.

  http://technet.microsoft.com/en-us/sysinternals/bb896645

Steps for use, after installing Process Monitor:

1. Set up Minefield to load with your typical start pages
2. Reboot
3. Run Process Monitor, reset the filter to default if necessary
4. Run Minefield
5. Wait until all pages are loaded
6. Stop event logging in Process Monitor (click on the magnifier, third icon from the left)

This will record *all* events on the system, which is important to verify if other events are taking place concurrently, such as other services doing lots of io during startup.

Filtering is really simple, move your mouse over any item in the event list (e.g. appname) and right click to include/exclude that in the filter.  From the "Tools" menu, select "File Summary..." to get a detailed report of io.  Note that only io for the currently filtered events will be summarized.

It would be really interesting to get a breakdown of file io for the entire system while Firefox is loading (i.e. unfiltered) and for the Firefox process itself (Firefox-only filter applied).

If you need any help with this, let me know.
Comment 103 John Daggett (:jtd) 2011-01-24 01:06:17 PST
As noted in bug 602792, comment 103, you might also try cleaning out font cache files.  The larger those are, the greater the chance that the FontCache service will fail to startup quickly enough and the dwrite client code will do a full font folder read.  For faster machines, having those font cache files will speed up rendering but on slower machines this appears to cause very slow cold startup times.
Comment 104 Peter Henkel [:Terepin] 2011-05-09 12:44:38 PDT
I have finally recorded a video showing described problem: http://www.youtube.com/watch?v=NaRDZf5pwYM
Comment 105 (dormant account) 2011-12-08 11:46:58 PST
John, I think your hardcoding + glyph cache work will take care of this. Can you close/dup this bug?
Comment 106 (dormant account) 2012-01-19 15:54:10 PST
(In reply to Taras Glek (:taras) from comment #105)
> John, I think your hardcoding + glyph cache work will take care of this. Can
> you close/dup this bug?

Leaving it as snappy:P1 until then
Comment 107 Amarjeet Singh Rai 2012-01-27 07:29:59 PST
Needs to be re-opened.

That has been happening to me for a very long time.
I finally found out what it is.

I have a lot of fonts. So I have to wait a long time for this.

Only load the fonts needed please fix this.

It has made me use Firefox less and less.
Comment 108 Amarjeet Singh Rai 2012-02-27 15:22:31 PST
This is a duplicate of this: https://bugzilla.mozilla.org/show_bug.cgi?id=705594
Comment 109 John Daggett (:jtd) 2012-05-06 19:10:20 PDT
This has been fixed by bug 705594, we no longer enumerate cmaps on font fallback when using DirectWrite.  Instead we use a custom DirectWrite text renderer to extract the appropriate fallback font.

Note You need to log in before you can comment on or make changes to this bug.