Open Bug 459891 (build-perf) Opened 16 years ago Updated 2 years ago

Mozilla Firefox build time on Windows is extremely slow

Categories

(Firefox Build System :: General, defect)

x86
Windows Vista
defect

Tracking

(Not tracked)

People

(Reporter: plam173, Unassigned)

References

(Depends on 4 open bugs, )

Details

(Whiteboard: [tracking][buildfaster:p3])

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3

Mozilla Firefox build time on Windows is extremely slow. Compared to other platforms, the build time on Windows can take up to 2-3 times longer. I am currently trying to look for the source of this problem and possibly enhance the build time.

Reproducible: Always

Steps to Reproduce:
1.
2.
3.
Version: unspecified → 3.0 Branch
Assuming you've done the obvious stuff like disable virus-scan and use FAT32 instead of NTFS, the next steps are all "hard". Although if you have the ability to create profiles of where we're spending time, that could be useful.
Assignee: nobody → plam173
Status: UNCONFIRMED → NEW
Ever confirmed: true
Version: 3.0 Branch → Trunk
Well, right now, I have managed to create a simple profile which just calculates where time was spent according to the directories that are 1 level down from root. I am going to try and build a more detail profile soon. But for now, this is what I have.

http://zenit.senecac.on.ca/wiki/index.php/Profile_the_build_system#Release_0.1_-_Oct_13
Patrick: I think Benjamin just thought this was a bug report, instead of a bug about the work you're doing. We often get people complaining that the build is slow, or things are broken.
Oh, oops, sorry for not making the bug descriptioin clear or filing the bug properly. I'm still not too familiar with bugzilla and I guess I should have been more descriptive in the description.

Right now, the info I have posted is just for the rebuilt of mozilla. I'm going to analyze the rest of my data and have the built time from scratch up as well within this week.

Ted: Any pointers what the next step I should do now?
Patrick: so that chart is from a rebuild after doing a full build, with nothing changed? That's the case we're most interested in, I think, since in a build from scratch, most of the time is spent compiling. In a rebuild with nothing changed, most of the time is wasted, since ideally it would take no time at all.

Shaver had some ideas for speeding things up a bit, I might have a patch you can try sometime this week for comparison.

One thing I'd love to see, if you can figure out how (perhaps using Process Explorer? http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx) is a list of all the programs that get executed in a build (or rebuild) and how much time they take up. I suspect part of our slowness is due to lots of process spawning, but it's hard to know if that's a real problem or not. Clearly we run 'make' itself way too many times, but there may be some other things that the build system does that we could make better.
Added a few dependencies on things I'm currently working on that can help here.
It'd be nice to get a log of something like:
pid, ppid, start time, end time, process name

for every process in the build.  We may be able to use some of the work done for http://www.bootchart.org/ to process the logs.

Not sure if Process Explorer or some other app could get this data for us on windows.  If not, we may be able to write our own:
http://msdn.microsoft.com/en-us/library/aa390425%28vs.85%29.aspx
I managed to find Process Monitor (http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx) which I think does the trick. I tracks down the process name, pid, ppid, time, duration, and many other things.

I'll try to get a log posted for a rebuild of firefox with no change asap.

Any hints on what to do after this point?
Patrick: making things faster is sort of dependent on knowing what makes it slow. I've marked a bunch of bugs as dependent bugs here, I think they'll all have a positive effect on build time. Ideally, you could produce some data for us showing what is taking the most time in a rebuild currently, and if it's not something that's already covered by a bug, figure out what to fix, file a new bug and fix it. :)
I've got a CSV file containing the logs of all the processes during the rebuild. The file is rather big (112MB, not sure where to post it) and I'm not sure what to do with the data. The log has the Process Name, PID, PPID, Time started, Duration, Operation, Command Line, and Path.

It captured the following process: basename.exe, bash.exe, cat.exe, cmp.exe, cp.exe, dirname.exe, iconv.exe, make.exe, makensis.exe, mkdir.exe, mv.exe, nsinstall.exe, perl.exe, python.exe, rm.exe, sed.exe, sh.exe, tar.exe, touch.exe, uname.exe, xpidl.exe, xpt_link.exe, zip.exe.

There is a total of 624229 processes captures in this log and 98346 of them are make.exe. And the Operations called by make.exe are: Load Image, Process Create, Process Exit, Process Start, Thread Create, and Thread Exit.
I'm not sure why, but all the processes in this log had a duration of 0 even down to 7 decimal points. So I don't even know how to go about calculating the time used by the processes.
To be clear: this log is just a do-nothing rebuild, right?

Is it possible to get a log with the working directory also?

Sounds like the duration information is bogus... many of the intermediate make invocations will have long duration, but will spend little or no actual CPU time; they will just sit around and wait for other sub-makes to finish. Is it possible to collect total CPU time per process, and/or the time the process ended?
Yes, this log is for a do-nothing rebuild.

Working directory as in where in the mozilla code the process is being called?

Maybe the duration is for how long it needs to invoke the process? And not the actual duration of the process from start to exit. Let me dig around and see if I can get the CPU time for the processes.
Well... the working directory will help tell us where in the code it was called... but it's just the result of GetWorkingDirectory() for the process.
I still haven't been able to log their current working directory, but I was able to get their process time now.

As of now, with the profiling done so far, I suspect that the cause for the slowness in compiling is due to the excessive make call throughout the whole structure. So I believe this is related to bug 461444 (remove cases of excessive recursion in makefiles) and 462463 (make dependencies smarter).

I'll try to get the working directories of the make.exe and see if there is such a problem exisiting in the directories using a lot of time during the build.
Assignee: plam173 → nobody
Sorry guys, I had a project change and won't be able to spend time on this project.
This is interesting: I ran trace profiling of a no-change rebuild of xpcom/ on solaris. A few things stand out:

1) when DIRS isn't set we're still launching a shell to recurse into subdirs: it just happens to do nothing:

`/bin/sh -c set -e; true`

2) We're checking for jar.mn in each directory with a shell command: we could avoid launching the subshell by using a $(wildcard jar.mn) check

3) running mddepend.pl is by far our largest consumer of CPU
Depends on: 466486
Depends on: 466492
Product: Firefox → Core
QA Contact: build.config → build-config
Depends on: 485412
Depends on: 518136
Depends on: 717620
Blocks: 755684
Whiteboard: [buildfaster:?]
Severity: enhancement → normal
Whiteboard: [buildfaster:?] → [tracking][buildfaster:p3]
Depends on: 832472
Product: Core → Firefox Build System
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: