Closed Bug 909506 Opened 11 years ago Closed 7 years ago

Add a non-recursive build system generator to the tree

Categories

(Firefox Build System :: General, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: ehsan.akhgari, Unassigned)

References

()

Details

Attachments

(1 file)

We've been working on an idea to give us extremely fast incremental non-recursive builds. The code is hosted at <https://github.com/bgirard/hacky.mk> but the basic idea is pretty simple. Here are the goals and non-goals: Goals: * Bring the overhead of no-op builds to <1s on all platforms. * Build non-recursively, never leave a core unused. * Focus on the Gecko hackers' edit/compile/test cycle by making the compile step finish as fast as possible. * Focus on a pragmatic solution. Where you need to decide between implementation niceties and faster builds, prefer the latter. * Focus on being a temporary work around for making the lives of developers better, and have it as our goal to remove this solution when our current build system can handle the first two goals here. * Facilitate other nice benefits, such as generating Visual Studio projects. Non-goals: * Handle clobber builds or builds you do after pulling in a new tree with tons of changes. Those are a one-time const for developers. * Handle all of our build rules. We'd like to focus only on improvements affecting Gecko C++ hackers, namely C++ compilation and linking. Right now, the hacky.mk project can generate ninja projects which pretty much have the first two properties in the Goals section. It can also generate Visual Studio projects. It does not as of yet handle anything except C++ compilation and linking, and it does not currently have fall back mechanisms to handle the case where a Makefile/moz.build file changes. The latter is something that we will fix before we advertize this to the larger developer audience. Hacky.mk works by modifying the rules.mk file to capture the command lines used to compile and link C++ code, and from that it generates things like ninja files. It uses the existing header dependency tracking mechanism as the current build system. For now, we would like to land the hacky.mk backend, disabled by default initially, and iterate over it in the tree before we enable it for everyone by default.
I just discussed this again with glandium on irc. He basically objects to the rules.mk modifications the most, and he said he'd be willing to consider an alternate version of this which would use a recurse.mk modification like this to extract the build command lines during every build as opposed to just clobber builds.
Note that I don't actually know how recurse.mk is used yet. :-)
I no longer have any plans to work on this. The idea is out there, I hope somebody else will make it happen.
The suggested recurse.mk mechanism for recording cflags, etc is flawed in that it won't properly report target-specific variables. For those, we need hacks in rules.mk, similar to what hacky.mk does today. Or, we can remove target-specific variables (even better). I was working on porting hacky.mk to the tree in a less... hacky way. My solution didn't require a full build to gather the metadata. Instead, it traversed the tree without building C++ files. Unfortunately, that still took ~40s to run. ~40s overhead to generate Ninja, Visual Studio, etc projects seems steep. Unless we can get the recurse.mk method working, we'll need to move all variables and logic around formulating compiler commands into Python. i.e. we'll need to port large parts of config.mk to Python. We need to do this anyway - it's rather unfortunate the timeline couldn't be delayed further. FWIW, the more and more we look into pymake, the more obvious it is that it's beyond saving. I want an alternative build backend (at least for C++) in Q4. Expect this to be discussed during goals setting.
(In reply to Gregory Szorc [:gps] from comment #4) > The suggested recurse.mk mechanism for recording cflags, etc is flawed in > that it won't properly report target-specific variables. For those, we need > hacks in rules.mk, similar to what hacky.mk does today. Or, we can remove > target-specific variables (even better). I support removing target-specific variables. Very very much. > I was working on porting hacky.mk to the tree in a less... hacky way. My > solution didn't require a full build to gather the metadata. Instead, it > traversed the tree without building C++ files. Unfortunately, that still > took ~40s to run. ~40s overhead to generate Ninja, Visual Studio, etc > projects seems steep. Hm, that doesn't seem step at all. That's a one-time cost; 40s sounds fantastic! What am I missing?
That's way faster than configure...
That 40s cost would be spent every time a moz.build file changed [and the build backend needed updating]. This could be prohibitively expensive for some local development scenarios. Although, I suppose we could provide it as an option and those that didn't mind the 40s lag could use that build backend. Choices are king.
I don't expect that changing a moz.build file is so common that it would be a big deal to anyone except perhaps build system hackers. It's certainly not normally in the tight edit/compile/test cycle.
Yeah, absolutely. A 40s cost to pay for changing a moz.build file is one I (and I suspect any other platform dev) would happily pay to get 1s no-op builds otherwise. (In reply to Gregory Szorc [:gps] from comment #7) > Choices are king. No! Bad! Choices are *awful*. Choices are only useful if there are two equally-important cases that are totally orthogonal to eachother. Otherwise, pick one and run with it. Having choices and multiple paths is partially how we ended up with the current build system mess :(
(In reply to Benjamin Smedberg [:bsmedberg] from comment #8) > I don't expect that changing a moz.build file is so common that it would be > a big deal to anyone except perhaps build system hackers. It's certainly not > normally in the tight edit/compile/test cycle. We are pulling more and more data into moz.build files. Recently it is test manifests. Soon it will be jar.mn. Will people tolerate a 40s delay when updating a test manifest or jar.mn file? There is a very vocal minority that is already upset about 3s moz.build traversal times. Avoiding pissing those people off even more is why I say there should be a choice to make a tradeoff. We'd only be talking about a "preferred" backend choice for as long as the new backend isn't better than the current in every way imaginable. We're going to need choices anyway, as history has told us RelEng won't be able to deploy Ninja, Tup, etc to release automation machines in a timeline that appeases us. I'd rather not delay 1s no-op builds until RelEng is ready nor create more fire drills for RelEng. That being said, if whatever we do decreases build times drastically, I would expect RelEng to adjust priorities to yield said benefits sooner. Hopefully most of these choices can be made automatically in configure (or equivalent). Maybe we can design things so the 40s overhead isn't realized for changes that don't apply (i.e. ignore jar.mn and test manifest only changes). This slightly complicates things, but is doable.
(In reply to Gregory Szorc [:gps] from comment #10) > (In reply to Benjamin Smedberg [:bsmedberg] from comment #8) > > I don't expect that changing a moz.build file is so common that it would be > > a big deal to anyone except perhaps build system hackers. It's certainly not > > normally in the tight edit/compile/test cycle. > > We are pulling more and more data into moz.build files. Recently it is test > manifests. Soon it will be jar.mn. Will people tolerate a 40s delay when > updating a test manifest or jar.mn file? There is a very vocal minority that > is already upset about 3s moz.build traversal times. Which people, and where are they being vocal? And what are they building where 3s moz.build traversal is noticeable to them? (Other than that the fact that they're being traversed and how long that took is visibly exposed..) We need to get them represented explicitly and figure out what their needs are.
> We are pulling more and more data into moz.build files. Recently it is test > manifests. Soon it will be jar.mn. Will people tolerate a 40s delay when > updating a test manifest or jar.mn file? Do you expect that this will happen much? I don't imagine that this will be a common occurrence during the tight edit/compile/test cycles. > We'd only be talking about a "preferred" backend choice for as long as the > new backend isn't better than the current in every way imaginable. We're > going to need choices anyway, as history has told us RelEng won't be able to > deploy Ninja, Tup, etc to release automation machines in a timeline that > appeases us. On anything other than a very short timeline, that's not good engineering. Maintaining multiple systems and especially diverging what developers do and what build automation does is an obvious recipe for messes. Release engineering serves the needs of developers and if we get a system that does <1s rebuilds, we can expect releng to support that with whatever new tools are necessary. > Maybe we can design things so the 40s overhead isn't realized > for changes that don't apply (i.e. ignore jar.mn and test manifest only > changes). This slightly complicates things, but is doable. That would be a good followup at least. Note: pymake was not engineered to be a production-grade make system. It was a technology focused on being able to understand make syntax for conversion. It's kinda neat that we've been able to use it for Windows parallel builds, but I'm not opposed to something else that's better. (I am opposed to going back to msys paths and make). But if we could re-engineer pymake to fix whatever its problems are, that would be good too.
People complain a lot in IRC. There are a few distinguished engineers in that list. I tend to pay attention to their concerns. FWIW, I forgot that we recently regressed from ~3s to ~8s (on my machine) when we added manifest parsing. That inspired bug 922517.
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #5) > (In reply to Gregory Szorc [:gps] from comment #4) > > The suggested recurse.mk mechanism for recording cflags, etc is flawed in > > that it won't properly report target-specific variables. For those, we need > > hacks in rules.mk, similar to what hacky.mk does today. Or, we can remove > > target-specific variables (even better). > > I support removing target-specific variables. Very very much. That's not possible, we need them for things like sse and neon. We need to move them to moz.build instead.
I don't see anything actionable in this bug. So closing. We've made a lot of progress on build backend optimizations since this bug was last updated. `mach build faster` has existed for over a year. And there is currently work around moving compiler flags computation out of make files. This will enable generation of lots of new build backends for compiling.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
Is there a meta bug we can follow if we want to know when such new backends are unblocked? (Seems like bug 847009 blocks on a lot more stuff than would be needed.)
Flags: needinfo?(gps)
I found bug 1362612.
Flags: needinfo?(gps)
And bug 774049 (which hasn't been updated in a while but it is the most appropriate bug to track things).
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: