Closed Bug 522770 Opened 10 years ago Closed 10 years ago

Implement enough of fakelibs to work around MSVC limitations

Categories

(Firefox Build System :: General, defect, P4)

x86
Windows XP
defect

Tracking

(Not tracked)

RESOLVED FIXED
mozilla2.0b3

People

(Reporter: bent.mozilla, Assigned: khuey)

References

()

Details

Attachments

(1 file, 3 obsolete files)

MSVC won't link incrementally if any static libs have changed, and that basically means libxul will never link quickly. We should fix that.
The basic idea here is that we shouldn't be making "real" static libraries. Instead, where we would make it, make a text file with a list of .obj files. And where we would link it, list each file on the final commandline.

We'd need some way to separate out the "static libraries which actually need to be static libraries" (libxpcomglue_s.a and the few others which we actually ship in the SDK) from the ones which merely exist to be intermediate products of a broken system.
Assignee: nobody → ted.mielczarek
Status: NEW → ASSIGNED
Attached patch WIP (incomplete) (obsolete) — Splinter Review
I started hacking on this because I thought it would help me with something else. The easy solution didn't quite seem to work, so I started down this route, but I don't have time to finish right now. Note that I only tested this on Windows with pymake. I think msys GNU make will be difficult, since @filename gets mangled with absolute pathnames. Also, fixing all the LIB_SUFFIX -> FAKELIB_SUFFIX stuff will be long and torturous, and difficult to ensure that I've fixed all the cases for different build configurations.
Status: ASSIGNED → NEW
Summary: Make incremental linking work by removing static libs or providing a direct object list instead → Make incremental linking work by removing static libs and providing a direct object list instead
Would it be possible to at least make it work for gcc in the meantime to see how it affects mac builds?
It's not any more or less work to make it work for one or the other, both compilers support the exact same syntax for using response files.
Attached patch working patch (obsolete) — Splinter Review
This works on Linux. It's the same underlying trick, but a simpler approach. We produce a .fake lib alongside each static lib, instead of producing just the fake lib. This way, we can just use $(wildcard) to use .fake libs if present, and fall back to regular static libs when not, so I don't have to globally search and replace stuff. This also makes it easier because I can just ignore stuff like host binaries since they don't matter as much.

I have this ifdef'ed off on Windows+GNU make right now, since I don't know how to work around the MSYS path munging. Should work with PyMake, but I need to test.
Attachment #415628 - Attachment is obsolete: true
Actually this doesn't seem sufficient for linking libxul, since it sticks stuff in SHARED_LIBRARY_LIBS and I'm not munging that. Trying the simple thing of munging SHARED_LIBRARY_LIBS fails because we do all that silliness about unpacking and repacking static libraries, so we then wind up with fake static libs referencing object files that don't exist. :-/
If we could make this technique work reliably everywhere (msys gmake, mostly) we could probably ditch all the ugliness behind SHARED_LIBRARY_LIBS in rules.mk, since there's no point in repacking static libs if you don't need the static libs themselves.
Attached patch mostly working patch (obsolete) — Splinter Review
Ok, this seems to do the trick here. I need to do a local clobber, and I've also pushed it to try to see if I broke anything else.

I fixed the SHARED_LIBRARY_LIBS thing by:
a) Making it so that if we have .fake equivalents of all our SHARED_LIBRARY_LIBS, we just cat their contents instead of unpacking the static libs
b) If we don't (and some things like libffi we won't, since we're building them with their own build system), then just don't delete the object files after unpacking the static lib, so the new .fake lib we're building can reference them
Attachment #435150 - Attachment is obsolete: true
Ted, can we get this in bsmedberg's review queue?
I think it broke on try, but I forgot to see what broke. I should push it again and pay attention.
Going to double-check this this week.
This is a potential fix to bug 582335, which is currently busting all opt Windows builds on TraceMonkey. Ted's latest patch doesn't compile and I'd like to see if this would help 582335 or not -- any chance someone can fix it enough so that it compiles on Windows? Or if that's a ton of trouble, I could probably get what I need by invoking link.exe manually -- I just need to know what that would look like with this patch, I'm having trouble decoding it.
Ted might be able to fix it up reasonably quickly, but it would definitely take me a while to dive into it and fix it up.  To do this manually you need to gather up a list of the .obj files that end up in gklayout.lib, and then invoke link for xul.dll with all of those .obj files instead of gklayout.lib.

Running |dumpbin -ARCHIVEMEMBERS| on gklayout.lib should give you the list you're looking for.
Thanks for the reply, Kyle, I'll give it a try.
Bumping the severity on this because it's impacting TraceMonkey opt builds (tracked in bug 582335) -- they haven't had any since http://hg.mozilla.org/tracemonkey/rev/80382d88b92c, which apparently tipped us over the edge of needed address space when linking libxul. I tested a libxul link that replaced gklayout.lib with all of the things which go into gklayout.lib, which worked fine.

If someone has the time and energy to push this along, that be really helpful. TraceMonkey is unable to merge back to mozilla-central in the meantime.

Ted has an updated version of this patch (last updated early July) in his mq repo: http://hg.mozilla.org/users/tmielczarek_mozilla.com/mq
Severity: normal → major
I'll make this happen.
Assignee: ted.mielczarek → me
Severity: major → critical
Status: NEW → ASSIGNED
Priority: -- → P1
mozilla-central is currently closed because of this.  I have a patch in hand that I'm testing now and will get post-hoc review on once it is complete.
Severity: critical → blocker
I landed the following four changesets on mozilla-central

http://hg.mozilla.org/mozilla-central/rev/e9c9b7d21a0c
http://hg.mozilla.org/mozilla-central/rev/d71692333564
http://hg.mozilla.org/mozilla-central/rev/070d9d46d88b
http://hg.mozilla.org/mozilla-central/rev/ef5b7c79da4a

I'll collect these into a single patch and place it up for review at some point after some sleep.  The tree is stable and relatively healthy.
FWIW, I think the tree can reopen, but I don't have the magical powers to do that.
Severity: blocker → normal
Priority: P1 → P4
Depends on: 583584
Blocks: 583591
Comment on attachment 435597 [details] [diff] [review]
mostly working patch

>diff --git a/configure.in b/configure.in
>--- a/configure.in
>+++ b/configure.in
>@@ -7979,14 +7979,9 @@
>     AC_SUBST(PNG_FUNCTIONS_FEATURE)
>     AC_SUBST(QT_SURFACE_FEATURE)
> 
>-    if test "$_WIN32_MSVC"; then
>-        MOZ_CAIRO_LIBS='$(DEPTH)/gfx/cairo/cairo/src/mozcairo.lib $(DEPTH)/gfx/cairo/libpixman/src/mozlibpixman.lib'
>-    else
>-        MOZ_CAIRO_LIBS='$(DEPTH)/gfx/cairo/cairo/src/$(LIB_PREFIX)mozcairo.$(LIB_SUFFIX) $(DEPTH)/gfx/cairo/libpixman/src/$(LIB_PREFIX)mozlibpixman.$(LIB_SUFFIX)'" $CAIRO_FT_LIBS"
>-
>-        if test "$MOZ_X11"; then
>-            MOZ_CAIRO_LIBS="$MOZ_CAIRO_LIBS $XLDFLAGS -lXrender -lfreetype -lfontconfig"
>-        fi
>+    MOZ_CAIRO_LIBS='$(call EXPAND_LIBNAME_PATH,mozcairo,$(DEPTH)/gfx/cairo/cairo/src) $(call EXPAND_LIBNAME_PATH,mozlibpixman,$(DEPTH)/gfx/cairo/libpixman/src)' $CAIRO_FT_LIBS
>+    if test "$MOZ_X11"; then
>+        MOZ_CAIRO_LIBS="$MOZ_CAIRO_LIBS $XLDFLAGS -lXrender -lfreetype -lfontconfig"
>     fi
> 
>     CAIRO_FEATURES_H=gfx/cairo/cairo/src/cairo-features.h
When removing the test for Win32_MSVC were those double-quotes around $CAIRO_FT_LIBS removed intentionally? Asking because with the missing double-quotes autoconf.mk gets an empty MOZ_CAIRO_LIBS entry on OS/2.
This seems to have completely painted comm-central red, what changes are needed to fix that?
(In reply to comment #20)
> Comment on attachment 435597 [details] [diff] [review]
> mostly working patch
> 
> >diff --git a/configure.in b/configure.in
> >--- a/configure.in
> >+++ b/configure.in
> >@@ -7979,14 +7979,9 @@
> >     AC_SUBST(PNG_FUNCTIONS_FEATURE)
> >     AC_SUBST(QT_SURFACE_FEATURE)
> > 
> >-    if test "$_WIN32_MSVC"; then
> >-        MOZ_CAIRO_LIBS='$(DEPTH)/gfx/cairo/cairo/src/mozcairo.lib $(DEPTH)/gfx/cairo/libpixman/src/mozlibpixman.lib'
> >-    else
> >-        MOZ_CAIRO_LIBS='$(DEPTH)/gfx/cairo/cairo/src/$(LIB_PREFIX)mozcairo.$(LIB_SUFFIX) $(DEPTH)/gfx/cairo/libpixman/src/$(LIB_PREFIX)mozlibpixman.$(LIB_SUFFIX)'" $CAIRO_FT_LIBS"
> >-
> >-        if test "$MOZ_X11"; then
> >-            MOZ_CAIRO_LIBS="$MOZ_CAIRO_LIBS $XLDFLAGS -lXrender -lfreetype -lfontconfig"
> >-        fi
> >+    MOZ_CAIRO_LIBS='$(call EXPAND_LIBNAME_PATH,mozcairo,$(DEPTH)/gfx/cairo/cairo/src) $(call EXPAND_LIBNAME_PATH,mozlibpixman,$(DEPTH)/gfx/cairo/libpixman/src)' $CAIRO_FT_LIBS
> >+    if test "$MOZ_X11"; then
> >+        MOZ_CAIRO_LIBS="$MOZ_CAIRO_LIBS $XLDFLAGS -lXrender -lfreetype -lfontconfig"
> >     fi
> > 
> >     CAIRO_FEATURES_H=gfx/cairo/cairo/src/cairo-features.h
> When removing the test for Win32_MSVC were those double-quotes around
> $CAIRO_FT_LIBS removed intentionally? Asking because with the missing
> double-quotes autoconf.mk gets an empty MOZ_CAIRO_LIBS entry on OS/2.

No, that's a mistake.  I'll put them back.

(In reply to comment #21)
> This seems to have completely painted comm-central red, what changes are needed
> to fix that?

Not sure, looking.
The comm-central bustage appears unrelated, but I'm keeping an eye on it.
(In reply to comment #22)
> No, that's a mistake.  I'll put them back.
Thanks
 
> (In reply to comment #21)
> > This seems to have completely painted comm-central red, what changes are needed
> > to fix that?
> 
> Not sure, looking.

With re-adding the quotes for CAIRO_FT_LIBS I was able to build a running Seamonkey on OS/2 - indicating that indeed the comm-central redness is not due to Kyle's checkin here.
(In reply to comment #24)
> (In reply to comment #22)
> > No, that's a mistake.  I'll put them back.
> Thanks
> 
> > (In reply to comment #21)
> > > This seems to have completely painted comm-central red, what changes are needed
> > > to fix that?
> > 
> > Not sure, looking.
> 
> With re-adding the quotes for CAIRO_FT_LIBS I was able to build a running
> Seamonkey on OS/2 - indicating that indeed the comm-central redness is not due
> to Kyle's checkin here.

Thanks for catching this, it broke our Android builders too.  I just didn't notice because they don't show up on the main TBPL.
Attached patch As checked inSplinter Review
This is what I've checked in, including the followups and Jacek's cross compile patch.  There's probably some good room for improvement (I don't think the doubleslashing is necessary on Windows, for instance).
Attachment #435597 - Attachment is obsolete: true
Attachment #461925 - Flags: superreview?
Attachment #461925 - Flags: review?
Attachment #461925 - Flags: superreview?(benjamin)
Attachment #461925 - Flags: superreview?
Attachment #461925 - Flags: review?(mitchell.field)
Attachment #461925 - Flags: review?
Hmm, seems that SeaMonkey and Thunderbird on Windows are still red - same patch needed in the comm-central build system to get things working there? Or something else?
(In reply to comment #27)
> Hmm, seems that SeaMonkey and Thunderbird on Windows are still red - same patch
> needed in the comm-central build system to get things working there? Or
> something else?

We talked on IRC, but for everyone else's benefit, it's being worked on.

Also, we probably shouldn't install the foo.lib.fake files to dist/lib.
(In reply to comment #28)
> Also, we probably shouldn't install the foo.lib.fake files to dist/lib.

Why not? A lot of things link libraries from dist/lib. (Presumably we don't want them to go to dist/sdk/, which is stuff we'd package in the xulrunner SDK.)
(In reply to comment #29)
> (In reply to comment #28)
> > Also, we probably shouldn't install the foo.lib.fake files to dist/lib.
> 
> Why not? A lot of things link libraries from dist/lib. (Presumably we don't
> want them to go to dist/sdk/, which is stuff we'd package in the xulrunner
> SDK.)

I seem to remember running into a lot of issues with @<abs path in dist/lib with forward slashes here> on Windows.  They're probably fixable.
I'm having a problem building on windows that might be related to these changes - I get the following error build js/src/shell - LINK : fatal error LNK1104: cannot open file 'tbsmart:\objdir-tb\mozilla\js\src\
jsapi.obj' 

/tbsmart is the root of my tree. I've started with a clean obj-dir. Is this problem due to this patch?
Attachment #461925 - Flags: review?(mitchell.field) → review+
Attachment #461925 - Flags: superreview?(benjamin) → superreview+
Since I mostly hijacked this bug, I'm going to close it as fixed and open followups for everything else that needs to be done, both in terms of fallout from my changes and actually fixing this bug as filed.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Summary: Make incremental linking work by removing static libs and providing a direct object list instead → Implement enough of fakelibs to work around MSVC limitations
Target Milestone: --- → mozilla2.0b3
Blocks: 602842
Blocks: 608498
No longer blocks: 584476
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.