Open Bug 484799 Opened 15 years ago Updated 1 year ago

make {package,installer} should rebase on windows

Categories

(Firefox Build System :: General, enhancement)

x86
Windows XP
enhancement

Tracking

(Not tracked)

People

(Reporter: ted, Unassigned)

References

(Blocks 1 open bug, )

Details

(Keywords: helpwanted, Whiteboard: [ToDo: comment 40+41+47+48] [ts][win])

Attachments

(4 files, 2 obsolete files)

There's a "rebase" target in the top-level Makefile (that no longer gets called). It used to get called via the "deliver" target by Tinderbox:
http://mxr.mozilla.org/mozilla/source/tools/tinderbox/post-mozilla-rel.pl#181

AFAICT, that means we're no longer rebasing our Windows builds. I think this is probably bad for startup time. We should make the packaging target do this on Windows builds.
Bug 479978 comment 22:
{
From  Ted Mielczarek (:luser)   2009-03-23 08:44:12 PDT

Remove it from the deliver: target below as well.
}
Fwiw,

http://mxr.mozilla.org/mozilla-central/search?string=rebase&case=on
{
/toolkit/mozapps/installer/packager.mk
    * line 257 -- rebasedlls* \
}

http://mxr.mozilla.org/comm-central/search?string=rebase&case=on&find=%2FMakefile%5C.in%24
{
/Makefile.in
    * line 50 -- ...profiledbuild check rebase splitsymbols upload...
}
Yeah, neither of those are anything useful.
ted, how can we test what effect on startup this might have? also, can you elaborate on what needs to be done to fix this?
Whiteboard: [ts]
Attached patch Roughed in (obsolete) — Splinter Review
As a quick rough guess, I'd say since we used to rebase right after signing NSS, you'd copy-paste a little like this, push it to the try server, see that http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1255400576.1255407103.18806.gz&fulltext=1 says it rebased, and then wait eight or ten hours for a Talos run.
Since Dietrich knows how to run standalone Talos, it'd be easier than that:
1) Grab any Windows nightly or release build, run in Talos for a baseline
2) Open a MozillaBuild shell, cd to the directory where you installed the build, and run:
/bin/find . -name "*.dll" > rebase.lst
rebase -b 60000000 -R . -G rebase.lst
3) Run the newly rebased build through standalone Talos again.

I have no idea where the 0x60000000 base address came from, presumably someone plucked it out of thin air. I don't even know how you choose a decent address these days with ASLR.

philor: just two things with that patch. a) I'd move the rebase outside of the PKG_SKIP_STRIP block, since they're not logically connected. b) Would you mind removing the signnss and rebase targets from the top-level Makefile.in, since nothing calls them anymore anyway?
0x60000000 is recommended by a CodeProject post I read on this; I wouldn't be at all surprised if that is the origin of this.  I believe 0x10000000 is the default base address that MSVC links DLLs at.
Maybe we should pick a different address then, since every app in the world that gets rebased is likely to wind up at that address. ;-)
I don't think it matters; it's all virtual and offset from the base of the process.  Likely 0x60000000 is chosen as being large enough not to step on the main app's toes and small enough to leave space for all DLLs.  We can/should experiment with other values, but I'm guessing this one is fine.  Also, it's easy to verify whether or not DLLs are being rebased: MSVC's debugger will warn you about whether or not the DLL has been loaded at its expected base address.

Whatever changes we make here might need to be reflect by our soccorro(sp?) data (on Win32, at least), has anyone accounted for that?
Ah, so it's just an offset from the process base. I had no idea that was how that worked!

Also, I don't see how this is in any way related to our crash reporting stuff. That uses the module GUID that the linker encodes in the DLL, which is completely separate AFAIK.
(In reply to comment #7)
> philor: just two things with that patch. a) I'd move the rebase outside of the
> PKG_SKIP_STRIP block, since they're not logically connected.

Actually, whoever fixes this really ought to move it somewhere completely different, since I put it in stage-package, so we'd get the ~1% win on Talos, but not in the installers we actually ship.
Bleh, you're right. What a mess.
Summary: make package should rebase on windows → make {package,installer} should rebase on windows
Assignee: nobody → ted.mielczarek
We need to be careful: I don't think we want to rebase twice (once for the ZIP and once for the installer), because it's possible for rebase to be non-deterministic.
This adds a rebase to both the stage-package and installer-stage targets, and also removes the unused targets (deliver, signnss, rebase) from the top-level Makefile. I tested this locally and it looks good, but I've also pushed it to the try server just to be sure.
Attachment #405982 - Attachment is obsolete: true
Attachment #407016 - Flags: review?(benjamin)
Status: NEW → ASSIGNED
(In reply to comment #14)
> We need to be careful: I don't think we want to rebase twice (once for the ZIP
> and once for the installer), because it's possible for rebase to be
> non-deterministic.

How so?
Incidentally, I tried rebasing manually on WinCE to see if my modules would get loaded where I expected them, and didn't have any luck in a cursory go.  I'll try some different offsets and see if I can find the magic number for this.
Is the ordering of files returned by /bin/find deterministic? I did not think it was.
Oh, I suppose not. We could pipe that through | sort to make it stable. Would that fix your concern?
I think so... I hope that rebase itself is deterministic, and I can't find any documentation either way.
Attachment #407016 - Flags: review?(benjamin) → review-
i tried updating the patch w/ input from ted, but patch no longer applies cleanly. ted, can you update this when you're able?
Sure, I uploaded a rebased patch for you yesterday, maybe you missed that:
http://people.mozilla.com/~tmielczarek/package-rebase.patch
But if you don't have time I'll fix this when I get home.
Bah, no i missed it, was on/off IRC. will try to land this weekend.
Note that the patch still needs review.
Oh, i guess i was reading comment #20 over-optimistically :)
I stuck a sort in between the find and the rebase. This WFM and should make it deterministic as far as we know.
Attachment #407016 - Attachment is obsolete: true
Attachment #411497 - Flags: review?(benjamin)
Attachment #411497 - Flags: review?(benjamin) → review+
Pushed to m-c:
http://hg.mozilla.org/mozilla-central/rev/228506b717f7
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment on attachment 411497 [details] [diff] [review]
rebase with sort first
[Backout: Comment 48 + Comment 40]

"approval1.9.2=?":
No risk, startup perf increase.
Attachment #411497 - Flags: approval1.9.2?
Flags: in-testsuite-
Target Milestone: --- → mozilla1.9.3a1
This c-c patch is fine as-is, whether m-1.9.2 is fixed or not.
Attachment #414604 - Flags: review?(bugzilla)
Comment on attachment 414611 [details] [diff] [review]
(Cv1) Remove obsolete 'rebasedlls*'
[Checkin: Comment 42]

Whatever. I'm sure there's plenty other cruft in that particular list.
Attachment #414611 - Flags: review?(ted.mielczarek) → review+
Keywords: checkin-needed
Whiteboard: [ts] → [c-n: Cv1 to m-c after 'Firefox 3.6 RC code freeze'] [ts]
Attachment #411497 - Flags: approval1.9.2? → approval1.9.2+
Comment on attachment 411497 [details] [diff] [review]
rebase with sort first
[Backout: Comment 48 + Comment 40]


http://hg.mozilla.org/releases/mozilla-1.9.2/rev/d17a52e750da
Attachment #411497 - Attachment description: rebase with sort first → rebase with sort first [Checkin: Comment 28 & 33]
This patch broke Fennec's nightly Windows desktop builds:

From http://tinderbox.mozilla.org/showlog.cgi?log=Mobile/1259917201.1259921629.17579.gz&fulltext=1

Stripping package directory...
signing nss libraries
Rebasing  ../../dist/fennec
/bin/find  ../../dist/fennec -name "*.dll" -a -not -name "MSVC*" | sort | xargs rebase -b 60000000
/bin/sh: ../../dist/bin/shlibsign: No such file or directory
/bin/sh: ../../dist/bin/shlibsign: No such file or directory

REBASE: Total Size of mapping 0x0000000000e50000
REBASE: Range 0x0000000060000000 -0x0000000060e50000
make[2]: Leaving directory `/e/builds/moz2_slave/w32mob-1.9.2-nightly/mozilla-1.9.2/objdir/mobile/mobile/installer'
make[1]: Leaving directory `/e/builds/moz2_slave/w32mob-1.9.2-nightly/mozilla-1.9.2/objdir/mobile/mobile/installer'
make[2]: *** [stage-package] Error 123
make[1]: *** [all] Error 2
make: *** [package] Error 2
And rather amusingly, broke Fennec's nightly Windows desktop mozilla-central builds when it landed on mozilla-central almost three weeks ago.
If you run that command manually on a Windows desktop build, can you see if it succeeds or fails?

Are the Fennec desktop builds using MSVC8 or 9?
It looks like MSVC9.
mikel in #mobile would like to see this fixed or backed out since he needs to test his addon in windows fennec for the addon contest on monday.  passing that on.
reverted d17a52e750da on 1.9.2 due to Fennec build bustage.
(In reply to comment #34)
> signing nss libraries
> /bin/sh: ../../dist/bin/shlibsign: No such file or directory
> /bin/sh: ../../dist/bin/shlibsign: No such file or directory

Fwiw,
http://mxr.mozilla.org/mozilla-central/search?string=SIGN_NSS&case=on
http://mxr.mozilla.org/mozilla-central/search?string=shlibsign&case=on
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment on attachment 414611 [details] [diff] [review]
(Cv1) Remove obsolete 'rebasedlls*'
[Checkin: Comment 42]


http://hg.mozilla.org/mozilla-central/rev/dc2e649c530f
Attachment #414611 - Attachment description: (Cv1) Remove obsolete 'rebasedlls*' → (Cv1) Remove obsolete 'rebasedlls*' [Checkin: Comment 42]
Keywords: checkin-needed
Whiteboard: [c-n: Cv1 to m-c after 'Firefox 3.6 RC code freeze'] [ts] → [ts]
Attachment #414604 - Flags: review?(bugzilla) → review+
Comment on attachment 414604 [details] [diff] [review]
(Bv1-CC) Remove obsolete m-c targets
[Checkin: Comment 43]


http://hg.mozilla.org/comm-central/rev/a02bc393512e
Attachment #414604 - Attachment description: (Bv1-CC) Remove obsolete m-c targets → (Bv1-CC) Remove obsolete m-c targets [Checkin: Comment 43]
Whiteboard: [ts] → [ToDo: comment 40+41] [ts]
Attachment #411497 - Attachment description: rebase with sort first [Checkin: Comment 28 & 33] → rebase with sort first [Checkin: Comment 28; m-1.9.2 backout: Comment 40]
Flags: wanted1.9.2?
I don't think we should take this on 1.9.2 given the problems it caused with releases + the problems it's causing mobile. It didn't show any measurable perf improvement on Talos. In fact, we should probably just back it out on trunk to fix Fennec builds.
Flags: wanted1.9.2? → wanted1.9.2-
(In reply to comment #44)
> In fact, we should probably just back it out on trunk to
> fix Fennec builds.

That would be the Mobile team's vote as well.
Well, was the Fennec build failure investigated/explained?
FTR, rebasing the NSS libraries requires regenerating the .chk files, or else FIPS mode is broken. See bug 521849.
Blocks: 521849
(In reply to comment #44)
> I don't think we should take this on 1.9.2 given the problems it caused with
> releases + the problems it's causing mobile. It didn't show any measurable perf
> improvement on Talos. In fact, we should probably just back it out on trunk to
> fix Fennec builds.

This patch also breaks our new Windows build machines for some reason. Given that, and this comment, I've backed it out on trunk with changeset:   38306:936010e7309f
If someone else wants to investigate the issues here, that's fine, but I'm not going to. Even when we had it in it didn't make a noticeable difference in Ts. Perhaps it has better effects on machines with lots of other modules mapped, but without some clear data here it doesn't seem worth the effort.
Assignee: ted.mielczarek → nobody
Status: REOPENED → NEW
Attachment #411497 - Attachment description: rebase with sort first [Checkin: Comment 28; m-1.9.2 backout: Comment 40] → rebase with sort first [Backout: Comment 48 + Comment 40]
Makefile.in cleanups shouldn't have needed to be backed out.
Ted, shall I push them again?
Sure, feel free, they're just dead code removal.
Attachment #428368 - Attachment description: (Dv1) Makefile.in obsolete target removals part only → (Dv1) Makefile.in obsolete target removals part only [Checkin: Comment 52]
Severity: normal → enhancement
Whiteboard: [ToDo: comment 40+41] [ts] → [ToDo: comment 40+41] [ts] []
Target Milestone: mozilla1.9.3a1 → ---
Fwiw:
http://msdn.microsoft.com/en-us/library/ms810432.aspx
Rebasing Win32 DLLs: The Whole Story
(But it was 1995: Windows 95 and Windows NT 3.51...)
Keywords: helpwanted
Whiteboard: [ToDo: comment 40+41] [ts] [] → [ToDo: comment 40+41+47+48+53] [ts]
Fwiw,

http://www.drdobbs.com/windows/184416272
Rebasing Win32 DLLs
{
Applications (.exe files) start at 0x00400000 for all compilers that I tested.
they will never need to be relocated.

(gives more technical details (wrt dlls) than the following article)
}

http://www.codeproject.com/KB/DLL/rebase.aspx
Using the Rebase utility in project makefile
{
A Dynamic-link library created in Microsoft Visual C++ has default base address 0x10000000.
this takes additional time.
when the Memory Manager needs memory pages for something else, it saves pages with the Dll code to the system paging file.
}

http://dependencywalker.com/
Dependency Walker 2.2
{
I tried it on a recent SeaMonkey 2.1 nightly: (static, not libxul)
0x00400000 seamonkey.exe 
0x10000000 mozjs.dll, mozsqlite3.dll, nspr4.dll, nss3.dll, ..., xpcom_core.dll
}

I can't speak about actual performance (depending on user memory availability/usage too), but, based (;-)) on the explanation, it looks like we do still want to rebase!
Whiteboard: [ToDo: comment 40+41+47+48+53] [ts] → [ToDo: comment 40+41+47+48] [ts]
Blocks: 447581
Whiteboard: [ToDo: comment 40+41+47+48] [ts] → [ToDo: comment 40+41+47+48] [ts][win]
Product: Core → Firefox Build System
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.