Builds spit out by tinderbox (Atlantia and Triton) crash on startup. I will attach the Apple log to the bug for information purposes. I also submitted one Talkback report - TB16010765H.
Marcia: Give this build a try: http://ftp.mozilla.org/pub/mozilla.org/mozilla/nightly/2006-03-06-12-1.7/
Status: NEW → ASSIGNED
Assignee: nobody → preed
Status: ASSIGNED → NEW
Marcia reports the above build does install and doesn't immediately crash. The fix was adding --with-macos-sdk=/Developer/SDKs/MacOSX10.2.8.sdk to the mozconfig for this build on atlantia.
Status: NEW → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → FIXED
Reopening the bug. The builds consistently crash using 10.4.5, but don't crash using 10.3.9. I confirmed this on two separate machines running 10.4.5. When I crashed on the second machine I got a similar apple report to the one attached to the bug which referenced: Symbol not found: _PR_Realloc\. Adding mark to the bug, and I will investigate 10.2. as well.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Per preed's request, setting the blocking flag for 1.7.13 on this bug. Testing on 10.2.8, I do not get the hang when running the build, but rather a "confirm migration" dialog window comes up and the spinning wheel of death continues, but no crash. Only way to get out is to force quit. I started my testing in all instances with a fresh profile.
Summary: Mac 1.7.13 candidate builds crash on startup → [Mac 10.4.5] Mac 1.7.13 candidate builds crash on startup
Oh, duh. The 1.7 branch has weaker SDK support that makes it difficult to completely weed out dependencies on the NSPR implementation erroneously exported by the system libraries. Aviary 1.0.x builds were produced by 10.3 systems using the 10.2 SDK, so I'd think someone would have hit this already. The one significant difference between those builds and this build is that those, as well as most othe relase builds, were static. In a static build, libgfx_mac (per Marcia's crash log) and many other libs don't get their NSPR symbols resolved until the final app link. That's significant here because the library order is different when linking libgfx_mac.dylib and linking the app with a static libgfx_mac.a: in the former case, the system libraries come BEFORE the NSPR libraries, creating the painful broken dependency at issue here. In the latter case, the NSPR libraries come first, so the app won't depend on NSPR. I suggest flipping the --enable-static switch. Someone should then validate that on all links that include NSPR, NSPR comes before the system libs. This can be done by scrutinizing the build log, or by running otool on Mach-O files. If this MUST be done anti-statically, we'll need some Makefile tweaks to fix the link order.
(In reply to comment #6) > I suggest flipping the --enable-static switch. Someone should then validate > that on all links that include NSPR, NSPR comes before the system libs. This > can be done by scrutinizing the build log, or by running otool on Mach-O files. > > If this MUST be done anti-statically, we'll need some Makefile tweaks to fix > the link order. I'm not sure whether or not it MUST be done dynamically, but I'm concerned about the impact for people using 1.0.x on older machines and such. Maintenance releases like this shouldn't have huge changes (like this one, I'm presuming) to them, either in code, configuration, or environment. I'm concerned that flipping this flag will reveal other problems that we'll sit an iterate on for awhile, wasting time. (I'm actually *more* concerned that testing *won't* reveal a problem, and out in the field, 1.0.8 would become a dead release for Mac users, due to some change like this.) What's the likelihood of that sort of regression? If it's anything but "extremely low," then we're going to have to rebuild a Mac with as close a configuration as we can get to barcelona (based on the logs we have from builds done on that machine).
We're not talking about aviary 1.0.x. Those were always produced by 10.3 boxes as static builds using the 10.2 SDK. Nothing's changed in that build configuration, and nothing is expected to change. The fact that they're static builds is precisely the reason that this bug was never an issue for aviary 1.0.x users on 10.2 and 10.4. This is solely limited to suite 1.7.x, which up until now was produced by 10.2 boxes. I've given two options: build as static (if this is desired), and tweak the makefiles to fix the link order. Both are exceedingly unlikely to cause any sort of problem: static builds, if approved for suite, have been proven by the aviary 1.0.x products. Fixing the link order to move system libraries after NSPR libraries won't cause a problem, because there's never a case where NSPR would define something we'd prefer to take from the system. Of course, there's the third option of slapping 10.2 back on a system and building with that. Your call. Keep in mind that now that you've moved from 10.2 to 10.3, there are other toolchain changes - you're using a different compiler, you're using a different linker, etc. There's a world of difference between the 10.2 and 10.3 tools. Most of these changes are benign for these purposes (and some may in fact fix latent bugs), and the newer tools have largely been proven by being used for the aviary 1.0.x products throughout the 1.0.x lifecycle, but if eliminating any and all differences in the build environment is your goal, then maybe 10.2 is a better choice for suite.
(In reply to comment #8) > We're not talking about aviary 1.0.x. Those were always produced by 10.3 boxes > as static builds using the 10.2 SDK. Yeah, my mispeak. I keep making that mistake, since we're globbing the FFx 1.0.x/Suite 1.7.13 releases together. Made that mistake on my blog post too, when I suggested removing the Moz 1.7 tinderbox. ;-)
To be specific, the Makefile tweak I'm proposing is here: http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/gfx/src/mac/Makefile.in&rev=126.96.36.199&mark=81#78 Move $(TK_LIBS) (which references system libraries) to be last, before $(NULL). This is a change that I initially made in bug 292530, which landed on the trunk prior to the 1.8 branch point. On that bug, you'll also find a full backport of the critical portion of the work done there to the 1.7 branch, including strengthened SDK support and the ability to build on Tiger (using the 3.3 compiler). I'm not suggesting seeking 1.7 approval for that whole patch now, though. I just did a quick (and maybe incomplete) audit of the Mach-O files in the current atlantia build and found system libs linked before NSPR libs in the following cases: libmozjs, libgkgfx_mac, libwidget_mac. libmozjs and libwidget_mac remain unfixed even in the current trunk (I'll open a new bug on myself to fix for trunk and 1.8, not 1.8.0). libmozjs is always a dylib even during a static build, so any problems would have shown up in the aviary and 1.8.0 builds. I didn't catch any trouble in libwidget_mac either when I initially fixed all of this, although this is untested by aviary builds because libwidget_mac is static there. It might be prudent to make the suggested gfx Makefile change locally on atlantia just to spit out a test build for QA purposes before deciding how to proceed.
Hi, These builds work on 10.3.6 Maybe we should have seperatye builds for every version of mac OSX. A build for 10.1 a build for 10.2 a build for 10.3 and a nother build for 10.4 Does this sound good? And a question. You all are trying to build on 10.3 but what version 10.3.9? Or a lower version of 10.3? I know that you all used 10.2.8 before so i will assume 10.3.9 unless you all say otherwise. Have you all considered building on 10.4? I am just wondering.
(In reply to comment #10) > To be specific, the Makefile tweak I'm proposing is here: I talked about this with QA and others and we decided to reinstall 10.2 on a machine and re-create barcelona's build environment. So I'll start sifting through logs and doing that tomorrow.
Barcelona is now building, but the tests are failing again: http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1141895340.18012.gz&fulltext=1 Mark: do you think you might be able to take a quick peek and see if it's obvious why the tests would be failing? Thanks!
It's only Ts that's failing. The AliveTest, Tp, and Txul are all completing successfully. Given that the 15 second Ts timeout should be more than enough even for barcelona, my bet is that startup-test.html wasn't copied into the tinderbox directory.
(In reply to comment #14) > It's only Ts that's failing. The AliveTest, Tp, and Txul are all completing > successfully. Given that the 15 second Ts timeout should be more than enough > even for barcelona, my bet is that startup-test.html wasn't copied into the > tinderbox directory. The mozilla/tools/performance CVS module was missing on barcelona, so I checked it out and make the symlink for startup-test.html in the tinderbox dir.
(In reply to comment #15) > The mozilla/tools/performance CVS module was missing on barcelona, so I checked > it out and make the symlink for startup-test.html in the tinderbox dir. I'm *so* glad this stuff is so well documented. *rolls eyes* Looks like we now have a build for QA; I'll move the tbox and respin one last time. Thanks for the help, guys!
Status: REOPENED → RESOLVED
Last Resolved: 12 years ago → 12 years ago
Resolution: --- → FIXED
verified with Mozilla 1.7.13 build from 2006-03-09
Status: RESOLVED → VERIFIED
Keywords: fixed1.7.13 → verified1.7.13
You need to log in before you can comment on or make changes to this bug.