Closed Bug 550335 Opened 14 years ago Closed 11 years ago

fix_macosx_stack fails on 64-bit builds

Categories

(Firefox Build System :: General, defect)

x86_64
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla24

People

(Reporter: jruderman, Assigned: mccr8)

References

Details

(Keywords: 64bit)

Attachments

(1 file, 2 obsolete files)

How can automation.py or fix_macosx_stack.py determine whether we're running a 64-bit build? Does the answer change if we want to support universal binaries?
It's probably a tricky problem, since we are likely to ship a universal build in the future. There must be some way to determine what the preferred architecture of the system is, and determine what the architecture of the app is, and sort it out from that.
Maybe look at MacPorts (http://macports.org/ , http://guide.macports.org/ , http://trac.macports.org/ , http://trac.macports.org/wiki/Tickets ) and how they solve(d) quite the same challenge/problem...

Maybe a hint/keyword: "build_arch"...

/opt/local/etc/macports.conf:
# CPU architecture to compile for. Defaults to i386 or ppc on Mac OS X 10.5
# and earlier, depending on the CPU type detected at runtime.
# On Mac OS X 10.6 the default is x86_64 if the CPU supports it, i386 otherwise.
#build_arch
$ otool -vf ./foo

This will dump out universal binary architectures. For example, the Mac OS X 10.6 Xcode binary which has 3 architectures included:

[josh@mpro106 MacOS] otool -vf Xcode 
Fat headers
fat_magic FAT_MAGIC
nfat_arch 3
architecture x86_64
    cputype CPU_TYPE_X86_64
    cpusubtype CPU_SUBTYPE_X86_64_ALL
    capabilities CPU_SUBTYPE_LIB64
    offset 4096
    size 224400
    align 2^12 (4096)
architecture i386
    cputype CPU_TYPE_I386
    cpusubtype CPU_SUBTYPE_I386_ALL
    capabilities 0x0
    offset 229376
    size 180320
    align 2^12 (4096)
architecture ppc7400
    cputype CPU_TYPE_POWERPC
    cpusubtype CPU_SUBTYPE_POWERPC_7400
    capabilities 0x0
    offset 413696
    size 191728
    align 2^12 (4096)
(In reply to comment #3)

> $ otool -vf ./foo

Yes. And/or: $ file ./foo

file -- determine file type (for details, see 'man file'):

$ file Xcode
Xcode: Mach-O universal binary with 3 architectures
Xcode (for architecture x86_64):	Mach-O 64-bit executable x86_64
Xcode (for architecture i386):	Mach-O executable i386
Xcode (for architecture ppc7400):	Mach-O executable ppc
In C, you'd call NXGetLocalArchInfo.  No idea what to do in Python.
On Snow Leopard you have to be careful which Python version you run because IIRC with Python 2.6 all it is doing is running the file command and grepping for the answer.

the platform.architecture() call will tell you if the environment is "capable" of running 64 bit, not if it *is* a 64 bit machine.  To do that you need to query sys.maxint.

On my new MacBookPro with Snow Leopard and Python 2.5:

import sys, platform
sys.platform  # 'darwin':
sys.maxint    # 2147483647
platform.architecture() #('32bit', '')

Same with Python 2.6:

import sys, platform
sys.platform  # 'darwin':
sys.maxint    # 9223372036854775807
platform.architecture() #('64bit', '')
http://hg.mozilla.org/mozilla-central/rev/d39bed41e215 fixes the perma-orange but doesn't fix this bug.
The right fix for this bug is to have the dumper print the name of the architecture as part of the stack trace.  It's just asking for trouble to try to guess what architecture produced the stack trace, when it could just as well tell us explicitly.
I agree.  And we'll need that change even if we later switch to a gdb-based or breakpad-based stack post-processor.
Depends on: 570286
I split that out as bug 570286.
Turns out my "fix for perma-orange" left random orange in the form of hangs.  I hit this on my own machines as well; I think python's "pty" module isn't behaving well when the sub-process exits early.

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1276037058.1276039433.9885.gz&fulltext=1#err0

I could write a temporary hack that passes the architecture from automation.py to fix_macosx_stack.py.  This would be temporary because it will break if we start making universal builds; the permanent fix would be bug 570286.

Or we can just ignore this for now and hope we can replace fix_macosx_stack entirely soon (requires fixing both bug 570287 and bug 558947).
No longer blocks: 558947
A super-quick workaround would be to stop calling fix_macosx_stack.py on 64-bit.
To elaborate on the "temporary hack", automation.py could detect the arch of the Firefox binary:

subprocess.Popen(["file", app], stdout=subprocess.PIPE).communicate()[0].strip().split(" ")[-1]

(can't just do this in fix_macosx_stack.py, because some of the libraries used by Firefox are universal)
fix_stack_using_bpsyms.py is the way forward (but needs a similar fix). fix_macosx_stack is old.
No longer blocks: support-10.6_x64
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → WONTFIX
No longer depends on: 570286
Depends on: 570286
On second thought, fix_macosx_stack is significantly more convenient for local debug builds.  Using "--enable-breakpad" and "make buildsymbols" isn't a lot of fun.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Is fix_macosx_stack.py still used in the build system?  If it is, I'd like to add a file fix_macosx_stack64.py or something because it is annoying to have to manually patch this file every time I want to do refcount logging.  There's also an additional error in here where it eats the first character of the symbol names.
It looks like the only place we reference it is in automation.py, but it's currently disabled (note the "and False"):
http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#747

In any event, I think this whole discussion could be subverted, and we could simply check whether we're on OS X 10.5 and use the 32-bit behavior there, and otherwise use the 64-bit behavior, since by default we only run Firefox as 32-bit on 10.5.
Attached patch fixes 64, breaks 32 (obsolete) — Splinter Review
My slightly modified version of Jesse's patch.
Attachment #430436 - Attachment is obsolete: true
Comment 17 sounds reasonable enough.  Local builds are usually not going to be universal (32 & 64).
Assignee: nobody → continuation
Attached patch fix itSplinter Review
I run this script a few times a year for refcount logging, and 32-bit OSX support is dead and buried, so it would be nice to fix this.

njn, it looks like DMD is the only thing in the tree to use this script.  Do people run DMD much on OSX?  Do you think this script is okay to land without breaking DMD?  I kind of wonder how it works there, but not for me for refcount logs.  If you have a local OSX DMD build sitting around, njn, if you could test this, I'd appreciate it.  Otherwise I guess I can test it.
Attachment #627248 - Attachment is obsolete: true
Attachment #761273 - Flags: review?(ted)
Attachment #761273 - Flags: feedback?(n.nethercote)
This is just Jesse's patch from 3.5 years ago, I don't claim to understand it.  I just know that I can't fix stacks on OSX without this patch applied.

Try run was okay, which isn't surprising, as we likely don't test this on a try push. https://tbpl.mozilla.org/?tree=Try&rev=b3bb655d7107
Comment on attachment 761273 [details] [diff] [review]
fix it

Review of attachment 761273 [details] [diff] [review]:
-----------------------------------------------------------------

I just tried the patch.  It changes DMD's output, but AFAICT for the better.  Here's an example:

  Allocated at
    replace_malloc (DMD.cpp:1226, in libdmd.dylib) 0x10000b923
-   __strtodg$UNIX2003 (in libsystem_c.dylib) + 945 0x7fff89ae63c8
-   __strtodg$UNIX2003 (in libsystem_c.dylib) + 4493 0x7fff89ae71a4
+   malloc_zone_malloc (in libsystem_c.dylib) + 77 0x7fff89ae63c8
+   malloc (in libsystem_c.dylib) + 44 0x7fff89ae71a4
    gfxImageSurface::AllocateAndInit(long, int, bool) (gfxImageSurface.cpp:125,
in XUL) 0x10264de7e


Those __strtodg entries were always suspect, so I'm happy for this to land.


> There's also an additional error in here where it eats the first character of
> the symbol names.

I fixed that a while back;  see bug 819833 comment 4.
Attachment #761273 - Flags: feedback?(n.nethercote) → feedback+
> 32-bit OSX support is dead and buried

My Nightly is still universal i386 + x86_64.  But yeah, it's dead enough.
Thanks for testing, Nick!

> I fixed that a while back

Yeah, I remember you fixing that, thanks.  My comment was from a year ago. ;)
Comment on attachment 761273 [details] [diff] [review]
fix it

Review of attachment 761273 [details] [diff] [review]:
-----------------------------------------------------------------

Since we're not using this in automation this is effectively NPOTB anyway.
Attachment #761273 - Flags: review?(ted) → review+
No longer depends on: 570286
https://hg.mozilla.org/mozilla-central/rev/300c21d59506
Status: REOPENED → RESOLVED
Closed: 14 years ago11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla24
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: