Closed Bug 550335 Opened 13 years ago Closed 10 years ago
_macosx _stack fails on 64-bit builds
How can automation.py or fix_macosx_stack.py determine whether we're running a 64-bit build? Does the answer change if we want to support universal binaries?
It's probably a tricky problem, since we are likely to ship a universal build in the future. There must be some way to determine what the preferred architecture of the system is, and determine what the architecture of the app is, and sort it out from that.
Maybe look at MacPorts (http://macports.org/ , http://guide.macports.org/ , http://trac.macports.org/ , http://trac.macports.org/wiki/Tickets ) and how they solve(d) quite the same challenge/problem... Maybe a hint/keyword: "build_arch"... /opt/local/etc/macports.conf: # CPU architecture to compile for. Defaults to i386 or ppc on Mac OS X 10.5 # and earlier, depending on the CPU type detected at runtime. # On Mac OS X 10.6 the default is x86_64 if the CPU supports it, i386 otherwise. #build_arch
$ otool -vf ./foo This will dump out universal binary architectures. For example, the Mac OS X 10.6 Xcode binary which has 3 architectures included: [josh@mpro106 MacOS] otool -vf Xcode Fat headers fat_magic FAT_MAGIC nfat_arch 3 architecture x86_64 cputype CPU_TYPE_X86_64 cpusubtype CPU_SUBTYPE_X86_64_ALL capabilities CPU_SUBTYPE_LIB64 offset 4096 size 224400 align 2^12 (4096) architecture i386 cputype CPU_TYPE_I386 cpusubtype CPU_SUBTYPE_I386_ALL capabilities 0x0 offset 229376 size 180320 align 2^12 (4096) architecture ppc7400 cputype CPU_TYPE_POWERPC cpusubtype CPU_SUBTYPE_POWERPC_7400 capabilities 0x0 offset 413696 size 191728 align 2^12 (4096)
(In reply to comment #3) > $ otool -vf ./foo Yes. And/or: $ file ./foo file -- determine file type (for details, see 'man file'): $ file Xcode Xcode: Mach-O universal binary with 3 architectures Xcode (for architecture x86_64): Mach-O 64-bit executable x86_64 Xcode (for architecture i386): Mach-O executable i386 Xcode (for architecture ppc7400): Mach-O executable ppc
In C, you'd call NXGetLocalArchInfo. No idea what to do in Python.
On Snow Leopard you have to be careful which Python version you run because IIRC with Python 2.6 all it is doing is running the file command and grepping for the answer. the platform.architecture() call will tell you if the environment is "capable" of running 64 bit, not if it *is* a 64 bit machine. To do that you need to query sys.maxint. On my new MacBookPro with Snow Leopard and Python 2.5: import sys, platform sys.platform # 'darwin': sys.maxint # 2147483647 platform.architecture() #('32bit', '') Same with Python 2.6: import sys, platform sys.platform # 'darwin': sys.maxint # 9223372036854775807 platform.architecture() #('64bit', '')
http://hg.mozilla.org/mozilla-central/rev/d39bed41e215 fixes the perma-orange but doesn't fix this bug.
The right fix for this bug is to have the dumper print the name of the architecture as part of the stack trace. It's just asking for trouble to try to guess what architecture produced the stack trace, when it could just as well tell us explicitly.
I agree. And we'll need that change even if we later switch to a gdb-based or breakpad-based stack post-processor.
I split that out as bug 570286.
Turns out my "fix for perma-orange" left random orange in the form of hangs. I hit this on my own machines as well; I think python's "pty" module isn't behaving well when the sub-process exits early. http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1276037058.1276039433.9885.gz&fulltext=1#err0 I could write a temporary hack that passes the architecture from automation.py to fix_macosx_stack.py. This would be temporary because it will break if we start making universal builds; the permanent fix would be bug 570286. Or we can just ignore this for now and hope we can replace fix_macosx_stack entirely soon (requires fixing both bug 570287 and bug 558947).
No longer blocks: 558947
A super-quick workaround would be to stop calling fix_macosx_stack.py on 64-bit.
To elaborate on the "temporary hack", automation.py could detect the arch of the Firefox binary: subprocess.Popen(["file", app], stdout=subprocess.PIPE).communicate().strip().split(" ")[-1] (can't just do this in fix_macosx_stack.py, because some of the libraries used by Firefox are universal)
fix_stack_using_bpsyms.py is the way forward (but needs a similar fix). fix_macosx_stack is old.
No longer blocks: support-10.6_x64
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WONTFIX
On second thought, fix_macosx_stack is significantly more convenient for local debug builds. Using "--enable-breakpad" and "make buildsymbols" isn't a lot of fun.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Is fix_macosx_stack.py still used in the build system? If it is, I'd like to add a file fix_macosx_stack64.py or something because it is annoying to have to manually patch this file every time I want to do refcount logging. There's also an additional error in here where it eats the first character of the symbol names.
It looks like the only place we reference it is in automation.py, but it's currently disabled (note the "and False"): http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#747 In any event, I think this whole discussion could be subverted, and we could simply check whether we're on OS X 10.5 and use the 32-bit behavior there, and otherwise use the 64-bit behavior, since by default we only run Firefox as 32-bit on 10.5.
My slightly modified version of Jesse's patch.
Attachment #430436 - Attachment is obsolete: true
Comment 17 sounds reasonable enough. Local builds are usually not going to be universal (32 & 64).
I run this script a few times a year for refcount logging, and 32-bit OSX support is dead and buried, so it would be nice to fix this. njn, it looks like DMD is the only thing in the tree to use this script. Do people run DMD much on OSX? Do you think this script is okay to land without breaking DMD? I kind of wonder how it works there, but not for me for refcount logs. If you have a local OSX DMD build sitting around, njn, if you could test this, I'd appreciate it. Otherwise I guess I can test it.
This is just Jesse's patch from 3.5 years ago, I don't claim to understand it. I just know that I can't fix stacks on OSX without this patch applied. Try run was okay, which isn't surprising, as we likely don't test this on a try push. https://tbpl.mozilla.org/?tree=Try&rev=b3bb655d7107
Comment on attachment 761273 [details] [diff] [review] fix it Review of attachment 761273 [details] [diff] [review]: ----------------------------------------------------------------- I just tried the patch. It changes DMD's output, but AFAICT for the better. Here's an example: Allocated at replace_malloc (DMD.cpp:1226, in libdmd.dylib) 0x10000b923 - __strtodg$UNIX2003 (in libsystem_c.dylib) + 945 0x7fff89ae63c8 - __strtodg$UNIX2003 (in libsystem_c.dylib) + 4493 0x7fff89ae71a4 + malloc_zone_malloc (in libsystem_c.dylib) + 77 0x7fff89ae63c8 + malloc (in libsystem_c.dylib) + 44 0x7fff89ae71a4 gfxImageSurface::AllocateAndInit(long, int, bool) (gfxImageSurface.cpp:125, in XUL) 0x10264de7e Those __strtodg entries were always suspect, so I'm happy for this to land. > There's also an additional error in here where it eats the first character of > the symbol names. I fixed that a while back; see bug 819833 comment 4.
Attachment #761273 - Flags: feedback?(n.nethercote) → feedback+
> 32-bit OSX support is dead and buried My Nightly is still universal i386 + x86_64. But yeah, it's dead enough.
Thanks for testing, Nick! > I fixed that a while back Yeah, I remember you fixing that, thanks. My comment was from a year ago. ;)
Comment on attachment 761273 [details] [diff] [review] fix it Review of attachment 761273 [details] [diff] [review]: ----------------------------------------------------------------- Since we're not using this in automation this is effectively NPOTB anyway.
Attachment #761273 - Flags: review?(ted) → review+
Status: REOPENED → RESOLVED
Closed: 13 years ago → 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla24
You need to log in before you can comment on or make changes to this bug.