Closed
Bug 1305159
Opened 8 years ago
Closed 8 years ago
[10.12] usage of deprecated NSAddImage causes lldb to stall forever
Categories
(Core :: XPCOM, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla52
People
(Reporter: drno, Assigned: drno)
References
Details
Attachments
(2 files)
4.76 KB,
text/plain
|
Details | |
58 bytes,
text/x-review-board-request
|
froydnj
:
review+
gchang
:
approval-mozilla-aurora+
|
Details |
When I execute commands like './mach run --debug' or './mach gtest --debug' lldb gets started, but after issuing the 'run' command the loading stall forever in the attached stack trace.
Assignee | ||
Comment 1•8 years ago
|
||
Stack trace when hitting CTRL+C while lldb stalls.
Assignee | ||
Comment 2•8 years ago
|
||
Current guess is that we still use NSAddImage, an API which has been deprecated since 2007. Trying to replace NSSAddImage() with the dlopen() code we have for Linux appears to load the libs, but fails somewhere else.
Assignee | ||
Comment 3•8 years ago
|
||
NSAddImage documentation http://mirror.informatimago.com/next/developer.apple.com/documentation/DeveloperTools/Conceptual/MachORuntime/5rt_api_reference/chapter_11_section_4.html#//apple_ref/doc/c_ref/NSAddImage
Assignee | ||
Comment 4•8 years ago
|
||
So if I take out the OSX code which tries to use NSAddImage() etc and let it simply use the Linux(?) code instead which uses dlopen() and dlsym() all libs appear to get loaded, but dlsym() fails to locate _NS_GetFrozenFunctions in the loaded XUL lib, even though nm clearly shows that XUL has that symbol.
Assignee | ||
Comment 5•8 years ago
|
||
The warnings output I get when compiling nsXPCOMGlue.cpp 0:11.94 Warning: -Wdeprecated-declarations in /Users/nohlmeier/src/mozilla-central/xpcom/glue/standalone/nsXPCOMGlue.cpp: 'NSAddImage' is deprecated: first deprecated in macOS 10.5 0:11.94 /Users/nohlmeier/src/mozilla-central/xpcom/glue/standalone/nsXPCOMGlue.cpp:96:29: warning: 'NSAddImage' is deprecated: first deprecated in macOS 10.5 [-Wdeprecated-declarations] 0:11.94 LibHandleType libHandle = NSAddImage(aDependentLib, 0:11.94 ^ 0:11.94 /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/usr/include/mach-o/dyld.h:223:34: note: 'NSAddImage' has been explicitly marked deprecated here 0:11.94 extern const struct mach_header* NSAddImage(const char* image_name, uint32_t options) __OSX_AVAILABLE_BUT_DEPRECATED(__MAC_10_1,__MAC_10_5,__IPHONE_NA,__IPHONE_NA); 0:11.94 ^ 0:11.94 Warning: -Wdeprecated-declarations in /Users/nohlmeier/src/mozilla-central/xpcom/glue/standalone/nsXPCOMGlue.cpp: 'NSLinkEditError' is deprecated: first deprecated in macOS 10.5 0:11.94 /Users/nohlmeier/src/mozilla-central/xpcom/glue/standalone/nsXPCOMGlue.cpp:104:5: warning: 'NSLinkEditError' is deprecated: first deprecated in macOS 10.5 [-Wdeprecated-declarations] 0:11.94 NSLinkEditError(&linkEditError, &errorNum, &fileName, &errorString); 0:11.94 ^ 0:11.94 /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/usr/include/mach-o/dyld.h:210:13: note: 'NSLinkEditError' has been explicitly marked deprecated here 0:11.94 extern void NSLinkEditError(NSLinkEditErrors *c, int *errorNumber, const char** fileName, const char** errorString) __OSX_AVAILABLE_BUT_DEPRECATED(__MAC_10_1,__MAC_10_5,__IPHONE_NA,__IPHONE_NA); 0:11.94 ^ 0:11.94 Warning: -Wdeprecated-declarations in /Users/nohlmeier/src/mozilla-central/xpcom/glue/standalone/nsXPCOMGlue.cpp: 'NSLookupSymbolInImage' is deprecated: first deprecated in macOS 10.5 0:11.94 /Users/nohlmeier/src/mozilla-central/xpcom/glue/standalone/nsXPCOMGlue.cpp:122:11: warning: 'NSLookupSymbolInImage' is deprecated: first deprecated in macOS 10.5 [-Wdeprecated-declarations] 0:11.94 sym = NSLookupSymbolInImage(aLibHandle, aSymbol, 0:11.94 ^ 0:11.94 /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/usr/include/mach-o/dyld.h:175:17: note: 'NSLookupSymbolInImage' has been explicitly marked deprecated here 0:11.94 extern NSSymbol NSLookupSymbolInImage(const struct mach_header* image, const char* symbolName, uint32_t options) __OSX_AVAILABLE_BUT_DEPRECATED(__MAC_10_1,__MAC_10_5,__IPHONE_NA,__IPHONE_NA); 0:11.94 ^ 0:11.94 Warning: -Wdeprecated-declarations in /Users/nohlmeier/src/mozilla-central/xpcom/glue/standalone/nsXPCOMGlue.cpp: 'NSIsSymbolNameDefined' is deprecated: first deprecated in macOS 10.4 0:11.94 /Users/nohlmeier/src/mozilla-central/xpcom/glue/standalone/nsXPCOMGlue.cpp:126:9: warning: 'NSIsSymbolNameDefined' is deprecated: first deprecated in macOS 10.4 [-Wdeprecated-declarations] 0:11.94 if (NSIsSymbolNameDefined(aSymbol)) { 0:11.94 ^ 0:11.94 /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/usr/include/mach-o/dyld.h:169:17: note: 'NSIsSymbolNameDefined' has been explicitly marked deprecated here 0:11.94 extern bool NSIsSymbolNameDefined(const char* symbolName) __OSX_AVAILABLE_BUT_DEPRECATED(__MAC_10_1,__MAC_10_4,__IPHONE_NA,__IPHONE_NA); 0:11.94 ^ 0:11.94 Warning: -Wdeprecated-declarations in /Users/nohlmeier/src/mozilla-central/xpcom/glue/standalone/nsXPCOMGlue.cpp: 'NSLookupAndBindSymbol' is deprecated: first deprecated in macOS 10.4 0:11.94 /Users/nohlmeier/src/mozilla-central/xpcom/glue/standalone/nsXPCOMGlue.cpp:127:13: warning: 'NSLookupAndBindSymbol' is deprecated: first deprecated in macOS 10.4 [-Wdeprecated-declarations] 0:11.94 sym = NSLookupAndBindSymbol(aSymbol); 0:11.94 ^ 0:11.94 /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/usr/include/mach-o/dyld.h:172:17: note: 'NSLookupAndBindSymbol' has been explicitly marked deprecated here 0:11.94 extern NSSymbol NSLookupAndBindSymbol(const char* symbolName) __OSX_AVAILABLE_BUT_DEPRECATED(__MAC_10_1,__MAC_10_4,__IPHONE_NA,__IPHONE_NA); 0:11.94 ^ 0:11.94 Warning: -Wdeprecated-declarations in /Users/nohlmeier/src/mozilla-central/xpcom/glue/standalone/nsXPCOMGlue.cpp: 'NSAddressOfSymbol' is deprecated: first deprecated in macOS 10.5 0:11.94 /Users/nohlmeier/src/mozilla-central/xpcom/glue/standalone/nsXPCOMGlue.cpp:134:21: warning: 'NSAddressOfSymbol' is deprecated: first deprecated in macOS 10.5 [-Wdeprecated-declarations] 0:11.94 return (NSFuncPtr)NSAddressOfSymbol(sym); 0:11.95 ^ 0:11.95 /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/usr/include/mach-o/dyld.h:181:21: note: 'NSAddressOfSymbol' has been explicitly marked deprecated here 0:11.95 extern void * NSAddressOfSymbol(NSSymbol symbol) __OSX_AVAILABLE_BUT_DEPRECATED(__MAC_10_1,__MAC_10_5,__IPHONE_NA,__IPHONE_NA); 0:11.95 ^ 0:11.96 cd legacydb; /Applications/Xcode.app/Contents/Developer/usr/bin/make libs 0:12.00 6 warnings generated.
Assignee | ||
Comment 6•8 years ago
|
||
'./mach run' works as expected and even though it also calls NSAddImage() at the same place (verified via a printf) it does not hang as it does when running in lldb.
Assignee | ||
Comment 7•8 years ago
|
||
If I let dlsym() search for 'NS_GetFrozenFunctions' instead of '_NS_GetFrozenFunctions' it works. But then fails to locate the XRE functions next.
Comment hidden (mozreview-request) |
Assignee | ||
Comment 9•8 years ago
|
||
Marking it P1 because I assume this prevents every developer on 10.12 from using any 'mach' command with --debug.
Priority: -- → P1
Assignee | ||
Updated•8 years ago
|
Attachment #8794435 -
Flags: review?(nfroyd)
Attachment #8794435 -
Flags: review?(michelangelo)
Attachment #8794435 -
Flags: review?(erahm)
Attachment #8794435 -
Flags: review?(doug.turner)
Attachment #8794435 -
Flags: review?(benjamin)
Assignee | ||
Comment 10•8 years ago
|
||
Marked all module peers, owner and emeritus for review in the hope that who ever gets to it first can quickly review this.
Assignee | ||
Updated•8 years ago
|
Summary: [10.12] firefox stalls forever when started in lldb → [10.12] usage of deprecated NSAddImage causes lldb to stall forever
Comment 11•8 years ago
|
||
mozreview-review |
Comment on attachment 8794435 [details] Bug 1305159: replace NSAddImage with dlopen on OSX. https://reviewboard.mozilla.org/r/80906/#review79702 I'm a fan of using more cross-platform-y APIs when possible...but does this actually solve all your problems? Comment 7 makes it sound like more than this patch is needed.
Attachment #8794435 -
Flags: review?(nfroyd) → review+
Comment 12•8 years ago
|
||
Comment on attachment 8794435 [details] Bug 1305159: replace NSAddImage with dlopen on OSX. You should also verify with your try run that OSX 10.10 debug M-2, M-e10s-4, and the Windows R-e10s crashes (R-e10s and JSReftest) aren't due to your patch via retriggers.
Attachment #8794435 -
Flags: review?(erahm)
Attachment #8794435 -
Flags: review?(doug.turner)
Attachment #8794435 -
Flags: review?(benjamin)
Comment 13•8 years ago
|
||
In the past, it was required to use NSADDIMAGE_OPTION_MATCH_FILENAME_BY_INSTALLNAME so that dynamic XPCOM components (in particular libbrowsercomps) would find dependent libraries correctly. Has that changed? Or we could wait until we fold browsercomps back in...
Assignee | ||
Comment 14•8 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #11) > I'm a fan of using more cross-platform-y APIs when possible...but does this > actually solve all your problems? Comment 7 makes it sound like more than > this patch is needed. If you look closely at the patch, the original code actually adds the leading '_' to all function names only (!) on OSX via the define of the underscore. So it looks like NSAddImage actually required the leading underscore, but non of the other API's for looking up the symbols need them. That's why I deleted that define all together.
Comment 15•8 years ago
|
||
(In reply to Nils Ohlmeier [:drno] from comment #14) > (In reply to Nathan Froyd [:froydnj] from comment #11) > > I'm a fan of using more cross-platform-y APIs when possible...but does this > > actually solve all your problems? Comment 7 makes it sound like more than > > this patch is needed. > > If you look closely at the patch, the original code actually adds the > leading '_' to all function names only (!) on OSX via the define of the > underscore. So it looks like NSAddImage actually required the leading > underscore, but non of the other API's for looking up the symbols need them. > That's why I deleted that define all together. Ah, I see. So when you wrote comment 7, you had only removed LEADING_UNDERSCORE from the lookup of NS_GetFrozenFunctions, but when you wrote the patch, you had figured out that LEADING_UNDERSCORE removal had to go a bit farther than that? Nothing else indicated that you had fixed the XRE symbol lookup otherwise...
Flags: needinfo?(drno)
Assignee | ||
Comment 16•8 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #15) > Ah, I see. So when you wrote comment 7, you had only removed > LEADING_UNDERSCORE from the lookup of NS_GetFrozenFunctions, but when you > wrote the patch, you had figured out that LEADING_UNDERSCORE removal had to > go a bit farther than that? Nothing else indicated that you had fixed the > XRE symbol lookup otherwise... Exactly. In comment #7 I only had removed the LEADING_UNDERSCORE form the GetFrozenFunctionsFunc lookup in XPCOMGlueLoad(). When I started looking where the XRE symbols get the underscore added I figured out that it was caused by the define, which really only affected OSX.
Flags: needinfo?(drno)
Assignee | ||
Comment 17•8 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #12) > You should also verify with your try run that OSX 10.10 debug M-2, M-e10s-4, > and the Windows R-e10s crashes (R-e10s and JSReftest) aren't due to your > patch via retriggers. All of them turned at least partially green.
Assignee | ||
Comment 18•8 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #13) > In the past, it was required to use > NSADDIMAGE_OPTION_MATCH_FILENAME_BY_INSTALLNAME so that dynamic XPCOM > components (in particular libbrowsercomps) would find dependent libraries > correctly. > > Has that changed? > > Or we could wait until we fold browsercomps back in... So I manually ran Firefox with dlopen() and used the location bar and the search bar. Everything appears to behave normal. That is only on 10.12. I'll try to give it a spin on 10.8 as well.
Assignee | ||
Comment 20•8 years ago
|
||
So I manually tested this also on 10.9, which seems to be the latest OS X version supported by Nightly. Same result as with 10.12: location and search bar appear to working as expected. In other words: Am I good to land this?
Flags: needinfo?(nfroyd)
Flags: needinfo?(benjamin)
Comment 21•8 years ago
|
||
(In reply to Nils Ohlmeier [:drno] from comment #20) > So I manually tested this also on 10.9, which seems to be the latest OS X > version supported by Nightly. Same result as with 10.12: location and search > bar appear to working as expected. > > In other words: Am I good to land this? Via some search engine spelunking, I found: http://opensource.apple.com/source/dyld/dyld-360.22/src/dyldAPIs.cpp From that code (and all other versions available on opensource.apple.com), it looks like dlopen replicates the functionality of MATCH_FILENAME_BY_INSTALLNAME. So I think we're good on that front. The one wrinkle is that the Unix codepath would technically not be equivalent to the mac code we had before, as NSLookupSymbolInImage (called from GetSymbol) will only consider the particular library handle that we pass it, whereas dlsym will consider the image and any dependent images that would be loaded. We'd need to use the (Mac-specific, I think) RTLD_FIRST flag to replicate NSLookupSymbolInImage's behavior, which doesn't seem like a bad idea. But AFAICT from looking at the source code, using dlsym and using NSLookupSymbolInImage are completely equivalent here. So please make the RTLD_FIRST change, and then if that passes testing, I think we're good to go.
Flags: needinfo?(nfroyd)
Comment hidden (mozreview-request) |
Comment 23•8 years ago
|
||
Yes, I think as long as Firefox is not broken, we no longer care about breaking other DLLs and this should be ok to land.
Flags: needinfo?(benjamin)
Assignee | ||
Comment 24•8 years ago
|
||
Looks like the remaining test failure are indeed intermittents. So I'm planing on landing this Thursday/tomorrow morning west coast time (so I can monitor how it's going).
Comment 25•8 years ago
|
||
You can land it now, and I'll monitor it.. This bug has been giving me grief for one week :)
Comment 26•8 years ago
|
||
Pushed by drno@ohlmeier.org: https://hg.mozilla.org/integration/autoland/rev/b9b87a4102da replace NSAddImage with dlopen on OSX. r=froydnj
Assignee | ||
Comment 28•8 years ago
|
||
I would guess/assume so. As it would be good to get the ability to debug aurora and beta with macos Sierra. If it would be my call I would let the patch settle for some time on Nightly before uplifting. But I think it should be modules owners call if this too risky or not.
Flags: needinfo?(drno) → needinfo?(nfroyd)
Comment 29•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/b9b87a4102da
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Comment 30•8 years ago
|
||
(In reply to Nils Ohlmeier [:drno] from comment #28) > I would guess/assume so. As it would be good to get the ability to debug > aurora and beta with macos Sierra. If it would be my call I would let the > patch settle for some time on Nightly before uplifting. But I think it > should be modules owners call if this too risky or not. Uplifting would be OK by me.
Flags: needinfo?(nfroyd)
Assignee | ||
Comment 31•8 years ago
|
||
Comment on attachment 8794435 [details] Bug 1305159: replace NSAddImage with dlopen on OSX. Approval Request Comment [Feature/regressing bug #]: The XPCOM code on OSX is/was using an API which has been deprecated since 2007. The original code is as old the first import from CVS. [User impact if declined]: Developers on Mac OSX 10.12 can't start Firefox or any CPP unit test in lldb debugger. [Describe test coverage new/current, TreeHerder]: No new tests, but every existing test which starts a Firefox binary on OSX should cover this. [Risks and why]: There is a small risk that if something isn't working as expected the Firefox binary won't start anymore. But the new API's used through this patch are well known cross platform API's which should not cause problems. Further more I assume that something fundamental as startup failure would have gotten reported by now. [String/UUID change made/needed]: N/A
Attachment #8794435 -
Flags: approval-mozilla-aurora?
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → drno
Updated•8 years ago
|
status-firefox51:
--- → affected
Assignee | ||
Updated•8 years ago
|
status-firefox49:
--- → affected
status-firefox50:
--- → affected
Comment 32•8 years ago
|
||
Comment on attachment 8794435 [details] Bug 1305159: replace NSAddImage with dlopen on OSX. This patch helps dev debug in macos. Take it in 51 aurora.
Attachment #8794435 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 33•8 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/bcf7dee34cd6
Updated•8 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•