Closed
Bug 878791
Opened 12 years ago
Closed 11 years ago
Firefox on FreeBSD crashes due to loading libc.so.6
Categories
(Toolkit Graveyard :: OS.File, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
mozilla29
People
(Reporter: dimitry, Assigned: Yoric)
References
Details
(Keywords: crash, Whiteboard: [Async])
Attachments
(1 file, 8 obsolete files)
1.14 KB,
patch
|
glandium
:
review+
|
Details | Diff | Splinter Review |
On the FreeBSD ports mailing list, I noticed reports [1] about Firefox
21 crashes, which had very strange backtraces. For example:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 2d8c7e00 (LWP 101017/DOM Worker)]
0x282a854b in strcasecmp_l () from /lib/libc.so.7
(gdb) where
#0 0x282a854b in strcasecmp_l () from /lib/libc.so.7
#1 0x282a8666 in strcasecmp () from /lib/libc.so.7
#2 0x282d195e in .cerror () from /lib/libc.so.7
#3 0x283477b8 in .got () from /usr/local/lib/libffi.so.6
#4 0x28345e87 in ffi_call_SYSV () from /usr/local/lib/libffi.so.6
#5 0x28345cbe in ffi_call () from /usr/local/lib/libffi.so.6
#6 0x2a6c295b in js::SizeOfDataIfCDataObject ()
from /usr/local/lib/firefox/libxul.so
#7 0x2a41d167 in js::AutoMaybeTouchDeadZones::~AutoMaybeTouchDeadZones ()
from /usr/local/lib/firefox/libxul.so
#8 0x2a3ec016 in js::AutoCTypesActivityCallback::AutoCTypesActivityCallback ()
from /usr/local/lib/firefox/libxul.so
#9 0x2a41d127 in js::AutoMaybeTouchDeadZones::~AutoMaybeTouchDeadZones ()
from /usr/local/lib/firefox/libxul.so
#10 0x2a41969f in js::AutoMaybeTouchDeadZones::~AutoMaybeTouchDeadZones ()
from /usr/local/lib/firefox/libxul.so
#11 0x2a412563 in js::AutoMaybeTouchDeadZones::~AutoMaybeTouchDeadZones ()
from /usr/local/lib/firefox/libxul.so
#12 0x2a41d09a in js::AutoMaybeTouchDeadZones::~AutoMaybeTouchDeadZones ()
from /usr/local/lib/firefox/libxul.so
#13 0x2a3ec016 in js::AutoCTypesActivityCallback::AutoCTypesActivityCallback ()
from /usr/local/lib/firefox/libxul.so
#14 0x2a41d127 in js::AutoMaybeTouchDeadZones::~AutoMaybeTouchDeadZones ()
from /usr/local/lib/firefox/libxul.so
#15 0x2a41969f in js::AutoMaybeTouchDeadZones::~AutoMaybeTouchDeadZones ()
from /usr/local/lib/firefox/libxul.so
#16 0x2a412563 in js::AutoMaybeTouchDeadZones::~AutoMaybeTouchDeadZones ()
from /usr/local/lib/firefox/libxul.so
#17 0x2a41d09a in js::AutoMaybeTouchDeadZones::~AutoMaybeTouchDeadZones ()
from /usr/local/lib/firefox/libxul.so
#18 0x2a41d5e3 in js::AutoMaybeTouchDeadZones::~AutoMaybeTouchDeadZones ()
from /usr/local/lib/firefox/libxul.so
#19 0x2a3a12f6 in JS_CallFunctionValue () from /usr/local/lib/firefox/libxul.so
#20 0x2927b2aa in std::vector<mozilla::gfx::Glyph, std::allocator<mozilla::gfx::Glyph> >::_M_insert_aux () from /usr/local/lib/firefox/libxul.so
#21 0x29c43dc3 in std::vector<mozilla::plugins::IPCByteRange, std::allocator<mozilla::plugins::IPCByteRange> >::_M_fill_insert ()
from /usr/local/lib/firefox/libxul.so
#22 0x29c43cc2 in std::vector<mozilla::plugins::IPCByteRange, std::allocator<mozilla::plugins::IPCByteRange> >::_M_fill_insert ()
from /usr/local/lib/firefox/libxul.so
#23 0x2a41d127 in js::AutoMaybeTouchDeadZones::~AutoMaybeTouchDeadZones ()
from /usr/local/lib/firefox/libxul.so
#24 0x2a41d5e3 in js::AutoMaybeTouchDeadZones::~AutoMaybeTouchDeadZones ()
from /usr/local/lib/firefox/libxul.so
#25 0x2a3a124c in JS_CallFunctionName () from /usr/local/lib/firefox/libxul.so
#26 0x29279429 in std::vector<mozilla::gfx::Glyph, std::allocator<mozilla::gfx::Glyph> >::_M_insert_aux () from /usr/local/lib/firefox/libxul.so
#27 0x29290724 in std::vector<mozilla::gfx::Glyph, std::allocator<mozilla::gfx::Glyph> >::_M_insert_aux () from /usr/local/lib/firefox/libxul.so
#28 0x29287475 in std::vector<mozilla::gfx::Glyph, std::allocator<mozilla::gfx::Glyph> >::_M_insert_aux () from /usr/local/lib/firefox/libxul.so
#29 0x29288f6e in std::vector<mozilla::gfx::Glyph, std::allocator<mozilla::gfx::Glyph> >::_M_insert_aux () from /usr/local/lib/firefox/libxul.so
#30 0x29282933 in std::vector<mozilla::gfx::Glyph, std::allocator<mozilla::gfx::Glyph> >::_M_insert_aux () from /usr/local/lib/firefox/libxul.so
#31 0x29e118c4 in XRE_AddJarManifestLocation ()
from /usr/local/lib/firefox/libxul.so
#32 0x29dcbf68 in std::vector<mozilla::plugins::IPCByteRange, std::allocator<mozilla::plugins::IPCByteRange> >::_M_fill_insert ()
from /usr/local/lib/firefox/libxul.so
#33 0x29e10e30 in XRE_AddJarManifestLocation ()
from /usr/local/lib/firefox/libxul.so
#34 0x2b38a59a in PR_CreateThread () from /usr/local/lib/libplds4.so.1
#35 0x281b3f1a in pthread_getprio () from /lib/libthr.so.3
#36 0x00000000 in ?? ()
The common property of all the backtraces was that they originated from
ffi_call() invocations. Those invocations tried to call into a
perfectly normal libc function, such as open(2), but for some reason
ended up in a completely different one, usually having disastrous
results. Most of the time, this ended in segfaults.
After a lot of debugging, where libffi, libc, libthr (the FreeBSD
pthread library) and others were suspected, instrumented and found to
work correctly, the cause turned to be in Firefox itself.
In Firefox's OS.File implementation for UNIX, it loads libc.so.6 if that
is available, and then uses libffi to call into it. However, the system
C library on recent FreeBSD versions is called libc.so.7 instead.
For compatibility with older applications, there is an installable
package which provides libc.so.6, so if the compatibility package is
installed, Firefox will load that on top of the already loaded
libc.so.7. Since several internal symbols of these different library
verions conflict, this usually leads to problems, such as the segfault
mentioned above.
As a workaround, users can uninstall the compatibility package, or
remove the libc.so.6 file manually, but it is probably better if this is
fixed in Firefox itself. Currently,
toolkit/components/osfile/osfile_unix_allthreads.jsm does:
(function(exports) {
[...]
// Open libc
let libc;
let libc_candidates = [ "libSystem.B.dylib",
"libc.so.6",
"libc.so" ];
for (let i = 0; i < libc_candidates.length; ++i) {
try {
libc = ctypes.open(libc_candidates[i]);
break;
} catch (x) {
LOG("Could not open libc ", libc_candidates[i]);
}
}
I propose adding libc.so.7 into the list, just before libc.so.6, like in
the attached patch. This should not lead to trouble on Linux, where
there is only libc.so.6. Alternatively, it could be made OS-dependent,
but I am not sure if that is worth the complexity here.
[1]: <http://lists.freebsd.org/pipermail/freebsd-ports/2013-May/083791.html>
Updated•12 years ago
|
Assignee: nobody → general
Severity: normal → critical
Component: General → JavaScript Engine
Keywords: crash
Product: Firefox → Core
Updated•12 years ago
|
Attachment #757393 -
Attachment is patch: true
![]() |
||
Comment 1•12 years ago
|
||
This is a toolkit issue, not a jseng one.
Assignee: general → nobody
Component: JavaScript Engine → OS.File
Product: Core → Toolkit
Reporter | ||
Comment 2•12 years ago
|
||
Apologies, this is the correct patch, which adds libc.so.7 before libc.so.6, instead of replacing it.
Attachment #757393 -
Attachment is obsolete: true
Assignee | ||
Comment 3•12 years ago
|
||
Comment on attachment 757453 [details] [diff] [review]
Add libc.so.7 before libc.so.6 in the list of libc candidates for OS.File
Review of attachment 757453 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good to me.
Attachment #757453 -
Flags: review+
Comment 4•12 years ago
|
||
I have exactly same issue with uClibc which has libc.so.0.9.32 on Alpine Linux. I know for sure that libc.so.0 is used for other uClibc setups.
I think we can do better than have a static list of known/supported libc's that will break late whenever a new/unknown version shows up. (this is not the way ELF is supposed to work).
I think we have 2 options to solve this properly:
1) Implement a find_library() similar to python's ctype.find_library does, and parse the output of ldconfig -p.
2) from configure script find the current platforms proper libc.so name, something like:
libc_file=$(cat <<EOF | python -
from ctypes.util import find_library
print(find_library('c'))
EOF
)
Store that some place as a constant during compile time and use that constant from toolkit.
It should be safe to detect it compile time since if it changes due to ABI breakage, the application will need to rebuild anyways.
Comment 5•12 years ago
|
||
A patch that works around the issue for Alpine Linux
Comment 6•12 years ago
|
||
I haven not even compile tested this patch but it gives an idea how we could detect the libc at compile time.
Attachment #759130 -
Attachment is obsolete: true
Comment 7•12 years ago
|
||
I have tested this patch and it appears to actually work. It should fix the problem for freebsd too.
Attachment #759693 -
Attachment is obsolete: true
Updated•12 years ago
|
Attachment #759721 -
Flags: review?(dteller)
Reporter | ||
Comment 8•12 years ago
|
||
Just as a quick verification, the Python fragment indeed gives the wanted result on FreeBSD:
$ python -c "from ctypes.util import find_library; print(find_library('c'))"
libc.so.7
So that looks good to me.
Assignee | ||
Comment 9•12 years ago
|
||
Comment on attachment 759721 [details] [diff] [review]
A patch that actually works
Review of attachment 759721 [details] [diff] [review]:
-----------------------------------------------------------------
Are you sure it's the right patch?
There should be no ".orig" files. That's artifacts from a hg merge.
Attachment #759721 -
Flags: review?(dteller)
Comment 10•12 years ago
|
||
(In reply to David Rajchenbach Teller [:Yoric] from comment #9)
> Comment on attachment 759721 [details] [diff] [review]
> A patch that actually works
>
> Review of attachment 759721 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Are you sure it's the right patch?
> There should be no ".orig" files. That's artifacts from a hg merge.
I manually copied the .orig file from the release source (firefox-21) and manually did git diff -u. (i think)
Do you want me to rebase it against a mercurial branch? which branch in that case?
Comment 11•12 years ago
|
||
Regarding FreeBSD, we have recently switched libc.so to be an ld script (this will be the case in the next major release FreeBSD 10), similar to what we can find on Linux (at least on Debian):
debian$ cat /usr/lib/x86_64-linux-gnu/libc.so
/* GNU ld script
Use the shared library, but some functions are only in
the static library, so try that secondarily. */
OUTPUT_FORMAT(elf64-x86-64)
GROUP ( /lib/x86_64-linux-gnu/libc.so.6 /usr/lib/x86_64-linux-gnu/libc_nonshared.a AS_NEEDED ( /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 ) )
freebsd$ cat /usr/lib/libc.so
/* $FreeBSD: head/lib/libc/libc.ldscript 251668 2013-06-12 21:12:05Z jlh $ */
GROUP ( /lib/libc.so.7 /usr/lib/libssp_nonshared.a )
Can you point me where the change is to be done please?
I tried naively to the following command in the Firefox source tree
$ grep -rl 'libc\.so' .
But it finds 23 files. This would greatly help me if someone knowing the Firefox source tree could share his knowledge.
As a side note, I think trying directly libc.so is your best bet. Do you know the rationale that led to open the major-numbered library directly instead of the standard symlink (or ld script)?
Comment 12•12 years ago
|
||
Regarding FreeBSD, we have recently switched libc.so to be an ld script (this will be the case in the next major release FreeBSD 10), similar to what we can find on Linux (at least on Debian):
debian$ cat /usr/lib/x86_64-linux-gnu/libc.so
/* GNU ld script
Use the shared library, but some functions are only in
the static library, so try that secondarily. */
OUTPUT_FORMAT(elf64-x86-64)
GROUP ( /lib/x86_64-linux-gnu/libc.so.6 /usr/lib/x86_64-linux-gnu/libc_nonshared.a AS_NEEDED ( /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 ) )
freebsd$ cat /usr/lib/libc.so
/* $FreeBSD: head/lib/libc/libc.ldscript 251668 2013-06-12 21:12:05Z jlh $ */
GROUP ( /lib/libc.so.7 /usr/lib/libssp_nonshared.a )
Can you point me where the change is to be done please?
I tried naively to the following command in the Firefox source tree
$ grep -rl 'libc\.so' .
But it finds 23 files. This would greatly help me if someone knowing the Firefox source tree could share his knowledge.
As a side note, I think trying directly libc.so is your best bet. Do you know the rationale that led to open the major-numbered library directly instead of the standard symlink (or ld script)?
Comment 13•12 years ago
|
||
Wouldn't ldscript break dlopen() ?
js> ctypes.open("libc.so")
typein:2:0 Error: couldn't open library libc.so
>>> ctypes.CDLL("libc.so")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/ctypes/__init__.py", line 365, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /usr/lib/libc.so: invalid file format
And I'm not sure dlopen(NULL) is supported/desirable by js-ctypes.
Assignee | ||
Comment 14•11 years ago
|
||
(In reply to Natanael Copa from comment #10)
> I manually copied the .orig file from the release source (firefox-21) and
> manually did git diff -u. (i think)
>
> Do you want me to rebase it against a mercurial branch? which branch in that
> case?
Well, it's not a problem of rebasing. There should be no .orig file in a patch.
Assignee | ||
Updated•11 years ago
|
Whiteboard: [Async]
Comment 15•11 years ago
|
||
This patch should have no .orig ans was rebased against a fresh hg clone.
Attachment #759721 -
Attachment is obsolete: true
Attachment #8346464 -
Flags: review+
Updated•11 years ago
|
Attachment #8346464 -
Flags: review+ → review?(dteller)
Assignee | ||
Comment 16•11 years ago
|
||
Comment on attachment 8346464 [details] [diff] [review]
detect-libc-filename.patch
Review of attachment 8346464 [details] [diff] [review]:
-----------------------------------------------------------------
The OS.File part is fine for me.
Asking ted about the configure.in part.
::: dom/system/OSFileConstants.cpp
@@ +836,5 @@
> if (!SetStringProperty(cx, objPath, "tmpDir", gPaths->tmpDir)) {
> return false;
> }
>
> + if (!SetStringProperty(cx, objPath, "libcFilename", NS_LITERAL_STRING(MOZ_LIBC_FILENAME))) {
Just "libc".
::: toolkit/components/osfile/modules/osfile_unix_allthreads.jsm
@@ +45,2 @@
> "libc.so.6",
> "libc.so" ];
Well, if OS.Constants.Path.libc is correct, we don't need to try the others, do we?
Attachment #8346464 -
Flags: review?(ted)
Attachment #8346464 -
Flags: review?(dteller)
Attachment #8346464 -
Flags: feedback+
Comment 17•11 years ago
|
||
This is dangerous to hard-code "libc.so.6". You may end up using a stale libc from a previous version of the system.
How does it work on Linux? On Debian, /usr/lib/libc.so is also an ldscript, so there should be already a solution in place that can be re-used on FreeBSD 10+.
-- Jeremie
Comment 18•11 years ago
|
||
(In reply to David Rajchenbach Teller [:Yoric] (away 20 Dec to Jan 4th - please use "needinfo?") from comment #16)
> @@ +45,2 @@
> > "libc.so.6",
> > "libc.so" ];
>
> Well, if OS.Constants.Path.libc is correct, we don't need to try the others,
> do we?
That is correct. We should not even need to loop over different alternatives. This should be enough:
try {
libc = ctypes.open(OS.Constants.Path.libcFilename);
catch (x) {
LOG("Could not open libc ", OS.Constants.Path.libcFilename);
}
I left all the alternatives there in case the unexpected happens.
(I just tested run the python's find_library('c') on an OSX box and there it returns libc.dylib which appears to be a symlink libc.dylib -> libSystem.dylib -> libSystem.B.dylib. I'm not sure if libc.dylib vs libSystem.B.dylib matters.)
Comment 19•11 years ago
|
||
(In reply to Jeremie Le Hen from comment #17)
> This is dangerous to hard-code "libc.so.6".
The existence of this bug itself is a proof of that...
My patch detects the libc file name at build time. (if libc name changes runtime you'll have to recompile it all anyways)
Comment 20•11 years ago
|
||
(In reply to Natanael Copa from comment #19)
> (In reply to Jeremie Le Hen from comment #17)
> > This is dangerous to hard-code "libc.so.6".
>
> The existence of this bug itself is a proof of that...
>
> My patch detects the libc file name at build time. (if libc name changes
> runtime you'll have to recompile it all anyways)
You're right. I suggest you entirely remove the "libc.so.6" from the list then. I think it's better to break blatantly rather than sneakily.
Comment 21•11 years ago
|
||
Comment on attachment 8346464 [details] [diff] [review]
detect-libc-filename.patch
Review of attachment 8346464 [details] [diff] [review]:
-----------------------------------------------------------------
I don't see how this makes sense. The machines we build on and the machines we run on have no connection, so how can detecting this at build time possibly work? The Windows/Mac cases seem to be trivial (the Python module literally just returns the MSVCRT or libc.dylib for them), and the Linux one sounds like we could just try a list of useful names.
Also in the Windows case the Python ctypes version is just wrong, since it'll return the CRT that Python was built against, not the one that Firefox was built against.
Python ctypes ref:
http://svn.python.org/view/python/branches/release27-maint/Lib/ctypes/util.py?revision=84835&view=markup
Attachment #8346464 -
Flags: review?(ted) → review-
Comment 22•11 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #21)
> Comment on attachment 8346464 [details] [diff] [review]
> detect-libc-filename.patch
>
> Review of attachment 8346464 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> I don't see how this makes sense. The machines we build on and the machines
> we run on have no connection, so how can detecting this at build time
> possibly work?
Do you support crosscompiling of firefox? (eg build Firefox for linux on windows or build for ARM on x86). If you do support crosscompile then the patch will not work.
> The Windows/Mac cases seem to be trivial (the Python module
> literally just returns the MSVCRT or libc.dylib for them), and the Linux one
> sounds like we could just try a list of useful names.
How do you define the list of useful names? How do you define the search order? What if your system you run it on has multiple libc's installed (eg you have both glibc, uclibc and musl libc installed on same system. ELF does not forbid that...). There are atleast 2 systems that breaks due to the current list and as soon any POSIX system updates to an new libc ABI (or somebody wants to port to a new system) the list will need to be updated. Would you be willing to maintain it?
Wouldn't it be better to try detect the libc firefox will be linked against? We could even testcompile a minimal C prog with same CC and CFLAGS as firefox and use readelf to find out what we linked against. Something like:
echo "int main(void) { return 0; }" > link-test.c
# CC, CFLAGS, LDFLAGS should correspond to whatever firefox will be built with
$CC $CFLAGS -o link-test link-test.c $LDFLAGS
MOZ_LIBC_FILENAME=$(readelf -d link-test | grep 'NEEDED.*\[libc\.' | sed 's/.*\[\(.*\)\].*/\1/')
rm link-test link-test.c
> Also in the Windows case the Python ctypes version is just wrong, since
> it'll return the CRT that Python was built against, not the one that Firefox
> was built against.
AFAIK, Windows will not use this at all as the issue we are trying to resolve does not affect it. (oh btw, it looks like on windows 'kernel32.dll' will be used regardless of what libc Firefox was built against)
> Python ctypes ref:
> http://svn.python.org/view/python/branches/release27-maint/Lib/ctypes/util.
> py?revision=84835&view=markup
Yeah, the implementation was buggy on Alpine Linux so i have patched that code too. (which is why i knew about it in the first place)
Comment 23•11 years ago
|
||
Yes, we support cross-compiling. We have Android builds that are built on Linux, we're working on cross-compiling from Linux to Darwin, and there are people out there cross-compiling from Linux to Windows.
Comment 24•11 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #23)
> Yes, we support cross-compiling.
So we cannot use the python trick. I think the only way to do it properly is to compile/link a minimal test program and check what libc it got linked to.
Comment 25•11 years ago
|
||
This patch will test compile and use readelf -d | grep NEEDED to find what libc we linked against. It should work even when crosscompiling.
It also removes the libc.so.6 from candidate list as requested but still keeps libSystem.B.dylib and libc.so as fallback in case the detection goes wrong.
Attachment #8346464 -
Attachment is obsolete: true
Attachment #8348624 -
Flags: review?(ted)
Comment 26•11 years ago
|
||
Comment on attachment 8348624 [details] [diff] [review]
detect-libc-filename.patch
Review of attachment 8348624 [details] [diff] [review]:
-----------------------------------------------------------------
I'm punting this to glandium.
Attachment #8348624 -
Flags: review?(ted) → review?(mh+mozilla)
Assignee | ||
Comment 28•11 years ago
|
||
OS.File dynamically links to this libc and its functions and uses it build the Unix implementation of the cross-platform API:
http://dxr.mozilla.org/mozilla-central/toolkit/components/osfile/osfile_unix_back.jsm
Flags: needinfo?(dteller)
Comment 29•11 years ago
|
||
Can't we use the special "a.out" instead of libc?
Flags: needinfo?(dteller)
Assignee | ||
Comment 30•11 years ago
|
||
I seem to remember that a.out doesn't exist on all platforms, but we can put it as first choice and see what gives.
Flags: needinfo?(dteller)
Assignee | ||
Comment 31•11 years ago
|
||
Quick test.
Try: https://tbpl.mozilla.org/?tree=Try&rev=0a74f9cdd685
Assignee | ||
Comment 32•11 years ago
|
||
Comment on attachment 8350595 [details] [diff] [review]
Loading a.out as the first choice of libc
Well, the results are encouraging.
Attachment #8350595 -
Flags: review?(nfroyd)
Attachment #8350595 -
Flags: review?(mh+mozilla)
![]() |
||
Comment 33•11 years ago
|
||
(In reply to David Rajchenbach Teller [:Yoric] (away 20 Dec to Jan 4th - please use "needinfo?") from comment #32)
> Comment on attachment 8350595 [details] [diff] [review]
> Loading a.out as the first choice of libc
>
> Well, the results are encouraging.
ISTR that we had issues with this dynamic library loading at some time in the past and/or there might have been some reason we decided to not use a.out. (Those Android tests on the try push look...unencouraging, for instance.) Investigating that and finding the bug(s) would put this bug above the "review in 2 minutes or less" threshold I'm currently try to hold to at this point. If you want to do my research for me, that'd be great! :) Otherwise, I'll go code spelunking in the new year.
Comment 34•11 years ago
|
||
Comment on attachment 8348624 [details] [diff] [review]
detect-libc-filename.patch
I'd rather find a way that doesn't involve configure.
Attachment #8348624 -
Flags: review?(mh+mozilla) → feedback-
Comment 35•11 years ago
|
||
Comment on attachment 8350595 [details] [diff] [review]
Loading a.out as the first choice of libc
Review of attachment 8350595 [details] [diff] [review]:
-----------------------------------------------------------------
Try says this doesn't work so well, so you may want to try to put a.out last and remove libc.so.6.
Attachment #8350595 -
Flags: review?(nfroyd)
Attachment #8350595 -
Flags: review?(mh+mozilla)
Attachment #8350595 -
Flags: feedback-
Comment 36•11 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #34)
> Comment on attachment 8348624 [details] [diff] [review]
> detect-libc-filename.patch
>
> I'd rather find a way that doesn't involve configure.
The problem is that the FFI does dlopen(3) of the libc file. What the the file actually is depends completely on the linker at build time. (eg what file is used when linking with -lc). The libc file name is a constant that does not change once the application is linked. Detecting this from configure is IMHO the simplest way to do it reliably.
If not using configure for this, then I think the only sane way to do it is to parse the ELF header in a known binary (for example the firefox binary itself) and read the NEEDED section and find out what libc the firefox binary was actually linked against.
The information is there so there is no need to blindly shoot at a guess list of candidates and hope something hits. There is no guarantee that you will hit the correct libc or any libc at all.
Comment 37•11 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #29)
> Can't we use the special "a.out" instead of libc?
No. a.out has not really anything to do with libc. http://en.wikipedia.org/wiki/A.out
Comment 38•11 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #35)
> Comment on attachment 8350595 [details] [diff] [review]
> Loading a.out as the first choice of libc
>
> Review of attachment 8350595 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Try says this doesn't work so well, so you may want to try to put a.out last
> and remove libc.so.6.
NACK. There are no sane systems that calls their shared libc for a.out.
Also, on most current GNU/Linux'es the correct file to open will be libc.so.6. The correct filename depends on system/toolchain.
Comment 39•11 years ago
|
||
(In reply to Natanael Copa from comment #38)
> (In reply to Mike Hommey [:glandium] from comment #35)
> > Comment on attachment 8350595 [details] [diff] [review]
> > Loading a.out as the first choice of libc
> >
> > Review of attachment 8350595 [details] [diff] [review]:
> > -----------------------------------------------------------------
> >
> > Try says this doesn't work so well, so you may want to try to put a.out last
> > and remove libc.so.6.
>
> NACK. There are no sane systems that calls their shared libc for a.out.
>
> Also, on most current GNU/Linux'es the correct file to open will be
> libc.so.6. The correct filename depends on system/toolchain.
Picking the last message to reply to numerous statements that have been made through this bug.
I have the feeling there's some kind of over-engineering going on. You cannot sanely try to cope with every single operating system weirdness you will find out there. This is a never-ending chase.
I suggest going the easy and straightforward way: just try to open "libc.so" and MacOS X' equivalent and fail blatantly if you cannot (I personally don't know any OS that calls his libc "a.out"):
- First, this will avoid failing in an awkward manner which may be difficult to debug.
- Second, each maintainer probably has a better insight to deal with his own OS weirdness (they may even have already the bits in place to work around this problem: FreeBSD allows to redirect a dlopen(2)'ed file to another another file [1]).
Performing some introspection on Firefox' own binary (that is, looking for DT_NEEDED entries and trying to match the right one) might be appealing, but you have to keep in mind to honor the ELF spec and the system's implementation specificities: DT_NEEDED entries contains only the basename of the shared library, you have to mimic all the runtime linker's ways to look up the right path ($LD_LIBRARY_PATH, the binary's DT_RPATH, the system config), **and do it in the right order**. Otherwise you may end up loading the wrong library and fail in an awkward manner again.
If you really want to relieve maintainers from dealing with this problem as much as possible, then one method you may try is to make Firefox execute ldd(1) on itself, capture the output and try to match the right library; that way you delegate the path dance to the system'd runtime linker. You have to cope with the different format of ldd(1) output out there though; I don't even know if it exists everywhere. But in that case you can still fall back to the straightforward method.
[1] http://www.freebsd.org/cgi/man.cgi?query=libmap.conf
-- Jeremie
Comment 40•11 years ago
|
||
(In reply to Jeremie Le Hen from comment #39)
> I suggest going the easy and straightforward way: just try to open "libc.so"
> and MacOS X' equivalent and fail blatantly if you cannot (I personally don't
> know any OS that calls his libc "a.out"):
> - First, this will avoid failing in an awkward manner which may be difficult
> to debug.
I agree that its better to fail early.
> - Second, each maintainer probably has a better insight to deal with his own
> OS weirdness (they may even have already the bits in place to work around
> this problem: FreeBSD allows to redirect a dlopen(2)'ed file to another
> another file [1]).
>
>
> Performing some introspection on Firefox' own binary (that is, looking for
> DT_NEEDED entries and trying to match the right one) might be appealing, but
> you have to keep in mind to honor the ELF spec and the system's
> implementation specificities:
[snipping some insightsful rows]
> If you really want to relieve maintainers from dealing with this problem as
> much as possible, then one method you may try is to make Firefox execute
> ldd(1) on itself, capture the output and try to match the right library;
> that way you delegate the path dance to the system'd runtime linker. You
> have to cope with the different format of ldd(1) output out there though; I
> don't even know if it exists everywhere. But in that case you can still
> fall back to the straightforward method.
Talked with a friend and he said "why do they need to dlopen()? why not just use dlsym(RTLD_DEFAULT, ...)"
And this is a really a good point. Theoretically we don't need open it since it already is open. It also looks that calling dlopen(0,...) will return a handle to the global symbol table[1].
I think that the proper solution would be to use that rather than trying to again try dlopen the libc. It would save us from lots of headache.
Maybe something in the direction of:
--- a/nsprpub/pr/src/linking/prlink.c
+++ b/nsprpub/pr/src/linking/prlink.c
@@ -795,17 +795,21 @@ pr_LoadLibraryByPathname(const char *nam
/* ensure the file exists if it contains a slash character i.e. path */
/* DARWIN's dlopen ignores the provided path and checks for the */
/* plain filename in DYLD_LIBRARY_PATH */
if (strchr(name, PR_DIRECTORY_SEPARATOR) == NULL ||
PR_Access(name, PR_ACCESS_EXISTS) == PR_SUCCESS) {
h = dlopen(name, dl_flags);
}
#else
- h = dlopen(name, dl_flags);
+ if (strcmp(name, "RTLD_GLOBAL") == 0) {
+ h = dlopen(0, dl_flags); /* or: h = RTLD_DEFAULT; */
+ } else {
+ h = dlopen(name, dl_flags);
+ }
#endif
#elif defined(USE_HPSHL)
int shl_flags = 0;
shl_t h;
/*
* Use the DYNAMIC_PATH flag only if 'name' is a plain file
* name (containing no directory) to match the behavior of
[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/dlopen.html
![]() |
||
Comment 41•11 years ago
|
||
(In reply to Natanael Copa from comment #40)
> And this is a really a good point. Theoretically we don't need open it since
> it already is open. It also looks that calling dlopen(0,...) will return a
> handle to the global symbol table[1].
>
> I think that the proper solution would be to use that rather than trying to
> again try dlopen the libc. It would save us from lots of headache.
Just FYI, NSPR does this:
http://mxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/linking/prlink.c#166
and assigns that handle to the special name "a.out":
http://mxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/linking/prlink.c#192
So that's why we've been talking about using "a.out". If you'd like to try the patch that David pushed to Try that uses "a.out", please do! That would be a useful data point.
But "a.out" doesn't work on Android. So it's not a fully general solution, hence Mike's comment 35.
Comment 42•11 years ago
|
||
(In reply to Nathan Froyd (:froydnj) (AFKish through 3 January 2014) from comment #41)
> (In reply to Natanael Copa from comment #40)
> > And this is a really a good point. Theoretically we don't need open it since
> > it already is open. It also looks that calling dlopen(0,...) will return a
> > handle to the global symbol table[1].
> >
> > I think that the proper solution would be to use that rather than trying to
> > again try dlopen the libc. It would save us from lots of headache.
>
> Just FYI, NSPR does this:
>
> http://mxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/linking/prlink.
> c#166
>
> and assigns that handle to the special name "a.out":
>
> http://mxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/linking/prlink.
> c#192
Oh.. that explains alot. The use 'a.out' made me think people were going nuts but it all makes sense now. Sorry.
> So that's why we've been talking about using "a.out". If you'd like to try
> the patch that David pushed to Try that uses "a.out", please do! That would
> be a useful data point.
I expect it to just work, but I will test it.
> But "a.out" doesn't work on Android. So it's not a fully general solution,
> hence Mike's comment 35.
Well, the behavior is mentioned in POSIX[1] so either Android is not POSIX compliant (and needs special handling anyways) or is broken and should be fixed. I assume dlopen'ing libc.so.6 won't work on Android either.
I believe this is the direction we want to go.
[1] http://pubs.opengroup.org/onlinepubs/000095399/functions/dlopen.html
Comment 43•11 years ago
|
||
(In reply to Natanael Copa from comment #38)
> (In reply to Mike Hommey [:glandium] from comment #35)
> > Comment on attachment 8350595 [details] [diff] [review]
> > Loading a.out as the first choice of libc
> >
> > Review of attachment 8350595 [details] [diff] [review]:
> > -----------------------------------------------------------------
> >
> > Try says this doesn't work so well, so you may want to try to put a.out last
> > and remove libc.so.6.
>
> NACK. There are no sane systems that calls their shared libc for a.out.
>
> Also, on most current GNU/Linux'es the correct file to open will be
> libc.so.6. The correct filename depends on system/toolchain.
After gaining insight in the special handling of 'a.out', I'd like to revert my NACK to an ACK. Sorry for my ignorance.
Assignee | ||
Comment 44•11 years ago
|
||
Here's a trivial patch that seems to do the trick.
Try: https://tbpl.mozilla.org/?tree=Try&rev=8290759c67ce
Attachment #757453 -
Attachment is obsolete: true
Attachment #8348624 -
Attachment is obsolete: true
Attachment #8350595 -
Attachment is obsolete: true
Attachment #8358332 -
Flags: review?(mh+mozilla)
Comment 45•11 years ago
|
||
Comment on attachment 8358332 [details] [diff] [review]
Loading a.out as the first choice of libc, v2
Review of attachment 8358332 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/osfile/modules/osfile_unix_allthreads.jsm
@@ +40,5 @@
>
> // Open libc
> let libc;
> +let libc_candidates = [ "libc.so",
> + "a.out" ];
Does that work on mac?
Attachment #8358332 -
Flags: review?(mh+mozilla) → review+
Assignee | ||
Comment 46•11 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #45)
> Does that work on mac?
According to my tests, it does.
Assignee | ||
Updated•11 years ago
|
Keywords: checkin-needed
Comment 47•11 years ago
|
||
Assignee: nobody → dteller
Keywords: checkin-needed
Comment 48•11 years ago
|
||
Status: UNCONFIRMED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Comment 49•11 years ago
|
||
(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?") from comment #46)
> (In reply to Mike Hommey [:glandium] from comment #45)
> > Does that work on mac?
>
> According to my tests, it does.
Looks like it does not at least for the 32bit mode. There is bug 960509 on file now.
Assignee | ||
Comment 50•11 years ago
|
||
Looks like we should readd "libSystem.B.dylib" as last item of the list.
Updated•2 years ago
|
Product: Toolkit → Toolkit Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•