Open Bug 650772 Opened 14 years ago Updated 3 years ago

Strip soname version in PR_LoadLibrary() on OpenBSD

Categories

(NSPR :: NSPR, defect)

All
OpenBSD
defect

Tracking

(Not tracked)

People

(Reporter: gaston, Unassigned)

References

Details

Attachments

(1 file, 1 obsolete file)

Currently, i have a bunch of local patches (for widget/src/qt/nsSound.cpp, widget/src/gtk2/nsBidiKeyboard.cpp, widget/src/gtk2/nsScreenManagerGtk.cpp, etc for most PR_LoadLibrary() calls removing the .X version from the lib name parameter passed as argument. ie: - gtklib = PR_LoadLibrary("libgtk-x11-2.0.so.0"); + gtklib = PR_LoadLibrary("libgtk-x11-2.0.so"); This is needed for two reasons: - OpenBSD doesn't install libs with symlinks like Linux and other Unices do, libfoo.so.X doesn't exist. We only install libfoo.so.X.Y, where X.Y comes from the ports tree infrastructure. - There can be libfoo.so.2.0 and libfoo.so.1.1 installed at the same time, so OpenBSD's loader (ld.so) takes care of loading the correct library when dlopen() is asked for libfoo.so. Of course all my local patches for removing the .X from PR_LoadLibrary parameter are not upstreamable, so i was wondering if pr_LoadLibraryByPathname() in pr/src/linking/prlink.c could be amended (of course within #ifdef __OpenBSD__) so that anything after .so was stripped out, likely in the #if defined(XP_UNIX) block right before dlopen() call. I'd like an opinion from someone used to NSPR internal, and if that's positive i'll start working on a patch implementing that.
Blocks: openbsdmeta
To make a demo of the actual problem, here's a little C program and its associated result on OpenBSD (of course, the library is libXinerama.so.5.0, but the version changes over time...) - PR_LoadLibrary("libXinerama.so.1") code stolen from widget/src/gtk2/nsScreenManagerGtk.cpp : $cat nsprdlopen.c #include <nspr/prtypes.h> #include <nspr/prlink.h> #include <stdio.h> int main() { PRLibrary *mXineramalib; mXineramalib = PR_LoadLibrary("libXinerama.so"); if (mXineramalib) printf("Success!!\n"); else printf ("LoadLibrary failed\n"); mXineramalib = PR_LoadLibrary("libXinerama.so.1"); if (mXineramalib) printf("Success!!\n"); else printf ("LoadLibrary failed\n"); mXineramalib = PR_LoadLibrary("libXinerama.so.5.0"); if (mXineramalib) printf("Success!!\n"); else printf ("LoadLibrary failed\n"); } $./nsprdlopen Success!! LoadLibrary failed Success!!
We should fix the callers of PR_LoadLibrary that need this behavior. PR_LoadLibrary should faithfully pass its pathname argument to dlopen. In your nsprdlopen.c test program, it seems that you should test "libXinerama.so.5" instead of "libXinerama.so.1" in the second PR_LoadLibrary call. It seems that if OpenBSD's loader (ld.so) is willing to handle a libfoo.so argument intelligently, it should also be willing to handle a libfoo.so.X argument intelligently to make porting to OpenBSD easier.
(In reply to comment #2) > We should fix the callers of PR_LoadLibrary that need this behavior. > PR_LoadLibrary should faithfully pass its pathname argument to > dlopen. That looked easier to me to 'fix' PR_LoadLibrary() only in one place, instead of fixing the multiple callers, and also the callers that might be added in the future. As of now, theres ~15 callers (in OpenBSD, 11 of them are patched to remove the .X) with a fixed version number which doesn't match OpenBSD, and lots of others where the libSpec comes from a variable. > In your nsprdlopen.c test program, it seems that you should test > "libXinerama.so.5" instead of "libXinerama.so.1" in the second > PR_LoadLibrary call. If i try to load libXinerama.so.5, it works. But the current code in nsScreenManagerGtk.cpp tries to load libXinerama.so.1, which doesnt exist. Problem is, the .X changes over time when libraries are updated, be it for Xinerama, gtk, gnome-2, Xss, etc.. and you don't control it. It doesn't make sense to track the 'real version' in OpenBSD, so falling back to just looking for .so without version looks simpler. > It seems that if OpenBSD's loader (ld.so) is willing to handle > a libfoo.so argument intelligently, it should also be willing > to handle a libfoo.so.X argument intelligently to make porting > to OpenBSD easier. What do you suggest ? Fixing all callers to do +#ifdef __OpenBSD__ +PR_LoadLibrary("libfoo.so") +#else PR_LoadLibrary("libfoo.so.X") +#endif I find that rather ugly. stripping whatever is after .so in the argument passed to PR_LoadLibrary() looks simpler to me. Or maybe right after the current dlopen call(), in a #ifdef __OpenBSD__ block, if dlopen() failed strip the libSpec, and retry dlopen() ?
Can you cd into the directory where libXinerama.so.5.0 is installed, and run the command ls -l libXinerama.so* ? What is the output?
Seems i haven't made myself clear enough. For each library installed on the system, there's libfoo.a, libfoo.la, and libfoo.so.X.Y, where you can't know in advance X.Y since it changes over time when libs are updated (ie major bumped when functions are removed/abi changes, minor bumped when functions are added). The loader is smart in the sense that it will open libfoo.so.X.Y if asked for libfoo.so or libfoo.so.X. And there's no symlink with so.X -> so.X.Y.Z between libs like linux does. /usr/X11R6/lib/libXinerama.a /usr/X11R6/lib/libXinerama.la /usr/X11R6/lib/libXinerama.so.5.0 /usr/local/lib/libgtk-x11-2.0.a /usr/local/lib/libgtk-x11-2.0.la /usr/local/lib/libgtk-x11-2.0.so.2200.0
Here's a first version of a patch implementing this. If the first dlopen() fails, we strip major.minor from name, and retry dlopen() with the stripped name. With that patch, PR_LoadLibrary("libXinerama.so.1") is now successful, even if only libXinerama.so.5.0 is present on the system. The previous testcase coverage works. What's your opinion ?
Attachment #530126 - Flags: review?(wtc)
Better patch putting the strncpy outside of the if(!h) block, so that we ensure sname is set before unconditionally being assigned later to lm->name.
Attachment #530126 - Attachment is obsolete: true
Attachment #530126 - Flags: review?(wtc)
Attachment #530129 - Flags: review?(wtc)
Landry, there is something wrong with your approach. When "libfoo.so.2.0 and libfoo.so.1.1 installed at the same time", which one will be loaded by [dlopen(libfoo.so)] ? Dynloader can't choose, that's application responsibility. Application must supply major version, because major change means loss of backward compatibility. It addition, with your approach you have to patch all applications which do versioned dynamic library loading, not only NSPR/Mozilla. Have you ?
(In reply to comment #8) > Landry, there is something wrong with your approach. > > When "libfoo.so.2.0 and libfoo.so.1.1 installed at the same time", which one > will be loaded by [dlopen(libfoo.so)] ? I'm not the one who wrote ld.so, but from my understanding dlopen("libfoo.so") is smart in the sense that it opens the lib with the highest major.minor, which is what the user generally wants. For a binary, ld.so loads the library registered in the binary's dependencies (objdump -p | grep NEEDED) It's quite rare that several versions of the same lib are present, that's only when a binary not yet upgraded by the pkg tools still needs the 'old' version registered in its dependencies. In the case of dlopen(), that's different. See for reference : http://www.openbsd.org/cgi-bin/man.cgi?query=dlopen and http://www.openbsd.org/cgi-bin/man.cgi?query=ld.so > Dynloader can't choose, that's application responsibility. Application must > supply major version, because major change means loss of backward > compatibility. That's why on OpenBSD, major.minor is overridden and controlled by the ports infrastructure. So whatever happens, the approach of 'supply the major version of the lib we want to dlopen()' in PR_LoadLibrary() won't work. > It addition, with your approach you have to patch all applications which do > versioned dynamic library loading, not only NSPR/Mozilla. Have you ? That's what the infrastructure of the portstree does (ie we have our own libtool which takes care of overriding the versionning with the one set in the portstree), and there are some patches in it to fix the rare offending dlopen() calls. For example, i have 7 or 8 patches in each of the mozilla ports, which would go away if PR_LoadLibrary() is 'fixed' by the patch.
(In reply to comment #9) > (In reply to comment #8) > > When "libfoo.so.2.0 and libfoo.so.1.1 installed at the same time", which one > > will be loaded by [dlopen(libfoo.so)] ? > > I'm not the one who wrote ld.so, but from my understanding dlopen("libfoo.so") > is smart in the sense that it opens the lib with the highest major.minor, which > is what the user generally wants. > > For a binary, ld.so loads the library registered in the binary's dependencies > (objdump -p | grep NEEDED) You are confusing load-time and run-time dynamic linking. 1) load-time: object deps are recorded in the binary (objdump ...), resolved during object load. No need not to dlopen() at run-time. 2) run-time: application invokes dlopen(libfoo). In this case, object binary has no records about foo (otherwise, foo would already be loaded). Only application knows (in it's mind), which version of foo it wants. > ... dlopen("libfoo.so") ... opens the lib with the highest major.minor Incorrect. If application is designed to work with v1.0, and you put v2.0 under, the application will most likely crash. > It's quite rare that several versions of the same lib are present, that's only > when a binary not yet upgraded by the pkg tools still needs the 'old' version > registered in its dependencies. Incorrect. Some applications will upgrade to the foo.2.0 faster, some - slower, some - will not upgrade at all, staying with v1.0. You can not force all applications in the system to use the single version of the library. In addition, there are closed-source applications, which you can't even patch and recompile. What will you do, if such application [dlopen(libfoo.so.1)] ?
(In reply to comment #10) > (In reply to comment #9) > > (In reply to comment #8) > > > When "libfoo.so.2.0 and libfoo.so.1.1 installed at the same time", which one > > > will be loaded by [dlopen(libfoo.so)] ? > > > > I'm not the one who wrote ld.so, but from my understanding dlopen("libfoo.so") > > is smart in the sense that it opens the lib with the highest major.minor, which > > is what the user generally wants. > > > > For a binary, ld.so loads the library registered in the binary's dependencies > > (objdump -p | grep NEEDED) > > You are confusing load-time and run-time dynamic linking. I know the difference, thanks. > 1) load-time: object deps are recorded in the binary (objdump ...), resolved > during object load. No need not to dlopen() at run-time. > > 2) run-time: application invokes dlopen(libfoo). In this case, object binary > has no records about foo (otherwise, foo would already be loaded). Only > application knows (in it's mind), which version of foo it wants. And in that particular case, it asks for a version of foo which will never match what is currently installed, since the versions are different on OpenBSD. > > ... dlopen("libfoo.so") ... opens the lib with the highest major.minor > > Incorrect. If application is designed to work with v1.0, and you put v2.0 > under, the application will most likely crash. > > > It's quite rare that several versions of the same lib are present, that's only > > when a binary not yet upgraded by the pkg tools still needs the 'old' version > > registered in its dependencies. > > Incorrect. Some applications will upgrade to the foo.2.0 faster, some - slower, > some - will not upgrade at all, staying with v1.0. You can not force all > applications in the system to use the single version of the library. That's what we do in OpenBSD, ie the portstree controls/adapts ports when needed, and generally there's only a single version of the lib installed. And it works fine so far. > In addition, there are closed-source applications, which you can't even patch > and recompile. What will you do, if such application [dlopen(libfoo.so.1)] ? Closed source applications providing binaries for OpenBSD ? gimme a break.. like if adobe was going to provide a binary flash player for the bsds. Whatever does dlopen(libfoo.so.X) will break anyway, since the libfoo.so.X they expect _doesn't exist_, but is rather libfoo.so.Y.Z where Y.Z is a moving target (ie bumped when API changes). As of now, the model of asking for a particular version of libfoo.so works on other OSes (with the upstream versionning kept, and the symlink maze), but not on OpenBSD. Like it or not, things are this way, that's a particularity, and we have to live with it, hence actually patching all callers of PR_LoadLibrary(). When opening that issue, my only goal was to simplify the patching currently done to have things working ootb. I'm not giving an opinion on different models, i only want to ease maintenance of mozilla ports on OpenBSD and push my patches upstream in a portable way.
Comment on attachment 530129 [details] [diff] [review] If dlopen() fails on OpenBSD retry with major.minor stripped after .so Landry: thank you for the patch and the explanation in comment 5. The correct fix is for an application to load libXinerama.so.5 on OpenBSD. It is wrong for Mozilla to assume that the file name is libXinerama.so.1 on all Unix platforms: http://mxr.mozilla.org/mozilla-central/source/widget/src/gtk2/nsScreenManagerGtk.cpp#177 It seems that the OpenBSD libXinerama.so package owner created a porting problem for OpenBSD by using a different library major version from other Unix platforms (5 vs. 1).
Attachment #530129 - Flags: review?(wtc) → review-
(In reply to comment #12) > Comment on attachment 530129 [details] [diff] [review] [review] > If dlopen() fails on OpenBSD retry with major.minor stripped after .so > > Landry: thank you for the patch and the explanation in comment 5. > > The correct fix is for an application to load libXinerama.so.5 > on OpenBSD. It is wrong for Mozilla to assume that the file > name is libXinerama.so.1 on all Unix platforms: > http://mxr.mozilla.org/mozilla-central/source/widget/src/gtk2/ > nsScreenManagerGtk.cpp#177 The issue with that is that we would need to fix ALL PR_LoadLibrary() callers, and that EVERYTIME the major changes. It doesn't happen often for libXinerama, but happens quite often for some libs (ie happens at each gtk 'major' update for example). I wouldnt want to have to reopen a bug report each time a library loaded by mozilla changes in the ports-tree. > It seems that the OpenBSD libXinerama.so package owner created > a porting problem for OpenBSD by using a different library major > version from other Unix platforms (5 vs. 1). It's not a package, it's in the basesystem. OpenBSD follows strict rules regarding ABI/API changes and library versionning, other oses don't follow the same rules.
So, since all sane and simple approaches were rejected, i guess it's time to create a new bug report patching all offending PR_LoadLibrary() callers ? Ie for example in gfx/thebes/GLContextProviderGLX.cpp : +#ifdef __OpenBSD__ + const char *libGLfilename = "libGL.so"; +#else const char *libGLfilename = "libGL.so.1"; +#endif That's fugly, but since it seems to be what you want.... and NO, i won't add a major number, since it's prone to change at each library update.
Ok, i've discussed a bit more this issue with our developers, and our dlopen works in two cases : - give it the full path of the lib : "/usr/X11R6/lib/libGL.so.12.0", where of course 12.0 is not guaranteed over time, and doesn't necessarly mean the same thing as libGL.so.1 on linux. - give it "libGL.so" only, and it will figure out which library to open. If you give it libGL.so.1, it won't find it and error out. Fixing all the callers of PR_LoadLibrary() is a real hassle, and all the OpenBSD developers i've talked to about that issue agree with me that given our dlopen() works, the best way to fix that is in PR_LoadLibrary itself, striping what's passed after .so. Wan-Teh, could you please reconsider your opinion on the patch ? I've started fixing the callers, and so far the patches (bug #667325 and bug #687320) were ignored/waiting in bugzilla. Given that the patch only touches OpenBSD, there can't be a regression for other oses, and only improvements for us.
Attachment #530129 - Flags: review- → review?(wtc)
Comment on attachment 530129 [details] [diff] [review] If dlopen() fails on OpenBSD retry with major.minor stripped after .so Asking for ted's feedback, based on suggestions from other folks..
Attachment #530129 - Flags: review?(wtc) → review?(ted.mielczarek)
Comment on attachment 530129 [details] [diff] [review] If dlopen() fails on OpenBSD retry with major.minor stripped after .so Landry: I am sorry I cannot take this patch. PR_LoadLibrary is the cross-platform abstraction of dlopen. Your asking me to change PR_LoadLibrary would be like my asking the dlopen maintainers to change dlopen. NSPR is not used by just Mozilla products. Other NSPR clients may not want this automatic retry behavior. I am sure I would get the same answer from the dlopen maintainers.
We should make this WONTFIX, I believe. I agree that magic retry behavior in NSPR is not the right solution, and the calling code should be modified.
Your call.(In reply to Wan-Teh Chang from comment #17) > Comment on attachment 530129 [details] [diff] [review] > If dlopen() fails on OpenBSD retry with major.minor stripped after .so > > Landry: I am sorry I cannot take this patch. > > PR_LoadLibrary is the cross-platform abstraction of dlopen. > Your asking me to change PR_LoadLibrary would be like my asking > the dlopen maintainers to change dlopen. NSPR is not used > by just Mozilla products. Other NSPR clients may not want > this automatic retry behavior. I am sure I would get the > same answer from the dlopen maintainers. In our case, the dlopen maintainers on OpenBSD agreed that fixing that in PR_LoadLibrary was the way to go. Oh well, your call.
(In reply to Benjamin Smedberg [:bsmedberg] from comment #18) > We should make this WONTFIX, I believe. I agree that magic retry behavior in > NSPR is not the right solution, and the calling code should be modified. Well.. so we should sprinkle ugly #ifdef __OpenBSD__ around all the PR_LoadLibrary() callers ? don't you agree this is just looking plain wrong ? And given the lack of feedback i had in bug #667325 and bug #687320, do you realize how this is a painful task for me ?
Comment on attachment 530129 [details] [diff] [review] If dlopen() fails on OpenBSD retry with major.minor stripped after .so Deferring to wtc here.
Attachment #530129 - Flags: review?(ted.mielczarek)

The bug assignee is inactive on Bugzilla, so the assignee is being reset.

Assignee: wtc → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: