Closed Bug 14676 Opened 21 years ago Closed 20 years ago
dlopen() bug in Free
BSD 3 .x causes problems with components, also impacts Purify on Solaris .
FreeBSD 3.x has a bug in its implementation of dlopen(). If the same C++ symbol is exported from multiple libraries opened with dlopen(), every library picks up the version of the symbol present in the *first* library opened with dlopen. See <http://www.freebsd.org/cgi/query-pr.cgi?pr=12438> for details on this FreeBSD bug. This bug, naturally, wreaks havoc on Mozilla components; both bug 6323 and bug 13154 seem to have been due to it, though in each case it's been specifically fixed (or perhaps worked around) by declaring offending symbols static. However, there is enough copy&pasted code in Mozilla that tracking down every multiply-defined symbol seems an impossible task. (I don't currently have any specific behavioral bugs that can be tracked to this, though, now that these two bugs have been fixed; so QA can probably take a break on this one.) The behavior of dlopen() has been changed in FreeBSD 4.0-CURRENT; I think this is why neither of the above two bugs were apparently present on 4.0-CURRENT machines. The correct workaround for this bug, I think, is to force Mozilla to be compiled with --enable-low-fat on FreeBSD. This flag strips all symbols except the four that are required for components (and are opened with dlsym()) from the component binaries. Note that on FreeBSD versions prior to 3.3 (or 4.0-CURRENT), you'll want to apply the fix described in bug 14241 to --enable-low-fat or dlopen() will spew huge amounts of debugging information.
Toshok, could you help with this one.
*** Bug 20857 has been marked as a duplicate of this bug. ***
I've posted a request to freebsd-stable mailing list asking if the fix to dlopen() could be brought into the FreeBSD -STABLE (3.x) branch, as this would fix this problem (and fixing it this way seems to be the Right Thing(tm) to do since it is FreeBSD that is broken and not Mozilla..) http://docs.freebsd.org/cgi/getmsg.cgi?fetch=937122+0+current/freebsd-stable
John D. Polstra and Jordan K. Hubbard of FreeBSD responded very quickly (matter of hours) and merged the dlopen() fix from -CURRENT into -STABLE, just in time before 3.4-RELEASE. http://docs.FreeBSD.org/cgi/getmsg.cgi?fetch=950810+0+current/freebsd-stable I can verify that Mozilla now works on a freshly built FreeBSD 3.4-STABLE system! The long lasting breakage since pre-M10 is now over. Can someone else also verify this? There seem to be a bunch of other related bugs which also should be fixed or affected in a postive way by this fix. Summary: FreeBSD 3.4 fixes this. Thanks to toshok for excellent debugging skills! :) (The severity of this bug should have been blocker or critical, but it's about to be closed now so I don't know if there's any need to change it.)
The patch submited by email@example.com works real good on FreeBSD3.2 I merged in the patch into my tinderbox build as well and i am now GREEN and verified. Look for bab71-131 FreeBSD 3.2. Which leads me to the question. Will this patch make it into the tree?? So FreeBSDer's don't have to fix this by hand? pete
Pete, the correct solution for this problem was to fix FreeBSD's dlopen() implementation. With the correct implementation (as in 3.4) there is no need to patch Mozilla. While toshok's patch was working, it's still band-aid, and won't prevent potential future problems of the same kind. The only thing you need to do is to upgrade FreeBSD to 3.4-STABLE, or at least to merge the dlopen() fixes from 3.4 into your FreeBSD. That will be the correct fix to this problem. Beware though, this is my opinion and I'm not authoritative.
It looks like the fix made it into 3.4-RELEASE, so even folks installing from the 3.4 CD should be fine.
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Fixed (upgrade FreeBSD to 3.4).
*** Bug 17504 has been marked as a duplicate of this bug. ***
*** Bug 13154 has been marked as a duplicate of this bug. ***
*** Bug 26278 has been marked as a duplicate of this bug. ***
I'm re-opening this. This bug causes problems for Purify on Solaris such that it can't load the i18n converter stuff correctly at all. Without this getting fixed in the tree, we'll have no Unix Purify coverage on much of i18n. This is readily fixable from within Mozilla and it should be fixed.
Severity: normal → major
Status: RESOLVED → REOPENED
Component: XPCOM → Internationalization
Resolution: FIXED → ---
Target Milestone: M15 → M16
Changing summary to match problem.
Summary: dlopen() bug in FreeBSD 3.x causes problems with components → dlopen() bug in FreeBSD 3.x causes problems with components, also impacts Purify on Solaris.
Yea Bruce, this patch will affect absolutely nothing. It only renames some of the classes. I would love to se it in the tree. Attaching the latest diffs. pete
Ftang you need to decide if you want to take pete's patch. It is a name change and has nothing to do with this bug. Bruce, regarding the bug, what are you seeing. What is the cause of the purify core dump: Symbols not being marked static in components.
Status: REOPENED → ASSIGNED
I'm not seeing a coredump. I'm seeing attempts to get converter streams involving mozilla/intl/uconv/ failing. The crash/coredump was resolved back in September, 1999. I only have Purify through tomorrow (!) so, when I get back home tonight, I'll apply the patch that pete attached and see if that fixes my problem. If I recall correctly from months ago, it should, but we'll see tonight.
wait, I dont understand what the bug is now. pete's patch isn't about this bug, I think. So I dont know why you would want to try pete's patch out. Can you resummarize why you reopened the bug.
If you are getting a converter streams failing, then yes apply this patch. I would guess it should fix the problem. pete
This was never a Mozilla bug, hence it was closed. If there are same kind of problems in Purify, Purify needs to be fixed. Mozilla just pushes the the environment it is run in to the edge. The patch mentioned by pete (originally from toshok) might temporarily remedy the problem, but it is not a fix (not trying to be counter productive, just important to emphasize this).
Per my understanding, this bug is closed. If there is a purify problem, it should become a separate bug.
Status: ASSIGNED → RESOLVED
Closed: 20 years ago → 20 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.