Closed Bug 14676 Opened 21 years ago Closed 20 years ago

dlopen() bug in FreeBSD 3.x causes problems with components, also impacts Purify on Solaris.

Categories

(Core :: Internationalization, defect, P3, major)

x86
FreeBSD
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: lennox, Assigned: dp)

References

Details

Attachments

(1 file)

FreeBSD 3.x has a bug in its implementation of dlopen().  If the same C++ symbol
is exported from multiple libraries opened with dlopen(), every library picks up
the version of the symbol present in the *first* library opened with dlopen.
See <http://www.freebsd.org/cgi/query-pr.cgi?pr=12438> for details on this
FreeBSD bug.

This bug, naturally, wreaks havoc on Mozilla components; both bug 6323 and bug
13154 seem to have been due to it, though in each case it's been specifically
fixed (or perhaps worked around) by declaring offending symbols static.
However, there is enough copy&pasted code in Mozilla that tracking down every
multiply-defined symbol seems an impossible task.  (I don't currently have any
specific behavioral bugs that can be tracked to this, though, now that these two
bugs have been fixed; so QA can probably take a break on this one.)

The behavior of dlopen() has been changed in FreeBSD 4.0-CURRENT; I think this
is why neither of the above two bugs were apparently present on 4.0-CURRENT
machines.

The correct workaround for this bug, I think, is to force Mozilla to be compiled
with --enable-low-fat on FreeBSD.  This flag strips all symbols except the four
that are required for components (and are opened with dlsym()) from the
component binaries.

Note that on FreeBSD versions prior to 3.3 (or 4.0-CURRENT), you'll want to
apply the fix described in bug 14241 to --enable-low-fat or dlopen() will spew
huge amounts of debugging information.
QA Contact: beppe → dp
Status: NEW → ASSIGNED
Target Milestone: M15
Toshok, could you help with this one.
Bug 20857, bug 13154 and bug 17504 all seem to depend (or dupe) this one.
*** Bug 20857 has been marked as a duplicate of this bug. ***
I've posted a request to freebsd-stable mailing list asking if the fix to
dlopen() could be brought into the FreeBSD -STABLE (3.x) branch, as this would
fix this problem (and fixing it this way seems to be the Right Thing(tm) to do
since it is FreeBSD that is broken and not Mozilla..)

http://docs.freebsd.org/cgi/getmsg.cgi?fetch=937122+0+current/freebsd-stable
John D. Polstra and Jordan K. Hubbard of FreeBSD responded very quickly (matter
of hours) and merged the dlopen() fix from -CURRENT into -STABLE, just in time
before 3.4-RELEASE.

http://docs.FreeBSD.org/cgi/getmsg.cgi?fetch=950810+0+current/freebsd-stable

I can verify that Mozilla now works on a freshly built FreeBSD 3.4-STABLE
system! The long lasting breakage since pre-M10 is now over.

Can someone else also verify this?

There seem to be a bunch of other related bugs which also should be fixed or
affected in a postive way by this fix.

Summary: FreeBSD 3.4 fixes this.

Thanks to toshok for excellent debugging skills! :)

(The severity of this bug should have been blocker or critical, but it's about
to be closed now so I don't know if there's any need to change it.)
The patch submited by toshok@hungry.com works real good on FreeBSD3.2

I merged in the patch into my tinderbox build as well and i am now GREEN and
verified.

Look for bab71-131 FreeBSD 3.2.

Which leads me to the question.
Will this patch make it into the tree??
So FreeBSDer's don't have to fix this by hand?

pete
Pete, the correct solution for this problem was to fix FreeBSD's dlopen()
implementation. With the correct implementation (as in 3.4) there is no need to
patch Mozilla. While toshok's patch was working, it's still band-aid, and won't
prevent potential future problems of the same kind.

The only thing you need to do is to upgrade FreeBSD to 3.4-STABLE, or at least
to merge the dlopen() fixes from 3.4 into your FreeBSD. That will be the correct
fix to this problem. Beware though, this is my opinion and I'm not
authoritative.
It looks like the fix made it into 3.4-RELEASE, so even folks installing from
the 3.4 CD should be fine.
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Fixed (upgrade FreeBSD to 3.4).
*** Bug 17504 has been marked as a duplicate of this bug. ***
*** Bug 13154 has been marked as a duplicate of this bug. ***
*** Bug 26278 has been marked as a duplicate of this bug. ***
I'm re-opening this.  This bug causes problems for Purify on Solaris such that
it can't load the i18n converter stuff correctly at all.  Without this getting
fixed in the tree, we'll have no Unix Purify coverage on much of i18n.  This is
readily fixable from within Mozilla and it should be fixed.
Severity: normal → major
Status: RESOLVED → REOPENED
Component: XPCOM → Internationalization
Resolution: FIXED → ---
Target Milestone: M15 → M16
Changing summary to match problem.
Summary: dlopen() bug in FreeBSD 3.x causes problems with components → dlopen() bug in FreeBSD 3.x causes problems with components, also impacts Purify on Solaris.
Yea Bruce, this patch will affect absolutely nothing. It only renames some of
the classes. I would love to se it in the tree. Attaching the latest diffs.

pete
Attached patch latest diffsSplinter Review
Ftang you need to decide if you want to take pete's patch. It is a name change
and has nothing to do with this bug.

Bruce, regarding the bug, what are you seeing. What is the cause of the purify
core dump: Symbols not being marked static in components.
Status: REOPENED → ASSIGNED
I'm not seeing a coredump. I'm seeing attempts to get converter streams
involving mozilla/intl/uconv/ failing.  The crash/coredump was resolved back in
September, 1999.  I only have Purify through tomorrow (!) so, when I get back
home tonight, I'll apply the patch that pete attached and see if that fixes my
problem.  If I recall correctly from months ago, it should, but we'll see
tonight.
wait, I dont understand what the bug is now. pete's patch isn't about this bug,
I think. So I dont know why you would want to try pete's patch out. Can you
resummarize why you reopened the bug.
If you are getting a converter streams failing, then yes apply this patch.

I would guess it should fix the problem.


pete
This was never a Mozilla bug, hence it was closed.

If there are same kind of problems in Purify, Purify needs to be fixed.
Mozilla just pushes the the environment it is run in to the edge.

The patch mentioned by pete (originally from toshok) might temporarily remedy
the problem, but it is not a fix (not trying to be counter productive, just
important to emphasize this).
Per my understanding, this bug is closed. If there is a purify problem, it 
should become a separate bug.
Status: ASSIGNED → RESOLVED
Closed: 20 years ago20 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.