Closed Bug 440714 Opened 12 years ago Closed 12 years ago

Firefox 3 failed to start on Solaris with GNOME 2.23.3

Categories

(Core :: Graphics, defect, critical)

All
OpenSolaris
defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: ginnchen+exoracle, Assigned: ginnchen+exoracle)

References

Details

Attachments

(2 files, 5 obsolete files)

WIth latest JDS vermillion installed, the bundled Firefox 3 could not be started.

The stack is like this,
--- called from signal handler with signal 11 (SIGSEGV) ---
 00000000 ???????? (fb38cdd8, 0, 0, fb385964, 0, ffbf7778)
 fb37c000 pixman_composite_rect_general (ffbf7778, ffbf7798, f7b79920, ffbf1700, f7b79920, 50) + ec
 fe01db7c pixman_image_composite_rect (1, 14, 0, 0, 0, 0) + c4
 fe01da7c pixman_walk_composite_region (0, 0, 0, 14, 0, 14) + cac
 fe01e618 _moz_pixman_image_composite (0, 1, 0, 14, 0, 0) + 600
 fdfcbafc _cairo_image_surface_composite (0, f78db640, 0, f8eacdf0, 0, 0) + 200
 fdfdd55c _cairo_surface_composite (f78db640, f78db640, 0, 1, 0, 0) + a0
 fdfdfa38 _composite_trap_region (0, f78db640, 14, f8eacdf0, 14, ffbfdfe8) + 1c4
 fdfdfe58 _clip_and_composite_trapezoids (f78db640, 0, f8eacdf0, ffbfe07c, 0, 1) + 328
 fdfe0044 _cairo_surface_fallback_paint (f8eacdf0, 1, f78db640, 14, 0, 14) + e4
 fdfdda34 _cairo_surface_paint (0, 0, fe7c5c28, f8eacdf0, fcdf7e47, 1) + 80

The cause is there're some global  "C" functions of pixman in libxul.so conflict with /usr/lib/libcario.so.

We have cairo-rename.h to deal with this issue.
But it doesn't define all the functions.

Also pixman-remap.h is not used by any program now. We can delete it.
Attached patch patch (obsolete) — Splinter Review
Attachment #325977 - Flags: review?(vladimir)
Attached patch patch (update to trunk) (obsolete) — Splinter Review
the removal of pixman-remap.h is done by other commits.
Attachment #325977 - Attachment is obsolete: true
Attachment #331286 - Flags: review?(shriyash_21)
Attachment #325977 - Flags: review?(vladimir)
Attachment #331286 - Flags: review?(shriyash_21) → review?(vladimir)
Comment on attachment 331286 [details] [diff] [review]
patch (update to trunk)

Weird, those symbols shouldn't even be exported.. but this is fine for now.
Attachment #331286 - Flags: review?(vladimir) → review+
Pushed
http://hg.mozilla.org/index.cgi/mozilla-central/rev/4691e4934089

BTW:
It's also reproducible with latest trunk on GNOME 2.22.x now.

For pixman-priviate.h, I didn't see anything stopping these functions being exported.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
backed out, since it failed on Mac OS X.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
For pixman, it is supposed to build with "-fvisibility=hidden", and
#define PIXMAN_EXPORT __attribute__ ((visibility("default")))
However, it's only for GCC 4.0 or higher, and it is turned off if MOZ_ENABLE_LIBXUL is not defined.

I don't know why pixman-mmx.c is broken by my patch yet.
I'll make a build to find out.
pixman-mmx.o is broken because it uses a special build rule in Makefile, which doesn't generate .deps file.

Attached patch patch v2 (obsolete) — Splinter Review
Changes:
1) rename mmx functions as well
2) Fix Makefile.in, use %.$(OBJ_SUFFIX) instead, so that
OS_COMPILE_CFLAGS $(filter-out %/.pp,-Wp,-MD,$(MDDEPDIR)/$(*F).pp) will work for these files.
Attachment #331286 - Attachment is obsolete: true
Attachment #331708 - Flags: review?(vladimir)
This is ok, the problem is that it's not maintainable -- the mmx functions and others should not be exported from any library.  If they are, that's a bug in the upstream.  There's no easy way to generate a rename script for the non-public methods in pixman, so there will always be a chance that some will be missed in the future.  Is there any way to fix the core problem without renaming?
I agree.

I don't think it's a bug in the upstream.
It's kind of 50% in the upstream C file, and 50% in our Makefile.in.

I need some time to address it.
What about we land this fix first, and file a followup bug to get rid of the root cause?

BTW: Do you think we need to take care of Mac OS X, Windows for this issue? There's unlikely a conflict of  cairo library on Mac OS X and Windows.

Blocks: 449757
Attached patch patch v3 (obsolete) — Splinter Review
Use hidden attribute for pixman functions.
This patch works for me on Solaris, Linux, Mac OS X. I have only tried debug build.

I don't know why we have
ifndef MOZ_ENABLE_LIBXUL
VISIBILITY_FLAGS =
endif

pixman will link into libthebes.so.
So far, I don't have any problem with hidden attributes for pixman functions.

I don't know how VC works for exporting symbols, but this bug is unlikely reproducible on Windows anyway.

BTW: For the dep file of pixman-mmx.c issue, I'll file another bug to discuss.
Attachment #331708 - Attachment is obsolete: true
Attachment #333738 - Flags: review?(vladimir)
Attachment #331708 - Flags: review?(vladimir)
Comment on attachment 333738 [details] [diff] [review]
patch v3

I think this is ok, but let's get benjamin to confirm -- also please send this off to the try servers to make sure that removing the visibility override in pixman doesn't break anything.
Attachment #333738 - Flags: superreview?(benjamin)
Attachment #333738 - Flags: review?(vladimir)
Attachment #333738 - Flags: review+
Comment on attachment 333738 [details] [diff] [review]
patch v3

I just found it's not enough.
If we set VISIBILITY_FLAGS for Sun Studio we will need more code like NP_VISIBILITY_DEFAULT to recover the visibility for Sun Studio.

It would be a little hard to find them all. :(
Attachment #333738 - Attachment is obsolete: true
Attachment #333738 - Flags: superreview?(benjamin)
To summarize what's the root cause and what I'm going to do:

Cause:
1) In Solaris GNOME 2.22, 2.23, libcairo exports libpixman private functions as global.
I've fixed this bug in latest trunk.
See: http://bugs.freedesktop.org/show_bug.cgi?id=17183
2) Mozilla also exports libpixman private functions as global for Sun Studio compiler or disable-libxul.
3) When linking libxul.so, MOZ_PANGO_LIBS is ahead of MOZ_CAIRO_LIBS, therefore private functions in system cairo lib are possible to be binded, if bug 1) is present.

Fixes:
1) We always need VISIBILITY_FLAGS for libpixman. Switch the order in libxul-rules.mk to get around the Solaris system library bug.

2) Use __global and __hidden for Sun Studio compiler for all Mozilla modules. In general this is helpful to make Mozilla safer on Solaris. Mozilla private functions will not be visible outside of the module.
Some bugs will be gone, e.g.
https://bugzilla.mozilla.org/show_bug.cgi?id=437041

The patch is a little big. But I think it's safe, and I've tested on try-server.

Status: REOPENED → ASSIGNED
Attached patch patch part1Splinter Review
Attachment #335330 - Flags: review?(benjamin)
Attached patch patch part2 (obsolete) — Splinter Review
For Sun Studio,
-xldscope=hidden equals to -fvisibility=hidden
__hidden equals to __attribute__((visibility("hidden")))
__global equals to __attribute__((visibility ("default")))

Sun Studio doesn't allow __hidden be followed by semicolon.
So I have to switch the order for some files.
Attachment #335331 - Flags: review?(benjamin)
Comment on attachment 335331 [details] [diff] [review]
patch part2

I'm not sure whether

NS_HIDDEN void Method(); 

is allowed in GCC, but I'm sure that

nsIID NS_HIDDEN kIID;

is an incorrect declaration. The __attribute__((visibility("hidden")) would modify the *type* in that case, and not the declaration, which is not at all what we want.
Attachment #335331 - Flags: review?(benjamin) → review-
Comment on attachment 335330 [details] [diff] [review]
patch part1

I don't think the VISIBILITY_FLAGS logic is correct: in the non-libxul configuration, how would cairo link against pixman symbols (which are in a shared library?) without turning off the visibility flags?

In any case, this needs review by a thebes/gfx peer.
Attachment #335330 - Flags: review?(benjamin) → review?(pavlov)
Comment on attachment 335330 [details] [diff] [review]
patch part1

vlad: what does this do to extensions?  anything?
Attachment #335330 - Flags: review?(pavlov) → review?(vladimir)
Comment on attachment 335330 [details] [diff] [review]
patch part1

Shouldn't do anything -- extensions shouldn't be touching pixman or cairo directly anyway.  This part of the fix looks fine to me.
Attachment #335330 - Flags: review?(vladimir) → review+
Attachment #335330 - Flags: superreview?(benjamin)
(In reply to comment #18)
> (From update of attachment 335330 [details] [diff] [review])
> I don't think the VISIBILITY_FLAGS logic is correct: in the non-libxul
> configuration, how would cairo link against pixman symbols (which are in a
> shared library?) without turning off the visibility flags?

In the non-libxul configuration, cairo and pixman are in one shared library -
libthebes.so.
(In reply to comment #17)
> (From update of attachment 335331 [details] [diff] [review])
> I'm not sure whether
> 
> NS_HIDDEN void Method(); 
> 
> is allowed in GCC

It is allowed and widely used in current Mozilla code.

>but I'm sure that 
> nsIID NS_HIDDEN kIID;
> 
> is an incorrect declaration. The __attribute__((visibility("hidden")) would
> modify the *type* in that case, and not the declaration, which is not at all
> what we want.
> 

I don't understand what you mean.

visibility is not a valid "type attribute", compiler will give a warning, if we use it for a type.

I wrote a test program on Mac OS X.
test.cpp:

typedef int T1;
T1 __attribute__((visibility("hidden"))) foo1;
T1 foo2 __attribute__((visibility("hidden")));
__attribute__((visibility("hidden"))) T1 foo3;
T1 foo4;

int main()
{ return 0; }

gcc test.cpp; nm -m a.out | grep foo

The result is,
0000203c (__DATA,__common) non-external (was a private external) _foo1
00002038 (__DATA,__common) non-external (was a private external) _foo2
00002034 (__DATA,__common) non-external (was a private external) _foo3
00002030 (__DATA,__common) external _foo4
Currently, if you apply a decl attribute to a type, GCC will fix things for you by moving the attribute to the associated type. But this is incorrect usage and will probably change in future releases of GCC.

But I'm a little confused, having read up the bug some more: you're using -xldscope=hidden. Is that equivalent to GCC -fvisibility=hidden (i.e. it makes definitions hidden-visibility, but doesn't affect declarations?) Or is it more like the more intrusive #pragma visibility solution we use on more modern versions of GCC?

In either case, why do you need the extra NS_HIDDEN declarations?
I don't actually need it.
Actually GCC4 doesn't need it, either.

I put it in the patch in case some modules use global visibility by default, i.e. set VISIBILITY_FLAG to empty, e.g. cairo did it.

We can do it case by case anyway.
Attached patch patch part2 v2Splinter Review
No change to NS_HIDDEN
Attachment #335331 - Attachment is obsolete: true
Attachment #336329 - Flags: review?(benjamin)
Attachment #335330 - Flags: superreview?(benjamin) → superreview+
Attachment #336329 - Flags: review?(benjamin) → review+
Pushed
http://hg.mozilla.org/mozilla-central/rev/8b5344112ce7
Status: ASSIGNED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Duplicate of this bug: 437041
You need to log in before you can comment on or make changes to this bug.