Closed Bug 341874 Opened 18 years ago Closed 18 years ago

Crash invoking the system cairo library

Categories

(Core :: Graphics, defect)

x86
SunOS
defect
Not set
major

Tracking

()

RESOLVED FIXED

People

(Reporter: alfred.peng, Assigned: alfred.peng)

Details

Attachments

(3 files, 6 obsolete files)

I've built Firefox with thebes on Ubuntu and Solaris. The dynamic library libthebes.so will link to libmozcairo.a and the system cairo library(gtk2 depends on it) at the same time on these two platforms. On Ubuntu, Firefox will call into its own cairo library and works fine. But on Solaris, it crashes at _cairo_user_data_array_set_data of system cairo.

To have our own cairo library in the source tree may be just a temporary solution I think. Firefox should depend on the system cairo library at last. To bypass this  build problem, we also need a temporary Solaris specific patch for this crash.
Attached patch Patch v1 (obsolete) — Splinter Review
Add -Bsymbolic to the ld flags.

The manual for the ld option "-Bsymbolic" on Solairs:
"
Symbol definitions have symbolic linker scoping which is more restrictive than global linker scoping. All references to the symbol from within the dynamic load module being linked bind to the symbol defined within the module.
"
Attachment #225999 - Flags: review?(vladimir)
Hmm, I don't think #ifndef GNU_CXX is the right thing here -- is there a specific define for sun's compiler that could be used?
I don't find a corresponding define for sun's compiler. Do I need to add one in configure.in?
Attached patch Patch v2Splinter Review
Or we can make the patch like this?
Attachment #225999 - Flags: review?(vladimir)
Comment on attachment 226009 [details] [diff] [review]
Patch v2

wouldn't this be a good idea on all ELF platforms?
(GNU ld does support -Bsymbolic according to its manpage)
I found this crash problem on Solaris platform with SunCC compiler.

When the cairo library releases a newer build and all the distributions upgrade their system cairo library, I think Firefox should only depend on one version of cairo library, the system one. The patch is just a temporary solution for Solaris, and it should be removed at last.

So my idea is to keep the other platforms unchange if they can work well in the current situation.

Vald, roc, what's your opinions?
So, one thing that slipped my mind is that we rename our in-tree cairo symbols; if you're crashing anywhere, it means that a symbol was missed in the renaming.  dump libthebes.so's symbols, and see if there are any _cairo_* symbols exported -- they should all be _moz_cairo_*.  There should be no conflicts between the system cairo and the in-tree cairo.

Also, our official builds will always ship with our in-tree cairo; the option to use system cairo is only there for distros who ship their own builds of Firefox.
Attached file stack on Solaris (obsolete) —
The core dump call stack on Solaris. It seems that the rename works for cairo_surface_set_user_data=>_moz_cairo_surface_set_user_data. But it still calls into _cairo_user_data_array_set_data in the system cairo library. Do we need to rename this symbol?
ah, that's because only GCC uses the visibility(hidden) parts of cairo, I suspect:
http://lxr.mozilla.org/seamonkey/source/gfx/cairo/cairo/src/cairoint.h#97
Attached patch Patch v3 (obsolete) — Splinter Review
Modify the patch to disable the option for system cairo.
Attachment #225999 - Attachment is obsolete: true
Attachment #226134 - Flags: review?(vladimir)
Assignee: nobody → alfred.peng
The patch v3 is to keep Solaris consistent with the other platforms when using system cairo library. I think that can avoid some Solaris specific problems in the future. vlad, any comments? BTW, do I need a sr for the patch?
I just noticed that -Bsymbolic is also used for other OSes/compilers, see:
http://lxr.mozilla.org/seamonkey/source/config/rules.mk#526

Maybe Solaris should just be added there?
Attached patch Patch v4 (obsolete) — Splinter Review
Add the "-Bsymbolic" option to SunOS and change thebes to use it.

Christian, thanks for pointing that out:)
vlad, bsmedbeg, what's your idea?
Attachment #226134 - Attachment is obsolete: true
Attachment #226771 - Flags: superreview?(benjamin)
Attachment #226771 - Flags: review?(vladimir)
Attachment #226134 - Flags: review?(vladimir)
Comment on attachment 226771 [details] [diff] [review]
Patch v4

>Index: config/rules.mk

> #
>+# SunOS: add -Bsymbolic flag for components
>+# 
>+ifeq ($(OS_ARCH),SunOS)
>+ifdef IS_COMPONENT
>+EXTRA_DSO_LDOPTS += -Bsymbolic
>+endif
>+endif 

This is not the right place to add this flag; it should probably be added to http://lxr.mozilla.org/mozilla/source/configure.in#2247

>Index: gfx/thebes/src/Makefile.in

> MODULE		= thebes
> LIBRARY_NAME	= thebes
>+IS_COMPONENT	= 1
>+MODULE_NAME 	= nsGfxThebesModule
> LIBXUL_LIBRARY	= 1
> EXPORT_LIBRARY	= 1

This cannot possibly be correct. Thebes is not a component.
Attachment #226771 - Flags: superreview?(benjamin)
Attachment #226771 - Flags: superreview-
Attachment #226771 - Flags: review?(vladimir)
bsmedberg: That file has -Bsymbolic for other OSes too

Alfred Peng: sorry, I didn't realize that those flags were only used for components
Let's not perpetuate the bad code.
(In reply to comment #16)
> This is not the right place to add this flag; it should probably be added to
> http://lxr.mozilla.org/mozilla/source/configure.in#2247
> 

I tried to add that option there, and it doesn't work. Solaris uses the "-solaris" case instead of the "-sunos" one.

My question is: whether there is any negative impact on the Solaris build if we add the "-Bsymbolic" option to the whole build. Maybe we should fall back to the first patch.
(In reply to comment #19)
> Maybe we should fall back to the first patch.
                                   ~~~~~ oops, the second patch.
Comment on attachment 226009 [details] [diff] [review]
Patch v2

Some linker engineer doesn't recommend us to use the option. So I'd like to stick to this patch.
Attachment #226009 - Flags: superreview?(benjamin)
What "whole build"? EXTRA_DSO_LDOPTS and MKSHLIB have the same effects on the final build, don't they?
I think changing MKSHLIB in configure.in will affect all the compile or link stages("whole build":-)). And EXTRA_DSO_LDOPTS will only have an impact where IS_COMPONENT is defined.

Since the option isn't recommended, to limit the usage of that option to a minimal range maybe a good choice.
Both of those statements are incorrect: MKSHLIB is only used to make shared libraries, and EXTRA_DSO_LDOPTS is also used for making shared libraries, entirely independent of whether IS_COMPONENT is set.
Thanks for correcting me:)

The idea of patch v4 is to keep Solaris consistent with the other platforms, to add the option to EXTRA_DSO_LDOPTS where IS_COMPONENT is defined. And it seems to be a bad idea. If we add the option to MKSHLIB, that will affect all the shared libraries. Maybe that will lead some other problems in the future.

Take this bug as an example, we should only link to one version of a library instead of two, which is more reasonable. If this is the truth, this option will be unnecessary.
Solaris is still suffering from this problem. bs, could you give us some suggestion?
Comment on attachment 226009 [details] [diff] [review]
Patch v2

Uck, ok
Attachment #226009 - Flags: superreview?(benjamin) → superreview+
Oops, content/canvas/src directory of the seamonkey build is also affected by the cairo issue.
For the seamonkey issue, I tried to add option "-Bsymbolic" to DSO_LDOPTS in configure.in for Solaris. It doesn't work. Then I change the Makefile.in in content/canvas/src. It works.
Attached file Stack 2 on Solaris (obsolete) —
The crash still exists even I add the option. Here is the new call stack. It seems that the the static library "libmozcairo.a" calls to the system cairo, which causes the crash finally. The option "-Bsymbolic" doesn't work for a static library.
Are you still seeing the crash with a pull after the bug 345874 checkin?
I've already rebuilt the head(seamonkey). It works fine on my box. No crash anymore. Thanks.
Attached file Stack 3 on Solaris (obsolete) —
I've just rebuilt Firefox here. It still crashes on my box. And here is the core stack. And patch v2 can fix the bug.

Some other investigation: I've also checked tor's patch. In the patch, some private methods are renamed. From the stack 3, we can see that the problem starts from "_cairo_gstate_paint". It's in the system cairo. Maybe we should also rename it. Any ideas from you guys?
Attachment #226102 - Attachment is obsolete: true
Attached patch Patch v5Splinter Review
Suncc also provides similiar functionality of visibility attribute extension. So I made this patch. It's better than v2 I think.

One issue I'm still concern with is this approach can only fix the problem for Suncc compiler. How about the other compilers besides gcc and suncc?

Another approach is to rename all the private methods(several hundred of them). I tried to rename all of them in cairo-rename.h. After building sucessfully, firefox crashes at startup time here:
http://lxr.mozilla.org/seamonkey/source/gfx/cairo/cairo/src/cairo-xlib-surface.c#2451
Attachment #226771 - Attachment is obsolete: true
Attachment #231908 - Flags: review?(benjamin)
Comment on attachment 231908 [details] [diff] [review]
Patch v5

I like this.
Attachment #231908 - Flags: review?(benjamin) → review+
Me too:-) Then do I need a sr for it?
that patch should be first checked in to cairo itself, imo
Comment on attachment 231908 [details] [diff] [review]
Patch v5

Yes, please submit this upstream.
Attachment #231908 - Flags: review?(vladimir)
I've now pushed the patch out into upstream cairo:

http://gitweb.freedesktop.org/?p=cairo;a=commit;h=04757a3aa8deeff3265719ebe01b021638990ec6

Thanks,

-Carl

Disclaimer: This action on my part does not constitute an endorsement of mozilla's private copy of cairo. I would still much prefer to see mozilla start  using/preferring system cairo.
Attached file Stack 4 on Solaris
With the patch v5 applied, I got another crash with the firefox trunk build 20060807. Is it a relative issue? Or a known issue?
Attachment #230739 - Attachment is obsolete: true
Attachment #230876 - Attachment is obsolete: true
Patch v5 checked in.
Checking in gfx/cairo/cairo/src/cairoint.h;
/cvsroot/mozilla/gfx/cairo/cairo/src/cairoint.h,v  <--  cairoint.h
new revision: 1.26; previous revision: 1.25
done
Bug 353162 addresses the latest crash issue in comment 40.

=> Fixed
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: