Closed Bug 1185424 Opened 9 years ago Closed 7 years ago

build fails on SunOS due to "NSModules are not ordered appropriately"

Categories

(Toolkit Graveyard :: Build Config, defect)

38 Branch
Unspecified
Other
defect
Not set
normal

Tracking

(firefox57 fixed)

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: richard.palo, Assigned: petr.sumbera)

References

()

Details

Attachments

(1 file, 3 obsolete files)

User Agent: Mozilla/5.0 (X11; SunOS i86pc; rv:31.0) Gecko/20100101 Firefox/31.0
Build ID: 20150715204604

Steps to reproduce:

SunOS discussion moved from https://bugzilla.mozilla.org/show_bug.cgi?id=1054034
Component: Untriaged → Build Config
OS: Unspecified → Other
Product: Firefox → Toolkit
We are still hiting this on Solaris. And we have patch file which solves/workarounds the issue:

--- a/toolkit/library/StaticXULComponentsEnd/StaticXULComponentsEnd.cpp
+++ b/toolkit/library/StaticXULComponentsEnd/StaticXULComponentsEnd.cpp
@@ -5,9 +5,10 @@
 #ifdef _MSC_VER
 /* Sections on Windows are in two parts, separated with $. When linking,
  * sections with the same first part are all grouped, and ordered
  * alphabetically with the second part as sort key. */
 #  pragma section(".kPStaticModules$Z", read)
 #  undef NSMODULE_SECTION
 #  define NSMODULE_SECTION __declspec(allocate(".kPStaticModules$Z"), dllexport)
 #endif
-NSMODULE_DEFN(end_kPStaticModules) = nullptr;
+static const mozilla::Module kEndModule = { 0 };
+NSMODULE_DEFN(end_kPStaticModules) = &kEndModule;
diff --git a/toolkit/library/StaticXULComponentsStart.cpp b/toolkit/library/StaticXULComponentsStart.cpp
--- a/toolkit/library/StaticXULComponentsStart.cpp
+++ b/toolkit/library/StaticXULComponentsStart.cpp
@@ -1,3 +1,4 @@
 #include "mozilla/Module.h"
 
-NSMODULE_DEFN(start_kPStaticModules) = nullptr;
+static const mozilla::Module kStartModule = { 0 };
+NSMODULE_DEFN(start_kPStaticModules) = &kStartModule;

Unfortunatelly I don't really follow it.

Can I have any comment on this?

Why NSModule symbols have to be ordered properly? Any background on this?

I see that Linux is using linker script:

cat /var/tmp/firefox/toolkit/library/StaticXULComponents.ld
SECTIONS {
  .data.rel.ro : {
    *(.kPStaticModules)
  }
}

Is it somehow related to the issue?

Thank you!
Flags: needinfo?(mh+mozilla)
(In reply to Petr Sumbera from comment #1)
> We are still hiting this on Solaris. And we have patch file which
> solves/workarounds the issue:
> 
> --- a/toolkit/library/StaticXULComponentsEnd/StaticXULComponentsEnd.cpp
> +++ b/toolkit/library/StaticXULComponentsEnd/StaticXULComponentsEnd.cpp
> @@ -5,9 +5,10 @@
>  #ifdef _MSC_VER
>  /* Sections on Windows are in two parts, separated with $. When linking,
>   * sections with the same first part are all grouped, and ordered
>   * alphabetically with the second part as sort key. */
>  #  pragma section(".kPStaticModules$Z", read)
>  #  undef NSMODULE_SECTION
>  #  define NSMODULE_SECTION __declspec(allocate(".kPStaticModules$Z"),
> dllexport)
>  #endif
> -NSMODULE_DEFN(end_kPStaticModules) = nullptr;
> +static const mozilla::Module kEndModule = { 0 };
> +NSMODULE_DEFN(end_kPStaticModules) = &kEndModule;
> diff --git a/toolkit/library/StaticXULComponentsStart.cpp
> b/toolkit/library/StaticXULComponentsStart.cpp
> --- a/toolkit/library/StaticXULComponentsStart.cpp
> +++ b/toolkit/library/StaticXULComponentsStart.cpp
> @@ -1,3 +1,4 @@
>  #include "mozilla/Module.h"
>  
> -NSMODULE_DEFN(start_kPStaticModules) = nullptr;
> +static const mozilla::Module kStartModule = { 0 };
> +NSMODULE_DEFN(start_kPStaticModules) = &kStartModule;
> 
> Unfortunatelly I don't really follow it.
> 
> Can I have any comment on this?
> 
> Why NSModule symbols have to be ordered properly? Any background on this?

The modules are defined here and there in the tree, and are expected to all appear between start_kPStaticModules and end_kPStaticModules, which is what the order check validates. The reason that needs to happen is that because the code that enumerates the modules starts looking at starts_kPStaticModules + 1 and stops at end_kPStaticModules.
 
> I see that Linux is using linker script:
> 
> cat /var/tmp/firefox/toolkit/library/StaticXULComponents.ld
> SECTIONS {
>   .data.rel.ro : {
>     *(.kPStaticModules)
>   }
> }
> 
> Is it somehow related to the issue?

That groups them all together. That also respects the file order, so start and end always end up first and last.
Flags: needinfo?(mh+mozilla)
On Solaris starts_kPStaticModules and end_kPStaticModules end in read only data section (R) instead of initialized data section (D) as all other ns*_NSModule.

$ gnm -g obj-x86_64-pc-solaris2.12/toolkit/library/libxul.so |  grep _NSModule$ | grep -vw refptr | sort
000000000324d770 R start_kPStaticModules_NSModule
000000000324d778 R end_kPStaticModules_NSModule
00000000075d01c8 D nsPrefModule_NSModule
..

As for Linux on my test Ubuntu box it seems that toolkit/library/StaticXULComponents.ld  doesn't have any effect. Without using it symbols are still ordered properly.

I was told by our linker experts that we could come up with similar Solaris ld map file. Unfortunatelly the syntax used is quite new and wasn't released in any Solaris version yet. Plus providing for the start and end symbols, while doable, is still a bit of a PITA that involves providing a couple of CRT-like objects to bracket the link.

Therfore it seems that the only viable solution now is to use workaround which was already mentioned above. And which we are using to build Firefox for some time.

For record I need to state that commenting out check for order is no solution. Becuase Firefox would core dump before it actually starts:

00007fffbfffd500 libc.so.1`__lwp_sigqueue+0xa()
00007fffbfffd520 libc.so.1`raise+0x19()
00007fffbfffd560 libxul.so`_ZN13nsProfileLock18FatalSignalHandlerEiP7siginfoPv+0xab()
00007fffbfffd5b0 libxul.so`_ZN2jsL20UnixExceptionHandlerEiP7siginfoPv+0xed()
00007fffbfffd5c0 libc.so.1`__sighndlr+6()
00007fffbfffd660 libc.so.1`call_user_handler+0x2f1()
00007fffbfffd690 libc.so.1`sigacthandler+0xde(b, 7fffbfffe4a0, 7fffbfffd6b0)
00007fffbfffe620 libxul.so`_ZN7mozilla3dom6danger12GetJSContextEv+9()
00007fffbfffe640 libxul.so`_ZN7mozilla3dom9AutoJSAPI4InitEv+0x17()
00007fffbfffe6e0 libxul.so`_ZN7mozilla15ScriptPreloaderC1Ev+0x292()
00007fffbfffe710 libxul.so`_ZN7mozilla15ScriptPreloader12GetSingletonEv+0x85()
00007fffbfffe8e0 libxul.so`NS_InitXPCOM2+0x828()
00007fffbfffe900 libxul.so`_ZN18ScopedXPCOMStartup10InitializeEv+0x2b()
00007fffbfffe980 libxul.so`_ZN7XREMain8XRE_mainEiPPcRKN7mozilla15BootstrapConfigE+0x741()
00007fffbfffeb50 libxul.so`_Z8XRE_mainiPPcRKN7mozilla15BootstrapConfigE+0x13a()
00007fffbfffeb60 libxul.so`_ZN7mozilla13BootstrapImpl8XRE_mainEiPPcRKNS_15BootstrapConfigE+0x11()
00007fffbfffefa0 _ZL7do_mainiPPcS0_+0x16f()
00007fffbfffefd0 main+0xac()
00007fffbfffefe0 _start+0x43()

Also applying workaround just for end_kPStaticModules_NSModule would not help either. Check for order is now ok. But starts_kPStaticModules + 1 logic would not work. It's actually starts_kPStaticModules + unknown number.

00007fffbfffd390 libc.so.1`__lwp_sigqueue+0xa()
00007fffbfffd3b0 libc.so.1`raise+0x19()
00007fffbfffd3f0 libxul.so`_ZN13nsProfileLock18FatalSignalHandlerEiP7siginfoPv+0xab()
00007fffbfffd440 libxul.so`_ZN2jsL20UnixExceptionHandlerEiP7siginfoPv+0xed()
00007fffbfffd450 libc.so.1`__sighndlr+6()
00007fffbfffd4f0 libc.so.1`call_user_handler+0x2f1()
00007fffbfffd520 libc.so.1`sigacthandler+0xde(b, 7fffbfffe330, 7fffbfffd540)
00007fffbfffe4d0 libxul.so`_ZN22nsComponentManagerImpl23InitializeStaticModulesEv+0x5c()
00007fffbfffe5a0 libxul.so`_ZN22nsComponentManagerImpl4InitEv+0x4a()
00007fffbfffe770 libxul.so`NS_InitXPCOM2+0x79f()
00007fffbfffe790 libxul.so`_ZN18ScopedXPCOMStartup10InitializeEv+0x2b()
00007fffbfffe810 libxul.so`_ZN7XREMain8XRE_mainEiPPcRKN7mozilla15BootstrapConfigE+0x741()
00007fffbfffe9e0 libxul.so`_Z8XRE_mainiPPcRKN7mozilla15BootstrapConfigE+0x13a()
00007fffbfffe9f0 libxul.so`_ZN7mozilla13BootstrapImpl8XRE_mainEiPPcRKNS_15BootstrapConfigE+0x11()
00007fffbfffee30 _ZL7do_mainiPPcS0_+0x16f()
00007fffbfffee60 main+0xac()
00007fffbfffee70 _start+0x43()

We don't really know how starts_kPStaticModules + 1 "enumeration" occurs.  We didn't find the use of the symbols
start_kPStaticModules_NSModule or nsBrowserCompsModule_NSModule. But maybe it's technique to order "hot" code (laid hot code together).
Attached patch Bug1185424.patch (obsolete) — Splinter Review
Attachment #8895788 - Flags: review?(mh+mozilla)
Comment on attachment 8895788 [details] [diff] [review]
Bug1185424.patch

Review of attachment 8895788 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/library/StaticXULComponentsEnd/StaticXULComponentsEnd.cpp
@@ +13,5 @@
> +#ifdef XP_SOLARIS
> +/* Avoids to land end_kPStaticModules in read only data section (R) instead of
> + * initialized data section (D) as all other ns*_NSModule. */
> +static const mozilla::Module kEndModule = { 0 };
> +NSMODULE_DEFN(end_kPStaticModules) = &kEndModule;

FYI, {start,end}_kPStaticModules are used in nsComponentManagerImpl::InitializeStaticModules in xpcom/components/nsComponentManager.cpp, and that's the only thing they're used for. Their value is never actually used. They're only used as boundaries for the other pointers to mozilla::Modules. So you don't actually need to make them point to a valid mozilla::Module. So you could for example do something like:

NSMODULE_DEFN(end_kPStaticModules) = &NSMODULE_NAME(end_kPStaticModules) (with the right cast to make it happy). That would avoid adding two useless mozilla::Modules.
Attachment #8895788 - Flags: review?(mh+mozilla)
Attached patch Bug1185424.patch (obsolete) — Splinter Review
Maybe there is no reason to use ifdef solaris. Or should I?
Attachment #8897603 - Flags: review?(mh+mozilla)
Comment on attachment 8897603 [details] [diff] [review]
Bug1185424.patch

Review of attachment 8897603 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/library/StaticXULComponentsEnd/StaticXULComponentsEnd.cpp
@@ +10,4 @@
>  #  undef NSMODULE_SECTION
>  #  define NSMODULE_SECTION __declspec(allocate(".kPStaticModules$Z"), dllexport)
>  #endif
> +NSMODULE_DEFN(end_kPStaticModules) = (mozilla::Module*)&NSMODULE_NAME(end_kPStaticModules);

Please add a comment like
// This could be null, but this needs a dummy value to ensure it actually ends up in the same section as other NSMODULE_DEFNs, instead of being moved to a separate readonly section.
Attachment #8897603 - Flags: review?(mh+mozilla) → review+
Attached patch Bug1185424.patch (obsolete) — Splinter Review
Attachment #8895788 - Attachment is obsolete: true
Attachment #8897603 - Attachment is obsolete: true
Attachment #8898747 - Flags: review?(mh+mozilla)
Comment on attachment 8898747 [details] [diff] [review]
Bug1185424.patch

Review of attachment 8898747 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/library/StaticXULComponentsStart.cpp
@@ +1,3 @@
>  #include "mozilla/Module.h"
>  
> +NSMODULE_DEFN(start_kPStaticModules) = (mozilla::Module*)&NSMODULE_NAME(start_kPStaticModules);

Please duplicate the comment here.
Attachment #8898747 - Flags: review?(mh+mozilla) → review+
Attached patch Bug1185424.patchSplinter Review
Attachment #8898747 - Attachment is obsolete: true
Attachment #8899745 - Flags: review?(mh+mozilla)
Attachment #8899745 - Flags: review?(mh+mozilla) → review+
Attachment #8899745 - Flags: checkin?
Assignee: nobody → petr.sumbera
Attachment #8899745 - Flags: checkin?
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3fd81dad7c8b
Fix build failures on SunOS due to "NSModules are not ordered appropriately". r=glandium
https://hg.mozilla.org/mozilla-central/rev/3fd81dad7c8b
Status: UNCONFIRMED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Product: Toolkit → Toolkit Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: