Open Bug 1465038 Opened 6 years ago Updated 2 years ago

Calling "if (!JS_Init()) exit(1); JS_ShutDown();" causes segfault

Categories

(Core :: JavaScript Engine, defect, P3)

62 Branch
Unspecified
Linux
defect

Tracking

()

Tracking Status
firefox62 --- affected

People

(Reporter: dpa-mozilla, Unassigned)

References

(Blocks 1 open bug)

Details

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.0 Safari/605.1.15 Epiphany/605.1.15

Steps to reproduce:

I installed SpiderMoney 62a1 (changeset:   420197:f01bb6245db1) with

PYTHON=/usr/local/bin/python2 /mercurial/mozilla-central/js/src/configure --with-linux-headers=/src/kernel/linux-4.15.12/usr/include --enable-accessibility --enable-cookies --enable-cpp-rtti --enable-dbus --enable-debug-js-modules --with-system-zlib --with-system-nspr --with-system-bz2 --enable-system-pixman --enable-system-sqlite --enable-system-cairo --enable-necko-wifi --enable-libjpeg-turbo --disable-gconf --enable-debug --enable-debug-symbols --enable-raw --enable-readline --enable-startup-notification 

Afterwards I made gjs-1.52.3 to require mozjs-62a1, run ./configure and then the script tried running this program:

#define DEBUG 1
#include <js/Initialization.h>

int
main ()
{

if (!JS_Init()) exit(1);
JS_ShutDown();

  ;
  return 0;
}

Which is compiled with:
   g++ -o conftest -g -O2 -pthread -include /usr/local/include/mozjs-62a1/js/RequiredDefines.h -I/usr/local/include/gobject-introspection-1.0 -I/usr/local/lib/libffi-3.2.1/include -I/usr/local/include/glib-2.0 -I/usr/local/lib/glib-2.0/include -I/usr/local/include -I/usr/local/include/mozjs-62a1 -I/usr/local/include/nspr  conftest.cpp -L/usr/local/lib -L/usr/local/lib/../lib -L/usr/local/lib -lgirepository-1.0 -lffi -lgthread-2.0 -pthread -lgio-2.0 -lgobject-2.0 -lglib-2.0 -lmozjs-62a1

Doing ./conftest fails with the the backtrace

#0  0x0000000000000000 in  ()
#1  0x00007ffff5000892 in js::Mutex::Mutex(js::MutexId const&) (id=..., this=0x7ffff6e8bb00 <js::jit::CacheIRSpewer::cacheIRspewer>) at /mercurial/mozilla-central/js/src/threading/Mutex.h:57
#2  0x00007ffff5000892 in js::jit::CacheIRSpewer::CacheIRSpewer() (this=0x7ffff6e8bb00 <js::jit::CacheIRSpewer::cacheIRspewer>) at /mercurial/mozilla-central/js/src/jit/CacheIRSpewer.cpp:34
#3  0x00007ffff5000892 in __static_initialization_and_destruction_0(int, int) (__priority=65535, __initialize_p=1) at /mercurial/mozilla-central/js/src/jit/CacheIRSpewer.cpp:31
#4  0x00007ffff7de6e4a in call_init (l=<optimized out>, argc=argc@entry=1, argv=argv@entry=0x7fffffffe438, env=env@entry=0x7fffffffe448) at dl-init.c:72
        j = <optimized out>
        jm = <optimized out>
        addrs = <optimized out>
        init_array = <optimized out>
#5  0x00007ffff7de6f56 in call_init (env=0x7fffffffe448, argv=0x7fffffffe438, argc=1, l=<optimized out>) at dl-init.c:118
        preinit_array = <optimized out>
        preinit_array_size = <optimized out>
        i = 6
#6  0x00007ffff7de6f56 in _dl_init (main_map=0x7ffff7ffe170, argc=1, argv=0x7fffffffe438, env=0x7fffffffe448) at dl-init.c:119
        preinit_array = <optimized out>
        preinit_array_size = <optimized out>
        i = 6
#7  0x00007ffff7dd906a in _dl_start_user () at /lib64/ld-linux-x86-64.so.2
#8  0x0000000000000001 in  ()
#9  0x00007fffffffe6b0 in  ()
#10 0x0000000000000000 in  ()

According to https://gitlab.gnome.org/GNOME/gjs/issues/128 this was known to fail in FF < 60, but was supposed to be fixed in >=60.
Summary: spidermonkey → Calling "if (!JS_Init()) exit(1); JS_ShutDown();" causes segfault
Hi! I think you worked on this before? Can you take a look.
Flags: needinfo?(philip.chimento)
this issue is linux specific, afaik.
you need to link to mozglue code, which is unfortunately not included in libmozjs.

mozglue symbols are weak, and it doesn't fail while loading even if the symbols are not found,
and the function's address is kept null, and it crashes when you call it.

here's the dump of OBJDIR/js/src/shell/js.list, generated while building js shell.
the list includes object files which needs to be linked, especially mozglue and maybe some others:

INPUT("Unified_cpp_js_src_shell0.o")
INPUT("../../../memory/build/Unified_cpp_memory_build0.o")
INPUT("../../../memory/mozalloc/mozalloc_abort.o")
INPUT("../../../memory/mozalloc/Unified_cpp_memory_mozalloc0.o")
INPUT("../../../mozglue/misc/AutoProfilerLabel.o")
INPUT("../../../mozglue/misc/ConditionVariable_posix.o")
INPUT("../../../mozglue/misc/Mutex_posix.o")
INPUT("../../../mozglue/misc/Printf.o")
INPUT("../../../mozglue/misc/StackWalk.o")
INPUT("../../../mozglue/misc/TimeStamp.o")
INPUT("../../../mozglue/misc/TimeStamp_posix.o")
INPUT("../../../mfbt/lz4.o")
INPUT("../../../mfbt/Compression.o")
INPUT("../../../mfbt/Decimal.o")
INPUT("../../../mfbt/Unified_cpp_mfbt0.o")
INPUT("../../../mfbt/Unified_cpp_mfbt1.o")
INPUT("../editline/Unified_c_js_src_editline0.o")
OS: Unspecified → Linux
Flags: needinfo?(philip.chimento)
Flags: needinfo?(philip.chimento)
Argh, I thought this was fixed in ESR60 ... in any case it was fixed the last time I tested it on Linux in 59! (I _thought_ I had tested a 60 alpha on Linux as well, but I'm not at all sure of that.)

I'll take a closer look as soon as I can, but I didn't really know what I was doing with mozbuild when I was trying to fix it for 52, so any help is appreciated.

The idea is (IIRC) that JS_STANDALONE implies MOZ_GLUE_IN_PROGRAM which links the mozglue files into libmozjs, instead of having to link them into the executable in the end.
Ah, I knew I had tested it somehow; here are CI results from 60.0.0 where it tries to link and execute a small embedder program (https://github.com/ptomato/mozjs/blob/mozjs60/testjs.cpp) with the freshly built mozjs. It does seem to work with GCC 6, 7, 8, and Clang on Travis CI's Ubuntu image...

https://travis-ci.org/ptomato/mozjs/builds/377607183
I figured out what's missing here. For standalone SpiderMonkey it's *required* to configure with --disable-jemalloc. I missed this in the arguments list earlier.

It would be nice if we could add a patch that will fail the build (say, at "make install" time, which only applies to standalone SpiderMonkey, so wouldn't affect the rest of Firefox) if --disable-jemalloc was not given. @evilpie, do you know if that's possible or who could advise on that?

If that's not wanted, then we can probably close this issue.
Flags: needinfo?(philip.chimento) → needinfo?(evilpies)
I confirm that --disable-jemalloc helps.
I have no idea what the answer to this is. Steve Fink has been working on embedding, but he is on PTO for a while.
Flags: needinfo?(evilpies)
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P3
@sfink, I'm guessing you might be back from PTO by this time? If so, do you have any comments on this:

> It would be nice if we could add a patch that will fail the build (say, at "make install" time, which only applies to standalone SpiderMonkey, so wouldn't affect the rest of Firefox) if --disable-jemalloc was not given.
Flags: needinfo?(sphink)

We discussed this on IRC today and a likely course of action might be to add a JS_TARBALL_BUILD option similar to (and which implies) JS_STANDALONE (https://searchfox.org/mozilla-central/source/js/moz.configure#18). One idea is to enable it if build_project == 'mozjs' although it would still be nice to find a way to set it automatically (as is done for js in https://searchfox.org/mozilla-central/source/js/src/configure.in#25) without requiring embedders to pass an extra configure option.

Flags: needinfo?(sphink)

Since I always go back to this bug to try to remember what the situation is, and I just now traced through the configure code to understand how it works, I'm going to paste my exploration here (originally an email to dev-tech-js-engine) for posterity:


I just traced through some of our configure goop. I'm nowhere close to being an expert on this stuff, so don't trust anything I say here. But I think the most relevant part is: https://searchfox.org/mozilla-central/rev/a40ef31fc9af34a99ceda6d65cdc4573d52b83d2/js/src/old-configure.in#1239-1252

--disable-jemalloc will clear out MOZ_MEMORY, which from the above configure snippet will result in MOZ_GLUE_IN_PROGRAM being set to the empty string for JS_STANDALONE (mozjs). MOZ_GLUE_IN_PROGRAM can be read as "mozglue is linked into the executable as opposed to the library", and you want it in the library.

The magic unicorn in the build system will then look at MOZ_GLUE_IN_PROGRAM, and if it is clear, be chopped up and processed into glue to compensate and will no longer be able to help you. As a result, mozglue will not be linked into the library. (We don't give out the unicorn glue, sorry; that stuff is valuable.)

Uh... sorry, having a lot of trouble tracking down how this affects things. That's the best explanation I have at the moment.

I'll try harder. https://searchfox.org/mozilla-central/rev/a40ef31fc9af34a99ceda6d65cdc4573d52b83d2/build/gecko_templates.mozbuild#39-50 seems to be where the unicorn-chopping must happen. I have no idea how 'mozglue' gets set there, but you can see that if the magic unicorn set it to 'library' before dying and MOZ_GLUE_IN_PROGRAM is clear, then you'll end up with USE_LIBS including 'mozglue'. I'll take it on faith that USE_LIBS will result in the right thing happening.

GeckoBinary seems an odd thing to use for configuring building for a library, but the unicorn was probably a little occupied with its impending doom to worry about that, and https://searchfox.org/mozilla-central/rev/a40ef31fc9af34a99ceda6d65cdc4573d52b83d2/build/docs/defining-binaries.rst#326-345 seems to confirm. https://searchfox.org/mozilla-central/rev/a40ef31fc9af34a99ceda6d65cdc4573d52b83d2/build/gecko_templates.mozbuild#99-110 shows that GeckoSharedLibrary does indeed use it. I guess my brain associates "binary" with "executable", but this code appears to use "Program" for that.

And sure enough, https://searchfox.org/mozilla-central/rev/a40ef31fc9af34a99ceda6d65cdc4573d52b83d2/js/src/build/moz.build#25-26 uses GeckoSharedLibrary (the actual library name mozjs appears to be set here: https://searchfox.org/mozilla-central/rev/a40ef31fc9af34a99ceda6d65cdc4573d52b83d2/js/src/old-configure.in#1589 )

Anyway: yes, --disable-jemalloc should be adequate to get the linking correct. And our tarball sucks, in that it produces a mozjs depending on weak symbols in a library that it does not provide, which results in things linking happily but then trying to call null pointers when you call any of those weak symbols. Not only should we switch the tarball's default, we should also add a canary call (or simple null pointer test) that gives some clue as to what's actually going wrong if you end up in this configuration.

Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.