Closed
Bug 699520
Opened 14 years ago
Closed 14 years ago
Compile failure using modified Clang from Address Sanitizer on Ubuntu 11.10
Categories
(Firefox :: General, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: gkw, Unassigned)
References
()
Details
(Whiteboard: [sg:want])
Attachments
(1 file)
19.77 KB,
text/plain
|
Details |
Using Ehsan's instructions at the URL, I'm unable to compile a build of Firefox using Clang modified by Address Sanitizer at:
http://code.google.com/p/address-sanitizer/wiki/HowToBuild
I'm on Ubuntu 11.10 on a 2009 Mac mini with 2Gb ram. Christian indicates that he was also unable to compile on his machine, so this might be the same issue.
===
Mozconfig:
export CC=/home/fuzz1/address-sanitizer/asan_clang_Linux/bin/clang
export CXX=/home/fuzz1/address-sanitizer/asan_clang_Linux/bin/clang++
export CFLAGS='-fasan -Dxmalloc=myxmalloc'
export CXXFLAGS='-fasan -Dxmalloc=myxmalloc'
export LDFLAGS=-ldl
. $topsrcdir/browser/config/mozconfig
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/objdir-ff-dbg-asan
mk_add_options MOZ_MAKE_FLAGS="-j3"
ac_add_options --enable-application=browser
ac_add_options --enable-debug
ac_add_options --disable-optimize
ac_add_options --disable-jemalloc
ac_add_options --disable-crashreporter
Updated•14 years ago
|
Whiteboard: [sg:want]
Comment 1•14 years ago
|
||
@gkw: Yes, those are exactly the error messages I had as well. I created a workaround (of that I don't know if it's the right way), which gets me through all the errors but the build does not start up (might also be a different issue). My exact workaround was:
1. Added -fasan to linker flags
export LDFLAGS='-ldl -fasan'
2. Edited security/coreconf/Linux.mk line 140 to include -fasan:
OS_LIBS = $(OS_PTHREAD) -ldl -lc -fasan
The second change seemed necessary because during the NSS build, it seemed that my LDFLAGS we're simply ignored (might be a bug on its own).
Can you try with these changes and let me know if it compiles at least?
Comment 2•14 years ago
|
||
Seems my instructions in comment #1 we're insufficient. While that configuration seems to solve some problems, I still get this when linking the first C++ so:
/usr/bin/python2.7 /home/decoder/LangFuzz/mozilla-central-browser/config/pythonpath.py -I../../config /home/decoder/LangFuzz/mozilla-central-browser/config/expandlibs_exec.py --uselist -- /home/decoder/LangFuzz/asan/address-sanitizer/asan_clang_Linux/bin/clang++ -fno-rtti -Wall -Wpointer-arith -Woverloaded-virtual -Wsynth -Wno-ctor-dtor-privacy -Wno-non-virtual-dtor -Wno-invalid-offsetof -Wno-variadic-macros -Werror=return-type -pedantic -Wno-long-long -fasan -Dxmalloc=myxmalloc -fno-exceptions -fno-strict-aliasing -fshort-wchar -pthread -pipe -DDEBUG -D_DEBUG -DTRACING -g -fno-omit-frame-pointer -fPIC -shared -Wl,-z,defs -Wl,-h,libmozalloc.so -o libmozalloc.so mozalloc.o mozalloc_abort.o mozalloc_oom.o -lpthread -ldl -fasan -Wl,-rpath-link,/home/decoder/LangFuzz/mozilla-central-browser/objdir-ff-asan3/dist/bin -Wl,-rpath-link,/usr/local/lib
mozalloc.o: In function `moz_free':
/home/decoder/LangFuzz/mozilla-central-browser/memory/mozalloc/mozalloc.cpp:98: undefined reference to `__asan_report_store8'
I had to change LDFLAGS like this to get it fully working (and here im pretty sure it's the WRONG way but I don't know how to fix it else):
export LDFLAGS='-ldl -L/path/to/address-sanitizer/asan_clang_Linux/lib/ -lasan64 -lpthread -lstdc++'
It only seems to be required for C++, because for example NSS links fine without this.
Comment 3•14 years ago
|
||
After having compiled Firefox with the changes made in Comment 1 and Comment 2 (which might already be incorrect!), I first get a crash immediately when libxul.so is loaded and tries to set a signal handler:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) bt
#0 0x0000000000000000 in ?? ()
#1 0x00007ffff3806663 in _PR_UnixInit () at /home/decoder/LangFuzz/mozilla-central-browser/nsprpub/pr/src/md/unix/unix.c:2909
#2 0x00007ffff36fb0ce in _PR_InitStuff () at /home/decoder/LangFuzz/mozilla-central-browser/nsprpub/pr/src/misc/prinit.c:246
#3 0x00007ffff36facc9 in _PR_ImplicitInitialization () at /home/decoder/LangFuzz/mozilla-central-browser/nsprpub/pr/src/misc/prinit.c:251
#4 0x00007ffff3653bb0 in PR_NewLogModule (name=<value optimized out>) at /home/decoder/LangFuzz/mozilla-central-browser/nsprpub/pr/src/io/prlog.c:385
#5 0x00007fffcdad7934 in __cxx_global_var_init () from /home/decoder/LangFuzz/mozilla-central-browser/objdir-ff-asan2/dist/bin/libxul.so
#6 0x00007fffcdad7989 in global constructors keyed to a () from /home/decoder/LangFuzz/mozilla-central-browser/objdir-ff-asan2/dist/bin/libxul.so
#7 0x00007fffe0d51f56 in __do_global_ctors_aux () from /home/decoder/LangFuzz/mozilla-central-browser/objdir-ff-asan2/dist/bin/libxul.so
#8 0x00007fffcdaaeafb in _init () from /home/decoder/LangFuzz/mozilla-central-browser/objdir-ff-asan2/dist/bin/libxul.so
[...]
#15 0x00007ffff79baf16 in dlopen_doit (a=0x7fffffff5510) at dlopen.c:67
[...]
Commenting out that signal handler code just gives me the next crash where some other signal handler is set in toolkit/xre/nsSigHandlers.cpp:267. That signal handler depends on the environment variable XRE_NO_WINDOWS_CRASH_DIALOG, so using
export XRE_NO_WINDOWS_CRASH_DIALOG=1
in the shell (!) bypasses that location.
Starting Firefox now gives me:
JS Component Loader: ERROR (null):0
too much recursion
JS Component Loader: ERROR (null):0
uncaught exception: [Exception... "Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIXPCComponents_Utils.import]" nsresult: "0x80520012 (NS_ERROR_FILE_NOT_FOUND)" location: "JS frame :: file:///home/decoder/LangFuzz/mozilla-central-browser/objdir-ff-asan/dist/bin/components/Weave.js :: <TOP_LEVEL> :: line 42" data: no]
WARNING: Cannot create startup observer : service,@mozilla.org/weave/service;1: file /home/decoder/LangFuzz/mozilla-central-browser/embedding/components/appstartup/src/nsAppStartupNotifier.cpp, line 119
(followed by many more of these and Firefox terminating).
We already figured out that the FILE_NOT_FOUND is just wrong because the file is there and found, it just seems that the loader ignores the error code and displays file not found instead of propagating the real error (that might be a bug on its own). The real error is the "too much recursion" from the JS engine here. Jimb already had a brief look with me at this and said that the stack is not deep enough that it should cause the too much recursion error. Either any of the steps I performed previously screwed this up or this is a bug on its own where address sanitizer somehow interferes with the JS engine recursion detection.
(On a side note, I was not able to perform jimb's instructions for debugging this because I did not manage to get a non-optimized build with clang, some variables we're always missing like the cx variable, I even tried -O0 -g, no luck.)
That's all I have from my side, Cc'ing some more people on this that hopefully might be able to provide comments and/or debug this.
Comment 4•14 years ago
|
||
I made some progress here: It seems that even with vanilla clang at least one JS shell test fails (with "Too much recursion") so it could be that clang is using more stack space than GCC for some reason. Adding an instrumentation on top of that (e.g. ASAN or my own coverage instrumentation based on LLVM/Clang) might cause those errors to manifest even earlier if the instrumentation consumes stack space itself.
Some of the failures in the JS shell I debugged recurse over js::frontend::EmitTree before the error happens, and that is also the case for Firefox + ASAN if I remember correctly. According to bhackett, these frames are huge and there is a bug (bug 704369) to refactor that area. This refactoring could solve some of the issues seen here as well.
Depends on: 704369
Comment 5•14 years ago
|
||
I tried the workarounds above again in combination with the fixes in bug 704369 and 708870 and I still get "too much recursion" errors from the JS engine on startup. What makes me wonder is that if I remove the recursion check, I don't get a crash either as I would have expected. Instead, Firefox starts up normally and then terminates due to bug 709483 (which seems to be the first true positive produced by address sanitizer here). After fixing that bug, I get another startup crash in mozilla::widget::GfxInfoBase::GetFeatureStatusImpl which could also be a true positive but needs more investigation.
My key question here would be why the "too much recursion" errors occur here and why they don't crash when removing these checks.
Comment 6•14 years ago
|
||
Filed bug 709580 for the GfxInfoBase error which is indeed also a true positive. After fixing this bug, Firefox starts up!
This only leaves us to determine which of the hacks above are required and which are not, and whats the matter with the recursion checking in JS.
> This only leaves us to determine which of the hacks above are required and
> which are not, and whats the matter with the recursion checking in JS.
Try printing the stack addresses that let us to conclude that we are rucursing too much. Maybe we are getting the shadow address somehow?
Comment 8•14 years ago
|
||
(In reply to Rafael Ávila de Espíndola (:espindola) from comment #7)
> Try printing the stack addresses that let us to conclude that we are
> rucursing too much. Maybe we are getting the shadow address somehow?
I checked the limits and noticed that the default browser limits are a little low. Luke suggested to increase the limits, so I increased the stack limit by a factor of 4 (patched it in three places in the browser), and voila, it works =) Starts up, no errors. I'll include this in the manual im writing.
Comment 9•14 years ago
|
||
And here it is: https://developer.mozilla.org/en/Building_Firefox_with_Address_Sanitizer
The question is now how to proceed from here. Shall we create a new bug about running Firefox tests on a clang+asan tbpl and file bugs blocking to that? Rafael already pointed out there is a bug for a clang tbpl machine, not sure if we should merge this.
Comment 10•14 years ago
|
||
Now that all of the dependencies are fixed, can you please try to see if a build from mozilla-central works these days?
Comment 11•14 years ago
|
||
(In reply to Ehsan Akhgari [:ehsan] from comment #10)
> Now that all of the dependencies are fixed, can you please try to see if a
> build from mozilla-central works these days?
As I described in the other bug, I successfully went through the whole build process and documented it here: https://developer.mozilla.org/en/Building_Firefox_with_Address_Sanitizer
There is also something about optimized builds in that manual which might be interesting, because these builds are really fast.
![]() |
Reporter | |
Comment 12•14 years ago
|
||
Christian has been successful in compiling Firefox as per comment 11, and I have been successful in compiling a js shell with ASAN too.
Resolving WFM because it is indeed fixed but with no specific patch.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•