Closed Bug 1233568 Opened 5 years ago Closed 5 years ago

Address Sanitizer builds for NSS

Categories

(NSS :: Build, defect, P1)

defect

Tracking

(firefox46 affected)

RESOLVED FIXED
Tracking Status
firefox46 --- affected

People

(Reporter: jld, Assigned: jld)

References

Details

Attachments

(3 files, 4 obsolete files)

From the big NSS wishlist triage last Friday: we'd like to get builds of NSS with Address Sanitizer, but it was mentioned that there were problems of some kind with the use of arena allocation.

Firefox ASan builds definitely have sanitizer instrumentation in most of NSS (exceptions seem to be asm files, which ASan can't help with, or C files with only data items), and the Gecko/Firefox tests pass — but they may not be covering everything that NSS's own tests do.  (Coverage data would also be nice….)  The Gecko build doesn't seem to be doing anything special to the NSS or NSPR builds here, but it does interpose jemalloc as malloc()/etc. by default (for everything, including system libraries), which might be relevant.

This bug is to find out what's needed to get ASan working for NSS by itself.
So far this is working fine on Linux with Clang 3.6 (give or take some fighting with the build system, which I need to convert into a committable form); the gtests have finished, and all.sh is proceeding.

Leak Sanitizer… well, it seems to work, in that it's on by default and complains voluminously about the NSS tools.  I've disabled it, but it might be usable with enough suppressions (which Gecko might already have some/most of).
I think I know what the problem with NSPR arena allocation was, because I've accidentally reproduced it: building NSS with -fsanitize=address but not NSPR.  There are macros in nspr/lib/ds/plarena.h that integrate it with dynamic analysis tools like Valgrind and ASan; the NSS code expands them them with ASan enabled and poisons memory when it's freed back to the arena, but the non-inlined part of the arena code in NSPR expands the macros with ASan disabled and doesn't know to unpoison the memory before accessing it or freeing it back to the malloc level. 

The part of the NSS top-level makefile that invokes the NSPR configure/make is relevant here, in terms of what does and doesn't automatically carry over.
See Also: → 1234990
Attached patch Work In Progress (2015-12-23) (obsolete) — Splinter Review
This patch allows setting USE_ASAN=1 to use Address Sanitizer on Linux or OS X.  I haven't tried Windows yet, but there apparently is ASan support in at least an experimental state.

Tested with: Linux Clang 3.6 x86_64, Linux GCC 4.8 x86_64 and x86, OS X 10.10 "clang-700.1.81".  Works with and without optimization (and enables debug symbols on optimized builds, under the assumption that you probably want debug info and you probably don't want to ship the result, but you might like the tests to be less slow than with unoptimized code and unoptimized ASan annotations).

If trying to build 32-bit on a 64-bit Linux host: your distribution may not have packaged the 32-bit sanitizer runtime such that you can install it at the same time as the 64-bit (e.g., Ubuntu 14.04 has this problem for Clang but not GCC); also, if you're overriding the compiler by setting CC in the environment, this will confuse the NSPR build (but not the NSS build) into not setting -m32, so you'll have to add it to the $CC and $CCC you set.

Also, there's an EXTRA_SANITIZER_FLAGS environment variable for supplying other flags to be used alongside -fsanitize=address; I've used this to try out sanitizer coverage, but I might make that its own variable.
Hey Jed also make sure to add "-fno-omit-frame-pointer" to the CFLAGS of both the NSPR and NSS builds for good stack traces[1].

Also I think EXTRA_SANITIZER_FLAGS is a good idea for MSan and UBSan when it is added later. It could be used for flags such as "-fno-sanitize-recover=..." etc.

[1]http://clang.llvm.org/docs/AddressSanitizer.html#usage
Duplicate of this bug: 1203286
Attached patch create_dir.patch (obsolete) — Splinter Review
Also how about adding something like this to create a separate directory for the builds? I think this makes sense because you may want to have ASan and non ASan debug builds and also in the future UBSan and MSan as well.

I don't know if this should live in arch.mak though
Attached patch bug1233568-nss-asan-hg0.diff (obsolete) — Splinter Review
This fixes my FIXMEs.  Attachment 8707696 [details] [diff] works fine as a separate patch, so I'm leaving it as a separate patch.

One thing I've run into, when running tests/all.sh on Linux with GCC just now, are a few instances of this (and sometimes several next to each other with the same pid):

Assertion failure: 0 == rv, at ../../../../pr/src/pthreads/ptthread.c:870
==126342== AddressSanitizer CHECK failed: ../../../../src/libsanitizer/asan/asan_rtl.cc:397 "((curr_thread)) != (0)" (0x0, 0x0)

The first assertion is in _pt_thread_death in NSPR:

    869         rv = pthread_setspecific(pt_book.key, NULL);
    870         PR_ASSERT(0 == rv);

The second is probably this, from __asan_handle_no_return[*]; it eventually winds up in AsanTSDGet which calls pthread_getspecific:

    396   AsanThread *curr_thread = asanThreadRegistry().GetCurrent();
    397   CHECK(curr_thread);

[*] https://github.com/gcc-mirror/gcc/blob/gcc-4_8_4-release/libsanitizer/asan/asan_rtl.cc#L397

I wonder if we've managed to run this on a non-main thread after the main thread has already run enough of the exit()/destructor code such that thread specifics no longer work.

In any case it doesn't cause test failures, which may or may not be a good thing.

Also, the presence of an NSPR assertion points out (it was an opt build, because I wanted the tests to finish in a reasonable amount of time) that I'm passing --enable-debug to the NSPR build when I think I actually wanted --enable-debug-symbols.  Or maybe just to pass the -g (and -fno-omit-frame-pointer, if applicable?) directly.  As-is it works, but it's a little not-least-surprise.
Attachment #8701699 - Attachment is obsolete: true
Attached patch bug1233568-nss-asan-hg1.diff (obsolete) — Splinter Review
Fixed things to enable only -g, not assertions, unconditionally for NSPR; and to pass along -fno-omit-frame-pointer; and to correctly document what ASan needs it for (or at least follow what the ASan documentation I found said about it, which more or less makes sense).
Attachment #8717300 - Attachment is obsolete: true
Attachment #8707696 - Flags: review?(wtc)
Attachment #8718530 - Flags: review?(wtc)
Comment on attachment 8718530 [details] [diff] [review]
bug1233568-nss-asan-hg1.diff

Review of attachment 8718530 [details] [diff] [review]:
-----------------------------------------------------------------

Jed: thanks for the patch. I suggest some changes.
You can just indicate whether you agree or disagree
with my suggestions, and I can make the changes you
agree with before I check in this patch.

::: Makefile
@@ +89,5 @@
> +ifdef SANITIZER_CFLAGS
> +ifdef BUILD_OPT
> +NSPR_CONFIGURE_OPTS += --enable-debug-symbols
> +endif
> +NSPR_COMPILERS += CFLAGS='$(SANITIZER_CFLAGS)' CXXFLAGS='$(SANITIZER_CFLAGS)' LDFLAGS='$(SANITIZER_LFLAGS)'

Maybe NSPR_COMPILERS should be renamed to reflect
its purpose better. It is a bunch of environment
variables for the NSPR configure script. DO you
have any suggestion? Perhaps NSPR_CONFIGURE_ENV?

::: coreconf/Linux.mk
@@ +151,5 @@
>  DSO_CFLAGS		= -fPIC
>  DSO_LDOPTS		= -shared $(ARCHFLAG) -Wl,--gc-sections
>  # The linker on Red Hat Linux 7.2 and RHEL 2.1 (GNU ld version 2.11.90.0.8)
>  # incorrectly reports undefined references in the libraries we link with, so
>  # we don't use -z defs there.

Nit: please put the new comment here, so that the two
reasons for not using -z defs form a contiguous comment
block.

@@ +160,1 @@
>  DSO_LDOPTS		+= $(if $(findstring 2.11.90.0.8,$(shell ld -v)),,$(ZDEFS_FLAG))

We should ask the Red Hat NSS developers to find out
if we still need to support building NSS on RHEL 2.1.
We may be able to remove the special case for GNU ld
2.11.90.0.8 now.

::: coreconf/sanitizers.mk
@@ +1,4 @@
> +# Address Sanitizer support; include this in OS-specific .mk files
> +# *after* defining the variables that are appended to here.
> +
> +ifeq ($(USE_ASAN), 1)

Nit: if USE_ASAN is only set to 1, in NSS makefiles
we usually use the simpler test of "ifdef USE_ASAN".

::: tests/common/init.sh
@@ +44,5 @@
>  NSS_STRICT_SHUTDOWN=1
>  export NSS_STRICT_SHUTDOWN
>  
> +# If using ASan, disable LSan; see bug 1246801.
> +export ASAN_OPTIONS=detect_leaks=0${ASAN_OPTIONS:+,$ASAN_OPTIONS}

Do we not need to quote the new value of ASAN_OPTIONS, like this?

  	export ASAN_OPTIONS="detect_leaks=0${ASAN_OPTIONS:+,$ASAN_OPTIONS}"
Comment on attachment 8707696 [details] [diff] [review]
create_dir.patch

Review of attachment 8707696 [details] [diff] [review]:
-----------------------------------------------------------------

Hi Jed, Tyson,

Is this patch necessary? Are you doing a regular build and an
ASAN build in the same NSS source tree and need to run the
NSS test suite against both builds?
(In reply to Wan-Teh Chang from comment #10)
> Comment on attachment 8707696 [details] [diff] [review]
> create_dir.patch
> 
> Review of attachment 8707696 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Hi Jed, Tyson,
> 
> Is this patch necessary? Are you doing a regular build and an
> ASAN build in the same NSS source tree and need to run the
> NSS test suite against both builds?

Hey Wan-Teh,

Yes I think this patch is absolutely necessary. At the very least I think the test suite should be run on a regular basis with an ASan build. Being able to easily build a project written in C/C++ with ASan is, IMO, fundamental to the stability and security of the project, especially something like NSS.

In terms of fuzzing this is very helpful since fuzzing with ASan is much more effective then fuzzing without it. And last this is a good start point for adding support for other sanitizes such as Memory, Thread and Undefined Behavior Sanitizer.

That's my point of view, Jed or others may have more to add.
There might be some confusion here — I think comment #10 is specifically about the patch to add "_ASAN" to the objdir.  It's still possible to run the tests with ASan enabled without it.

On that topic:

* I've been doing ASan and non-ASan builds in the same tree while working on this, for whatever that's worth.  Having the builds coexist has been useful.

* We already stamp the objdirs with a few other build attributes that might not pass that criterion, like the compiler type (gcc vs. clang).

* Changing between ASan and non-ASan without that patch means manually removing all the objdirs — including NSPR; see earlier comments about how NSPR and its callers need to match ASan-ness — and rebuilding.
(In reply to Wan-Teh Chang from comment #9)
> ::: Makefile
> @@ +89,5 @@
> > +ifdef SANITIZER_CFLAGS
> > +ifdef BUILD_OPT
> > +NSPR_CONFIGURE_OPTS += --enable-debug-symbols
> > +endif
> > +NSPR_COMPILERS += CFLAGS='$(SANITIZER_CFLAGS)' CXXFLAGS='$(SANITIZER_CFLAGS)' LDFLAGS='$(SANITIZER_LFLAGS)'
> 
> Maybe NSPR_COMPILERS should be renamed to reflect
> its purpose better. It is a bunch of environment
> variables for the NSPR configure script. DO you
> have any suggestion? Perhaps NSPR_CONFIGURE_ENV?

That's a good point.  NSPR_CONFIGURE_ENV sounds fine.
 
> ::: coreconf/Linux.mk
> @@ +151,5 @@
> >  DSO_CFLAGS		= -fPIC
> >  DSO_LDOPTS		= -shared $(ARCHFLAG) -Wl,--gc-sections
> >  # The linker on Red Hat Linux 7.2 and RHEL 2.1 (GNU ld version 2.11.90.0.8)
> >  # incorrectly reports undefined references in the libraries we link with, so
> >  # we don't use -z defs there.
> 
> Nit: please put the new comment here, so that the two
> reasons for not using -z defs form a contiguous comment
> block.

No objection to this change.

> @@ +160,1 @@
> >  DSO_LDOPTS		+= $(if $(findstring 2.11.90.0.8,$(shell ld -v)),,$(ZDEFS_FLAG))
> 
> We should ask the Red Hat NSS developers to find out
> if we still need to support building NSS on RHEL 2.1.
> We may be able to remove the special case for GNU ld
> 2.11.90.0.8 now.

I filed bug 1249457.

> ::: coreconf/sanitizers.mk
> @@ +1,4 @@
> > +# Address Sanitizer support; include this in OS-specific .mk files
> > +# *after* defining the variables that are appended to here.
> > +
> > +ifeq ($(USE_ASAN), 1)
> 
> Nit: if USE_ASAN is only set to 1, in NSS makefiles
> we usually use the simpler test of "ifdef USE_ASAN".

The idea was that USE_ASAN=0 shouldn't enable ASan.  But NSS seems to be not entirely consistent about whether any nonempty string is true vs. only "1"; contrast uses of BUILD_OPT and USE_64.

> ::: tests/common/init.sh
> @@ +44,5 @@
> >  NSS_STRICT_SHUTDOWN=1
> >  export NSS_STRICT_SHUTDOWN
> >  
> > +# If using ASan, disable LSan; see bug 1246801.
> > +export ASAN_OPTIONS=detect_leaks=0${ASAN_OPTIONS:+,$ASAN_OPTIONS}
> 
> Do we not need to quote the new value of ASAN_OPTIONS, like this?
> 
>   	export ASAN_OPTIONS="detect_leaks=0${ASAN_OPTIONS:+,$ASAN_OPTIONS}"

Nice catch; thanks.  Yes, that should either be quoted or separated into an assignment and plain `export ASAN_OPTIONS`, like the NSS_STRICT_SHUTDOWN example in the diff context.  (This file is #!/bin/bash, which suggests the extended form of export is valid.)
(In reply to Jed Davis [:jld] from comment #12)

Oh I see. Sorry I did misunderstand wtc :)
Updated with the suggested changes that I agreed with (see comment #13).
Attachment #8718530 - Attachment is obsolete: true
Attachment #8718530 - Flags: review?(wtc)
Attachment #8723342 - Flags: review?(wtc)
Comment on attachment 8723342 [details] [diff] [review]
bug1233568-nss-asan-hg3.diff

Review of attachment 8723342 [details] [diff] [review]:
-----------------------------------------------------------------

r=wtc. I edited your patch and checked it in yesterday.

https://hg.mozilla.org/projects/nss/rev/47879a76b71a

::: Makefile
@@ +89,5 @@
> +ifdef SANITIZER_CFLAGS
> +ifdef BUILD_OPT
> +NSPR_CONFIGURE_OPTS += --enable-debug-symbols
> +endif
> +NSPR_CONFIGURE_ENV += CFLAGS='$(SANITIZER_CFLAGS)' CXXFLAGS='$(SANITIZER_CFLAGS)' LDFLAGS='$(SANITIZER_LFLAGS)'

I folded this long line.

Can we change "SANITIZER_LFLAGS" to "SANITIZER_LDFLAGS"? ("L" vs. "LD")?
LFLAGS is the Lex flags.

::: coreconf/Darwin.mk
@@ +138,5 @@
>  endif
> +
> +
> +include $(CORE_DEPTH)/coreconf/sanitizers.mk
> +DARWIN_SDK_SHLIBFLAGS += $(SANITIZER_FLAGS)

I deleted a blank line.

::: coreconf/Linux.mk
@@ +152,5 @@
>  DSO_LDOPTS		= -shared $(ARCHFLAG) -Wl,--gc-sections
>  # The linker on Red Hat Linux 7.2 and RHEL 2.1 (GNU ld version 2.11.90.0.8)
>  # incorrectly reports undefined references in the libraries we link with, so
>  # we don't use -z defs there.
> +# Also, -z defs conflicts with Address Sanitizer: it emits relocations

I changed "it" to "which" to make it clear that we're referring
to Address Sanitizer, not "-z defs".

::: tests/common/init.sh
@@ +44,5 @@
>  NSS_STRICT_SHUTDOWN=1
>  export NSS_STRICT_SHUTDOWN
>  
> +# If using ASan, disable LSan; see bug 1246801.
> +ASAN_OPTIONS=detect_leaks=0${ASAN_OPTIONS:+,$ASAN_OPTIONS}

I added quotes around the new value of ASAN_OPTIONS.

If the old value of ASAN_OPTIONS contains whitespace, the quotes
will be necessary.

It seems that you use a comma as the separator. I can't find
documentation on what separators ASAN_OPTIONS uses. The examples
I found on the web use spaces or colons.
Attachment #8723342 - Flags: review?(wtc) → review+
Attachment #8723342 - Flags: checked-in+
I edited Tyson's patch and checked it in. Mainly, I moved
the code to a more appropriate location in the file, rather
than in the middle of the code defining OS_ARCH and OS_RELEASE.

https://hg.mozilla.org/projects/nss/rev/a61e85d5a367
Attachment #8707696 - Attachment is obsolete: true
Attachment #8707696 - Flags: review?(wtc)
Attachment #8724601 - Flags: review?(twsmith)
Attachment #8724601 - Flags: feedback?(jld)
Attachment #8724601 - Flags: checked-in+
You should have checked the tree status at test.nss-crypto.org prior to check in.

The NSS trunk is currently frozen for the NSS 3.23 release process, until tomorrow, and the release hasn't been tagged yet.

In the future, please check that page, and if the tree is closed, please set the checkin-needed keyword on the bug instead.

I don't know yet if I'll have to back you out. If nothing else happens, it might work to tag the earlier snapshot.
Kai: I am very sorry! You can back out these two checkins if necessary.
(In reply to Wan-Teh Chang from comment #16)
> Can we change "SANITIZER_LFLAGS" to "SANITIZER_LDFLAGS"? ("L" vs. "LD")?
> LFLAGS is the Lex flags.

No objection to making that change.

> ::: coreconf/Linux.mk
> @@ +152,5 @@
> >  DSO_LDOPTS		= -shared $(ARCHFLAG) -Wl,--gc-sections
> >  # The linker on Red Hat Linux 7.2 and RHEL 2.1 (GNU ld version 2.11.90.0.8)
> >  # incorrectly reports undefined references in the libraries we link with, so
> >  # we don't use -z defs there.
> > +# Also, -z defs conflicts with Address Sanitizer: it emits relocations
> 
> I changed "it" to "which" to make it clear that we're referring
> to Address Sanitizer, not "-z defs".

Thanks.

> ::: tests/common/init.sh
> @@ +44,5 @@
> >  NSS_STRICT_SHUTDOWN=1
> >  export NSS_STRICT_SHUTDOWN
> >  
> > +# If using ASan, disable LSan; see bug 1246801.
> > +ASAN_OPTIONS=detect_leaks=0${ASAN_OPTIONS:+,$ASAN_OPTIONS}
> 
> I added quotes around the new value of ASAN_OPTIONS.
> 
> If the old value of ASAN_OPTIONS contains whitespace, the quotes
> will be necessary.

They might be helpful for clarity, but they're not necessary: this context isn't subject to word splitting after parameter substitution.  (Also, I just realized that bash doesn't word-split arguments to `export` — although dash does — so it wasn't broken before like I thought it was.)

> It seems that you use a comma as the separator. I can't find
> documentation on what separators ASAN_OPTIONS uses. The examples
> I found on the web use spaces or colons.

Yes, that was wrong.  The compiler-rt source confirms:

      // Read until the next space or colon.
      end = pos + internal_strcspn(pos, " :\r\n\t");

Sorry about that.  Do you want to fix that (once the tree freeze issue is sorted out) or should I prepare a patch?

At this point it might make the most sense to back the patches out and then reland them with fixes after the tree is unfrozen.
Attachment #8724601 - Flags: feedback?(jld) → feedback+
A followup to fix the mistakes that got in.  Two items mentioned above:

* Use correct delimiter in ASAN_OPTIONS.
* Variable name change (LDFLAGS vs. LFLAGS).

Also, one that I discovered while testing, which I'd regressed from an earlier version of the patch while renaming things (sorry about that):

* Unbreak Mac ASan build -- shlib linker flags were wrong.
Attachment #8727694 - Flags: review?(wtc)
Comment on attachment 8727694 [details] [diff] [review]
Patch with fixups

Review of attachment 8727694 [details] [diff] [review]:
-----------------------------------------------------------------

r=wtc. Thanks! Patch checked in:
https://hg.mozilla.org/projects/nss/rev/eba25272d225
Attachment #8727694 - Flags: review?(wtc)
Attachment #8727694 - Flags: review+
Attachment #8727694 - Flags: checked-in+
Status: NEW → RESOLVED
Closed: 5 years ago
OS: Unspecified → All
Priority: -- → P1
Hardware: Unspecified → All
Resolution: --- → FIXED
Target Milestone: --- → 3.24
Attachment #8724601 - Flags: review?(twsmith) → review+
See Also: → 1288722
See Also: → 1294394
You need to log in before you can comment on or make changes to this bug.