Closed Bug 624878 Opened 10 years ago Closed 10 years ago

Remove dangerous uses of vanilla (throw-on-failure) |operator new|

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: njn, Assigned: njn)

References

Details

(Whiteboard: hardblocker, fixed-in-tracemonkey)

Attachments

(2 files, 3 obsolete files)

The JS engine has several uses of vanilla |operator new|:  in jstracer.cpp, in YARR, in JM, in CTypes.  On my Linux box, it throws if it cannot allocate, even though -fno-exceptions is specified.  (But if I try to catch the exception, GCC aborts with "error: exception handling disabled, use -fexceptions to enable", argh!)

We overload 'new' in TraceRecorder with a version that uses js_calloc().  I think we should do that globally within the JS engine, for all the |operator new| variants.

This blocks 624645 -- we can't fix the unhandled OOMs in InitJIT() until we have a way to detect them!
This should block 2.0, lots of potential for crashes.

Also, the places that call operator new need to check for NULL.
blocking2.0: --- → ?
To clarify a point that shaver raised in IRC:  there is lots of potential for new crashes that didn't occur in 3.6.  That's because (a) there are new unchecked allocations in JM and YARR, and (b) 4.0 uses more memory than 3.6, so is likely to hit OOM more often.
blocking2.0: ? → betaN+
Whiteboard: softblocker
(In reply to comment #0)
> The JS engine has several uses of vanilla |operator new|:  in jstracer.cpp, in
> YARR, in JM, in CTypes.  On my Linux box, it throws if it cannot allocate, even
> though -fno-exceptions is specified.

That's because your libstdc++ is compiled with -fexceptions.  Surprise!

(In reply to comment #1)
> This should block 2.0, lots of potential for crashes.

Exceptions thrown from gecko code are also a potential security vulnerability when plugins and/or binary extensions are loaded.
Here's a straw-man patch.  It removes the new/delete overloadings from TraceRecorder and puts global ones in jsapi.h.  I used Valgrind to print backtraces within the newly-added new/new[]/delete/delete[] variants (see http://blog.mozilla.com/nnethercote/2011/01/11/using-valgrind-to-get-stack-traces/ for how) and confirmed that the places I'd eyeballed as using vanilla 'new' are correctly using the newly-added variants.

cjones:  mozalloc.h has a lot of extra stuff on its declarations of new/new[]/delete/delete[], esp. to do with 'throw' declarations.  I guess I'll need to emulate them?  I've only tested on Linux so far.

brendan:  putting these in jsapi.h is a bad idea because it exposes them to embedders.  jsprvtd.h looks like a better place -- it appears to be #included in every .cpp file within SpiderMonkey but isn't #included in jsapi.h.

Brendan also suggested just hand-fixing the current places that use vanilla new to use something else, but if global replacement is simple I'd prefer that as it prevents accidental use of vanilla 'new' later on.
Attachment #503009 - Flags: feedback?(jones.chris.g)
Attachment #503009 - Flags: feedback?(brendan)
Comment on attachment 503009 [details] [diff] [review]
draft patch (against TM 60026:0ccd15d19393)

This is going to bring on a lot of pain.  Implementing global operator new/delete without generating copious compiler warnings, breaking C sources, accidentally exporting the overrides, and in this case, not competing with Gecko's operator new/delete, is not going to be fun.  And this impl will probably end up duplicating all the goop we had to add for mozalloc.  I'd really rather we moved mozalloc or its successor into js/src and shared it.

If the goal is to prevent folks from using libstdc++ new, and there are ~10-20 uses currently, I would suggest instead fixing the current callers and then preventing its use in the future.  It would be fairly easy to cook up a Makefile rule that used nm and/or dumpbin (probably only need to run it on one platform) to check libmozjs.so or its .o's for dependencies on libstdc++ new.
Attachment #503009 - Flags: feedback?(jones.chris.g) → feedback-
Comment 5, as last para of comment 4 says, is the way. My IRC words were a bit more than a suggestion. Recommendation, at this point.

/be
(In reply to comment #6)
> Comment 5, as last para of comment 4 says, is the way. My IRC words were a bit
> more than a suggestion. Recommendation, at this point.

Now that cjones has confirmed that global overrides are a bad idea, sure.
This patch:

- Adds js_new/js_delete to jsutil.h and replaces all occurrences of vanilla
  new/delete with them.  39 new/new[] calls were converted.  I added NULL
  checks to several that were missing;  16 of them still lack NULL checks;
  they're all in YARR and more difficult to fix, and covered by bug 574459.

  I chose the names js_new()/js_delete() to mirror js_malloc()/js_free().
  There is also js_vec_new()/js_vec_delete();  I chose those names to mirror
  GCC's builtin_vec_{new,delete}, and they were as good as anything else I
  could think of.

- A couple of places where we had:

    new Foo(0)

  had to be changed to:

    new Foo((T*)NULL)

  I guess the template instantiation needs to know the exact type.

- Also adds JS_NEW/JS_DELETE macros for the cases where classes have private
  constructors/destructors.  A bit ugly, but it was the neatest solution I
  could come up with, and they're only used in a few places.

- Removes the vanilla new/delete overrides in TraceRecorder, there wasn't
  no point having them now that we have global js_new()/js_delete().

- Removes ValueDeleter<T> by inlining it into its one use-point, and moves
  deleteAllValues;  it's easier to understand in its new place, the T and T*
  parameters were getting confusing through the multiple layers of templated
  calls.

- Adds a script, find_vanilla_new_calls, which is run when 'make check' is
  run in js/src/, but only on Linux.  That should be good enough to catch
  any vanilla new calls that creep in.  There are identical copies in
  config/ and js/src/config to satisfy the sync checking.

  I'm currently running tests on the try server, for both cases:  where we
  still have vanilla new calls (to ensure they're detected) and where we
  don't have vanilla new calls (to ensure it's accepted).

Valgrind's alloc/free mismatch checking was a huge help with this task.  Props to Dirk Mueller for implementing it.

A questions:
- Is it ok to hardwire libmozjs.so in to the find_new_vanilla_call
  invocation in js/src/Makefile.in?

I'm asking Luke to review particularly for C++ language lawyering, Brendan
for feedback particularly on API additions, and cjones for feedback
particularly on find_new_vanilla_calls.  Thanks!  This patch should reduce
our crash-on-OOM rate greatly.
Attachment #503009 - Attachment is obsolete: true
Attachment #503642 - Flags: review?(lw)
Attachment #503642 - Flags: feedback?(brendan)
Attachment #503009 - Flags: feedback?(brendan)
This should be a hardblocker, IMO.  See comment 2.
js_new etc. kind of makes the usage of new/delete pretty gross :/  Would be nice to find a way to integrate mozalloc so as not have add a speedtrap for someone writing code for JS.

Also, I think js_vec_delete is wrong -- it needs to call the destructor on every element, and in this case will only call it on the first, no?

Looking at mozalloc.h -- operator delete doesn't seem to call /any/ destructors?  Or is this ok because the global operators are invoked after destructors?  (Since they only have a void* to work with)
(In reply to comment #8)
> - Adds a script, find_vanilla_new_calls, which is run when 'make check' is
>   run in js/src/, but only on Linux.  That should be good enough to catch
>   any vanilla new calls that creep in.  There are identical copies in
>   config/ and js/src/config to satisfy the sync checking.

I could be wrong, but AIUI the sync checking allows files in js/src/config which don't exist in config/.

(BTW, kudos on adding the `make check` script.)
(In reply to comment #10)
> js_new etc. kind of makes the usage of new/delete pretty gross :/  Would be
> nice to find a way to integrate mozalloc so as not have add a speedtrap for
> someone writing code for JS.

I'm not sure what you mean by a speedtrap, but I'm open to hearing other suggestions that will (a) fix our unchecked allocation problems for Firefox 4.0, and (b) still allow JS embedders to use JS_USE_CUSTOM_ALLOCATOR.  

(W.r.t. (b), ap_bb was complaining on IRC just the other day that he was having to replace all the vanilla new/deletes in SM with calls to his custom allocators).

> Also, I think js_vec_delete is wrong -- it needs to call the destructor on
> every element, and in this case will only call it on the first, no?

Mmm, yeah, that sounds right, good catch.
(In reply to comment #11)
> 
> I could be wrong, but AIUI the sync checking allows files in js/src/config
> which don't exist in config/.

I originally only had it in js/src/config/ but check-sync-dirs.py barfed on the tryserver.
(In reply to comment #9)
> This should be a hardblocker, IMO.  See comment 2.

OK. It seems borderline, in the sense that we really could release without it, but it also seems more important than most of the softblockers.
Whiteboard: softblocker → hardblocker
(In reply to comment #12)
> (In reply to comment #10)
> > js_new etc. kind of makes the usage of new/delete pretty gross :/  Would be
> > nice to find a way to integrate mozalloc so as not have add a speedtrap for
> > someone writing code for JS.
> 
> I'm not sure what you mean by a speedtrap, but I'm open to hearing other
> suggestions that will (a) fix our unchecked allocation problems for Firefox
> 4.0, and (b) still allow JS embedders to use JS_USE_CUSTOM_ALLOCATOR.  

Just meant that someone doing work on js, especially a new contributor, will get bitten.. in Fx proper we're trying to move things to more standard c++, so this in JS seems like a step in the wrong direction (especially given that you guys are using more normal c++ features anyway :-).  But I agree that I don't really see many other options, especially when JS_USE_CUSTOM_ALLOCATOR is considered.
(In reply to comment #15)
 > 
> Just meant that someone doing work on js, especially a new contributor, will
> get bitten..

Hopefully find_vanilla_new_calls will minimize any bad consequences.
> A questions:
> - Is it ok to hardwire libmozjs.so in to the find_new_vanilla_call
>   invocation in js/src/Makefile.in?

Apparently not, I got "nm: './../../dist/bin/libmozjs.so': No such file" on the try server.
And yes, SpiderMonkey is weird because it is distributed standalone.  And it contains Nanojit, which is also distributed standalone (more or less), so there's lots of inconsistencies there, esp. w.r.t. memory management.  I hope to reduce the inconsistencies in the future, but that's a way off.
(In reply to comment #10)
> 
> Also, I think js_vec_delete is wrong -- it needs to call the destructor on
> every element, and in this case will only call it on the first, no?

Aw jeez, I just realized this means I need to store the number of elements for each js_vec_new().  What a pain.  I guess the standard thing is to store the size to the left of the returned pointer, but I'm worried about getting the alignment right.  Hmm.
 
> Looking at mozalloc.h -- operator delete doesn't seem to call /any/
> destructors?  Or is this ok because the global operators are invoked after
> destructors?  (Since they only have a void* to work with)

That's a very good question that I'd dearly like to know the answer to.
(In reply to comment #19)
> 
> > Looking at mozalloc.h -- operator delete doesn't seem to call /any/
> > destructors?  Or is this ok because the global operators are invoked after
> > destructors?  (Since they only have a void* to work with)
> 
> That's a very good question that I'd dearly like to know the answer to.

But the operator new overrides don't call the constructors... I guess if you override new/delete you just have to do the memory allocation/deallocation and the system calls the constructors/destructors for you?  Whereas I'm defining new functions js_new/js_delete so I have to do that work myself...?
(In reply to comment #20)
> But the operator new overrides don't call the constructors... I guess if you
> override new/delete you just have to do the memory allocation/deallocation and
> the system calls the constructors/destructors for you?

Right.

> Whereas I'm defining
> new functions js_new/js_delete so I have to do that work myself...?

In your case, you're using placement new for js_new, so that'll call the constructor.  But for the destructor in this case, you're on your own.. and yeah, you have to store the length somewhere.  You still want to return a 16-byte aligned pointer, so I think you can get away with allocating 16 bytes more and using that to store your length.
This patch implements js_vec_new/js_vec_delete properly, ie. so that the destructors are run for each element in the array when js_vec_delete is called.
Attachment #503642 - Attachment is obsolete: true
Attachment #503746 - Flags: review?
Attachment #503746 - Flags: feedback?(brendan)
Attachment #503642 - Flags: review?(lw)
Attachment #503642 - Flags: feedback?(brendan)
Attachment #503746 - Flags: feedback?(jones.chris.g)
Comment on attachment 503746 [details] [diff] [review]
patch v2 (against TM 60213:dff3fccf7407)

Looks OK.  A sugar-ier option for single-object alloction would be
|new (js) T(...)| and jsdelete(p), but that relies on being able to
detect use of |delete p| instead of jsdelete(p) which your comments in
find_vanilla_new_calls suggest is surprisingly hard.  Not sure what's
going on there ... do you have a small testcase?

>+template <class T>
>+JS_ALWAYS_INLINE T *js_vec_new(size_t n) {
>+	/* The length is stored just before the vector memory. */
>+    void *memory = js_malloc(JSMinAlignment + sizeof(T) * n);

I don't know if SM discipline requires callers to check beforehand for
possible overflow of |new T[n]|, but it would be useful to check here
if only in debug builds.

Regardless of |new (js)| vs js_new(), you definitely want js_vec_new()
instead of new[], because gcc refuses to check for overflow of new[]
allocs.  MSVC does check.
Attachment #503746 - Flags: feedback?(jones.chris.g) → feedback+
(In reply to comment #23)
> I don't know if SM discipline requires callers to check beforehand for
> possible overflow of |new T[n]|, but it would be useful to check here
> if only in debug builds.

Good catch -- unless n is constant, this must be a runtime check in release builds that forces failure. Multiplication wraparound bugs are sg:critical in general.

/be
A few comments:

Finding vanilla new calls from 'make check' is awesome.

Template argument deduction makes the T in js_delete<T>(p) unnecessary, so you can drop the <T> and just write js_delete(p).

An alternative to JS_NEW/JS_DELETE is to just use js_new/js_delete like normal and have the hiding class befriend the js_new/js_delete templates:

class A {
  ~A() {}
  template <class T> friend void js_delete(T *);
};

A little arduous to write at first but we will read them a lot more than we write them and macros are a brain drag when reading/maintaining code.
+1 on avoiding UGLY_MACROS.

/be
(In reply to comment #25)
> 
> Template argument deduction makes the T in js_delete<T>(p) unnecessary, so you
> can drop the <T> and just write js_delete(p).

Oh, I did not know that.  Good.

> An alternative to JS_NEW/JS_DELETE is to just use js_new/js_delete like normal
> and have the hiding class befriend the js_new/js_delete templates:
> 
> class A {
>   ~A() {}
>   template <class T> friend void js_delete(T *);
> };

That is better, yes.

This is why everyone needs a good (language) lawyer :)
Attachment #503746 - Flags: review? → review?(lw)
Comment on attachment 503746 [details] [diff] [review]
patch v2 (against TM 60213:dff3fccf7407)

r+ with comment 25 addressed.

>     if (!tm->recordAttempts->init(PC_HASH_COUNT))
>     if (!tm->loopProfiles->init(PC_HASH_COUNT))
>-        abort();
>+        return false;

Wow, InitJIT already returned a bool; that was just flagrantly abortious.

Two rather-nitty nits; feel free to ignore:
 - For the purpose of short code, I think all the branches and null-setting in FinishJIT could be removed, leaving a linear list of delete statements.
 - Since "arrays" are the language-level feature, not "vectors", I would have expected the name to be js_arr_delete/js_delete_arr
Attachment #503746 - Flags: review?(lw) → review+
Comment on attachment 503746 [details] [diff] [review]
patch v2 (against TM 60213:dff3fccf7407)

Luke's review covers any feedback I can give, except let's spell out array, no arr shorthand.

/be
Attachment #503746 - Flags: feedback?(brendan) → feedback+
(In reply to comment #25)
> 
> Finding vanilla new calls from 'make check' is awesome.

And proving its worth already -- it appears that shell builds don't include CTypes but browser builds do, and I didn't realize that I needed to remove vanilla new/delete from CTypes code until the script failed in a browser build.
(In reply to comment #23)
> Comment on attachment 503746 [details] [diff] [review]
> patch v2 (against TM 60213:dff3fccf7407)
> 
> Looks OK.  A sugar-ier option for single-object alloction would be
> |new (js) T(...)| and jsdelete(p), but that relies on being able to
> detect use of |delete p| instead of jsdelete(p) which your comments in
> find_vanilla_new_calls suggest is surprisingly hard.  Not sure what's
> going on there ... do you have a small testcase?

Here's the smallest I could manage:

  class Foo { virtual ~Foo() {} };
  int main(void) { Foo* f = new Foo(); return 0; }

I like the symmetry of js_new/js_delete, and others seem happy with it, so I'll stick with it.
Updated patch.  Nobody asked for additional review but there ended being many changes so I'm putting the revised patch up in case anyone wants a look.  I'll land it tomorrow once it passes tryserver, assuming no-one complains.

Changes against v2:

- find_vanilla_new_calls now fails if it can't find the given file.

- Used $(LIBRARY) in the Makefile, it resolves to libjs_static.a in both
  shell builds and browser builds;  browser builds don't build libmozjs.so.

- Renamed js_vec_{new,delete} as js_array_{new,delete}.

- Added an overflow check in js_array_new().

- Removed the unnecessary <T> from js_delete/js_array_delete calls.

- Added an initializer for TraceRecorder::addPropShapeBefore, which was
  necessary because TraceRecorder is no longer allocated with js_calloc(),
  and so isn't auto-zeroed.

- Removed the unnecessary NULL checks before deleting things in FinishJIT();
  I kept the NULL assignments, though, out of caution.

- Removed JS_NEW/JS_DELETE.  Getting the interaction between templates,
  namespaces and friends correct was a real trip, esp. since a GCC bug
  (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32204) means that this form
  doesn't work:

    template <class T> friend void js_delete(T *);

  because js_delete gets resolved to the local namespace instead of the 
  global namespace.  So I used this form instead:

    friend void js_delete<>(t *);

  where |t| is the actual type, not the template type.  Good enough, though
  getting it all to compile was rather fiddly.  Many thanks to someone named
  "jag" on #developers for help with this.
Attachment #503746 - Attachment is obsolete: true
jag is (I think) Peter Annema, formerly of Netscape, possibly still at Microsoft via their acquisition of Danger (which used/uses Gecko in a transcoding proxy).

/be
Bloody hell, MSVC can't handle the templating used to remove JS_NEW/JS_DELETE:

ExecutableAllocator.h(142) : error C2275: 'JSC::ExecutablePool' : illegal use of this type as an expression
ExecutableAllocator.h(203) : error C2143: syntax error : missing ';' before '<'
ExecutableAllocator.h(203) : error C2433: 'JSC::js_new' : 'friend' not permitted on data declarations
ExecutableAllocator.h(203) : error C2365: 'JSC::js_new' : redefinition; previous definition was 'function'
ExecutableAllocator.h(109) : see declaration of 'JSC::js_new'
ExecutableAllocator.h(203) : error C2238: unexpected token(s) preceding ';'
ExecutableAllocator.h(215) : error C2275: 'JSC::ExecutableAllocator' : illegal use of this type as an expression
ExecutableAllocator.h(215) : error C2059: syntax error : ')'
RegexPattern.h(63) : error C2143: syntax error : missing ';' before '<'
RegexPattern.h(63) : error C2433: 'JSC::Yarr::js_new' : 'friend' not permitted on data declarations
RegexPattern.h(63) : error C2238: unexpected token(s) preceding ';'
RegexPattern.h(73) : error C2275: 'JSC::Yarr::CharacterClassTable' : illegal use of this type as an expression
RegexPattern.h(293) : error C2275: 'JSC::Yarr::PatternAlternative' : illegal use of this type as an expression
RegexPattern.h(229) : see declaration of 'JSC::Yarr::PatternAlternative'


Argh!  What now?  Go back to the macros?  Insist that constructors/destructors be public?  Sigh.
Yep, that's me, and no longer at MS.

You could try:

  friend Foo *::js_new<Foo,Bar>(const Bar*);

and then in using it:

  Foo* foo = ::js_new<Foo,Bar>(bar);


Did MSVC like the old, templated friend declarations, btw?
(In reply to comment #35)
> Yep, that's me, and no longer at MS.
> 
> You could try:
> 
>   friend Foo *::js_new<Foo,Bar>(const Bar*);
> 
> and then in using it:
> 
>   Foo* foo = ::js_new<Foo,Bar>(bar);
> 
> 
> Did MSVC like the old, templated friend declarations, btw?

I didn't try them, I don't have a Windows machine so trying things is painful.

For all the time I've spent on this, there are only a handful of places where its needed.  I'm just going to inline the body of js_new into those places.  Thanks for your help, though, it's been educational!
template <class T> T* makeT() { return new T; }

namespace N
{
  class Foo
  {
    //template <class T> friend T* ::makeT(); // works with MSVC
    //friend Foo* makeT<>();                  // works with g++
    friend Foo* ::makeT<Foo>();               // works with MSVC and g++
    Foo() {}
  };
}

int main()
{
  makeT<N::Foo>();
}
As does this:

friend Foo* ::makeT<>();                  // works with MSVC and g++

I'm hoping that this (once applied everywhere) will clear up the other errors too. I haven't installed a full Mozilla build env yet.
Attached patch Builds on msvcSplinter Review
jag's right, you need:
  friend void ::js_delete(XYZ *);
when the class in which you are declaring a friend is not in the global namespace.  (Otherwise, instead of finding the decl in the global namespace, it injects a new decl in the enclosing namespace and befriends *that*.  (Yes, friend decls in C++ are awful.)

With that simple change (attached) and not even changing use sites), everything builds find on msvc.  There is this one lame warning emitted, though:

  warning C4396: 'js_delete' : the inline specifier cannot be used when a friend declaration refers to a specialization of a function template

for practically every file.  I think I know what its complaining about and its lame so I think we should add this to the existing list of lame-msvc-warnings-we-ignore.
I'm still inclined to just manually inline the relevant js_new/js_delete calls.  If GCC and MSVC both have trouble, the compilers on Solaris and AIX and whatever else probably will too.  And having to write ::js_new and ::js_delete is ugly.  Basically, I think the friend looks no better than inlining the small function.
(In reply to comment #40)
> I'm still inclined to just manually inline the relevant js_new/js_delete calls.
>  If GCC and MSVC both have trouble, the compilers on Solaris and AIX and
> whatever else probably will too.

If it already works as posted, perhaps we can wait and see?

> And having to write ::js_new and ::js_delete is ugly.  Basically, I think the
> friend looks no better than inlining the small function.

Two chars vs. several lines...  Also, we often have to ::qualify in headers, so its not out of place.

::js_new isn't
(In reply to comment #41)
> 
> If it already works as posted, perhaps we can wait and see?

Sorry, no.  I wasted a couple of hours on this yesterday, I don't want to waste any more in the future.  Worse is better.
Manually inlining the js_new/js_delete calls ended up making the patch 9 lines shorter, primarily because the 11 argument variant of js_new() could be removed.

http://hg.mozilla.org/tracemonkey/rev/4046ef71ddc2
Whiteboard: hardblocker → hardblocker, fixed-in-tracemonkey
(In reply to comment #43)
> 11 argument variant

IHNJH, IJLS "11 argument variant". HAND.
I personally like the "could be removed" part :)
Duplicate of this bug: 626610
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.