Closed Bug 868814 Opened 11 years ago Closed 9 years ago

Fold mozalloc into mozglue

Categories

(Core :: Memory Allocator, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox39 --- disabled
firefox40 --- fixed

People

(Reporter: jwatt, Assigned: glandium)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file)

Shouldn't we be linking mfbt against mozcrt?

I'm encountering a crash in some patches that I've been working in optimized Windows builds because je_free() ends up being called on memory allocated by msvcr100.dll!malloc.

To be more specific, I have an mfbt function that returns a std::string. If this string is long enough that its internal buffer can't store its contents, then it allocates memory under the following stack:

msvcr100.dll!malloc(unsigned int size)
msvcr100.dll!operator new(unsigned int size)
mozglue.dll!std::_Allocate<char>(unsigned int _Count, char * __formal)
mozglue.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::_Copy(unsigned int _Newsize, unsigned int _Oldlen)
mozglue.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::_Grow(unsigned int _Newsize, bool _Trim)
mozglue.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::append(unsigned int _Count, char _Ch)
mozglue.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::operator+=(char _Ch)
mozglue.dll!StringBuilder::append(char c)
mozglue.dll!WebCore::Decimal::toString()
xul.dll!mozilla::dom::HTMLInputElement::ConvertNumberToString(WebCore::Decimal aValue, nsAString_internal & aResultString)

When this string is returned from the mfbt function, it is assigned directly to a stack std::string in HTMLInputElement.cpp where we end up under the following stack:

xul.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::assign(const char * _Ptr, unsigned int _Count)
xul.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::basic_string<char,std::char_traits<char>,std::allocator<char> >(const char * _Ptr)
xul.dll!mozilla::dom::HTMLInputElement::ConvertNumberToString(WebCore::Decimal aValue, nsAString_internal & aResultString)

In the assign() function, the contents of the mfbt std::string are transferred to the HTMLInputElement std::string by assigning the mfbt string's pointer to the allocated memory to the HTMLInputElement std::string:

  _Myt& assign(_Myt&& _Right)
    {  // assign by moving _Right
    if (this == &_Right)
      ;
    else if (get_allocator() != _Right.get_allocator()
      && this->_BUF_SIZE <= _Right._Myres)
      *this = _Right;
    else
      {  // not same, clear this and steal from _Right
      _Tidy(true);
      if (_Right._Myres < this->_BUF_SIZE)
        _Traits::move(this->_Bx._Buf, _Right._Bx._Buf,
          _Right._Mysize + 1);
      else
        {  // copy pointer
->       this->_Bx._Ptr = _Right._Bx._Ptr;
        _Right._Bx._Ptr = 0;
        }
      this->_Mysize = _Right._Mysize;
      this->_Myres = _Right._Myres;

      _Right._Mysize = 0;
      _Right._Myres = 0;
      }
    return (*this);
    }

Then when the HTMLInputElement std::string is destroyed, we end up here:

mozglue.dll!arena_dalloc(void * ptr, unsigned int offset)
mozglue.dll!je_free(void * ptr)
xul.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::_Tidy(bool _Built, unsigned int _Newsize)
xul.dll!mozilla::dom::HTMLInputElement::ConvertNumberToString(WebCore::Decimal aValue, nsAString_internal & aResultString)

where we, unsurprisingly, crash in arena_dalloc.
(In reply to Jonathan Watt [:jwatt] from comment #0)
> To be more specific, I have an mfbt function that returns a std::string.

That sounds fishy.
Hmm, I guess we _are_ actually linking mfbt against mozcrt, since mozcrt is part of mozglue, and the "bad" stack of an allocation from mfbt code shows we're using mozglue.dll:

msvcr100.dll!malloc(unsigned int size)
msvcr100.dll!operator new(unsigned int size)
mozglue.dll!std::_Allocate<char>(unsigned int _Count, char * __formal)
mozglue.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::_Copy(unsigned int _Newsize, unsigned int _Oldlen)
mozglue.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::_Grow(unsigned int _Newsize, bool _Trim)
mozglue.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::append(unsigned int _Count, char _Ch)
mozglue.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::operator+=(char _Ch)
mozglue.dll!StringBuilder::append(char c)
mozglue.dll!WebCore::Decimal::toString()
xul.dll!mozilla::dom::HTMLInputElement::ConvertNumberToString(WebCore::Decimal aValue, nsAString_internal & aResultString)

A "good" stack allocating for a std::string in HTMLInputElement.cpp looks like this:

mozglue.dll!choose_arena()
mozglue.dll!imalloc(unsigned int size)
mozglue.dll!je_malloc(unsigned int size)
mozalloc.dll!moz_xmalloc(unsigned int size)
xul.dll!std::_Allocate<char>(unsigned int _Count, char * __formal)
xul.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::_Copy(unsigned int _Newsize, unsigned int _Oldlen)
xul.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::_Grow(unsigned int _Newsize, bool _Trim)
xul.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::assign(const char * _Ptr, unsigned int _Count)
xul.dll!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::basic_string<char,std::char_traits<char>,std::allocator<char> >(const char * _Ptr)
xul.dll!mozilla::dom::HTMLInputElement::ConvertNumberToString(WebCore::Decimal aValue, nsAString_internal & aResultString)

I guess the std::basic_string stuff is in xul.dll because these functions are inline functions in opt Windows builds.

I'm not sure why xul.dll!std::_Allocate would use mozalloc.dll!moz_xmalloc whereas mozglue.dll!std::_Allocate would use msvcr100.dll!operator new, though. That seems...weird.
Mike, can you shed any light on what's going wrong with the linking here?
Flags: needinfo?(mh+mozilla)
The problem is there is no operator new override in mozglue, so it ends up calling msvcrt's operator new, which uses malloc. I have WIP patches that fold mozalloc into mozglue, which would solve this.
Flags: needinfo?(mh+mozilla)
(In reply to Mike Hommey [:glandium] from comment #4)
> The problem is there is no operator new override in mozglue.

More specifically, the STL use from mfbt (in mozglue) calls operator new, which ends up calling msvcrt's operator new, since there is no operator new override in mozglue.
(In reply to Mike Hommey [:glandium] from comment #4)
> I have WIP patches that fold mozalloc into mozglue, which would solve this.

Is there a bug number and an estimated timescale for this? If it can be ready and land before the next uplift I can probably hold off, but otherwise I'm going to need to do something else.
I'll recheck how far it is from completion tomorrow.
Summary: Consider linking mfbt against mozcrt → Fold mozalloc into mozglue
Blocks: 868874
(In reply to Mike Hommey [:glandium] from comment #7)
> I'll recheck how far it is from completion tomorrow.

It needs changes to mozcrt (specifically, it needs a static-linking version of mozcrt). These tend to be tricky things, so I doubt I can make it for the next uplift.
Okay, thanks, Mike.
Assignee: nobody → mh+mozilla
Blocks: 1134920
Component: MFBT → Memory Allocator
OS: Windows 7 → All
Hardware: x86 → All
Please note that the mess with _impl in mozalloc.cpp will have to be sorted out sooner rather than later because it prevents doing the next sensible thing, which is to replace moz_malloc/moz_free with malloc/free, because anything that is in mozglue and uses malloc/free then ends up using the system one due to how mozglue is linked on windows (because, inconveniently, when you have a def file that says that malloc is je_malloc, the linker doesn't care about that for symbol resolution within the library itself).
Attachment #8571185 - Flags: review?(n.nethercote)
Blocks: 1138293
Blocks: 1138321
Comment on attachment 8571185 [details] [diff] [review]
Fold mozalloc library into mozglue

Review of attachment 8571185 [details] [diff] [review]:
-----------------------------------------------------------------

r=me because I don't see any problem but I don't understand most of this patch. And I don't know if anybody other than you really does... the bus factor on this code is not good :(

My only suggestion is to, if you haven't already, rgrep for "mozalloc" to see if there are any remaining mentions that you haven't addressed.

::: memory/mozalloc/mozalloc.cpp
@@ +28,5 @@
> +#define MALLOC_FUNS MALLOC_FUNCS_MALLOC
> +#include "malloc_decls.h"
> +
> +extern "C" MOZ_MEMORY_API char *strdup_impl(const char *);
> +extern "C" MOZ_MEMORY_API char *strndup_impl(const char *, size_t);

Is there a good reason why strdup/strndup are not mentioned in malloc_decls.h?

::: memory/mozalloc/mozalloc_oom.h
@@ +13,4 @@
>  /**
>   * Called when memory is critically low.  Returns iff it was able to 
>   * remedy the critical memory situation; if not, it will abort().
>   * 

Nit: remove trailing whitespace while you are here?

@@ +18,1 @@
>   * used indepedently of mozalloc.h.

Nit: I think this comment is now irrelevant? The re-#define of MOZ_ALLOC_EXPORT above was removed. Also, "independently" is misspelled but that won't matter if it gets removed.

::: memory/mozalloc/staticruntime/moz.build
@@ +2,5 @@
> +# vim: set filetype=python:
> +# This Source Code Form is subject to the terms of the Mozilla Public
> +# License, v. 2.0. If a copy of the MPL was not distributed with this
> +# file, You can obtain one at http://mozilla.org/MPL/2.0/.
> +NO_VISIBILITY_FLAGS = True

Nit: blank line between header comment and this line.

::: xpcom/glue/nomozalloc/Makefile.in
@@ -2,5 @@
> -# This Source Code Form is subject to the terms of the Mozilla Public
> -# License, v. 2.0. If a copy of the MPL was not distributed with this
> -# file, You can obtain one at http://mozilla.org/MPL/2.0/.
> -
> -DIST_INSTALL	= 1

Nit: extraneous whitespace.

::: xpcom/glue/nomozalloc/moz.build
@@ -22,5 @@
> -FORCE_STATIC_LIB = True
> -
> -if CONFIG['_MSC_VER']:
> -    DEFINES['_USE_ANSI_CPP'] = True
> -    # Don't include directives about which CRT to use

Nit: '.' at end of sentence.
Attachment #8571185 - Flags: review?(n.nethercote) → review+
(In reply to Nicholas Nethercote [:njn] from comment #11)
> Comment on attachment 8571185 [details] [diff] [review]
> Fold mozalloc library into mozglue
> 
> Review of attachment 8571185 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=me because I don't see any problem but I don't understand most of this
> patch. And I don't know if anybody other than you really does... the bus
> factor on this code is not good :(

Would the following changes help?
- replace MFBT_API with MOZ_GLUE_API
- have the build system generate _staticruntime libraries on its own.

> My only suggestion is to, if you haven't already, rgrep for "mozalloc" to
> see if there are any remaining mentions that you haven't addressed.
> 
> ::: memory/mozalloc/mozalloc.cpp
> @@ +28,5 @@
> > +#define MALLOC_FUNS MALLOC_FUNCS_MALLOC
> > +#include "malloc_decls.h"
> > +
> > +extern "C" MOZ_MEMORY_API char *strdup_impl(const char *);
> > +extern "C" MOZ_MEMORY_API char *strndup_impl(const char *, size_t);
> 
> Is there a good reason why strdup/strndup are not mentioned in
> malloc_decls.h?

The main reason is that all the declarations in malloc_decls.h are for replace-malloc implementations, and those don't need to implement str(n)dup. That could be arranged, but I'd rather keep that to a followup.

> ::: xpcom/glue/nomozalloc/Makefile.in
> @@ -2,5 @@
> > -# This Source Code Form is subject to the terms of the Mozilla Public
> > -# License, v. 2.0. If a copy of the MPL was not distributed with this
> > -# file, You can obtain one at http://mozilla.org/MPL/2.0/.
> > -
> > -DIST_INSTALL	= 1
> 
> Nit: extraneous whitespace.
> 
> ::: xpcom/glue/nomozalloc/moz.build
> @@ -22,5 @@
> > -FORCE_STATIC_LIB = True
> > -
> > -if CONFIG['_MSC_VER']:
> > -    DEFINES['_USE_ANSI_CPP'] = True
> > -    # Don't include directives about which CRT to use
> 
> Nit: '.' at end of sentence.

Those last two are removed files :)
https://hg.mozilla.org/mozilla-central/rev/beed1c584a22
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
Depends on: 1141660
Depends on: 1141759
Depends on: 1142006
Blocks: 1142434
Depends on: 1168291
Depends on: 1173683
Depends on: 1170141
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: