Implement xmalloc regardless of whether jemalloc is disabled

RESOLVED FIXED

Status

()

Core
XPCOM
RESOLVED FIXED
9 years ago
3 years ago

People

(Reporter: Benjamin Smedberg, Assigned: cjones)

Tracking

(Blocks: 2 bugs)

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 10 obsolete attachments)

72.22 KB, patch
Details | Diff | Splinter Review
920 bytes, patch
cjones
: review+
Details | Diff | Splinter Review
(Reporter)

Description

9 years ago
We want to depend on xmalloc. In order to do so, we need an implementation even when jemalloc is not enabled. It can be a dirt-simple nullcheck+abort.
What is (or should be) the relation between this bug and bug 427109 ?
(Reporter)

Comment 2

9 years ago
What part of comment 0 is unclear?
Assignee: benjamin → jones.chris.g
Summary: Implement xmalloc when jemalloc is disabled → Implement xmalloc regardless of whether jemalloc is disabled
Created attachment 391233 [details] [diff] [review]
v1

First step towards infallible malloc.  Next steps are (i) replacing usage of fallible allocators in Moz code; (ii) same for NSPR; (iii) eliminating newly vacuous null checks.  These will be done for separate bugs.
Attachment #391233 - Flags: review?(benjamin)
Blocks: 507082
(Reporter)

Updated

8 years ago
Attachment #391233 - Flags: review?(benjamin) → review-
(Reporter)

Comment 4

8 years ago
Comment on attachment 391233 [details] [diff] [review]
v1

Per IRC discussion, this should be a sharedlib and linked up just about everywhere...
The next patch will also
  #define malloc moz_malloc
instead of
  #define malloc moz_xmalloc
This will allow the library to land now, without any behavioral or performance differences; moz_malloc() is an always_inline wrapper around malloc().
I should clarify that that change allows landing this library along with bug 507082 now.
Created attachment 391364 [details] [diff] [review]
v2
Attachment #391233 - Attachment is obsolete: true
Attachment #391364 - Flags: review?(benjamin)
(Reporter)

Updated

8 years ago
Attachment #391364 - Flags: review?(benjamin) → review+
(Reporter)

Comment 8

8 years ago
Comment on attachment 391364 [details] [diff] [review]
v2

>diff --git a/memory/mozalloc/mozalloc.h b/memory/mozalloc/mozalloc.h

>+extern const fallible_t fallible;

I don't think this should be an extern symbol. Looking up the dynamic address of this symbol is going to be more costly than using a stack temporary. Avoid the global altogether.
Created attachment 391881 [details] [diff] [review]
v3

This patch changed some from the reviewed version --- I stopped trying to support the symbols that jemalloc implements (posix_memalign et al.) when they aren't present on the platform.  This led to too many linker woes because of how we link with jemalloc.

r? again in case bsmedberg wants another look.
Attachment #391881 - Flags: review?(benjamin)
Comment on attachment 391881 [details] [diff] [review]
v3


>+ * |#define MOZALLOC_DEFINE_MACRO_WRAPPERS| to make libc "allocating
>+ * functions" never fail (return NULL).
>+ *
>+ * XXX: these symbols are defined to the fallible allocators.  in the
>+ * future they will be defined to the infallible allocators.

Why is this XXX?  I see the comment earlier about this allowing this to land now, but I don't understand it, especially given that operator new is doing xmalloc.  When would we make the switch?

I'd rather that we switch to infalliable variants everywhere, and then fix things to be falliable as we run into them.
(In reply to comment #10)
> (From update of attachment 391881 [details] [diff] [review])
> 
> >+ * |#define MOZALLOC_DEFINE_MACRO_WRAPPERS| to make libc "allocating
> >+ * functions" never fail (return NULL).
> >+ *
> >+ * XXX: these symbols are defined to the fallible allocators.  in the
> >+ * future they will be defined to the infallible allocators.
> 
> Why is this XXX?  I see the comment earlier about this allowing this to land
> now, but I don't understand it, especially given that operator new is doing
> xmalloc.  When would we make the switch?
> 

#define'ing malloc et al. is sucky and fragile, that's all.  We'd switch whenever someone got bored enough to rewrite the places we currently use malloc() et al. to be explicitly fallible or explicitly infallible.

XXX/pieinthesky would perhaps have been more appropriate.

> I'd rather that we switch to infalliable variants everywhere, and then fix
> things to be falliable as we run into them.

That's the plan.
Whoops, I withdraw this response.  Yeah, this is just a good old-fashioned bug on my part; they were /supposed/ to be defined to moz_xmalloc, etc, and have a comment saying "this sucks, don't define them in the future."

Thanks for the catch.

(In reply to comment #11)
> (In reply to comment #10)
> > (From update of attachment 391881 [details] [diff] [review] [details])
> > 
> > >+ * |#define MOZALLOC_DEFINE_MACRO_WRAPPERS| to make libc "allocating
> > >+ * functions" never fail (return NULL).
> > >+ *
> > >+ * XXX: these symbols are defined to the fallible allocators.  in the
> > >+ * future they will be defined to the infallible allocators.
> > 
> > Why is this XXX?  I see the comment earlier about this allowing this to land
> > now, but I don't understand it, especially given that operator new is doing
> > xmalloc.  When would we make the switch?
> > 
> 
> #define'ing malloc et al. is sucky and fragile, that's all.  We'd switch
> whenever someone got bored enough to rewrite the places we currently use
> malloc() et al. to be explicitly fallible or explicitly infallible.
> 
> XXX/pieinthesky would perhaps have been more appropriate.
> 
> > I'd rather that we switch to infalliable variants everywhere, and then fix
> > things to be falliable as we run into them.
> 
> That's the plan.
Created attachment 393994 [details] [diff] [review]
back to default-infallible

Per newsgroup discussion.
Attachment #391364 - Attachment is obsolete: true
Attachment #391881 - Attachment is obsolete: true
Attachment #391881 - Flags: review?(benjamin)
Created attachment 395384 [details] [diff] [review]
same as before, fixed rotted macro #define
Attachment #393994 - Attachment is obsolete: true
Duplicate of this bug: 507082
Duplicate of this bug: 507252
Depends on: 507083
Blocks: 507249
Created attachment 395990 [details] [diff] [review]
backup of WIP

Playing nice with the wince shunt will be harder than I thought.
Attachment #395384 - Attachment is obsolete: true
Created attachment 396039 [details] [diff] [review]
v4

libmozalloc changed pretty significantly in addition to the wince hackery, requesting re-review.
Attachment #395990 - Attachment is obsolete: true
Attachment #396039 - Flags: review?(benjamin)
Attachment #396039 - Attachment description: v3 → v4
Attachment #396039 - Flags: review?(vladimir)
Attachment #396039 - Flags: review?(bugmail)
(Reporter)

Comment 19

8 years ago
Comment on attachment 396039 [details] [diff] [review]
v4

I didn't believe we were going to redefine malloc by default for this pass: only change the behavior of operator new/new[]. I don't think we should have infallible malloc() until we have made at least the most obvious content-controllable allocation spots explicitly use the fallible version.
My reading of the discussion was not to use infallible malloc until we had a /plan/ for being able handle fallible content-controlled allocations.  My impression was that we finalized the plan: return ERROR_OOM as today, and guarantee that the error code can reach a "bail point", like a loop processing tokens (and further guarantee that the "bail point" can indeed pop out to the event loop).  dwitte's magical call graph will enable a nice static analysis of both requirements.

Should this bug really block on the insertion of a few |new (fallible) (...)| calls?  I have no idea where these should go, so some one would need to pick up that slack before enabling infallible malloc.  The downside to blocking is that we couldn't start using infallible malloc immediately and would need to keep adding in soon-to-be-useless OOM checks.
Blocks: 512868
(Reporter)

Comment 21

8 years ago
I believe we should land this immediately without wrapping malloc/free. This gives us the ability to start using infallible malloc now (via moz_xmalloc) as well as avoiding regressing large-image DOSes. We get most of the benefit from infallible operator new immediately.
Created attachment 397222 [details] [diff] [review]
v5
Attachment #396039 - Attachment is obsolete: true
Attachment #397222 - Flags: review?(benjamin)
Attachment #396039 - Flags: review?(vladimir)
Attachment #396039 - Flags: review?(bugmail)
Attachment #396039 - Flags: review?(benjamin)
Attachment #397222 - Flags: review?(vladimir)
Attachment #397222 - Flags: review?(bugmail)
(Reporter)

Updated

8 years ago
Attachment #397222 - Flags: review?(benjamin) → review+
Comment on attachment 397222 [details] [diff] [review]
v5

Just looking at the shunt portion of this patch

>   extern const nothrow_t nothrow;
>+
>+  struct bad_alloc {};
> };

nit, I don't think we need the newline


> inline void *operator new(size_t size, const std::nothrow_t&) throw() {
>-  return (void*) malloc(size);
>-}
>-inline void operator delete(void *ptr) throw() {
>-  free(ptr);
>-}
>-inline void *operator new[](size_t size) throw() {
>-  return (void*) malloc(size);
>+  return malloc(size);
> }
> inline void *operator new[](size_t size, const std::nothrow_t&) throw() {
>-  return (void*) malloc(size);
>-}
>-inline void operator delete[](void *ptr) throw() {
>-  return free(ptr);
>+  return malloc(size);
> }

looks like we're missing operator delete when we don't have infallible malloc enabled.  If not, please add a comment to explain how this works and how we don't get an allocator mismatch
Attachment #397222 - Flags: review?(bugmail) → review-
Created attachment 401363 [details] [diff] [review]
v6, address blassey's comments

Not r? back to bsmedberg because nothing major has changed.
Attachment #397222 - Attachment is obsolete: true
Attachment #401363 - Flags: review?(bugmail)
Attachment #397222 - Flags: review?(vladimir)
Attachment #401363 - Flags: review?(vladimir)
Attachment #401363 - Flags: review?(vladimir) → review+
Comment on attachment 401363 [details] [diff] [review]
v6, address blassey's comments

This looks good to me, though we'll want a quiet/closed tree for the landing I bet.
Brad, have an ETA on re-view?
Attachment #401363 - Flags: review?(bugmail) → review+
Duplicate of this bug: 518880
Blocks: 518881
Created attachment 403551 [details] [diff] [review]
updated v6

Changed MOZ_ATTR_* to NS_* and removed duplicate NS_ALWAYS_INLINE definition from nsUTF8Utils.h.  Ready to land.
Created attachment 403942 [details] [diff] [review]
v6 updated again
[Checkin: See comment 40]

Chris Double found a problem with non-libxul builds.  Fixed in latest patch.  Ready to land again, but waiting for tryserver double check (though it won't catch anything as the change is non-libxul only).
Attachment #403551 - Attachment is obsolete: true
Pushed http://hg.mozilla.org/mozilla-central/rev/8cbc47eee659
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
Depends on: 520114
WINNT 5.2 mozilla-central leak test build has been orange since this landed.
THhs appears to have completely broken comm-central builds.
Seamonkey and Thunderbird trees are ablaze.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
That's being tracked in bug 520114.
Status: REOPENED → RESOLVED
Last Resolved: 8 years ago8 years ago
Resolution: --- → FIXED
(In reply to comment #31)
> WINNT 5.2 mozilla-central leak test build has been orange since this landed.

I went to bed last night before I was sure this was causing the orange.  I'm looking into it right now.
Backout out.

Spent the entire day trying to set up a Windows build, to no avail, hence no progress on the orange.  Much gnashing of teeth has resulted on my part.

Tinkerty-tonk.  And I intend that to sting.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I believe I have found one memory issue at least on windows. I am getting assertions with the patch in for heap pointer validation inside browsercomps. The cause of this seems to mainly be that browsercomps is statically linked to the CRTs, meaning it has it's own heap. What happens is that inside nsModule.cpp, inside NS_GENERIC_FACTORY_CONSTRUCTOR_INIT, NS_NEWXPCOM is used to allocate the memory for the new object. 

This file includes nscore.h, and since MOZ_NO_MALLOC is not defined it will then proceed to allocate from mozmalloc.cpp, which is linked to the shared CRTs.

When the object is then released, it calls the delete which is executed from the object release function, which lives inside the object's cpp file, which does not include nscore.h. That delete will then run through the normal CRT path, which in the case of this component is the static library. When then executing the delete the static library's heap manager concludes the given pointer is not valid for that heap, since it was allocated on the shared heap for mozmalloc.cpp. This could cause all kinds of free()'s to not occur, since the static heap manager can't execute them.

Atleast.. this is what I think...
I ran into one more interesting issue when trying to fix this issue. I added a MOZ_NO_MOZALLOC define to the files in browsercomps. But for some reason it was still using the new operator in mozalloc.h. I might've very well done something wrong in my build settings (although I did make sure the file was compiled with that define), or it might have been pulled in from somewhere else. But one thing I noticed was that the new operator existed as a function, as in, not inlined. My other idea was that this could mean that the linker resolved the new call in nsModule.obj to the new function that probably existed in one of the other libs included in the link. If the latter is the case, that could give very annoying situations when really wanting the standard C++ new.
While I was at it I decided to look a bit more. The latter situation mentioned earlier was indeed the case - setting the new operators in mozalloc.h static fixed the issue. I also had to define MOZ_NO_MOZALLOC in xpcom/glue to make it work completely. Since otherwise the static linked in xpcomglue_s.lib would still be using the wrong allocation calls for its generic factory and nsVoidArray and several others. After that the build now works with no leaks reported.
Blocks: 520114
No longer depends on: 520114
Bas, thanks for looking into these failures.  I decided to implement another fix, namely implementing |moz_free| in libmozalloc and #define free moz_free.  This fixed windows debug non-libxul builds locally for me.
http://hg.mozilla.org/mozilla-central/rev/d8c18f04396e
http://hg.mozilla.org/mozilla-central/rev/5257693bcfbf
http://hg.mozilla.org/mozilla-central/rev/81bd90ae5899
http://hg.mozilla.org/mozilla-central/rev/130a52770a7b
Status: REOPENED → RESOLVED
Last Resolved: 8 years ago8 years ago
Resolution: --- → FIXED
This patch seems to break qt builds as the standard qt headers contain classes that have function names like "realloc", which really doesn't support having macros with the same names.

More specific the problem I see is with the qt header  .../include/qt4/Qt/qbytearray.h

I'm sure much thought has gone into using macros but is there a plan to do it in another way?
Mike, can you please file a followup?  Fixing this is a matter of shuffling header include order.  We have the same problem with js/src classes.

#define'ing malloc et al. is stupid and fragile, but we're working within the constraint that JS and Gecko have to share malloc et al. symbols at link time, but JS and Gecko (will) have different expectations about their semantics at compile time.  A global rewrite of malloc et al. to moz_xmalloc et al. in Gecko was deemed unpalatable because it would perpetuate the NS_Alloc/nsMemory tradition of being "nonstandard" and throwing away mindshare of new Gecko hackers.
Also, there was a dev.t.platform discussion about this a while back, if you're interested.
I'm on it - bug 550261.  Thanks:)
Depends on: 550261

Updated

8 years ago
Depends on: 550542
Blocks: 550548
Blocks: 550593

Updated

8 years ago
Depends on: 550666

Comment 45

8 years ago
Created attachment 430834 [details] [diff] [review]
mingw fix

This patch broke mingw build. I get arror:

In file included from ../../../../dist/include/nscore.h:54,
                 from ../../../../dist/include/nsAutoLock.h:109,
                 from /home/jacek/mozilla-build/wine-gecko-git/modules/plugin/base/src/nsNPAPIPlugin.cpp:50:
../../../../dist/include/mozilla/mozalloc.h: In function 'void* operator new(size_t)':
../../../../dist/include/mozilla/mozalloc.h:214: error: declaration of 'void* operator new(size_t) throw ()' throws different exceptions
/usr/local/lib/gcc/i686-mingw32/4.4.3/../../../../i686-mingw32/include/c++/4.4.3/new:91: error: from previous declaration 'void* operator new(size_t) throw (std::bad_alloc)'
../../../../dist/include/mozilla/mozalloc.h: In function 'void* operator new [](size_t)':
../../../../dist/include/mozilla/mozalloc.h:220: error: declaration of 'void* operator new [](size_t) throw ()' throws different exceptions
/usr/local/lib/gcc/i686-mingw32/4.4.3/../../../../i686-mingw32/include/c++/4.4.3/new:92: error: from previous declaration 'void* operator new [](size_t) throw (std::bad_alloc)'

The attached patch fixes the problem.

Updated

8 years ago
Attachment #430834 - Flags: review?(vladimir)
Attachment #403942 - Attachment description: v6 updated again → v6 updated again [Checkin: See comment 40]
No longer blocks: 507082
(In reply to comment #40)
> http://hg.mozilla.org/mozilla-central/rev/81bd90ae5899

Nit: MOZALLOC_LIB is included twice before NSPR_LIBS and twice after.
Any reason not to unify that? (or is it on purpose?)

What is MOZ_ALLOCATING_FUNCS supposed to do?
http://mxr.mozilla.org/comm-central/search?string=MOZ_ALLOCATING_FUNCS&case=1&find=%2Fconfigure%5C.in
And NS_NORETURN?
http://mxr.mozilla.org/comm-central/search?string=NS_NORETURN&case=1&find=%2Fconfigure%5C.in
Blocks: 550692
(In reply to comment #46)

And HAVE_JEMALLOC_VALLOC and HAVE_JEMALLOC_MEMALIGN?
http://mxr.mozilla.org/comm-central/search?string=HAVE_JEMALLOC_&case=on
(See also bug 550692.)
(In reply to comment #40)
> http://hg.mozilla.org/mozilla-central/rev/d8c18f04396e
> http://hg.mozilla.org/mozilla-central/rev/5257693bcfbf
> http://hg.mozilla.org/mozilla-central/rev/81bd90ae5899
> http://hg.mozilla.org/mozilla-central/rev/130a52770a7b

this caused a startup crash on Windows Mobile. Stack:

 	0x03f666e0	
>	mozce_shunt.dll!arena_dalloc(arena_s* arena = 0x00000000, arena_chunk_s* chunk = 0x67bf39fe, void* ptr = 0xffffc808) Line: 4227, Byte Offsets: 0x38	C
 	mozce_shunt.dll!putenv_internal(char* key = 0x00000000, char* value = 0x67bf39fe, int flag = -14328) Line: 102, Byte Offsets: 0xb8	C++
 	mozce_shunt.dll!putenv(const char* envstr = 0x00000000) Line: 144, Byte Offsets: 0x104	C++
 	nspr4.dll!PR_SetEnv(const char* string = 0x00000000) Line: 97, Byte Offsets: 0x80	C
 	xul.dll!LaunchChild(nsINativeAppSupport* aNative = 0x00000000, int aBlankCommandLine = 1740585470) Line: 1722, Byte Offsets: 0x50	C++
 	xul.dll!XRE_main(int argc = 0, char** argv = 0x67bf39fe, nsXREAppData* aAppData = 0xffffc808) Line: 3596, Byte Offsets: 0x2958	C++
 	fennec.exe!NS_internal_main(int argc = 0, char** argv = 0x67bf39fe) Line: 147, Byte Offsets: 0x2f0	C++
 	fennec.exe!wmain(int argc = 0, wchar_t** argv = 0x67bf39fe) Line: 120, Byte Offsets: 0x178	C++
 	fennec.exe!mainWCRTStartup(HINSTANCE__* hInstance = 0x00000000, HINSTANCE__* hInstancePrev = 0x67bf39fe, unsigned short* lpszCmdLine = 0xffffc808, int nCmdShow = 0) Line: 188, Byte Offsets: 0x94	C++
 	0x03f672ac
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Updated

8 years ago
Depends on: 550805

Updated

8 years ago
Depends on: 550823
Brad, are you seeing bug 550805?
(In reply to comment #46)
> (In reply to comment #40)
> > http://hg.mozilla.org/mozilla-central/rev/81bd90ae5899
> 
> Nit: MOZALLOC_LIB is included twice before NSPR_LIBS and twice after.
> Any reason not to unify that? (or is it on purpose?)
> 

I didn't notice that, but it's probably fallout from all the changes to *LIBS* variables.  TBH it's probably not worth fixing, unless you have a patch in hand.

> What is MOZ_ALLOCATING_FUNCS supposed to do?
> http://mxr.mozilla.org/comm-central/search?string=MOZ_ALLOCATING_FUNCS&case=1&find=%2Fconfigure%5C.in

This was going to be used to selectively enable allocators in mozalloc that aren't present on all platforms and build configurations, such as posix_memalign on non-jemalloc windows.  It's a TODO though.  Bug 550692 appears to cover the TODO.

> And NS_NORETURN?
> http://mxr.mozilla.org/comm-central/search?string=NS_NORETURN&case=1&find=%2Fconfigure%5C.in

This is going to be used to mark the eventual "abort on OOM" function that mozalloc_handle_oom() is a placeholder for currently.  It's also a generally useful annotation.
Blocks: 550991
Depends on: 551098
Status: REOPENED → RESOLVED
Last Resolved: 8 years ago8 years ago
Resolution: --- → FIXED
Attachment #430834 - Flags: review?(vladimir) → review?(jones.chris.g)
Depends on: 550371
Comment on attachment 430834 [details] [diff] [review]
mingw fix

Looks fine, but note that this fix will conflict with the similar one for Solaris in bug 550371.
Attachment #430834 - Flags: review?(jones.chris.g) → review+

Updated

8 years ago
Keywords: checkin-needed
Attachment #401363 - Attachment is obsolete: true

Updated

8 years ago
Whiteboard: [c-n attachment 430834]
http://hg.mozilla.org/mozilla-central/rev/7d81ca137806
Keywords: checkin-needed
Whiteboard: [c-n attachment 430834]
Depends on: 555387
Depends on: 550401
No longer depends on: 555387
Depends on: 1108934
You need to log in before you can comment on or make changes to this bug.