Closed Bug 441324 Opened 16 years ago Closed 14 years ago

Implement xmalloc regardless of whether jemalloc is disabled

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: benjamin, Assigned: cjones)

References

Details

Attachments

(2 files, 10 obsolete files)

72.22 KB, patch
Details | Diff | Splinter Review
920 bytes, patch
cjones
: review+
Details | Diff | Splinter Review
We want to depend on xmalloc. In order to do so, we need an implementation even when jemalloc is not enabled. It can be a dirt-simple nullcheck+abort.
What is (or should be) the relation between this bug and bug 427109 ?
What part of comment 0 is unclear?
Assignee: benjamin → jones.chris.g
Summary: Implement xmalloc when jemalloc is disabled → Implement xmalloc regardless of whether jemalloc is disabled
Attached patch v1 (obsolete) — Splinter Review
First step towards infallible malloc.  Next steps are (i) replacing usage of fallible allocators in Moz code; (ii) same for NSPR; (iii) eliminating newly vacuous null checks.  These will be done for separate bugs.
Attachment #391233 - Flags: review?(benjamin)
Attachment #391233 - Flags: review?(benjamin) → review-
Comment on attachment 391233 [details] [diff] [review]
v1

Per IRC discussion, this should be a sharedlib and linked up just about everywhere...
The next patch will also
  #define malloc moz_malloc
instead of
  #define malloc moz_xmalloc
This will allow the library to land now, without any behavioral or performance differences; moz_malloc() is an always_inline wrapper around malloc().
I should clarify that that change allows landing this library along with bug 507082 now.
Attached patch v2 (obsolete) — Splinter Review
Attachment #391233 - Attachment is obsolete: true
Attachment #391364 - Flags: review?(benjamin)
Attachment #391364 - Flags: review?(benjamin) → review+
Comment on attachment 391364 [details] [diff] [review]
v2

>diff --git a/memory/mozalloc/mozalloc.h b/memory/mozalloc/mozalloc.h

>+extern const fallible_t fallible;

I don't think this should be an extern symbol. Looking up the dynamic address of this symbol is going to be more costly than using a stack temporary. Avoid the global altogether.
Attached patch v3 (obsolete) — Splinter Review
This patch changed some from the reviewed version --- I stopped trying to support the symbols that jemalloc implements (posix_memalign et al.) when they aren't present on the platform.  This led to too many linker woes because of how we link with jemalloc.

r? again in case bsmedberg wants another look.
Attachment #391881 - Flags: review?(benjamin)
Comment on attachment 391881 [details] [diff] [review]
v3


>+ * |#define MOZALLOC_DEFINE_MACRO_WRAPPERS| to make libc "allocating
>+ * functions" never fail (return NULL).
>+ *
>+ * XXX: these symbols are defined to the fallible allocators.  in the
>+ * future they will be defined to the infallible allocators.

Why is this XXX?  I see the comment earlier about this allowing this to land now, but I don't understand it, especially given that operator new is doing xmalloc.  When would we make the switch?

I'd rather that we switch to infalliable variants everywhere, and then fix things to be falliable as we run into them.
(In reply to comment #10)
> (From update of attachment 391881 [details] [diff] [review])
> 
> >+ * |#define MOZALLOC_DEFINE_MACRO_WRAPPERS| to make libc "allocating
> >+ * functions" never fail (return NULL).
> >+ *
> >+ * XXX: these symbols are defined to the fallible allocators.  in the
> >+ * future they will be defined to the infallible allocators.
> 
> Why is this XXX?  I see the comment earlier about this allowing this to land
> now, but I don't understand it, especially given that operator new is doing
> xmalloc.  When would we make the switch?
> 

#define'ing malloc et al. is sucky and fragile, that's all.  We'd switch whenever someone got bored enough to rewrite the places we currently use malloc() et al. to be explicitly fallible or explicitly infallible.

XXX/pieinthesky would perhaps have been more appropriate.

> I'd rather that we switch to infalliable variants everywhere, and then fix
> things to be falliable as we run into them.

That's the plan.
Whoops, I withdraw this response.  Yeah, this is just a good old-fashioned bug on my part; they were /supposed/ to be defined to moz_xmalloc, etc, and have a comment saying "this sucks, don't define them in the future."

Thanks for the catch.

(In reply to comment #11)
> (In reply to comment #10)
> > (From update of attachment 391881 [details] [diff] [review] [details])
> > 
> > >+ * |#define MOZALLOC_DEFINE_MACRO_WRAPPERS| to make libc "allocating
> > >+ * functions" never fail (return NULL).
> > >+ *
> > >+ * XXX: these symbols are defined to the fallible allocators.  in the
> > >+ * future they will be defined to the infallible allocators.
> > 
> > Why is this XXX?  I see the comment earlier about this allowing this to land
> > now, but I don't understand it, especially given that operator new is doing
> > xmalloc.  When would we make the switch?
> > 
> 
> #define'ing malloc et al. is sucky and fragile, that's all.  We'd switch
> whenever someone got bored enough to rewrite the places we currently use
> malloc() et al. to be explicitly fallible or explicitly infallible.
> 
> XXX/pieinthesky would perhaps have been more appropriate.
> 
> > I'd rather that we switch to infalliable variants everywhere, and then fix
> > things to be falliable as we run into them.
> 
> That's the plan.
Attached patch back to default-infallible (obsolete) — Splinter Review
Per newsgroup discussion.
Attachment #391364 - Attachment is obsolete: true
Attachment #391881 - Attachment is obsolete: true
Attachment #391881 - Flags: review?(benjamin)
Attached patch backup of WIP (obsolete) — Splinter Review
Playing nice with the wince shunt will be harder than I thought.
Attachment #395384 - Attachment is obsolete: true
Attached patch v4 (obsolete) — Splinter Review
libmozalloc changed pretty significantly in addition to the wince hackery, requesting re-review.
Attachment #395990 - Attachment is obsolete: true
Attachment #396039 - Flags: review?(benjamin)
Attachment #396039 - Attachment description: v3 → v4
Attachment #396039 - Flags: review?(vladimir)
Comment on attachment 396039 [details] [diff] [review]
v4

I didn't believe we were going to redefine malloc by default for this pass: only change the behavior of operator new/new[]. I don't think we should have infallible malloc() until we have made at least the most obvious content-controllable allocation spots explicitly use the fallible version.
My reading of the discussion was not to use infallible malloc until we had a /plan/ for being able handle fallible content-controlled allocations.  My impression was that we finalized the plan: return ERROR_OOM as today, and guarantee that the error code can reach a "bail point", like a loop processing tokens (and further guarantee that the "bail point" can indeed pop out to the event loop).  dwitte's magical call graph will enable a nice static analysis of both requirements.

Should this bug really block on the insertion of a few |new (fallible) (...)| calls?  I have no idea where these should go, so some one would need to pick up that slack before enabling infallible malloc.  The downside to blocking is that we couldn't start using infallible malloc immediately and would need to keep adding in soon-to-be-useless OOM checks.
I believe we should land this immediately without wrapping malloc/free. This gives us the ability to start using infallible malloc now (via moz_xmalloc) as well as avoiding regressing large-image DOSes. We get most of the benefit from infallible operator new immediately.
Attached patch v5 (obsolete) — Splinter Review
Attachment #396039 - Attachment is obsolete: true
Attachment #397222 - Flags: review?(benjamin)
Attachment #396039 - Flags: review?(vladimir)
Attachment #396039 - Flags: review?(bugmail)
Attachment #396039 - Flags: review?(benjamin)
Attachment #397222 - Flags: review?(vladimir)
Attachment #397222 - Flags: review?(benjamin) → review+
Comment on attachment 397222 [details] [diff] [review]
v5

Just looking at the shunt portion of this patch

>   extern const nothrow_t nothrow;
>+
>+  struct bad_alloc {};
> };

nit, I don't think we need the newline


> inline void *operator new(size_t size, const std::nothrow_t&) throw() {
>-  return (void*) malloc(size);
>-}
>-inline void operator delete(void *ptr) throw() {
>-  free(ptr);
>-}
>-inline void *operator new[](size_t size) throw() {
>-  return (void*) malloc(size);
>+  return malloc(size);
> }
> inline void *operator new[](size_t size, const std::nothrow_t&) throw() {
>-  return (void*) malloc(size);
>-}
>-inline void operator delete[](void *ptr) throw() {
>-  return free(ptr);
>+  return malloc(size);
> }

looks like we're missing operator delete when we don't have infallible malloc enabled.  If not, please add a comment to explain how this works and how we don't get an allocator mismatch
Attachment #397222 - Flags: review?(bugmail) → review-
Attached patch v6, address blassey's comments (obsolete) — Splinter Review
Not r? back to bsmedberg because nothing major has changed.
Attachment #397222 - Attachment is obsolete: true
Attachment #401363 - Flags: review?(bugmail)
Attachment #397222 - Flags: review?(vladimir)
Attachment #401363 - Flags: review?(vladimir)
Comment on attachment 401363 [details] [diff] [review]
v6, address blassey's comments

This looks good to me, though we'll want a quiet/closed tree for the landing I bet.
Brad, have an ETA on re-view?
Attachment #401363 - Flags: review?(bugmail) → review+
Attached patch updated v6 (obsolete) — Splinter Review
Changed MOZ_ATTR_* to NS_* and removed duplicate NS_ALWAYS_INLINE definition from nsUTF8Utils.h.  Ready to land.
Chris Double found a problem with non-libxul builds.  Fixed in latest patch.  Ready to land again, but waiting for tryserver double check (though it won't catch anything as the change is non-libxul only).
Attachment #403551 - Attachment is obsolete: true
Pushed http://hg.mozilla.org/mozilla-central/rev/8cbc47eee659
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
WINNT 5.2 mozilla-central leak test build has been orange since this landed.
THhs appears to have completely broken comm-central builds.
Seamonkey and Thunderbird trees are ablaze.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
That's being tracked in bug 520114.
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
(In reply to comment #31)
> WINNT 5.2 mozilla-central leak test build has been orange since this landed.

I went to bed last night before I was sure this was causing the orange.  I'm looking into it right now.
Backout out.

Spent the entire day trying to set up a Windows build, to no avail, hence no progress on the orange.  Much gnashing of teeth has resulted on my part.

Tinkerty-tonk.  And I intend that to sting.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I believe I have found one memory issue at least on windows. I am getting assertions with the patch in for heap pointer validation inside browsercomps. The cause of this seems to mainly be that browsercomps is statically linked to the CRTs, meaning it has it's own heap. What happens is that inside nsModule.cpp, inside NS_GENERIC_FACTORY_CONSTRUCTOR_INIT, NS_NEWXPCOM is used to allocate the memory for the new object. 

This file includes nscore.h, and since MOZ_NO_MALLOC is not defined it will then proceed to allocate from mozmalloc.cpp, which is linked to the shared CRTs.

When the object is then released, it calls the delete which is executed from the object release function, which lives inside the object's cpp file, which does not include nscore.h. That delete will then run through the normal CRT path, which in the case of this component is the static library. When then executing the delete the static library's heap manager concludes the given pointer is not valid for that heap, since it was allocated on the shared heap for mozmalloc.cpp. This could cause all kinds of free()'s to not occur, since the static heap manager can't execute them.

Atleast.. this is what I think...
I ran into one more interesting issue when trying to fix this issue. I added a MOZ_NO_MOZALLOC define to the files in browsercomps. But for some reason it was still using the new operator in mozalloc.h. I might've very well done something wrong in my build settings (although I did make sure the file was compiled with that define), or it might have been pulled in from somewhere else. But one thing I noticed was that the new operator existed as a function, as in, not inlined. My other idea was that this could mean that the linker resolved the new call in nsModule.obj to the new function that probably existed in one of the other libs included in the link. If the latter is the case, that could give very annoying situations when really wanting the standard C++ new.
While I was at it I decided to look a bit more. The latter situation mentioned earlier was indeed the case - setting the new operators in mozalloc.h static fixed the issue. I also had to define MOZ_NO_MOZALLOC in xpcom/glue to make it work completely. Since otherwise the static linked in xpcomglue_s.lib would still be using the wrong allocation calls for its generic factory and nsVoidArray and several others. After that the build now works with no leaks reported.
Blocks: 520114
No longer depends on: 520114
Bas, thanks for looking into these failures.  I decided to implement another fix, namely implementing |moz_free| in libmozalloc and #define free moz_free.  This fixed windows debug non-libxul builds locally for me.
This patch seems to break qt builds as the standard qt headers contain classes that have function names like "realloc", which really doesn't support having macros with the same names.

More specific the problem I see is with the qt header  .../include/qt4/Qt/qbytearray.h

I'm sure much thought has gone into using macros but is there a plan to do it in another way?
Mike, can you please file a followup?  Fixing this is a matter of shuffling header include order.  We have the same problem with js/src classes.

#define'ing malloc et al. is stupid and fragile, but we're working within the constraint that JS and Gecko have to share malloc et al. symbols at link time, but JS and Gecko (will) have different expectations about their semantics at compile time.  A global rewrite of malloc et al. to moz_xmalloc et al. in Gecko was deemed unpalatable because it would perpetuate the NS_Alloc/nsMemory tradition of being "nonstandard" and throwing away mindshare of new Gecko hackers.
Also, there was a dev.t.platform discussion about this a while back, if you're interested.
I'm on it - bug 550261.  Thanks:)
Depends on: 550542
Blocks: 550548
Blocks: 550593
Depends on: 550666
Attached patch mingw fixSplinter Review
This patch broke mingw build. I get arror:

In file included from ../../../../dist/include/nscore.h:54,
                 from ../../../../dist/include/nsAutoLock.h:109,
                 from /home/jacek/mozilla-build/wine-gecko-git/modules/plugin/base/src/nsNPAPIPlugin.cpp:50:
../../../../dist/include/mozilla/mozalloc.h: In function 'void* operator new(size_t)':
../../../../dist/include/mozilla/mozalloc.h:214: error: declaration of 'void* operator new(size_t) throw ()' throws different exceptions
/usr/local/lib/gcc/i686-mingw32/4.4.3/../../../../i686-mingw32/include/c++/4.4.3/new:91: error: from previous declaration 'void* operator new(size_t) throw (std::bad_alloc)'
../../../../dist/include/mozilla/mozalloc.h: In function 'void* operator new [](size_t)':
../../../../dist/include/mozilla/mozalloc.h:220: error: declaration of 'void* operator new [](size_t) throw ()' throws different exceptions
/usr/local/lib/gcc/i686-mingw32/4.4.3/../../../../i686-mingw32/include/c++/4.4.3/new:92: error: from previous declaration 'void* operator new [](size_t) throw (std::bad_alloc)'

The attached patch fixes the problem.
Attachment #430834 - Flags: review?(vladimir)
Attachment #403942 - Attachment description: v6 updated again → v6 updated again [Checkin: See comment 40]
No longer blocks: 507082
(In reply to comment #40)
> http://hg.mozilla.org/mozilla-central/rev/81bd90ae5899

Nit: MOZALLOC_LIB is included twice before NSPR_LIBS and twice after.
Any reason not to unify that? (or is it on purpose?)

What is MOZ_ALLOCATING_FUNCS supposed to do?
http://mxr.mozilla.org/comm-central/search?string=MOZ_ALLOCATING_FUNCS&case=1&find=%2Fconfigure%5C.in
And NS_NORETURN?
http://mxr.mozilla.org/comm-central/search?string=NS_NORETURN&case=1&find=%2Fconfigure%5C.in
(In reply to comment #46)

And HAVE_JEMALLOC_VALLOC and HAVE_JEMALLOC_MEMALIGN?
http://mxr.mozilla.org/comm-central/search?string=HAVE_JEMALLOC_&case=on
(See also bug 550692.)
(In reply to comment #40)
> http://hg.mozilla.org/mozilla-central/rev/d8c18f04396e
> http://hg.mozilla.org/mozilla-central/rev/5257693bcfbf
> http://hg.mozilla.org/mozilla-central/rev/81bd90ae5899
> http://hg.mozilla.org/mozilla-central/rev/130a52770a7b

this caused a startup crash on Windows Mobile. Stack:

 	0x03f666e0	
>	mozce_shunt.dll!arena_dalloc(arena_s* arena = 0x00000000, arena_chunk_s* chunk = 0x67bf39fe, void* ptr = 0xffffc808) Line: 4227, Byte Offsets: 0x38	C
 	mozce_shunt.dll!putenv_internal(char* key = 0x00000000, char* value = 0x67bf39fe, int flag = -14328) Line: 102, Byte Offsets: 0xb8	C++
 	mozce_shunt.dll!putenv(const char* envstr = 0x00000000) Line: 144, Byte Offsets: 0x104	C++
 	nspr4.dll!PR_SetEnv(const char* string = 0x00000000) Line: 97, Byte Offsets: 0x80	C
 	xul.dll!LaunchChild(nsINativeAppSupport* aNative = 0x00000000, int aBlankCommandLine = 1740585470) Line: 1722, Byte Offsets: 0x50	C++
 	xul.dll!XRE_main(int argc = 0, char** argv = 0x67bf39fe, nsXREAppData* aAppData = 0xffffc808) Line: 3596, Byte Offsets: 0x2958	C++
 	fennec.exe!NS_internal_main(int argc = 0, char** argv = 0x67bf39fe) Line: 147, Byte Offsets: 0x2f0	C++
 	fennec.exe!wmain(int argc = 0, wchar_t** argv = 0x67bf39fe) Line: 120, Byte Offsets: 0x178	C++
 	fennec.exe!mainWCRTStartup(HINSTANCE__* hInstance = 0x00000000, HINSTANCE__* hInstancePrev = 0x67bf39fe, unsigned short* lpszCmdLine = 0xffffc808, int nCmdShow = 0) Line: 188, Byte Offsets: 0x94	C++
 	0x03f672ac
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 550805
Depends on: 550823
(In reply to comment #46)
> (In reply to comment #40)
> > http://hg.mozilla.org/mozilla-central/rev/81bd90ae5899
> 
> Nit: MOZALLOC_LIB is included twice before NSPR_LIBS and twice after.
> Any reason not to unify that? (or is it on purpose?)
> 

I didn't notice that, but it's probably fallout from all the changes to *LIBS* variables.  TBH it's probably not worth fixing, unless you have a patch in hand.

> What is MOZ_ALLOCATING_FUNCS supposed to do?
> http://mxr.mozilla.org/comm-central/search?string=MOZ_ALLOCATING_FUNCS&case=1&find=%2Fconfigure%5C.in

This was going to be used to selectively enable allocators in mozalloc that aren't present on all platforms and build configurations, such as posix_memalign on non-jemalloc windows.  It's a TODO though.  Bug 550692 appears to cover the TODO.

> And NS_NORETURN?
> http://mxr.mozilla.org/comm-central/search?string=NS_NORETURN&case=1&find=%2Fconfigure%5C.in

This is going to be used to mark the eventual "abort on OOM" function that mozalloc_handle_oom() is a placeholder for currently.  It's also a generally useful annotation.
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
Attachment #430834 - Flags: review?(vladimir) → review?(jones.chris.g)
Comment on attachment 430834 [details] [diff] [review]
mingw fix

Looks fine, but note that this fix will conflict with the similar one for Solaris in bug 550371.
Attachment #430834 - Flags: review?(jones.chris.g) → review+
Keywords: checkin-needed
Attachment #401363 - Attachment is obsolete: true
Whiteboard: [c-n attachment 430834]
Keywords: checkin-needed
Whiteboard: [c-n attachment 430834]
Depends on: 555387
Depends on: 550401
No longer depends on: 555387
Depends on: 1108934
You need to log in before you can comment on or make changes to this bug.