DLL blocklist is using the wrong implementations of malloc and free

RESOLVED FIXED in Firefox 46

Status

()

Core
mozglue
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: aklotz, Assigned: aklotz)

Tracking

(Blocks: 2 bugs)

Trunk
mozilla48
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox45+ wontfix, firefox46+ fixed, firefox47+ fixed, firefox48 fixed)

Details

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(3 attachments)

Some UniquePtrs in WindowsDllBlocklist end up using jemalloc for operator new and MSVCRT for operator delete.

After a considerable amount of digging, I realized that this MOZ_NO_MALLOC code is generating references to operator new and operator delete that need to be resolved by the linker. Since those operators are declared as extern, and MSVCRT's operators are not inline, they end up being resolved by the linker and we are left with whatever it works out.

I am 99.9% certain that this is the cause of the Windows heap corruption that we were seeing in bug 1243098 and friends.

My knee-jerk response is to implement inline new and delete operators for the DLL blocklist so that it uses its own. OTOH, I have yet to get my head around all of the complexities surrounding mozalloc et al that I might be missing an alternative.
(In reply to Aaron Klotz [:aklotz] (please use needinfo) from comment #0)
> My knee-jerk response is to implement inline new and delete operators for
> the DLL blocklist so that it uses its own. OTOH, I have yet to get my head
> around all of the complexities surrounding mozalloc et al that I might be
> missing an alternative.

Mike, any suggestions?
Flags: needinfo?(mh+mozilla)
Blocks: 1218473
Blocks: 1247969
Try removing that MOZ_NO_MOZALLOC. Back when the blocklist was moved to mozglue, it made sense, but now that mozalloc is in mozglue as well, it makes less sense.
Flags: needinfo?(mh+mozilla)
[Tracking Requested - why for this release]:
This could cause heap corruption which is a pretty serious quality issue. Based on what I've seen, adding additional heap operations to the DLL blocklist is enough to trigger this.
tracking-firefox45: --- → ?
tracking-firefox46: --- → ?
tracking-firefox47: --- → ?
Tracking for 45+ since this approach may fix a group of crashes (maybe startup crashes).
status-firefox45: --- → affected
status-firefox46: --- → affected
tracking-firefox45: ? → +
tracking-firefox46: ? → +
tracking-firefox47: ? → +
Aaron, is this something you intend to work on? If not, can you suggest who might take a deeper look? Thanks.
Flags: needinfo?(aklotz)
Yeah, I'll take it.
Assignee: nobody → aklotz
Status: NEW → ASSIGNED
Flags: needinfo?(aklotz)
Too late for 45 as today is release day!
status-firefox45: affected → wontfix
Created attachment 8727993 [details] [diff] [review]
RFC

After disabling MOZ_NO_MOZALLOC, the problems persist:

- Allocating goes through jemalloc because operator new calls moz_xmalloc.
- Freeing calls free_impl, which maps to free() as expected outside of mozglue, but ends up referencing MSVC free() within mozglue.

It looks to me that any references to malloc_impl or free_impl in this code need to be map to the mozglue *_impl functions.

This patch does that as a successful test... is this the right idea?
Attachment #8727993 - Flags: feedback?(mh+mozilla)
Comment on attachment 8727993 [details] [diff] [review]
RFC

Review of attachment 8727993 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozglue/build/WindowsDllBlocklist.cpp
@@ +2,5 @@
>  /* This Source Code Form is subject to the terms of the Mozilla Public
>   * License, v. 2.0. If a copy of the MPL was not distributed with this
>   * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
>  
> +#include "mozmemory_wrap.h"

I'd #define MOZ_MEMORY_IMPL above this include, instead of putting it in moz.build.

@@ +6,5 @@
> +#include "mozmemory_wrap.h"
> +#define MALLOC_FUNCS MALLOC_FUNCS_MALLOC
> +#define MALLOC_DECL(name, return_type, ...) \
> +  extern "C" MOZ_MEMORY_API return_type name ## _impl(__VA_ARGS__);
> +#include "malloc_decls.h"

Copying the comment from memory/mozalloc/mozalloc.cpp could be a good thing.
Attachment #8727993 - Flags: feedback?(mh+mozilla) → feedback+
Summary: MOZ_NO_MOZALLOC code in mozglue needs inline new and delete operators → DLL blocklist is using the wrong implementations of malloc and free
Created attachment 8732978 [details]
MozReview Request: Bug 1249849: Make sure the correct implementations of malloc and free are used in DLL blocklist; r?glandium

Review commit: https://reviewboard.mozilla.org/r/41485/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/41485/
Attachment #8732978 - Flags: review?(mh+mozilla)
Comment on attachment 8732978 [details]
MozReview Request: Bug 1249849: Make sure the correct implementations of malloc and free are used in DLL blocklist; r?glandium

https://reviewboard.mozilla.org/r/41485/#review38409

Can you file a followup to make things such that we don't need to solve this one libmozglue .cpp file at a time?
Attachment #8732978 - Flags: review?(mh+mozilla) → review+
Blocks: 1259271

Comment 12

a year ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/c26f9e1dbad7

Comment 13

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/c26f9e1dbad7
Status: ASSIGNED → RESOLVED
Last Resolved: a year ago
status-firefox48: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Comment on attachment 8732978 [details]
MozReview Request: Bug 1249849: Make sure the correct implementations of malloc and free are used in DLL blocklist; r?glandium

Approval Request Comment
[Feature/regressing bug #]: DLL Blocklist
[User impact if declined]: Heap corruption leading to crashes or undefined behaviour
[Describe test coverage new/current, TreeHerder]: Verified locally by stepping into code to ensure that the correct malloc and free functions are being called.
[Risks and why]: Very low, trivial fix
[String/UUID change made/needed]: None
Attachment #8732978 - Flags: approval-mozilla-beta?
Attachment #8732978 - Flags: approval-mozilla-aurora?
Aaron, do you think it is best to land this cautiously on aurora first, then beta? Or jump right to land this on beta 6 next Monday? Is there anything else we can do to verify this or test around it? I am torn between landing it for beta 6 so we have time to respond to problems, vs. verifying on aurora and landing in beta 7 or 8.  Fixing a bunch of startup crashes and hangs sounds very tempting!  What do you think?
Flags: needinfo?(aklotz)
I'd recommend that we go straight to beta. This patch only affects one source file and I have manually verified that it fixes all allocations within that source file.
Flags: needinfo?(aklotz)
Comment on attachment 8732978 [details]
MozReview Request: Bug 1249849: Make sure the correct implementations of malloc and free are used in DLL blocklist; r?glandium

Straight to beta it is, sounds good for crash fixes
Attachment #8732978 - Flags: approval-mozilla-beta?
Attachment #8732978 - Flags: approval-mozilla-beta+
Attachment #8732978 - Flags: approval-mozilla-aurora?
Attachment #8732978 - Flags: approval-mozilla-aurora+

Comment 18

a year ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-aurora/rev/9b19d8fe7dec
status-firefox47: affected → fixed
Backed out of aurora in https://hg.mozilla.org/releases/mozilla-aurora/rev/0e6e693a2ab2 for Windows build failures: 
https://treeherder.mozilla.org/logviewer.html#?job_id=2246246&repo=mozilla-aurora
Flags: needinfo?(aklotz)
status-firefox47: fixed → affected
Created attachment 8735998 [details] [diff] [review]
Build fix for non-nightly channels

This is needed to fix problems on non-Nightly builds. Since malloc_decls.h is not exported when MOZ_REPLACE_ALLOC is not defined, we still need to be able to find it in the case of mozglue.
Flags: needinfo?(aklotz)
Attachment #8735998 - Flags: review?(mh+mozilla)
Attachment #8735998 - Flags: review?(mh+mozilla) → review+
(In reply to Aaron Klotz [:aklotz] (please use needinfo) from comment #20)
> Created attachment 8735998 [details] [diff] [review]
> Build fix for non-nightly channels
> 
> This is needed to fix problems on non-Nightly builds. Since malloc_decls.h
> is not exported when MOZ_REPLACE_ALLOC is not defined, we still need to be
> able to find it in the case of mozglue.

so this need now to land on beta and aurora ?
Flags: needinfo?(aklotz)
Yes. It should also land on central or else we'll have the same problem come merge day.
Flags: needinfo?(aklotz)
(In reply to Aaron Klotz [:aklotz] (please use needinfo) from comment #22)
> Yes. It should also land on central or else we'll have the same problem come
> merge day.

landed on m-i as https://hg.mozilla.org/integration/mozilla-inbound/rev/cc70dae5693b
landed on aurora as 

remote:   https://hg.mozilla.org/releases/mozilla-aurora/rev/26411f370dad
remote:   https://hg.mozilla.org/releases/mozilla-aurora/rev/b7488a21db02
status-firefox47: affected → fixed
remote:   https://hg.mozilla.org/releases/mozilla-beta/rev/d1b69a42b55e
remote:   https://hg.mozilla.org/releases/mozilla-beta/rev/c08d80f639f7
status-firefox46: affected → fixed

Comment 26

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/cc70dae5693b
You need to log in before you can comment on or make changes to this bug.