Closed Bug 748654 Opened 8 years ago Closed 8 years ago

Drop patch using moz_alloc in ANGLE to remove one possible cause for the allocator mismatch crashes

Categories

(Core :: Canvas: WebGL, defect)

ARM
Android
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla15
Tracking Status
firefox14 --- fixed

People

(Reporter: bjacob, Assigned: bjacob)

References

Details

Attachments

(2 files)

This is just a pie-in-the-sky theory, not tested yet.

We've had lots of strange ANGLE crashes on Android, that look like they could be allocator mismatches. See bug 709947, we stopped using shader translation on Android because of them, which is a blocker for webgl conformance on android, and even a security/stability concern by itself. What's more, recently we've seen new similar crashes in other parts of ANGLE on Android, see bug 746794.

Now I just remembered that among our local patches against ANGLE, we have the patch for bug 680840,

  gfx/angle/angle-use-xmalloc.patch

which makes ANGLE link to $(MOZALLOC_LIB) and use moz_xmalloc to fix a security bug in the old preprocessor.

To be tried: unapply this patch, re-fix bug 680840 by manually aborting on malloc failure instead of using moz_xmalloc, and re-enable shader translation on Android. See if the crashes reported in above bugs are still reproducible.
Blocks: 746794
Blocks: 743748
Comment on attachment 618777 [details] [diff] [review]
instead, manually abort on allocation failure

See bug 680840. We have to abort here as this code does not properly handle the OOM case. This is the old preprocessor, soon to be replaced with the new preprocessor, so we just want to avoid crashes here, no need to aim for the optimal fix.
Attachment #618777 - Flags: review?(jgilbert)
Comment on attachment 618777 [details] [diff] [review]
instead, manually abort on allocation failure

Review of attachment 618777 [details] [diff] [review]:
-----------------------------------------------------------------

I wonder if we should really patch our ANGLE to use our NS_RUNTIMEABORT() stuff, but OOM is such a broken case that it's probably not worth it.
Attachment #618777 - Flags: review?(jgilbert) → review+
Attachment #618776 - Flags: review?(jgilbert) → review+
Comment on attachment 618776 [details] [diff] [review]
drop patch making us use mozalloc in angle

[Approval Request Comment]
Regression caused by (bug #): 
User impact if declined: won't be able to re-enable shader translation on android, which is a serious stability/security improvement
Testing completed (on m-c, etc.): inbound
Risk to taking this patch (and alternatives if risky): none if landed together with the next patch which reproduces the same behavior in slightly different way
String changes made by this patch: none
Attachment #618776 - Flags: approval-mozilla-aurora?
Comment on attachment 618777 [details] [diff] [review]
instead, manually abort on allocation failure

[Approval Request Comment]
Regression caused by (bug #): 
User impact if declined: this patch is necessary to land together with the other one. Otherwise we'd be reopening a security hole.
Testing completed (on m-c, etc.): inbound
Risk to taking this patch (and alternatives if risky): none
String changes made by this patch: none
Attachment #618777 - Flags: approval-mozilla-aurora?
We're seeing crashes on Boot2Gecko similar to #746794 even after enabling these patches.  The allocator doesn't seem to be the issue, but it looks to me like something is corrupting heap metadata.  Digging into it now.
Can you explain why you don't think there's an allocator mismatch here?
It's very, very likely there actually is an allocator mismatch here.  Disabling jemalloc fixes the issue on Boot2Gecko.  Tracking the issue down now.
Target Milestone: mozilla14 → mozilla15
Comment on attachment 618776 [details] [diff] [review]
drop patch making us use mozalloc in angle

[Triage Comment]
Approving for Aurora 14 alongside bug 743748.
Attachment #618776 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #618777 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Renaming this bug to clarify what we now know.
Summary: Theory: ANGLE crashes on Android are caused by use of moz_xmalloc in ANGLE patch, causing allocator mismatch → Drop patch using moz_alloc in ANGLE to remove one possible cause for the allocator mismatch crashes
Any suggestions on tests that would allow manual verification of this issue?
Not really. The hope was that this would make all the allocator mismatch crashes go away, but it at best fixed only a small part of them.
You need to log in before you can comment on or make changes to this bug.