Bug 1137007 makes ASAN on Fedora 21 (clang 3.5.0) confused and angry

RESOLVED FIXED in Firefox 39

Status

()

RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: bwc, Assigned: jld)

Tracking

Trunk
mozilla39
x86
Linux
Points:
---

Firefox Tracking Flags

(firefox39 fixed)

Details

Attachments

(1 attachment)

(Reporter)

Description

4 years ago
This changeset is causing ASAN to report spurious stack-overflows and SIGSEGV, which strangely enough doesn't seem to result in program termination. Occasionally I'm seeing unit-tests hang in addition to the spurious errors, but I am not 100% sure this is related (will investigate further). The initial output always seems to be the same:

ASAN:SIGSEGV
=================================================================
==17888==ERROR: AddressSanitizer: stack-overflow on address 0x000000000002 (pc 0x0037120fae91 bp 0x7fff0752a560 sp 0x000000000002 T0)
 

I've observed this in sdp_unittest, jsep_session_unittest, signaling_unittests, and running the browser.
(Reporter)

Comment 1

4 years ago
Just observed the test-case hanging on this changeset (actually, it looks like it is trying to crash, given that I see an instance of abrt-hook-ccpp pointing at it, but it seems to just spin indefinitely).
(Reporter)

Comment 2

4 years ago
Huh. Killing the abrt-hook-ccpp seems to unstick the unit-test. Bizarre.
(Assignee)

Comment 3

4 years ago
I'm going to guess that ASAN doesn't like me calling clone(2) directly like that.  Does it work with MOZ_ASSUME_USER_NS=1 (or 0; it doesn't actually matter yet, but any value will make it skip the check) set in the environment?
Flags: needinfo?(docfaraday)
(Reporter)

Comment 4

4 years ago
That seems to prevent the problem, at least with the unit-tests. When I get in the office tomorrow, I'll try the same with mochitest.
(Assignee)

Updated

4 years ago
Assignee: nobody → jld
(Assignee)

Comment 5

4 years ago
(In reply to Jed Davis [:jld] from comment #3)
> I'm going to guess that ASAN doesn't like me calling clone(2) directly like
> that.

Worse than that: it looks like ASAN's stack-overflow error is correct.  I forgot to explicitly pass the clone(2) argument that sets the child stack pointer if non-null, and taking care of that seems to fix it.  (I'm running into other errors as well, but they seem to be unrelated.)
(Assignee)

Comment 6

4 years ago
(In reply to Jed Davis [:jld] from comment #5)
> Worse than that: it looks like ASAN's stack-overflow error is correct.  I
> forgot to explicitly pass the clone(2) argument that sets the child stack
> pointer if non-null, and taking care of that seems to fix it.

…and this also explains the “doesn't seem to result in program termination” from comment #0 — it's in a cloned child process, and its parent just wants to know if it was created successfully, so whether it successfully evaluates _exit(0) or crashes on a bad stack pointer doesn't actually matter.  (The fun times would start if the garbage that winds up in rSP is a valid pointer to a word holding a pointer to executable memory; so, really, let's not do that.)
(Assignee)

Comment 7

4 years ago
Created attachment 8576368 [details] [diff] [review]
bug1142263-userns-detection-oops-hg0.diff

Does this patch fix the mystery crashes on your end?
Flags: needinfo?(docfaraday)
Attachment #8576368 - Flags: feedback?(docfaraday)
(Reporter)

Comment 8

4 years ago
Comment on attachment 8576368 [details] [diff] [review]
bug1142263-userns-detection-oops-hg0.diff

Review of attachment 8576368 [details] [diff] [review]:
-----------------------------------------------------------------

This seems to have done the trick!
Attachment #8576368 - Flags: feedback?(docfaraday) → feedback+
(Assignee)

Updated

4 years ago
Attachment #8576368 - Flags: review?(gdestuynder)
(Assignee)

Comment 9

4 years ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=e8b0913ccdf1 (build-only try run because automation's Linux is too old to reach this case, but I've tested locally, and see comment #8).
Keywords: checkin-needed
Keywords: checkin-needed
(Assignee)

Updated

4 years ago
Duplicate of this bug: 1142862
https://hg.mozilla.org/mozilla-central/rev/46472d25b238
Status: NEW → RESOLVED
Last Resolved: 4 years ago
status-firefox39: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
You need to log in before you can comment on or make changes to this bug.