disable trace malloc stacks on OS X too

RESOLVED FIXED in mozilla10

Status

()

RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: espindola, Assigned: espindola)

Tracking

unspecified
mozilla10
x86_64
Linux
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [buildfaster])

Attachments

(1 attachment, 3 obsolete attachments)

Created attachment 568573 [details] [diff] [review]
disable trace malloc stacks on OS X too

Try shows that the comment is out of date:
https://tbpl.mozilla.org/?tree=Try&rev=586bb0d70ed6

Trying to cleanup the stack walking code showed that the unwind.h unwinder is the one that halts in 10.5, so the comment probably dates from when that was the one we used.
Attachment #568573 - Flags: review?(dbaron)
Comment on attachment 568573 [details] [diff] [review]
disable trace malloc stacks on OS X too

r=dbaron if you make the same change in testing/xpcshell/runxpcshelltests.py as well.
Attachment #568573 - Flags: review?(dbaron) → review+

Updated

7 years ago
Whiteboard: [buildfaster]
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/248ffe091880 - even though I claimed that with our diminished expectations there's no such thing as too much intermittent orange, it turns out that having more than 50% of our 10.5 debug builds hang in trace-malloc actually *is* too much orange.
We might have a chicken and egg problem :-(

For us to be able to enable trace malloc on 64 bits, we have to disable stack walking, but disabling it makes 10.5 orange. Part of the problem is that the files are in multiple hg repos, so we cannot atomically switch from doing tracing only on 10.5 to only on 10.6.

I will first try to enable the unwind.h based unwinder on 10.5 and see if I can make it work. If that fails, I will try to get both changes pushed as close to each other as possible.
OK. What I think is happening:

When stacks_enabled is true, we call NS_StackWalk. NS_StackWalk checks for pthread_cond_wait$UNIX2003 being on the stack, if it is, it returns NS_ERROR_UNEXPECTED. Seeing it we abort any work we would do to avoid a deadlock.

When stacks_enabled is false, we still create a dummy frame and do a bit of work on it, which can causes us to deadlock.

What I am testing is replacing the dummy frame with an empty one and fixing any tooling that assumes we have at least one frame. Failing that I think the best option would be to move the OS_X hack into the code and call NS_StackWalk even if stacks_enabled is false, just to decide if we should set immediate_abort or not.
Created attachment 569384 [details] [diff] [review]
avoid doing work when stacks are disabled
Attachment #568573 - Attachment is obsolete: true
Attachment #569384 - Flags: review?(dbaron)
Comment on attachment 569384 [details] [diff] [review]
avoid doing work when stacks are disabled

Setting immediate_abort to true will break the memory statistics; we can't do that.
Attachment #569384 - Flags: review?(dbaron) → review-
Created attachment 570260 [details] [diff] [review]
walk the stack to find out if immediate_abort must be set

https://tbpl.mozilla.org/?tree=Try&rev=89bddfc32a2f
Assignee: nobody → respindola
Attachment #569384 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #570260 - Flags: review?(dbaron)
I can #ifdef the stack walk for OS X if it is too expensive in other platforms. At least on OS X the expensive part looks to be dladdr.
Created attachment 570308 [details] [diff] [review]
on os x, walk the stack to find out if immediate_abort must be set

Windows was not very happy with us walking the stack more often. This patch is similar to the previous one, but changes only OS X.

https://tbpl.mozilla.org/?tree=Try&rev=80ef7c939d81
Attachment #570260 - Attachment is obsolete: true
Attachment #570260 - Flags: review?(dbaron)
Attachment #570308 - Flags: review?(dbaron)
Comment on attachment 570308 [details] [diff] [review]
on os x, walk the stack to find out if immediate_abort must be set

r=dbaron
Attachment #570308 - Flags: review?(dbaron) → review+
But in the future please put useful commit messages in patches posted for review.
(In reply to David Baron [:dbaron] from comment #13)
> But in the future please put useful commit messages in patches posted for
> review.

The process I use to make sure I am posting for review what I am testing is to push to try, wget that change and attach it to bugzila. That way I see the same hash in two fields in the bugzilla attach page (in the file name and in the try url).

That is why the patches have the try commit as the message.
However, commit messages are important, and I want to review them as part of the code review.  You can still include try syntax and a commit message, as long as you remember to remove the try syntax later.
https://hg.mozilla.org/mozilla-central/rev/93e293c87580
Status: ASSIGNED → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla10
You need to log in before you can comment on or make changes to this bug.