Closed Bug 1113416 Opened 10 years ago Closed 9 years ago

Hang monitor + on-demand decompression can cause deadlocks

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla37

People

(Reporter: jchen, Assigned: jchen)

References

Details

Attachments

(1 file)

I believe this is the real cause behind bug 1059797.

Basically when a thread hangs, the hang monitor triggers a sighandler and reads the thread's profiler pseudostack, which may contain string constants in static storage.

However, if the page containing the constant hasn't been decompressed yet, on-demand decompression will kick in to load the string constant.

If the original hang signal happened during a memory allocation, on-demand decompression will try to allocate memory again, and cause a deadlock when trying to take a lock that's already taken.
When we're inside the hang monitor's signal handler, we must not read any string labels. Doing so may result in on-demand decompression kicking in on Android, which may result in a deadlock.
Attachment #8544767 - Flags: review?(snorp)
Comment on attachment 8544767 [details] [diff] [review]
Bug 1113416 - Don't read stack labels inside hang monitor sighandler;

Review of attachment 8544767 [details] [diff] [review]:
-----------------------------------------------------------------

This is some crazy stuff.
Attachment #8544767 - Flags: review?(snorp) → review+
This makes so much more sense :)
Comment on attachment 8544767 [details] [diff] [review]
Bug 1113416 - Don't read stack labels inside hang monitor sighandler;

That said, this should be reviewed by a peer of the xpcom module.
Attachment #8544767 - Flags: review?(nfroyd)
Comment on attachment 8544767 [details] [diff] [review]
Bug 1113416 - Don't read stack labels inside hang monitor sighandler;

Review of attachment 8544767 [details] [diff] [review]:
-----------------------------------------------------------------

I don't know what the stacks in BHM usually look like, so I'm curious about your thoughts on the comment below.

::: xpcom/threads/BackgroundHangMonitor.cpp
@@ +362,5 @@
>  
> +  // Remove unwanted "js::RunScript" frame from the stack
> +  for (const char** f = &mHangStack.back(); f >= mHangStack.begin(); f--) {
> +    if (!mHangStack.IsInBuffer(*f) && !strcmp(*f, "js::RunScript")) {
> +      mHangStack.erase(f);

My only concern is that we now have this O(N^2)-ish algorithm here.  Maybe it doesn't matter that much, as we *probably* won't have many RunScript entries and those that we do are likely to be close to the end, so the copying is minimal?
Attachment #8544767 - Flags: review?(nfroyd) → review+
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #5)
> Comment on attachment 8544767 [details] [diff] [review]
> Bug 1113416 - Don't read stack labels inside hang monitor sighandler;
> 
> Review of attachment 8544767 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I don't know what the stacks in BHM usually look like, so I'm curious about
> your thoughts on the comment below.
> 
> ::: xpcom/threads/BackgroundHangMonitor.cpp
> @@ +362,5 @@
> >  
> > +  // Remove unwanted "js::RunScript" frame from the stack
> > +  for (const char** f = &mHangStack.back(); f >= mHangStack.begin(); f--) {
> > +    if (!mHangStack.IsInBuffer(*f) && !strcmp(*f, "js::RunScript")) {
> > +      mHangStack.erase(f);
> 
> My only concern is that we now have this O(N^2)-ish algorithm here.  Maybe
> it doesn't matter that much, as we *probably* won't have many RunScript
> entries and those that we do are likely to be close to the end, so the
> copying is minimal?

Right, stacks are usually less than 10 entries and RunScript only appears for JS entries near the end of the stack. And this block only runs infrequently because it only runs when there is a hang happening on a thread.
https://hg.mozilla.org/mozilla-central/rev/6cbf0b1ec001
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla37
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: