common deadlocks of nsProxyObjectManager.mProxyCreationLock

RESOLVED FIXED in mozilla6

Status

()

Core
XPCOM
P2
normal
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: dbaron, Assigned: dbaron)

Tracking

({hang})

Trunk
mozilla6
x86_64
Linux
Points:
---

Firefox Tracking Flags

(firefox5 affected)

Details

Attachments

(4 attachments)

(Assignee)

Description

6 years ago
I restarted my build yesterday after running for quite a while, and I started seeing very common deadlocks (maybe once per hour of use) of nsProxyObjectManager.mProxyCreationLock .  (I'm running a debug+opt Linux x86-64 build of mozilla-central.)

When I restarted yesterday, my tree was based on https://hg.mozilla.org/mozilla-central/rev/48d6abe0a436 plus local patches; I updated this morning to https://hg.mozilla.org/mozilla-central/rev/1346f7ada9d5 and it didn't help.

Attachments and diagnosis coming.
(Assignee)

Updated

6 years ago
Keywords: hang
(Assignee)

Comment 1

6 years ago
Created attachment 526628 [details]
tail of stdout of process when the hang happened
(Assignee)

Comment 2

6 years ago
Created attachment 526631 [details]
gdb output showing the problem

This gdb output has a better stack of the same thread.

The problem here is that thread #12 is trying to acquire the same lock twice, due to what appears to me to be a longstanding bug in the XPCOM proxy code.  Then all the other threads that are active eventually try to acquire the same lock, and everything comes to a halt.

The key frames in this stack are frame #9 and frame #6, both of which want to hold the same lock.

I believe this is a bug in the code in frame #9, which should not be calling |delete newpeo| here:

    // Now that we're locked again, check for races by repeating the
    // linked-list check.
    for (peo = mFirst; peo; peo = peo->mNext) {
        if (peo->GetClass()->GetProxiedIID().Equals(aIID)) {
            delete newpeo;
            *aResult = static_cast<nsISupports*>(peo->mXPTCStub);
            peo->LockedAddRef();
            return NS_OK;
        }
    }

since it is not safe to delete a proxy event object while holding the lock.

I believe this code needs to explicitly unlock around the |delete newpeo|.  (nsProxy*Event*Object doesn't seem to have a LockedRelease method... nor would we want to call such a method since its Release method removes it from the table that we haven't added it to yet.)
(Assignee)

Comment 3

6 years ago
FWIW, this codepath looks like it could be relatively hard to hit, since it's the code that's handling the "somebody else already inserted one of these while we were unlocked" case.  I'm not sure if what's causing me to start seeing this all the time now is some recent change to the code or some recent change to what sites I have loaded or my internet connectivity.
(Assignee)

Comment 4

6 years ago
Created attachment 526632 [details] [diff] [review]
patch
Attachment #526632 - Flags: review?(benjamin)
(Assignee)

Comment 5

6 years ago
That said, in this instance this is clearly part of a load of https://si3.twimg.com/profile_images/1158205226/ft2010-closeup_normal.png which was part of loading https://twitter.com/ .  (It's https://twitter.com/LeaVerou 's profile picture.)
Attachment #526632 - Flags: review?(benjamin) → review+
(Assignee)

Updated

6 years ago
status-firefox5: --- → affected
(Assignee)

Comment 6

6 years ago
https://hg.mozilla.org/mozilla-central/rev/6e5fcc89c65e
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
(Assignee)

Comment 7

6 years ago
I'm really not sure why I suddenly started seeing this; I think the build I was using when I started seeing it was right *before* the pull to Aurora, so I'm a little worried about Firefox 5.  But I'm inclined not to try to get it in unless I hear other reports of users seeing hangs.  (Though I'm not sure how well we'll hear about it...)
Priority: -- → P2
Target Milestone: --- → mozilla6

Updated

6 years ago
Duplicate of this bug: 400450
(Assignee)

Comment 9

6 years ago
Created attachment 527198 [details] [diff] [review]
additional patch

After reading bug 400450, I think we should probably also do this.  (I probably should have noticed this sooner.)
Attachment #527198 - Flags: review?(benjamin)
Attachment #527198 - Flags: review?(benjamin) → review+
(Assignee)

Comment 10

6 years ago
http://hg.mozilla.org/mozilla-central/rev/127b9421c7c3
You need to log in before you can comment on or make changes to this bug.