Closed Bug 684887 Opened 9 years ago Closed 7 years ago

Make the component manager lock non-reentrant and protect nsFactoryEntry.mFactory

Categories

(Core :: XPCOM, defect)

x86_64
Windows 7
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla22

People

(Reporter: benjamin, Assigned: benjamin)

References

Details

Attachments

(2 files, 1 obsolete file)

The component manager monitor is currently the reentrant (Java) style, which doesn't perform as well as a standard non-reentrant monitor and is also papering over some unfortunate calling patterns. Also nsFactoryEntry.mFactory isn't protected by the lock, which may be leading to the symptoms in bug 679993.
This passed tryserver.
Attachment #558521 - Flags: review?(jones.chris.g)
Attachment #558521 - Flags: review?(dbaron)
Why not just switch to Mutex?  Why do you need Monitor?

Also, is there any reason cjones's review wouldn't be sufficient?  I don't feel any need to review this.
Comment on attachment 558521 [details] [diff] [review]
Make the XPCOM monitor non-reentrant and protect nsFactoryEntry.mFactory, rev. 1

dbaron is right, no need for a Monitor.

>+        mon.Unlock();
> 
>         if (!currentThread) {
>             currentThread = NS_GetCurrentThread();
>             NS_ASSERTION(currentThread, "This should never be null!");
>         }
> 
>         // This will process a single event or yield the thread if no event is
>         // pending.
>         if (!NS_ProcessNextEvent(currentThread, PR_FALSE)) {
>             PR_Sleep(PR_INTERVAL_NO_WAIT);
>         }
> 
>-        mon.Enter();
>+        mon.Lock();

This should be an explicitly-scoped MutexAutoUnlock.

>-    mon.Exit();
>+    mon.Unlock();
> 
>     nsresult rv = CreateInstance(aClass, nsnull, aIID, getter_AddRefs(service));
>+    if (NS_SUCCEEDED(rv) && !service) {
>+        NS_ERROR("Factory did not return an object but returned success!");
>+        return NS_ERROR_SERVICE_NOT_FOUND;
>+    }
> 
>-    mon.Enter();
>+    mon.Lock();
> 

Same here.

>-        mon.Exit();
>+        mon.Unlock();
> 
>         if (!currentThread) {
>             currentThread = NS_GetCurrentThread();
>             NS_ASSERTION(currentThread, "This should never be null!");
>         }
> 
>         // This will process a single event or yield the thread if no event is
>         // pending.
>         if (!NS_ProcessNextEvent(currentThread, PR_FALSE)) {
>             PR_Sleep(PR_INTERVAL_NO_WAIT);
>         }
> 
>-        mon.Enter();
>+        mon.Lock();

And here.

>-    mon.Exit();
>+    mon.Unlock();
> 
>     nsresult rv = CreateInstanceByContractID(aContractID, nsnull, aIID,
>                                              getter_AddRefs(service));
> 
>-    mon.Enter();
>+    mon.Lock();
> 

And here.

>+    // Monitor held
>     void RegisterCIDEntry(const mozilla::Module::CIDEntry* aEntry,
>                           KnownModule* aModule);

It's common to add a "Locked" suffix on the end of methods that are
supposed to hold a mutex.  Nice little syntactic reminder.

r=me with s/Monitor/Mutex, MutexAutoUnlock above, and nit picked.
Attachment #558521 - Flags: review?(jones.chris.g) → review+
Attachment #558521 - Flags: review?(dbaron)
I do not want to land this until bug 429592 lands, since it could cause vaguely scary deadlocks.
Duplicate of this bug: 833349
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #5)
> I do not want to land this until bug 429592 lands, since it could cause
> vaguely scary deadlocks.

Bug 429592 is marked fixed, can we land this now?
Flags: needinfo?(benjamin)
It is preffed off. To have confidence that this bug would not cause deadlocks, we'd really need to make the same-thread lock detection in mozilla::Mutex/PR_Lock a fatal error.
Flags: needinfo?(benjamin)
Are there plans to pref it on?
No. Given the problems we uncovered the first time, we have no realistic plan to turn it on.
We need someone to fix NSPR to abort if you reenter a lock on the same thread.
(In reply to ben turner [:bent] from comment #11)
> We need someone to fix NSPR to abort if you reenter a lock on the same
> thread.

Unfortunately that's hard.

https://bugzilla.mozilla.org/show_bug.cgi?id=476536#c8
This is a hand-rebased version of attachment 558521 [details] [diff] [review], but there are no functional changes so I don't think this needs re-review.
Attachment #558521 - Attachment is obsolete: true
Oh and on patch B there's an obvious bug: AssertCurrentThreadOwns should have an == instead of a !=
Duplicate of this bug: 844817
Duplicate of this bug: 844794
Comment on attachment 722312 [details] [diff] [review]
Part B - Make a SafeMutex which crashes in case of recursive locking, rev. 1

Review of attachment 722312 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with these things addressed.

::: xpcom/components/nsComponentManager.h
@@ +78,5 @@
> +        , mOwnerThread(nullptr)
> +    { }
> +    ~SafeMutex()
> +    { }
> +    

Nit: You've got some extra whitespace here, and you could also remove the destructor since it is no different than the one the compiler will generate for you.

@@ +100,5 @@
> +        // This method is a debug-only check
> +        MOZ_ASSERT(mOwnerThread != PR_GetCurrentThread());
> +    }
> +
> +    MOZ_NEVER_INLINE void AssertNotCurrentThreadOwns() const

Hm, I don't understand the reason for the MOZ_NEVER_INLINE here. Even if the compiler is forced to treat this as a function call it still knows the consequences of calling it, right?

@@ +110,5 @@
> +    }
> +
> +private:
> +    mozilla::Mutex mMutex;
> +    PRThread* mOwnerThread;

So after thinking about this a little I think that we should mark this as volatile. That way we can have a reasonable expectation that the optimizer won't pull funny tricks with this address (which still sounds unlikely, but see the last example in http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong).
Attachment #722312 - Flags: review?(bent.mozilla) → review+
I always add destructors to my classes, blank if necessary. I made AssertNotCurrentThreadOwns non-inline so that it would show up in crash report signatures. I've fixed the whitespace and added volatile.
https://hg.mozilla.org/mozilla-central/rev/625ba43c97c6
https://hg.mozilla.org/mozilla-central/rev/401981c0ea51
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
Duplicate of this bug: 690604
Can we uplift this to Aurora to reduce the number of branches affected by bug 690604?
This patch is pretty risky and I don't think it's a good candidate for uplift.
Depends on: 881237
You need to log in before you can comment on or make changes to this bug.