Closed
Bug 412456
Opened 15 years ago
Closed 15 years ago
###!!! ASSERTION: nsSecureBrowserUIImpl not thread-safe
Categories
(Core :: Security: PSM, defect, P1)
Core
Security: PSM
Tracking
()
RESOLVED
FIXED
People
(Reporter: cbook, Assigned: KaiE)
Details
(Keywords: assertion, regression)
Attachments
(3 files)
30.62 KB,
patch
|
rrelyea
:
review-
|
Details | Diff | Splinter Review |
36.01 KB,
patch
|
Details | Diff | Splinter Review | |
37.96 KB,
patch
|
rrelyea
:
review+
|
Details | Diff | Splinter Review |
###!!! ASSERTION: nsSecureBrowserUIImpl not thread-safe: '_mOwningThread.GetThad() == PR_GetCurrentThread()', file c:/Debug/mozilla/security/manager/boot/srnsSecureBrowserUIImpl.cpp, line 169 Steps to reproduce: Created a new Firefox Profile Installed IE View Extension Browser Restart to enable this Extension While Restart and restore of 3 Tabs (2 default Startpages for Trunk Build and AMO IE Homepage (https://addons.mozilla.org/en-US/firefox/addon/35) i was running into this Assertion
Comment 1•15 years ago
|
||
See also bug 370875.
I'm getting this on linux whenever I load Bugzilla. I think it's a fairly recent regression, within the last week or two.
OS: Windows XP → All
Hardware: PC → All
Updated•15 years ago
|
Flags: blocking1.9?
Keywords: regression
Comment 3•15 years ago
|
||
Based on comment 2 I'd like to understand whether this is a bad issue or not
Flags: blocking1.9? → blocking1.9+
Priority: -- → P2
Comment 4•15 years ago
|
||
At the very least, this adds to debugging pain when a testcase is attached to bugzilla.
Comment 5•15 years ago
|
||
Here's a stack: #0 nsSecureBrowserUIImpl::AddRef (this=0x1961e450) at ../../../../../source/mozilla/security/manager/boot/src/nsSecureBrowserUIImpl.cpp:167 #1 0x00317c6d in NS_TableDrivenQI (aThis=0x1961e450, entries=0x5e1208, aIID=@0xeddb3c0, aInstancePtr=0xb00a08fc) at nsISupportsImpl.cpp:49 #2 0x005d62db in nsSecureBrowserUIImpl::QueryInterface (this=0x1961e450, aIID=@0xeddb3c0, aInstancePtr=0xb00a08fc) at ../../../../../source/mozilla/security/manager/boot/src/nsSecureBrowserUIImpl.cpp:186 #3 0x0031410d in nsQueryInterface::operator() (this=0xb00a0914, aIID=@0xeddb3c0, answer=0xb00a08fc) at nsCOMPtr.cpp:47 #4 0x0ed7d318 in nsCOMPtr<nsISSLStatusProvider>::assign_from_qi (this=0xb00a0988, qi={mRawPtr = 0x1961e450}, aIID=@0xeddb3c0) at nsCOMPtr.h:1275 #5 0x0ed7d372 in nsCOMPtr<nsISSLStatusProvider>::nsCOMPtr (this=0xb00a0988, qi={mRawPtr = 0x1961e450}) at nsCOMPtr.h:645 #6 0x0ed7d390 in nsCOMPtr<nsISSLStatusProvider>::nsCOMPtr (this=0xb00a0988, qi={mRawPtr = 0x1961e450}) at nsCOMPtr.h:645 #7 0x0ed7bcc0 in nsNSSSocketInfo::SetNotificationCallbacks (this=0x10e4d2b0, aCallbacks=0x10e830d4) at ../../../../../source/mozilla/security/manager/ssl/src/nsNSSIOLayer.cpp:357 #8 0x092eb66e in nsSocketTransport::BuildSocket (this=0x11ad5210, fd=@0xb00a0c50, proxyTransparent=@0xb00a0c4c, usingSSL=@0xb00a0c48) at ../../../../source/mozilla/netwerk/base/src/nsSocketTransport2.cpp:1042 #9 0x092ec04f in nsSocketTransport::InitiateSocket (this=0x11ad5210) at ../../../../source/mozilla/netwerk/base/src/nsSocketTransport2.cpp:1110 #10 0x092ed4f7 in nsSocketTransport::OnSocketEvent (this=0x11ad5210, type=1, status=0, param=0x11a53df0) at ../../../../source/mozilla/netwerk/base/src/nsSocketTransport2.cpp:1426 #11 0x092ef856 in nsSocketEvent::Run (this=0x10ec6330) at ../../../../source/mozilla/netwerk/base/src/nsSocketTransport2.cpp:98 #12 0x00391ca8 in nsThread::ProcessNextEvent (this=0x6342f0, mayWait=1, result=0xb00a0dac) at ../../../source/mozilla/xpcom/threads/nsThread.cpp:510 #13 0x0031dab8 in NS_ProcessNextEvent_P (thread=0x6342f0, mayWait=1) at nsThreadUtils.cpp:227 #14 0x092f1418 in nsSocketTransportService::Run (this=0x867000) at ../../../../source/mozilla/netwerk/base/src/nsSocketTransportService2.cpp:562 #15 0x00391ca8 in nsThread::ProcessNextEvent (this=0x6342f0, mayWait=1, result=0xb00a0ecc) at ../../../source/mozilla/xpcom/threads/nsThread.cpp:510 #16 0x0031dab8 in NS_ProcessNextEvent_P (thread=0x6342f0, mayWait=1) at nsThreadUtils.cpp:227 #17 0x00391eb7 in nsThread::ThreadFunc (arg=0x6342f0) at ../../../source/mozilla/xpcom/threads/nsThread.cpp:254 #18 0x0055f360 in _pt_root (arg=0x635900) at ../../../../../source/mozilla/nsprpub/pr/src/pthreads/ptthread.c:221 #19 0x93142075 in _pthread_start () #20 0x93141f32 in thread_start () This is on a thread other than the main thread.
Updated•15 years ago
|
Assignee: nobody → kengert
Component: General → Security: PSM
QA Contact: general → psm
Comment 6•15 years ago
|
||
This is the third most common assertion on the top sites.
Assignee | ||
Comment 7•15 years ago
|
||
I think it's time to add a lock to nsSecureBrowserUIImpl and declare it as threadsafe.
Assignee | ||
Comment 8•15 years ago
|
||
This patch declares the implementation as thread-safe. The remaining portions of the patch add the locks to really make it so. I'm using a lock whenever a member variable is read or updated. I'm trying to keep the lock periods to the minimum, never hold the lock when calling the outside world.
Attachment #306555 -
Flags: review?(rrelyea)
Updated•15 years ago
|
Flags: tracking1.9+ → blocking1.9+
Priority: P2 → P1
Comment 9•15 years ago
|
||
Comment on attachment 306555 [details] [diff] [review] Patch v1 r- In general it looks like you are handling access to the various interal states with locks, however, since the code was not originally thread-safe it makes it very difficult to determine if we have potential race conditions. In reviewing I've found a couple of things which may or may not be issues. Part of the resolution could either be code changes or explanations why this particular race is not an issue... Issue 1) There appears to be a merge issue from the current trunk. This isn't a race condition (well maybe a check-in race). Places where the mWindow weak reference is made. The current code fetches a new window with a query context. I think that change and this one needs to be in sync (is it possible for the tmpWindow copy of the mWindow weak reference still has the weak reference characteristic of mWindow for instance?). Issue 2: Are auto locks reentrant? If so, this is not an issue, but if they are not (NSPR PR_Locks are not for instance), then you have a potential deadlock in OnStateChange where you call ResetStateTracking while holding the lock. Issue 3: Again in OnStateChange, you grap the current mTopLevel, (correctly with the lock), call ObtainEvenSink with it, then set the value holding the lock again. If it is possible for ObtainEventSink to come up with a different answer for temp_ToplevelEventSink in consecutive calls, then you have a race condition setting mToplevelEventSink. Issue 4: Same type of race in UpdateSecurity state. You call it based on the setting of a global. It in turn modifies some globals. Without knowing what the invariance model is, it's impossible to say whether or not there is a race condition here. In general the code does a good job of correctly referencing or setting known 'globals' under locks, but it's not clear that these are being set correctly in the race case without a clear definition of what the true invariants are. Other than Issue 1 you may be able to resolve the other issue simply with documentation. bob
Attachment #306555 -
Flags: review?(rrelyea) → review-
Assignee | ||
Comment 10•15 years ago
|
||
(In reply to comment #9) > Issue 1) There appears to be a merge issue from the current trunk. This isn't a > race condition (well maybe a check-in race). Places where the mWindow weak > reference is made. The current code fetches a new window with a query context. > I think that change and this one needs to be in sync (is it possible for the > tmpWindow copy of the mWindow weak reference still has the weak reference > characteristic of mWindow for instance?). Indeed, this change was made very recently. I'm attaching a new patch, which has the changes merged, but none of your other requests (yet) addressed. If we dereference while holding the lock, it's fine to copy a reference to nsCOMPtr for our use. We'll use that object temporary for completing our current activity, and release the reference afterwards. If this causes the object to be freed, it's all good. Our next attempt to dereference the weak reference will do the right thing, whatever happened in between. In other words, our tmpWindow is not a weak reference, it's a strong reference, and it must be that way.
Assignee | ||
Comment 11•15 years ago
|
||
Bugzilla's interdiff is insufficient again. If you'd like to see the difference between patch v1 and patch v2, download both files, and use tkdiff. (FYI: On Fedora Linux it's contained in the tkcvs package.)
Assignee | ||
Comment 12•15 years ago
|
||
(In reply to comment #9) > Issue 2: Are auto locks reentrant? If so, this is not an issue, but if they are > not (NSPR PR_Locks are not for instance), then you have a potential deadlock in > OnStateChange where you call ResetStateTracking while holding the lock. The underlying lock is a PRMonitor*. I'm not using nsAutoLock, I'm using nsAutoMonitor. nsAutoMonitor calls PR_EnterMonitor on contruction/scope entry and calls PR_ExitMonitor on destruction/scope exit, so we are reentrant and the code location you found is safe. > Issue 3: Again in OnStateChange, you grap the current mTopLevel, (correctly > with the lock), call ObtainEvenSink with it, then set the value holding the > lock again. If it is possible for ObtainEventSink to come up with a different > answer for temp_ToplevelEventSink in consecutive calls, then you have a race > condition setting mToplevelEventSink. I don't think there is a race. I don't think we'll ever see parallel calls to OnStateChange on the same object. Actually, the sink only ever gets set once for each instance of nsSecureBrowserUIImpl. We need to know the pointer to "our" window (that this tracker is associated to), and it won't change. So even if there were a race, it wouldn't matter. However, your question made me thinking. Should we ensure it's not necessary that OnStateChange is reentrant? Well, it has never been. But I added a mechanism to detect unexpected reentrance. Search for nsAutoAtomic and you'll understand. In my testing this was never hit. But this adds a bit of overhead. I think we should avoid it. (If you agree, I'll attach a follow up patch with nsAutoAtomic and the assertion and variable m...ReentranceDetection removed.) > Issue 4: Same type of race in UpdateSecurity state. You call it based on the > setting of a global. It in turn modifies some globals. Without knowing what the > invariance model is, it's impossible to say whether or not there is a race > condition here. I reviewed the code, only the following two events can eventually result in a call to UpdateSecurityState: - OnStateChange notification - OnLocationChange notification I used the above nsAutoAtomic mechanism to detect any races within this group of two functions. I didn't see a race in my testing. This is what I expected. The notifications for a given window come in serialized.
Assignee | ||
Comment 13•15 years ago
|
||
Attachment #307505 -
Flags: review?(rrelyea)
Assignee | ||
Updated•15 years ago
|
Attachment #307494 -
Attachment description: Patch v2 → Patch v2 (proposed for check in)
Comment 14•15 years ago
|
||
Comment on attachment 307505 [details] [diff] [review] Patch v3 Summary of resolutions to review issues. Issue 1: this looks much safer, particularly given the previous code. Issue 2: OK, PR_Monitors have the correct reentry semantic. Issues 3 & 4: The guard you provide properly detects what would otherwise be race issues. RE the performance issue: Atomic operations should be realtively fast (certainly compared to acquisition of locks), and adding the check will instantly show up any threading issues rather than have races create unexpected UI issues without any visibility. I personally think this is worth it, but it also depends on the frequency of calls to these functions and their sensitivity to performance issues. If this patch produces noticeable performance problems from the user level, then the trade-off is not so good. I have a hard time believing it will, but will accept patch V2 if that is the case. bob
Attachment #307505 -
Flags: review?(rrelyea) → review+
Assignee | ||
Updated•15 years ago
|
Attachment #307494 -
Attachment description: Patch v2 (proposed for check in) → Patch v2
Assignee | ||
Comment 15•15 years ago
|
||
Comment on attachment 307505 [details] [diff] [review] Patch v3 ok, let's check in this one
Attachment #307505 -
Attachment description: Patch v3 (with race testing, performance penalty) → Patch v3
Assignee | ||
Comment 16•15 years ago
|
||
fixed
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment 17•10 years ago
|
||
I am updating MapInternaltoExternalState() in bug 822367 and am considering removing the static keyword from the method. It was added here in Patch 1: https://bugzilla.mozilla.org/attachment.cgi?id=306555&action=diff#mozilla/security/manager/boot/src/nsSecureBrowserUIImpl.h_sec4 Does anyone see an issue with this?
Comment 18•10 years ago
|
||
(In reply to Tanvi Vyas [:tanvi] from comment #17) > I am updating MapInternaltoExternalState() in bug 822367 and am considering > removing the static keyword from the method. It was added here in Patch 1: > https://bugzilla.mozilla.org/attachment.cgi?id=306555&action=diff#mozilla/ > security/manager/boot/src/nsSecureBrowserUIImpl.h_sec4 > > Does anyone see an issue with this? If it seems reasonable to you, then it seems reasonable to me. Just have a PSM peer review your patch for bug 822367.
You need to log in
before you can comment on or make changes to this bug.
Description
•