Closed
Bug 1447685
Opened 3 years ago
Closed 3 years ago
Deadlock during startup in nsRFPService::RandomMidpoint
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
RESOLVED
FIXED
mozilla61
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox59 | --- | unaffected |
firefox60 | + | fixed |
firefox61 | + | fixed |
People
(Reporter: sergey.kogan, Assigned: tjr)
References
Details
(Keywords: hang)
Attachments
(2 files)
20.35 KB,
image/png
|
Details | |
59 bytes,
text/x-review-board-request
|
keeler
:
review+
jcristau
:
approval-mozilla-beta+
|
Details |
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36 Steps to reproduce: Start Firefox with an URL as a command line argument (https://yandex.ru for example). Actual results: Firefox sometimes shows the main window but the window is blank and doesn't respond to input. See the screenshot attached. The issue is pretty rare. It started to happen on 03/15 with Firefox 60.0.0.6645 and continued with 60.0.0.6648. All cases were on Windows 10 RS3. It happened 8 times so far. Please contact me to get a memory dump.
Digging into the memory dump shows that this is a simple deadlock. The main thread is waiting inside nsRFPService::RandomMidpoint trying to acquire a lock. Line 346: https://dxr.mozilla.org/mozilla-central/rev/6ff60a083701d08c52702daf50f28e8f46ae3a1c/toolkit/components/resistfingerprinting/nsRFPService.cpp#346 The call stack is: ... 06 ntdll!RtlEnterCriticalSection 07 mozglue!mozilla::detail::MutexImpl::lock 08 xul!mozilla::OffTheBooksMutex::Lock 09 xul!mozilla::StaticMutex::Lock 0a xul!mozilla::BaseAutoLock<mozilla::StaticMutex>::{ctor} 0b xul!mozilla::nsRFPService::RandomMidpoint <---------------- trying to acquire the lock 0c xul!mozilla::nsRFPService::ReduceTimePrecisionImpl 0d xul!mozilla::nsRFPService::ReduceTimePrecisionAsUSecsWrapper 0e xul!NowAsMillis 0f xul!js::date_now 10 xul!js::CallJSNative ... Meanwhile that critical section is locked by another thread which is also inside nsRFPService::RandomMidpoint: ... 07 xul!mozilla::MonitorAutoLock::Wait 08 xul!mozilla::SyncRunnable::DispatchToThread 09 xul!mozilla::SyncRunnable::DispatchToThread 0a xul!EnsureNSSInitializedChromeOrContent 0b xul!mozilla::psm::Constructor<nsRandomGenerator,0,1,1> 0c xul!mozilla::GenericFactory::CreateInstance 0d xul!nsComponentManagerImpl::CreateInstanceByContractID 0e xul!nsComponentManagerImpl::GetServiceByContractID 0f xul!CallGetService 10 xul!nsGetServiceByContractIDWithError::operator() 11 xul!nsCOMPtr_base::assign_from_gs_contractid_with_error 12 xul!nsCOMPtr<nsIRandomGenerator>::{ctor} 13 xul!mozilla::nsRFPService::RandomMidpoint <------------- running the frames above under the lock 14 xul!mozilla::nsRFPService::ReduceTimePrecisionImpl 15 xul!mozilla::nsRFPService::ReduceTimePrecisionAsUSecsWrapper 16 xul!NowAsMillis 17 xul!js::date_now 18 xul!js::CallJSNative ... From the code of EnsureNSSInitializedChromeOrContent it seems that it tries to dispatch a runnable to the main thread and waits for the result. The problem is that the main thread is busy waiting for the lock that was locked by the second thread. Other dumps that I made have the same deadlock with the same stacks.
![]() |
||
Updated•3 years ago
|
Severity: normal → critical
Component: Untriaged → DOM: Events
Keywords: hang
Product: Firefox → Core
Version: unspecified → 60 Branch
Updated•3 years ago
|
Component: DOM: Events → DOM
Flags: needinfo?(tom)
Assignee | ||
Comment 2•3 years ago
|
||
Firstly, thank you so much for digging into this and finding the root cause. Hrm. The first thing that comes to mind is a hack. If NSS is not initialized, and we're not on the main thread - don't do anything that might initialize NSS. But as the repro steps illustrate: we would be returning unjittered time to web content. Practically speaking, one needs to run an attack using a high res timer for 'some amount of time' and NSS will surely be initialized and start jitting time early on in the process. So it's.... probably safe? But it's definitely not correct and it's quite ugly.
Flags: needinfo?(tom)
Assignee | ||
Updated•3 years ago
|
Assignee: nobody → tom
status-firefox59:
--- → unaffected
status-firefox60:
--- → affected
status-firefox61:
--- → affected
tracking-firefox60:
--- → ?
tracking-firefox61:
--- → ?
Updated•3 years ago
|
status-firefox-esr52:
--- → unaffected
Tom - how about not holding sLock while getting a handle on the random generator service? i.e. // If we don't have a seed, we need to get one. if (MOZ_UNLIKELY(!sSecretMidpointSeed)) { nsCOMPtr<nsIRandomGenerator> randomGenerator = do_GetService("@mozilla.org/security/random-generator;1", &rv); if (NS_WARN_IF(NS_FAILED(rv))) { return rv; } StaticMutexAutoLock lock(sLock); if (MOZ_LIKELY(!sSecretMidpointSeed)) { rv = randomGenerator->GenerateRandomBytes(kSeedSize, &sSecretMidpointSeed); if (NS_WARN_IF(NS_FAILED(rv))) { return rv; } } }
Flags: needinfo?(tom)
Comment hidden (mozreview-request) |
![]() |
||
Comment 5•3 years ago
|
||
mozreview-review |
Comment on attachment 8961482 [details] Bug 1447685 Move nsIRandomGenerator creation out of a lock to avoid a deadlock https://reviewboard.mozilla.org/r/230256/#review235856
Attachment #8961482 -
Flags: review?(dkeeler) → review+
Assignee | ||
Updated•3 years ago
|
Flags: needinfo?(tom)
Keywords: checkin-needed
Pushed by ryanvm@gmail.com: https://hg.mozilla.org/integration/autoland/rev/e63bed369163 Move nsIRandomGenerator creation out of a lock to avoid a deadlock r=keeler
Keywords: checkin-needed
Comment 7•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/e63bed369163
Status: UNCONFIRMED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Assignee | ||
Comment 8•3 years ago
|
||
Sergei - any chance you could confirm this is fixed in the latest Nightly?
Flags: needinfo?(sergey.kogan)
(In reply to Tom Ritter [:tjr] from comment #8) > Sergei - any chance you could confirm this is fixed in the latest Nightly? Unfortunately the issue is hard to reproduce. I've seen it 9 times since 03/15 and during that time we've done roughly 8000 launches of Firefox 60.* in our test suite.
Flags: needinfo?(sergey.kogan)
Assignee | ||
Comment 10•3 years ago
|
||
Comment on attachment 8961482 [details] Bug 1447685 Move nsIRandomGenerator creation out of a lock to avoid a deadlock Approval Request Comment [Feature/Bug causing the regression]: Deadlock in Time Jittering [User impact if declined]: Occasional deadlocks [Is this code covered by automated tests?]: No [Has the fix been verified in Nightly?]: No; I couldn't repro it. But Nightly works so it didn't break it entirely... [Needs manual test from QE? If yes, steps to reproduce]: No [List of other uplifts needed for the feature/fix]: No [Is the change risky?]: No [Why is the change risky/not risky?]: It's moving code out of a critical section
Attachment #8961482 -
Flags: approval-mozilla-beta?
Comment 11•3 years ago
|
||
Comment on attachment 8961482 [details] Bug 1447685 Move nsIRandomGenerator creation out of a lock to avoid a deadlock fix a deadlock regression in beta60
Attachment #8961482 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 12•3 years ago
|
||
bugherderuplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/8ab8e9ddda16
Updated•2 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•