Closed Bug 1679933 Opened 4 years ago Closed 4 years ago

Firefox freeze on the startup

Categories

(Core :: Networking: HTTP, defect, P1)

x86_64
macOS
defect

Tracking

()

RESOLVED FIXED
86 Branch
Tracking Status
firefox-esr78 --- fixed
firefox85 blocking fixed
firefox86 --- fixed

People

(Reporter: tetsuharu, Assigned: kershaw)

References

Details

(Whiteboard: [necko-triaged])

Attachments

(2 files)

Environment

Steps to reproduce

  1. Launch Firefox.
  2. Try to open a new window from the context menu on the icon in Dock

Actual Result

  • Firefox is not responsible with rainbow cursor.
  • Firefox does not comeback and I need to quit Firefox forcely.
  • I reproduce this bug a few days ago.
  • I does not face this bug rarely on a new profile

(In reply to Tetsuharu OHZEKI [:tetsuharu] (UTC+9) from comment #0)

  • Firefox is not responsible with rainbow cursor.
  • Firefox does not comeback and I need to quit Firefox forcely.
  • I reproduce this bug a few days ago.
  • I does not face this bug rarely on a new profile

Can you clarify whether this means you do not see this with a new profile?

Do you know which Nightly this started happening with?

Flags: needinfo?(tetsuharu.ohzeki)

Can you clarify whether this means you do not see this with a new profile?

Yes. I face this bug reraly on a new profile.

Do you know which Nightly this started happening with?

Sorry, I don't remember the concrete date about which Nightly starts to happen......
I feel that I faced this bug frequently from 11/20 (Fri) ~ 23 (Mon). At least I seem at that time which U.S entered to thanksgiving holiday.


Additional Information are:

  • WIth yeasterday's Nightly, I have not faced this bug. But I'm not sure about that this bug has been fixed actually. This bug sometimes does not happen. I seem that the step to reproduce is depends on some timing issue.
  • My main profile enables fission.autostart=true.
Flags: needinfo?(tetsuharu.ohzeki)
Attached file stack information

This is an information which I got from the macOS' dialog to report the crash shown after force quit Firefox

Based on the stack info it looks like maybe Firefox was stuck in a deadlock somewhere in the HTTP code so moving this over there for mre investigation. Is this still occurring?

Component: General → Networking: HTTP
Flags: needinfo?(tetsuharu.ohzeki)
Product: Firefox → Core

Is this still occurring?

Yes.
I seem this was some changed from the before. On the before, this bug is reproducible on launching Firefox every time.

But now,

  • I face this bug on launching Firefox first time after daily update, and it's not always happens.
  • I feel this bug happens if I open a context window on the dock and opening an window.
  • But the timing is a bit scatterd.
  • If this bug does not happen in 10 sec after launching Firefox, then I never face to this bug whilte using Firefox until close.
Flags: needinfo?(tetsuharu.ohzeki)

CacheIO Thread holds mRCWNLock and try to dispatch SyncRunnable on the main thread:
45 mozilla::net::nsHttpChannel::OnCacheEntryCheck(nsICacheEntry*, nsIApplicationCache*, unsigned int*) + 2485 (XUL + 7218629) [0x107e7d5c5] 1-45
45 mozilla::net::nsHttpChannel::OpenCacheInputStream(nsICacheEntry*, bool, bool) + 1539 (XUL + 7225683) [0x107e7f153] 1-45
45 mozilla::net::CacheEntry::GetSecurityInfo(nsISupports**) + 197 (XUL + 41189653) [0x109ee3115] 1-45
45 NS_DeserializeObject(nsTSubstring<char> const&, nsISupports**) + 131 (XUL + 6224003) [0x107d8a883] 1-45
45 nsBinaryInputStream::ReadObject(bool, nsISupports**) + 274 (XUL + 5336850) [0x107cb1f12] 1-45
45 nsCOMPtr_base::assign_from_helper(nsCOMPtr_helper const&, nsID const&) + 44 (XUL + 5114444) [0x107c7ba4c] 1-45
45 nsCreateInstanceByCID::operator()(nsID const&, void**) const + 42 (XUL + 5481626) [0x107cd549a] 1-45
45 nsComponentManagerImpl::CreateInstance(nsID const&, nsISupports*, nsID const&, void**) + 183 (XUL + 5470679) [0x107cd29d7] 1-45
45 nsresult mozilla::psm::NSSConstructor<mozilla::psm::TransportSecurityInfo>(nsISupports*, nsID const&, void**) + 56 (XUL + 84060120) [0x10c7c57d8] 1-45
45 EnsureNSSInitializedChromeOrContent() + 583 (XUL + 21727959) [0x108c53ad7] 1-45
45 mozilla::SyncRunnable::DispatchToThread(nsIEventTarget*, bool) + 156 (XUL + 5904060) [0x107d3c6bc] 1-45

The main Thread is waiting on the mRCWNLock.

Kershaw, do I recall correctly that we have this SyncRunnable only because of some test? EnsureNSSInitializedChromeOrContent should always be called on the main thread.

Dana, do you know whatt has change recently

Severity: -- → S3
Flags: needinfo?(kershaw)
Flags: needinfo?(dkeeler)
Priority: -- → P1
Whiteboard: [necko-triaged]

(In reply to Dragana Damjanovic [:dragana] from comment #6)

CacheIO Thread holds mRCWNLock and try to dispatch SyncRunnable on the main thread:
45 mozilla::net::nsHttpChannel::OnCacheEntryCheck(nsICacheEntry*, nsIApplicationCache*, unsigned int*) + 2485 (XUL + 7218629) [0x107e7d5c5] 1-45
45 mozilla::net::nsHttpChannel::OpenCacheInputStream(nsICacheEntry*, bool, bool) + 1539 (XUL + 7225683) [0x107e7f153] 1-45
45 mozilla::net::CacheEntry::GetSecurityInfo(nsISupports**) + 197 (XUL + 41189653) [0x109ee3115] 1-45
45 NS_DeserializeObject(nsTSubstring<char> const&, nsISupports**) + 131 (XUL + 6224003) [0x107d8a883] 1-45
45 nsBinaryInputStream::ReadObject(bool, nsISupports**) + 274 (XUL + 5336850) [0x107cb1f12] 1-45
45 nsCOMPtr_base::assign_from_helper(nsCOMPtr_helper const&, nsID const&) + 44 (XUL + 5114444) [0x107c7ba4c] 1-45
45 nsCreateInstanceByCID::operator()(nsID const&, void**) const + 42 (XUL + 5481626) [0x107cd549a] 1-45
45 nsComponentManagerImpl::CreateInstance(nsID const&, nsISupports*, nsID const&, void**) + 183 (XUL + 5470679) [0x107cd29d7] 1-45
45 nsresult mozilla::psm::NSSConstructor<mozilla::psm::TransportSecurityInfo>(nsISupports*, nsID const&, void**) + 56 (XUL + 84060120) [0x10c7c57d8] 1-45
45 EnsureNSSInitializedChromeOrContent() + 583 (XUL + 21727959) [0x108c53ad7] 1-45
45 mozilla::SyncRunnable::DispatchToThread(nsIEventTarget*, bool) + 156 (XUL + 5904060) [0x107d3c6bc] 1-45

The main Thread is waiting on the mRCWNLock.

Kershaw, do I recall correctly that we have this SyncRunnable only because of some test? EnsureNSSInitializedChromeOrContent should always be called on the main thread.

No, I think this is a different problem and this bug could be regressed by bug 1634065.

Flags: needinfo?(kershaw)

I'm not sure bug 1634065 would have changed this one way or another. It seems like we have a preexisting issue where if EnsureNSSInitializedChromeOrContent() has never been called, it could think it needs to dispatch to the main thread, even if NSS has already been initialized (which causes a problem if the currently running code is holding a lock that the main thread is waiting on). My guess is if we replaced nsCOMPtr<nsISupports> psm = do_GetService(PSM_COMPONENT_CONTRACTID, &rv); with EnsureNSSInitializedChromeOrContent() at [0], this wouldn't happen.

[0] https://searchfox.org/mozilla-central/rev/6bb59b783b193f06d6744c5ccaac69a992e9ee7b/netwerk/base/nsNetUtil.cpp#2718

Flags: needinfo?(dkeeler)

ni myself to take a look.

Assignee: nobody → kershaw
Flags: needinfo?(kershaw)

I agree with Dana. Calling EnsureNSSInitializedChromeOrContent() in net_EnsurePSMInit seems to be the best way to fix this.

Flags: needinfo?(kershaw)
Pushed by kjang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/5f0a8b3326e7 Call EnsureNSSInitializedChromeOrContent() during nsHttpHandler initialization r=necko-reviewers,dragana

Could you try to use this build to verify if this issue is fixed?

Thanks.

Flags: needinfo?(tetsuharu.ohzeki)
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 86 Branch

(In reply to Kershaw Chang [:kershaw] from comment #14)

Could you try to use this build to verify if this issue is fixed?

Thank you for your effort!
I seem this build would fix this bug. But I cannot be confident to confirm this bug has been fixed by your patch because this bug is most reproducible on updating Firefox (In other words, it's hard to reproduce this bug on the timing which is not on updating in recent build) ....

I'll try to check it again in the next nightly build which will come in the next morning of UTC+9.

Flags: needinfo?(tetsuharu.ohzeki)

(In reply to Tetsuharu OHZEKI [:tetsuharu] (UTC+9) from comment #16)

I'll try to check it again in the next nightly build which will come in the next morning of UTC+9.

I think this has been fixed.
Thanks!

I do see that but on Linux too, Fedora 33 / Firefox 85.

I was unable to reproduce the issue by using Firefox 85.0a1 (2020-11-30) after and before updating it, under macOS 10.15.7.

It seems that the reporter says the fix is working (see Comment 16 and Comment 17), so I will remove the qa+ flag based on that. If further investigation is needed, please don't hesitate and ni me.

Flags: qe-verify+
See Also: → 1689032
Blocks: 1689032
See Also: 1689032

Comment on attachment 9197123 [details]
Bug 1679933 - Call EnsureNSSInitializedChromeOrContent() during nsHttpHandler initialization

Beta/Release Uplift Approval Request

  • User impact if declined: Firefox could freeze on start up.
    See bug 1689032. It seems we have some users reported the same issue.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This patch is quite straightforward and this patch is already on beta and nightly for a while.
  • String changes made/needed: N/A
Attachment #9197123 - Flags: approval-mozilla-release?

Do we know when/why this started? Any idea how many users may have been affected, and why?

Severity: S3 → S1
Flags: needinfo?(kershaw)

(In reply to Julien Cristau [:jcristau] from comment #21)

Do we know when/why this started? Any idea how many users may have been affected, and why?

I think the code that triggers this deadlock was in bug 1325341 (since the lock is mRCWNLock), so it has been there for years. I think it's probably some recent changes that change the thread timings and make this deadlock happen more often.
Unfortunately, I can't tell how many users are affected, since it's all about timing to hit this.

Flags: needinfo?(kershaw)

(In reply to Kershaw Chang [:kershaw] from comment #22)

(In reply to Julien Cristau [:jcristau] from comment #21)

Do we know when/why this started? Any idea how many users may have been affected, and why?

I think the code that triggers this deadlock was in bug 1325341 (since the lock is mRCWNLock), so it has been there for years. I think it's probably some recent changes that affect the thread timings and make this deadlock happen more often.
Unfortunately, I can't tell how many users are affected, since it's all about timing to hit this.

In reply to "when", I never experienced this problem up to and including 84.0.2. I first got this problem in 85.0, and still get it in 85.0.1.

Comment on attachment 9197123 [details]
Bug 1679933 - Call EnsureNSSInitializedChromeOrContent() during nsHttpHandler initialization

fixing a deadlock on startup, approved for 85.0.2

Attachment #9197123 - Flags: approval-mozilla-release? → approval-mozilla-release+

Is this worth taking on ESR78 to be safe?

Flags: needinfo?(kershaw)

(In reply to Ryan VanderMeulen [:RyanVM] from comment #27)

Is this worth taking on ESR78 to be safe?

OK, let's uplift this to ESR78 for safe.

Flags: needinfo?(kershaw)

Comment on attachment 9197123 [details]
Bug 1679933 - Call EnsureNSSInitializedChromeOrContent() during nsHttpHandler initialization

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Firefox could freeze on start up.
  • User impact if declined: Firefox could freeze on start up.
  • Fix Landed on Version: 86
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The patch is already verified on 86.
  • String or UUID changes made by this patch: N/A
Attachment #9197123 - Flags: approval-mozilla-esr78?

Tried again reproducing this issue (following some leads from Bug 1689032 as well) in order to verify this on Firefox 85.0.2 and on ESR, but had no luck.

Tried creating dirty profiles (adding addons, cache from websites, etc), installing and uninstalling the browser, updating it over and over from 84.0.2 to 85.0.1, and we couldn't encounter any freezes. Tests were performed on macOS 11.2, Ubuntu 20.04 and Windows 10.

Comment on attachment 9197123 [details]
Bug 1679933 - Call EnsureNSSInitializedChromeOrContent() during nsHttpHandler initialization

Low-risk fix for a longstanding issue which can cause startup freezes in some situations. Approved for 78.8esr.

Attachment #9197123 - Flags: approval-mozilla-esr78? → approval-mozilla-esr78+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: