Closed Bug 1871011 Opened 1 year ago Closed 1 year ago

cannot browse any website, on multiple OSs

Categories

(Core :: Security: PSM, defect, P1)

Firefox 120
defect

Tracking

()

RESOLVED FIXED
124 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox121 --- wontfix
firefox122 --- wontfix
firefox123 --- fixed
firefox124 --- fixed

People

(Reporter: twain43, Assigned: keeler)

References

(Regression)

Details

(Keywords: regression, Whiteboard: [psm-assigned])

Attachments

(4 files, 2 obsolete files)

Steps to reproduce:

Updated to Firefox 120+ (issue persists from FF 120 onwards to 121, I just tried to update and had to roll back), both on MacOS Monterey 12.7.2 and Windows 11 (latest 12/2023 update).

I am familiar with the possibility of issues with plugins and extensions, so I also tried a clean install, backupping and deleting all my Profile folders on both systems.
I disabled any proxy, HTTPS only and safe DNS settings, to pin down any possible issue related to misconfigurations. No luck

Rolling back to Firefox 119.0.1 (clean or with -force-downgrade option for Profile recovery) works fine on both platforms.

Actual results:

Firefox starts fine, but cannot connect to ANY website at all.

Depending on the system, the following happens:

  • On MacOS, the "connection timed out" error occurs, YET if you keep retrying and refreshing, eventually, after 5 to 10 minutes, the browser finally starts working (timing is random, AFAICT). If you close the program, however, the browser gets stuck again;
  • On Windows 11, firefox cannot connect to any website at all, not even waiting and refreshing the pages.

Expected results:

Browsing the web without issues.

Can you use https://mozilla.github.io/mozregression/ to get us a regression range?

Flags: needinfo?(twain43)

The Bugbug bot thinks this bug should belong to the 'Core::Networking' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Networking
Product: Firefox → Core

(In reply to Robert Longson [:longsonr] from comment #1)

Can you use https://mozilla.github.io/mozregression/ to get us a regression range?

Yes I did...I set release dates for 119.0.1 (last good, released on 2023-11-07) to 120.0 (first bad, released on 2023-11-21).

Bisecting on mozilla-central [2023-11-07 to 2023-11-21]:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=5d6699b34edce04ffd8886be86de9d604d88a89a&tochange=7bc9f9c659dd5c04c341cef4f898cb08574d9cdd

At a certain point, the GUI switched to:
"Bisecting on autoland [5d6699b3 - f1fb5f0a]"
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=5d6699b34edce04ffd8886be86de9d604d88a89a&tochange=f1fb5f0afb5896b5c30e6a9b1439a9c4de4a3db2

All tested builds failed...so I guess this might be something edited immediately after 119.0.1.
I am attaching the log of the GUI, hopefully it can provide more info...

Flags: needinfo?(twain43)
Attached file moregression GUI log (obsolete) —

This is what your log says :

Bug 1849265 - P1 Synchronize WorkerPrivate::mParentStatus with mStatus while entering Killing status. r=asuth

Differential Revision: https://phabricator.services.mozilla.com/D191633

So basically that revision might have broken something?
Not a tech, here, just a worried user. ;)
Just let me know if I can do anything else to help.

Can you type about:support in the browsers address-bar, and paste its contents here?

Actually, I redid the mozregression test, since I spotted I was dealing only with 121+ version builds, and I guessed this was due to some mismatch between "release dates" and effective "build dates".

I am attaching a new log, this time comprising tests of BUILDS ranging from v.119 to v.120, instead of using dates, for a better naildown. This should provide a better regression.

Attached file Mozregression v119-120 —
Attachment #9369620 - Attachment is obsolete: true

(In reply to Mayank Bansal from comment #7)

Can you type about:support in the browsers address-bar, and paste its contents here?

Sure. I am attaching it as well (this is for the working 119.0.1 version currently running...)

Attached file raw data from about:support page (obsolete) —

I think about:support from the version of Firefox thats not working will be more useful.
You may also want to share logs from the new version using the steps here: https://firefox-source-docs.mozilla.org/networking/http/logging.html

Flags: needinfo?(bugmail)
Keywords: regression
Regressed by: 1848815

What happens if you set security.enterprise_roots.enabled to false in about:config?

Flags: needinfo?(twain43)
Attachment #9369628 - Attachment is obsolete: true
Flags: needinfo?(twain43)

(In reply to Robert Longson [:longsonr] from comment #13)

What happens if you set security.enterprise_roots.enabled to false in about:config?

No changes.

this logs the following actions:
open new TAB, trying to browse on a bookmark;
open new TAT, trying to browse on www.youtube.com

(In reply to twain43 from comment #15)

(In reply to Robert Longson [:longsonr] from comment #13)

What happens if you set security.enterprise_roots.enabled to false in about:config?

No changes.

I rebooted firefox and now it IS working. So it might be related indeed!

(Clearing needinfo because it sounds like the identified ServiceWorker patch in comment 5 was determined to be an erroneous regression identification, which makes sense because the change cannot affect navigation. Also the suggestion in comment 13 seems to have been correct per comment 17.)

(In reply to twain43 from comment #17)

I rebooted firefox and now it IS working. So it might be related indeed!

To provide context links, security.enterprise_roots.enabled was landed in https://hg.mozilla.org/mozilla-central/rev/81b65e80ca14f2dcc9bfb919173d95d1831a00c0 in Firefox 120 by Bug 1848815 which does match the timeline, it sounds like.

Flags: needinfo?(bugmail)

:mhowell, since you are the author of the regressor, bug 1848815, could you take a look? Also, could you set the severity field?

For more information, please visit BugBot documentation.

Flags: needinfo?(mhowell)

All my patch did was flip the default for the enterprise roots pref, I wasn't involved in implementing it at all, so I don't think I can really help here. Sorry. :(

Flags: needinfo?(mhowell)

(In reply to twain43 from comment #16)

Created attachment 9369636 [details]
LOG obtained with about:logging

this logs the following actions:
open new TAB, trying to browse on a bookmark;
open new TAT, trying to browse on www.youtube.com

I saw this error in the log.

2023-12-20 11:14:05.729986 UTC - [Parent 4584: Socket Thread]: V/nsHttp canceling transaction: tls handshake takes too long: tls handshake last 45791ms, timeout is 30000ms.
2023-12-20 11:14:05.729993 UTC - [Parent 4584: Socket Thread]: V/nsHttp nsHttpConnection::CloseTransaction[this=14723ce00 trans=14cb5ea00 reason=804b000e]

Also, given that disabling security.enterprise_roots.enabled seems to work, this issue appears to be related to PSM/NSS.

Component: Networking → Security: PSM

It would be helpful to know what certificates are being imported. Please:

  • Enable the browser console by setting the preference devtools.chrome.enabled to true in about:config
  • Open the browser console with ctrl + shift + j (command + shift + j on macOS)
  • Run the following:
    Cc["@mozilla.org/psm;1"].getService(Ci.nsINSSComponent).getEnterpriseRoots()
    Cc["@mozilla.org/psm;1"].getService(Ci.nsINSSComponent).getEnterpriseIntermediates()
  • Menu-click on the output from each of those commands, click Copy Object, and paste the results in a text file and email that to me.

Thank you!

Flags: needinfo?(twain43)

(In reply to Dana Keeler (she/her) (use needinfo) (:keeler for reviews) from comment #22)

It would be helpful to know what certificates are being imported. Please:

  • Enable the browser console by setting the preference devtools.chrome.enabled to true in about:config
  • Open the browser console with ctrl + shift + j (command + shift + j on macOS)
  • Run the following:
    Cc["@mozilla.org/psm;1"].getService(Ci.nsINSSComponent).getEnterpriseRoots()
    Cc["@mozilla.org/psm;1"].getService(Ci.nsINSSComponent).getEnterpriseIntermediates()
  • Menu-click on the output from each of those commands, click Copy Object, and paste the results in a text file and email that to me.

Thank you!

I am trying to do so...but the nested lines in the "Array" result seem to go on forever...there's bound to be an "export all sub-lines" command or option, somewhere...if I try to "save all", the console just saves the top-most results (so just "Array[]")...

Flags: needinfo?(twain43)

I replied to your email, but the trick is to menu-click on just the output of each command (it'll look like an arrow pointing to the left and then a triangle pointing to the right and then Array... - click on the Array part) and select Copy Object.

I provided you with some screenshots via email...clicking "Copy Object" while on the Array part seems only to export:" []" ...

The severity field is not set for this bug.
:keeler, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(dkeeler)

Building certificate chains (at startup, no less) is slow. Using the policy
APIs to determine if an enterprise certificate is a trusted root or an
intermediate should be faster.

Assignee: nobody → dkeeler
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true

I've been communicating with the reporter. This approach appears to solve the issue on macOS. I've asked them to file a new bug so we can investigate the issue on Windows.

Severity: -- → S2
Flags: needinfo?(dkeeler)
Priority: -- → P1
Whiteboard: [psm-assigned]

Set release status flags based on info from the regressing bug 1848815

(In reply to Dana Keeler (she/her) (use needinfo) (:keeler for reviews) from comment #28)

I've been communicating with the reporter. This approach appears to solve the issue on macOS. I've asked them to file a new bug so we can investigate the issue on Windows.

See bug 1873371, just posted. I uploaded a mozlog as well.

Pushed by dkeeler@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f760cb85cb74 use policy APIs rather than building certificate chains to determine enterprise roots on macOS r=jschanck
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 123 Branch

The patch landed in nightly and beta is affected.
:keeler, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox122 to wontfix.

For more information, please visit BugBot documentation.

Flags: needinfo?(dkeeler)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: 123 Branch → ---
Regressions: 1873851
Regressions: 1873724
Duplicate of this bug: 1873932

:keeler Fx123 goes to beta next week, wondering if you might have a fix before then?
This seems risky for a Fx122 uplift but what do you think?

See Also: 1873932
No longer duplicate of this bug: 1873932
Regressions: 1873932

I'm not sure this is going to make 122. I'm working on bug 1874054 to prevent the issue we saw in bug 1873851 first.

Flags: needinfo?(dkeeler)

Setting Fx122 to wontfix given the risk.

Pushed by dkeeler@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e0ed03857de1 use policy APIs rather than building certificate chains to determine enterprise roots on macOS r=jschanck
Status: REOPENED → RESOLVED
Closed: 1 year ago1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 124 Branch

Dana, is that something we would want to uplift to beta 123 or should it ride the 124 train? Thanks

Flags: needinfo?(dkeeler)

Potentially - I first want to check with the reporter that it's working as intended.

Twain - does the latest Nightly work for you? Thanks!

Flags: needinfo?(dkeeler) → needinfo?(twain43)

(In reply to Dana Keeler (she/her) (use needinfo) (:keeler for reviews) from comment #43)

Potentially - I first want to check with the reporter that it's working as intended.

Twain - does the latest Nightly work for you? Thanks!

Hello, I just downloaded and tried it.
Working fine, here.

Flags: needinfo?(twain43)

Comment on attachment 9371202 [details]
Bug 1871011 - use policy APIs rather than building certificate chains to determine enterprise roots on macOS r?jschanck

Beta/Release Uplift Approval Request

  • User impact if declined: Potential for very slow startup on macOS.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This was manually verified. Also, we landed bug 1874054 to prevent disastrous things like the imported roots affecting the trust of built-in roots.
  • String changes made/needed: none
  • Is Android affected?: No
Attachment #9371202 - Flags: approval-mozilla-beta?

Comment on attachment 9371202 [details]
Bug 1871011 - use policy APIs rather than building certificate chains to determine enterprise roots on macOS r?jschanck

Approved for 123 beta 6, thanks.

Attachment #9371202 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: