Closed Bug 1898884 Opened 9 months ago Closed 9 months ago

Firefox seems to be stuck in some sort of migration when security.nocertdb is true, causing certain parts of the browser UI to malfunction (first time starting 127; workaround: restart 127)

Categories

(Firefox :: General, defect)

Firefox 127
Desktop
Unspecified
defect

Tracking

()

VERIFIED FIXED
128 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox126 --- unaffected
firefox127 + wontfix
firefox128 --- verified

People

(Reporter: aoia7rz7l, Assigned: ssachdev)

References

Details

(Keywords: regression)

Attachments

(1 file)

Tested in 127.0beta on both Windows 10 and Linux.

Prerequisites:

  1. security.nocertdb is set to true.
  2. browser.migration.version is below 145. (e.g. it is 144 in 126.0b9)

STR:

Start Firefox (in a new profile if you want).

Expected Behavior:

Nothing remarkable. browser.migration.version is set to 145 (it is when using a vanilla 127.0beta profile).

Actual Behavior:

Certain parts of the UI seems to be completely broken. These include

  1. FxA-related menu items showing up in AppMenu and Firefox View when identity.fxaccounts.enabled is set to false.
  2. Firefox Home is completely blank and the entire Firefox Home Content section in about:preferences#home disappeared.
  3. Firefox Suggest-related options in Address Bar — Firefox Suggest disappeared and that section is now titled Address Bar.
  4. Unable to close Firefox when closing the last tab. I had to do so via Quit in the AppMenu or the close button in the top right corner.
  5. Context menu when right-clicking the tab bar is malformed, showing duplicate options such as Duplicate Tab and Duplicate Tabs and other irrelevant ones like Hide Toolbars and Exit Full Screen Mode.
  6. Clicking OK on the first-time confirmation dialog for browser.tabs.haveShownCloseAllDuplicateTabsWarning does nothing.
  7. Enabling and disabling browser.taskbar.lists.enabled have no effect on Firefox's Windows Jumplist.
  8. All fields in about:support are empty.

browser.migration.version is stuck, for example, at 144, if you were upgrading from 126.0b9.

Workaround 1:

Set browser.migration.version to 145 and restart Firefox.

Workaround 2:

Set security.nocertdb to false and restart Firefox. Once browser.migration.version is automatically raised to 145 I can set it back to true without any problem.

I wasn't able to find a bad Nightly build so I resorted to mozregression --repo mozilla-beta and it returned

Last good revision: 0ec860346abeb203e68bd7f5c00b3b19dae388ce
First bad revision: e78a64baff46c8e9bee42350c1048c8569824619
Pushlog: https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=0ec860346abeb203e68bd7f5c00b3b19dae388ce&tochange=e78a64baff46c8e9bee42350c1048c8569824619

which doesn't seem very useful when it's essentially the difference between 126.0b10 and 127.0b1. However, https://searchfox.org/mozilla-central/source/browser/components/BrowserGlue.sys.mjs#3801 points to bug 1889232, which is within this regression range.

The Bugbug bot thinks this bug should belong to the 'Firefox::Migration' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Migration

I worry that this is related to bug 1890883 / bug 1898323. It's the end of the 127 cycle so we're a bit late to try to fix this at this point.

When this happens, are there any errors in the browser console?

Do you use any saved password or creditcard functionality in Firefox, and if so, can you still fill those without being prompted?

What happens if you go to about:preferences and try to tick either of the "require device sign in to..." checkboxes?

Blocks: 1898323
Component: Migration → General
Flags: needinfo?(aoia7rz7l)

The bug has a release status flag that shows some version of Firefox is affected, thus it will be considered confirmed.

Status: UNCONFIRMED → NEW
Ever confirmed: true

The bug is marked as tracked for firefox127 (beta). We have limited time to fix this, the soft freeze is in a day. However, the bug still isn't assigned.

:pluk, could you please find an assignee for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit BugBot documentation.

Flags: needinfo?(pluk)

(In reply to :Gijs (he/him) from comment #2)

When this happens, are there any errors in the browser console?

I can see

NS_ERROR_ABORT: User canceled primary password entry crypto-SDR.sys.mjs:85
    encrypt resource://gre/modules/cryto-SDR.sys.mjs:85
    setSecurePref resource://gre/modules/shared/FormAutofill.Utils.sys.mjs:207
    setOSAuthEnabled resource://gre/modules/shared/FormAutofill.Utils.sys.mjs:234
    _migrateUI resource:///modules/BrowserGlue.sys.mjs:4504
    BG__beforeUIStartup resource:///modules/BrowserGlue.sys.mjs:1465
    BG_observe resource:///modules/BrowserGlue.sys.mjs:1113

NS_ERROR_ABORT: User canceled primary password entry crypto-SDR.sys.mjs:85

immediately after startup. Note that the second error doesn't seem to be expandable.

With 127.0b9 on Windows, when both .optout prefs were set after the migration and I lower the value of browser.migration.version to force it to "re-migrate", I see an extra

NS_ERROR_FAILURE: Couldn't decrypt string crypto-SDR.sys.mjs:197
    decrypt resource://gre/modules/cryto-SDR.sys.mjs:197
    getSecurePref resource://gre/modules/shared/FormAutofill.Utils.sys.mjs:193
    getOSAuthEnabled resource://gre/modules/shared/FormAutofill.Utils.sys.mjs:223
    _migrateUI resource:///modules/BrowserGlue.sys.mjs:4489
    BG__beforeUIStartup resource:///modules/BrowserGlue.sys.mjs:1465
    BG_observe resource:///modules/BrowserGlue.sys.mjs:1113

before the two NS_ERROR_ABORT errors. On earlier betas (e.g. 127.0b1) this NS_ERROR_FAILURE does not appear at startup, but everytime when I opened about:preferences. Removing both .optout prefs will make the NS_ERROR_FAILURE error go away. I did not see any NS_ERROR_FAILURE error on Linux builds.

Do you use any saved password or creditcard functionality in Firefox, and if so, can you still fill those without being prompted?

Unfortunately I don't really use these functionalities so I don't know what to expect from them.

What happens if you go to about:preferences and try to tick either of the "require device sign in to..." checkboxes?

I only see a Require device sign in to fill and manage passwords checkbox option on Windows, and nothing on Linux.

On Windows, the checkbox appears to be selected by default. "Unselecting" the option will set security.osreauthenticator.blank_password to true, and both security.osreauthenticator.password_last_changed_hi and security.osreauthenticator.password_last_changed_lo to a non-zero integer.

However, the same option is automatically "selected" again as soon as I reload about:preferences. If I do not touch the option instead and disable security.nocertdb (and so on), it is automatically unselected after restart. If anything I'd say this is just another case of interrupted BrowserGlue startup breaking the UI.

Flags: needinfo?(aoia7rz7l)

OK, I've confirmed this is related to the changes in bug 1890883.

The reason this is breaking is that when this pref is set, any calls to the login manager's encrypt function throw an exception, similar to if the user were prompted for a primary password and clicked "cancel" - it means we cannot encrypt/decrypt with the profile's own secure storage. This sort of makes sense because if the nocertdb pref is turned on, no such storage exists.

We are trying to encrypt the opt-out ("turn this feature off") value into the prefs storage for the credit card / pwd manager autofill OS authentication setting, which is what is invoking the encrypt function (via https://searchfox.org/mozilla-central/rev/b476ffaef761ff85c012e2d93050cf444ff7be34/toolkit/components/passwordmgr/LoginHelper.sys.mjs#1617 and its FormAutofill cousin).

However, after the changes in bug 1898323, we now also check if a migration for Firefox 127 has already run by checking the last mstone value. This means that if you hit this case and update a second time, the migration will succeed because we end up deciding that you had this feature turned on in your profile, so we don't end up writing to prefs or requiring to encrypt/decrypt anything. That wasn't the case when this bug was filed, so the symptoms have changed somewhat since comment 0.

Of course, the feature makes no sense in the context of having this security.nocertdb pref enabled as you cannot store passwords anyway so there's no point "protecting" the non-existing passwords with an OS auth prompt.

Note that the mstone check is a bit fickle; it will go back to being false if you update to a beta with 128 in the version number (as that's not 127). So in the scenario:

  1. update to 127, migration breaks. You use the browser anyway...
  2. update to a 128 beta (starting next week)

then we'd go back to the scenario in comment 0. Also for users who have skipped 127 beta altogether (ie are updating from a 126 beta direct to 128 or later).

So we do actually need to do something here to remedy, but we're probably fine for 127 release/beta users. And of course, it seems likely that very few people have this pref set...

No longer blocks: 1898323
Depends on: 1898323
Keywords: regression
Summary: Firefox seems to be stuck in some sort of migration when security.nocertdb is true, causing certain parts of the browser UI to malfunction → Firefox seems to be stuck in some sort of migration when security.nocertdb is true, causing certain parts of the browser UI to malfunction (first time starting 127; workaround: restart 127)
Assignee: nobody → ssachdev
Status: NEW → ASSIGNED
Attachment #9405782 - Attachment description: Bug 1898884 - Firefox seems to be stuck in some sort of migration when security.nocertdb is true, causing certain parts of the browser UI to malfunction (first time starting 127; workaround: restart 127). r=gijs! → Bug 1898884 - Disabling and hiding the OS Authentication checkboxes when "security.nocertdb" is true. r=gijs!
Pushed by ssachdev@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/011f084bdf6d Disabling and hiding the OS Authentication checkboxes when "security.nocertdb" is true. r=Gijs,settings-reviewers,credential-management-reviewers,dimi
Status: ASSIGNED → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Target Milestone: --- → 128 Branch
Flags: needinfo?(pluk)

I cannot reproduce the issues mentioned in comment 0 by opening Firefox 126.0b9 with a new profile with user.js, which sets the security.nocertdb: true on Windows 10x64. Also after updating to Firefox 128.0 the browser.migration.version is updated to 148 from 144(126.0b9).
Can you please verify if the issues stated at comment 0 are still reproducible with Firefox 128.0? Or are there any additional steps that need to be taken to reproduce the issue besides having security.nocertdb: true on startup? Thank you in advance!

Flags: needinfo?(aoia7rz7l)

(In reply to Alexandru Trif, Desktop QA [:atrif] from comment #10)

I cannot reproduce the issues mentioned in comment 0 by opening Firefox 126.0b9 with a new profile with user.js, which sets the security.nocertdb: true on Windows 10x64.

Hi! Sorry for the late reply, but this issue only affected 127.0beta. I only mentioned 126.0b9 just to show that the value of browser.migration.version had incremented by 1 after upgrading from 126.0b9 to 127.0b1.

Also after updating to Firefox 128.0 the browser.migration.version is updated to 148 from 144(126.0b9).
Or are there any additional steps that need to be taken to reproduce the issue besides having security.nocertdb: true on startup? Thank you in advance!

If browser.migration.version is immediately updated to 148 after startup, that means the browser is no longer stuck in migration.

AFAICT this issue is no longer reproducible in 128.0.

Flags: needinfo?(aoia7rz7l)

(In reply to aoia7rz7l from comment #11)

(In reply to Alexandru Trif, Desktop QA [:atrif] from comment #10)

I cannot reproduce the issues mentioned in comment 0 by opening Firefox 126.0b9 with a new profile with user.js, which sets the security.nocertdb: true on Windows 10x64.

Hi! Sorry for the late reply, but this issue only affected 127.0beta. I only mentioned 126.0b9 just to show that the value of browser.migration.version had incremented by 1 after upgrading from 126.0b9 to 127.0b1.

Also after updating to Firefox 128.0 the browser.migration.version is updated to 148 from 144(126.0b9).
Or are there any additional steps that need to be taken to reproduce the issue besides having security.nocertdb: true on startup? Thank you in advance!

If browser.migration.version is immediately updated to 148 after startup, that means the browser is no longer stuck in migration.

AFAICT this issue is no longer reproducible in 128.0.

Thank you for the response. I confirm that updating to 128.0 from 127.0 will update browser.migration.version to 148 while having security.nocertdb: true on startup. Closing this per this comment and comment 11.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: