Open Bug 2034342 Opened 1 month ago Updated 13 hours ago

something wrong - certificate error notifications - with gmail SSL connections (IMAP and SMTP at least)

Categories

(MailNews Core :: Networking, defect, P1)

Thunderbird 151
defect

Tracking

(thunderbird151+ affected, thunderbird152 affected, thunderbird153+ affected)

Tracking Status
thunderbird151 + affected
thunderbird152 --- affected
thunderbird153 + affected

People

(Reporter: mkmelin, Assigned: leggert)

References

(Blocks 2 open bugs)

Details

(Keywords: regression, regressionwindow-wanted)

Attachments

(1 file)

We have had a few people in the matrix channel report SSL connection problems when sending.
I've had a few times the UI reporting that the certificate for imap.gmail.com was not valid. (Cert not from a valid source, I think the notification said.)
When that happened, in the account settings I tried the "Test connection to server" - says can't connect.
But that's wrong as openssl s_client -connect imap.gmail.com:993 worked just fine when I tried that instead.

I saw this on Daily first like a week ago, and it's not consistent. The problem has now, again, resolved itself.
Perhaps there's a chance some network problem causing it, but then I don't see how the connection would work outside of Thunderbird.

Maybe networking code changes in mozilla-central?

I've seen that in the past few days too. Not sure if the OpenSSL client working is a sign that something is wrong in TB, or that the problem on Gmail's side is intermittent.

For what it's worth, on the SMTP side of things I managed to turn on debug logs before the issue went away, and it seems like the TCPSocket was raising the error SSL_ERROR_DECRYPT_ERROR_ALERT.

Thanks brendan, sounds like it could be the same cause as bug 2034343

I mean bug 2033073 as the original cause, because it talked about the same error code.

See Also: → 2033073

(In reply to Kai Engert [:KaiE:] from comment #4)

I mean bug 2033073 as the original cause, because it talked about the same error code.

I'm not sure, since bug 2033073 seems to be about mitigations for existing SSL_ERROR_DECRYPT_ERROR_ALERT failures, not about recategorizing new ones. It does implement a strategy to improve robustness against those failures because they might be caused by encryption key rotation, so maybe we could implement the same kind of mitigation, i.e. removing the SSL token from the network's cache and retrying. Though the cache removal seems to be a C++-only thing so we wouldn't be able to do that for SMTP.

Edit: oh you meant sharing the same root cause, not that bug being the originator, I misread, sorry.

Summary: something wrong with gmail SSL connections (IMAP and SMTP at least) → something wrong - certificate error notifications - with gmail SSL connections (IMAP and SMTP at least)

Also started seeing this on beta 151.0b1 yesterday. Initial fetch fails with this, second fetch makes one of the gmail accounts work and the other one forget about the error, third fetch also syncs the second gmail account. Haven't really looked at the networking details at all, but all the behaviors match what's been described here so far.

Severity: -- → S2

Ditto on the Daily MacOS in recent days; intermittent cert errors for gmail, that then disappear when I try to investigate.

Do we still want to try and get mozregression working to confirm the regressor? I haven't found a setup that reliably triggers this so far, but I also don't understand the suspected regressor - understanding of which I'm sure would help with finding a setup.

I think it also being in 151 beta tracks, since the regressor was uplifted there.

Version: unspecified → Thunderbird 151
Attached file ssllog.moz_log.zip

Log with export MOZ_LOG=timestamp,sync,nsHttp:5,cache2:5,nsSocketTransport:5,nsHostResolver:5,pipnss:5

At this the moment, an ASan build triggers it all the time for me at startup.

I guess it's possible this will get fixed by the follow up? https://bugzilla.mozilla.org/show_bug.cgi?id=2033073#c18

Duplicate of this bug: 2037680

So from bug 2033073 this should now be resolved on daily and fixed in the next beta if that is the regressor.

Unfortunately I don't think it's resolved. I had it with a later build as well, I believe.

It's only really managed to stick on daily yesterday and they also mentioned bug 2035453 might be needed for beta too. So it's possible there's more to come, but should we be relying on that?

(In reply to Martin Giger [:freaktechnik] from comment #15)

It's only really managed to stick on daily yesterday and they also mentioned bug 2035453 might be needed for beta too. So it's possible there's more to come, but should we be relying on that?

So we don't think (or can't confirm) that we are fixed on daily?

With no response from leggert we're basically out of time for fixing our beta 151. Which wouldn't be a big deal, except that 151 hits release in 1.5 weeks, which would not be good.

And there's now also Bug 2037866 - nsHttpTransaction::PrepareConnInfoForRetry strips HTTPS-RR alt-route on TLS resumption errors

Confirmed it's not yet gone from trunk. I just got it with Daily 2026-05-08

Note that it may not be an issue for 151 beta anymore. https://bugzilla.mozilla.org/show_bug.cgi?id=2033073#c41 (assuming same regressor)

Not that it's needed, but another confirmation: still seeing it on: 152.0a1 (2026-05-08) (aarch64)

MacOS 26.4.1 (25E253)

Linux OpenSUSE Tumbleweed
FWIW, I have no problem with two Gmail accounts using 152.0a1 (2026-05-08) (64-bit).
There's no problem with a different Gmail account using 151.0b3 (64-bit) either.
I've seen the problem before with both, 151.0b3 (64-bit), and daily, albeit with an earlier daily than 2026-05-08.
So could this be a temporary glitch at Google?
The cert I'm getting from imap.gmail.com is from Mon, 20 Apr 2026 08:36:31 GMT, valid through Mon, 13 Jul 2026 08:36:30 GMT.
SHA-256 Fingerprint F0:A3:11:30:12:90:88:5D:C3:29:CF:21:25:B4:AD:D7:41:18:FD:C0:64:F5:91:05:13:1E:76:93:9E:16:3B:CC

This might have been related to the series of TLS token persistence patches I did, but Nightly should now work. Is anyone still seeing issues?

I'm still seeing it, intermittently, with 152.0a1 (2026-05-10) (aarch64)

MacOS 26.4.1 (25E253)

Do you have an ssl_tokens_cache.bin file in your profile directory? If yes, quit, delete it and restart. Does the problem still happen?

If yes, temporarily set network.ssl_tokens_cache_persistence to false and see if that fixes it?

I did have that file, which I've now deleted. No gmail SSL errors seen so far, but it seemed intermittent before, so may take a while to be sure.

The file doesn't seem to have yet been re-created, although I still have that setting set true.

Thanks very much.

You might see it reoccur. I'm still landing patches against this functionality; it's pref'ed enabled only for Nightly and Early Beta. Let me know if it does?

(In reply to Lars Eggert [:lars] from comment #24)

You might see it reoccur. I'm still landing patches against this functionality; it's pref'ed enabled only for Nightly and Early Beta. Let me know if it does?

Will do; thanks Lars.

Duplicate of this bug: 2040366

I was also affected, I'm using my own local Thunderbird Beta build.

It wasn't sufficient to make the preference change that Lars had suggested in comment 22.

In addition, I had to manually delete the mentioned file from my profile directory.

(In reply to Lars Eggert [:lars] from comment #24)

You might see it reoccur. I'm still landing patches against this functionality; it's pref'ed enabled only for Nightly and Early Beta. Let me know if it does?

How is that set? We should verify that the pref works on the Thunderbird release channels, too. (In other words, make sure that Thunderbird release channel don't get that functionality.)

Would it be reasonable to NOT set that pref for Thunderbird nightly/beta at all?

Flags: needinfo?(leggert)

Interestingly it's been fine for me so far this week with (the official) b4. Which might be because the early beta flag finally took hold in our builds - the pref isn't even known anymore.

Re beta working, see comment 17.

For Daily, still broken at least today.

Only deleting ssl_tokens_cache.bin didn't help.
I set network.ssl_tokens_cache_persistence false and didn't see the issue. But, had to revert that to true, as I stopped getting imap mails from MS365 at all, for some reason. At least I think that was what caused it, as I had restarted many times troubleshooting, and then after resetting the pref it instantly started working again.

The problem re-appeared for me with Beta 151 after a reboot, despite having network.ssl_tokens_cache_persistence already set to false.

Why do I have a new file ssl_tokens_cache.bin in my profile directory, despite having the pref set to false?

Are we certain that the bug is related to Lars' work?

The issue we see is a certificate validation error, can that be related to TLS tokens?

network.ssl_tokens_cache_persistence = false has a bug, even set it to false, the bin file is created.

try delete ssl_tokens_cache.bin file and set network.ssl_tokens_cache_capacity = 0.

Base on tracking flags in bug 2033073 it appears that the functionality got disabled in 151.

I've updated my local build to the 151-beta-end-code. I didn't get a failure when starting that build. I'll keep using that for another few days and will check whether I can still reproduce or whether it's gone.

Thanks Alice!

Flags: needinfo?(leggert)
See Also: → 2040637

Should be fixed by bug 2040637.

Assignee: nobody → leggert

(In reply to Lars Eggert [:lars] from comment #35)

Should be fixed by bug 2040637.

That bug says it's fixed for 153+ and that 151 is unaffected.
Is that a different issue?

For this behavior here we need a fix that works on the 151 branch.
Did we already get a fix here?

It's possible that latest 151 is already working.
I cannot reproduce the bug currently with the latest 151 beta snapshot, using a local build based on tag FIREFOX_BETA_151_END.

Problem recurred for me, after deleting ssl_tokens_cache.bin; now with Daily 153.0a1 (2026-05-20) (aarch64).

That was with network.ssl_tokens_cache_persistence still true, and the file was recreated.

Have now deleted the file again, and set network.ssl_tokens_cache_persistence to false.

The problem has recurred for me today; 153.0a1 (2026-05-23) (aarch64)

Despite having deleted file ssl_tokens_cache.bin, and with network.ssl_tokens_cache_persistence set to false, and restarted. The file was nevertheless recreated.

My issue:

Sending of the message failed.
Peer reports failure of signature verification or key exchange.
The configuration related to smtp.gmail.com must be corrected.

has reoccurred. It magically went away a couple of weeks ago without a TB change, and came back today. I recently updated to on 152.0b1, but I was able to send after the update. This is a couple of days later.

I was seeing this earlier today for a gmail account after updating daily to 153.0a1 (2026-05-24) - I hadn't updated in a few weeks - Troubleshooting Information claims my previous update was to 151.0a1 20260418. Now also seeing it for a fastmail account that had been working well** all day**, it started failing after I changed some "Copies&Folders" locations in account settings - I didn't touch server settings. I don't see how that could have affected caching or server response.

Works now after setting network.ssl_tokens_cache_persistence to false and deleting ssl_tokens_cache.bin. Working also after changing network.ssl_tokens_cache_persistence back to true - but testing has been minimal.

Problem eventually returned with network.ssl_tokens_cache_persistence=true.
I could see this being a release blocker.

Affected: Thunderbird Beta 152.0 (20260520174921) on Windows 11 Pro 25H2.

Both imap.gmail.com and imap.ziggo.nl showed 'Non-overridable TLS error occurred. Handshake error or probably the TLS version or certificate used by server is incompatible.' Started suddenly with no deliberate changes.

Root cause in my case: corrupted ssl_tokens_cache.bin in profile directory, consistent with the workaround described in this bug. Also had security.enterprise_roots.enabled = true in user.js (Windows enterprise root store enabled), which may have been a contributing factor.

Fix: deleted ssl_tokens_cache.bin, SiteSecurityServiceState.bin, and AlternateServices.bin from profile, and set security.enterprise_roots.enabled = false in user.js. Errors stopped after restart.

Affected: Thunderbird Beta 152.0 (20260520174921) on Windows 11 Pro 25H2.

Both imap.gmail.com and imap.ziggo.nl showed 'Non-overridable TLS error occurred. Handshake error or probably the TLS version or certificate used by server is incompatible.' Started suddenly with no deliberate changes.

Root cause in my case: corrupted ssl_tokens_cache.bin in profile directory, consistent with the workaround described in this bug. Also had security.enterprise_roots.enabled = true in user.js (Windows enterprise root store enabled), which may have been a contributing factor.

Fix: deleted ssl_tokens_cache.bin, SiteSecurityServiceState.bin, and AlternateServices.bin from profile, and set security.enterprise_roots.enabled = false in user.js. Errors stopped after restart.

Kai, does https://searchfox.org/firefox-main/source/modules/libpref/init/StaticPrefList.yaml#15458 not cause this feature to be disabled on Thunderbird?

Adding a cleaner report to supplement comments 42/43 (accidental duplicates, sorry).

Thunderbird Beta 152.0 (20260520174921), Windows 11 Pro 25H2.

Both imap.gmail.com and imap.ziggo.nl (a Dutch ISP, non-Google) failed simultaneously with 'Non-overridable TLS error'. This confirms the issue is not Gmail-specific but affects any IMAP server using TLS.

Onset was sudden with no user-made changes. Profile had ssl_tokens_cache.bin present.

Fix: deleted ssl_tokens_cache.bin (plus SiteSecurityServiceState.bin and AlternateServices.bin), restarted. Errors resolved.

(In reply to Lars Eggert [:lars] from comment #44)

Kai, does https://searchfox.org/firefox-main/source/modules/libpref/init/StaticPrefList.yaml#15458 not cause this feature to be disabled on Thunderbird?

On Thunderbird, IS_EARLY_BETA_OR_EARLIER seems to be set to true for beta and daily(/nightly), so the pref is true for these channels. Most reports in here seem to be from either beta or daily.

Duplicate of this bug: 2042626

Not certain, but this could also be the cause of some intermittent account creation errors I've been seeing with my gmail account, where I set up the account with a good token, but I get a nondescript auth error in accounthub. Eventually things succeed if I just keep pressing the continue button in accounthub without changing any information.

(In reply to Eleanor Dicharry from comment #48)

Yes, I think that is related. We could get a (slightly) better error message there and the OS notification about a failed connection, but we're suppressing them for some reason. Whether doing that is a good idea or not is a discussion for another day…

Priority: -- → P1
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: