Closed Bug 1386601 Opened 7 years ago Closed 5 years ago

Crash downloading mail in nssCertificate_Destroy | CERT_DestroyCertificate | CERT_DestroyCertArray | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature

Categories

(Thunderbird :: Security, defect)

x86
Windows 7
defect
Not set
critical

Tracking

(thunderbird_esr52 wontfix, thunderbird_esr60+ affected, thunderbird59 wontfix, thunderbird60 wontfix, thunderbird61 wontfix, thunderbird62 wontfix, thunderbird63 wontfix, thunderbird65 wontfix, thunderbird66 verified)

RESOLVED FIXED
Thunderbird 66.0
Tracking Status
thunderbird_esr52 --- wontfix
thunderbird_esr60 + affected
thunderbird59 --- wontfix
thunderbird60 --- wontfix
thunderbird61 --- wontfix
thunderbird62 --- wontfix
thunderbird63 --- wontfix
thunderbird65 --- wontfix
thunderbird66 --- verified

People

(Reporter: wsmwk, Assigned: KaiE)

References

Details

(5 keywords, Whiteboard: [tbird topcrash][regression:TB58?])

Crash Data

User Story

Beta picture:
no crashes after 60.0b11 but frequent during and before that, for both ...
nssCertificate_Destroy | CERT_DestroyCertificate | CERT_DestroyCertArray | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature 
and
nssCertificate_Destroy | CERT_DestroyCertificate | CERT_DestroyCertArray | CERT_ImportCerts | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature

no crashes after 60.0b9, but frequent before that ...
PORT_ArenaAlloc_Util | nssTrust_GetCERTCertTrustForCert | fill_CERTCertificateFields

rare but consistent non-Windows crashes in beta  for nssCertificate_Destroy | <name omitted> | CERT_ImportCerts | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature 

bug 1453518's signature becomes rare after 60.0b11  nssCertificate_Destroy | CERT_DestroyCertArray | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature

--

Assuming the above items did not go away because of a fix, perhaps they have been replace by rising stars...

nssCertificate_Destroy | NSSCertificate_Destroy | CERT_DestroyCertificate | CERT_DestroyCertArray | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature  bp-e9981019-32db-4d10-93cb-ec2c70190109

and

 nssCertificate_Destroy | NSSCertificate_Destroy | CERT_DestroyCertificate | CERT_ImportCerts | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature  bp-095b69fe-e3fc-4ed9-b640-f9e8c0190112

Attachments

(2 files, 2 obsolete files)

#85 crash nssCertificate_Destroy | CERT_DestroyCertificate | CERT_DestroyCertArray | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature

#172 crash nssCertificate_Destroy | CERT_DestroyCertificate | CERT_DestroyCertArray | CERT_ImportCerts | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature

The combination makes this ~top 50-60 crash

nssCertificate_Destroy | CERT_DestroyCertificate | CERT_DestroyCertArray | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature
bp-e0b48c1d-1355-40a7-8402-922c10170801.
 0 	nss3.dll	nssCertificate_Destroy	security/nss/lib/pki/certificate.c:95
1 	nss3.dll	CERT_DestroyCertificate	security/nss/lib/certdb/stanpcertdb.c:768
2 	nss3.dll	CERT_DestroyCertArray	security/nss/lib/certdb/certdb.c:2231
3 	nss3.dll	NSS_CMSSignedData_ImportCerts	security/nss/lib/smime/cmssigdata.c:635
4 	xul.dll	nsCMSMessage::CommonVerifySignature(unsigned char*, unsigned int)	C:/builds/moz2_slave/tb-rel-c-esr52-w32_bld-0000000/build/mailnews/mime/src/nsCMS.cpp:266
5 	xul.dll	nsCMSMessage::VerifyDetachedSignature(unsigned char*, unsigned int)	C:/builds/moz2_slave/tb-rel-c-esr52-w32_bld-0000000/build/mailnews/mime/src/nsCMS.cpp:218
6 	xul.dll	SMimeVerificationTask::CalculateResult()	C:/builds/moz2_slave/tb-rel-c-esr52-w32_bld-0000000/build/mailnews/mime/src/nsCMS.cpp:394
7 	xul.dll	mozilla::CryptoTask::Run()	security/manager/ssl/CryptoTask.cpp:53 


 nssCertificate_Destroy | CERT_DestroyCertificate | CERT_DestroyCertArray | CERT_ImportCerts | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature 
bp-41302fcf-01c2-48f2-9ebe-1caac0170802
 0 	nss3.dll	nssCertificate_Destroy	security/nss/lib/pki/certificate.c:95
1 	nss3.dll	CERT_DestroyCertificate	security/nss/lib/certdb/stanpcertdb.c:768
2 	nss3.dll	CERT_DestroyCertArray	security/nss/lib/certdb/certdb.c:2231
3 	nss3.dll	CERT_ImportCerts	security/nss/lib/certdb/certdb.c:2461
4 	nss3.dll	NSS_CMSSignedData_ImportCerts	security/nss/lib/smime/cmssigdata.c:614
5 	xul.dll	nsCMSMessage::CommonVerifySignature(unsigned char*, unsigned int)	C:/builds/moz2_slave/tb-rel-c-esr52-w32_bld-0000000/build/mailnews/mime/src/nsCMS.cpp:266
6 	xul.dll	nsCMSMessage::VerifyDetachedSignature(unsigned char*, unsigned int)	C:/builds/moz2_slave/tb-rel-c-esr52-w32_bld-0000000/build/mailnews/mime/src/nsCMS.cpp:218
7 	xul.dll	SMimeVerificationTask::CalculateResult()	C:/builds/moz2_slave/tb-rel-c-esr52-w32_bld-0000000/build/mailnews/mime/src/nsCMS.cpp:394
8 	xul.dll	mozilla::CryptoTask::Run()	security/manager/ssl/CryptoTask.cpp:53
#6 crash for 59.0b2, so topcrash.
Looks like for 99% of users it's a once and done crash - users don't get repeated crashes.
Big percentage from Germany.

recent example
bp-199d93cd-9849-4079-884e-54ff50180326
0 	nss3.dll	nssCertificate_Destroy	security/nss/lib/pki/certificate.c:95
1 	nss3.dll	CERT_DestroyCertificate	security/nss/lib/certdb/stanpcertdb.c:768
2 	nss3.dll	CERT_DestroyCertArray	security/nss/lib/certdb/certdb.c:2231
3 	nss3.dll	NSS_CMSSignedData_ImportCerts	security/nss/lib/smime/cmssigdata.c:635
4 	xul.dll	nsCMSMessage::CommonVerifySignature(unsigned char*, unsigned int)	C:/builds/moz2_slave/tb-rel-c-esr52-w32_bld-0000000/build/mailnews/mime/src/nsCMS.cpp:266
5 	xul.dll	nsCMSMessage::VerifyDetachedSignature(unsigned char*, unsigned int)	C:/builds/moz2_slave/tb-rel-c-esr52-w32_bld-0000000/build/mailnews/mime/src/nsCMS.cpp:218
6 	xul.dll	SMimeVerificationTask::CalculateResult()	C:/builds/moz2_slave/tb-rel-c-esr52-w32_bld-0000000/build/mailnews/mime/src/nsCMS.cpp:394
7 	xul.dll	mozilla::CryptoTask::Run()	security/manager/ssl/CryptoTask.cpp:53
See Also: → 1338654
Gene, you had some recent comcast experience?

Walt is crashing (reproducible) "Creating my Comcast POP3 account first is common for each of my reports. Doesn't crash if I create my Verizon POP3 account first." 
bp-555b6593-39a6-4c5c-a05d-f353c0180329 60.0b1
bp-e3692055-e8c5-43d4-8bfe-4b4c50180405 60.0b2
(In reply to Wayne Mery (:wsmwk) from comment #2)
> bp-e3692055-e8c5-43d4-8bfe-4b4c50180405 60.0b2

Downloaded and installed 60.0b2 for testing on Windows
Created a new profile, launched TB
Created my Comcast account, downloaded emails
Clicked Keep for Lightning, closed the other prompt.
Set Skip Integration in that dialog box.
Moved the cursor and TB crashed.
(In reply to Wayne Mery (:wsmwk) from comment #2)
> Gene, you had some recent comcast experience?

No, not on comcast here, unless maybe they allow free accounts. Will check, but have my doubts...
It seems like you have to have a paid account or a recently expired account.
See Also: → 1453518
Walt, in bp-d35276aa-a526-4c93-9c3d-5e55e0180719 you wrote "Just setup my Comcast account, dowloaded emails, moved the mouse to select the Inbox and TB crashed."  And in bug 1453518 comment 5 you had a similar crash.  

Can you reproduce this?
Flags: needinfo?(wls220spring)
Around June 12 the crash rate quadrupled.  But I do not understand why, because we didn't issue any new versions or change update rates in that time period.

The overwhelming majority are German locale, 70%
Some users also see signature PORT_ArenaAlloc_Util | nssTrust_GetCERTCertTrustForCert | fill_CERTCertificateFields 

Combined crashes signatures put ranking at #21 for 52.9.1 

bug 1453518 comment 5 "Looking at Troubleshooting Information, Firefox 60.0.2 release has the "Expected minimum versions" of NSS, NSSSMIME, NSSSSL and NSSUTIL at 3.36.4. In TB 60 beta they are version 3.36.1."

I found one user who crashes about once a week bp-65731f71-21fe-4039-a3dd-997a10180726 (jon)

Magnus, what do you think we need to move this forward?
Crash Signature: NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature ] → NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature ] [@ PORT_ArenaAlloc_Util | nssTrust_GetCERTCertTrustForCert | fill_CERTCertificateFields ]
Flags: needinfo?(mkmelin+mozilla)
Whiteboard: [tbird topcrash]
(In reply to Wayne Mery (:wsmwk) from comment #5)
> Walt, in bp-d35276aa-a526-4c93-9c3d-5e55e0180719 you wrote "Just setup my
> Comcast account, dowloaded emails, moved the mouse to select the Inbox and
> TB crashed."  And in bug 1453518 comment 5 you had a similar crash.  
> 
> Can you reproduce this?

No, I could not reproduce it testing TB 60rc (build2) again, which generated that report. I also tested TB 60rc (build3) and it didn't crash.
Flags: needinfo?(wls220spring)
Marco mentioned a crash on IRC:
https://crash-stats.mozilla.com/report/index/49749f49-291f-42c3-820f-209ed0180726
He has steps in the report. I think it's related to Enigmail. Why else would all this crypto code be involved?
FWIW I believe I saw similiar crashes while running (I think mozmill) tests for another bug. I did not have Enigmail, so I don't think it's specific to that.

Given that the relevant functions talk about certs, it seems likely that this could happen anytime TLS/HTTPS connections are made.
the majority of the crash reports do not have enigmail installed. In fact when I was investigating over the last few days I found no such examples.
> Given that the relevant functions talk about certs, it seems likely that this could happen anytime TLS/HTTPS connections are made.

Probably not as all the crashes come from CMS/SMIME code. That code is pretty messy. This looks like a double free to me where it tries to destroy a cert that has been destroyed before. I see a couple points where this could come from. But I'd need something to reproduce the issue before trying to fix something.
Agreed we probably need something reproducible to pin it down.
Flags: needinfo?(mkmelin+mozilla)
(In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #12)
> > Given that the relevant functions talk about certs, it seems likely that this could happen anytime TLS/HTTPS connections are made.
> 
> Probably not as all the crashes come from CMS/SMIME code. That code is
> pretty messy. This looks like a double free to me where it tries to destroy
> a cert that has been destroyed before. I see a couple points where this
> could come from. But I'd need something to reproduce the issue before trying
> to fix something.

Yeah I agree after examining some more. In fact, I must've see a different crash because I'm on macOS and these crashes don't seem to appear on that platform.
WaltS48 told me on IRC that he got crashes setting up an existing Comcast POP3 account. The same didn't happen for his Verizon account. No S/MIME in use at all. Oh, that's already stated in comment #2.
#1 crash for TB60, which is a massive increase. Still almost every crash is a different user, i.e. no duplicate crashes, as noted in comment 1

The few users I have been in contact with think it is when thunderbird is checking for new messages
Where did you get the secport.c#l373 from? Comment #0 and all the crash reports I've seen have certificate.c:95 and that's a "use after free":
https://dxr.mozilla.org/mozilla-esr60/rev/dd52b41d2b775e5c7261ce52795268b7670635fc/security/nss/lib/pki/certificate.c#95
Some change made things worse so adding "regression" even though this is not a new crash signature.
Keywords: regression
IIUC you say that most stacks originate in CMS code, which is used by S/MIME. It would be good to find out why this code is reached by users who are simply configuring a new account.

I suspect Enigmail doesn't use S/MIME, but maybe it uses CMS in some way? What happens immediately after setting up an email account? Does TB immediately start scanning some of the email messages? Could users have S/MIME messages in their mailboxes, which are immediately scanned by TB, e.g. for spam filtering? Could that automatically trigger some S/MIME processing? Maybe you could try to trace if nsCMSMessage::CommonVerifySignature / NSS_CMSSignedData_ImportCerts are reached frequently.

Does TB have any other code that uses CMS messages, maybe some encrypted chat service?
Flags: needinfo?(kaie)
Blocks: 1453518
In the case of bug 1453518, users are simply getting mail.
(In reply to Magnus Melin from comment #13)
> Agreed we probably need something reproducible to pin it down.

I no longer have access to email addresses of crash reporters.  So we likely need to generate our own data.  I'm made one correlation to Enigmail noted below.


Most crashes are still nssCertificate_Destroy | CERT_DestroyCertificate | CERT_DestroyCertArray | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature

***Release channel shows modest rate increase June 14-15 at the release of 52.8.0)[1][2], and big increase August 6-7 at the release of version 60(comment 16) [1]. 52.8.0 ONLY got security patches, which were many https://www.mozilla.org/en-US/security/advisories/mfsa2018-13/  BUT 52.8.0 uptake would have been 90+% by June 1 (rate 100 was set on May 21) the 52.8.0 crash rate increase isn't until June 14 [2].  *This is when Enigmail 2.0.7 released, June 13 https://addons.thunderbird.net/en-US/thunderbird/addon/enigmail/versions/?page=1#version-2.0.7 

Non-release channels OTOH have consistent crash rate going back to April [3] (which is our oldest data).  The earliest crashes on record are: 54.0.b2, 58.0a1 and (mostly) 58.0b1. In comparison to 52.8.0 it is unclear why 58.0b1 apparently increased, because I don't think it many if any security patches there https://www.thunderbird.net/en-US/thunderbird/58.0beta/releasenotes/  (it does have the compact patch)

bp-dc9da22d-3feb-4824-befc-e6dce0180427	2018-04-27 06:12:47 54.0b2 20170524131922	
bp-3dfb8641-8d7f-4af6-a8dc-f37100180422	2018-04-22 17:43:03 58.0a1 20171024030201	
bp-a1d4d47c-08f1-4721-b3ad-6450c0180906	2018-09-06 23:52:19 58.0b1 20171124151037	
bp-d3aadd70-4860-462f-8c6f-fbfbd0180829	2018-08-29 07:48:11 58.0b1 20171124151037	
bp-a357d0f4-5b24-45bb-bf1c-4cd930180329	2018-03-29 10:59:25 58.0b1 20171124151037	


[1] https://crash-stats.mozilla.com/signature/?release_channel=release&signature=nssCertificate_Destroy%20%7C%20CERT_DestroyCertificate%20%7C%20CERT_DestroyCertArray%20%7C%20NSS_CMSSignedData_ImportCerts%20%7C%20nsCMSMessage%3A%3ACommonVerifySignature&date=%3E%3D2017-12-31T20%3A28%3A00.000Z&date=%3C2018-09-09T23%3A28%3A55.000Z#graphs

[2] 52.8.0 June 14 https://crash-stats.mozilla.com/signature/?release_channel=release&signature=nssCertificate_Destroy%20%7C%20CERT_DestroyCertificate%20%7C%20CERT_DestroyCertArray%20%7C%20NSS_CMSSignedData_ImportCerts%20%7C%20nsCMSMessage%3A%3ACommonVerifySignature&date=%3E%3D2018-06-01T08%3A28%3A00.000Z&date=%3C2018-06-29T11%3A28%3A00.000Z#graphs

[3] https://crash-stats.mozilla.com/signature/?release_channel=%21release&signature=nssCertificate_Destroy%20%7C%20CERT_DestroyCertificate%20%7C%20CERT_DestroyCertArray%20%7C%20NSS_CMSSignedData_ImportCerts%20%7C%20nsCMSMessage%3A%3ACommonVerifySignature&date=%3E%3D2017-12-31T20%3A28%3A00.000Z&date=%3C2018-09-09T23%3A28%3A55.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_sort=build_id&_sort=-date&page=1#graphs
Enigmail does not use any NSS functions, nor anything related to X.509 certificates. 

The stacks clearly show that the error happens within S/MIME signature verification, when certificates are imported from signed messages. That's nothing Enigmail would do.
(In reply to Patrick Brunschwig from comment #24)
> Enigmail does not use any NSS functions, nor anything related to X.509
> certificates. 

Thanks for weighing in.  

> The stacks clearly show that the error happens within S/MIME signature
> verification, when certificates are imported from signed messages. That's
> nothing Enigmail would do.

Indeed I did not intend to lay blame at enigmail.  The June 14 correlation [4] must be caused by something other than enigmail 2.0.7 [4].  I spot checked some crashes in the time period and none have enigmail installed.  But there clearly is a bump so there must be some other explanation.

As for the big version 60 bump, this might originate in 58.0b2 or 58.0b3 - a regression that increased the crash rate of the signature which obviously existed prior to version 58.


[4] https://crash-stats.mozilla.com/signature/?release_channel=release&signature=nssCertificate_Destroy%20%7C%20CERT_DestroyCertificate%20%7C%20CERT_DestroyCertArray%20%7C%20NSS_CMSSignedData_ImportCerts%20%7C%20nsCMSMessage%3A%3ACommonVerifySignature&date=%3E%3D2018-06-03T16%3A28%3A00.000Z&date=%3C2018-06-23T19%3A28%3A00.000Z#graphs
Whiteboard: [tbird topcrash] → [tbird topcrash][regression:TB55?]
Whiteboard: [tbird topcrash][regression:TB55?] → [tbird topcrash][regression:TB58?]
Crash Signature: NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature ] [@ PORT_ArenaAlloc_Util | nssTrust_GetCERTCertTrustForCert | fill_CERTCertificateFields ] → NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature ] [@ PORT_ArenaAlloc_Util | nssTrust_GetCERTCertTrustForCert | fill_CERTCertificateFields ] [@ nssCertificate_Destroy | <name omitted> | CERT_ImportCerts | NSS_CMSSignedData_ImportCert…
#1 crash for version 60.  So we need a way forward that might not involve having a testcase, because we don't have access to the users, except the one in bug 1453518
Flags: needinfo?(mkmelin+mozilla)
Summary: Crash in nssCertificate_Destroy | CERT_DestroyCertificate | CERT_DestroyCertArray | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature → Crash downloading mail in nssCertificate_Destroy | CERT_DestroyCertificate | CERT_DestroyCertArray | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature
via PM, crash reporter Jon has been put in contact with Magnus, Gene and Jorg for followup.
bp-65731f71-21fe-4039-a3dd-997a10180726 63.0a1
Other cert related signatures (I have not compared stacks)

* nssCertificate_Destroy | <name omitted> | <name omitted> | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature 
bp-7dd63446-525c-4bf0-a3f2-946de0181103 -- ranked #82 for 60.3.0
 0 	libnss3.dylib	nssCertificate_Destroy	security/nss/lib/pki/certificate.c:95
1 	libnss3.dylib	<name omitted>	security/nss/lib/pki/certificate.c:141
2 	libnss3.dylib	<name omitted>	security/nss/lib/certdb/certdb.c:2232
3 	libnss3.dylib	NSS_CMSSignedData_ImportCerts	security/nss/lib/smime/cmssigdata.c:635
4 	XUL	nsCMSMessage::CommonVerifySignature(unsigned char*, unsigned int)	comm/mailnews/mime/src/nsCMS.cpp:227
5 	XUL	mozilla::CryptoTask::Run()	security/manager/ssl/CryptoTask.cpp:39 

*  PORT_FreeArena_Util | CERT_DestroyCertificate | CERT_DestroyCertArray | CERT_ImportCerts | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature bp-65a69f05-f294-4470-a93e-6efbd0181106  -- ranked #33 for 60.3.0
 0 	nss3.dll	PORT_FreeArena_Util	security/nss/lib/util/secport.c:373
1 	nss3.dll	CERT_DestroyCertificate	security/nss/lib/certdb/stanpcertdb.c:819
2 	nss3.dll	CERT_DestroyCertArray	security/nss/lib/certdb/certdb.c:2232
3 	nss3.dll	CERT_ImportCerts	security/nss/lib/certdb/certdb.c:2462
4 	nss3.dll	NSS_CMSSignedData_ImportCerts	security/nss/lib/smime/cmssigdata.c:614
5 	xul.dll	nsCMSMessage::CommonVerifySignature(unsigned char*, unsigned int)	comm/mailnews/mime/src/nsCMS.cpp:227
6 	xul.dll	nsCMSMessage::VerifyDetachedSignature(unsigned char*, unsigned int)	comm/mailnews/mime/src/nsCMS.cpp:183
7 	xul.dll	SMimeVerificationTask::CalculateResult()	comm/mailnews/mime/src/nsCMS.cpp:354 

*  nssCertificate_Destroy | <name omitted> | NSS_CMSSignedData_Destroy | NSS_CMSContentInfo_Destroy | NSS_CMSMessage_Destroy | nsCMSMessage::~nsCMSMessage  bp-160146c4-1dd9-4334-af70-3abc50181030 -- Mac-only
 0 	libnss3.dylib	nssCertificate_Destroy	security/nss/lib/pki/certificate.c:95
1 	libnss3.dylib	<name omitted>	security/nss/lib/pki/certificate.c:141
2 	libnss3.dylib	NSS_CMSSignedData_Destroy	security/nss/lib/smime/cmssigdata.c:74
3 	libnss3.dylib	NSS_CMSContentInfo_Destroy	security/nss/lib/smime/cmscinfo.c:60
4 	libnss3.dylib	NSS_CMSMessage_Destroy	security/nss/lib/smime/cmsmessage.c:99
5 	XUL	nsCMSMessage::~nsCMSMessage()	comm/mailnews/mime/src/nsCMS.cpp:63
6 	XUL	<name omitted>	comm/mailnews/mime/src/nsCMS.cpp:37
7 	XUL	SMimeVerificationTask::~SMimeVerificationTask()	xpcom/base/nsCOMPtr.h:313
Flags: needinfo?(mkmelin+mozilla)
The stacks from comment 28 suggest that it's memory corruption.

In the middle stack, we attempt to free memory, and we crash when performing a memory block consistency check. The "magic identifier" of the block is unexpected. This sounds like we're attempt to free a block that has already been freed, or has been overwritten (as a consequence of some other bug).

The first and the third stack are similar. After processing some S/MIME signature data, we're trying to clean up. While doing so, we run into an unexpected scenario. The internal NSS data structures are in an inconsistent state, and we crash trying to dereference a NULL pointer - of a pointer that should never be NULL.
The earlier comments are true.
We need reliable steps to reproduce, so we can run them in a debugger.
"Use after free", comment #18 :-(
Jon is still our only testcase.
What do you think - is Jon's data sufficient?
Flags: needinfo?(kaie)
Without STR, no.
Flags: needinfo?(kaie)

Signatures more common in beta than the previously documented signatures

nssCertificate_Destroy | NSSCertificate_Destroy | CERT_DestroyCertificate | CERT_DestroyCertArray | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature
bp-e9981019-32db-4d10-93cb-ec2c70190109
Crashing Thread (63), Name: SMimeVerify
0 nss3.dll nssCertificate_Destroy security/nss/lib/pki/certificate.c:95
1 nss3.dll NSSCertificate_Destroy security/nss/lib/pki/certificate.c:141
2 nss3.dll CERT_DestroyCertificate security/nss/lib/certdb/stanpcertdb.c:817
3 nss3.dll CERT_DestroyCertArray security/nss/lib/certdb/certdb.c:2232
4 nss3.dll NSS_CMSSignedData_ImportCerts security/nss/lib/smime/cmssigdata.c:635
5 xul.dll nsCMSMessage::CommonVerifySignature(unsigned char*, unsigned int) comm/mailnews/mime/src/nsCMS.cpp:224
6 xul.dll nsCMSMessage::VerifyDetachedSignature(unsigned char*, unsigned int) comm/mailnews/mime/src/nsCMS.cpp:180
7 xul.dll SMimeVerificationTask::CalculateResult() comm/mailnews/mime/src/nsCMS.cpp:353
8 xul.dll mozilla::CryptoTask::Run() security/manager/ssl/CryptoTask.cpp:39

nssCertificate_Destroy | NSSCertificate_Destroy | CERT_DestroyCertificate | CERT_ImportCerts | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature
bp-095b69fe-e3fc-4ed9-b640-f9e8c0190112
Crashing Thread (69), Name: SMimeVerify
0 nss3.dll nssCertificate_Destroy security/nss/lib/pki/certificate.c:95
1 nss3.dll NSSCertificate_Destroy security/nss/lib/pki/certificate.c:141
2 nss3.dll CERT_DestroyCertificate security/nss/lib/certdb/stanpcertdb.c:817
3 nss3.dll CERT_ImportCerts security/nss/lib/certdb/certdb.c:2539
4 nss3.dll NSS_CMSSignedData_ImportCerts security/nss/lib/smime/cmssigdata.c:614
5 xul.dll nsCMSMessage::CommonVerifySignature(unsigned char*, unsigned int) comm/mailnews/mime/src/nsCMS.cpp:224
6 xul.dll nsCMSMessage::VerifyDetachedSignature(unsigned char*, unsigned int) comm/mailnews/mime/src/nsCMS.cpp:180
7 xul.dll SMimeVerificationTask::CalculateResult() comm/mailnews/mime/src/nsCMS.cpp:353
8 xul.dll mozilla::CryptoTask::Run() security/manager/ssl/CryptoTask.cpp:36

User Story: (updated)

(In reply to Kai Engert (:kaie:) from comment #29)

The first and the third stack are similar. After processing some S/MIME
signature data, we're trying to clean up. While doing so, we run into an
unexpected scenario. The internal NSS data structures are in an inconsistent
state, and we crash trying to dereference a NULL pointer - of a pointer that
should never be NULL.

Sorry, my above reasoning was wrong, but it's a memory corruption.

All stacks that have certificate.c:95 as the topmost code crash with the following code:

1 nssCertificate_Destroy(NSSCertificate *c) {
2 if (c) {
3 nssDecodedCert *dc = c->decoding;

Because we arrive at line 3, we know that c is non-null.
If we crash in line 3, pointer c points to invalid memory.

All crashes appear to happen while we verify an S/MIME signature. The crash occurrs on the thread named "SMimeVerify", that's expected. We perform the verification in the background, because it can be slow.

While looking at several crash reports, I noticed that at the time of the crash, we usually have at least two SMimeVerify threads running in parallel. In theory, that's fine. The user could have clicked on a first signed message, and then clicks on a second signed message, before the first verification has succeeded.

In theory, NSS should be fully thread safe, and allow the above concurrency. Lacking other ideas, we could attempt to avoid that concurrency, and allow only one SMimeVerify thread to actively make calls into NSS at any time, and see if it prevents this crash, or decreases its occurrence.

How about using this patch for nightly, as an experiment?

Comment on attachment 9036649 [details] [diff] [review]
smime-verify-serialize-v1.patch

rs=jorgk, I'll land it tonight. Too bad I have switched off Nightlies due to the tree bustage since we don't know how badly TB is broken.

Attachment #9036649 - Flags: review+
Keywords: leave-open
Comment on attachment 9036649 [details] [diff] [review]
smime-verify-serialize-v1.patch

Review of attachment 9036649 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/mime/src/nsCMS.cpp
@@ +382,5 @@
>    nsCOMPtr<nsICMSMessage> mMessage;
>    nsCOMPtr<nsISMimeVerificationListener> mListener;
>    nsCString mDigestData;
> +
> +  static mozilla::Mutex *mLock;

fyi there's a static mutex type in the tree: https://searchfox.org/mozilla-central/source/xpcom/base/StaticMutex.h

Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/279f3823f223
experimental patch to investigate Thunderbird topcrash, serializes S/MIME verification. rs=jorgk

Keywords: checkin-needed

We don't get many crash reports for nightly. Not seeing this crash in nightly, for a couple of days, seems normal. This could make it difficult to conclude whether this patch helps or not. Let's see how frequently we'll crash during the next week.

Links to related crash reports for nightly 66 (click "reports" on the pages):
https://is.gd/wr1kUg - https://is.gd/17PXKm - https://is.gd/6yzmJ4 - https://is.gd/B66Apf - https://is.gd/Xn6l0q

(In reply to Kai Engert (:kaie:) from comment #41)

We don't get many crash reports for nightly. Not seeing this crash in nightly, for a couple of days, seems normal. This could make it difficult to conclude whether this patch helps or not.

This is typical for us - we often cannot draw conclusions until a patch hits beta channel. (nightly topcrashes are rare)

Actually, the signatures cited so far in this bug are even rare in beta - despite the high rate in release channel. But if the following two signatures are the same problem, then we can make a judgement when patch hits beta.

  • nssCertificate_Destroy | NSSCertificate_Destroy | CERT_DestroyCertificate | CERT_DestroyCertArray | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature

https://crash-stats.mozilla.com/signature/?product=Thunderbird&signature=nssCertificate_Destroy%20%7C%20NSSCertificate_Destroy%20%7C%20CERT_DestroyCertificate%20%7C%20CERT_DestroyCertArray%20%7C%20NSS_CMSSignedData_ImportCerts%20%7C%20nsCMSMessage%3A%3ACommonVerifySignature&version=65.0b0&date=%3C2019-01-16T17%3A37%3A07%2B00%3A00&date=%3E%3D2019-01-09T17%3A37%3A07%2B00%3A00

  • nssCertificate_Destroy | NSSCertificate_Destroy | CERT_DestroyCertificate | CERT_ImportCerts | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature

https://crash-stats.mozilla.com/signature/?product=Thunderbird&signature=nssCertificate_Destroy%20%7C%20NSSCertificate_Destroy%20%7C%20CERT_DestroyCertificate%20%7C%20CERT_ImportCerts%20%7C%20NSS_CMSSignedData_ImportCerts%20%7C%20nsCMSMessage%3A%3ACommonVerifySignature&version=65.0b0&date=%3C2019-01-16T17%3A37%3A07%2B00%3A00&date=%3E%3D2019-01-09T17%3A37%3A07%2B00%3A00

I can stick it into TB 65 beta 3 which I'm planning to prepare tomorrow.

(In reply to Wayne Mery (:wsmwk) from comment #42)

Actually, the signatures cited so far in this bug are even rare in beta - despite the high rate in release channel. But if the following two signatures are the same problem, then we can make a judgement when patch hits beta.

I think most of the crashes from the following query probably point to the same kind of issue:
https://is.gd/CZU16E

Comment on attachment 9036649 [details] [diff] [review]
smime-verify-serialize-v1.patch

[Approval Request Comment]
Regression caused by (bug #): unknown
User impact if declined: none
Testing completed (on c-c, etc.): no reports from nightly yet
Risk to taking this patch (and alternatives if risky): user interface slower or stuck

Should we take this experiment for beta?
Attachment #9036649 - Flags: approval-comm-beta?
Comment on attachment 9036649 [details] [diff] [review]
smime-verify-serialize-v1.patch

Why not. I think we'll do another 65 beta in a week.
Attachment #9036649 - Flags: approval-comm-beta? → approval-comm-beta+
Target Milestone: --- → Thunderbird 66.0
See Also: → 1522968

65.0bx crashes. Will need a few days of beta 4 to determine if the crashes gone, so should know by Monday Feb 4

this is the most common, with 0-8 crashes per day averaging 5/day - nssCertificate_Destroy | NSSCertificate_Destroy | CERT_DestroyCertificate | CERT_DestroyCertArray | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature bp-c1ecea9d-a9fd-43f8-a234-791e10190108

nssCertificate_Destroy | NSSCertificate_Destroy | CERT_DestroyCertArray | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature bp-5047eed3-a6ea-4089-b670-db9720190128

Crash Signature: NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature ] [@ PORT_FreeArena_Util | CERT_ImportCerts | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature ] → NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature ] [@ PORT_FreeArena_Util | CERT_ImportCerts | NSS_CMSSignedData_ImportCerts | nsCMSMessage::CommonVerifySignature ] [@ nssCertificate_Destroy | NSSCertificate_Destroy | CERT_DestroyCe…

What's the build ID of beta 4? Is it 20190123180341 ?

I don't see any crashes with the recent 65 betas, and no related crashes with 66 beta at all.
Wayne, can you confirm?

Flags: needinfo?(vseerror)

(In reply to Dana Keeler (she/her) (use needinfo) (:keeler for reviews) from comment #39)

  • static mozilla::Mutex *mLock;

fyi there's a static mutex type in the tree:
https://searchfox.org/mozilla-central/source/xpcom/base/StaticMutex.h

Dana, thanks a lot for pointing me to that, that's cleaner. If we decide to keep the mutex, I agree we should use that. Patch in bug 1522968 is updated.

See Also: → 1529003

Thanks Wayne, All crashes from 65beta are with build IDs that are older than beta 4.

I think the data confirms that our workaround helps.

If this workaround fixes the crash and memory corruption, it means that the S/MIME code inside Thunderbird isn't threadsafe. I've filed bug 1529003 to track that and get it potentially fixed at the NSS level. Bug 1529003 will require analysis, and it's not clear how much work that will be.

I suggest to keep the workaround for Thunderbird 68. Beta testing should show if the workaround has any negative effects.

At worst, if S/MIME verification is sometimes slow, users could experience delayed update of the signed/encrypted message status (icons shown).

If you'd like to consider applying the workaround to Thunderbird 60.x, we'd have to accept this risk.

Better slow than crashing ;-) - So not threadsafe leads to double-free?

(In reply to Jorg K (GMT+1) from comment #55)

So not threadsafe leads to double-free?

One thread might free it, and the data might get immediately overwritten, while another thread might still have a pointer to it, and read/write it.

Attached patch backport-esr60-1386601.patch (obsolete) — Splinter Review

(In reply to Jorg K (GMT+1) from comment #55)

Better slow than crashing ;-)

This patch backports the serialization to the esr60 branch (while using the better approach suggested by Dana).

Assignee: nobody → kaie
Comment on attachment 9045029 [details] [diff] [review]
backport-esr60-1386601.patch

Dana, would you please take a look. Thanks in advance.
Attachment #9045029 - Flags: feedback?(dkeeler)
Attachment #9045029 - Flags: approval-comm-esr60?
Comment on attachment 9045029 [details] [diff] [review]
backport-esr60-1386601.patch

Review of attachment 9045029 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/mime/src/nsCMS.cpp
@@ +415,5 @@
>    nsCOMPtr<nsISMimeVerificationListener> mListener;
>    nsCString mDigestData;
>    int16_t mDigestType;
> +
> +  mozilla::StaticMutex sMutex;

This needs to be declared static, right?
Attachment #9045029 - Flags: feedback?(dkeeler) → feedback-
Attached patch backport-esr60-1386601-v2.patch (obsolete) — Splinter Review

Of course you're right, because it's defined as a class member, thanks.

Attachment #9045029 - Attachment is obsolete: true
Attachment #9045029 - Flags: approval-comm-esr60?
Attachment #9045062 - Flags: review?(dkeeler)
Attachment #9045062 - Flags: approval-comm-esr60?
Comment on attachment 9045062 [details] [diff] [review]
backport-esr60-1386601-v2.patch

Review of attachment 9045062 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with comment addressed

::: mailnews/mime/src/nsCMS.cpp
@@ +415,5 @@
>    nsCOMPtr<nsISMimeVerificationListener> mListener;
>    nsCString mDigestData;
>    int16_t mDigestType;
> +
> +  static mozilla::StaticMutex sMutex;

Looks like this needs to also be defined somewhere concrete:

`mozilla::StaticMutex SMimeVerificationTask::sMutex;`

(see https://searchfox.org/mozilla-central/search?q=symbol:_ZN7mozilla3dom5cache7Manager7Factory6sMutexE&redirect=false for a similar situation)
Attachment #9045062 - Flags: review?(dkeeler) → review+
Attachment #9045062 - Attachment is obsolete: true
Attachment #9045062 - Flags: approval-comm-esr60?
Attachment #9045093 - Flags: review+
Attachment #9045093 - Flags: approval-comm-esr60?
Status: NEW → ASSIGNED
Comment on attachment 9045093 [details] [diff] [review]
backport-esr60-1386601-v3.patch

Let's take it for 60.5.2, thanks for the reminder.
Attachment #9045093 - Flags: approval-comm-esr60? → approval-comm-esr60+

Agreed, the crash is gone for newer 66 betas

Great work!

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Flags: needinfo?(vseerror)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: