Crash in [@ Botan::`anonymous namespace'::RSA_Verify_Operation::verify_mr]
Categories
(MailNews Core :: Security: OpenPGP, defect, P1)
Tracking
(thunderbird_esr78+ fixed, thunderbird82 fixed)
People
(Reporter: wsmwk, Assigned: rjl)
References
Details
(Keywords: crash, topcrash-thunderbird)
Crash Data
Attachments
(3 files)
430.16 KB,
application/x-xpinstall
|
Details | |
62.41 KB,
application/x-gzip
|
Details | |
47 bytes,
text/x-phabricator-request
|
rjl
:
approval-comm-beta+
wsmwk
:
approval-comm-esr78+
|
Details | Review |
This bug is for crash report bp-f777a3ac-fd81-4d38-988d-da04b0200718.
Top 10 frames of crashing thread:
0 kernelbase.dll RaiseException
1 rnp.dll _CxxThrowException /builds/worker/workspace/obj-build/comm/third_party/rnp/f:/dd/vctools/crt/vcruntime/src/eh/throw.cpp:129
2 rnp.dll Botan::`anonymous namespace'::RSA_Verify_Operation::verify_mr comm/third_party/botan/src/lib/pubkey/rsa/rsa.cpp:603
3 rnp.dll Botan::PK_Ops::Verification_with_EMSA::is_valid_signature comm/third_party/botan/src/lib/pubkey/pk_ops.cpp:123
4 rnp.dll Botan::PK_Verifier::verify_message comm/third_party/botan/src/lib/pubkey/pubkey.cpp:328
5 rnp.dll Botan::KeyPair::signature_consistency_check comm/third_party/botan/src/lib/pubkey/keypair/keypair.cpp:77
6 rnp.dll Botan::RSA_PrivateKey::check_key const comm/third_party/botan/src/lib/pubkey/rsa/rsa.cpp:350
7 rnp.dll std::_Func_impl_no_alloc<`lambda at /builds/worker/workspace/obj-build/comm/third_party/botan/build/include/botan/internal/ffi_util.h:89:38', int>::_Do_call
8 rnp.dll Botan_FFI::ffi_guard_thunk comm/third_party/botan/src/lib/ffi/ffi.cpp:93
9 rnp.dll Botan_FFI::apply_fn<Botan::Private_Key, 2140551262, `lambda at /builds/worker/checkouts/gecko/comm/third_party/botan/src/lib/ffi/ffi_pkey.cpp:148:11'> /builds/worker/workspace/obj-build/comm/third_party/botan/build/include/botan/internal/ffi_util.h:89
Reporter | ||
Comment 1•4 years ago
|
||
Comment 2•4 years ago
|
||
Comment 3•4 years ago
|
||
Nickolay, could you please change RNP to fail gracefully (with an error result) if you get an exception from Botan?
Do you understand why key generation is failing?
Comment 4•4 years ago
|
||
An Botan exception should NEVER trigger a crash.
Comment 5•4 years ago
|
||
All exceptions inside of RNP should be catched on FFI boundary, and error code must be returned instead.
Will try to reproduce the issue, it looks like definitely Botan's one (as it fails on validation of just-generated key).
Reporter | ||
Comment 6•4 years ago
|
||
Another example CxxThrowException | Botan::`anonymous namespace'::RSA_Verify_Operation::verify_mr bp-cd9232e5-02f9-4b5d-adf0-c64180200911
Comment 7•4 years ago
|
||
This was discussed here:
https://github.com/rnpgp/rnp/issues/1287
Botan and RNP both handle exceptions on FFI boundary, so most likely it is some sort of memory corruption.
Reporter | ||
Comment 9•4 years ago
|
||
Kai, the combined number of crashes of the two signatures now makes this a topcrash for 78.2.2 - approximately in the top 30.
Lev, can you provide Kai with a reproducible testcase? If so, you can describe it here, and if there is an example file please email it to him
Comment 10•4 years ago
|
||
I'll try to make keyring without any private keys which leads to crash
Comment 11•4 years ago
|
||
Kai, I've sent you GPG keyring which leads to crash.
Crash id is bp-a022cb5d-c3b6-4488-b3b2-487d60200920
Comment 13•4 years ago
|
||
I could try to write bisect script which will create keyring without half the keys, try to import it, rinse-n-repeat, to find key which cause crash. But not right now...
And, of course, it is possible to delete all public keys from keyring and start life with a clean slate :) All public keys must be on key servers!
Comment 14•4 years ago
|
||
I cannot reproduce the crash, I tried win32, win64, linux32, linux64, all work for me.
Lev, I wonder if the crash happens while importing your secret key, because you said it crashes shortly after you enter the password to unlock it.
You could try to backup your gnupg directory, then temporarily move the public key ring file away, so that only the secret keys are in your gnupg directory. Then start the migration. Does it crash?
Comment 15•4 years ago
|
||
Well, it seems obvious that it's about a private key, when looking at the crash stack...
Comment 16•4 years ago
|
||
Wayne, does it look like all reported crashes are on Windows with a 32bit executable?
Comment 17•4 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #12)
Is a workaround possible?
We don't know what's happening.
Reporter | ||
Comment 18•4 years ago
|
||
No, not all 32bit. 1/3 64bit and 2/3 32bit - probably reflective of our user base
Comment 19•4 years ago
|
||
Kai, crash happens in the middle of importing process, much later, than it asks for PK password. I
I've checked, that crash repeats when there is no PK in directory, exactly with same data as I sent you. Crash bp-a022cb5d-c3b6-4488-b3b2-487d60200920 is WITHOUT PK.
I'll try to import ONLY PK tomorrow, but, again, I've checked, that sent data cause crash, at least on my computer.
Maybe, I need to cleanup leftovers from previous trys somehow?
Maybe, I could run some "debug" build for more comprehensive diagnostics?
Comment 20•4 years ago
|
||
(PK is "private key" in previous comment, I've realized, that it could be Public key as well)
Also, I could try to remove public part of my key from keyring too, as now Thunderbird shows error about it if private key is not found (instead request for password).
Comment 21•4 years ago
|
||
For example, logging each key ID BEFORE import could be useful, as we will know key which leads to crash.
Comment 22•4 years ago
|
||
Good idea. You can use config editor to set preference extensions.enigmail.logDirectory to a directory of your choice to enable logging of the Enigmail migrator Add-on. In that directory a file enigdbug.txt will be created.
With that, I get log lines like this:
2020-09-23 10:11:26.481 [DEBUG] setupWizard2.js: importKeys: pubFile: /tmp/enig-exp/0x....asc
where ... is the key ID
If it crashes, I wonder if the newest ID will be flushed to file prior to the crash.
I can create a test version of that Add-on that adds flushing after each write to the logfile.
Comment 23•4 years ago
|
||
Lev, you can save this to file, then use add-on manager, settings wheel icon, and select import from file, then pick this one. Restart TB to be sure it's picked up. Then start migration again from the profile. It might be even slower than before, because of the many writes to file. But if you crash, the last line in enigdbug.txt will hopefully refer to the key that causes the crash.
Comment 24•4 years ago
|
||
Crashed with test version of Enigmail migration plugin.
Crash ID bp-90a2eae8-f830-4717-8c51-f5f230200923
Comment 25•4 years ago
|
||
The last line referred to key 7420DF86BCE15A458DCE997639278DA8109E6244.
Can you try to
gpg --export-keys --armor 7420DF86BCE15A458DCE997639278DA8109E6244 > key-7420DF86BCE15A458DCE997639278DA8109E6244.asc
with Thunderbird OpenPGP key manager, use file open public key, select that file.
Does that crash for you?
(I still don't crash.)
It's interesting that your public keyring is 19 MB in total, and just this one is 1.5 MB big.
It has 3296 signatures from other keys, and 1 signature is reported by gnupg as "bad".
Comment 26•4 years ago
|
||
Finally, I can reproduce the crash.
Comment 27•4 years ago
|
||
With my local Windows optimized build, we get an exception in botan rsa.cpp:
BigInt public_op(const BigInt& m) const
{
if(m >= m_public->get_n())
--> throw Invalid_Argument("RSA public op - input is too large");
return m_public->public_op(m);
}
The opt build doesn't have the variable contents, they are optimized away.
I will create a local debug build in the hope it will tell us more details.
Comment 28•4 years ago
|
||
... still experimenting with the opt build (debug build takes time)...
This is very confusing.
I have wrapped the above statement like this:
try {
if(m >= m_public->get_n())
--> throw Invalid_Argument("RSA public op - input is too large");
}
} catch (Exception &e) {
BigInt none; return none;
} catch (...) {
BigInt none; return none;
}
Regardless, the Visual Studio debugger reports that the "throw" is unhandled.
I don't understand how this is possible.
Comment 29•4 years ago
|
||
Kai, as from discussion here: https://github.com/rnpgp/rnp/issues/1287#issuecomment-684971878, it looks like some sort of memory corruption.
After this call Botan's FFI wraps all exception, and afterwards our FFI also handles all exceptions.
Comment 30•4 years ago
|
||
I'm in the MSVS debugger, and I see ffi_guard_thunk is in the call stack a few levels above.
But the runtime code doesn't go there. Instead, it reports an unhandled exception and stops.
The exception handling doesn't work for some reason.
Comment 31•4 years ago
|
||
(In reply to Nickolay Olshevsky from comment #29)
Kai, as from discussion here: https://github.com/rnpgp/rnp/issues/1287#issuecomment-684971878, it looks like some sort of memory corruption.
I'm not yet convinced it's memory corruption.
I've changed RSA_Public_Operation::public_op() to return an empty BigInt on failure (instead of throwing), in the hope that this causes the operation to fail. (At least it continues.)
With this change, I'm able to successfull import the public key. And I'm able to perform lots of functionality afterwards. If we had a memory corruption, I'd expect that I'd soon crash afterwards when performing other operations. But that doesn't happen, no crash.
Maybe it is memory corruption. Maybe memory corruption causes exception handling to no work. But I think it's possible that simply the exception handling is broken or disabled or overridden in some way, that we cannot think of yet. Maybe as a consequence of using the Thunderbird build environment.
I've just learned that we haven't ran the botan/rnp tests suites using the binaries that are produced as part of the Thunderbird build.
I'd like to try building botan/rnp with their usual approach, to check if that fixes the crash.
Comment 32•4 years ago
|
||
Yes, there are some keys of FreeBSD security officers, for example, which have a tons of signatures.
OpenPGP Key Manager -> Import Public Key -> CRASH bp-77341c71-544e-4a8f-8872-504170200923
But at least I could remove this key from keyring and try again! :-)
And, I think, this key could be added as testcase into testsuite :-)
Thank you for all your work, and good luck with fixing this bug. If I could provide any additional information - let me know!
Comment 33•4 years ago
|
||
Nickolay, Rob, I think I have confirmed that my theory was correct. The crash is caused by non-working exception handling.
When building RNP and Botan as part of the Thunderbird build system, I'm guessing that some of the build flags disable the C++ exception handling system in a way that breaks Botan's and RNP's expectations.
I've manually created alternative binaries using the original library build rules, cross compiling on Fedora using MinGW for Windows.
With those binaries, it works fine, no crash.
Updated•4 years ago
|
Updated•4 years ago
|
Comment 34•4 years ago
|
||
In Windows distributions of Thunderbird, the directory that contains thunderbird.exe also contains a file rnp.dll - this is a file that bundles multiple libraries, RNP, Botan and json-c. That file is currently created as part of the Thunderbird build system.
I've manually created alternative binaries, that don't crash.
If you want to test them, I've uploaded them to my personal server.
Version information:
https://kuix.de/mozilla/openpgp/info-libs-20200923.txt
Windows 32 bit:
https://kuix.de/mozilla/openpgp/rnp-botan-32-libs-20200923.zip
https://kuix.de/mozilla/openpgp/rnp-botan-32-libs-20200923.zip.asc
Windows 64 bit:
https://kuix.de/mozilla/openpgp/rnp-botan-64-libs-20200923.zip
https://kuix.de/mozilla/openpgp/rnp-botan-64-libs-20200923.zip.asc
Files with extension .asc are detached signatures, to show they were really made by me
(key fingerprint 21D16E67E18398C8DA9DDF2E1C27423725007724).
Comment 35•4 years ago
|
||
Lev, if you want to test:
Find the directory that contains your thunderbird.exe
Rename file rnp.dll to a different filename, for example orig-rnp.dll
Download the archive that matches your Thunderbird binary, either 32 or 64 bit.
Extract the archive.
Copy the individual files to the thunderbird directory.
Run Thunderbird
Comment 36•4 years ago
|
||
(In reply to Kai Engert (:KaiE:) from comment #35)
Lev, if you want to test:
Yes, it works, thank you!
Assignee | ||
Comment 37•4 years ago
|
||
Clang-cl/MSVC default exceptions are SEH style and do not
propagate the same as standard c++ exceptions that RNP is
expecting. The /EHs flag sets the correct behavior.
Assignee | ||
Comment 38•4 years ago
•
|
||
I ran Botan's configure.py with MSVC set as the compiler and compared the compile flags set there against what is set in the moz.build files. The one that stood out is /EHs
, which in MSVC allows for throwing structured exceptions.
https://docs.microsoft.com/en-us/cpp/build/reference/eh-exception-handling-model?view=vs-2017
The compiler always generates code that supports asynchronous structured exception handling (SEH). By default (that is, if no /EHsc, /EHs, or
/EHa option is specified), the compiler supports SEH handlers in the native C++ catch(...) clause. However, it also generates code that only partially supports C++ exceptions. The default exception unwinding code doesn't destroy automatic C++ objects outside of try blocks that go out of scope because of an exception. Resource leaks and undefined behavior may result when a C++ exception is thrown.
There's a lot there that I don't follow, but I understood that the default mode may result in undefined behavior when a c++ exception is thrown.
So the fix is change -Xclang -fcxx-exceptions
to /EHs
. I built locally with this and verified that there is no crash on trunk or on 78.2.0 (that's just what was on my Windows VM already).
Try build (trunk):
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=653f4dc92d67b8ed6b032163c7f0e326caa20e0d
Comment 39•4 years ago
|
||
Using /EHs fixes the issue in my local build.
Comment 40•4 years ago
|
||
Thanks for the explanation, didn't know about this key/difference in exception handling (oh, those happy times when I worked with single compiler for single OS..).
Some offtopic: Did you have a chance to test whether latest RNP is build ok under Windows/MSVC (without MinGW)?
Assignee | ||
Comment 41•4 years ago
|
||
(In reply to Nickolay Olshevsky from comment #40)
Thanks for the explanation, didn't know about this key/difference in exception handling (oh, those happy times when I worked with single compiler for single OS..).
Some offtopic: Did you have a chance to test whether latest RNP is build ok under Windows/MSVC (without MinGW)?
Not sure what you're asking?
Does it build on a Windows host?
I did not check that, but when this lands in comm-central (probably later today) we have a win64 build job that builds on a Windows host to catch those kinds of problems.
Comment 42•4 years ago
|
||
BTW, Thunderbird updated itself to 78.3.0 release, restored rpn.dll
in process and crashes at start after that. As I missed update, it was scary expirience :)
Replacing rpn.dll
again with special build from this bug helps.
Comment 43•4 years ago
|
||
(In reply to Rob Lemley [:rjl] from comment #41)
(In reply to Nickolay Olshevsky from comment #40)
Some offtopic: Did you have a chance to test whether latest RNP is build ok under Windows/MSVC (without MinGW)?
Not sure what you're asking?
Nickolay, we only used MinGW for my experimental build.
Usually the automated builds use clang-cl, cross-compiling to Windows.
We don't use the MSVC compiler, IIUC, as Rob has explained here:
https://github.com/rnpgp/rnp/issues/1287#issuecomment-685067756
We link to MSVC libraries, but we don't use MSVC compiler for building.
Regarding "latest RNP", we haven't tested newer builds than the 2020-09-13 yet.
Comment 45•4 years ago
|
||
Pushed by kaie@kuix.de:
https://hg.mozilla.org/comm-central/rev/41caf6adc1ed
Use standard c++ exception handling in Botan/RNP. r=kaie
Updated•4 years ago
|
Comment 46•4 years ago
|
||
Comment on attachment 9177536 [details]
Bug 1653884 - Use standard c++ exception handling in Botan/RNP. r=kaie
[Approval Request Comment]
Regression caused by (bug #): none
User impact if declined: users crash when working with correspondents OpenPGP keys
Testing completed (on c-c, etc.): manually
Risk to taking this patch (and alternatives if risky): very little risk, build flags now aligned with upstream
Assignee | ||
Comment 47•4 years ago
|
||
Comment on attachment 9177536 [details]
Bug 1653884 - Use standard c++ exception handling in Botan/RNP. r=kaie
[Triage Comment]
Approved by wsmwk via Matrix for 82.0b1
Assignee | ||
Comment 48•4 years ago
|
||
bugherder uplift |
Thunderbird 82.0b1 build2:
https://hg.mozilla.org/releases/comm-beta/rev/b37594670eee
Comment 49•4 years ago
|
||
Rob, I meant that now it should be possible to build rnp on Windows natively, without MinGW/MSYS2.
Reporter | ||
Comment 50•4 years ago
|
||
Comment on attachment 9177536 [details]
Bug 1653884 - Use standard c++ exception handling in Botan/RNP. r=kaie
[Triage Comment]
Approved for esr78
Comment 51•4 years ago
|
||
Description
•