Closed Bug 1276192 Opened 8 years ago Closed 8 years ago

UAF in TlsAgent cancelling timer

Categories

(NSS :: Test, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mt, Assigned: mt)

Details

Attachments

(3 files, 1 obsolete file)

Attached patch tmpSplinter Review
See https://public-artifacts.taskcluster.net/BzlGtG_FRfieTdtQC40I0A/0/public/logs/live_backing.log

The reason that this happens is a little convoluted.  In DTLS tests, we set a timer based on whatever DTLS_GetHandshakeTimeout() says.  Before that, we cancel the old timer, but fail to null it.  If DTLS_GetHandshakeTimeout() fails, we fail to update the timer_handle_ value in the TlsAgent.  Because when the timer runs, it deletes the object we reference.  But we still have a handle and we then try to cancel again when the TlsAgent is destructed.

Note that DTLS_GetHandshakeTimeout() fails often because it fails when we don't have a timeout set.  That's all the time because once the handshake is complete, only the side that has a holddown timer has any timers running.

Null the pointer when we cancel and it should be OK.
Attachment #8757256 - Flags: review?(ttaubert)
Summary: UAF in tests → UAF in TlsAgent cancelling timer
Comment on attachment 8757256 [details] [diff] [review]
tmp

Review of attachment 8757256 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #8757256 - Flags: review?(ttaubert) → review+
This still fails for me locally with the same error

==29194==ERROR: AddressSanitizer: heap-use-after-free on address 0x6030000729b0 at pc 0x000000428401 bp 0x7ffc82d6b6e0 sp 0x7ffc82d6b6d0
WRITE of size 8 at 0x6030000729b0 thread T0
    #0 0x428400 in nss_test::Poller::Timer::Cancel() /home/franziskus/Code/nss/external_tests/ssl_gtest/test_io.h:105
    #1 0x4290ef in nss_test::Timeout::Cancel() /home/franziskus/Code/nss/external_tests/ssl_gtest/gtest_utils.h:32
    #2 0x42905b in nss_test::Timeout::~Timeout() /home/franziskus/Code/nss/external_tests/ssl_gtest/gtest_utils.h:24
    #3 0x4bedf3 in nss_test::TlsConnectTestBase::Receive(unsigned long) /home/franziskus/Code/nss/external_tests/ssl_gtest/tls_connect.cc:376
    #4 0x4bec72 in nss_test::TlsConnectTestBase::SendReceive() /home/franziskus/Code/nss/external_tests/ssl_gtest/tls_connect.cc:372
    #5 0x48d357 in nss_test::TlsCipherSuiteTest_ConnectWithCipherSuite_Test::TestBody() /home/franziskus/Code/nss/external_tests/ssl_gtest/ssl_ciphersuite_unittest.cc:77
    #6 0x505748 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) gtest/src/gtest.cc:2362
    #7 0x4fc75b in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) gtest/src/gtest.cc:2398
    #8 0x4dc503 in testing::Test::Run() gtest/src/gtest.cc:2435
    #9 0x4dd0ab in testing::TestInfo::Run() gtest/src/gtest.cc:2610
    #10 0x4dd945 in testing::TestCase::Run() gtest/src/gtest.cc:2728
    #11 0x4e84d0 in testing::internal::UnitTestImpl::RunAllTests() gtest/src/gtest.cc:4591
    #12 0x506f70 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) gtest/src/gtest.cc:2362
    #13 0x4fde05 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) gtest/src/gtest.cc:2398
    #14 0x4e6159 in testing::UnitTest::Run() gtest/src/gtest.cc:4209
    #15 0x495131 in RUN_ALL_TESTS() ../../external_tests/google_test/gtest/include/gtest/gtest.h:2304
    #16 0x49528b in main /home/franziskus/Code/nss/external_tests/ssl_gtest/ssl_gtest.cc:33
    #17 0x7fe908fc3740 in __libc_start_main (/usr/lib/libc.so.6+0x20740)
    #18 0x410208 in _start (/home/franziskus/Code/nss/external_tests/ssl_gtest/Linux4.5_x86_64_cc_glibc_PTH_64_ASAN_DBG.OBJ/ssl_gtest+0x410208)
Flags: needinfo?(martin.thomson)
What do you need to run with ASAN and I'll try it myself.  I want to use rr.

BTW, I checked it in anyway: https://hg.mozilla.org/projects/nss/rev/d0f1c1030393
Attached patch tmp2 (obsolete) — Splinter Review
Franziskus, that's a different error to the one I looked at.  That's a worse one in a sense.  The problem there is that the timeout handle is being hit after it timed out and was deleted.  Try this please.
Flags: needinfo?(martin.thomson)
Attachment #8757265 - Flags: review?(franziskuskiefer)
Attachment #8757265 - Attachment is patch: true
Attachment #8757265 - Attachment mime type: text/x-patch → text/plain
Assignee: nobody → martin.thomson
Attached patch tmp3Splinter Review
This is nicer, I think.
Attachment #8757265 - Attachment is obsolete: true
Attachment #8757265 - Flags: review?(franziskuskiefer)
Attachment #8757267 - Flags: review?(franziskuskiefer)
(In reply to Martin Thomson [:mt:] from comment #3)
> What do you need to run with ASAN and I'll try it myself.  I want to use rr.
> 
> BTW, I checked it in anyway:
> https://hg.mozilla.org/projects/nss/rev/d0f1c1030393

yeah, I think there're multiple errors.

You can build and run tests with USE_ASAN=1
Comment on attachment 8757267 [details] [diff] [review]
tmp3

Review of attachment 8757267 [details] [diff] [review]:
-----------------------------------------------------------------

While this is looking good to me it make gtests fail. And this time not because of some ASAN error but an actual test error (but only when built with ASAN).
[  FAILED  ] CipherSuiteAEAD12/TlsCipherSuiteTest.ConnectWithCipherSuite/7, where GetParam() = ("TLS", 771, 49188)
[  FAILED  ] CipherSuiteAEAD12/TlsCipherSuiteTest.ConnectWithCipherSuite/8, where GetParam() = ("TLS", 771, 49192)
[  FAILED  ] CipherSuiteAEAD12/TlsCipherSuiteTest.ConnectWithCipherSuite/16, where GetParam() = ("DTLS", 771, 49188)
[  FAILED  ] CipherSuiteAEAD12/TlsCipherSuiteTest.ConnectWithCipherSuite/17, where GetParam() = ("DTLS", 771, 49192)
Attachment #8757267 - Flags: review?(franziskuskiefer)
franziskus, I get no failures.  Tons of leaks at the end, but no failures.  Did you make clean?

So many leaks I can't even copy and paste them.
Flags: needinfo?(franziskuskiefer)
This seems to be the most common stack trace for a leak.

Direct leak of 40032 byte(s) in 834 object(s) allocated from:
    #0 0x7fd767461f30 in calloc (/lib64/libasan.so.3+0xc6f30)
    #1 0x7fd76667f901 in PR_Calloc ../../../../pr/src/malloc/prmem.c:443
    #2 0x7fd7666698b7 in PR_CreateIOLayerStub ../../../../pr/src/io/prlayer.c:447
    #3 0x4a333a in nss_test::DummyPrSocket::CreateFD(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, nss_test::Mode) /home/martin/code/nss/external_tests/ssl_gtest/test_io.cc:292
    #4 0x4acb33 in nss_test::TlsAgent::Init() /home/martin/code/nss/external_tests/ssl_gtest/tls_agent.h:65
Given the error that franziskus was seeing seems legitimate, I think that I'd still like to make the change in Attachment #8757267 [details] [diff].
running USE_ASAN=1 make nss_clean_all nss_build_all and USE_ASAN=1 NSS_TESTS="ssl_gtests" NSS_CYCLES="standard" ./all.sh with Attachment #8757267 [details] [diff] gives me the errors from comment 7.

Looking at the output the first failure is the same as without the patch, but the without leaking. So it looks to me like the patch fixes the leak but not the test failure.

The cipher suites 49188 and 49192 were introduced in bug 923089. I'll do some debugging and see if I can figure out what's going wrong.
Flags: needinfo?(franziskuskiefer)
The problem appears to be the build. We don't link ssl_gtest against libsoftokn, which makes my machine use the system libsoftokn, which obviously can't handle the sha384 case. Adding |$(DIST)/lib/$(LIB_PREFIX)softokn3.$(DLL_SUFFIX)| to the EXTRA_LIBS in the ssl_gtest manifest.mn solves this. Now I also see leaks similar to the ones in bug 1276612.
Attachment #8758165 - Flags: review?(martin.thomson)
Attachment #8758165 - Flags: review?(martin.thomson) → review+
Franziskus, should we also link to libnss?
Comment on attachment 8757267 [details] [diff] [review]
tmp3

Review of attachment 8757267 [details] [diff] [review]:
-----------------------------------------------------------------

I assume that it's OK now.
Attachment #8757267 - Flags: review?(franziskuskiefer)
Comment on attachment 8757267 [details] [diff] [review]
tmp3

Review of attachment 8757267 [details] [diff] [review]:
-----------------------------------------------------------------

yeah, this patch seems to fix a couple of leaks and works for me with the fixed manifest.

> Franziskus, should we also link to libnss?
hm, that's a good question. I actually assumed that adding things to REQUIRE would link against the newly built libs. But it seems it doesn't (or only statically?). I think we should land this as is and file a follow up to investigate how our build system works and if there are more pitfalls like this.
Attachment #8757267 - Flags: review?(franziskuskiefer) → review+
Both patches landed.  Let's see what happens to the leak reporting now...

https://hg.mozilla.org/projects/nss/rev/d406bc66ef47
https://hg.mozilla.org/projects/nss/rev/658c53ba58c1
That's enough changes for this bug.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → 3.25
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: