Closed Bug 1978603 Opened 11 months ago Closed 6 months ago

Assertion failure: timer->cb == NULL, at ../../lib/ssl/dtlscon.c:922

Categories

(NSS :: Libraries, defect, P2)

Tracking

(firefox-esr115 wontfix, firefox-esr140 wontfix, firefox146 wontfix, firefox147 wontfix, firefox148 fixed)

RESOLVED FIXED
Tracking Status
firefox-esr115 --- wontfix
firefox-esr140 --- wontfix
firefox146 --- wontfix
firefox147 --- wontfix
firefox148 --- fixed

People

(Reporter: anna.weine, Assigned: anna.weine)

References

(Blocks 1 open bug)

Details

(Keywords: sec-audit, Whiteboard: [fuzzblocker] [adv-main148-])

Attachments

(1 file, 2 obsolete files)

48 bytes, text/x-phabricator-request
Details | Review

OSS-Fuzz: https://oss-fuzz.com/testcase-detail/4899719191265280

INFO: found LLVMFuzzerCustomMutator (0x5d3d99562a80). Disabling -len_control by default.
	INFO: Running with entropic power schedule (0xFF, 100).
	INFO: Seed: 2703421472
	INFO: Loaded 1 modules   (104398 inline 8-bit counters): 104398 [0x5d3d9a0969f0, 0x5d3d9a0b01be),
	INFO: Loaded 1 PC tables (104398 PCs): 104398 [0x5d3d9a0b01c0,0x5d3d9a247ea0),
	/mnt/scratch0/clusterfuzz/bot/builds/clusterfuzz-builds_nss_c7f62f1d50ead7b467de2ea6701001cb50f4098f/revisions/dtls-server: Running 1 inputs 100 time(s) each.
	Running: /mnt/scratch0/clusterfuzz/bot/inputs/fuzzer-testcases/e5d5e213f52f1275a6aa30470ad4d5b74ca082d5e5d4173cdff50855489crash
	Assertion failure: timer->cb == NULL, at ../../lib/ssl/dtlscon.c:922
	AddressSanitizer:DEADLYSIGNAL
	=================================================================
	==1293==ERROR: AddressSanitizer: ABRT on unknown address 0x05390000050d (pc 0x7cb175a3f00b bp 0x7ffd4b2c8a70 sp 0x7ffd4b2c87f0 T0)
	    #0 0x7cb175a3f00b in raise /build/glibc-LcI20x/glibc-2.31/sysdeps/unix/sysv/linux/raise.c:51:1
	    #1 0x7cb175a1e858 in abort /build/glibc-LcI20x/glibc-2.31/stdlib/abort.c:79:7
	    #2 0x5d3d99da8fbb in PR_Assert nspr/pr/src/io/prlog.c:556:3
	    #3 0x5d3d99649c49 in dtls_StartTimer nss/lib/ssl/dtlscon.c:922:5
	    #4 0x5d3d99649c49 in dtls_StartRetransmitTimer nss/lib/ssl/dtlscon.c:950:12
	    #5 0x5d3d99649c49 in dtls_FlushHandshakeMessages nss/lib/ssl/dtlscon.c:594:18
	    #6 0x5d3d99652006 in dtls13_MaybeSendKeyUpdate nss/lib/ssl/dtls13con.c:828:10
	    #7 0x5d3d995fb7e0 in tls13_SendKeyUpdate nss/lib/ssl/tls13con.c:1028:14
	    #8 0x5d3d995d04bc in tls13_CheckKeyUpdate nss/lib/ssl/sslsecur.c:827:14
	    #9 0x5d3d995cf7b0 in ssl_SecureRecv nss/lib/ssl/sslsecur.c:874:13
	    #10 0x5d3d995edda6 in ssl_Read nss/lib/ssl/sslsock.c:3283:10
	    #11 0x5d3d995641cc in TlsCommon::DoHandshake(PRFileDesc*, bool) nss/fuzz/targets/lib/tls/common.cc:66:18
	    #12 0x5d3d9956236f in LLVMFuzzerTestOneInput nss/fuzz/targets/tls_server.cc:87:3
	    #13 0x5d3d99d2d740 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:614:13
	    #14 0x5d3d99d189b5 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:327:6
	    #15 0x5d3d99d1e44f in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:862:9
	    #16 0x5d3d99d496f2 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
	    #17 0x7cb175a20082 in __libc_start_main /build/glibc-LcI20x/glibc-2.31/csu/libc-start.c:308:16
	    #18 0x5d3d9948997d in _start
	
	AddressSanitizer can not provide additional info.
	SUMMARY: AddressSanitizer: ABRT (/lib/x86_64-linux-gnu/libc.so.6+0x4300b) (BuildId: 0702430aef5fa3dda43986563e9ffcc47efbd75e)
	==1293==ABORTING
	
Summary: Assertion failure: timer->cb == NULL, at ../../lib/ssl/dtlscon.c:918 → Assertion failure: timer->cb == NULL, at ../../lib/ssl/dtlscon.c:922
Severity: -- → S2
Priority: -- → P2
Attached file (secure) (obsolete) —

So, what happens in the bug is the following:

Server sends 2 handshake messages at the same time. Each of them starts a retransmit timer.

Pretty much, it raises a couple of questions:

Why SendSessionTicket is disabled here (https://searchfox.org/mozilla-central/source/security/nss/lib/ssl/tls13con.c#6205), whereas it's still reacheable? See the test patch.
Do we need to enable it? Even for consistency?

More general question, should we disable calling KeyUpdate (or any other post-handshake handshake message) until we get an ACK for the previous (or any) post-handshake handshake message? Should we also document it in dtls1.3 beta RFC? :D

And if the answer to the previous question is 'no', should we be less strict about the timers? Instead of asserting that the only one timer of the same type is in progress, should we 'restart' the timer when a new timer request has arrived?

I believe that we might want to have just one handshake message in progress at a time, thus we won't need to have a more complicated timer logic (and this would also let us catching bugs using fuzzers).

Any opinion?

Flags: needinfo?(mt)
Flags: needinfo?(djackson)
Assignee: nobody → anna.weine
Status: NEW → ASSIGNED

I think we can get away with only allowing one outgoing post-handshake message at a time unless we have a use for session tickets / key updates / post-handshake client auth in the WebRTC setting.

Rather than rely entirely on the fuzzer, this might be worth writing an exhaustive test case for this (e.g. a nested loop over the different types of post-handshake message and trying to send both without waiting for an ACK).

The advice in 5.8.4 of the RFC seems reasonably clear to me. I think in the general case, folks are going to want to be able to handle of these messages independently and that would point to having a more complex state machine, but I don't know that we need to pay that cost now.

Flags: needinfo?(djackson)

I think we can get away with only allowing one outgoing post-handshake message at a time unless we have a use for session tickets / key updates / post-handshake client auth in the WebRTC setting.

We can send several, but the message HS2 will wait until the ACK of HS1 arrives before actually sending it.

e. I think in the general case, folks are going to want to be able to handle of these messages independently and that would point to having a more complex state machine, but I don't know that we need to pay that cost now.

I'm not sure that introducing several state machines is worth doing. It immediately increases the complexity (and a headache) of designing/debugging the system. You might need to care about communicating between two state machines.

Group: crypto-core-security

I'm not following the test, but I would instead say that -- for DTLS -- we reject attempts to start one or other if the timer is running.

Key update is tricky here, but we can keep retrying that one.

As Dennis suggests, it would be good to run the full matrix of post handshake messages (these two, plus authentication covers it) from both sides in tests.

Flags: needinfo?(mt)

(In reply to Martin Thomson [:mt:] from comment #6)

I'm not following the test, but I would instead say that -- for DTLS -- we reject attempts to start one or other if the timer is running.

The test is simple, 2 post-handshake messages at the same time. I will, obviously, implement a better test, but it's simpler to look at the test compared to the fuzzer info.

As Dennis suggests, it would be good to run the full matrix of post handshake messages (these two, plus authentication covers it) from both
sides
in tests.

Yes, I totally agree here.

I don't understand the security implications of comment 5 where this was turned into a security bug. Can you suggest a sec- rating for this? Or is it more of a an investigative "keep it hidden while we investigate" caution?

Flags: needinfo?(anna.weine)

It's the latter, yes.

what'd be the appropriate rating? sec-other?

Flags: needinfo?(anna.weine)

sec-audit is closer to this scenario. sec-other is more for bugs where we know that bug itself is definitely not a security problem, but it's related to/mentions/might reveal information about another bug that is a vulnerability. And then usually there's a comment that explains that relationship so we know when it's eventually safe to unhide it. Sometimes that can be implied by relations like "depends on"/blocks or "see also", but an explicit comment is generally better.

Keywords: sec-audit
Attached file (secure) (obsolete) —
Attached file (secure)
Attachment #9517125 - Attachment is obsolete: true
Attachment #9502179 - Attachment description: WIP: Bug 1978603 - DTLS: SESSION_TICKET + KeyUpdate timer failure test → (secure)
Attachment #9502179 - Attachment is obsolete: true

Pushed by nkulatova@mozilla.com:
https://hg.mozilla.org/projects/nss/rev/c1b5aed1a9d6
Allowing RT be started several times r=djackson

Status: ASSIGNED → RESOLVED
Closed: 6 months ago
Resolution: --- → FIXED
Group: crypto-core-security → core-security-release
QA Whiteboard: [sec] [qa-triage-done-c149/b148]
Whiteboard: [fuzzblocker] → [fuzzblocker] [adv-main148-]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: