Closed Bug 1644209 Opened 4 years ago Closed 4 years ago

Intermittent [ FAILED ] Tls13PskTest/Tls13PskTest.ClientVerifyHashType/1, where GetParam() = (1, 4867) (36 ms)

Categories

(NSS :: Test, defect, P1)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: intermittent-bug-filer, Assigned: kjacobs)

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

Filed by: kjacobs [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=305481936&repo=nss
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/bbpvKSuyRBexxncPtbeVQQ/runs/0/artifacts/public/logs/live_backing.log


Version: DTLS 1.3
server: Changing state from INIT to CONNECTING
client: Changing state from INIT to CONNECTING
handshake old: [92] fefd829c0ba069767415a5d1dd422d2d90eacd2a8e4a32c1e1095e0ffcca45c1...
handshake new: [92] fefd829c0ba069767415a5d1dd422d2d90eacd2a8e4a32c1e1095e0ffcca45c1...
record old: [104] 0200005c000000000000005cfefd829c0ba069767415a5d1dd422d2d90eacd2a...
record new: [104] 0200005c000000000000005cfefd829c0ba069767415a5d1dd422d2d90eacd2a...
server: Filtered packet: [269] 16fefd000000000000000000680200005c000000000000005cfefd829c0ba069...
client: Fatal alert sent: 50
../../gtests/ssl_gtest/tls_agent.cc:803: Failure
Expected equality of these values:
  expected
    Which is: '/' (47, 0x2F)
  alert->description
    Which is: '2' (50, 0x32)
client: Handshake failed with error SSL_ERROR_RX_MALFORMED_SERVER_HELLO: SSL received a malformed Server Hello handshake message.
client: Changing state from CONNECTING to ERROR
[  FAILED  ] Tls13PskTest/Tls13PskTest.ClientVerifyHashType/1, where GetParam() = (1, 4867) (36 ms)

This one seems a bit more reproducible than the other DTLS intermittents: my MacBook can reproduce it after a few hundred iterations. It appears the SelectedCipherSuiteReplacer filter is occasionally clobbering the ServerHello, resulting in a decode_error alert originating from one of the highlighted lines in [1]

I've confirmed there is no test failure in stream-variant test runs, and removing the filter causes the test to no longer fail in datagram-variant runs.

[1] https://searchfox.org/mozilla-central/rev/35b97af64a55d1d30caa4d6e9fabc1a7fbabc509/security/nss/lib/ssl/ssl3con.c#6748,6777

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → INCOMPLETE

This still happens in our Fedora builds of 3.56.

The root cause here is that SelectedCipherSuiteReplacer doesn't expect a [legacy_]session_id in TLS 1.3+. Further, the version check itself doesn't normalize for DTLS versions, resulting in DTLS 1.2 > TLS 1.3 (`0xfefd > 0x0304) and a mis-parsed header.

Normally this produces the same error code that the test looks for, so the failure was not apparent. Patch incoming.

Assignee: nobody → kjacobs.bugzilla
Severity: normal → S4
Status: RESOLVED → REOPENED
Priority: P5 → P1
Resolution: INCOMPLETE → ---

This patch corrects the SelectedCipherSuiteReplacerfilter to always parse the session_id variable (legacy_session_id for TLS 1.3+). The previous code attempted to skip it in 1.3+ but did not account for DTLS wire versions, resulting in intermittent failures.

Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 3.59
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: