Closed Bug 1276899 Opened 8 years ago Closed 8 years ago

GTests keep reading after read error

Categories

(NSS :: Test, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ekr, Unassigned)

Details

Attachments

(1 file)

You're not supposed to do this.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
What about DTLS?

See https://public-artifacts.taskcluster.net/HfsKhOR2R2mVLDxOIAN_2w/1/public/logs/live_backing.log

[ RUN      ] DatagramOnly/TlsConnectDatagram.ShortRead/0
Version: DTLS 1.3
rsa: Changing state from INIT to CONNECTING
client: Changing state from INIT to CONNECTING
client: Handshake
client: Would have blocked
rsa: Handshake
rsa: Would have blocked
Poll() waiters = 2 timers = 17
client: Readable
client: Handshake
client: Handshake success
client: Changing state from CONNECTING to CONNECTED
client: Handshake success
client: Readable
client: ReadBytes -1
client: Read error -5998: The operation would have blocked
client: Re-arming
Poll() waiters = 2 timers = 15
rsa: Readable
rsa: Handshake
rsa: Handshake success
rsa: Changing state from CONNECTING to CONNECTED
rsa: Handshake success
Connected with version 772 cipher suite TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
rsa: Writing 1200 bytes
Poll() waiters = 2 timers = 16
client: Readable
client: ReadBytes -1
client: Read error -12155: Not enough room in buffer for DTLS record.
client: Writing 50 bytes
rsa: Writing 50 bytes
Poll() waiters = 2 timers = 17
rsa: Readable
rsa: ReadBytes 50
rsa: Re-arming
Poll() waiters = 2 timers = 17
rsa: Readable
rsa: ReadBytes -1
rsa: Read error -5998: The operation would have blocked
rsa: Re-arming
Poll() waiters = 2 timers = 16
Poll() waiters = 2 timers = 15
Poll() waiters = 2 timers = 14
Poll() waiters = 2 timers = 13
Poll() waiters = 2 timers = 12
Poll() waiters = 2 timers = 11
tls_connect.cc:378: Failure
Value of: client_->received_bytes()
  Actual: 0
Expected: amount
Which is: 50
[  FAILED  ] DatagramOnly/TlsConnectDatagram.ShortRead/0, where GetParam() = 772 (2179 ms)
Flags: needinfo?(ekr)
Maybe that test is invalid, but I think that SSL_ERROR_RX_SHORT_DTLS_READ is recoverable.
Yeah, that seems like a bug in the tests, because we should be recovering in that case.
Flags: needinfo?(ekr)
Is this the sort of thing you are looking for?
Attachment #8759516 - Flags: review?(ekr)
Comment on attachment 8759516 [details] [diff] [review]
bug1276899-1.patch

Review of attachment 8759516 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM. But, why are the tests passing in automation without this?
Attachment #8759516 - Flags: review?(ekr) → review+
It's manifesting as an intermittent.
That's concerning. Any theory on why?
Probably not answering your question, but adding some more information: TlsConnectDatagram.ShortRead tests currently fail almost permanently when run by ASan builds on Taskcluster. This seems be due to a timing issue because ASan debug builds and the docker host are both somewhat slow. I can reproduce it locally by running ssl_gtests (built with USE_ASAN=1) in the Taskcluster docker image limited to 50% of a single core. With mt's patch above the failures are gone.
Yeah, my question is "Why does it ever work"
Well, your change only stops the polling from happening.  It doesn't block future attempts to call ReadBytes().  Reads also happen as a result of the handshake timer.  Maybe in those builds where it succeeds it is reading as a consequence of the handshake timer.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: