Closed Bug 848139 Opened 11 years ago Closed 11 years ago

Telemetry for Server Hello size

Categories

(Core :: Security: PSM, enhancement)

x86_64
Linux
enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla22

People

(Reporter: mcmanus, Assigned: mcmanus)

References

Details

Attachments

(1 file)

overflowing initial TCP send windows with large certificates or chains of intermediaries is a great way to add extra round trips to your SSL handshake.

the tls-cached-info extension is working its way through the standards track as a way to work around that happening repeatedly for the same hosts. There are (at least) patches for it for nss and openssl from agl.

If it solves a real problem we could consider backing that both in firefox and maybe for mod_ssl/apache open source efforts. If the data shows its not much of an issue then we can cross it off the list.

This is a patch to gather telemetry on the problem.

Rather than messing with nss this manages to implement a close enough approximation of what we need from psm - the number of bytes sent from the server before the cert verification callback is made is tracked through a plaintext nspr layer that just counts bytes. That layer is removed at the end of the handshake.
Attached patch patch 0Splinter Review
Attachment #721470 - Flags: review?(bsmith)
Comment on attachment 721470 [details] [diff] [review]
patch 0

Honza, brian suggested you could take this review (and one other) to unblock it.

This patch counts tcp data bytes of the SSL session handshake to figure out how applicable a strategy like tls-cached-info (or variant thereof) would be. I want to have data to inform feature work where its possible.

I hear b2g could make use of a similar counter (that counted data transferred, instead of plaintext inside the tunnel) for application throttling/billing etc.. so this ought to provide a PoC for that too.
Attachment #721470 - Flags: review?(bsmith) → review?(honzab.moz)
Comment on attachment 721470 [details] [diff] [review]
patch 0

Review of attachment 721470 [details] [diff] [review]:
-----------------------------------------------------------------

r=honzab, but not manually tested.

::: security/manager/ssl/src/nsNSSIOLayer.cpp
@@ +204,5 @@
> +      PR_GetIdentitiesLayer(mFd, nsSSLIOLayerHelpers::nsSSLPlaintextLayerIdentity);
> +    if (poppedPlaintext) {
> +      PR_PopIOLayer(mFd, nsSSLIOLayerHelpers::nsSSLPlaintextLayerIdentity);
> +      poppedPlaintext->dtor(poppedPlaintext);
> +    }

When we add bytesWritten and leave the layer in, we can get the SSL overhead telemetry, right?  Followup bug?

@@ +453,5 @@
>    if (errorCode) {
>      SetCanceled(errorCode, errorMessageType);
>    }
>  
> +  if (mPlaintextBytesRead && !mHandshakeCompleted && !errorCode) {

Why is !mHandshakeCompleted in the condition?

@@ +2657,5 @@
>    if (layer) {
>      layer->dtor(layer);
>    }
> +  if (plaintextLayer) {
> +    PR_PopIOLayer(fd, PR_TOP_IO_LAYER);

You should pop by the plaintextLayer identity rather.
Attachment #721470 - Flags: review?(honzab.moz) → review+
> >  
> > +  if (mPlaintextBytesRead && !mHandshakeCompleted && !errorCode) {
> 
> Why is !mHandshakeCompleted in the condition?

this code runs asynchronously (bounced from the cert verification threadpool to the socket thread) so its possible the handshake has actually completed at this execution point and encrypted data is flowing.. in which case the counter is going to be inaccurate for measuring server hello. In practice, I have never seen that happen so the pragmatic approach of just throwing those samples out covers it.
(In reply to Patrick McManus [:mcmanus] from comment #4)
> > >  
> > > +  if (mPlaintextBytesRead && !mHandshakeCompleted && !errorCode) {
> > 
> > Why is !mHandshakeCompleted in the condition?
> 
> this code runs asynchronously (bounced from the cert verification threadpool
> to the socket thread) so its possible the handshake has actually completed
> at this execution point and encrypted data is flowing.. in which case the
> counter is going to be inaccurate for measuring server hello. In practice, I
> have never seen that happen so the pragmatic approach of just throwing those
> samples out covers it.

Ok, maybe worth some comment in the code then?
(In reply to Honza Bambas (:mayhemer) from comment #3)

> Why is !mHandshakeCompleted in the condition?

I thought about this some more and !mHandshakeCompleted isn't required. While this code is async from the cert verification thread, this is the function that calls back NSS with the results of the cert verification, and it does it on the socket thread so no other data can be flowing simultaneously.

so I removed it.

thanks.
in an email bsmith gave go ahead to push psm telemetry patches without separate sr

https://hg.mozilla.org/integration/mozilla-inbound/rev/771c895e8b55
https://hg.mozilla.org/mozilla-central/rev/771c895e8b55
Assignee: nobody → mcmanus
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: