Closed Bug 1359987 Opened 7 years ago Closed 6 years ago

Create telemetry to understand HSTS Priming costs / benefits

Categories

(Core :: DOM: Security, enhancement, P2)

51 Branch
enhancement

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox56 --- fixed

People

(Reporter: kmckinley, Assigned: kmckinley)

References

Details

(Whiteboard: [domsecurity-active])

Attachments

(1 file)

In order to evaluate the impact to our users, more telemetry needs to be gathered.

Create telemetry to understand, relative to all requests
1) the % of requests that generate an HSTS priming request
2) the % of requests that are maximally pessimal HSTS priming requests, relative to all requests and relative to HSTS priming requests
3) the overall % of requests that get upgraded due to HSTS priming
I believe we should be able to generate an approximate value for #1 by looking at the total number of requests in HTTP_TRANSACTION_IS_SSL and comparing the number or requests which result in priming success, fail, or timeout. The total number of requests which may be held up by HSTS priming includes success, fail, and timeout, plus the results marked as "cached", which indicates a result was returned prior to sending priming. Custom telemetry could also be written for this, which would largely mimic HTTP_TRANSACTION_IS_SSL.

#2 should be able to be generated from HSTS_PRIMING_REQUEST_DURATION for all priming requests, however this telemetry is not currently in the set on atmo or telemetry.mozilla.org. A similar method to #1 could be used to compare to all requests, or it could be folded into custom telemetry for #1.

For #3, additional information would need to be added to the HSTS cache to indicate that the host was added to the cache by HSTS priming. When we have that, the existing HTTP_SCHEME_UPGRADE telemetry could be duplicated and extended to include the source of the upgrade.
Summary: Create telemetry to understand HSTS Priming cost → Create telemetry to understand HSTS Priming costs / benefits
Priority: -- → P2
Whiteboard: [domsecurity-active]
Blocks: 1365432
Comment on attachment 8872820 [details]
Bug 1359987 - Update HSTS priming telemetry  p=francois

https://reviewboard.mozilla.org/r/144334/#review149162

This already looks pretty good in general. Please address my concerns and flag me again once Honza has looked it over. Ultimately Honza will be the better person to review from a Necko point of view.

::: netwerk/base/nsNetUtil.cpp:2583
(Diff revision 1)
> +
> +      if (aLoadInfo->GetForceHSTSPriming()) {
> +        // don't log requests which might be upgraded due to HSTS Priming
> +        // they get logged in nsHttpChannel::OnHSTSPrimingSucceeded or
> +        // nsHttpChannel::OnHSTSPrimingFailed if the load is allowed to proceed.
> +        return NS_OK;

I think *aShouldUpgrade is undefined at this point, please assign a value.

::: netwerk/base/nsNetUtil.cpp:2625
(Diff revision 1)
> +      if (aLoadInfo->GetIsHSTSPrimingUpgrade()) {
> +        // if the upgrade occured due to HSTS priming, it was logged in
> +        // nsHttpChannel::OnHSTSPrimingSucceeded before redirect
> +        return NS_OK;
> +      }
> +    }

I think the same is true for those two returns

::: toolkit/components/telemetry/Histograms.json:10093
(Diff revision 1)
> +    "alert_emails": ["seceng@mozilla.org"],
> +    "bug_numbers": [1359987],
> +    "expires_in_version": "65",
> +    "kind": "enumerated",
> +    "n_values": 10,
> +    "description": "How often does a request result in HSTS priming? (0=No priming, 1=Sent request, 2=Skip priming due to cached HSTS, 3=Skip priming due to cached negative result, 4=Failed to start priming, 5=No load info, 6=Already upgraded"

Please align those values with your enumes. My suggestion:

0=No priming
1=Priming sent,
2=Priming skipped due to cached HSTS,
3=Priming skipped due to cached NO HSTS,
4=Priming failed (request error),
5=Priming skipped (missing loadinfo),
6=Priming skipped (already upgraded)"

In general this makes me wonder:
*) What's the difference between 0 (no priming) and e.g. 2,3,5,6. Maybe 0 should 'Priming skipped (because ....).
*) But then we should it flip that yet slightly around so that 0=Priming sent, and then starting at 1 we should list all the different reasons why it's skipped.
Attachment #8872820 - Flags: review?(ckerschb)
Comment on attachment 8872820 [details]
Bug 1359987 - Update HSTS priming telemetry  p=francois

https://reviewboard.mozilla.org/r/144334/#review149170

a side note: what happens when there is a non-secure request N to a server, a secure request S to the same server happening in parallel, and then the priming request P induced by the non-secure one.  If the S request makes it sooner than P and sets an entry to the HSTS cache, are we unleashing the N request (with either upgrade or block) right away or are we still waiting for P?  Just pointing this out in case someone may find interesting to collect telemetry also for cases when the P request finds an entry in the HSTS cache when done.  I can imagine pages to combine http: and https: scheme to the same host, so that we get priming requests for free (definitely when https: are found in the markup sooner than http:).

::: netwerk/base/nsNetUtil.cpp:2614
(Diff revision 1)
>        }
>      } else {
>        Telemetry::Accumulate(Telemetry::HTTP_SCHEME_UPGRADE, 1);
>      }
>    } else {
> +    if (aLoadInfo) {

nit: 

} else if () {

::: netwerk/protocol/http/HSTSPrimerListener.h:32
(Diff revision 1)
>  
>  /*
> + * How often do we send an HSTS priming request (over all requests)
> + */
> +enum HSTSPrimingRequest {
> +  // no HSTS priming request needed

not needed - means exactly what?

::: netwerk/protocol/http/HSTSPrimerListener.h:40
(Diff revision 1)
> +  eHSTS_PRIMING_REQUEST_SENT = 1,
> +  // channel marked for priming, but already had a cached result
> +  eHSTS_PRIMING_REQUEST_CACHED_HSTS = 2,
> +  // channel marked for priming, but already had a cached result
> +  eHSTS_PRIMING_REQUEST_CACHED_NO_HSTS = 3,
> +  // an error occured sending the priming request

what errors are these exactly?  or is only one of:
- the channel failed (OnStopRequest with a failed status) 
- not getting 200 from the HTTP server
- getting redirected to a different origin

::: netwerk/protocol/http/HSTSPrimerListener.h:44
(Diff revision 1)
> +  eHSTS_PRIMING_REQUEST_CACHED_NO_HSTS = 3,
> +  // an error occured sending the priming request
> +  eHSTS_PRIMING_REQUEST_ERROR = 4,
> +  // the channel had no load info, so is ineligible for priming
> +  eHSTS_PRIMING_REQUEST_NO_LOAD_INFO = 5,
> +  // the request was upgraded before HSTS priming

can you be more specific what this means?

::: netwerk/protocol/http/nsHttpChannel.cpp:485
(Diff revision 1)
>  
>  nsresult
>  nsHttpChannel::TryHSTSPriming()
>  {
>      if (mLoadInfo) {
> +        if(mLoadInfo->GetIsHSTSPriming()) {

if (

::: netwerk/protocol/http/nsHttpChannel.cpp:8615
(Diff revision 1)
>  {
> +    // If "security.mixed_content.use_hsts" is false, record the result of
> +    // HSTS priming and block or proceed with the load as required by
> +    // mixed-content blocking
> +    bool wouldBlock = mLoadInfo->GetMixedContentWouldBlock();
> +    mLoadInfo->ClearHSTSPriming();

please add a comment explaining why you call this.

::: netwerk/protocol/http/nsHttpChannel.cpp:8664
(Diff revision 1)
>   */
>  nsresult
>  nsHttpChannel::OnHSTSPrimingFailed(nsresult aError, bool aCached)
>  {
>      bool wouldBlock = mLoadInfo->GetMixedContentWouldBlock();
> +    mLoadInfo->ClearHSTSPriming();

as well here

::: toolkit/components/telemetry/Histograms.json:10087
(Diff revision 1)
>      "kind": "enumerated",
>      "n_values": 16,
>      "description": "How often do we get back an HSTS priming result which upgrades the connection to HTTPS? 0=cached (no upgrade), 1=cached (do upgrade), 2=cached (blocked), 3=already upgraded, 4=priming succeeded, 5=priming succeeded (block due to pref), 6=priming succeeded (no upgrade due to pref), 7=priming failed (block), 8=priming failed (accept), 9=timeout (block), 10=timeout (accept)"
>    },
> +  "MIXED_CONTENT_HSTS_PRIMING_REQUESTS": {
> +    "record_in_processes": [ "main", "content" ],

is this really to be recorded in content as well?
Comment on attachment 8872820 [details]
Bug 1359987 - Update HSTS priming telemetry  p=francois

https://reviewboard.mozilla.org/r/144334/#review149180
Attachment #8872820 - Flags: review?(honzab.moz) → review+
(In reply to Honza Bambas (:mayhemer) from comment #4)
> Comment on attachment 8872820 [details]
> Bug 1359987 - Update HSTS priming telemetry
> 
> https://reviewboard.mozilla.org/r/144334/#review149170
> 
> a side note: what happens when there is a non-secure request N to a server,
> a secure request S to the same server happening in parallel, and then the
> priming request P induced by the non-secure one.  If the S request makes it
> sooner than P and sets an entry to the HSTS cache, are we unleashing the N
> request (with either upgrade or block) right away or are we still waiting
> for P?  Just pointing this out in case someone may find interesting to
> collect telemetry also for cases when the P request finds an entry in the
> HSTS cache when done.  I can imagine pages to combine http: and https:
> scheme to the same host, so that we get priming requests for free
> (definitely when https: are found in the markup sooner than http:).

Yes, if we see the STS header before we fire off the HSTS priming request, we use that result. There are two places where that could happen. First, if we saw the S request before the N request hits NS_ShouldSecureUpgrade, the request will just get upgraded there, and record eHSTS_PRIMING_REQUEST_ALREADY_UPGRADED in TryHSTSPriming(). If the STS cache is updated between the NS_ShouldSecureUpgrade call and the StartHSTSPriming, then StartHSTSPriming also checks and redirects if an entry exists.

> ::: netwerk/base/nsNetUtil.cpp:2614
> (Diff revision 1)
> >        }
> >      } else {
> >        Telemetry::Accumulate(Telemetry::HTTP_SCHEME_UPGRADE, 1);
> >      }
> >    } else {
> > +    if (aLoadInfo) {
> 
> nit: 
> 
> } else if () {

The if is to protect the two cases where it is HTTPS, but we should not record for HTTP_SCHEME_UPGRADE from null check, not an else if ().
(In reply to Kate McKinley [:kmckinley, :Kate] from comment #6)
> (In reply to Honza Bambas (:mayhemer) from comment #4)
> > Comment on attachment 8872820 [details]
> > Bug 1359987 - Update HSTS priming telemetry
> > 
> > https://reviewboard.mozilla.org/r/144334/#review149170
> > 
> > a side note: what happens when there is a non-secure request N to a server,
> > a secure request S to the same server happening in parallel, and then the
> > priming request P induced by the non-secure one.  If the S request makes it
> > sooner than P and sets an entry to the HSTS cache, are we unleashing the N
> > request (with either upgrade or block) right away or are we still waiting
> > for P?  Just pointing this out in case someone may find interesting to
> > collect telemetry also for cases when the P request finds an entry in the
> > HSTS cache when done.  I can imagine pages to combine http: and https:
> > scheme to the same host, so that we get priming requests for free
> > (definitely when https: are found in the markup sooner than http:).
> 
> Yes, if we see the STS header before we fire off the HSTS priming request,

The question was about when you see the header _during_ the HSTS priming request is in progress.  But that is a micro-optimization only.  I was just curious.

> > } else if () {
> 
> The if is to protect the two cases where it is HTTPS, but we should not
> record for HTTP_SCHEME_UPGRADE from null check, not an else if ().

Ah!  Sorry, yes, overlooked you collect a telemetry there.
Comment on attachment 8872820 [details]
Bug 1359987 - Update HSTS priming telemetry  p=francois

https://reviewboard.mozilla.org/r/144334/#review150102

Thanks Kate. r=me
Attachment #8872820 - Flags: review?(ckerschb) → review+
Comment on attachment 8872820 [details]
Bug 1359987 - Update HSTS priming telemetry  p=francois

Francois, can you please review the new telemetry MIXED_CONTENT_HSTS_PRIMING_REQUESTS? This is collected on every request to tell us whether we sent an HSTS priming request, with a few reasons why why skipped it.
Attachment #8872820 - Flags: feedback?(francois)
Comment on attachment 8872820 [details]
Bug 1359987 - Update HSTS priming telemetry  p=francois

https://reviewboard.mozilla.org/r/144334/#review150380

::: toolkit/components/telemetry/Histograms.json:10099
(Diff revision 2)
>      "n_values": 10,
>      "description": "How often would blocked mixed content be allowed if HSTS upgrades were allowed? 0=display/no-HSTS, 1=display/HSTS, 2=active/no-HSTS, 3=active/HSTS"
>    },
>    "MIXED_CONTENT_HSTS_PRIMING": {
>      "record_in_processes": ["main", "content"],
>      "alert_emails": ["seceng@mozilla.org"],

Since you're updating this probe, let's also fix the email address: seceng-telemetry@mozilla.com

::: toolkit/components/telemetry/Histograms.json:10104
(Diff revision 2)
>      "alert_emails": ["seceng@mozilla.org"],
>      "bug_numbers": [1246540],
>      "expires_in_version": "60",
>      "kind": "enumerated",
>      "n_values": 16,
> -    "description": "How often would blocked mixed content be allowed if HSTS upgrades were allowed, including how often would we send an HSTS priming request? 0=display/no-HSTS, 1=display/HSTS, 2=active/no-HSTS, 3=active/HSTS, 4=display/no-HSTS-priming, 5=display/do-HSTS-priming, 6=active/no-HSTS-priming, 7=active/do-HSTS-priming"
> +    "description": "How often would blocked mixed content be allowed if HSTS upgrades were allowed, including how often would we send an HSTS priming request? 0=passive/HSTS, 1=active/HSTS, 2=passive/no priming, 3=passive/priming, 4=active/no priming, 5=active/priming, 6=passive/will HSTS upgrade, 7=active/will HSTS upgrade"

If you make this change, it will invalidate existing data.

The alternative would be to add a new `MIXED_CONTENT_HSTS_PRIMING_2` probe.

::: toolkit/components/telemetry/Histograms.json:10117
(Diff revision 2)
>      "n_values": 16,
>      "description": "How often do we get back an HSTS priming result which upgrades the connection to HTTPS? 0=cached (no upgrade), 1=cached (do upgrade), 2=cached (blocked), 3=already upgraded, 4=priming succeeded, 5=priming succeeded (block due to pref), 6=priming succeeded (no upgrade due to pref), 7=priming failed (block), 8=priming failed (accept), 9=timeout (block), 10=timeout (accept)"
>    },
> +  "MIXED_CONTENT_HSTS_PRIMING_REQUESTS": {
> +    "record_in_processes": [ "main" ],
> +    "alert_emails": ["seceng@mozilla.org"],

seceng-telemetry@mozilla.com

::: toolkit/components/telemetry/Histograms.json:10119
(Diff revision 2)
>    },
> +  "MIXED_CONTENT_HSTS_PRIMING_REQUESTS": {
> +    "record_in_processes": [ "main" ],
> +    "alert_emails": ["seceng@mozilla.org"],
> +    "bug_numbers": [1359987],
> +    "expires_in_version": "65",

Normally we ask telemetry to be limited to 6 releases (i.e. 62). It can be renewed before it expires if it's still useful.

::: toolkit/components/telemetry/Histograms.json:10126
(Diff revision 2)
> +    "n_values": 10,
> +    "description": "How often does a request result in HSTS priming? (0=Sent HSTS priming, 1=No priming, 2=Priming skipped due to cached HSTS, 3=Priming skipped due to cached NO HSTS, 4=Priming failed (request error), 5=Priming skipped (missing load info), 6=Priming skipped (already upgraded)"
> +  },
>    "HSTS_PRIMING_REQUEST_DURATION": {
>      "record_in_processes": ["main", "content"],
>      "alert_emails": ["seceng-telemetry@mozilla.org"],

seceng-telemetry@mozilla.com (.com not .org)

::: toolkit/components/telemetry/Histograms.json:10128
(Diff revision 2)
> +  },
>    "HSTS_PRIMING_REQUEST_DURATION": {
>      "record_in_processes": ["main", "content"],
>      "alert_emails": ["seceng-telemetry@mozilla.org"],
> -    "bug_numbers": [1311893],
> -    "expires_in_version": "58",
> +    "bug_numbers": [1311893, 1359987],
> +    "expires_in_version": "65",

62?
Attachment #8872820 - Flags: review-
Attachment #8872820 - Flags: feedback?(francois)
Francois, can you have another look at this? I tried to flag you on mozreview, but I'm not sure notifications work right.
Flags: needinfo?(francois)
Comment on attachment 8872820 [details]
Bug 1359987 - Update HSTS priming telemetry  p=francois

https://reviewboard.mozilla.org/r/144334/#review153080

datareview+ with the fixes below

::: toolkit/components/telemetry/Histograms.json:10110
(Diff revision 4)
>      "n_values": 3,
>      "description": "A simple counter of daily mixed-content unblock operations and top documents loaded"
>    },
>    "MIXED_CONTENT_HSTS": {
>      "record_in_processes": ["main", "content"],
>      "alert_emails": ["seceng@mozilla.org"],

While you're at it, can you please update this one too?

::: toolkit/components/telemetry/Histograms.json:10127
(Diff revision 4)
>      "n_values": 16,
> -    "description": "How often would blocked mixed content be allowed if HSTS upgrades were allowed, including how often would we send an HSTS priming request? 0=display/no-HSTS, 1=display/HSTS, 2=active/no-HSTS, 3=active/HSTS, 4=display/no-HSTS-priming, 5=display/do-HSTS-priming, 6=active/no-HSTS-priming, 7=active/do-HSTS-priming"
> +    "description": "How often would blocked mixed content be allowed if HSTS upgrades were allowed, including how often would we send an HSTS priming request? 0=passive/HSTS, 1=active/HSTS, 2=passive/no priming, 3=passive/priming, 4=active/no priming, 5=active/priming, 6=passive/will HSTS upgrade, 7=active/will HSTS upgrade"
>    },
>    "MIXED_CONTENT_HSTS_PRIMING_RESULT": {
> -    "record_in_processes": ["main", "content"],
> -    "alert_emails": ["seceng@mozilla.org"],
> +    "record_in_processes": [ "main" ],
> +    "alert_emails": ["seceng-telemetry@mozilla.org"],

.com
Attachment #8872820 - Flags: review+
(In reply to Kate McKinley [:kmckinley, :Kate] from comment #14)
> Francois, can you have another look at this? I tried to flag you on
> mozreview, but I'm not sure notifications work right.

Yeah, it never showed up on my queue :(

I think the easiest way to do this is to just mark me as a reviewer in MozReview. It doesn't distinguish between the code review and the data review.
Flags: needinfo?(francois)
Pushed by kmckinley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f6bb270bd6cf
Update HSTS priming telemetry r=ckerschb,francois,mayhemer p=francois
https://hg.mozilla.org/mozilla-central/rev/f6bb270bd6cf
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
You need to log in before you can comment on or make changes to this bug.