1310338 - Telemetry experiment to measure TLS 1.3 success/performance

Reporter

Description

•

9 years ago

We want to do a telemetry experiment to measure TLS 1.3 success against a known-good server. This gives us two things: 1. Sanity checking that it works reliably under ideal conditions. 2. A measure of adverse network conditions (middleboxes, etc.) which might cause problems. The initial experiment is very simple and will be run after we enable TLS 1.3 by default in Nightly: - Connect to two hosts, one with TLS 1.3 and one without, both operated by Cloudflare on our behalf. - Measure success rates, timing, etc. - Report the results in a custom telemetry ping Disable the experiment. Ideally we would run it on the entire (not very large) Nightly population.

Eric Rescorla (:ekr)

Reporter

Comment 1

•

9 years ago

Attached patch 0001-Bug-1310338-TLS-1.3-telemetry-experiment.patch (obsolete) — Details — Splinter Review

Eric Rescorla (:ekr)

Reporter

Comment 2

•

9 years ago

Comment on attachment 8801356 [details] [diff] [review] 0001-Bug-1310338-TLS-1.3-telemetry-experiment.patch Review of attachment 8801356 [details] [diff] [review]: ----------------------------------------------------------------- Note: I know that the start/stop times are wrong. I will fix before landing, but I don't have a perfect timeline yet

Attachment #8801356 - Flags: review?(benjamin)

Benjamin Smedberg

Comment 3

•

9 years ago

Comment on attachment 8801356 [details] [diff] [review] 0001-Bug-1310338-TLS-1.3-telemetry-experiment.patch >diff --git a/experiments/tls13-compat-nightly52/code/bootstrap.js b/experiments/tls13-compat-nightly52/code/bootstrap.js >+const kUNKNOWN_ERROR = 0x80040111; // NS_ERROR_NOT_AVAILABLE This doesn't appear to be used anywhere else. There's a kPR_UNKNOWN_ERROR below though? Can we use Components.results.NS_ERROR_NOT_AVAILABLE as a constant here? >+function report(result) { >+ console.log("Result"); >+ console.log(result); >+ >+ return TelemetryController.submitExternalPing( >+ "tls-13-study-v1", >+ { >+ results: result >+ }, {}); Somewhere, we need to have documented the format of this ping for data review. I think an .md or .rst file checked in here would be fine. It's really hard for me reading this code to know exactly what "results" will contain, and we want the data docs to be explicit. Are you recording anything about HTTP proxy config, offline state, or captive portal state? Are you worried about any of those significantly skewing the results? Do you want to/can you record the resolved IP address of the request to help diagnose the frequency of direct connections versus captive portal? >+function disable() { >+ Experiments.instance().disableExperiment("FROM_API"); >+} So the desired behavior here is for the experiment to install, immediately do the network requests, send the ping, and then uninstall? It might be worth documenting that flow, and the edge cases of what should happen in errors. >+function startup(data, reason) { >+ let todo = []; >+ let shuffled = shuffleArray(kURLs); >+ >+ if (reason != ADDON_ENABLE) { >+ return Promise.resolve(true); >+ } I'm a little worried about whether we send ADDON_ENABLE notifications reliably for experiments. Felipe or Georg, do you know if we have either manual or automated testing that verifies the sequence of startup events for experiments? >+ >+ for (var i in shuffled) { >+ todo.push(makeRequest(i, shuffled[i], null )); >+ } >+ >+ return Promise.all(todo) >+ .then(result => report(result)) >+ .catch(e => Cu.reportError(e)) You don't need any telemetry for the error case? I figured you'd want to send an error ping. Also, in case there are bugs, recording the telemetry clientId with this ping can help with deduping. Is there a reason this is sensitive enough that we wouldn't want to record the clientId?

Flags: needinfo?(gfritzsche)

Flags: needinfo?(felipc)

Flags: needinfo?(ekr)

Attachment #8801356 - Flags: review?(benjamin) → review-

:Felipe Gomes (needinfo for replies!)

Comment 4

•

9 years ago

(In reply to Benjamin Smedberg [:bsmedberg] from comment #3) > >+function startup(data, reason) { > >+ let todo = []; > >+ let shuffled = shuffleArray(kURLs); > >+ > >+ if (reason != ADDON_ENABLE) { > >+ return Promise.resolve(true); > >+ } > > I'm a little worried about whether we send ADDON_ENABLE notifications > reliably for experiments. Felipe or Georg, do you know if we have either > manual or automated testing that verifies the sequence of startup events for > experiments? No, actually the startup events for experiments are known to be buggy (i.e., different from the convention). See this here for some code that we use on every experiment: http://hg.mozilla.org/webtools/telemetry-experiment-server/file/tip/experiments/e10s-beta46-noapz/code/bootstrap.js#l25 As you just need to run this once, I recommend just using the install() function which will work correctly.

Flags: needinfo?(felipc)

Georg Fritzsche [:gfritzsche]

Updated

•

9 years ago

Flags: needinfo?(gfritzsche)

Eric Rescorla (:ekr)

Reporter

Comment 5

•

9 years ago

(In reply to Benjamin Smedberg [:bsmedberg] from comment #3) > Comment on attachment 8801356 [details] [diff] [review] > 0001-Bug-1310338-TLS-1.3-telemetry-experiment.patch > > >diff --git a/experiments/tls13-compat-nightly52/code/bootstrap.js b/experiments/tls13-compat-nightly52/code/bootstrap.js > > >+const kUNKNOWN_ERROR = 0x80040111; // NS_ERROR_NOT_AVAILABLE > > This doesn't appear to be used anywhere else. There's a kPR_UNKNOWN_ERROR > below though? Can we use Components.results.NS_ERROR_NOT_AVAILABLE as a > constant here? Sure, that should be fine. Do I need to pull in Components.results specifically? > > >+function report(result) { > >+ console.log("Result"); > >+ console.log(result); > >+ > >+ return TelemetryController.submitExternalPing( > >+ "tls-13-study-v1", > >+ { > >+ results: result > >+ }, {}); > > Somewhere, we need to have documented the format of this ping for data > review. I think an .md or .rst file checked in here would be fine. It's > really hard for me reading this code to know exactly what "results" will > contain, and we want the data docs to be explicit. I will add something. > Are you recording anything about HTTP proxy config, offline state, or > captive portal state? Are you worried about any of those significantly > skewing the results? Do you want to/can you record the resolved IP address > of the request to help diagnose the frequency of direct connections versus > captive portal? I don't think that that's necessary. Given that we know that we just installed the experiment, I think it's reasonably likely we have a clear connection. With that said, if we see a big delta we ma need to revisit. > >+function disable() { > >+ Experiments.instance().disableExperiment("FROM_API"); > >+} > > So the desired behavior here is for the experiment to install, immediately > do the network requests, send the ping, and then uninstall? It might be > worth documenting that flow, and the edge cases of what should happen in > errors. I added some text. Not sure what you mean about edge cases. > >+function startup(data, reason) { > >+ let todo = []; > >+ let shuffled = shuffleArray(kURLs); > >+ > >+ if (reason != ADDON_ENABLE) { > >+ return Promise.resolve(true); > >+ } > > I'm a little worried about whether we send ADDON_ENABLE notifications > reliably for experiments. Felipe or Georg, do you know if we have either > manual or automated testing that verifies the sequence of startup events for > experiments? I will use "install" as suggested by Georg. Should I still be calling disable() > > >+ > >+ for (var i in shuffled) { > >+ todo.push(makeRequest(i, shuffled[i], null )); > >+ } > >+ > >+ return Promise.all(todo) > >+ .then(result => report(result)) > >+ .catch(e => Cu.reportError(e)) > > You don't need any telemetry for the error case? I figured you'd want to > send an error ping. We should record XHR errors as errors in the results struct. And if the report function fails, then that sugests we can't report. > Also, in case there are bugs, recording the telemetry clientId with this > ping can help with deduping. Is there a reason this is sensitive enough that > we wouldn't want to record the clientId? No, I just didn't think I needed it. Happy to do it.

Flags: needinfo?(ekr)

Benjamin Smedberg

Comment 6

•

9 years ago

> Sure, that should be fine. Do I need to pull in Components.results > specifically? No. > > >+function disable() { > > >+ Experiments.instance().disableExperiment("FROM_API"); > > >+} > > > > So the desired behavior here is for the experiment to install, immediately > > do the network requests, send the ping, and then uninstall? It might be > > worth documenting that flow, and the edge cases of what should happen in > > errors. > > I added some text. Not sure what you mean about edge cases. I mean if one or more of the XHRs failed to connect at all, or timed out, or some other error caused those promises to reject, would we want to repeat the request again at next startup? Mainly I want to make sure that the implemented behavior here matches what you actually want. > I will use "install" as suggested by Georg. Should I still be calling > disable() Probably, since you want the experiment to immediately uninstall.

Eric Rescorla (:ekr)

Reporter

Comment 7

•

9 years ago

(In reply to Benjamin Smedberg [:bsmedberg] from comment #6) > > Sure, that should be fine. Do I need to pull in Components.results > > specifically? > > No. > > > > >+function disable() { > > > >+ Experiments.instance().disableExperiment("FROM_API"); > > > >+} > > > > > > So the desired behavior here is for the experiment to install, immediately > > > do the network requests, send the ping, and then uninstall? It might be > > > worth documenting that flow, and the edge cases of what should happen in > > > errors. > > > > I added some text. Not sure what you mean about edge cases. > > I mean if one or more of the XHRs failed to connect at all, or timed out, or > some other error caused those promises to reject, would we want to repeat > the request again at next startup? > > Mainly I want to make sure that the implemented behavior here matches what > you actually want. Good point, but I think on balance that will cause more trouble than it's worth. I actually specifically want to catch XHR failures > > > I will use "install" as suggested by Georg. Should I still be calling > > disable() > > Probably, since you want the experiment to immediately uninstall.

Eric Rescorla (:ekr)

Reporter

Comment 8

•

9 years ago

Attached patch 0001-Bug-1310338-TLS-1.3-telemetry-experiment.patch — Details — Splinter Review

Attachment #8801356 - Attachment is obsolete: true

Eric Rescorla (:ekr)

Reporter

Comment 9

•

9 years ago

Comment on attachment 8802704 [details] [diff] [review] 0001-Bug-1310338-TLS-1.3-telemetry-experiment.patch Review of attachment 8802704 [details] [diff] [review]: ----------------------------------------------------------------- See revised

Attachment #8802704 - Flags: review?(benjamin)

Eric Rescorla (:ekr)

Reporter

Comment 10

•

9 years ago

Question: We may want to update this experiment later with more probes. What does that look like?

Eric Rescorla (:ekr)

Reporter

Comment 11

•

9 years ago

Ping for re-review.

Flags: needinfo?(benjamin)

Benjamin Smedberg

Comment 12

•

9 years ago

While the experiment is installed, you can update it by deploying a new version to the experiment server. You are likely to get a new install() invocation on upgrade. Once a particular experiment has been uninstalled, it won't ever be reinstalled/rerun for that user.

Flags: needinfo?(benjamin)

Benjamin Smedberg

Updated

•

9 years ago

Attachment #8802704 - Flags: review?(benjamin) → review+

Benjamin Smedberg

Comment 13

•

9 years ago

Eric, if you need QA assistance with getting this tested please ping RyanVM and he can help find a QA lead.

Eric Rescorla (:ekr)

Reporter

Comment 14

•

9 years ago

Landed as: https://hg.mozilla.org/webtools/telemetry-experiment-server/rev/2efaab39bba4114ffc3f6f257e9f11f900c68701

Eric Rescorla (:ekr)

Reporter

Comment 15

•

9 years ago

Bsmedberg, I accidentally left this on 10% of the population,but I'd like to actually run it on the whole Nightly population (we need a big sample to get significance). Any objections?

Eric Rescorla (:ekr)

Reporter

Updated

•

9 years ago

Flags: needinfo?(benjamin)

Benjamin Smedberg

Comment 16

•

9 years ago

I sent email. I figured this was being deployed to nightly to see whether the experiment worked correctly: I don't think it is possible to get real product insights from nightly, and so in terms of answering real-world questions you need to deploy to a fraction of beta or maybe even release.

Flags: needinfo?(benjamin)

Eric Rescorla (:ekr)

Reporter

Comment 17

•

9 years ago

I think it's still possible to get real insights from Nightly. It represents a biased sample but so do all of our samples. And if we see a difference or don't see a difference, that's relevant. [Note that it's standard practice in clinical trials for drugs, for instance, to use totally unrepresentative subject samples, so it's not like there's something unscientific about this; though you do have to be careful about the conclusions you draw if your demographics are way out of whack.] In any case, this version of TLS 1.3 is only in 52, so while we might eventually extend the experiment to 52 beta or release, that's going to take a lot longer. Hope that clarifies things. Any objections to my replacing this number with 1.0?

Michelle Funches - QA

Comment 18

•

9 years ago

PASS Please consider this the requested QA sign-off. Experiment installation and data collection works as expected. Eric, thanks a lot for all your assistance. Please reference this link for details on install and data collection https://docs.google.com/spreadsheets/d/1jfpx0kT9N_eOpNO-fKv2EzC_WCp_WnQDhRabM7H1LuY/edit#gid=1087822068

Flags: needinfo?(lhenry)

Flags: needinfo?(ekr)

Flags: needinfo?(benjamin)

Benjamin Smedberg

Comment 19

•

9 years ago

As long as Michelle has verified that the experiment uninstalls correctly, 100% "sample" should be fine, because it uninstalls automatically and won't block other potential experiments. It would be good to set a maximum runtime though: 2-3 days perhaps so that if there is a bug it doesn't consume the entire nightly population?

Flags: needinfo?(benjamin)

Liz Henry (:lizzard) (privacy/fingerprinting team)

Comment 20

•

9 years ago

Sounds like a good idea to put a hard limit on this as Benjamin suggests. Thanks Michelle!

status-firefox52: --- → affected

tracking-firefox52: --- → +

Flags: needinfo?(lhenry)

Eric Rescorla (:ekr)

Reporter

Comment 21

•

9 years ago

bsmedberg: which value are you talking about? "maxActiveSeconds" : 604800?

Flags: needinfo?(ekr) → needinfo?(benjamin)

Benjamin Smedberg

Comment 22

•

9 years ago

Yes, that's currently 7 days: can we reduce that to 2 or 3? I'm sorry for some reason I misread and thought that wasn't set at all.

Flags: needinfo?(benjamin)

Eric Rescorla (:ekr)

Reporter

Comment 23

•

9 years ago

I assume this is just the limit on how long it runs on a given device? If so, then I'll cut it down to 3600s which should be more than plenty

Benjamin Smedberg

Comment 24

•

9 years ago

Correct.

Liz Henry (:lizzard) (privacy/fingerprinting team)

Comment 25

•

9 years ago

Sylvestre, FYI for 52 since you are the release owner.

Flags: needinfo?(sledru)

Liz Henry (:lizzard) (privacy/fingerprinting team)

Comment 26

•

9 years ago

I'll go ahead and give relman approval here since Sylvestre is out, so this can ship to Nightly.

Liz Henry (:lizzard) (privacy/fingerprinting team)

Comment 27

•

9 years ago

Matt, is it your team who would deploy this? Can you find an owner for it? Thanks.

Flags: needinfo?(mgrimes)

Matt Grimes [:Matt_G]

Comment 28

•

9 years ago

We may do this as a Shield study in the future, but this iteration is going out via Telemetry Experiments. I do not know who the owner would be in that case.

Flags: needinfo?(mgrimes)

Liz Henry (:lizzard) (privacy/fingerprinting team)

Comment 29

•

9 years ago

Looks like it needs signing (https://wiki.mozilla.org/QA/Telemetry/Developing_a_Telemetry_Experiment#Signing_the_XPI) And then someone from the Ops team should be contacted to deploy it.

Sylvestre Ledru [:Sylvestre]

Updated

•

9 years ago

Flags: needinfo?(sledru)

Julien Cristau [:jcristau]

Comment 30

•

9 years ago

This bug seems to have gone stale. Is this experiment done, or is there still some needed action here?

Flags: needinfo?(ekr)

Eric Rescorla (:ekr)

Reporter

Comment 31

•

9 years ago

Thanks. I need to analyze the data and then revise it to run on 51 Beta per the long discussion elsewhere. I'll be doing that soon.

Flags: needinfo?(ekr)

Julien Cristau [:jcristau]

Comment 32

•

8 years ago

Too late for firefox 52, mass-wontfix.

status-firefox52: affected → wontfix

Eric Rescorla (:ekr)

Reporter

Comment 33

•

8 years ago

The experiment actually is done on 53. I am analyzing the data now.

Georg Fritzsche [:gfritzsche]

Comment 34

•

7 years ago

This sounds like it's done and the bug can be closed.

Status: NEW → RESOLVED

Closed: 7 years ago

Resolution: --- → FIXED

BMO Automation

Updated

•

7 years ago

Product: Firefox Health Report → Firefox Health Report Graveyard

0001-Bug-1310338-TLS-1.3-telemetry-experiment.patch 9 years ago Eric Rescorla (:ekr) 9.95 KB, patch	benjamin : review-	Details \| Diff \| Splinter Review
0001-Bug-1310338-TLS-1.3-telemetry-experiment.patch 9 years ago Eric Rescorla (:ekr) 11.03 KB, patch	benjamin : review+	Details \| Diff \| Splinter Review