Closed Bug 1525632 Opened 5 years ago Closed 5 years ago

DNS-over-HTTPS experiment 6 - investigation impact of ECS (location hint) information with DNS

Categories

(Data Science :: Experiment Collaboration, task, P1)

task
Points:
3

Tracking

(data-science-status Evaluation & interpretation)

RESOLVED FIXED
Tracking Status
data-science-status --- Evaluation & interpretation

People

(Reporter: selenamarie, Assigned: tdsmith)

References

()

Details

Attachments

(2 files)

Brief description of the request:
We'd like to conduct a study of DoH / ECS

Link to any assets:

https://docs.google.com/document/d/1v-oZtJCPHU9VD3UeXvt4F533Njbbr47qR9WvOlkv-uA/edit (not complete)

Is there a specific data scientist you would like or someone who has helped to triage this request:

Tim!

Assignee: nobody → tdsmith
Status: NEW → ASSIGNED
data-science-status: --- → New
Component: General → Experiment Collaboration
Priority: -- → P1
Hardware: x86_64 → Unspecified
Points: --- → 3

v5 study with links to PHD, details, and sample payload: Bug 1502434

data-science-status: New → Planning
Depends on: 1529437

Saptarshi, can we use this bug to discuss the review? I think there will be a bit of back and forth before a r+.

Proposed pilot experiment: https://experimenter.services.mozilla.com/experiments/dns-over-https-experiment-6-pilot/

Proposed release experiment: https://experimenter.services.mozilla.com/experiments/dns-over-https-experiment-6-investigation-impact-of-ecs-location-hint-information-with-dns/

A look at the V5 data that's informing the power analysis for this experiment is here: https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/84963/command/85076

What is the goal of the effort the experiment is supporting?

The goal is to understand the performance impact of the ECS extension to the DNS protocol in combination with DoH. ECS is intended to improve performance by allowing public resolvers, including DoH endpoints, to communicate a limited amount of additional information about the client's network location to upstream name servers. This additional information is used by some services to match clients with a nearby edge server, with the goal of minimizing latency and bottlenecks.

What is the hypothesis or research question?

ECS should improve DoH performance. We're interested in sizing the performance difference vs DoH without ECS and also vs "native" DNS (i.e. over UDP).

Detecting breakage is not a goal of this experiment. It will not be possible to calculate our usual retention metrics or excess opt-out rates.

Which measurements will be taken, and how do they support the hypothesis and goal? Are these measurements available in the targeted release channels? Has there been data steward review of the collection?

Comparisons:

  • DNS vs DoH-ECS
  • DNS vs DoH+ECS
  • DoH-ECS vs DoH+ECS

Outcomes:

  • TCP connect time
  • TLS handshake time
  • Time to first byte
  • Time to completion
  • Fetch time

Over:

  • IP version (4 vs 6)
  • Fetch endpoint HTTP/S (on vs off)
  • Resource to fetch
    • Partner 1: small payload, 100K payload
    • Partner 2: now.txt, 1M.bin

This is 3*5*6 = 90 comparisons. The measurements are provided by an addon. Data steward review is pending in bug 1529437; the schema of collected data is described at https://github.com/jonathanKingston/http-dns/pull/34.

Each client will make n measurements of each condition daily, submitted to telemetry together in a single payload, for each of the k days of the experiment where the client is active.

I think n=1 and k=9, but I'm confirming that.

Analysis plan

The v5 study was analyzed by considering each measurement of each lookup and resource fetch as an independent measurement. Clients submitted multiple measurements over the course of the study. All comparisons were done on an unpaired, pooled basis. 95% confidence intervals for the relative difference between DoH and native DNS were bootstrapped for several quantiles of each measurement across each of the facets.

For v6, an alternative to treating all observations as independent could be doing paired comparisons within a daily result-set from a client -- I think whether this is useful depends on how the noise associated with a single measurement compares to the variance between clients. I'm also not sure how to handle that we'll get many result-sets from some clients and only one from some others.

If we were only interested in central tendency, I could work with the distribution of per-user median measurements to make comparisons where n=the number of clients, but I think that's not an interesting way to describe the extremes of the data.

Do you have any thoughts about an elegant way to make comparisons with this sort of hierarchical structure?

Sample size calculation

Using the v5 results, I ran a sample size calculation based on a t test for difference of the means of log-transformed data, computing the detectable percent difference in the (untransformed) mean as a function of observation count for each metric (attached), taking alpha=0.05/90 as a Bonferroni correction and power=90% (attached). I still need to translate this to the number of users we need -- right now the x axis is "observations," treating each measurement as independent.

That chart is here, in cell 20; I'll attach a copy: https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/84963/command/85824

The core product metrics have been deemed out-of-scope by the experiment owners so I haven't provided a sample size analysis for them.

--

How's this as a start? If you have any advice about the analysis structure, that'd be great; I'd be happy to chat over Vidyo.

Flags: needinfo?(sguha)

r+

Flags: needinfo?(sguha)

A couple of outstanding questions from the review were:

  • how many users opted out of v5: the opt-out rate was negligible, and
  • how many times we can expect to see a user in a week: Considering a 1% sample of release 65 users between March 3 and 10, the average number of days per user was 3.7.

Enrolling 1e5 users should give us good coverage.

Experiment shipped yesterday; enrolments and early data look healthy.

data-science-status: Planning → Data Acquisition
data-science-status: Data Acquisition → Evaluation & interpretation
Attached file Writeup draft β€”

Saptarshi, can you take a look at this and comment on whether it has the information you need in order to be able to review it?

Attachment #9072286 - Flags: feedback?(sguha)
Comment on attachment 9072286 [details]
Writeup draft

Saptarshi and I spoke during all-hands and he gave me an r+ on the report, subject to a little bit of i-dotting and t-crossing.
Attachment #9072286 - Flags: feedback?(sguha)

Delivered at https://mozilla-private.report/doh_v6/index.html; will see if it's possible to publish this publicly.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.