Closed Bug 1481284 Opened 7 years ago Closed 6 years ago

adau observational study: how does recording scalar_parent_browser_engagement_total_uri_count in all modes affect adau (and other things)

Categories

(Shield :: Shield Study, enhancement)

enhancement
Not set
normal

Tracking

(firefox61+ fixed)

RESOLVED FIXED
Tracking Status
firefox61 + fixed

People

(Reporter: joy, Unassigned)

References

Details

Attachments

(2 files)

There are several requests rather than hypotheses. If scalar_parent_browser_engagement_total_uri_count is recorded in Private Browsing and non Private Browsing modes (pbm and non pbm), our aDAU will likely increase. A study shows that this might be 16-20% of current aDAU but we can’t assume the reliability of the study. Hence we need estimates to for the following - Currently 75.4% of profiles who submit on a day are aDAU, with the new recording system how much what will be the new percentage (the above untested model says it will increase by 16%)? - How much does this translated to actual aDAU numbers (compare to existing)? - What is the total URIs for a day for the aDAU profiles(and compare to existing)? - What is the distribution of uris visited per profile per hour? - How more likely is a profile to have >=5 uris on a day? - How does the likelihood of an existing profile contributing to aDAU change? - Can we validate the model in study phd: https://docs.google.com/document/d/16k4wXKn7ih0H5t7SQd5Z5nzU2ZiBf6qPFqtzDdZI4CM/edit?ts=5b622c69 (note, ideally we would for a profile record the usual scalar_parent_browser_engagement_total_uri_count and another version scalar_parent_browser_engagement_total_uri_count that records in all modes, then at a profile level we would know the exact change. However this approach leaks information and is not viable privacy respecting. By recording in every mode, , we wont know whether the profile is in pbm or not at all and given these are Category 1 metrics there is not an issue in collecting this)
Taken from "details section" of the phd (comment 1) Basic description of experiment: - Currently we do not record scalar_parent_browser_engagement_total_uri_count (turi) in private browsing mode. - Profiles who’s total turi for a day is less than 5 will not contribute to aDAU - We believe that we are missing the count of uris visited in private browsing mode (pbm) . Had we we counted these, then the profiles above might have contributed to aDAU. - If we change turi to record during all sessions, our aDAU will increase - In this experiment, for a small test group we shall “turi” will be counted in all modes (pbm and non pbm) and we determine the amount of increase. What is the preference we will be changing? - browser.engagement.total_uri_count.pbm (see 2nd patch in the bug linked to below) What independent variable(s) (IVs) are you manipulating to affect measurements of your DV(s)? What different levels (values) can each IV assume? - There is test and a control branch. In the test branch we will record “turi” in all modes, but in the control branch it will only be recorded in non-pbm modes (the status quo) - We will likely conduct separate analyses for some countries and see if there are country specific effects What percentage of users do you want in each branch? - 1:1 split between test and control. 135K profiles per branch, a total of 270K profiles equally split between Test and Control. -- To estimate the proportion(proportion of profiles submitting who are aDAU) within 0.5% of the true value with virtual certainty, we would need a sample of 114,947 within each branch -- To estimate total URIs within 1% of the true value with 95% certainty, we need 134,501 profiles in a branch (having removed top 0.1% outlier) -- To estimate uris/hr/profile within 1% of true value with 95% certainty, we need 63101 profiles per branch (having removed 0.1% outlier) -- To compare uris visited per per profile(after removing top 0.1% outlier) with effect size of 0.039 (detect 1% change),power=0.95, alpha=0.01, we need sample size of 24K per branch What Channels and locales do you intend to ship to? - Release 61+ all locales. What is your intended go live date and how long will the study run? - Go live in the 61 release time frame. - shortly after 61.0.2 go live (sometime week of 6th August) - Enrollment for one week - Monitor profiles for three weeks (so 1+3 for a total of four weeks) - Are there specific criteria for participants? No - What is the main effect you are looking for and what data will you use to make these decisions? - Currently 75.4% of profiles who submit on a day are aDAU, with the new recording system how much what will be the new percentage (the untested model says it will increase by 16%)? - How much does this translated to actual aDAU numbers (compare to existing)? - What is the total URIs for a day for the aDAU profiles(and compare to existing)? - What is the distribution of uris visited per profile per hour(and compare to existing)? - How more likely is a profile to have >=5 uris on a day? - How does the likelihood of an existing profile contributing to aDAU change? - Who is the owner of the data analysis for this study? Saptarshi Guha - Will this experiment require uplift? I think Monday 6th (landed on moz-release), QA Status of your code: Green - see bug Do you plan on surveying users at the end of the study? No Prior Work: see https://metrics.mozilla.com/protected/sguha/leak2.html
Given current usage, a 0.2% sample ought be enough to get the requisite sample size
Edited to enrollment for two weeks.
Please launch on 15th August,2018. Enrollment for two weeks Study ends: 3 weeks from 15th August.
Requesting peer sign off
Flags: needinfo?(mdeboer)
Requesting sign off from QA
Flags: needinfo?(andreea.cupsa)
** Why this Study We are not entirely capturing aDAU, and by recording scalar_parent_browser_engagement_total_uri_count in all modes we can arrive at the true aDAU. However if we do start recoding in all modes, we will see a sudden jump in our top level metrics! The purpose of this study is - What will this jump look like? - Can we compute a proxy of aDAU by using existing variables and therefore not have to redefine scalar_parent_browser_engagement_total_uri_count
Science review: R+
Please proceed with this study, r=me.
Flags: needinfo?(mdeboer)
Adding rrayborn for Data Review
Flags: needinfo?(rrayborn)
This has been tested by QA and it is ready to ship.
Flags: needinfo?(andreea.cupsa)
Attached file data_review.md
Attached file Data Request Review
r+ Saptarshi's request, modifies the scope of an existing probe. Minor correction, this is level 2 "interaction data" not level 1 "technical data" but nonetheless approved for this use-case.
Flags: needinfo?(rrayborn)
Attachment #9001381 - Flags: review+
Based on the risk matrix I'm flagging dcamp for a final sign off.
Flags: needinfo?(dcamp)
Given the QA and Data Review signoffs, signing off for RelMan pending dcamp's final approval.
Approved.
Flags: needinfo?(dcamp)
This study is now live.
Per Saptarshi this study has now ended. All clients should begin to unenroll. Once the analysis is complete we can close this bug.
Blocks: 1499488
Blocks: 1510132
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: