[Shield] Pioneer Study: JESTER (Javascript Execution Survey and Traffic Record) (jester-pioneer-shield-study), release 63, 64

RESOLVED FIXED

Status

enhancement
RESOLVED FIXED
10 months ago
13 days ago

People

(Reporter: mlopatka, Assigned: mlopatka)

Tracking

unspecified

Firefox Tracking Flags

(firefox63+ fixed, firefox64+ fixed)

Details

Attachments

(3 attachments, 9 obsolete attachments)

Summary: 
Measure the execution of JavaScript using openWPM instrumentation for comparison against data collected via a web crawl. Assess the overlap between published top sites and web traffic specific to the Firefox Pioneer panel including repeat visitation patterns.

PHD Doc: https://docs.google.com/document/d/10JEZ9WgAqqsveYGMgSs8xJMPxQoSExAXx5pf_Fya3sY/edit?usp=sharing

Addon repo: https://github.com/motin/dataleak-pioneer-shield-study
Assignee: nobody → mlopatka
Data Steward review requested (rrayborn)
Flags: needinfo?(rrayborn)
Comment on attachment 9019678 [details]
jestr_data_review.md

Sorry for the late reply Martin.  

I feel like "The add-on listens to navigation, web requests, cookie modifications and access to certain javascript API:s, as determined by openwpm-webext-instrumentation." is underspecified.  Especially considering that the linked repo is similarly underspecified.  The documentation in the public data review request is technically adequate, but I would appreciate you taking steps to directly articulate a rough explanation of the packet(s) themselves this in the GitHub repo.

==

General: Great use of Pioneer.  Good use of existing control to avoid having to have conversations about general users.

#1) Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?
The best documentation is in the PHD and the data review request, advised Martin to further articulate in GitHub
https://github.com/motin/jestr-pioneer-shield-study/blob/develop/docs/TELEMETRY.md

#2) Is there a control mechanism that allows the user to turn the data collection on and off?
Yes, respects Telemetry settings + is opt-in Pioneer population (category 3 consent)

#3) If the request is for permanent data collection, is there someone who will monitor the data over time?
Non-permanent, Martin monitoring

#4) Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?
Category 3 and below

#5) Is the data collection request for default-on or default-off?
Telemetry defaults

#6) Does the instrumentation include the addition of any new identifiers?
No

#7) Is the data collection covered by the existing Firefox privacy notice? 
Yes with the addition of the legal approved Pioneer language

#8) Does there need to be a check-in in the future to determine whether to renew the data?
No, temporary.
Flags: needinfo?(rrayborn)
Attachment #9019678 - Flags: review+
^fwollsen, please add a schema of the JESTr telemetry payload to the repo documentation and then request a "need info" from rrayborn for data steward sign off.
Flags: needinfo?(fwollsen)
> .. articulate a rough explanation of the packet(s) themselves this in the GitHub repo.

This has been done to clarify the nature of the packets:
1. Typescript interfaces describing the telemetry packets have been added
2. The readme of the linked repo now explains the nature of the packets and where further information about the packet contents can be derived
3. The OpenWPM configuration used in the study has been added directly to the study add-on telemetry docs

All of the above can be found via links in https://github.com/motin/jestr-pioneer-shield-study/blob/develop/docs/TELEMETRY.md#shield-study-addon-pings-specific-to-this-study
Flags: needinfo?(fwollsen) → needinfo?(rrayborn)
LGTM, thanks for the updates Fred
Flags: needinfo?(rrayborn)
sorry for the bonus ping Rob, but does LGTM constitute a data steward R+ signoff for the JESTr study?
Flags: needinfo?(rrayborn)
Yes, sorry.  I r+'ed previously with some suggestions that the documentation be easier to parse for a quick audit by a less technical user.  The r+ there was because you had technically met the documentation/policy requirements at that point despite the suggestions.  Apologies for the confusion.
Flags: needinfo?(rrayborn)
Attached v0.3.0 for signing
Flags: needinfo?(mcooper)
Flags: needinfo?(mcooper)
Attached v0.4.0 for signing
Attachment #9021254 - Attachment is obsolete: true
Attachment #9021257 - Attachment is obsolete: true
Flags: needinfo?(mcooper)
Flags: needinfo?(mcooper)
[Tracking Requested - why for this release]: This is a study on the Pioneer population in release 63 and 64.
Summary: [Shield] Pioneer Study: JESTER (Javascript Execution Survey and Traffic Reference) (jester-pioneer-shield-study) → [Shield] Pioneer Study: JESTER (Javascript Execution Survey and Traffic Reference) (jester-pioneer-shield-study), release 63, 64
Attached v0.5.0 for signing
Attachment #9021727 - Attachment is obsolete: true
Attachment #9021935 - Attachment is obsolete: true
Flags: needinfo?(mcooper)
@rscholl can we get the methodological/scientific review meeting outcome logged in the bug please?
Flags: needinfo?(rscholl)
Summary: [Shield] Pioneer Study: JESTER (Javascript Execution Survey and Traffic Reference) (jester-pioneer-shield-study), release 63, 64 → [Shield] Pioneer Study: JESTER (Javascript Execution Survey and Traffic Record) (jester-pioneer-shield-study), release 63, 64
QA request

We would would like to request testing help with the JESTr study (tjestr-pioneer-shield-study@pioneer.mozilla.org) Pioneer study.

Project name: jestr-pioneer-shield-study

Description: The JESTr study helps Firefox developers understand the behavior of pages executing third-party Javascript resources that may be used to track your web browsing activity.

Risk: Pioneer cohort only (Low Risk)

Study owner: Martin Lopatka (mlopatka@mozilla.com)

Priority: normal

Tracking Bug: 1496154
https://bugzilla.mozilla.org/show_bug.cgi?id=1496154

PHD document: https://docs.google.com/document/d/10JEZ9WgAqqsveYGMgSs8xJMPxQoSExAXx5pf_Fya3sY/edit

Channels: All Firefox channels, Pioneer cohort only.

Platform: Firefox Desktop (OSX/Windows/Linux)

Locales: {en-Us}

Complexity: Simple one-time passive data collection effort (single branch). Telemetry sent using Pioneer encrypted Telemetry pipeline.
Flags: needinfo?(kraj)
@rhelmer, can you provide Firefox Engineering peer review for this Pioneer study?
Flags: needinfo?(rhelmer)
Science R+
Flags: needinfo?(rscholl)
Flags: needinfo?(kraj)
Hm, does this work on Nightly? I get this in browser console:

InvalidStateError: A mutation operation was attempted on a database that did not allow mutations.

Taking a look at the code now, but are there instructions for testing it locally too?
Flags: needinfo?(rhelmer) → needinfo?(mlopatka)
(In reply to Robert Helmer [:rhelmer] from comment #19)
> Hm, does this work on Nightly? I get this in browser console:
> 
> InvalidStateError: A mutation operation was attempted on a database that did
> not allow mutations.

Oh and the study immediately ends itself:

Error: endStudy, requested:  user-disable, but already ending ineligible

> 
> Taking a look at the code now, but are there instructions for testing it
> locally too?
(In reply to Robert Helmer [:rhelmer] from comment #19)
> Hm, does this work on Nightly? I get this in browser console:
> 
> InvalidStateError: A mutation operation was attempted on a database that did
> not allow mutations.
> 
> Taking a look at the code now, but are there instructions for testing it
> locally too?

This is very likely a Normandy problem, bug 1502182. I don't think it is related to this add-on.
Oh and also are there un-webpack'd versions of feature.js and content.js? Or are these common to all shield studies we ship already?
(In reply to Michael Cooper [:mythmon] from comment #21)
> (In reply to Robert Helmer [:rhelmer] from comment #19)
> > Hm, does this work on Nightly? I get this in browser console:
> > 
> > InvalidStateError: A mutation operation was attempted on a database that did
> > not allow mutations.
> > 
> > Taking a look at the code now, but are there instructions for testing it
> > locally too?
> 
> This is very likely a Normandy problem, bug 1502182. I don't think it is
> related to this add-on.

OK thanks.
> Taking a look at the code now, but are there instructions for testing it locally too?

Sure, check out https://github.com/motin/jestr-pioneer-shield-study/blob/develop/docs/DEV.md

> Oh and the study immediately ends itself:
> Error: endStudy, requested:  user-disable, but already ending ineligible

Don't forget to have the Pioneer add-on installed, or else the study add-on will uninstall itself. Full instructions available at https://github.com/motin/jestr-pioneer-shield-study/blob/develop/docs/TESTPLAN.md#install-the-add-on-and-enroll-in-the-study

> Oh and also are there un-webpack'd versions of feature.js and content.js? Or are these common to all shield studies we ship already?

These are specific to this study and the source codes are available in the folders https://github.com/motin/jestr-pioneer-shield-study/tree/develop/feature.js and https://github.com/motin/jestr-pioneer-shield-study/tree/develop/content.js respectively. They are in turn built by webpack by this configuration: https://github.com/motin/jestr-pioneer-shield-study/blob/develop/webpack.config.js#L7-L8
Attached v0.6.0 for signing.
Attachment #9022327 - Attachment is obsolete: true
(In reply to fwollsen from comment #24)
> > Taking a look at the code now, but are there instructions for testing it locally too?
> 
> Sure, check out
> https://github.com/motin/jestr-pioneer-shield-study/blob/develop/docs/DEV.md
> 
> > Oh and the study immediately ends itself:
> > Error: endStudy, requested:  user-disable, but already ending ineligible
> 
> Don't forget to have the Pioneer add-on installed, or else the study add-on
> will uninstall itself. Full instructions available at
> https://github.com/motin/jestr-pioneer-shield-study/blob/develop/docs/
> TESTPLAN.md#install-the-add-on-and-enroll-in-the-study
> 
> > Oh and also are there un-webpack'd versions of feature.js and content.js? Or are these common to all shield studies we ship already?
> 
> These are specific to this study and the source codes are available in the
> folders
> https://github.com/motin/jestr-pioneer-shield-study/tree/develop/feature.js
> and
> https://github.com/motin/jestr-pioneer-shield-study/tree/develop/content.js
> respectively. They are in turn built by webpack by this configuration:
> https://github.com/motin/jestr-pioneer-shield-study/blob/develop/webpack.
> config.js#L7-L8

Thanks!
Attachment #9023041 - Flags: review+
Flags: needinfo?(mcooper)
Flags: needinfo?(mlopatka)
Attached v1.0.0 for signing
Attachment #9022968 - Attachment is obsolete: true
Attachment #9023041 - Attachment is obsolete: true
Flags: needinfo?(rdalal)
R+ from QA.
I tested both the *-0.6.0-signed.xpi and *-1.0.0-signed.xpi and everything worked as expected.

:+1:
Flags: needinfo?(rdalal)
Pascal and Julien, if possible the team would like to ship this Pioneer study on Monday, 11/19. All approvals are in. Thanks!
Flags: needinfo?(pascalc)
Flags: needinfo?(jcristau)
pioneer / opt-in study, has review and qa, +1 for relman.
Flags: needinfo?(pascalc)
Flags: needinfo?(jcristau)
We're live on this.  Thanks all!
When this went live the telemetry endpoint started receiving roughly double the number of overall requests than expected, which from my preliminary observation was related to these pioneer study pings. I asked :lonnen and :mythmon as a precaution to shut it off.
Follow up questions: how many requests and how much data should this study be sending? The volume I saw was concerning enough to make me believe we were rolling out to more than the pioneer opt-in audience. If this study is in fact behaving appropriately, and is properly scoped in its rollout, we can scale our infrastructure to process it, but as a precaution I asked that it be shut down so that I can understand what exactly is going on before we proceed.
question for Motin or Martin then: how often do we expect this to send an event? is there any batching of events in the code?
Flags: needinfo?(mlopatka)
To the question of if we were accidentally enrolling users that were not in the Pioneer cohort: I see no evidence of that, and I think that I saw enough chattiness in the add-on code to account for all the traffic we saw. However, the number of events we saw is still concerning from an infrastructure point of view, since I'd expect the normal Normandy enrollment rate to enroll another 10-20 times more users than the amount that triggered the worrying amount already. I expect that the add-on will need to be modified to batch events, and to send fewer large pings instead of many small ones.
This is a chatty addon by design. But with some stochasticity in the chattiness dimension due to the unknown volume of browsing activity carried out by pioneers and the added variability in page behaviour (which may trigger monitored events).

The study *is* behaving as expected and there is no indication of any enrollment outside of the pioneer cohort. I have verified the payloads collected between launch and backing out that only eligible pioneers have been enrolled (zero exceptions). This is predicted behaviour that was observed and approved in the engineering review and QA. 

"If this study is in fact behaving appropriately, and is properly scoped in its rollout, we can scale our infrastructure to process it" 
^ This would be my preference, unless there is a real hardship here. 

We can reduce the total size of the payloads by a small factor by limiting the number of pings sent, thus excluding some number of shield envelopes (which include the environment etc.) This comes at a risk of increasingly large packets and know information loss in the payload. Likewise, batching at the level of navigation events may give us small wins in terms of the number of overall telemetry endpoint requests, but will lead to an unknown (and unknowable) amount of information loss as payloads exceeding 500kb (or any other threshold) are truncated to avoid loss in transit. 

Regarding the batching mentioned by :mythmon: "I expect that the add-on will need to be modified to batch events, and to send fewer large pings instead of many small ones."

By design this addon sends more frequent, smaller payloads. This is a design choice we made in light of the findings from the Pioneer Pathfinder study (https://bugzilla.mozilla.org/show_bug.cgi?id=1436789) where payload delivery become unreliable at larger payload sizes. As such, we put an upper limit on payloads sent by JESTr at 500kb excluding encryption overhead and shield-telemetry envelope. 

We can land some patches to refine the payload truncation in a more controlled manner, but that will not drastically reduce the increased load on the telemetry endpoint requests. :whd can you let me know what is a worrisome amount, an dhow much work we need to do in reducing the load on Telemetry to make this study palatable. 

I'm in the process of analyzing the payloads we received between launch and termination, and will provide some estimates on where we can save some space, but ultimately this is a very extensive and details data collection effort and the total information gathering is large, there are limits to how much of a reduction we can accomplish without compromising the intended study objectives. 

How much scaling of the infrastructure are we willing to provide? (perhaps :lonnen can provide guidance here)

My only additional ideas at this point would be:
1- Reduce the duration of the study to limit the time window of increased load on our infrastructure. Would that help?
2- Randomly sub-sample the events that we report from the clients (discard some information to reduce the number of Telemetry events). This would mean an incomplete data collection and an inability to reconstruct potentially meaningful navigation actions at the analysis phase.
Flags: needinfo?(whd)
Flags: needinfo?(mcooper)
Flags: needinfo?(chris.lonnen)
I'm familiar with the 500KB limit, and am sympathetic. I worked on Pathfinder, and it was a tough set of limitations to track down. In Pathfinder we bundled messages dynamically, creating several ~500KB payloads to send. We chose to maximize the size of payloads and minimize the number of requests, within our limitations.

Would it be reasonable to limit the number of users enrolled in the study? When we shut off the experiment, we had enrolled about 4500 users, and were already stressing the telemetry systems. I would expect that if we continued enrollment, we'd see another ~10-20x increase in users. If we could limit this to a smaller percentage of Pioneer users, it might make scaling Telemetry easier. In short, do you have a target number of user needed for the data science here?

I'll let :whd and :lonnen comment on scaling the infrastructure.
Flags: needinfo?(mcooper)
Thanks for the feedback :mythmon.
We have tracked down some of the extreme cases (websites) leading to very high number of telemetry events and come up with a bundling strategy that should drastically cut down on events among those while still delivering a meaningful payload.
:motin and I walked through several ways we can try to lighten the load in terms of number of pings and total payload of the study.

He has filed issues to track progress before we upload a patch for the JESTr study.

related to payload size: 
https://github.com/mozilla/openwpm-webext-instrumentation/issues/23#issuecomment-440325169 

related to number of telemetry events triggered:
https://github.com/mozilla/openwpm-webext-instrumentation/issues/22

in addition to adding bundling payload delivery at the level of a navigation event to maintain some sanity when we come the analytical tasks. This should accomplish a substantial decrease in the load on Telemetry.

Nonetheless, even following the strategy of "maximize the size of payloads and minimize the number of requests, within our limitations" due to the unknown behaviour that will be encountered in the wild for this study I think we should be ready to scale our infrastructure to some degree as we monitor the requests to the Telemetry endpoint. 

Thanks to :whd, :lonnen, and :mythmon for the qwuick decisive action. 
We'll aim to relaunch after American Thanksgiving so patches can be crafted in Europe, but we can have a full crew ondeck to monitor the rollout.
> This should accomplish a substantial decrease in the load on Telemetry.

This is good to hear.

> I think we should be ready to scale our infrastructure to some degree as we monitor the requests to the Telemetry endpoint. 

Our infrastructure is designed to scale pretty well. The piece that will not scale without manual intevention to the original estimated 10-20x is the message broker (kafka), mostly due to storage requirements. We can prepare to scale up if necessary, or lower retention and redundancy for pioneer to within reason if these become issues.

My primary goal is to avoid disruption to the general telemetry / data collection infrastructure if we have issues with pioneer load. I've consulted with :trink and we've developed a fail-safe by which we can switch off stream processing for pioneer to avoid overloading kafka and disrupting other consumers. This will require batch processing pioneer data but the effort to switch from streaming to batch should be minimal, and will only be triggered if necessary.

From a budget perspective we've been cleared to handle a doubling of data for a month if that's what it ends up being. We'll also need to spin up much larger analysis clusters in the pioneer environment to analyze this data when the time comes.

Overall with steps being taken to decrease load and simply being aware of the potential magnitude of this study I'm much more comfortable to proceed with a rollout next week.
Flags: needinfo?(whd)
Flags: needinfo?(chris.lonnen)
Flags: needinfo?(mlopatka)
Attaching v1.1.0 for signing.

Most telemetry events are now sent in batches oriented around web navigations, diminishing performance penalties associated with submitting many small packages via Pioneer telemetry. The 500kb limit now also applies to these batches instead of the individual packets, greatly decreasing the overall ping data volume.

For a more complete change log, issue links and code review of the changes since v1.0.0, please check out https://github.com/motin/jestr-pioneer-shield-study/pull/13

:mythmon, :rhelmer, :pdehaan - I'd be delighted if you can review and sign off on whether or not you believe this bundling strategy is sufficient (optionally also :whd if you have bandwidth, if not feel free to clear your ni).
Attachment #9025083 - Attachment is obsolete: true
Attachment #9025084 - Attachment is obsolete: true
Flags: needinfo?(whd)
Flags: needinfo?(rhelmer)
Flags: needinfo?(pdehaan)
Flags: needinfo?(mcooper)
Fred, does this version include removing npm-run-all from the dependencies? Do you think it needs to? Since that tool is only used in the build step, and given the nature of the problem, I think we'd probably be fine either.
Flags: needinfo?(mcooper)
In reply to fwollsen from comment #42)
> (optionally also :whd if you have bandwidth, if not feel free to clear your
> ni).

I don't have time to look at this but what is described sounds good to me.
Flags: needinfo?(whd)
> Fred, does this version include removing npm-run-all from the dependencies? Do you think it needs to?

It does not remove it, but upgrades it to a version that is known to not include the malicious dependency: https://github.com/motin/jestr-pioneer-shield-study/pull/13/commits/a91e4c9dcc38d9a5268b164fe29aa0a6e62aac47 (confirmation diff in package-lock.json: https://github.com/motin/jestr-pioneer-shield-study/pull/13/files#r236588621)

> Since that tool is only used in the build step, and given the nature of the problem, I think we'd probably be fine either.

+1
:mythmon can we get a signed xpi for testing?
Flags: needinfo?(mcooper)
Here is an add-on that is signed for testing.
Flags: needinfo?(mcooper)
(In reply to fwollsen from comment #42)
> Created attachment 9027714 [details]
> jestr_pioneer_shield_study-1.1.0.zip
> 
> Attaching v1.1.0 for signing.
> 
> Most telemetry events are now sent in batches oriented around web
> navigations, diminishing performance penalties associated with submitting
> many small packages via Pioneer telemetry. The 500kb limit now also applies
> to these batches instead of the individual packets, greatly decreasing the
> overall ping data volume.
> 
> For a more complete change log, issue links and code review of the changes
> since v1.0.0, please check out
> https://github.com/motin/jestr-pioneer-shield-study/pull/13
> 
> :mythmon, :rhelmer, :pdehaan - I'd be delighted if you can review and sign
> off on whether or not you believe this bundling strategy is sufficient
> (optionally also :whd if you have bandwidth, if not feel free to clear your
> ni).

I chatted with Fredrik and also took a look at the diff between this version and the previous, this approach lgtm.

This particular problem is difficult to catch with automated or manual testing, as we need a particular sort of web page to trigger the problem. We do know how to test for this scenario now.

Martin mentioned that he and :whd could watch this as it goes out just to make sure that everything is working correctly. I think we'll be OK now that there is a 500kb limit in place for batches, but this sounds like a good strategy to me just in case we run into anything unexpected.
Flags: needinfo?(rhelmer)
Re-deployed recipe 639 after substituting the 1.1 signed addon for 1.0.

Echoing the caution in some recent messages here, we're limiting to 10% of Pioneers for <= 24 hrs as a short-term precaution against "anything unexpected". I've set myself a reminder to check in first thing tomorrow.
Can we increase to 100% of the Pioneer cohort based on the load observed with the 10% sample?
@whd does it all look good from your end? Barring any infrastructure concerns I'd really like to see this shipped to 100% of the Pioneer cohort as specified in the PHD document.
Flags: needinfo?(whd)
Bumped to 100%
(In reply to mlopatka from comment #50)
> Can we increase to 100% of the Pioneer cohort based on the load observed
> with the 10% sample?
> @whd does it all look good from your end? Barring any infrastructure
> concerns I'd really like to see this shipped to 100% of the Pioneer cohort
> as specified in the PHD document.

This looks good to me, and I am fine with bumping to 100% (though per comment #51 that already happened).
Flags: needinfo?(whd)
Is Normandy still actively enrolling clients for this study?
Flags: needinfo?(tdowner)
Flags: needinfo?(jgaunt)
Yes, Martin - the Shield calendar has it set to run through 2019-01-22.
Flags: needinfo?(tdowner)
Flags: needinfo?(jgaunt)
thanks for the update!
Flags: needinfo?(pdehaan)

Disabled recipe 639 per mlopatka. JESTER is no longer live.

Status: NEW → RESOLVED
Closed: 13 days ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.