1.) Who is/are the point of contact(s) for this review?
Jonathan Eads (jeads <at> mozilla.com, jeads) and Carl Meyer (cmeyer <at> mozilla.com, carljm)
2.) Please provide a short description of the feature / application (e.g. problem solved, use cases, etc.):
Datazilla (https://github.com/mozilla/datazilla) includes a data model and web service that manages storing and analyzing performance test data for mozilla software products. We will be using it to manage data for a variety of projects including: talos, b2g, stoneridge, jetperf, xperf, and peptest. The intention is to house all performance related test data that the Automation and Tools (ateam) group is responsible for managing.
Unique database instances are used for each project and there is no requirement for co-localization so that projects can scale independently. Datazilla provides the group with a way to re-use data models, web services, and web based user interfaces across multiple projects.
3.) Please provide links to additional information (e.g. feature page, wiki) if available and not yet included in feature description:
Datazilla uses the datasource module for managing SQL statements.
This allows the application to encapsulate SQL in files that look like this:
You can see the set of active projects we are working on in Datazilla here:
4.) Does this request block another bug? If so, please indicate the bug number:
5.) This review will be scheduled amongst other requested reviews. What is the urgency or needed completion date of this review?
The urgency is high, this is a Q2 goal. It has taken longer than anticipated to build enough of the core architecture to accurately represent how the application will work. It would be great if we could get the review done within the next 3 weeks.
6.) To help prioritize this work request, does this project support a goal specifically listed on this quarter's goal list? If so, which goal?
7.) Does this feature or code change affect Firefox, Thunderbird or any product or service the Mozilla ships to end users?
No, not directly.
Are there any portions of the project that interact with 3rd party services?
Yes, at the moment the production talos infrastructure. Test runners on build machines need to send data to datazilla.
Will your application/service collect user data? If so, please describe:
The application does not collect user data. It collects performance test data in the form of a JSON structure which is described in https://github.com/mozilla/datazilla/blob/master/datazilla/model/sql/template_schema/schema_perftest.json.
8.) If you feel something is missing here or you would like to provide other kind of feedback, feel free to do so here (no limits on size):
We have a development environment set up on Mozilla-MPT. It has been ingesting data from the production talos test environment for ~2 months. We have used this data to develop the application, so the application has been tested with most of the production load it will initially manage.
Datazilla will provide a web-service interface that test runners on build machines will send test data to. This endpoint needs to be protected to prevent a bad actor from posting false data. To address this, Carl Meyer, contacted Gary Kwong via email, with a description of the problem. A question we're not sure how to answer is whether we should be using an endpoint that is on a private VPN, with a fallback option of API key based authentication or if there is a better approach that we are not aware of.
Datazilla is intended to be a replacement for the current graph server, used for collecting and displaying data from Talos performance tests. This replacement is not part of our Q2 goals but it is part of the long term purpose of the project.
9.) Desired Date of review (if known from https://email@example.com/Security%20Review.html) and whom to invite.
Friday June 22nd or Friday June 29th. Please invite: Jonathan Eads, Carl Meyer, Cameron Dawson, and Clint Talbert
Matt and I will look at this
> Datazilla will provide a web-service interface that test runners on build
> machines will send test data to. This endpoint needs to be protected to
> prevent a bad actor from posting false data. To address this, Carl Meyer,
> contacted Gary Kwong via email, with a description of the problem. A
> question we're not sure how to answer is whether we should be using an
> endpoint that is on a private VPN, with a fallback option of API key based
> authentication or if there is a better approach that we are not aware of.
I looped in Guillaume and Joe via email previously - they might have some other insights which might be useful. Have also updated dchan about it.
Created attachment 638890 [details]
Security Review Report
I've attached my notes from this review.
Sorry, I forgot to mark this resolved.