Deep Fake Detection add on Data Review
Categories
(Firefox :: General, task)
Tracking
()
People
(Reporter: jmcbride, Unassigned)
Details
Attachments
(2 files)
4.63 KB,
text/plain
|
willkg
:
data-review-
|
Details |
803.48 KB,
image/png
|
Details |
Per https://wiki.mozilla.org/Data_Collection#Data_Collection_Categories filing this bug for data review.
Updated•8 months ago
|
Comment 1•8 months ago
•
|
||
Comment on attachment 9428328 [details]
deep_fake_detection_data_review.txt
Data Review Form (to be filled by Data Stewards)
1) Is there or will there be documentation that describes the schema for the
ultimate data set in a public, complete, and accurate way?
The data collected will be documented in the privacy policy which is not ready
for review, yet.
2) Is there a control mechanism that allows the user to turn the data collection
on and off?
If a user does not want content data collected, they must uninstall the
extension. The extension doesn't function without this.
For errors and telemetry, the default is on and can be turned off.
3) If the request is for permanent data collection, is there someone who will
monitor the data over time?
The technical data is typical product observability data required for any
product. e.g. Errors.
4) Using the Data_Collection on the Mozilla wiki, what collection type of data
do the requested measurements fall under?
Category 1: Technical data
- error reports
Category 2: Interaction data
- feature clicked
- release channel
Category 4: Highly sensitive or clearly identifiable personal data
- ip address--used to identify users and track their interaction data
- content submitted by the user for evaluation by the service's models.
5) Is the data collection request for default-on or default-off?
This request covers different data collections.
Errors and interaction data is default-on, but can be turned off.
User content is default-on and can't be turned off--the user must uninstall the
addon. The addon doesn't function without this data.
6) Does the instrumentation include the addition of any new identifiers?
No.
7) Is the data collection covered by the existing Firefox privacy notice?
They're writing a privacy notice specifically for this addon.
At this point, it's unclear if data collection will be covered solely by the new
privacy notice or by the new privacy policy as well as the Firefox privacy
notice.
Assuming that the data collection is covered by at least one of those, we should
be good here.
8) Does the data collection use a third-party collection tool?
Sentry -- we use this for other products
Fakespot event server
Conclusion
The Category 1 and 2 data is fine.
The Category 4 data needs to go through Sensitive Data Review.
Because everything is done in one big request, I'm going to r- this.
I suggest we reduce the scope of this data collection review request to just the category 1 and 2 things by attaching a new review attachment. Then someone should create a new bug for the category 4 bits which will then go through sensitive data review. It's easier for the review group to focus on the specific things they need to review rather than have to look at everything.
Comment 2•7 months ago
|
||
This has been reviewed by Mozilla Security Risk Management as part of the Data Collection review process and is approved.
Comment 3•7 months ago
|
||
From a sensitive data review basis, I'd like to see us communicate more clearly what our operational model is (this personal data is available to Mozilla staff, which is unacceptable in my mind, but in line with expectations). I asked Michael whether this could be integrated into the legal notices, but got pushback there, but I'd like to see that in some user-visible documentation. Michael suggested SUMO, which seems fine.
Just reviewing comment #1, I'm concerned with this:
ip address--used to identify users and track their interaction data
I seriously hope that we aren't using IP address in this way. That isn't a request to use OHTTP, but more of a hygiene thing. We should have some sort of session identifier that is exchanged under encryption for connecting interactions with active sessions and any interrelation between sessions. Again, I'd prefer if we didn't connect separate sessions from the same user, but understand that a good solution in that area is likely to be tricky. I'd expect that we aren't using IP, but if we are, that's a deal breaker for me.
With those two conditions, this is OK from the CTO side of things.
Comment 4•7 months ago
|
||
The Bugbug bot thinks this bug should belong to the 'Firefox::Shopping' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
From the Privacy/Legal and Data perspectives (commenting for Data on Mark's behalf), this is approved. Work is in progress regarding the best way to ensure clarity on this language flagged by Martin.
(In reply to BugBot [:suhaib / :marco/ :calixte] from comment #4)
The Bugbug bot thinks this bug should belong to the 'Firefox::Shopping' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Firefox :: Shopping isn't the right component for this request. Not sure where it should be, so I'm moving the ticket over to General. Hopefully from there, one of the general triagers can find a better place for it.
Comment 7•7 months ago
|
||
After chatting with Joe and Chris about this, the attached screenshot helped a great deal with clearing things up.
The current copy on that permissions dialog really makes this seem worse from a privacy perspective than it truly is. That's something that we could improve through iteration, but the current state encompasses the possibility that the data would be exposed to our staff.
I would like to see us work toward a system where our staff cannot access the content that is uploaded, but that requires some fairly sophisticated technology and I don't think that people presently expect that of us. As we move into building more AI-based stuff, that's something we should do, following Apple's lead perhaps. That would need to be a longer-term goal; we're not resourced at that level and the technology is still a little immature. The Anonym team here are likely to be great resources, but even they are pushing the limits of what is presently feasible.
As for the IP address thing, we established that that was just an accident of wording. As shown, we only collect IP addresses with permission and only for the purposes of helping diagnose problems.
Description
•