444351 - do not send client UUID / GUID with crash reports

There was some fervent discussion happening, mostly between ss and shaver (although I can't remember where), and I wasn't clear on the outcome. shaver: did you two ever reach consensus?

Benjamin Smedberg

Comment 6

•

17 years ago

Shaver, decision please?

Assignee: ted.mielczarek → shaver

Flags: blocking1.9.1? → blocking1.9.1+

Daniel Veditz [:dveditz]

Comment 7

•

17 years ago

Can't see why we need to muck with this on the stable branch. Besides, the client UUID was occasionally tremendously useful with talkback reports -- not to have the UUID itself, but to know whether a particular 1000-report crash was from 900 unique users or 1 user (maybe developing a broken addon?).

Flags: wanted1.9.0.x? → wanted1.9.0.x-

Benjamin Smedberg

Comment 8

•

17 years ago

ted, please push this.

Samuel Sidler (old account; do not CC)

Comment 9

•

17 years ago

I don't think this patch should get pushed to mozilla-central (or any branches) until a decision has been made. Has this even been discussed in a public forum like the newsgroups? It seems like this is a governance issue related to privacy and data, not simply a module owner decision. We should get further feedback. (And no, I don't think Shaver and I ever reached consensus.)

Benjamin Smedberg

Comment 10

•

17 years ago

I've heard arguments on both sides. We're balancing the reward of being able to associate the number of users experiencing crashes versus the privacy risk of being able to associate multiple crashes with a user and build up a user profile. My decision as module owner is that the risk is much greater than the reward here. If you'd like to challenge that decision, please do so in the newsgroups posthaste.

Jesse Ruderman

Comment 11

•

17 years ago

(In reply to comment #7) > Can't see why we need to muck with this on the stable branch. Besides, the > client UUID was occasionally tremendously useful with talkback reports -- not > to have the UUID itself, but to know whether a particular 1000-report crash was > from 900 unique users or 1 user (maybe developing a broken addon?). I can think of several ways to accomplish this without the privacy problems associated with a GUID: * Send a random 4-bit identifier instead of a GUID. If all the identifiers for a given crash signature are the same, we can guess that they're all from the same person. But we can't tell the difference between 100 users hitting a crash and 10000 users hitting a crash. * Send a number that is a function of the number of crashes reported by the user in the last month. (For example, the buckets could be "1", "2 to 4", "5 to 15", and "16 or more"). If a given crash signature keeps showing up in the "16 or more" bucket, we can infer that the crash affects a small number of users a lot.

chris hofmann

Comment 12

•

17 years ago

In a lot of "hallway" conversations we have talked about getting rid of e-mail address collection and going to a system like jesse suggests in comment 11. The latter comment 11 suggestion of rotating the "submitter id" either after a number of reports or monthly based seems like the best approach. On the macro analysis level we have gotten a lot of value in the past, and upcoming in the new MTBF report in knowing how many users have submitted reports versus the total number of reports submitted and comparing those numbers release to release. On the micro analysis level understanding if a specific crash is coming by one, or just a few users, or if it is broad based has played an important part of isolating and reproducing many crashes. Quickly understanding how broad based the crash is helps direct the next level of analysis and which path might yield the most information. For example: * if it is just a few users crashing, understanding similarities in their configuration is often an effective next step. * if it is many users, looking at configuration similarities is not so effect and we look to other analysis techniques. Please don't break either one of these macro or micro analysis tools.

Benjamin Smedberg

Comment 13

•

17 years ago

The MTBF report *should* be using the time-since-last-crash number, and shouldn't need to use the submitter ID at all. Is there a bug where it is being implemented? Yes, it is valuable to know whether a crash comes from a few or many users. But you can perform automated regression testing to find similarities in configuration without any unique IDs. It is sufficient to compare the DLL list and other semi-unique characteristics of the report; data which is already collected and available. I don't believe that the "number of unique users experiencing this crash" number is a sufficient benefit to offset the risk to user privacy of the unique ID.

Benjamin Smedberg

Comment 14

•

17 years ago

Today's platform meeting decided to go ahead with this patch and remove GUIDs from crash reports, and continue investigating other less invasive ways of getting crash-per-user data, such as system signatures.

(not currently active) Ted Mielczarek

Assignee

Updated

•

17 years ago

Assignee: shaver → ted.mielczarek

(not currently active) Ted Mielczarek

Assignee

Comment 15

•

17 years ago

Pushed to m-c: http://hg.mozilla.org/mozilla-central/rev/e35d75754108 I'll push to 1.9.1 once this cycles green on trunk. We should consider taking this on the 1.9.0 branch as well.

Status: ASSIGNED → RESOLVED

Closed: 17 years ago

Resolution: --- → FIXED

chris hofmann

Comment 16

•

17 years ago

> But you can perform automated regression testing to find similarities in > configuration without any unique IDs. It is sufficient to compare the DLL list > and other semi-unique characteristics of the report; data which is already > collected and available. Could an attacker that has access to the database use this same approach to develop a "user profile"? Isn't this approach just a way of doing post processing on the data to create something just like a UUID, or a proxy for the UUID? If we are going to develop tools like suggested in bug 472358 couldn't an attacker use those tools to associate a collection of crashes with a specific user, then use the same system configuration or other attributes to go find other reports from the same user? If we don't develop those tools, couldn't an attacker develop them for themselves? If its technically feasible to post process a collection of crash reports to determine if they come from the same user and create a proxy for a UUID or an actual "post processed" UUID its not clear to me what we have really gained by removing the UUID. Its only increased the processing burden for us, and attackers to create the post processed user identification. I agree that there maybe some privacy risk here, so we should articulate them and find some solutions that prevent possible problems. ted is worried about an AOL style disclosure where a collection of URLs was enough to develop the browsing profile of specific individuals. that seems a valid concern and that kind of attack seems theoretically possible if an attacker gets full access to the database right now. here is how an attack like that might happen. find a crash report with a url that has user info embedded in the url e.g. http://socialnetworking.com/show_homepage?user=chofmann take the crash report with that url get the config info for the pc that submitted that crash. start matching that config info to that in other crash reports. if a match is found then print URL no UIDs involved but my browsing history is reviled using a script that does the steps above and its directly associated to me as a person. The problem in this case is that we have made URL collection and transmission "opt-out" as part of breakpad/socorro and automated the process of gathering and transmiting the full detailed host name and path information in every url. we have much more precise data now with that system, but we all feel a bit too uneasy to publish that data, and have it hang around on the servers with the rest of every the crash report. I've said in several other forums, we really should audit each of the pieces of information that we are collecting, identify the pieces of information that are really sensitive, figure out if we need it still, and figure out if it should be opt in. some candidates for removal are e-mail address - could be sensitive - not used in analysis, misleads users into thinking that we will likely contact them about a specific crash we in practice we don't do this. many users don't provide this now. get rid of e-mail addresses. IP address - could be sensitive - not used in any analysis techniques we use now. urls - could be very sensitive and revile a individuals browsing history - used in analysis, really valuable in finding reproducible sites and common content patterns that cause crashes. consider making this opt-in again, where the user types in the url and discloses just the parts of the site they are comfortable with. Add back the right disclosure so we are more comfortable with having URL data made public again. uuid - by itself it seems not valuable since the number isn't connected to an individual. could be reconstructed in post processing by looking at other configuration details. what do we gain if we remove? continue with looking at each of the data types we collect. process list... time of crash... etc...

Daniel Veditz [:dveditz]

Updated

•

17 years ago

Flags: wanted1.9.0.x- → wanted1.9.0.x?

(not currently active) Ted Mielczarek

Assignee

Comment 17

•

17 years ago

Pushed to 1.9.1: http://hg.mozilla.org/releases/mozilla-1.9.1/rev/97024aa042e5

Keywords: fixed1.9.1

(not currently active) Ted Mielczarek

Assignee

Updated

•

17 years ago

Attachment #328727 - Flags: approval1.9.0.6?

(not currently active) Ted Mielczarek

Assignee

Comment 18

•

17 years ago

Comment on attachment 328727 [details] [diff] [review] stop sending UserID Pretty small change, it's all code removal. If we're serious about this we should take it on 1.9.0.

(not currently active) Ted Mielczarek

Assignee

Updated

•

17 years ago

Blocks: 392608

Daniel Veditz [:dveditz]

Comment 19

•

17 years ago

Comment on attachment 328727 [details] [diff] [review] stop sending UserID Approved for 1.9.0.6, a=dveditz for release-drivers. Code freeze is tonight though -- please land ASAP

Attachment #328727 - Flags: approval1.9.0.6? → approval1.9.0.6+

Samuel Sidler (old account; do not CC)

Comment 20

•

17 years ago

QA: To verify, please ensure crash submission works (and shows up in Socorro), the UUID does not display in the crash reporter after a crash, and confirm with IT that the UUID is not present in the database (you can file a mozilla.org::Server Ops bug asking for information for a specific crash report ID).

(not currently active) Ted Mielczarek

Assignee

Comment 21

•

17 years ago

Checked in to 1.9.0: Checking in toolkit/crashreporter/nsExceptionHandler.cpp; /cvsroot/mozilla/toolkit/crashreporter/nsExceptionHandler.cpp,v <-- nsExceptionHandler.cpp new revision: 1.40; previous revision: 1.39 done To clarify what ss said, you can verify this client-side by using the crash me now extension, then clicking "Details" in the crash reporter window, and verifying that "UserID" is not in the list of data shown. Verifying it server-side would require an IT ticket.

Keywords: fixed1.9.0.6

u88484

Comment 22

•

17 years ago

Ted, I tested using the latest nightly with the crash me extension and here are my results. Add-ons: bettergmail2@ginatrapani.org:0.4,{59c81df5-4b7a-477b-912d-4e0fdf64e5f2}:0.9.84,{1280606b-2510-4fe0-97ef-9b5a22eafe80}:0.3.9.1,crashme@ted.mielczarek.org:0.1,inspector@mozilla.org:2.0.1,{8620c15f-30dc-4dba-a131-7c5d20cf4a29}:2.0.2,toggleprivatebrowsing@supernova00.biz:1.8,UnsortedBookmarksMenu@alice:1.6,{972ce4c6-7e08-4474-a285-3208198ce6fd}:3.2a1pre BuildID: 20090107033209 CrashTime: 1231434676 InstallTime: 1231348391 ProductName: Firefox SecondsSinceLastCrash: 1112360 StartupTime: 1231434666 Theme: classic/1.0 URL: http://code.google.com/p/socorro/downloads/detail?name=crashme.xpi&can=2&q= Vendor: Mozilla Version: 3.2a1pre This report also contains technical information about the state of the application when it crashed. Now when I viewed the crash report online at http://crash-stats.mozilla.com/report/index/c08ec64f-27d5-41ba-935a-ae5d22090108?p=1#details I see a UUID in the details tab. UUID c08ec64f-27d5-41ba-935a-ae5d22090108 Is this my client UUID or something different?

u88484

Comment 23

•

17 years ago

Crap, sorry for the bug spam. After submitting I noticed that the UUID is actually of the crash report haha. Sorry again about the bug spam!

Al Billings [:abillings - ex-MoCo]

Comment 24

•

17 years ago

For 1.9.0.6, I used http://crash-stats.mozilla.com/report/index/9659a7cb-f5ce-4470-acf2-098b02090116?p=1. This is verified but I need to check with server ops that the UUID is not in the DB.

Henrik Skupin [:whimboo][⌚️UTC+2]

Comment 25

•

17 years ago

Al, have you already filed a bug for the server-side verification?

Samuel Sidler (old account; do not CC)

Comment 26

•

17 years ago

Yes (although I don't recall the bug number) and it was verified.

Keywords: fixed1.9.0.6 → verified1.9.0.6

Nochum Sossonko [:Natch]

Updated

•

17 years ago

Keywords: fixed1.9.1

Samuel Sidler (old account; do not CC)

Comment 27

•

17 years ago

highmind63@gmail.com: Please do *not* remove the fixed1.9.1 keyword from bugs that have been fixed on the 1.9.1 branch (see comment 17 of this bug).

Keywords: fixed1.9.1

Nochum Sossonko [:Natch]

Comment 28

•

17 years ago

I removed it because it's verified1.9.1 now, but feel free to keep it.

Henrik Skupin [:whimboo][⌚️UTC+2]

Comment 29

•

17 years ago

Natch, fyi it was getting verified for 1.9.0.6, not for 1.9.1.

Al Billings [:abillings - ex-MoCo]

Comment 30

•

17 years ago

Sam and Henrik, it was Bug 474084 and it is verified.

Henrik Skupin [:whimboo][⌚️UTC+2]

Comment 31

•

17 years ago

Ok, so lets do the same thing for trunk and 1.9.1. I've filed bug 475531 for that.

Depends on: 474084, 475531

Henrik Skupin [:whimboo][⌚️UTC+2]

Comment 32

•

17 years ago

There are no user_id entries in the database for trunk and 1.9.1 builds. The id is even not visible in the crash reporter details window. Marking as verified.

Status: RESOLVED → VERIFIED

Keywords: fixed1.9.1 → verified1.9.1

Target Milestone: --- → mozilla1.9.2a1

Daniel Veditz [:dveditz]

Updated

•

17 years ago

Flags: wanted1.9.0.x? → wanted1.9.0.x+