Open Bug 1177121 Opened 9 years ago Updated 2 years ago

Redesigning the crash experience

Categories

(Toolkit :: Crash Reporting, defect)

defect

Tracking

()

People

(Reporter: Yoric, Unassigned)

References

(Depends on 2 open bugs)

Details

Short summary of the Whistler Crash Experience conclusions.

The main reason for which we cannot send crash data automatically is that we do not know exactly what data may be captured by the minidump accidentally, so this may contain any data. On the other hand, we can send the metadata automatically without privacy concerns (update: we already do, as part of the FHR/Telemetry).

1. For some reason, Firefox crashes;
2. The crash is stored on disk;
3. During the next startup, show a prompt (for instance, yellow bar, or Shield), with something along the lines of "Firefox has detected a problem, would you like to send us information so that we can debug it? Warning, this can contain private data. Yes/No/Always/Never" – if "Yes", send both the minidump and the metadata;
4. Users with FHR/Telemetry activated have the metadata sent regardless.

Note: We discussed using encryption and sending an minidump encrypted with a key we don't have, but this doesn't seem to add anything.

As a followup, we can actually skip the prompt in some cases in cases in which we don't care about the C++ stack, e.g. AsyncShutdown crashes.
(In reply to David Rajchenbach-Teller [:Yoric] (away June 22 - July 7th, use "needinfo") from comment #0)
> Short summary of the Whistler Crash Experience conclusions.
> 
> The main reason for which we cannot send crash data automatically is that we
> do not know exactly what data may be captured by the minidump accidentally,
> so this may contain any data. On the other hand, we can send the metadata
> automatically without privacy concerns (update: we already do, as part of
> the FHR/Telemetry).
> 
> 1. For some reason, Firefox crashes;
> 2. The crash is stored on disk;
> 3. During the next startup, show a prompt (for instance, yellow bar, or
> Shield), with something along the lines of "Firefox has detected a problem,
> would you like to send us information so that we can debug it? Warning, this
> can contain private data. Yes/No/Always/Never" – if "Yes", send both the
> minidump and the metadata;
> 4. Users with FHR/Telemetry activated have the metadata sent regardless.
> 
> Note: We discussed using encryption and sending an minidump encrypted with a
> key we don't have, but this doesn't seem to add anything.
> 
> As a followup, we can actually skip the prompt in some cases in cases in
> which we don't care about the C++ stack, e.g. AsyncShutdown crashes.

For unified Telemetry we send metadata:
https://gecko.readthedocs.org/en/latest/toolkit/components/telemetry/telemetry/crash-ping.html

... but not for FHR:
https://gecko.readthedocs.org/en/latest/services/healthreport/healthreport/dataformat.html#org-mozilla-crashes-crashes
The crash report submission UI could be integrated into the post-crash session restore page, when shown.
You forget the deal-breaker arguments:
1) We need the comments on the crashes, and most people will not remember on next startup whatr they were doing when they crashed. Losing the comments would seriously impede our ability to diagnose crashes.
2) We'd never get crashes reported that happen before the prompt for sending, most importantly startup crashes.

It sounds to me that either we were in different sessions in Whistler or we understood the discussion very differently, as to me it sounded like we pretty firmly cannot change anything in our rash submission process (other than potentially for the shutdownhangs).
Actually, if I recall, the idea was
1) "let's restart Firefox automatically if it crashed during regular use" (i.e., not startup or shutdown, and unless we are looping);
2) if Firefox crashed before the user has had a chance to respond to the prompt (in particular, if it never showed up), show the crash reporter.
There were multiple ideas floating around, but I didn't get the feeling that there was anything this rather large group could agree on.

I think we should, instead of a random discussion like we had there, do an actual working meeting with a smaller group of stakeholders that can actually decide on a way forward.

That said, even that meeting will not result in a final implementation but in the cornerstones or guidelines of where we really want this to go. From there to an actual implementation, we'll need to account for those three points as well:

1) We need to have UX revisit the whole flow and not do just one-offs there.
2) We need in-depth privacy reviews of anything that switches away from "always have the user acknowledge sending a crash report" as there are multiple pieces of privacy-relevant data in there.
3) We need to think of how to not lose user comments to crash reports during that, as they often significantly help us diagnose the issues.

(At least, that's my opinion on this.)
This is a complete great-or-dead project which requires coordination between engineering/UX/product/QA. Please do not work on this project until it comes up as a great-or-dead priority and we can dedicate the necessary resources.
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #0)
> 3. During the next startup, show a prompt (for instance, yellow bar, or
> Shield), with something along the lines of "Firefox has detected a problem,
> would you like to send us information so that we can debug it? Warning, this
> can contain private data. Yes/No/Always/Never" – if "Yes", send both the
> minidump and the metadata;

(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #3)
> 2) We'd never get crashes reported that happen before the prompt for
> sending, most importantly startup crashes.

I think it should be possible to implement the question "Firefox has detected a problem, would you like to send us information so that we can debug it? Warning, this can contain private data. Yes/No/Always/Never" in the breakpad-app, too. Guess the breakpad-app should be able to read/write this setting, too.
This should at least help that a user can set this setting even when FF is crashing all the time.


I'm missing a discussion about how to handle the URLs of the open windows.
I normally send in all crashes but without the URLs to protect my privacy.
(But after some comments of Robert I ask myself if this was enough protection ... ;-) )
Does Mozilla need the URLs ??? Guess e.g. Google Chrome don't send them out ...
If a crash is related to a special page, wouldn't the user note it in the comments ???
https://bugzilla.mozilla.org/show_bug.cgi?id=1138399
to move the discussion to this bug ...

> (In reply to Ted Mielczarek [:ted.mielczarek] from comment #19)
> > I don't want to add a pref for this. We should just keep track of reports
> > that we tried and failed to send and attempt to re-submit them at some point
> > using the in-browser crash submission pipeline.
> 
> But what's about container crashes? Think they will not be tried to send, or?
> 
> 
> OK, know OT but just some short questions because interested/involved people
> are here and it should be much easier & faster then in the mailing list ...
> Would it be possible that when I (the user) load a crash report on Socorro
> via about:crashes that then Socorro knows that I (the user) was the
> submitter and I (the user) can add a user comment to the report as submitter
> of the report?
> Means: Include the TXT on the user system something like a unique identifier
> to the report that nobody else can know ???
> Can/should we fill a bug for this against Socorro ???

My idea would be to move the comments to a crash to Socorro if FF comes up again.
Maybe this involves more people in participating to Mozilla, too.

In this case I also had the idea to log on with Firefox-Account to Socorro for user that have already a account in there FF-Profiles.
If FF restarts after a crash and the user have immediately the chance to add comments to the crash (if he wants to) the comments are only missing when FF is not able to start ...
Robert, have comments from crashes that prevent FF to start ever helped to fix a crash or is there really the chance that this comments can help in this case ???
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #0)
> 3. During the next startup, show a prompt (for instance, yellow bar, or
> Shield), with something along the lines of "Firefox has detected a problem,
> would you like to send us information so that we can debug it? Warning, this
> can contain private data. Yes/No/Always/Never" – if "Yes", send both the
> minidump and the metadata;

There were somewhere else in a bug some months ago a short discussion to such things I was involved in ... Does anybody know which bug this was ??? To look at this discussion again, too.
Bug 1138399:
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #21)
> Just as a note there is no way to "only send [...] crash reports [...]
> without private data" as the core of the crash reports is the minidump which
> contains highly sensitive private data.

Robert, when I was talking about "no private data" I was always talking about the open URLs.
Sorry for confusion!
I think there was a bug about improving breakpad somewhere, too?
Think it is related, or?

Think at the moment breakpad don't send out the bit-version (32/64bit) of the OS and FF, right?
Wouldn't help it to know that?
(In reply to Tobias B. Besemer [:BesTo] from comment #7)
> I think it should be possible to implement the question "Firefox has
> detected a problem, would you like to send us information so that we can
> debug it? Warning, this can contain private data. Yes/No/Always/Never" in
> the breakpad-app, too. Guess the breakpad-app should be able to read/write
> this setting, too.

Not without a privacy review. The contents of the minidump in the crash report are more privacy-relevant than the URLs and emails potentially sent in the metadata.

(In reply to Tobias B. Besemer [:BesTo] from comment #12)
> I think there was a bug about improving breakpad somewhere, too?

The bug here is about the whole experience.

And as comment #6 explains, nobody working for Mozilla will work on any of this for now, until it becomes enough of a priority.
(In reply to Tobias B. Besemer [:BesTo] from comment #11)
> Bug 1138399:
> (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #21)
> > Just as a note there is no way to "only send [...] crash reports [...]
> > without private data" as the core of the crash reports is the minidump which
> > contains highly sensitive private data.

On the other hand, the definition of "crash reports [...] without private data" at Whistler was "only send the metadata, without the minidump."
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #14)
> On the other hand, the definition of "crash reports [...] without private
> data" at Whistler was "only send the metadata, without the minidump."

Which would be 95% useless reports, I don't want that, we already have enough reports we ignore.
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #13)
> (In reply to Tobias B. Besemer [:BesTo] from comment #7)
> > I think it should be possible to implement the question "Firefox has
> > detected a problem, would you like to send us information so that we can
> > debug it? Warning, this can contain private data. Yes/No/Always/Never" in
> > the breakpad-app, too. Guess the breakpad-app should be able to read/write
> > this setting, too.
> 
> Not without a privacy review. The contents of the minidump in the crash
> report are more privacy-relevant than the URLs and emails potentially sent
> in the metadata.

What do you mean with "Not without a privacy review."?
It was intended as a additional option to get the setting set (Always/Never) if the first crash happens with the breakpad-app and can't be managed within FF.


(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #13)
> (In reply to Tobias B. Besemer [:BesTo] from comment #12)
> > I think there was a bug about improving breakpad somewhere, too?
> The bug here is about the whole experience.

I found this bug, too: Bug 765299 – Overhaul crash reporting UX


Robert, I just try to do a little bit brainstorming if we can find a "cool" solution that keeps the comments and is secure (privacy) for the user ... Think brainstorming in such a small group (this bug) can maybe bring an idea ...
(In reply to Tobias B. Besemer [:BesTo] from comment #12)
> I think there was a bug about improving breakpad somewhere, too?
> Think it is related, or?
> Think at the moment breakpad don't send out the bit-version (32/64bit) of
> the OS and FF, right?
> Wouldn't help it to know that?

Would be e.g. the Linux Distribution also interesting for finding problems?
Think the switch from Brackpad to Crashpad should be included in a re-do so I add bug 1174687 as a dependency.
Depends on: 1174687
No, this is unrelated work.
No longer depends on: 1174687
(In reply to Tobias B. Besemer [:BesTo] from comment #18)
> Think the switch from Brackpad to Crashpad should be included in a re-do so
> I add bug 1174687 as a dependency.
> Depends on: 1174687

(In reply to (on vacation July 18- 25) Ted Mielczarek [:ted.mielczarek] from comment #19)
> No, this is unrelated work.
> No longer depends on: 1174687

(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #13)
> (In reply to Tobias B. Besemer [:BesTo] from comment #12)
> > I think there was a bug about improving breakpad somewhere, too?
> The bug here is about the whole experience.
> And as comment #6 explains, nobody working for Mozilla will work on any of
> this for now, until it becomes enough of a priority.

(In reply to Benjamin Smedberg  [:bsmedberg] (away until 27-July) from comment #6)
> This is a complete great-or-dead project which requires coordination between
> engineering/UX/product/QA. Please do not work on this project until it comes
> up as a great-or-dead priority and we can dedicate the necessary resources.

Guess the switch to Crashpad is/should be part of this ...
(In reply to (on vacation July 18- 25) Ted Mielczarek [:ted.mielczarek] from comment #19)
> No, this is unrelated work.

Think the work for now should be/is to define a goal, make to-dos-lists, a roadmap for it ...
... and finding the dependencies to other open bugs that have to be done to come to the goal or fill new ones ... sure, it's just brainstorming and after the discussions it have to pass some other authorities/groups, but I guess the change to bring something great in a fast time is better, when ideas/planes are a little bit worked out for the persons who have to work later with/on them ...

Here some work-steps I will suggest as first:
- We need to bring together the things that have to be included like the comments that Robert need and clearing some open questions like if the open URLs of the browsers are in the crash reports needed;
- We need to have a look on a/the solution in the browser with msgs or whatever;
- And the app that is triggered, when FF don't come up anymore;
- There is the question if things can/should be done on Socorro;
- And how should be the crash management in FF (e.g. about:crashes);
- Things/bugs/feature-requests that have to be fixed/done in breakpad/crashpad;
- Sorting open bugs/create them and bring them together in meta-bugs for e.g. the single parts;
- Maybe make a wiki page to it to bring all informations together;
-> Then have a look on all again and look what else have to be done or what we have to re-do/improve again on the plan/informations.

Guess on the first steps it doesn't matter where to start, but we should talk about each step by step until on step is done, then the next and don't jump between the single parts around, work from on problem to all other connected problems and don't have just a "everybody talking about everything and nobody can follow anymore".
Just some suggestions ... I'm not part of Mozilla, so I can't decide, but I be interested in participate because I think it can be interesting in planing this and I maybe have a hand for it and some good ideas ...
Feedback welcome!
Btw.: And a "This solution or none!" and "No compromise with me!" helps no-one ...
(In reply to Tobias B. Besemer [:BesTo] (QA) from comment #22)
> Btw.: And a "This solution or none!" and "No compromise with me!" helps
> no-one ...

Sorry, but Mozilla is not a democracy. We try to make the best decisions with the information we have, but we're not going to implement something unless the module owner or peers think it's the right idea.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #23)
> (In reply to Tobias B. Besemer [:BesTo] (QA) from comment #22)
> > Btw.: And a "This solution or none!" and "No compromise with me!" helps
> > no-one ...
> 
> Sorry, but Mozilla is not a democracy. We try to make the best decisions
> with the information we have, but we're not going to implement something
> unless the module owner or peers think it's the right idea.

I can't believe that "having no solution, yet" should be really better then "having not a perfect solution, yet" ...

Check Bug 1219672. I sending in at least one crash per day, manually ...
What do you think how many other people do the same?
What do you think how many other people _don't_ do the same?
Can you tell me how critical is this problem really ???
See Also: → 929045
Depends on: 1233757
Depends on: 1233758
Are there any news to this bug?
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.