Prompt users with pending crash reports to submit them

RESOLVED FIXED in Firefox 49

Status

()

defect
P1
normal
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: blassey, Assigned: blassey, NeedInfo)

Tracking

(Blocks 3 bugs)

unspecified
Firefox 49
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(e10s+, firefox47 unaffected, firefox48 wontfix, firefox49 fixed)

Details

Attachments

(1 attachment, 6 obsolete attachments)

Posted patch prompt_unsumbitted.patch (obsolete) — Splinter Review
We're seeing unusually low crash submission rates with e10s. There are many theories as to why that might be, but I feel like prompting when we see pending crash reports on start up might go along way to alleviate the problem.
Attachment #8748496 - Flags: ui-review?(madhava)
Attachment #8748496 - Flags: review?(mconley)
Assignee: nobody → blassey.bugs
Comment on attachment 8748496 [details] [diff] [review]
prompt_unsumbitted.patch

Review of attachment 8748496 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/locales/en-US/chrome/browser/browser.properties
@@ +772,5 @@
>  tabgroups.migration.anonGroup = Group %S
>  tabgroups.migration.tabGroupBookmarkFolderName = Bookmarked Tab Groups
> +
> +pendingCrashReports.labelSingle = You have an unsubmitted crash report
> +pendingCrashReports.lableMulti = You have %S unsubmitted crash reports

Driveby nit: the correct thing to do here is to use 1 string and have it have 2 semi-colon-separated versions, and then using PluralForm ( https://developer.mozilla.org/en-US/docs/Mozilla/Localization/Localization_and_Plurals#Developing_with_PluralForm ) to pick the right one. That way languages who deal differently with 0, 1, 3, 5, ... can do that, too.
Comment on attachment 8748496 [details] [diff] [review]
prompt_unsumbitted.patch

Review of attachment 8748496 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not particularly jazzed about adding a new potential notificationbar at startup. I already feel like we have too many of those.

I think I'm going to wait for UX's opinion on this before conducting a review. In case, :madhava doesn't get to this in time, I'm ui-review?'ing shorlander as well, and they can race.
Attachment #8748496 - Flags: review?(mconley) → ui-review?(shorlander)
Comment on attachment 8748496 [details] [diff] [review]
prompt_unsumbitted.patch

Review of attachment 8748496 [details] [diff] [review]:
-----------------------------------------------------------------

I agree with Mike's concerns around more notification bars at start-up. Or more notification bars in general for that matter. But I would be OK with doing this for pre-release channels only since collecting this kind of data is what they are for. We should be prepared to reevaluate if this turns out to be too annoying and if people are not engaging with it.
Attachment #8748496 - Flags: ui-review?(shorlander)
Attachment #8748496 - Flags: ui-review?(madhava)
Attachment #8748496 - Flags: ui-review+
FWIW, maybe it would be worth thinking about how to have some kind of "submit crash reports automatically" pref?

Although there are a couple issues to sort out with that:

One is making sure we're comfortable with the privacy aspect. We started down this path with plugin crashes a while back... But (iirc) people argued that dumps contain enough sensitive data that there needs to be significant consent before a submission happens. But that was a while ago, and our thinking on such things has changed before... Having it be more of a pre-release thing would be an interesting angle too.

The other is that we get useful data from the comments people leave, so a simplistic crash-and-autosubmit would lose that. But maybe this could just be wired up to when we're tearing down the crashui anyway (and would otherwise have not submitted).

Clearly a separate bug if we want to do that, and I've no fundamental objection to an infobar.
Posted patch prompt_unsumbitted.patch (obsolete) — Splinter Review
Addressing the drive-by about plurals
Attachment #8748496 - Attachment is obsolete: true
Attachment #8748496 - Flags: review?(mconley)
Attachment #8749263 - Flags: ui-review+
Attachment #8749263 - Flags: review?(mconley)
Priority: -- → P1
(In reply to Justin Dolske [:Dolske] from comment #4)
> FWIW, maybe it would be worth thinking about how to have some kind of
> "submit crash reports automatically" pref?
Yes, we've had that conversation outside of this bug. I'll file a separate bug to get that discussion rolling.
> 
> Although there are a couple issues to sort out with that:
> 
> One is making sure we're comfortable with the privacy aspect. We started
> down this path with plugin crashes a while back... But (iirc) people argued
> that dumps contain enough sensitive data that there needs to be significant
> consent before a submission happens. But that was a while ago, and our
> thinking on such things has changed before... Having it be more of a
> pre-release thing would be an interesting angle too.
> 
> The other is that we get useful data from the comments people leave, so a
> simplistic crash-and-autosubmit would lose that. But maybe this could just
> be wired up to when we're tearing down the crashui anyway (and would
> otherwise have not submitted).
My thinking here is that when you have a content crash you get the in-content UI where you can leave a comment. If you ignore that but have auto submit enabled, the next time we restart we'll just submit without a comment.
> 
> Clearly a separate bug if we want to do that, and I've no fundamental
> objection to an infobar.
Duplicate of this bug: 1241106
Comment on attachment 8749263 [details] [diff] [review]
prompt_unsumbitted.patch

Review of attachment 8749263 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/base/content/browser.js
@@ +1366,5 @@
>  
>        PanicButtonNotifier.init();
>      });
> +
> +    this.checkForPendingCrashReports();

shorlander only gave his okay for doing this on pre-release channels, so we'll need to wrap this in a condition - though the right one currently escapes me.

Note that I'm also not crazy about having code that only runs in a particular way in beta or earlier, and another way in release. We've been burned by that before. Not sure what to recommend - perhaps using EARLY_BETA_OR_EARLIER.

@@ +1367,5 @@
>        PanicButtonNotifier.init();
>      });
> +
> +    this.checkForPendingCrashReports();
> +    

Trailing whitespace.

@@ +1383,5 @@
> +      let uuidRegex = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
> +      let count = 0;
> +      let files = [];
> +      let ids = [];
> +      while (entries.hasMoreElements()) {

This worries me for two reasons:

1) We're iterating this directory of files on the main thread during start up
2) We're doing this for every browser window that opens. Not just the first, but every single window for the session.

For (1), I suspect we'll want to do the iteration off main thread, and use something like OSFile.jsm[1] to do it. Thankfully, I think there's no issue with this work being asynchronous while it waits for the thread to determine whether or not there are crash reports on the disk.

For (2), we definitely only ever want to do this once, which makes me think this kind of thing should go into nsBrowserGlue.js (that's usually the "one-time-only" setup and teardown place for browser).

[1]: https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules/OSFile.jsm

@@ +1418,5 @@
> +            label: gNavigatorBundle.getString("pendingCrashReports.ignoreAll"),
> +            callback: function() {
> +              files.forEach(function(file) {
> +                let ignoreFile = Cc['@mozilla.org/file/local;1'].createInstance(Ci.nsIFile);
> +                ignoreFile.initWithPath(file.path + ".ignore");

Does adding .ignore prevent these reports from ever being submitted through about:crashes in the future?

@@ +1419,5 @@
> +            callback: function() {
> +              files.forEach(function(file) {
> +                let ignoreFile = Cc['@mozilla.org/file/local;1'].createInstance(Ci.nsIFile);
> +                ignoreFile.initWithPath(file.path + ".ignore");
> +                ignoreFile.create(Ci.nsIFile.NORMAL_FILE_TYPE, 0o600); 

Trailing whitespace.

@@ +1426,5 @@
> +          },
> +          {
> +            label: gNavigatorBundle.getString("pendingCrashReports.viewAll"),
> +            callback: function() {
> +              gBrowser.loadURI("about:crashes");

Probably should open a new tab instead of loading in the current one.
Attachment #8749263 - Flags: review?(mconley) → review-
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #9)
> Comment on attachment 8749263 [details] [diff] [review]
> prompt_unsumbitted.patch
> 
> Review of attachment 8749263 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> @@ +1418,5 @@
> > +            label: gNavigatorBundle.getString("pendingCrashReports.ignoreAll"),
> > +            callback: function() {
> > +              files.forEach(function(file) {
> > +                let ignoreFile = Cc['@mozilla.org/file/local;1'].createInstance(Ci.nsIFile);
> > +                ignoreFile.initWithPath(file.path + ".ignore");
> 
> Does adding .ignore prevent these reports from ever being submitted through
> about:crashes in the future?
No. 

For some context, I thought about moving these reports from the pending/ folder to an ignored/ folder, but worried about breaking submitting through about:crashes. The advantage of moving them is that we'd never have to iterate over them again though.
Posted patch prompt_unsumbitted.patch (obsolete) — Splinter Review
As for the early beta versus all of beta. I'm wary of changing our crash submission rates midway though beta. Also, this code is pretty isolated, so its hard to imagine spillover effects from not calling it in release.
Attachment #8749263 - Attachment is obsolete: true
Attachment #8749621 - Flags: review?(mconley)
Is there a risk here that we hammer socorro with new reports? I have to wonder how many pending reports there are on beta and release, it's probably quite a few.
Comment on attachment 8749621 [details] [diff] [review]
prompt_unsumbitted.patch

Review of attachment 8749621 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/nsBrowserGlue.js
@@ +1122,5 @@
>      }
>  
>      this._checkForOldBuildUpdates();
>  
> +    if (['default', 'nightly', 'aurora', 'beta'].includes(AppConstants.MOZ_UPDATE_CHANNEL)) {

maybe this should be ('release' != AppConstants.MOZ_UPDATE_CHANNEL)
There is a risk of adding a bunch of cruft to Socorro.

I think that we should start with nightly and see if this is working before we roll it out. And we should definitely limit this to a reasonable timeframe. I suggest one week.

The dates that socorro assigns to crashes are the date that they are received, not the date of the original crash. This normally doesn't matter because most crashes are submitted immediately, but it will cause things to be a bit wonky. We can't normally annotate crashes "after the fact", so adding an annotation that says this is a delayed submit might not be easily possible.
Comment on attachment 8749621 [details] [diff] [review]
prompt_unsumbitted.patch

Review of attachment 8749621 [details] [diff] [review]:
-----------------------------------------------------------------

This looks right to me. I just have a minor suggestion regarding code organization.

::: browser/components/nsBrowserGlue.js
@@ +833,5 @@
> +            let count = 0;
> +            let files = [];
> +            let ids = [];
> +            iterator.forEach(
> +              function onEntry(file) {

So it looks like we already have CrashSubmit.pendingIDs that's doing main-thread I/O to get the list of pending crash reports. I think it makes more sense probably to do the following:

1) Add a new method to the CrashSubmit.jsm public API called pendingIDsAsync or something that returns a Promise that resolves with the list of pending crash ID file paths. That way, we can keep the notion of what a crash report filename looks like, etc, all encapsulated within CrashSubmit.jsm.
2) Have checkForPendingCrashReports use the method, and then check for the .ignore files using OS.File.exists.

@@ +872,5 @@
> +                    {
> +                      label: win.gNavigatorBundle.getString("pendingCrashReports.ignoreAll"),
> +                      callback: function() {
> +                        files.forEach(function(file) {
> +                    let ignoreFile = Cc['@mozilla.org/file/local;1'].createInstance(Ci.nsIFile);

Busted indentation. Also, this is on the main-thread still. I think we want to use OS.File.open with the create mode. (Don't forget to close it afterward)

@@ +1122,5 @@
>      }
>  
>      this._checkForOldBuildUpdates();
>  
> +    if (['default', 'nightly', 'aurora', 'beta'].includes(AppConstants.MOZ_UPDATE_CHANNEL)) {

I agree that this makes more sense.
Attachment #8749621 - Flags: review?(mconley) → feedback+
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #14)
> 
> The dates that socorro assigns to crashes are the date that they are
> received, not the date of the original crash. This normally doesn't matter
> because most crashes are submitted immediately, but it will cause things to
> be a bit wonky. We can't normally annotate crashes "after the fact", so
> adding an annotation that says this is a delayed submit might not be easily
> possible.

The CrashSubmit API allows us to pass "extraExtraData" (I know), which is merged with the .extra stuff that the crash dump normally came with. That extraExtraData gets added at submission time: https://dxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/CrashSubmit.jsm#432

So it should be possible to annotate these special "late" crashes. I assume this will make it easier to filter out cruft?
Flags: needinfo?(benjamin)
Yes. Please make sure that we've added the field to supersearch before we deploy this, so that we can search for/exclude these special crash reports.
Flags: needinfo?(benjamin)
Posted patch prompt_unsumbitted.patch (obsolete) — Splinter Review
Attachment #8749621 - Attachment is obsolete: true
Attachment #8749897 - Flags: ui-review+
Attachment #8749897 - Flags: review?(mconley)
Posted patch prompt_unsumbitted.patch (obsolete) — Splinter Review
Attachment #8749897 - Attachment is obsolete: true
Attachment #8749897 - Flags: review?(mconley)
Attachment #8749987 - Flags: ui-review+
Attachment #8749987 - Flags: review?(mconley)
The largest risk here is that it will have an influence on the traditional metrics on submitted crashes - Calixte and others looking into this data will have to deal with that. Thankfully, the new shiny Telemetry-based data will not be influenced by this. :)
It's surely great to get more data that people looking into fixing crashes can analyze!
Comment on attachment 8749987 [details] [diff] [review]
prompt_unsumbitted.patch

Review of attachment 8749987 [details] [diff] [review]:
-----------------------------------------------------------------

I think this will work.

One concern I have, however, is that there was talk about having this be limited to some window of time (I believe bsmedberg suggested 2 weeks). We'd likely want to use the lastModifiedTime of the files to filter them out.

::: browser/components/nsBrowserGlue.js
@@ +123,5 @@
>                                    "resource:///modules/ContentCrashHandlers.jsm");
>  if (AppConstants.MOZ_CRASHREPORTER) {
>    XPCOMUtils.defineLazyModuleGetter(this, "PluginCrashReporter",
>                                      "resource:///modules/ContentCrashHandlers.jsm");
> +  XPCOMUtils.defineLazyModuleGetter(this, "CrashReports",

CrashReports isn't used anymore and can be removed.

@@ +863,5 @@
> +                                                 win.gNavigatorBundle.getString("pendingCrashReports.label")).replace("#1", count),
> +                                  "pending-crash-reports",
> +                                  "chrome://browser/skin/tab-crashed.svg",
> +                                  nb.PRIORITY_INFO_HIGH, buttons);
> +                  return;

I don't think this return is necessary.

@@ +1089,5 @@
>      }
>  
>      this._checkForOldBuildUpdates();
>  
> +    if ('release' != AppConstants.MOZ_UPDATE_CHANNEL) {

double-quotes please

::: browser/locales/en-US/chrome/browser/browser.properties
@@ +771,5 @@
>  # %S is the group number/ID
>  tabgroups.migration.anonGroup = Group %S
>  tabgroups.migration.tabGroupBookmarkFolderName = Bookmarked Tab Groups
> +
> +pendingCrashReports.label = You have an unsubmitted crash report;You have #1 unsubmitted crash reports

I've just realized we're going to have an uplift problem because we're adding new strings.

If it's truly important that we need to get this uplifted, we're going to want to coordinate with l10n.

::: toolkit/crashreporter/CrashSubmit.jsm
@@ +476,5 @@
>        memory.remove(false);
>      }
>    },
>  
> +  ignore: function CrashSubmit_ignore(id) {

Please add a docstring for this new method.

@@ +501,5 @@
>    pendingIDs: function CrashSubmit_pendingIDs() {
>      return getAllPendingMinidumpsIDs();
>    },
>  
> +  pendingIDsAsync: function CrashSubmit_pendingIDsAsync(includeIgnored) {

Please add a docstring for this new method.

includeIgnored never seems to be called with true, so maybe let's remove that feature for now unless there's need for it.

@@ +507,5 @@
> +      OS.File.stat(getDir("pending").path).then(
> +        function onSuccess(info) {
> +          if (info.isDir) {
> +            let iterator = new OS.File.DirectoryIterator(getDir("pending").path);
> +            let uuidRegex = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;

This REGEX might be useful towards the top of this JSM where other similar REGEX's are defined.
Attachment #8749987 - Flags: review?(mconley) → review+
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #21)
> Comment on attachment 8749987 [details] [diff] [review]
> prompt_unsumbitted.patch
> 
> Review of attachment 8749987 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I think this will work.
> 
> One concern I have, however, is that there was talk about having this be
> limited to some window of time (I believe bsmedberg suggested 2 weeks). We'd
> likely want to use the lastModifiedTime of the files to filter them out.

What's the rational for wanting to limit it?
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #22)
> (In reply to Mike Conley (:mconley) - Needinfo me! from comment #21)
> > Comment on attachment 8749987 [details] [diff] [review]
> > prompt_unsumbitted.patch
> > 
> > Review of attachment 8749987 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > I think this will work.
> > 
> > One concern I have, however, is that there was talk about having this be
> > limited to some window of time (I believe bsmedberg suggested 2 weeks). We'd
> > likely want to use the lastModifiedTime of the files to filter them out.
> 
> What's the rational for wanting to limit it?

I think the worry is that there could exist old profiles with many, many unsubmitted crash reports from years and years ago that would likely be irrelevant now.
It should be fine to send us batches of crash reports at the same time, our collectors should be able to handle the load. If you can give us the date when this feature will ship, we can make sure to monitor our systems and adjust them if need be. 

Limiting to sending only the "recent" crash reports seems like a good idea. As Benjamin mentioned, Socorro uses the date of reception as our "crash date" (we have a "client crash date" as well, but we don't use it for anything as we cannot really trust it). That means if you send a 2-years old report now, it will be considered to have happened today. It will still be set to whatever relic version of Firefox the user was using at the time, so that problem is mitigated, but it could still cause confusion. It seems, however, wiser to not send tons of crash reports at the same, especially if they have no value. I believe crash reports lose value pretty quickly, and anything older than a week or two will likely not be of interest to many people. 

Adding an annotation to those reports would be great, something like ``deferred_submission``, a simple flag set to 1 when submitting an old crash. I could see us adding special behavior for those crash reports, to use the client crash date instead of the reception date as our reference for example (though that's just an idea, we'll need to investigate more on our side).
Blocks: 1271336
Posted patch prompt_unsumbitted.patch (obsolete) — Splinter Review
Attachment #8749987 - Attachment is obsolete: true
Attachment #8750554 - Flags: review?(mconley)
Comment on attachment 8750554 [details] [diff] [review]
prompt_unsumbitted.patch

Review of attachment 8750554 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/nsBrowserGlue.js
@@ +824,5 @@
> +  checkForPendingCrashReports: function() {
> +    if (AppConstants.MOZ_CRASHREPORTER) {
> +      let dateLimit = new Date();
> +      // We don't process crash reports older than 28 days, so don't bother submitting them
> +      dateLimit.setDate(dateLimit.getDate() - 28);

This magic 28 should probably be a constant somewhere - even at the top of this function is fine. Alternatively, as a property on the nsBrowserGlue object.

::: toolkit/crashreporter/CrashSubmit.jsm
@@ +24,5 @@
>  const SUCCESS = "success";
>  const FAILED  = "failed";
>  const SUBMITTING = "submitting";
>  
> +const  uuidRegex = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;

UUID_REGEX, for consistency.

@@ +536,5 @@
> +                        let id = file.name.slice(0, -4);
> +                        if (uuidRegex.test(id)) {
> +                          return OS.File.stat(file.path).then(
> +                            function onSuccess(info) {
> +                              if (info.lastAccessDate.valueOf() > maxFileDate.valueOf()) {

Looks fine to me. I might open a follow-up bug to switch this to using Task instead to avoid the use of all of the callbacks, but I think I've harped on this patch enough and it can safely land as-is.

@@ +537,5 @@
> +                        if (uuidRegex.test(id)) {
> +                          return OS.File.stat(file.path).then(
> +                            function onSuccess(info) {
> +                              if (info.lastAccessDate.valueOf() > maxFileDate.valueOf()) {
> +                                ids.push(id); 

Trailing whitespace.
Attachment #8750554 - Flags: review?(mconley) → review+
Backed out for unexpected file access of crashreporter-override.ini and crashreporter.ini at startup:

https://hg.mozilla.org/integration/mozilla-inbound/rev/ad8395f134d7964187bc7491f41fbea18a133bd1

Push with failure: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=756948b3ac0f03a360d739ee5f54ad05d77d2386
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=27628230&repo=mozilla-inbound
00:54:39     INFO -  TEST-UNEXPECTED-FAIL : xperf: File '{firefox}\browser\crashreporter-override.ini' was accessed and we were not expecting it.  DiskReadCount: 2, DiskWriteCount: 0, DiskReadBytes: 8192, DiskWriteBytes: 0
00:54:39     INFO -  TEST-UNEXPECTED-FAIL : xperf: File '{firefox}\crashreporter.ini' was accessed and we were not expecting it.  DiskReadCount: 2, DiskWriteCount: 0, DiskReadBytes: 8192, DiskWriteBytes: 0

Furthermore, eslint expects the functions in CrashSubmit.jsm to always return a value.
Flags: needinfo?(blassey.bugs)
Comment on attachment 8750554 [details] [diff] [review]
prompt_unsumbitted.patch

Review of attachment 8750554 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/locales/en-US/chrome/browser/browser.properties
@@ +771,5 @@
>  # %S is the group number/ID
>  tabgroups.migration.anonGroup = Group %S
>  tabgroups.migration.tabGroupBookmarkFolderName = Bookmarked Tab Groups
> +
> +pendingCrashReports.label = You have an unsubmitted crash report;You have #1 unsubmitted crash reports

You need to add a specific comment before this string to trigger the plural form checks.
https://developer.mozilla.org/en-US/docs/Mozilla/Localization/Localization_content_best_practices#Use_proper_plural_forms
I have a few concerns here (but I'm generally OK with the proposal), I'll see if I can enumerate them all:
1) Echoing the concerns above about infobars. I have infobar fatigue myself.
2) We should definitely limit this to recent crashes, for some definition of "recent". We do try to prune the number of pending dumps in a few places, but if someone crashes infrequently they could have some pending dumps hanging out from quite a while ago, and those are going to be nothing but noise. On my Linux machine:
```
$ ls -1 ~/.mozilla/firefox/Crash\ Reports/pending/*.dmp | wc -l
35
```

My Windows machine:
```
$ ls -1 ~/AppData/Roaming/Mozilla/Firefox/Crash\ Reports/pending/*.dmp | wc -l
136
```

My Mac:
``
$ ls -1 ~/Library/Application\ Support/Firefox/Crash\ Reports/pending/*.dmp | wc -l
      10
```

(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #6)
> My thinking here is that when you have a content crash you get the
> in-content UI where you can leave a comment. If you ignore that but have
> auto submit enabled, the next time we restart we'll just submit without a
> comment.

I'm not sure what you mean here--if you had auto-submit enabled we'd just submit the crash report in the first place, right?

I do think that longer-term in our glorious e10s future we are going to have to figure out some way to do some kind of auto-submission, even if it's more like Microsoft's Windows Error Reporting, where the client generates its own signature and doesn't submit a minidump unless the server asks it to (like when there are no minidumps for a specific signature).
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #33)``
> 
> (In reply to Brad Lassey [:blassey] (use needinfo?) from comment #6)
> > My thinking here is that when you have a content crash you get the
> > in-content UI where you can leave a comment. If you ignore that but have
> > auto submit enabled, the next time we restart we'll just submit without a
> > comment.
> 
> I'm not sure what you mean here--if you had auto-submit enabled we'd just
> submit the crash report in the first place, right?

I'm suggesting that we continue to show the in-content crash reporter so the user can elect to add notes, but if they take no action we'd auto submit after the next restart.
Flags: needinfo?(blassey.bugs)
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #6)
> (In reply to Justin Dolske [:Dolske] from comment #4)
> > The other is that we get useful data from the comments people leave, so a
> > simplistic crash-and-autosubmit would lose that. But maybe this could just
> > be wired up to when we're tearing down the crashui anyway (and would
> > otherwise have not submitted).
> My thinking here is that when you have a content crash you get the
> in-content UI where you can leave a comment. If you ignore that but have
> auto submit enabled, the next time we restart we'll just submit without a
> comment.

Note that I brought up possible ideas and plans about auto-submit with the privacy team in Orlando, and I was basically told that we cannot submit crashes without prompting. So if you plan to do anything like auto-submit, please go through privacy team and data stewart approvals.
Comment on attachment 8752066 [details] [diff] [review]
prompt_unsumbitted.patch

Review of attachment 8752066 [details] [diff] [review]:
-----------------------------------------------------------------

Minor stylistic nits and suggestions, and one more important suggestion regarding closing out the iterator if things go wrong, but otherwise this is fine.

::: browser/components/nsBrowserGlue.js
@@ +830,5 @@
>    },
>  
> +  checkForPendingCrashReports: function() {
> +    // We don't process crash reports older than 28 days, so don't bother submitting them
> +    let numDays = 28;

const, and maybe rename this to something clearer, like:

PENDING_CRASH_REPORT_DAYS

::: toolkit/crashreporter/CrashSubmit.jsm
@@ +490,5 @@
> +
> +  ignore: function CrashSubmit_ignore(id) {
> +    let [dump, extra, mem] = getPendingMinidump(id);
> +    return OS.File.open(dump.path + ".ignore", {create: true},
> +                        {unixFlags:OS.Constants.libc.O_CREAT})

space between : and OS.Constants

@@ +505,5 @@
>      return getAllPendingMinidumpsIDs();
>    },
>  
>    /**
> +   * Get the list of pending crash IDs, exluding those marked to be ignored

Typo: exluding -> excluding

@@ +521,5 @@
> +      yield iterator.forEach(
> +        function onEntry(file) {
> +          if (file.name.endsWith(".dmp")) {
> +            return OS.File.exists(file.path + ".ignore")
> +              .then(ignoreExists => {if (!ignoreExists) {

Kinda wacky to have the if on the same line here - should probably put it on the next.

@@ +525,5 @@
> +              .then(ignoreExists => {if (!ignoreExists) {
> +                let id = file.name.slice(0, -4);
> +                if (UUID_REGEX.test(id)) {
> +                  return OS.File.stat(file.path)
> +                    .then(info => {if (info.lastAccessDate.valueOf() > maxFileDate.valueOf()) {

As before, probably best to put on the next line.

@@ +529,5 @@
> +                    .then(info => {if (info.lastAccessDate.valueOf() > maxFileDate.valueOf()) {
> +                      ids.push(id);
> +                    }});
> +                }
> +              } return null; });

Best to put the return on the next line, I think.

@@ +534,5 @@
> +          }
> +          return null;
> +        }
> +      );
> +      iterator.close();

We should probably wrap the iteration of the directory iterator in a try-catch-finally, and do the iterator.close in the finally, a la: https://dxr.mozilla.org/mozilla-central/source/devtools/server/actors/storage.js#1713

So that if things go wrong we don't have a dangling open file.
Attachment #8752066 - Flags: review?(mconley) → review+
as a note, this caused a 7% talos regression on osx for tresize:
https://treeherder.mozilla.org/perf.html#/alerts?id=1209

please run talos and work on fixing this regression prior to relanding :)
Flags: needinfo?(blassey.bugs)
Thankfully, we take screenshots during a test failure. The notification bar is coming up, and I think that's changed things for that particular test:

http://mozilla-releng-blobs.s3.amazonaws.com/blobs/mozilla-inbound/sha512/52fe4add1f50db5ba12ae61911b74c22081aaef58cbcf874795ed7b3eb0259462ee1eeaaa4de251b2cdbb0f3106ad472dc02562bae5d7ff93fa454ed9ea3c228

We should probably not have that notification bar show up during tests.
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #41)
> Thankfully, we take screenshots during a test failure. The notification bar
> is coming up, and I think that's changed things for that particular test:
> 
> http://mozilla-releng-blobs.s3.amazonaws.com/blobs/mozilla-inbound/sha512/
> 52fe4add1f50db5ba12ae61911b74c22081aaef58cbcf874795ed7b3eb0259462ee1eeaaa4de2
> 51b2cdbb0f3106ad472dc02562bae5d7ff93fa454ed9ea3c228
> 
> We should probably not have that notification bar show up during tests.

Do we have crash reports in the test profile? Should we not?
Joel, I think the right fix here is to clean up after our tests and delete any crash reports that are laying around. How can we do that?
Flags: needinfo?(jmaher)
I might not understand the problem here- we do cleanup our profile when we close the browser down.  Is this an ask for deleting crashes between tests?  If that is the case, we need to sort this out inside the javascript harness- that is interesting because we manage the crashes, etc. in python outside of the browser.  Could we assume a crash means the browser has closed down?
Flags: needinfo?(jmaher)
The Mochitest harness should not be putting any crashes in the pending directory--the crashreporter client does that and we run tests with `MOZ_CRASHREPORTER_NO_REPORT=1`, which doesn't launch the client. Any crashes that are there must either be from misbehaving harnesses or something else running Firefox.
If we crash during a test run, the test should fail. Otherwise, there shouldn't be any pending crash reports. Joel, can we have the harness delete these pending crash reports between runs?
Flags: needinfo?(jmaher)
we do delete the entire profile, and my understanding is that is where the crash reports are stored.  Please let me know if crash reports live outside of the profile and where specifically so we could look into deleting them.
Flags: needinfo?(jmaher)
They are stored outside of the profile. They are stored in "[UAppData]/Crash Reports/", where by default profiles are in "[UAppData]/Profiles/<salt>.<profile name>" and of course can be anywhere.
Flags: needinfo?(jmaher)
Depends on: 1274318
this seems reasonable, :gbrown, do you have some time to take this on?
Flags: needinfo?(jmaher) → needinfo?(gbrown)
Depends on: 1274395
Flags: needinfo?(gbrown)
Pushed by blassey@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/721772475449
Prompt users with pending crash reports to submit them r=mconley ui-r=shorlander
https://hg.mozilla.org/mozilla-central/rev/721772475449
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 49
(In reply to Jonathan Howard from comment #52)
> Just realised now when updating;
> This will create some submissions for other channels. From users that run
> the others as well as nightly.
> 
> https://crash-stats.mozilla.com/search/
> ?product=Firefox&submitted_from_infobar=__true__&_facets=signature&_facets=re
> lease_channel&_facets=version&_columns=date&_columns=signature&_columns=produ
> ct&_columns=version&_columns=build_id&_columns=platform#facet-release_channel

How do you mean? Like, if I normally use Nightly, and have some unsubmitted crash reports, but then I use the same profile and run DevEdition, and while using DevEdition the unsubmitted crash reports are submitted?
Flags: needinfo?(jonathan)
Pending crashes are saved in a general Firefox folder rather than profile bound.
Process is opposite way. e.g. Use DevEdition and it generates the pending crash report then later start Nightly which submits it. Report keeps correct identifiable info.
So far the numbers seem insignificant in volume for the other channels.
Flags: needinfo?(jonathan)
blassey, do we want to uplift this?
Flags: needinfo?(blassey.bugs)
I don't think so
Flags: needinfo?(blassey.bugs)
This is enabled for nightly/aurora only, correct? We should not ever show this to release users, and probably not to beta users unless there's a short-term compelling need.
Flags: needinfo?(blassey.bugs)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #57)
> This is enabled for nightly/aurora only, correct? We should not ever show
> this to release users, and probably not to beta users unless there's a
> short-term compelling need.

The current patch is designed to show the notification for Nightly, Aurora and Beta.

I believe when this patch was first being designed, it was during the time where we were concerned about the e10s content process crash rate.

Now that we have an understanding of the content process crash reports, perhaps we should alter this patch to only display for Nightly and Aurora.

Your thoughts, blassey?
8 weeks to decide whether to let it ride to beta.
Not sure what is so bad (for the user) to consider not having reports from beta. To me a good description of the crashes are as leftovers. Notification bar seems appropriate to deal with them.
Slightly over 50% of reports on new build are from this. (Expecting backlog to clear soon.)

Super-search not working for false (same as for accessibility.)
API can return the results. (Example, don't read it as a comparison;)
https://crash-analysis.mozilla.com/rkaiser/datil/searchcompare/?common=product%3DFirefox%26release_channel%3Dnightly%26build_id%3D%3E%253D20160602030220&p1=submitted_from_infobar%3D%21true&p2=submitted_from_infobar%3Dtrue
As Mike said, its restricted to Nightly, Aurora and Beta. I'd lean towards leaving it own for Beta, I don't think the infobar is a terrible experience for the pre-release audience and we'll want it the next time there's a content shutdown (or say, system shutdown for the chrome process) related issue we need data on.
Flags: needinfo?(blassey.bugs)
Brad, do you want to uplift this crash reporter fix to Beta 48?
Flags: needinfo?(blassey.bugs)
bsmedberg indicated at some point that he didn't think we should let this hit beta.
I think it can ride the trains. I also think it should remain on for Beta.
Flags: needinfo?(blassey.bugs)
SGTM. wontfix for Beta 48.
hi, from my point of view it isn't a good idea to let this go into beta like it stands now. this would totally skew the stats we are currently using to asses the stability of a release and make it impossible to compare the general situation to prior releases & we wouldn't have a clear idea where we are stability-wise until we go to a general release audience.
on 49 the crash rate seems to have permanently tripled after the change landed & the rates are still not going down to prior levels weeks after: https://crash-analysis.mozilla.com/rkaiser/crash-report-tools/longtermgraph/?fxaurora

in addition it doesn't seem to be possible to run a socorro search excluding those crashes submitted by the info bar (bug 1282529).

if the data gathered from this mechanism is quite useful though, maybe we could just run it for one or two beta builds in the middle of the cycle...?
It sounds to me like this will give us more crash reports, which may be useful for diagnosing/fixing issues.  But we are worried it will mess up our metrics on crash-stats, which it will do whether we delay till 49 or have this on beta 48. Is there a way to filter (ie marking these as delayed crashes) as bsmedberg suggested in https://bugzilla.mozilla.org/show_bug.cgi?id=1269998#c17 ?
Flags: needinfo?(blassey.bugs)
From irc discussion sounds like the annotation is there but supersearch doesn't return the right results though the API can.  Socorro devs can you take a look? Looks like we have a bug on file, bug 1282529. 

Benjamin, if we manage to fix this in supersearch, are you ok with this riding to beta?  We want these unsubmitted crashes for diagnosing and fixing crash bugs, (even if reported in 48, and fixed in time for 49) though it may play some havoc with judging metrics through crash-stats.
Flags: needinfo?(mbrandt)
Flags: needinfo?(benjamin)
Flags: needinfo?(blassey.bugs)
My problem is not with the stats: we're already using the new telemetry-based stats, so that doesn't matter.

Unless we absolutely need this data, this is a very crappy experience for users. We're basically constantly reminding them that Firefox crashed at a time when they've already moved on. It will give users the impression that Firefox is much crashier, even though it's not. And based on our e10s situation, I don't think that the value of deploying this to beta is worth that amount of user unhappiness/confusion.
Flags: needinfo?(benjamin)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #69)
> My problem is not with the stats: we're already using the new
> telemetry-based stats, so that doesn't matter.
> 
> Unless we absolutely need this data, this is a very crappy experience for
> users. We're basically constantly reminding them that Firefox crashed at a
> time when they've already moved on. It will give users the impression that
> Firefox is much crashier, even though it's not. And based on our e10s
> situation, I don't think that the value of deploying this to beta is worth
> that amount of user unhappiness/confusion.

In that case, we need to modify this line: http://searchfox.org/mozilla-central/rev/a7c8e9f3cc323fd707659175a46826ad12899cd1/browser/components/nsBrowserGlue.js#1152

It's currently causing the notification to show for Beta, Aurora and Nightly. Shall we modify it to only show the notification for Aurora and Nightly?
Flags: needinfo?(benjamin)
(In reply to Mike Conley (:mconley) - (Needinfo me!) from comment #70)
> (In reply to Benjamin Smedberg  [:bsmedberg] from comment #69)
> > My problem is not with the stats: we're already using the new
> > telemetry-based stats, so that doesn't matter.
> > 
> > Unless we absolutely need this data, this is a very crappy experience for
> > users. We're basically constantly reminding them that Firefox crashed at a
> > time when they've already moved on. It will give users the impression that
> > Firefox is much crashier, even though it's not. And based on our e10s
> > situation, I don't think that the value of deploying this to beta is worth
> > that amount of user unhappiness/confusion.
> 
> In that case, we need to modify this line:
> http://searchfox.org/mozilla-central/rev/
> a7c8e9f3cc323fd707659175a46826ad12899cd1/browser/components/nsBrowserGlue.
> js#1152
> 
> It's currently causing the notification to show for Beta, Aurora and
> Nightly. Shall we modify it to only show the notification for Aurora and
> Nightly?

Isn't Aurora the Developer build now? I don't think we should show it there either. Can we just show it on Nightly?
Flags: needinfo?(mbrandt) → needinfo?
(In reply to Stephen Horlander [:shorlander] from comment #71)
> 
> Isn't Aurora the Developer build now? I don't think we should show it there
> either. Can we just show it on Nightly?

Yep, we can do that - though originally we had decided that pre-release channels were okay (see comment 3). Has something changed between then and now, or what did you mean by pre-release channels in that comment? It's okay if we've changed our mind, I just want to be sure I understand.
Flags: needinfo? → needinfo?(shorlander)
(In reply to Mike Conley (:mconley) - (Needinfo me!) from comment #72)
> (In reply to Stephen Horlander [:shorlander] from comment #71)
> > 
> > Isn't Aurora the Developer build now? I don't think we should show it there
> > either. Can we just show it on Nightly?
> 
> Yep, we can do that - though originally we had decided that pre-release
> channels were okay (see comment 3). Has something changed between then and
> now, or what did you mean by pre-release channels in that comment? It's okay
> if we've changed our mind, I just want to be sure I understand.

I did mean pre-release including Beta. But if we don't think the extra data from beta users is necessary I would rather not use it there.

Most of the feedback I have received about this is that people hate notifications bars, especially notification bars that are out of context and not really actionable. This isn't unexpected and probably worth the trade-off for getting better crash stats.

For all of the reasons Benjamin lists in Comment 69, unless the data is vital then it's probably not worth both the annoyance factor and the perception of instability it creates.
Flags: needinfo?(shorlander)
Filed bug 1284622 to restrict the notification to Nightly.
OK, then the idea is that we won't be getting old content process crashes - only ones that come in for Nightly or that Nightly users submit from this notification. Do these crashes still show up in about:crashes? Maybe we could show the notification there (if you go to about:crashes, it gives you the option to submit them) in all pre-release channels.  That wouldn't get us a huge amount of data to go by, but if we came across particular users in SUMO or reporting bugs, we could ask them to submit from about:crashes.
Jury's still out. Bug 1284622 just got WONTFIX'd - I believe there's some disagreement about how to proceed here.

Needinfo'ing blassey who did the WONTFIX.
Flags: needinfo?(blassey.bugs)
Are there other ways we could prompt for these crash submissions with less potential bad reaction from users? We don't want to lose pre-release channel users, but a lot of why we want them is exactly so that we can get this kind of feedback. Seems like we should be able to strike a balance here.
As you said, Jury is out (and Jury is leaning towards Nightly & Aurora but questioning Beta). It is premature to be landing a patch to restrict this to Nightly.
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #77)
> Are there other ways we could prompt for these crash submissions with less
> potential bad reaction from users? We don't want to lose pre-release channel
> users, but a lot of why we want them is exactly so that we can get this kind
Yes, as I said, auto-submitting will prevent the dialog from ever showing. We're working on that.
> of feedback. Seems like we should be able to strike a balance here.
Flags: needinfo?(blassey.bugs)
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #78)
> As you said, Jury is out (and Jury is leaning towards Nightly & Aurora but
> questioning Beta). It is premature to be landing a patch to restrict this to
> Nightly.

Out of curiosity, where is that decision going to be made? I just want to make sure we don't accidentally let the notification slip to Beta.
Flags: needinfo?(blassey.bugs)
I think that's a question for shorlander
Flags: needinfo?(blassey.bugs) → needinfo?(shorlander)
Doing some design work on bug 1270553. Will swing back around to the Beta question after that.
My strong preference is that if beta+ users haven't noticed that we've crashed, that we don't notify them about it (or not often, and not in their first two weeks of using Firefox). Notifications will make Firefox feel crashier, and we know from market research that the perception of crashiness is a much greater retention problem than actual crashiness.

So I'd like this to continue to be tracked to either turn this off for beta or implement the something better that shorlander is working on.
Flags: needinfo?(benjamin)
Gentle ping for shorlander... Beta uplift isn't too far off. What do we want to do here?
Depends on: 1318136
See Also: → 1354042
Blocks: 1418789
You need to log in before you can comment on or make changes to this bug.