Closed Bug 1059390 Opened 6 years ago Closed 6 years ago

Incomplete submission data

Categories

(Firefox Health Report Graveyard :: Client: Desktop, defect)

defect
Not set
normal

Tracking

(firefox33 fixed, firefox34 fixed)

RESOLVED FIXED
Tracking Status
firefox33 --- fixed
firefox34 --- fixed

People

(Reporter: poiru, Assigned: poiru)

References

Details

Attachments

(6 files, 2 obsolete files)

We need the local crash ID and the server-side ID (which are indeed different) in order to fullfill all use-cases of the data. Unfortunately, the submission data we've collected in the past has only contained one of them.

For the app crash submissions (bug 994707), we recorded the local ID. See https://hg.mozilla.org/mozilla-central/rev/e5514ddc55f7#l4.119 (`lines[0]` is the local ID)

For plugin/content crashes (bug 994708), we recorded the server-side ID. See http://hg.mozilla.org/mozilla-central/rev/45f3f1fe7b39#l2.63 (`ret.CrashID` is the server-side ID)

In bug 1024672, we converted the old submissions stored as separate crash records to be stored within the metadata of a particular crash. For the app crash submissions, the extracted ID[0] was the local ID so the submission will have been added into the metadata of the corrosponding crash. However, for plugin/content crashes, the extracted ID is the server-side ID, so addSubmission{Request,Response} will fail as they won't find a crash record with that ID.

After talking with bsmedberg on IRC about this, this is what we should do:
- Store the server-side ID going forward.
- For app crashes, we won't have to do anything as the conversion code will do the right thing. We won't have the server-side ID, but we never had it in the first place.
- For plugin/content crashes, we will salvage the fact that a crash submission was made on a particular date. We will have lost that data for Nightly users running a recent release, but we should have something to salvage on Aurora at least.

gps, to fix the last point, we cannot associate the submission with an existing crash since we don't have the local ID. We could instead introduce dummy crash records (e.g. with an ID of "plugin-crash-old") and add the submissions there. Or we could store them outside of the crash records and make providers.jsm use that data as well. Any preference?

[0]: http://hg.mozilla.org/mozilla-central/file/31919b319db8/toolkit/components/crashes/CrashManager.jsm#l740
Flags: needinfo?(gps)
Comment on attachment 8480113 [details] [diff] [review]
Part 1: Allow the remote Breakpad ID to be stored in crash metadata

Review of attachment 8480113 [details] [diff] [review]:
-----------------------------------------------------------------

If I wanted to fish for nits, I could say that previously stored crash records would have undefined instead of null for their remote id. When doing data "schema" changes it's common to upgrade the data to the current version on next write. That would mean adding a remoteID=null property on load so it automatically updates on save. But undefined is pretty much null is JS, so we should be fine.
Attachment #8480113 - Flags: review?(gps) → review+
Attachment #8480114 - Flags: review?(gps) → review+
If we don't have local crash IDs, could we add a new element to the JSON being saved by CrashStore? How about .remoteCrashes = {"remoteID": {...}, ... }

I'd strongly prefer we retain some local ID so we have consistency though. Could we invent a UUID for crashes that have no local UUID? The local UUID has no relevance beyond the possibility that it may correspond to a file on the local filesystem. If we have no file, I don't see much harm from inventing a UUID to be used in its place. Can you think of anything that would go wrong?
Flags: needinfo?(gps)
(In reply to Birunthan Mohanathas [:poiru] from comment #0)
> For plugin/content crashes (bug 994708), we recorded the server-side ID. See
> http://hg.mozilla.org/mozilla-central/rev/45f3f1fe7b39#l2.63 (`ret.CrashID`
> is the server-side ID)

I forgot to mention that this was already "fixed"[0] a week ago when we switched to using the local ID (bug, again, the server-side ID was not recorded).

However, that fix broke crash submission event file processing, which sadly did not have test coverage. The patch in comment 5 resolves that issue.

[0]: https://hg.mozilla.org/integration/mozilla-inbound/rev/31919b319db
(In reply to Gregory Szorc [:gps] (away Sep 10 through 27) from comment #4)
> I'd strongly prefer we retain some local ID so we have consistency though.
> Could we invent a UUID for crashes that have no local UUID? The local UUID
> has no relevance beyond the possibility that it may correspond to a file on
> the local filesystem. If we have no file, I don't see much harm from
> inventing a UUID to be used in its place. Can you think of anything that
> would go wrong?

It should be fine, but we will probably want to somehow differentiate those records so that MozSelfSupport can ignore them (for the purposes bug 1024677 comment 8). Would it be OK to e.g. suffix the ID with "-fake" or something similar?
Flags: needinfo?(gps)
This should be a safe patch for Aurora. I'll request uplift after review.
Attachment #8480247 - Flags: review?(gps)
Attachment #8480250 - Flags: review?(gps) → review?(benjamin)
https://hg.mozilla.org/mozilla-central/rev/28176a2e9861
https://hg.mozilla.org/mozilla-central/rev/47551e5606d5
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 34
Reopening, forgot to add leave-open earlier.
Status: RESOLVED → REOPENED
Keywords: leave-open
Resolution: FIXED → ---
Target Milestone: Firefox 34 → ---
Comment on attachment 8480169 [details] [diff] [review]
Part 3: Fix crash submission event file handling and add test

Review of attachment 8480169 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/crashes/CrashManager.jsm
@@ +497,5 @@
> +          // processed, the crash record has already been created.
> +          if (!store._data.crashes.get(crashID)) {
> +            store.addCrash(this.PROCESS_TYPE_MAIN, this.CRASH_TYPE_CRASH,
> +                           crashID, date);
> +          }

Why the |if| here? When I first wrote this code, processing of the same event file should have resulted in idempotent behavior. i.e. you wouldn't need guards like this.

What breaks if the |if| is removed?

@@ +513,5 @@
> +            }
> +
> +            let submissionID = Cc["@mozilla.org/uuid-generator;1"]
> +                                 .getService(Ci.nsIUUIDGenerator)
> +                                 .generateUUID().toString().slice(1, -1);

A future improvement is to have the crash reporter create its own UUID and record it in the event file. This would be crash.submission.2.

@@ +519,5 @@
> +
> +            store.addSubmissionAttempt(crashID, submissionID, date);
> +            store.addSubmissionResult(crashID, submissionID, date,
> +                                      succeeded ? this.SUBMISSION_RESULT_OK :
> +                                                  this.SUBMISSION_RESULT_FAILED);

Oh, hmm.

This is kinda unfortunate. We should really get the crash reporter saving 2 events: 1 for attempt and 1 for completion.

When I initially worked on this code, that's the path I went down. It doesn't look like the final implementation does it this way :/

I prefer 2 events because if you wait until submission has completed, you may lose the fact that submission attempted at all! This especially holds true in the crash reporter case: the system could be in an unstable state and the crash reporter itself may crash before submission result, losing submission state.

I would strongly encourage us to fix the crash reporter to emit 2 events rather than go with this approach. Is that doable?
Attachment #8480169 - Flags: review?(gps) → feedback+
Attachment #8480250 - Flags: review?(benjamin) → review+
Comment on attachment 8480250 [details] [diff] [review]
Store local ID rather than remote ID when submitting crashes through CrashSubmit

Approval Request Comment
[Feature/regressing bug #]: Bug 994707
[User impact if declined]: No direct impact, but our FHR data will not be as useful.
[Describe test coverage new/current, TBPL]: Tests hit the changed code path, but don't actually verify that it does the right thing. I performed manual testing to ensure that it works as expected.
[Risks and why]: Low risk as the change is simple and self-contained.
[String/UUID change made/needed]: No.
Attachment #8480250 - Flags: approval-mozilla-aurora?
Attachment #8480250 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Duplicate of this bug: 1060408
(In reply to Gregory Szorc [:gps] (away Sep 10 through 27) from comment #13)
> ::: toolkit/components/crashes/CrashManager.jsm
> @@ +497,5 @@
> > +          // processed, the crash record has already been created.
> > +          if (!store._data.crashes.get(crashID)) {
> > +            store.addCrash(this.PROCESS_TYPE_MAIN, this.CRASH_TYPE_CRASH,
> > +                           crashID, date);
> > +          }
> 
> Why the |if| here? When I first wrote this code, processing of the same
> event file should have resulted in idempotent behavior. i.e. you wouldn't
> need guards like this.
> 
> What breaks if the |if| is removed?

As dicussed on IRC, it was because CrashStore_ensureCrashRecord performs count limiting even when updating an existing crash record. This patch fixes that.
Attachment #8481373 - Flags: review?(gps)
(In reply to Gregory Szorc [:gps] (away Sep 10 through 27) from comment #13)
> I would strongly encourage us to fix the crash reporter to emit 2 events
> rather than go with this approach. Is that doable?

Is it OK if I do that in a follow-up later today? This is broken in Nightly and I'd like to get it fixed ASAP before further improvements.
Attachment #8480169 - Attachment is obsolete: true
Attachment #8481374 - Flags: review?(gps)
Attachment #8481373 - Flags: review?(gps) → review+
Comment on attachment 8481374 [details] [diff] [review]
Part 4: Fix crash submission event file handling and add test

submissionID confused me for a bit: I now understand that per-crash we may be recording multiple submission (in case one failed and we retried). In that case, I'd like it not to look so much like a UUID: can you use something like http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/downloads/tests/chrome/utils.js#52 to generate a random string instead?

r=me with that change
Attachment #8481374 - Flags: review?(gps) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/26061d23f040
https://hg.mozilla.org/integration/mozilla-inbound/rev/e24d5e087d36

(In reply to Benjamin Smedberg  [:bsmedberg] from comment #18)
> Comment on attachment 8481374 [details] [diff] [review]
> Part 4: Fix crash submission event file handling and add test
> 
> submissionID confused me for a bit: I now understand that per-crash we may
> be recording multiple submission (in case one failed and we retried). In
> that case, I'd like it not to look so much like a UUID: can you use
> something like
> http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/downloads/
> tests/chrome/utils.js#52 to generate a random string instead?

After discussion on IRC, we decided to go with a UUID and a "sub-" prefix.
Status: REOPENED → ASSIGNED
Flags: needinfo?(gps)
Attachment #8481653 - Flags: review?(gps) → review+
Status: ASSIGNED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in before you can comment on or make changes to this bug.