Status

defect
RESOLVED FIXED
5 years ago
9 months ago

People

(Reporter: poiru, Assigned: poiru)

Tracking

unspecified
Dependency tree / graph

Firefox Tracking Flags

(firefox33 fixed, firefox34 fixed)

Details

Attachments

(6 attachments, 2 obsolete attachments)

We need the local crash ID and the server-side ID (which are indeed different) in order to fullfill all use-cases of the data. Unfortunately, the submission data we've collected in the past has only contained one of them.

For the app crash submissions (bug 994707), we recorded the local ID. See https://hg.mozilla.org/mozilla-central/rev/e5514ddc55f7#l4.119 (`lines[0]` is the local ID)

For plugin/content crashes (bug 994708), we recorded the server-side ID. See http://hg.mozilla.org/mozilla-central/rev/45f3f1fe7b39#l2.63 (`ret.CrashID` is the server-side ID)

In bug 1024672, we converted the old submissions stored as separate crash records to be stored within the metadata of a particular crash. For the app crash submissions, the extracted ID[0] was the local ID so the submission will have been added into the metadata of the corrosponding crash. However, for plugin/content crashes, the extracted ID is the server-side ID, so addSubmission{Request,Response} will fail as they won't find a crash record with that ID.

After talking with bsmedberg on IRC about this, this is what we should do:
- Store the server-side ID going forward.
- For app crashes, we won't have to do anything as the conversion code will do the right thing. We won't have the server-side ID, but we never had it in the first place.
- For plugin/content crashes, we will salvage the fact that a crash submission was made on a particular date. We will have lost that data for Nightly users running a recent release, but we should have something to salvage on Aurora at least.

gps, to fix the last point, we cannot associate the submission with an existing crash since we don't have the local ID. We could instead introduce dummy crash records (e.g. with an ID of "plugin-crash-old") and add the submissions there. Or we could store them outside of the crash records and make providers.jsm use that data as well. Any preference?

[0]: http://hg.mozilla.org/mozilla-central/file/31919b319db8/toolkit/components/crashes/CrashManager.jsm#l740
Flags: needinfo?(gps)
Comment on attachment 8480113 [details] [diff] [review]
Part 1: Allow the remote Breakpad ID to be stored in crash metadata

Review of attachment 8480113 [details] [diff] [review]:
-----------------------------------------------------------------

If I wanted to fish for nits, I could say that previously stored crash records would have undefined instead of null for their remote id. When doing data "schema" changes it's common to upgrade the data to the current version on next write. That would mean adding a remoteID=null property on load so it automatically updates on save. But undefined is pretty much null is JS, so we should be fine.
Attachment #8480113 - Flags: review?(gps) → review+
Attachment #8480114 - Flags: review?(gps) → review+
If we don't have local crash IDs, could we add a new element to the JSON being saved by CrashStore? How about .remoteCrashes = {"remoteID": {...}, ... }

I'd strongly prefer we retain some local ID so we have consistency though. Could we invent a UUID for crashes that have no local UUID? The local UUID has no relevance beyond the possibility that it may correspond to a file on the local filesystem. If we have no file, I don't see much harm from inventing a UUID to be used in its place. Can you think of anything that would go wrong?
Flags: needinfo?(gps)
Assignee

Comment 6

5 years ago
(In reply to Birunthan Mohanathas [:poiru] from comment #0)
> For plugin/content crashes (bug 994708), we recorded the server-side ID. See
> http://hg.mozilla.org/mozilla-central/rev/45f3f1fe7b39#l2.63 (`ret.CrashID`
> is the server-side ID)

I forgot to mention that this was already "fixed"[0] a week ago when we switched to using the local ID (bug, again, the server-side ID was not recorded).

However, that fix broke crash submission event file processing, which sadly did not have test coverage. The patch in comment 5 resolves that issue.

[0]: https://hg.mozilla.org/integration/mozilla-inbound/rev/31919b319db
Assignee

Comment 7

5 years ago
(In reply to Gregory Szorc [:gps] (away Sep 10 through 27) from comment #4)
> I'd strongly prefer we retain some local ID so we have consistency though.
> Could we invent a UUID for crashes that have no local UUID? The local UUID
> has no relevance beyond the possibility that it may correspond to a file on
> the local filesystem. If we have no file, I don't see much harm from
> inventing a UUID to be used in its place. Can you think of anything that
> would go wrong?

It should be fine, but we will probably want to somehow differentiate those records so that MozSelfSupport can ignore them (for the purposes bug 1024677 comment 8). Would it be OK to e.g. suffix the ID with "-fake" or something similar?
Flags: needinfo?(gps)
Assignee

Comment 8

5 years ago
This should be a safe patch for Aurora. I'll request uplift after review.
Attachment #8480247 - Flags: review?(gps)
Assignee

Updated

5 years ago
Attachment #8480250 - Flags: review?(gps) → review?(benjamin)
https://hg.mozilla.org/mozilla-central/rev/28176a2e9861
https://hg.mozilla.org/mozilla-central/rev/47551e5606d5
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 34
Assignee

Comment 12

5 years ago
Reopening, forgot to add leave-open earlier.
Status: RESOLVED → REOPENED
Keywords: leave-open
Resolution: FIXED → ---
Target Milestone: Firefox 34 → ---
Comment on attachment 8480169 [details] [diff] [review]
Part 3: Fix crash submission event file handling and add test

Review of attachment 8480169 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/crashes/CrashManager.jsm
@@ +497,5 @@
> +          // processed, the crash record has already been created.
> +          if (!store._data.crashes.get(crashID)) {
> +            store.addCrash(this.PROCESS_TYPE_MAIN, this.CRASH_TYPE_CRASH,
> +                           crashID, date);
> +          }

Why the |if| here? When I first wrote this code, processing of the same event file should have resulted in idempotent behavior. i.e. you wouldn't need guards like this.

What breaks if the |if| is removed?

@@ +513,5 @@
> +            }
> +
> +            let submissionID = Cc["@mozilla.org/uuid-generator;1"]
> +                                 .getService(Ci.nsIUUIDGenerator)
> +                                 .generateUUID().toString().slice(1, -1);

A future improvement is to have the crash reporter create its own UUID and record it in the event file. This would be crash.submission.2.

@@ +519,5 @@
> +
> +            store.addSubmissionAttempt(crashID, submissionID, date);
> +            store.addSubmissionResult(crashID, submissionID, date,
> +                                      succeeded ? this.SUBMISSION_RESULT_OK :
> +                                                  this.SUBMISSION_RESULT_FAILED);

Oh, hmm.

This is kinda unfortunate. We should really get the crash reporter saving 2 events: 1 for attempt and 1 for completion.

When I initially worked on this code, that's the path I went down. It doesn't look like the final implementation does it this way :/

I prefer 2 events because if you wait until submission has completed, you may lose the fact that submission attempted at all! This especially holds true in the crash reporter case: the system could be in an unstable state and the crash reporter itself may crash before submission result, losing submission state.

I would strongly encourage us to fix the crash reporter to emit 2 events rather than go with this approach. Is that doable?
Attachment #8480169 - Flags: review?(gps) → feedback+
Attachment #8480250 - Flags: review?(benjamin) → review+
Assignee

Comment 14

5 years ago
Comment on attachment 8480250 [details] [diff] [review]
Store local ID rather than remote ID when submitting crashes through CrashSubmit

Approval Request Comment
[Feature/regressing bug #]: Bug 994707
[User impact if declined]: No direct impact, but our FHR data will not be as useful.
[Describe test coverage new/current, TBPL]: Tests hit the changed code path, but don't actually verify that it does the right thing. I performed manual testing to ensure that it works as expected.
[Risks and why]: Low risk as the change is simple and self-contained.
[String/UUID change made/needed]: No.
Attachment #8480250 - Flags: approval-mozilla-aurora?
Attachment #8480250 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee

Updated

5 years ago
Duplicate of this bug: 1060408
Assignee

Comment 16

5 years ago
(In reply to Gregory Szorc [:gps] (away Sep 10 through 27) from comment #13)
> ::: toolkit/components/crashes/CrashManager.jsm
> @@ +497,5 @@
> > +          // processed, the crash record has already been created.
> > +          if (!store._data.crashes.get(crashID)) {
> > +            store.addCrash(this.PROCESS_TYPE_MAIN, this.CRASH_TYPE_CRASH,
> > +                           crashID, date);
> > +          }
> 
> Why the |if| here? When I first wrote this code, processing of the same
> event file should have resulted in idempotent behavior. i.e. you wouldn't
> need guards like this.
> 
> What breaks if the |if| is removed?

As dicussed on IRC, it was because CrashStore_ensureCrashRecord performs count limiting even when updating an existing crash record. This patch fixes that.
Attachment #8481373 - Flags: review?(gps)
Assignee

Comment 17

5 years ago
(In reply to Gregory Szorc [:gps] (away Sep 10 through 27) from comment #13)
> I would strongly encourage us to fix the crash reporter to emit 2 events
> rather than go with this approach. Is that doable?

Is it OK if I do that in a follow-up later today? This is broken in Nightly and I'd like to get it fixed ASAP before further improvements.
Attachment #8480169 - Attachment is obsolete: true
Attachment #8481374 - Flags: review?(gps)

Updated

5 years ago
Attachment #8481373 - Flags: review?(gps) → review+
Comment on attachment 8481374 [details] [diff] [review]
Part 4: Fix crash submission event file handling and add test

submissionID confused me for a bit: I now understand that per-crash we may be recording multiple submission (in case one failed and we retried). In that case, I'd like it not to look so much like a UUID: can you use something like http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/downloads/tests/chrome/utils.js#52 to generate a random string instead?

r=me with that change
Attachment #8481374 - Flags: review?(gps) → review+
Assignee

Comment 20

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/26061d23f040
https://hg.mozilla.org/integration/mozilla-inbound/rev/e24d5e087d36

(In reply to Benjamin Smedberg  [:bsmedberg] from comment #18)
> Comment on attachment 8481374 [details] [diff] [review]
> Part 4: Fix crash submission event file handling and add test
> 
> submissionID confused me for a bit: I now understand that per-crash we may
> be recording multiple submission (in case one failed and we retried). In
> that case, I'd like it not to look so much like a UUID: can you use
> something like
> http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/downloads/
> tests/chrome/utils.js#52 to generate a random string instead?

After discussion on IRC, we decided to go with a UUID and a "sub-" prefix.
Status: REOPENED → ASSIGNED
Flags: needinfo?(gps)
Attachment #8481653 - Flags: review?(gps) → review+
Assignee

Updated

5 years ago
Status: ASSIGNED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open

Updated

9 months ago
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in before you can comment on or make changes to this bug.