Closed Bug 845966 Opened 7 years ago Closed 7 years ago

Detect hung startup and/or shutdown

Categories

(Firefox Health Report Graveyard :: Client: Desktop, defect)

defect
Not set

Tracking

(firefox20 unaffected, firefox21 fixed)

RESOLVED FIXED
Firefox 22
Tracking Status
firefox20 --- unaffected
firefox21 --- fixed

People

(Reporter: gps, Assigned: gps)

Details

Attachments

(1 file, 1 obsolete file)

We don't have timeouts for startup and shutdown events. This could bite us with shutdown hangs (as evidenced in bug 845842).
There were gaps in the existing tests. We were simulating exceptions after the promise had already been resolved. I'm kinda surprised we didn't know about this until now!

Anyway, I applied some nice test-driven development with this patch. I was able to produce the hang in test. Then, I fixed it.

We still have a gap in test coverage for storage init. But, that one is a little bit more difficult to monkeypatch. The bug with collector was definitely there with storage. And, it is patched as well - just untested.

We should definitely uplift this to Aurora.
Assignee: nobody → gps
Status: NEW → ASSIGNED
Attachment #719237 - Flags: review?(rnewman)
Irving: Please apply the patch, |./mach build services browser|, then verify the shutdown hangs go away. I expect the stack error to still be there and for FHR to not init. But at least we are a bit more graceful about things.
Flags: needinfo?(irving)
I should qref before submitting patches.
Attachment #719237 - Attachment is obsolete: true
Attachment #719237 - Flags: review?(rnewman)
Attachment #719241 - Flags: review?(rnewman)
Comment on attachment 719241 [details] [diff] [review]
Be more robust about errors during startup, v2

Review of attachment 719241 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/healthreport/healthreporter.jsm
@@ +240,1 @@
>        this._log.warn("Collector is in progress of initializing. Waiting to finish.");

Let's log before this `if` if initializeHadError, just so we get a message like "Shutting down immediately: initialize failed."
Attachment #719241 - Flags: review?(rnewman) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/8087b8019bf6

With slightly restructured if block so we log initializeHadError.
Target Milestone: --- → mozilla22
Comment on attachment 719241 [details] [diff] [review]
Be more robust about errors during startup, v2

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Firefox Health Report core code
User impact if declined: Possible shutdown hangs.
Testing completed (on m-c, etc.): Landed on inbound a few hours ago. I'm fine with a day of testing on Nightly before uplift. There is a new xpcshell test for the failure. Manual spot testing reveals FHR still works with this patch.
Risk to taking this patch (and alternatives if risky): I think it's riskier to have FHR enabled on Aurora without this patch.
String or UUID changes made by this patch: None
Attachment #719241 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/8087b8019bf6
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
I tried this out with the logging patch gps gave me for bug 845842, and my debug build no longer hangs on shutdown. Still a lot of failures in the log, but I'm pretty sure those are all because of 845842 related issues.
Flags: needinfo?(irving)
Comment on attachment 719241 [details] [diff] [review]
Be more robust about errors during startup, v2

Given the user impact & risk & considering we will be enabling FHR on aurora by tomorrow per 844181, approving for uplift.
Attachment #719241 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Gregory, do you know of a way I could simulate a startup/shutdown hang so I can verify this?
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #11)
> Gregory, do you know of a way I could simulate a startup/shutdown hang so I
> can verify this?

You'd have to trigger an error in FHR's internals. The only way you will reliably be able to do that is to monkeypatch a function to always throw. The class at https://hg.mozilla.org/mozilla-central/file/ec68e56cafa3/services/healthreport/modules-testing/utils.jsm#l214 essentially implements a derived type that overrides some functions to always throw. You'll want to do something similar. The trick will be to install this code on the built-in health reporter instance at run-time. That will be difficult if not impossible.
Thanks Gregory. Given the time constraints I don't think we'll reasonably be able to test this. If testing this is a Beta blocker then I'll need to re-evaluate that decision.
Component: Metrics and Firefox Health Report → Client: Desktop
Product: Mozilla Services → Firefox Health Report
Target Milestone: mozilla22 → Firefox 22
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in before you can comment on or make changes to this bug.