Closed
Bug 1035290
Opened 11 years ago
Closed 11 years ago
AsyncShutdownTimeout "FHR: Flushing storage shutdown"
Categories
(Firefox Health Report Graveyard :: Client: Desktop, defect)
Firefox Health Report Graveyard
Client: Desktop
Tracking
(firefox34 wontfix, firefox35+ wontfix, firefox36+ wontfix, firefox37+ wontfix)
People
(Reporter: Yoric, Unassigned)
References
()
Details
(Keywords: topcrash)
According to Socorro, we have 25 on these on Nightly (1/8 of AsyncShutdownTimeout crashes) and 97 on Aurora (1/3 of AsyncShutdownTimeout crashes). Apparently, bug 1017706 helped but did not suffice.
From the 25 Nightly crashes, we have
* 15 with {shutdownInitiated:true, initialized:false, shutdownRequested:true, initializeHadError:false, providerManagerInProgress:false, storageInProgress:false, hasProviderManager:false, hasStorage: true, shutdownComplete:false}
* 8 with {shutdownInitiated:false, initialized:false, shutdownRequested:true, initializeHadError:false, providerManagerInProgress:true, storageInProgress:false, hasProviderManager:true, hasStorage:true, shutdownComplete:false}
From the 93 Aurora crashes, we have
* 87 with {shutdownInitiated:false, initialized:true, shutdownRequested:false, initializeHadError:false, providerManagerInProgress:false, storageInProgress:false, hasProviderManager:true, hasStorage:true}
We really should make sure that we fix the problem before it gets to Beta.
| Reporter | ||
Comment 2•11 years ago
|
||
Forgot to mention: this is essentially the same as bug 944873, except that bug has been hijacked by other issues.
Comment 3•11 years ago
|
||
Unless I'm missing something, the running theory is bug 1030266 is the sole cause and fixing will make this go away. All the other solutions (timeouts on tasks, etc) are nice, but they can probably wait. They hopefully amount to a benign itch.
Flags: needinfo?(gps)
| Reporter | ||
Comment 4•11 years ago
|
||
Well, we can certainly try and fix it.
| Reporter | ||
Updated•11 years ago
|
| Reporter | ||
Comment 5•11 years ago
|
||
I just checked, and we still have some of these: http://yoric.github.io/are-we-shutting-down-yet-/?signature=%7EFHR%3A+Flushing+storage+shutdown
Comment 7•11 years ago
|
||
This is a topcrash right now on Firefox 34.
This crash signature has come back into the top 10 for 34 with 683/47826 crashes in the last 3 days, and 1255/91832 crashes for 34.0.5.
Some comments mention Flash, plugin container issues, and an inability to shutdown.
There isn't a crash signature field for this component, so people may file duplicate bugs by accident (as this shows up on the lists of top crash signatures that don't have bugs associated yet. ) Yoric, is it worth keeping the one I filed open to avoid that confusion?
Flags: needinfo?(dteller)
Keywords: topcrash
| Reporter | ||
Comment 8•11 years ago
|
||
I thought we had at least one open already with that crash signature? If that's not the case, yeah, you can keep it open. Just make it a meta bug, because all it means is "some component fails to shut down properly", and you have to look at the metadata to find out which component is.
Georg, I think that's in your lap, isn't it?
Flags: needinfo?(dteller) → needinfo?(georg.fritzsche)
Updated•11 years ago
|
Flags: needinfo?(georg.fritzsche)
Comment 9•11 years ago
|
||
(In reply to David Rajchenbach-Teller [:Yoric] (hard to reach until December 10th - use "needinfo") from comment #8)
> Georg, I think that's in your lap, isn't it?
I'm not sure what exactly you are asking me about?
| Reporter | ||
Comment 10•11 years ago
|
||
I'm asking whether you are investigating the issue.
Comment 11•11 years ago
|
||
(In reply to David Rajchenbach-Teller [:Yoric] (hard to reach until December 10th - use "needinfo") from comment #10)
> I'm asking whether you are investigating the issue.
Ah, yes, i'm checking into this one right now. I don't think that i can take on all the data collection issues personally though.
Comment 12•11 years ago
|
||
I did a quick manual check of the "0 days ago" reports for Fx 34.* here:
http://yoric.github.io/are-we-shutting-down-yet-/?signature=~FHR%3A+Flushing+storage+shutdown&version=Firefox+34.0.5&version=Firefox+34.0#
For 18 of 19 of these reports, the state matches bug 1110681.
The single outlier is bp-6f645945-0102-4135-8bfa-f08392141212, which has this data:
{
"phase":"Metrics Storage Backend",
"conditions":[
{
"name":"FHR: Flushing storage shutdown",
"state":{
"shutdownInitiated":true,
"initialized":false,
"shutdownRequested":true,
"initializeHadError":false,
"providerManagerInProgress":false,
"storageInProgress":false,
"hasProviderManager":false,
"hasStorage":true,
"shutdownComplete":false
},
"filename":"resource://gre/modules/HealthReport.jsm",
"lineNumber":4335,
"stack":[
"resource://gre/modules/HealthReport.jsm:AbstractHealthReporter.prototype<.init/<:4335",
""
]
}
]
}
Comment 13•11 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #12)
> The single outlier is bp-6f645945-0102-4135-8bfa-f08392141212, which has
> this data:
And going by that state, this is waiting on the storage closing around here:
http://hg.mozilla.org/mozilla-central/annotate/0cf461e62ce5/services/healthreport/healthreporter.jsm#l625
Comment 14•11 years ago
|
||
Tracking top crash.
status-firefox34:
--- → wontfix
status-firefox35:
--- → affected
status-firefox36:
--- → affected
status-firefox37:
--- → affected
tracking-firefox35:
--- → +
tracking-firefox36:
--- → +
tracking-firefox37:
--- → +
Comment 15•11 years ago
|
||
So, i ran a proper analysis now based on the full release data we have for this.
Of 6212 AsyncShutdownTimeouts for "FHR: Flushing storage shutdown" [0], 3810 are on Fx 34.
Of those, 3579 have the same state as bug 1110681, so we need to push that bug.
The detailed breakdown:
{
'{"shutdownRequested": true, "shutdownInitiated": false, "providerManagerInProgress": true, "hasProviderManager": true, "hasStorage": true, "initializeHadError": false, "initialized": false, "shutdownComplete": false, "storageInProgress": false}': 3579,
'{"shutdownRequested": true, "shutdownInitiated": true, "providerManagerInProgress": false, "hasProviderManager": false, "hasStorage": true, "initializeHadError": false, "initialized": false, "shutdownComplete": false, "storageInProgress": false}': 219,
'{"shutdownRequested": true, "shutdownInitiated": false, "providerManagerInProgress": false, "hasProviderManager": false, "hasStorage": false, "initializeHadError": false, "initialized": false, "shutdownComplete": false, "storageInProgress": true}': 7,
'{"shutdownRequested": true, "shutdownInitiated": true, "providerManagerInProgress": false, "hasProviderManager": true, "hasStorage": true, "initializeHadError": false, "initialized": false, "shutdownComplete": false, "storageInProgress": false}': 5,
}
[0] http://bsmedberg.github.io/crash-stats-api-magic/analyze-crash.html?url=https%3A%2F%2Fdl.dropboxusercontent.com%2Fu%2F15124579%2Fasync_shutdown_timeout_crashes.json&rulecount=3&rule0_action=filter&rule0_fn=function%28d%29%20{%0A%20%20return%20d.version.indexOf%28%2234%22%29%20%3D%3D%200%20%26%26%0A%20%20%20%20%20%20%20%20%20d.async_shutdown_timeout.indexOf%28%22FHR%3A%20Flushing%20storage%20shutdown%22%29%20%3E%20-1%3B%0A}&rule1_action=map&rule1_fn=function%28d%29%20{%0A%20%20return%20JSON.parse%28d.async_shutdown_timeout%29.conditions%5B0%5D.state%3B%0A}&rule2_action=counter&rule2_fn=
Comment 16•11 years ago
|
||
If this regressed in Firefox 34, what changed in Firefox 34 that is causing this code path to get exercised more often?
Comment 17•11 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #16)
> If this regressed in Firefox 34, what changed in Firefox 34 that is causing
> this code path to get exercised more often?
We don't know at this point - there are more diagnostics that went into Firefox 35+ that should hopefully help us pin down the offending provider.
See also bug 1110681 for more context.
Comment 18•11 years ago
|
||
kairo, we don't have great short-term options right now because we probably won't get enough diagnostic data back in time for the 35 release.
Can you tell whether the specific issue filed here (AsyncShutdownTimeout with state "FHR: Flushing storage shutdown") has a big impact?
We might consider silencing this specific issue for release if we have to, but would lose diagnostic data in the process.
Flags: needinfo?(kairo)
Comment 19•11 years ago
|
||
(Note that we only pulled 3810 crashes for this issue per comment 15, but i'm not sure about the effect of throttling or other factors here)
Comment 20•11 years ago
|
||
http://yoric.github.io/are-we-shutting-down-yet-/# usually gives good info of how many of the shutdown crashes are what issue but its main page doesn't seem to load right now.
That said, http://yoric.github.io/are-we-shutting-down-yet-/?version=Firefox+34.0.5# shows that this is the major part of those signatures other than bug 1114567 that just came up in the last days. Unfortunately, http://yoric.github.io/are-we-shutting-down-yet-/?version=Firefox+35.0# also doesn't seem to load. Looks like Yoric needs to fix it.
Flags: needinfo?(kairo)
Comment 21•11 years ago
|
||
KaiRo confirmed that the AsyncShutdownTimeout is ~1% of release crashes, with most of them presumably being this issue.
Given that volume and that it is a shutdown crash, we're not jumping to emergency measures for 35 here.
Instead we can wait on the diagnostic data, which should hopefully get us to a fix in 36 beta.
Comment 22•11 years ago
|
||
Wontfix based on comment 21 and the timing of 35.
Comment 23•11 years ago
|
||
Georg, can we have an assignee on this for 36? Thanks
Flags: needinfo?(gfritzsche)
Comment 24•11 years ago
|
||
The next actionable step is adding additional forensics on the SearchProvider in bug 1110681.
I hope that me or Yoric can get to that soon.
Flags: needinfo?(gfritzsche)
Comment 26•11 years ago
|
||
Guys, have you been able to work on this? Thanks
Flags: needinfo?(gfritzsche)
Flags: needinfo?(dteller)
Comment 27•11 years ago
|
||
We haven't been able to get bug 1110681 done yet due to other prioritized work.
Flags: needinfo?(gfritzsche)
Flags: needinfo?(dteller)
Comment 28•11 years ago
|
||
OK. So, wontfix for 36 too.
Comment 29•11 years ago
|
||
The additional forensics from bug 1110681 look to have landed about 1.5 months ago on 35+. Do you have the data that you need to proceed with the investigation for this bug?
Flags: needinfo?(gfritzsche)
Flags: needinfo?(dteller)
Comment 30•11 years ago
|
||
(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #29)
> The additional forensics from bug 1110681 look to have landed about 1.5
> months ago on 35+. Do you have the data that you need to proceed with the
> investigation for this bug?
The first part landed, pointed to the SearchService issue and required further investigation.
Bug 1110681 is still open for the further forensics on what is broken in the search service, hence comment 27.
We are currently pushing for FHR & Telemetry unification, which would obsolete these issues on desktop, which is why this hasn't made the top of the list.
Flags: needinfo?(gfritzsche)
Flags: needinfo?(dteller)
Comment 31•11 years ago
|
||
We're not moving very quickly on this bug, which is still marked as a top crash.
Kairo - Can you confirm that this is still a top crash?
Flags: needinfo?(kairo)
Comment 32•11 years ago
|
||
(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #31)
> Kairo - Can you confirm that this is still a top crash?
http://yoric.github.io/are-we-shutting-down-yet-/?version=Firefox+35.0# says it's still 22% of all 35 AsyncShutdownTimeouts.
According to http://yoric.github.io/are-we-shutting-down-yet-/?version=Firefox+36.0# it's only 4% of them in 36.
The overall signature that we usually see with AsyncShutdownTimeouts is right now #9 with 1.1% of 36.0b10 crashes.
Flags: needinfo?(kairo)
Comment 33•11 years ago
|
||
Georg - This bug is still very relevant as it has a rather substantial portion of the shutdown crashes. (See comment 32.) FHR/Telemetry unification is happening in 38. If you intend to investigate further in 38 after the unification is complete (in the next couple of weeks), perhaps we can hold off. If you're suggesting that we'll look at this again to ship a fix in 39, that seems pretty late. Do we have any other options to get better data and get back to making progress on this bug?
Flags: needinfo?(gfritzsche)
Comment 34•11 years ago
|
||
With the unification done, we will have a decision about whether we can turn off FHR in 38, which would make the problem go away anyway.
We may want to still solve things here if this is critical enough to fix on 37 or if we want to have this ready in case disabling FHR is a no-go.
Flags: needinfo?(gfritzsche)
Comment 35•11 years ago
|
||
This bug is pretty old at this point but it is still flagged as a topcrash and still something that we want to fix if we have a way to make progress. Assuming that we don't want to wait for FHR/Telemetry unification or that that doesn't happen in 38, what are the next steps toward resolving this bug?
Flags: needinfo?(gfritzsche)
| Reporter | ||
Comment 36•11 years ago
|
||
I believe that the next step is extracting data collected from bug 1110681.
| Reporter | ||
Comment 37•11 years ago
|
||
(once it has landed, that is)
Comment 38•11 years ago
|
||
Indeed, note that David picked that one up again (thanks!).
Flags: needinfo?(gfritzsche)
Comment 39•11 years ago
|
||
I can confidently say that we aren't going to have time for this bug. The only sane way to fix this is by removing the code in question, which will either be 38 or 39 depending on the status of unification.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WONTFIX
Comment 40•11 years ago
|
||
I think comment 34 and comment 39 make sense. This issue will be eliminated in 38 or 39. I'm going to mark as wontfix for 37+
Updated•7 years ago
|
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•