Closed
Bug 1303369
Opened 8 years ago
Closed 8 years ago
Don't fill the gfx critical log if the compositor is blocked forever.
Categories
(Core :: Graphics: Layers, defect)
Core
Graphics: Layers
Tracking
()
RESOLVED
FIXED
mozilla52
People
(Reporter: nical, Assigned: nical)
Details
Attachments
(2 files, 1 obsolete file)
3.29 KB,
patch
|
milan
:
review+
|
Details | Diff | Splinter Review |
1.23 KB,
patch
|
milan
:
review+
gchang
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
We should limit the amount of logging coming from nsRefreshDriver if the compositor is blocked for indefinitely, because the log could contain important information about why the problem happened.
Assignee | ||
Comment 1•8 years ago
|
||
Attachment #8792024 -
Flags: review?(milan)
Comment on attachment 8792024 [details] [diff] [review] Stop logging after 8 seconds Review of attachment 8792024 [details] [diff] [review]: ----------------------------------------------------------------- Correct me if I'm wrong, but this is going to limit us to getting a few messages at a time, but nothing is stopping multiple refresh problems from adding up, three or so entries each, and hiding the other messages? If we know that there was a refresh lag for 8+ seconds, what's the value in knowing there were 5 of such events, vs. just one, and have those other four*3 wipe out the rest of the messages?
For example, that same crash - https://crash-stats.mozilla.com/report/index/d366d0e8-9b36-44c3-bafc-b48872160914
Here's one where I just crashed - https://crash-stats.mozilla.com/report/index/5520bb85-38ab-4501-8417-f37a72160916. 15 out of 16 messages are with the refresh driver < 8 seconds. The 16th (max size by default) was a refresh driver > 8 seconds. Would the patch above have 15 out of those 16 show up in the log?
We already have >3k reports with the refresh messages in the error log. Can we get rid of this message while we're figuring out how to limit it? Because it's making the critical log useless at this point.
Flags: needinfo?(nical.bugzilla)
Assignee | ||
Comment 6•8 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #3) > For example, that same crash - > https://crash-stats.mozilla.com/report/index/d366d0e8-9b36-44c3-bafc- > b48872160914 (In reply to Milan Sreckovic [:milan] from comment #4) > Here's one where I just crashed - > https://crash-stats.mozilla.com/report/index/5520bb85-38ab-4501-8417- > f37a72160916. Both of these are with builds prior to the patch that fixed the log spam. (In reply to Milan Sreckovic [:milan] from comment #5) > We already have >3k reports with the refresh messages in the error log. Can > we get rid of this message while we're figuring out how to limit it? > Because it's making the critical log useless at this point. The bulk of the issue is already figured out, there's still some adjustments to make, but the vast majority of the reports with this in the log are from nightly builds prior to the fix. (In reply to Milan Sreckovic [:milan] from comment #2) > Comment on attachment 8792024 [details] [diff] [review] > Stop logging after 8 seconds > > Review of attachment 8792024 [details] [diff] [review]: > ----------------------------------------------------------------- > > Correct me if I'm wrong, but this is going to limit us to getting a few > messages at a time, but nothing is stopping multiple refresh problems from > adding up, three or so entries each, and hiding the other messages? Sure, that's true with almost all of the gfxCriticalError/Note messages in the code. If it fires too much it means either the logging logic is incorrect or someone needs to stop what he is doing and fix a critical bug asap. In this case the logging code was bogus and spammed the log, which should be addressed now. > If we know that there was a refresh lag for 8+ seconds, what's the value in > knowing there were 5 of such events, vs. just one, and have those other > four*3 wipe out the rest of the messages? I'd rather avoid having the log fire only once because of cases where a bad tab switch takes a second or some addon messes up, and then the log doesn't fire again later when the compositor actually froze for a very long time. Knowing that the problem happened several time is useful info too. (In reply to Milan Sreckovic [:milan] from comment #5) > We already have >3k reports with the refresh messages in the error log. Can > we get rid of this message while we're figuring out how to limit it? > Because it's making the critical log useless at this point. Looking at https://crash-stats.mozilla.com/search/?graphics_critical_error=~Refresh%20driver%20waiting&build_id=%3E20160916030204&product=Firefox&_sort=build_id&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=graphics_critical_error#crash-reports There has been around 450 crashes with a 1s or more freeze since the logging fix landed which is still too much, but for most of them the log is not filling up all the gfxCriticalNote slots (not sure how to build a query that shows crashes where the same note filled all slots), so we are not in the emergency situation we were in a few days ago. These 450 crashes most likely did freeze badly for long periods of time, so I would not try to minimize this number by tweaking the log. We just want to make sure it doesn't prevent us from seeing the other logging items. We could do the same thing without resetting the timeout threshold next time we paint. This would mean we only log a freeze if it held longer than the previous one up to 16sec and never log more than 4 times total. As long as we don't do gfxCriticalNoteOnce which will only record the first 1sec freeze and not show potential worse ones after that.
Flags: needinfo?(nical.bugzilla)
Assignee | ||
Comment 7•8 years ago
|
||
With this patch we log only the first time we hit each threshold (1s, 2s, 4s and 8s).
Attachment #8792024 -
Attachment is obsolete: true
Attachment #8792024 -
Flags: review?(milan)
Attachment #8792453 -
Flags: review?(milan)
Comment on attachment 8792453 [details] [diff] [review] Stop logging after 8 seconds & only log if the freeze is worse than the previous one Review of attachment 8792453 [details] [diff] [review]: ----------------------------------------------------------------- Still feels like this belongs in telemetry, rather than crash reports. It looks like a normal thing to happen (sad, but not unexpected), doesn't feel like a critical error.
Attachment #8792453 -
Flags: review?(milan) → review+
Assignee | ||
Comment 9•8 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #8) > Still feels like this belongs in telemetry, rather than crash reports. It > looks like a normal thing to happen (sad, but not unexpected), doesn't feel > like a critical error. You have a point about telemetry. It should not be a normal thing to happen, though. 1 second is not so far from "forever" in terms of compositor responsiveness. It's scary and I'd like to have reliable numbers. Someone just reported to me that he was still seeing the log sometimes when the browser didn't look like it was frozen, So I guess I'll disable the log entirely in the aurora train that starts today/tomorrow and see if there's a way to get the logging to be more reliable.
Assignee | ||
Comment 10•8 years ago
|
||
Attachment #8792536 -
Flags: review?(milan)
Updated•8 years ago
|
Attachment #8792536 -
Flags: review?(milan) → review+
Comment 11•8 years ago
|
||
Pushed by nsilva@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/aeef09f7560c Disable the nsRefreshDriver warnings while false positives are investigated. r=milan
Comment 12•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/aeef09f7560c
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox52:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Assignee | ||
Comment 13•8 years ago
|
||
Comment on attachment 8792536 [details] [diff] [review] Disable the warning (aurora) Approval Request Comment [Feature/regressing bug #]: [User impact if declined]: None, we would have a harder time collecting the gfxCriticalNote info in crash-stats. This disables a log which has false positives so we can't trust enough to draw conclusions from it and it occupies space in the limited list of logged items. [Describe test coverage new/current, TreeHerder]: None. [Risks and why]: None, this just comments out a log. [String/UUID change made/needed]: None.
Attachment #8792536 -
Flags: approval-mozilla-aurora?
Updated•8 years ago
|
status-firefox51:
--- → affected
Comment 14•8 years ago
|
||
Comment on attachment 8792536 [details] [diff] [review] Disable the warning (aurora) The patch disables logs to better collect information. Take it in 51 aurora.
Attachment #8792536 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 15•8 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/0b711e7b0f85
You need to log in
before you can comment on or make changes to this bug.
Description
•