Closed Bug 981817 Opened 11 years ago Closed 9 years ago

missing refresh driver ticks with tiling/APZ enabled

Categories

(Core :: Graphics, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: bkelly, Unassigned)

References

Details

(Keywords: perf)

I was doing some testing today with the tiling work that just landed in bug 963073. It appears to work most of the time, but periodically I see large periods of checkerboarding. Taking a profile shows that we are not getting refresh driver ticks in the client process. For example, here are two profiles on my buri: http://people.mozilla.org/~bgirard/cleopatra/#report=059af3091d0be8c40d334c0757e620816ca1cd80 http://people.mozilla.org/~bgirard/cleopatra/#report=ee2ae38e2b6a8be106c85276015b97e498561402 Look at the beginning of the first profile and the end of the second profile. If I disable tiling then I can no longer provoke this condition. I also saw this on my tarako device. Both of these devices are ICS based. I was scrolling the virtual-list-demo from my branch here: https://github.com/wanderview/gaia/tree/virtual-list-demo Which is based on: gaia: a351fe62c11737c722ad33aaff438f6ccd00bd4a Also using: gecko: 923f1411f42f v1.2-device.cfg OEM firmware
I cannot seem to trigger this problem on nexus-4.
I was going to make a video, but I'm having trouble re-triggering on my buri now too.
It seems hard to pin this on tiling when its so intermittent. Its possible I just naturally could not reproduce due to the variability when I disabled tiling. Vlad had suggested that it might be APZ related.
No longer blocks: b2g-tiling
Summary: missing refresh driver ticks with tiling enabled → missing refresh driver ticks with tiling/APZ enabled
If you're not getting refresh driver ticks then my first guess would be that the paint throttler in the APZC is getting stuck because it doesn't receive a notification. The APZC code waits to send a second paint request [1] until it knows the first one has been processed [2]; this is done by the paint throttler code. It could be that the code at [2] doesn't get run in some cases if the tiling code has some sort of early-exit condition and doesn't send a layers update to the compositor. I know that when I was investigating APZC on Fennec (where tiling is enabled) the code that used to be at [1] would get hit for every tile that got uploaded and it was causing all sorts of problems. The code that's there now should be better behaved but it's still possible that it doesn't interact well with tiling. [1] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/AsyncPanZoomController.cpp?rev=3b1371ee7744#1452 [2] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/AsyncPanZoomController.cpp?rev=3b1371ee7744#1657
Another one I happened to catch while trying to profile something else: http://people.mozilla.org/~bgirard/cleopatra/#report=dc85d1215ef8193ee4a5f22a72d8f078201d0465
I think I've seen this on my nexus-4 as well, but I have not been lucky enough to catch it in the profiler so far.
When this occurs it pretty much looks just like other checkerboarding.
Blocks: 942750
So it feels like its easier to reproduce this the higher the compositor thread priority. This suggests that maybe we have some kind of race condition. Kats, would this behavior make sense if the notification came too early? Is it possible for it occur before APZC is ready for it? I understand we use more async messaging with the tiling code enabled.
Flags: needinfo?(bugmail.mozilla)
(In reply to Ben Kelly [:bkelly] from comment #8) > Kats, would this behavior make sense if the notification came too early? Is > it possible for it occur before APZC is ready for it? I don't think that makes sense; the notification is a response to a paint request by the APZC and so should never precede the paint request. Even if we got some other unrelated notification before our paint request, that should just trigger more paints and shouldn't result in the "stuck" behaviour. If you attach a log of the output with the APZ logging enabled (comment out the no-op APZC_LOG and uncomment the printf_stderr version at the top of AsyncPanZoomController.cpp) when this reproduces I should be able to help isolate the problem.
Flags: needinfo?(bugmail.mozilla)
Since this bug was filed, silk landed and overhauled the refresh driver ticking conditions, and we also removed the paint throttler code. So I'm gonna go ahead and close this bug on the assumption that it doesn't happen any more.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.