Created attachment 8584711 [details] gecko_profile.out In the attached profile, the rasterization pass at 2741 - 2977 spends 20% of it's time under PaintThebesSingleBufferAlloc, which is a section in ClientTiledLayerBuffer::PaintThebes() that simply allocates a DrawTarget. Seems a little much.
These profiles would be more useful with stack walking, and symbolication.
Yeah. The profiler addon doesn't work for me (and hasn't in forever). I can try to get it going I guess.
I don't think these are really going to be actionable without that.
Still no call stacks but it's a little more actionable. Inside of validateTile we're spending about 64ms copying memory. Doing math using the memory bandwidth numbers from a flame suggests this is approximately 4kx4k of 32 bit pixels. That seems like a lot. snorp, can you figure out how many tiles we'r validating?
This was run on a Nexus 10, which has a screen size of 2560x1600. 4kx4k of tiles sounds like it could be reasonable?
It looks like most of the time we're only validating a handful of tiles, but in some cases it's over 30. Occassionally it's also over 100. So that seems somewhat unexpected.
AFAICT, tiles should always be 256x256 on Android, so that does end up being quite a bit.
Oops, the ValidateTile stuff is bug 1148131. This one is about the DrawTarget allocation in PaintThebes().
Created attachment 8586182 [details] [diff] [review] DrawTargetTiled on android You may want try to profile with this patch applied. With this, we paint directly into tiles rather than the single buffer copy. On mac, this made things faster (and that's what we currently ship). Last time I pushed this to try on fennec there was a few reftest failures so this can't land as is, but if it turns out to be a big perf it might be a good motivation to enable it on fennec (I am currently trying to get it enabled on b2g).
Things do look substantially better with this. I'll attach a profile.