This is a follow-up to the crash issue in , could also be related to another slow-ness bug .

The issue here is to investigate where we are spending time, and what can be done to improve the framerate.

Gecko profile:

Looks like the Renderer thread is the bottleneck here, spending hundreds of milliseconds per frame to composite it. In particular, 1/3 of the thread time is spent in init_fbos() call. This is unexpected: we should preserve the FBOs across frames and only creates new when necessary.

A GPU capture shows us making 355 draw calls, which is more than we should aim for (up to 100, ideally). This happens due to two reasons:

  • all the blits necessary for picture caching. If caching is unfeasible here, we should improve the heuristic that disables it, so that blits could be avoided.
  • there is a lot of draws of transformed primitives into scissored areas. There are ways to limit the transformed rendering to specific areas without breaking batches, we should look more into it.
  • frame is split across 7 passes, which seems a bit high. We need to investigate if the task graph is deeper than needs to be.

Apparently, the blits are not due to picture caching, as Glenn noted. They are due to our slow handling of mix-blend modes. See also -

I tested this again with WIP mix-blend rewrite: it feels a bit smoother, but it's not fixed by far. I believe the issues I listed in Comment 1 are still causing the slow.

Use natively supported mix-blend modes, where appropriate.

