610727 - Panning around canvas on crash-stats.mozilla.org is laggy without GL compositing

Reporter

Description

•

14 years ago

Go to a product summary page on crash-stats.  Notice that when panning around while the canvas element is in view, the panning is laggy.  Notice that when it is out of view, the panning is smooth as butter.  I've seen this on both my n900 and Galaxy S.

Martijn Wargers (dead)

Updated

•

14 years ago

tracking-fennec: --- → ?

Keywords: perf

OS: Linux → All

Hardware: x86 → All

Doug Turner (:dougt)

Comment 1

•

14 years ago

jdm, any idea what is going on?

Alon Zakai (:azakai)

Assignee

Updated

•

14 years ago

Assignee: nobody → azakai

Alon Zakai (:azakai)

Assignee

Comment 2

•

14 years ago

Some data, comparing 3 seconds of panning over the canvas to not panning, on a fast laptop:

Profiling IPC messages shows

PBrowserChild|AM:Content:SetCacheViewpor               total latency: 0.0000    total processing: 0.1079    total: 0:137:137
                 PLayersChild|Msg_Update [SYNC/RPC]    total latency: 0.2097    total processing: 0.0276    total: 70:75:75

Neither message is noticeable in profiles from without panning.

oprofile shows:

2335      1.7176  libxul.so                pixman_blt_sse2
2189      1.6102  libxul.so                js::Interpret(JSContext*, JSStackFrame*, unsigned int, JSInterpMode)

The former is not noticeable without panning; the latter is half as big without panning.

Taking the IPC profiling and oprofile stats together, and considering that this is on a powerful laptop and not a device, I'd say they might explain the issue. We apparently

1. Spend a significant amount of time in synchronous PLayersChild|Msg_Update messages that cause latency
2. Send dozens of Content:SetCacheViewport messages per second, that take a significant amount of time to process

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 3

•

14 years ago

It looks like the canvas there has an alpha channel, which is expensive to composite on the CPU.  GL should make this go away, but another possibility is fixing the TODO here

  virtual LayerState GetLayerState(nsDisplayListBuilder* aBuilder,
                                   LayerManager* aManager)
  {
    // XXX we should have some kind of activity timeout here so that
    // inactive canvases can be composited into the background
    return mozilla::LAYER_ACTIVE;
  }

Assignee: azakai → nobody

Component: General → Canvas: 2D

Product: Fennec → Core

QA Contact: general → canvas.2d

Brad Lassey [:blassey] (use needinfo?)

Updated

•

14 years ago

Assignee: nobody → azakai

tracking-fennec: ? → 2.0+

Alon Zakai (:azakai)

Assignee

Comment 4

•

14 years ago

So, what should we do here - fix that TODO, or wait for GL?

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 5

•

14 years ago

If you've got the time, it'd be great to apply the GL patches and check if they indeed speed up compositing here.

Mark Finkle (:mfinkle) (use needinfo?)

Comment 6

•

14 years ago

(In reply to comment #4)
> So, what should we do here - fix that TODO, or wait for GL?

Alon - Front-end could explore reducing the Content:SetCacheViewport messages too

Alon Zakai (:azakai)

Assignee

Comment 7

•

14 years ago

(In reply to comment #5)
> If you've got the time, it'd be great to apply the GL patches and check if they
> indeed speed up compositing here.

I tried to test with hardware acceleration, however it crashes (filed bug 617498), so I cannot tell if that will help here or not.

Alon Zakai (:azakai)

Assignee

Comment 8

•

14 years ago

Crashes and lockups are frequent with GL turned on on that page, but sometimes it works long enough to see that it is much, much smoother with GL (on a Galaxy S).

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 9

•

14 years ago

\o/

Doug Turner (:dougt)

Comment 10

•

14 years ago

has this gotten better?  WFM?

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 11

•

14 years ago

GL compositing makes it go away.  Probably worth leaving open to explore alternatives if we end up disabling GL on important HW.

Brad Lassey [:blassey] (use needinfo?)

Updated

•

14 years ago

Summary: Panning around canvas on crash-stats.mozilla.org is laggy → Panning around canvas on crash-stats.mozilla.org is laggy without GL compositing

Alon Zakai (:azakai)

Assignee

Comment 12

•

14 years ago

Removing assignee as there is currently no work to be done in this bug.

Assignee: azakai → nobody

Brad Lassey [:blassey] (use needinfo?)

Comment 13

•

13 years ago

Its looking like open gl compositing will not be enabled on at least a subset of devices. Alon, do you want to pick this back up?

Alon Zakai (:azakai)

Assignee

Comment 14

•

13 years ago

Hmm, I am not clear on what should be done here. I assume comment #3 is intended? cjones, can you please elaborate?

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 15

•

13 years ago

Yes, when we use CPU compositing in chrome our easiest bet is probably to do comment 3 in the content process.  In fennec, that will slow down content re-rendering a bit at the expense of faster panning in the common case, which seems like a win.

Alon Zakai (:azakai)

Assignee

Comment 16

•

13 years ago

Are we sure that that would improve things here? Just switching mozilla::LAYER_ACTIVE to mozilla::LAYER_INACTIVE as a test doesn't seem to have any effect.

Matt Woodrow (:mattwoodrow)

Comment 17

•

13 years ago

You can check the layer tree output by setting gDumpPaintList = 1 in a debug build. With layers set to inactive, the CanvasLayer entry should disappear.

Note that this will break reftests (functionally, rather than a visible failure) that rely on canvas being an always active layer.

Alon Zakai (:azakai)

Assignee

Comment 18

•

13 years ago

Thanks for the info!

Ok, I verified that indeed all CanvasLayers vanish when I return INACTIVE (but do appear when the original ACTIVE is returned). As mentioned before, no noticeable performance difference (or any other visual difference), but I'll do some more tests.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 19

•

13 years ago

(In reply to comment #17)
> Note that this will break reftests (functionally, rather than a visible
> failure) that rely on canvas being an always active layer.

Hmm, how so?  Example?

(In reply to comment #18)
> Ok, I verified that indeed all CanvasLayers vanish when I return INACTIVE (but
> do appear when the original ACTIVE is returned). As mentioned before, no
> noticeable performance difference (or any other visual difference), but I'll do
> some more tests.

Your patch (probably not quite what we want; prolly want to make INACTIVE off a timeout) would move the <canvas>->page compositing into the content process, where it would happen exactly once per PLayers:Update.  Currently compositing happens on each chrome-process repaint.  If the number of repaints are >> PLayers:Update, e.g. if you're panning slowly up/down around the canvas, your patch ought to be a clear qualitative win.  If the ratio is closer to 1:1, we won't win anything.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 20

•

13 years ago

(And in fact could lose if we would have been able to apply a clip in the chrome process, to skip some compositing.)

Alon Zakai (:azakai)

Assignee

Comment 21

•

13 years ago

Ok, did some more testing. Using INACTIVE definitely has a positive effect in many cases, but it might not be very noticeable depending on zoom etc. (which is why I didn't notice it before).

Assignee: nobody → azakai

Alon Zakai (:azakai)

Assignee

Comment 22

•

13 years ago

cjones, can you please clarify some things for me?

Other classes that decide between ACTIVE/INACTIVE - nsDisplayVideo and nsDisplayOpacity - appear to have natural ways to find out if their content is active or not. Not clear to me what makes sense with canvas. My initial guess is something to do with when a reflow/invalidation occurs, but I'm not sure if that makes sense - won't the canvas be already inactive when the reflow completes (so no need to do a timeout for later)? Or is the idea to wait in anticipation of possible additional reflows happening soon after? Or are my guesses completely off the mark? ;)

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 23

•

13 years ago

Vlad would have the best feel for heuristics, so he should have the final say.

For GL we don't need to worry about this, it would be a perf hit. (Might be a win for battery life, but we don't really have the tools to test that yet.) If

http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/ShadowLayers.h#318

says anything other than LAYERS_BASIC, we don't need to bother with this ACTIVE/INACTIVE code. The same is true with single-process rendering.

There's not really a clean way to determine ACTIVE/INACTIVE for <canvas>, AFAIK. A heuristic that might be good enough for 4.0 could be
- ACTIVE unless chrome-process layer manager is LAYERS_BASIC, in which case
- ACTIVE if the <canvas> is opaque (and so compositing is cheap)
- INACTIVE otherwise

I'm starting to reconsider comment 19; with the above, we'd trade off possibly unnecessary compositing ops in the content process (say if the <canvas> layer ended up off-screen in chrome) for *not* doing an unbounded number of unnecessary compositing ops in the chrome process if the <canvas> layer were on-screen. This keeps the possibly-wasted work in the content process.

We could get slightly fancier by amending "INACTIVE otherwise" above to
- ACTIVE unless the canvas has been invalidated within the last k milliseconds*
- INACTIVE otherwise

* probably determined near here http://mxr.mozilla.org/mozilla-central/source/content/canvas/src/nsCanvasRenderingContext2D.cpp#4061

That's similar to what we do for scrollable subframes, and could be a win on desktop too. Not sure what the timeout would be ... 100ms? 200ms? The important thing is not to waste composite-into-background ops for active animations.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 24

•

13 years ago

If we were doing to make canvases INACTIVE, I would use a heuristic of the form "make the canvas active if it's been updated within the last NNN seconds".

If someone's updating the canvas a lot (and not changing other content on the page), then making it an active layer should be a performance improvement because it avoids the cost of redrawing the content in the ThebesLayer(s) around it.

Alon Zakai (:azakai)

Assignee

Comment 25

•

13 years ago

>  - ACTIVE unless chrome-process layer manager is ...
                   ~~~~~~~~~~~~~~

A question about that: Even when running a remote page in Fennec, the call to GetLayerState (where we need to decide active/inactive) happens in the parent process during panning (it happens in the child during initial load). So I do not understand what kind of check you are suggesting to do, regarding which process is being run (as the check during panning always happens in the parent).

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 26

•

13 years ago

Which GetLayerState call?  For <canvas> frames?

Alon Zakai (:azakai)

Assignee

Comment 27

•

13 years ago

Yes, in nsDisplayCanvas.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 28

•

13 years ago

Those are probably the UI's <browser> thumbnails, which we don't really care about since they're small and opaque.  But comment 23 and comment 24 apply equally well to those.

"chrome-process layer manager" should really have been, "*compositing* layer manager".  For the content process, the eventual compositor will be the chrome-process's manager.

Alon Zakai (:azakai)

Assignee

Comment 29

•

13 years ago

Attached patch wip patch (obsolete) — Details — Splinter Review

This is what I have so far. Works nicely on device.

cjones, I don't understand your last comment, and still not clear to me what kind of process-related check you are suggesting here - what exactly should be checked for, and where?

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 30

•

13 years ago

Content processes always use basic layer managers.  "Shadow layers" of the layers created by the content-process managers are hooked into the chrome-process layer manager.  The chrome process layer manager can by any type.  The chrome-process layer manager is the one that eventually composites content-process <canvas> shadow layers onto fennec's window.

In your patch, the check for basic managers is always going to be true in the content process, even if we're using GL layers in chrome.  In that case, composite-into-background doesn't necessarily help perf, and can hurt it.

Instead, we'd want to check a layer-manager interface like MaybeCompositeCanvasLayersIntoBackground().  In content processes, that interface would use

http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/ShadowLayers.h#318 

, i.e. the compositing manager's type, instead of the content-process manager's type, which is always LAYERS_BASIC.

(You didn't request feedback on this patch, but I think you want the inactivity check to be near

http://mxr.mozilla.org/mozilla-central/source/content/canvas/src/nsCanvasRenderingContext2D.cpp#4061

rather the checking the frame's reflow time.  To my knowledge, <canvas> reflow and web content drawing to the canvas are entirely separate beasts, and we only care about the latter for composite-into-background.  Vlad would know more.)

Alon Zakai (:azakai)

Assignee

Comment 31

•

13 years ago

Attached patch wip patch (obsolete) — Details — Splinter Review

Thanks for the explanation!

Vlad, I marked as asking for feedback, specifically for the issue cjones mentioned you would know about, in the previous comment.

Unrelated question (for anyone): I have

  layers.acceleration.disabled = true
  layers-acceleration-force-enable = false

but I do get some layers as being LAYERS_OPENGL for their parent backend type. Is this expected?

Attachment #505272 - Attachment is obsolete: true

Attachment #505481 - Flags: feedback?(vladimir)

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Comment 32

•

13 years ago

Indeed, canvas reflow is totally unrelated to the canvas layer or drawing into the canvas.  Only drawing into the canvas should trigger the inactive/active stuff; using the last reflow time for this isn't useful.  (Think of video -- video is marked as "active" while it's playing, that is while the content is changing, and inactive when paused.)  Instead, you want to use something like the last update time on the canvas layer, if possible.

You can't quite use last invalidation time, because if we ever invalidate the entire canvas, we'll stop calling invalidate again until we actually paint.  So you could get into the scenario of someone doing some really expensive canvas drawing that takes a long time; at some point we'd invalidate the entire canvas and stop calling invalidate again, so the "time since last invalidate" would be very large by the time we got to layers, even though we're actively painting.  But maybe that's a pathological case anyhow, since you can get into the same situation if you use last updated time...

Note that as far as I know, the inactivity bits are an optimization primarily for basic layers.  For other layer types, the relevant composite should be quite fast, and the only optimization we'd gain is somewhat reduced memory usage.

Matt Woodrow (:mattwoodrow)

Comment 33

•

13 years ago

Examples of reftests relying on canvases being active is layout/bugs/602200-**.html tests.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 34

•

13 years ago

I think those tests will be OK as long as canvases become inactive a few seconds after the last drawing activity.

Alon Zakai (:azakai)

Assignee

Comment 35

•

13 years ago

Is AreLayersMarkedActive relevant here?

http://mxr.mozilla.org/mozilla-central/source/layout/base/nsDisplayList.cpp#1567

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 36

•

13 years ago

Yes, that's the right thing. You can call nsIFrame::MarkLayersActive to mark the canvas frame as active after it's been painted. It will automatically be marked inactive soon after the last MarkActive (currently 300-400ms, contrary to comments, although we might reduce that to 75-100ms).

Alon Zakai (:azakai)

Assignee

Comment 37

•

13 years ago

Attached patch wip patch (obsolete) — Details — Splinter Review

Hmm, I still don't understand the big picture here. I can't seem, for example, to find a way to access the relevant Layer in GetLayerState() (to find out if it was recently updated), nor to access the relevant Frame from ShadowLayers (to call MarkLayersActive).

So, this patch is the best I can think of so far. It adds a lastUpdatedTime to ShadowLayer, and caches that when calling BuildLayer, then reads it in GetLayerState.

Is there a better way?

Attachment #505481 - Attachment is obsolete: true

Attachment #507657 - Flags: feedback?

Attachment #505481 - Flags: feedback?(vladimir)

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 38

•

13 years ago

See nsDisplayOpacity::GetLayerState, which just does
  if (mFrame->AreLayersMarkedActive())
    return LAYER_ACTIVE;

Instead of accessing the frame from ShadowLayers, mark the frame active from the canvas code (nsCanvasRenderingContext2D::GetCanvasLayer and WebGLContext::GetCanvasLayer).

Alon Zakai (:azakai)

Assignee

Comment 39

•

13 years ago

Attached patch patch (obsolete) — Details — Splinter Review

Thanks, here is a much nicer and shorter patch.

Attachment #507657 - Attachment is obsolete: true

Attachment #507998 - Flags: review?(roc)

Attachment #507657 - Flags: feedback?

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 40

•

13 years ago

-    // XXX we should have some kind of activity timeout here so that
-    // inactive canvases can be composited into the background
-    return mozilla::LAYER_ACTIVE;
+    if (XRE_GetProcessType() == GeckoProcessType_Default ||
+        static_cast<BasicShadowLayerManager*>(aManager)
+          ->GetParentBackendType() != LayerManager::LAYERS_BASIC)
+      return mozilla::LAYER_ACTIVE;
+
+    return mFrame->AreLayersMarkedActive() ? LAYER_ACTIVE : LAYER_INACTIVE;

I would like to be consistent here across products and processes. I don't think this should depend on the process type.

Also, I don't think we should just check for LAYERS_BASIC here. It could be a ShadowLayerManager where the shadows get accelerated rendering, in which case we probably want to always be active. How about adding a method LayerManager::IsCompositingCheap() which we call here and which is true for everything except BasicLayerManager? Then we can easily change it to return the right thing when we have accelerated shadows.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 41

•

13 years ago

See comment 30 for a way to implement that.

Alon Zakai (:azakai)

Assignee

Comment 42

•

13 years ago

Not sure I understand. So, LayerManager::IsCompositingCheap would return true for everything but BasicLayerManager and BasicShadowLayerManager? BasicLayerManager will return false, and BasicShadowLayerManager will need to do an IPC call to call IsCompositingCheap in the parent?

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 43

•

13 years ago

LayerManager::IsCompositingCheap() {
  return LAYERS_BASIC != GetBackendType();
}

overridden for basic layers by

BasicShadowLayerManager::IsCompositingCheap() {
  return !mShadowManager ? LayerManager::IsCompositingCheap() :
                           LAYERS_BASIC != mShadowManager->GetParentBackendType();
}

Alon Zakai (:azakai)

Assignee

Comment 44

•

13 years ago

Attached patch patch (obsolete) — Details — Splinter Review

Updated patch with |IsCompositingCheap|.

Attachment #507998 - Attachment is obsolete: true

Attachment #508541 - Flags: review?(roc)

Attachment #507998 - Flags: review?(roc)

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 45

•

13 years ago

+  virtual PRBool IsCompositingCheap()
+  {
+    return IsCompositingCheap(GetBackendType());
+  }

Make this just return true, and override in BasicLayerManager to return false. No need for the virtual call of GetBackendType.

+  // Whether compositing is cheap depends on the parent backend.
+  return LayerManager::IsCompositingCheap(mShadowManager ?
+    GetParentBackendType() : GetBackendType());

So this would just be return mShadowManager && IsCompositingCheap(GetParentBackendType());

Alon Zakai (:azakai)

Assignee

Comment 46

•

13 years ago

Attached patch patch (obsolete) — Details — Splinter Review

Patch with those fixes.

Attachment #508541 - Attachment is obsolete: true

Attachment #508642 - Flags: review?(roc)

Attachment #508541 - Flags: review?(roc)

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 47

•

13 years ago

Comment on attachment 508642 [details] [diff] [review]
patch

We'd better land this for the beta, it may affect desktop Firefox performance.

Attachment #508642 - Flags: review?(roc) → review+

Alon Zakai (:azakai)

Assignee

Comment 48

•

13 years ago

Sending to try, will push to m-c if all is well.

Alon Zakai (:azakai)

Assignee

Comment 49

•

13 years ago

(In reply to comment #33)
> Examples of reftests relying on canvases being active is
> layout/bugs/602200-**.html tests.

Early results from try indeed show a failure on 602200-4.html.

Do we want to keep the old behavior (that is, limit the new behavior only to child processes), or do we want to investigate the failure and possibly change the test?

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 50

•

13 years ago

I think what's happening is that we fill in the canvas before the canvas frame is created, so we never get around to marking the layer active.

The conservative way around this would be to mark the canvas frame active when it's created.

Alon Zakai (:azakai)

Assignee

Comment 51

•

13 years ago

Attached patch patch — Details — Splinter Review

Great, looks like that fixes the problems on try.

Did I put the call to MarkLayersActive() in the right place?

Attachment #508642 - Attachment is obsolete: true

Attachment #509137 - Flags: review?(roc)

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 52

•

13 years ago

Comment on attachment 509137 [details] [diff] [review]
patch

+  // We can fill in the canvas before the canvas frame is created, in
+  // which case we never get around to marking the layer active. Therefore,
+  // we mark it active here when we create the frame.
+  MarkLayersActive();

That's probably OK in the constructor but doing it in nsHTMLCanvasFrame::Init would be better. r+ with that.

Attachment #509137 - Flags: review?(roc) → review+

Alon Zakai (:azakai)

Assignee

Comment 53

•

13 years ago

Pushed: http://hg.mozilla.org/mozilla-central/rev/dc85ecc5f1dc

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Boris Zbarsky [:bzbarsky]

Updated

•

11 years ago

Depends on: 852803

Timothy Nikkel (:tnikkel)

Updated

•

11 years ago

No longer depends on: 852803

wip patch 13 years ago Alon Zakai (:azakai) 3.42 KB, patch		Details \| Diff \| Splinter Review
wip patch 13 years ago Alon Zakai (:azakai) 3.68 KB, patch		Details \| Diff \| Splinter Review
wip patch 13 years ago Alon Zakai (:azakai) 4.27 KB, patch		Details \| Diff \| Splinter Review
patch 13 years ago Alon Zakai (:azakai) 3.30 KB, patch		Details \| Diff \| Splinter Review
patch 13 years ago Alon Zakai (:azakai) 6.41 KB, patch		Details \| Diff \| Splinter Review
patch 13 years ago Alon Zakai (:azakai) 6.84 KB, patch	roc : review+	Details \| Diff \| Splinter Review
patch 13 years ago Alon Zakai (:azakai) 8.25 KB, patch	roc : review+	Details \| Diff \| Splinter Review