Closed Bug 1123196 Opened 9 years ago Closed 9 years ago

[Task Manager] enter card view animation not smoothly sometimes

Categories

(Firefox OS Graveyard :: Gaia::System::Window Mgmt, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(blocking-b2g:2.2+, firefox36 unaffected, firefox37 wontfix, firefox38 fixed, b2g-v2.2 verified, b2g-master verified)

VERIFIED FIXED
2.2 S5 (6feb)
blocking-b2g 2.2+
Tracking Status
firefox36 --- unaffected
firefox37 --- wontfix
firefox38 --- fixed
b2g-v2.2 --- verified
b2g-master --- verified

People

(Reporter: hcheng, Assigned: birtles)

References

Details

(Keywords: regression, Whiteboard: [systemsfe])

Attachments

(4 files, 1 obsolete file)

*Description:
When I enter the card view, I want to see the animation is smooth.

*Expected results:
The current using app should shrink downscale smoothly.

*Actual results:
The current using app shrinks downscale twice.

*Reproduce rate: 50% (5/10)

*Video link: https://www.youtube.com/watch?v=YMJkQAhk8K8
 The 1st time is not smooth, and the 2nd time is smooth.
blocking-b2g: --- → 2.2?
Flags: needinfo?(sfoster)
Whiteboard: [systemsfe]
degradation in performance, definitely a blocker
blocking-b2g: 2.2? → 2.2+
It seems to me something changed recently - perhaps since or over the holidays - that has hit performance. The card view itself has not changed much since it landed in nightly back in September (1) I'd like to narrow this down - can we do a regression window? (no point in looking at 2.1 or before) I know the smoothness is a bit subjective but it definitely *feels* slower recently. 

1. https://github.com/mozilla-b2g/gaia/commit/7d988b4f7de00317c4fe95949e5f19b943bbc783
Flags: needinfo?(sfoster)
QA Contact: ychung
Mozilla-inbound Regression Window:

Last Working Environmental Variables:
Device: Flame 2.2
BuildID: 20150108144000
Gaia: 82b534791a72ad59c78363df80459e18d35b48f8
Gecko: 20a7f674d668
Version: 37.0a1 (2.2) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0

First Broken Environmental Variables:
Device: Flame 2.2
BuildID: 20150108145857
Gaia: 82b534791a72ad59c78363df80459e18d35b48f8
Gecko: 95267af607c3
Version: 37.0a1 (2.2) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0

Last Working Gaia First Broken Gecko: Issue DOES reproduce 
Gaia: 82b534791a72ad59c78363df80459e18d35b48f8
Gecko: 95267af607c3

First Broken Gaia Last Working Gecko: Issue DOES NOT reproduce
Gaia: 82b534791a72ad59c78363df80459e18d35b48f8
Gecko: 20a7f674d668

http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=20a7f674d668&tochange=95267af607c3

Possibly caused by bug 1112480
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
QA Contact: ychung
Brian, can you take a look at this please? This may be a result of the work that was done on bug 1112480
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker) → needinfo?(bbirtles)
Sorry for the delay, I'm looking into this but currently having trouble getting a build together to reproduce it.
Flags: needinfo?(bbirtles)
Triage: Brian can we assign this to you while you investigate?
Flags: needinfo?(bbirtles)
(In reply to Ben Francis [:benfrancis] from comment #6)
> Triage: Brian can we assign this to you while you investigate?

I'll take it for now, but I won't be able to look at it properly for a few days. Currently I can't reproduce it on a trunk build.

Hermes, can you provide detailed STR?

For what it's worth, I'm using trunk (mozilla-central) gecko with trunk gaia running on a flame device.
Assignee: nobody → bbirtles
Status: NEW → ASSIGNED
Flags: needinfo?(bbirtles) → needinfo?(hcheng)
If I load a lot of apps then I see a slight flicker when entering the cards view. It's not quite the same as the video but probably the same cause. It seems the browser app and settings app display this more often. This might be easier to reproduce by reducing the memory on the flame device.
Flags: needinfo?(hcheng)
For my own notes in case I don't get back to this for a while:

The to-cardview class, sets up the clsoeAppTowardsCardView animation defined here:

https://github.com/mozilla-b2g/gaia/blob/cf23d578f68c351aee5110320f1e19b334733177/apps/system/style/window.css#L192

(One very preliminary observation is that I'm not sure why we have a 0% keyframe here. A 'to' keyframe would appear to be sufficient.)

And appears to be applied here:

https://github.com/mozilla-b2g/gaia/blob/06efefee07b484ce6968002ef2e9ffc4cce12715/apps/system/js/app_window.js#L2080

Next step is to add some logging on both the JS and C++ side to see why we appear to be restarting the animation.
Etienne, can you add anything here to help Brian? Have you seen the flicker he describes in Comment #8

Brian, we often run the Flame configured with 319MB memory when testing this kind of stuff, to emulate 256MB on a lower-resolution device
Flags: needinfo?(etienne)
For my own notes, one theory is we do something like:

1. Create the animation with null start time
2. Add it to the layer
3. Resolve the start time on the layer
4. Set the start time back in content as the pending start time
5. Complete the layer transaction and start the animation on the compositor
6. Before the next tick, re-trigger display list building and re-add the animation to the layer
7. Re-resolve the animation start time since we don't apply the new pending start time until the next tick  (this is the change in bug 1112480) so it still appears to be null.

That would explain why bug 1112480 and suggests a fairly straightforward tick where we expose the pending ready time and use that when adding animations to layers.

*However* so far my logging doesn't support this theory. It seems like instead we do 4 before 3. It may be because we have a nested iframe situation where we incorrectly assume that the target layer has been painted when it hasn't.

The next step is probably to work out why we end up initially generating 2 layer animations for the one content animation, and then why we generate a 3rd shortly after. Unfortunately I won't be able to look into that until at least next week.
(In reply to Sam Foster [:sfoster] from comment #10)
> Etienne, can you add anything here to help Brian? Have you seen the flicker
> he describes in Comment #8
> 
> Brian, we often run the Flame configured with 319MB memory when testing this
> kind of stuff, to emulate 256MB on a lower-resolution device

Thanks Sam. Setting the memory to 256Mb helps reproduce this.
(In reply to Sam Foster [:sfoster] from comment #10)
> Etienne, can you add anything here to help Brian? Have you seen the flicker
> he describes in Comment #8

Hello Brian, big fan of bug 927349 :)

I couldn't find anything horribly wrong on the gaia-side. That said 2 details could cause a glitch similar to this bug
- If, during a frame or so, the #cards-view is displayed but not the app in the process of shrinking, we will see the corresponding card element. So it looks like the card finished shrinking for an instant.
- If, during a frame or so, the #cards-view background is not displayed we will see all the apps open below (we transform:none all the app windows in card view mode to make them moz-elementable).
Note that this patch doesn't fix our glitch here (and breaks the moz-elements), but it makes it less likely to trigger a similar glitch while tweaking the cardview.


We also change the appWindow's z-index around the same time we add the animation [1], I wonder if this could cause trouble.

[1] https://github.com/mozilla-b2g/gaia/blob/d2d79194fd815b693b22dafedeb60a924ce19dd7/apps/system/style/zindex.css#L70-79
Flags: needinfo?(etienne)
Random note: I also tried using a transition instead and had the same glitch.
Thanks for the help Etienne! I'll look into this some more tomorrow.
Ok, I have a rough idea what is going on but I know very little about our layers setup so I'm going to need some help.

1. We do display list building and add the transform animation to the layer for the appWindow element

2. Inside nsDisplayList::PaintRoot we call EndTransaction on the ClientLayerManager (we're in a widget transaction at this point)

  3. Somewhere inside ClientLayerManager::EndTransactionInternal we end up triggering further display list building. I'm not sure exactly what triggers this but according to comment 13 we're using -moz-element, so perhaps that's it

  4. We end up creating another transform animation from the same source as (1) and applying that a different layer (but still one associated with the ClientLayerManager). This is probably fine, but I didn't know we did this. Perhaps -moz-element means we have multiple frames referring to the same element (and animations are stored on elements on the content side).

  5. We end up in a nested call to nsDisplayList::PaintRoot and make a new BasicLayerManager since it's not a widget transaction in this case

  6. We finish the transaction on the BasicLayerManager from 5 and hit the call to StartPendingAnimations here:
     https://dxr.mozilla.org/mozilla-central/source/layout/base/nsDisplayList.cpp?from=nsDisplayList.cpp#1680
     Basically, we think, "Great, painting has finished, let's trigger animations on the content side too"

  7. Over on the content side we resolve the start time of each pending animation using the animation ready time from the BasicLayerManager created in step 5.

  8. We do number of these calls to nsDisplayList::PaintRoot but without creating any further animations (perhaps we're doing one per card in the task manager but only one card is animated?)

9. Finally we get towards the end of ClientLayerManager::EndTransactionInternal and we trigger any pending animations. We have two: the one created in step 1 and the one created in step 4.

10. We resolve the start time of those two animations to the current time and update the animation ready time on the ClientLayerManager

11. We unwind the stack back to the original call to nsDisplayList::PaintRoot and go to resolve the start times of pending animations on the content side. Of course, the are no pending animations anymore, they got resolved back in step 7.


As a result we end up with different start times on the layer animations and the content animation. The next time we do a layer transaction we'll recreate the layer animations with the start time used on the content animation. In some cases this will be close enough that it's not noticeable. But in other cases you'll get a jump backwards because the content animation start time will be behind the original layer animation start time.

I'm not entirely sure why bug 1112480 triggered this. It's possible that something like comment 11 is also happening that is exacerbating it, however.

For example, if we are running this process multiple times per refresh driver tick, then as a result of bug 1112480, we'd update the layer animation start time each time and replace it with a much earlier content animation start time on the next tick.

I'm not sure, however, if we trigger layer transactions multiple times per tick though.

Matt, does the above make sense?

If it does then I think we need to do two things here:

a. Not starting pending animations on the content side until we finish the transaction on the ClientLayerManager. Matt, can you help me understand when we do these nested calls to nsDisplayList::PaintRoot?

  For example, would something like this make sense?

  if (document &&
      (widgetTransaction || document->IsBeingUsedAsImage()) {
    StartPendingAnimations(document, layerManager->GetAnimationReadyTime());
  }

  I'm adding the condition about IsBeingUsedAsImage() simply because I see we do that for SVG-as-image and I guess we want to trigger animations in that case.

  Or is there a better way of determining we're in a nested call to PaintRoot?

b. Expose the content animation start time we're going to apply on the next tick and use that when building layer animations.

  i.e. add something like AnimationPlayer::GetCurrentOrPendingStartTime()

That would avoid the problem in comment 11, if indeed that happens.


Adding Chris Lord and jwatt too in case they know how this setup works.
Flags: needinfo?(matt.woodrow)
Attached patch WIP patch (obsolete) — Splinter Review
After a little testing, fix a alone from comment 16 doesn't seem to fix this issue.

This patch implements both a + b (although not very efficiently).

With this patch applied I can't reproduce the restarting bug but I do occasionally see some flicker. I think the flicker may be a different issue however since I see it on other occasions too.
I'm really not sure if this patch is right. It fixes a problem, but maybe not in the right way.

I think part 1 alone, however, should fix the regression.
Attachment #8556855 - Flags: review?(matt.woodrow)
With parts 1 and 2 I think the regression is fixed. I still see some flicker when the animation start sometimes but I *think* that's a separate bug and not a regression from bug 1112480. With these patches applied I haven't seen the restarting behaviour described in comment 0.
Comment on attachment 8556855 [details] [diff] [review]
Part 2 - Don't resolve pending animations during nested calls to PaintRoot

Review of attachment 8556855 [details] [diff] [review]:
-----------------------------------------------------------------

-moz-element makes sense to be causing this. Looks like it's only SVG paint servers (which -moz-element uses), and SVG foreign object that can cause nested calls to PaintFrame.

widgetTransaction could be false when painting the root document with a drawWindow call, but I don't think that should affect this much.

::: layout/base/nsDisplayList.cpp
@@ +1677,5 @@
>    aBuilder->SetIsCompositingCheap(temp);
>    layerBuilder->DidEndTransaction();
>  
> +  if (document &&
> +      (widgetTransaction || document->IsBeingUsedAsImage())) {

I'm not sure why we need to check IsBeingUsedAsImage here. Rendering an image document should still be nested within the chrome document, and will have a widget transaction there.
Attachment #8556855 - Flags: review?(matt.woodrow) → review+
Attachment #8556854 - Flags: review?(jwatt) → review+
Hi Brian, could you please land to v2.2? Thanks.
Flags: needinfo?(bbirtles)
Comment on attachment 8556854 [details] [diff] [review]
Part 1 - While player is waiting to start, return its pending start time

NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 1112480
User impact if declined: Some animations, particularly the task switching animation, can restart midway
Testing completed: Has had about 1 day bake time on m-i, 10 hours on m-c.
Risk to taking this patch (and alternatives if risky): No known side effects at this time.
String or UUID changes made by this patch: None.
Attachment #8556854 - Flags: approval-mozilla-b2g37?
Comment on attachment 8556855 [details] [diff] [review]
Part 2 - Don't resolve pending animations during nested calls to PaintRoot

NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 1112480
User impact if declined: Some animations, particularly the task switching animation, can restart midway.
Testing completed: Has had about 1 day bake time on m-i, 10 hours on m-c.
Risk to taking this patch (and alternatives if risky): No known side effects at this time. We might only need part 1 in order to fix this bug, but part 2 is a minor change that probably helps.
String or UUID changes made by this patch: None.
Flags: needinfo?(bbirtles)
Attachment #8556855 - Flags: approval-mozilla-b2g37?
Attachment #8556854 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
Attachment #8556855 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
This issue is verified fixed on Flame Master.

Result: The shrinking animation does not occur twice when entering card view. Repro rate: 0/30  
 
Device: Flame Master (319mb, full flash)
Build ID: 20150209010211
Gaia: 0d7b35f23402c4cb29bca6b98280fec48a196dec
Gecko: 3436787a82d0
Gonk: e7c90613521145db090dd24147afd5ceb5703190
Version: 38.0a1 (3.0)
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:38.0) Gecko/38.0 Firefox/38.0
Status: RESOLVED → VERIFIED
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Adding verifyme for 2.2 verification.
Keywords: verifyme
Attached video Verify_video.mp4
The problem is verified not happen on latest Flame 2.2 build.

Steps:
1. Flash latest Flame 2.2 build.
2. Launch settings.
3. Long tap home button.
4. Tap settings card.
5. Repeat 3~4.

Actual Result:
5. Enter card view animation always be dispalyed smoothly.

Fail rate:0/50
See attachment:Verify_video.MP4

Flame 2.2 version:
Build ID               20150209002504
Gaia Revision          e827781324cbde91d2434b388f5dead3303a85ee
Gaia Date              2015-02-06 20:54:14
Gecko Revision         https://hg.mozilla.org/releases/mozilla-b2g37_v2_2/rev/0552759956d3
Gecko Version          37.0a2
Device Name            flame
Firmware(Release)      4.4.2
Firmware(Incremental)  eng.cltbld.20150209.040038
Firmware Date          Mon Feb  9 04:00:51 EST 2015
Bootloader             L1TC000118D0
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage+][MGSEI-Triage+]
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: