Compositor will sleep for a long time during composing if we align composing and refresh driver tick with vsync

RESOLVED WORKSFORME

Status

()

RESOLVED WORKSFORME
4 years ago
4 years ago

People

(Reporter: jerry, Assigned: jerry)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(7 attachments, 2 obsolete attachments)

Created attachment 8518104 [details]
Systrace result. Use Chrome to open(keyboard w,a,s,d to move and zoom the window)

If we align refresh driver and compositor with vsync event and have one frame composing more than 16.6ms(ex. system is busy for kernel or driver stuff), we will see compositor sleep for a long time in composing after that frame.
Created attachment 8518107 [details]
The screen capture for profiler result

This picture is captured from attachment 8518104 [details]. It shows the composing time for compositor.
I might be way off here but AIUI the SwapBuffers call [1] waits for vsync, which is why you're seeing this behaviour. In your screenshot - the first composite might be a no-op, and so it doesn't do that block. But every frame after that does need to block in SwapBuffers. I also don't think this is a problem really. As long as the compositor doesn't fall behind with compositing frames it doesn't matter if it's sleeping or not.

[1] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/opengl/CompositorOGL.cpp?rev=5e5b3c89df16#1207
GLContextEGL::SwapBuffers() could block or not is implementation dependent problem. mchang analyzed about it some in Bug 1076166.
Created attachment 8518203 [details]
Zoom out for bigger range of profile data.

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #2)
> I might be way off here but AIUI the SwapBuffers call [1] waits for vsync,
> which is why you're seeing this behaviour. In your screenshot - the first
> composite might be a no-op, and so it doesn't do that block. But every frame
> after that does need to block in SwapBuffers. I also don't think this is a
> problem really. As long as the compositor doesn't fall behind with
> compositing frames it doesn't matter if it's sleeping or not.
> 

If compositor spend for a long time(even though it is almost in sleep), it will make refresh driver enter throttling state. In this case, the refresh driver will skip all tick and only do the tick when it receives a "DidComposite" message from compositor at the end of compositing.

"The first composite" is not a no-op, we can see the time for each frame before this are also very short.
Assignee: nobody → mchang
Status: NEW → ASSIGNED
Created attachment 8522320 [details] [diff] [review]
change vsync-aligned compositor task scheduling timing.
If one frame costs more than 16ms and we still send another task to gpu, gpu will be blocked until next vsync(attachment 8518107 [details]). In this case, the compositor can't handle any layer update. This is not a good pattern. If we can pending the task to next sync when busy, we can reduce the compositing time for the later frame.
Created attachment 8522333 [details]
Pending compositing task to nexe vsync
Comment on attachment 8522320 [details] [diff] [review]
change vsync-aligned compositor task scheduling timing.

Please compare the attachment 8522333 [details] and attachment 8518107 [details].
If one frame is more than 16ms, we might need to defer the task to next vsync instead of composing immediately.
Attachment #8522320 - Flags: feedback?(mchang)
Attachment #8522320 - Flags: feedback?(bgirard)
(In reply to Jerry Shih[:jerry] (UTC+8) from comment #8)
> Comment on attachment 8522320 [details] [diff] [review]
> change vsync-aligned compositor task scheduling timing.
> 
> Please compare the attachment 8522333 [details] and attachment 8518107 [details]
> [details].
> If one frame is more than 16ms, we might need to defer the task to next
                                                        ~~~~~~~~~~~
                                                      the next frame                              
> vsync instead of composing immediately.
I'm not sure we want to do this. There are cases where the composite takes longer than 16 ms for the current frame, then if we composite immediately, we composite in time to finish before the next frame. With this patch, we'd miss the next frame. I'd also like to see how often this occurs once we have the refresh driver ticking off vsync before we starting adding heuristics.
Comment on attachment 8522320 [details] [diff] [review]
change vsync-aligned compositor task scheduling timing.

Review of attachment 8522320 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/ipc/CompositorParent.cpp
@@ +279,5 @@
> +  {
> +    // We already compose one frame. Rest the mCurrentCompositeTaskMonitor to
> +    // nullptr and we can post Composite() task at next vsync.
> +    MonitorAutoLock lock(mCurrentCompositeTaskMonitor);
> +    mCurrentCompositeTask = nullptr;

We can schedule a new composite from within the composition. This patch would break that =\.
Attachment #8522320 - Flags: feedback?(bgirard)
Attachment #8522320 - Flags: feedback?(mchang)
Blocks: 1123762
I will update current profiling result soon.
Assignee: mchang → hshih
With flame v18D-1, vsync-compositor only, I can see compositor spend approximately more than 16.6 ms for one frame sometimes. In systrace, the profiler shows that compositor spends a lot of time in "hwc_prepare_primary()" and CPU is in "uninterruptible sleep" in this time. I will check if this is the same situation as https://bugzilla.mozilla.org/show_bug.cgi?id=1076166#c7 .
(In reply to Jerry Shih[:jerry] (UTC+8) from comment #13)
> With flame v18D-1, vsync-compositor only, I can see compositor spend
> approximately more than 16.6 ms for one frame sometimes. In systrace, the
> profiler shows that compositor spends a lot of time in
> "hwc_prepare_primary()" and CPU is in "uninterruptible sleep" in this time.
> I will check if this is the same situation as
> https://bugzilla.mozilla.org/show_bug.cgi?id=1076166#c7 .

Does this still happen if you setup the triple buffers for framebuffer?
Flags: needinfo?(hshih)
(In reply to peter chang[:pchang][:peter] from comment #14)
> (In reply to Jerry Shih[:jerry] (UTC+8) from comment #13)
> > With flame v18D-1, vsync-compositor only, I can see compositor spend
> > approximately more than 16.6 ms for one frame sometimes. In systrace, the
> > profiler shows that compositor spends a lot of time in
> > "hwc_prepare_primary()" and CPU is in "uninterruptible sleep" in this time.
> > I will check if this is the same situation as
> > https://bugzilla.mozilla.org/show_bug.cgi?id=1076166#c7 .
> 
> Does this still happen if you setup the triple buffers for framebuffer?

This test is already with triple-buffer setting at [1] and [2].

[1]
http://mxr.mozilla.org/mozilla-central/source/widget/gonk/libdisplay/FramebufferSurface.cpp#41
[2]
http://mxr.mozilla.org/mozilla-central/source/widget/gonk/libdisplay/GonkDisplayJB.cpp#132
Flags: needinfo?(hshih)
Comment hidden (obsolete)
We will spend a lot of time to wait a lock.
The backtrace is just like https://bug1076166.bugzilla.mozilla.org/attachment.cgi?id=8499157

In kernel/drivers/video/msm/mdss/mdp3_ctrl.c
static int mdp3_overlay_set(struct msm_fb_data_type *mfd, struct mdp_overlay *req)
{
  ....
  mutex_lock(&mdp3_session->lock);  //<<====
  ....
  mutex_unlock(&mdp3_session->lock);

  return rc;
}

Do we have another thread to access the video device at the same time?
Flags: needinfo?(dwilson)
Created attachment 8560188 [details] [diff] [review]
Hold a composite until the next vsync if a composite takes > 15ms
Created attachment 8560189 [details]
Profile with patch applied
Created attachment 8560191 [details]
Profile on master

Profiles with this patch vs master while continuously scrolling the homescreen. Essentially, if any frame ever takes more than ~10 ms to composite on a Flame, all other composites following will have the lock contention issue as described in bug 1076166. If we hold back 1 composite until the next vblank, it's enough to recover so that our composite times go back down. I'm still not convinced we want this until we test more on devices and get a better idea of composite times. Bring this topic up in Taipei.
Flags: needinfo?(mchang)
Subjectively, while swiping up and down on the homescreen on a Flame, we get large janks that we can't recover from without this patch, or at least takes much longer to recover from.
Comment on attachment 8522320 [details] [diff] [review]
change vsync-aligned compositor task scheduling timing.

Review of attachment 8522320 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/ipc/CompositorParent.cpp
@@ +280,5 @@
> +    // We already compose one frame. Rest the mCurrentCompositeTaskMonitor to
> +    // nullptr and we can post Composite() task at next vsync.
> +    MonitorAutoLock lock(mCurrentCompositeTaskMonitor);
> +    mCurrentCompositeTask = nullptr;
> +  }

Actually I don't think it does. We schedule a new composition by setting the mNeedsComposite flag. When the next vsync occurs, it sets mCurrentCompositeTask. As long as mNeedsComposite is set to true, at the next vsync, it will post the composite task. This just won't allow a composite task to be posted until after we finish compositing.
(In reply to Mason Chang [:mchang] from comment #22)
> Comment on attachment 8522320 [details] [diff] [review]
> change vsync-aligned compositor task scheduling timing.
> 
> Review of attachment 8522320 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/layers/ipc/CompositorParent.cpp
> @@ +280,5 @@
> > +    // We already compose one frame. Rest the mCurrentCompositeTaskMonitor to
> > +    // nullptr and we can post Composite() task at next vsync.
> > +    MonitorAutoLock lock(mCurrentCompositeTaskMonitor);
> > +    mCurrentCompositeTask = nullptr;
> > +  }
> 
> Actually I don't think it does. We schedule a new composition by setting the
> mNeedsComposite flag. When the next vsync occurs, it sets
> mCurrentCompositeTask. As long as mNeedsComposite is set to true, at the
> next vsync, it will post the composite task. This just won't allow a
> composite task to be posted until after we finish compositing.

I think attachment 8522320 [details] [diff] [review] is enough.
When we receive a vsync notification, while compositing, it will just skip the tick at [1], because mCurrentCompositeTask is not null(it implies that we have one frame more than vsync interval). This patch doesn't change mNeedsComposite flag. In next vsync, if we already finished one frame, we can post a new compositing task for next frame.

So if frame A is more than one vsync interval, the next frame B will be composed with these two following conditions.
1)frame A is finished
2)compositor receive a vsync notification
We will not have a magic number(e.g. 15ms) in patch as attachment 8560188 [details] [diff] [review].

[1]
https://hg.mozilla.org/mozilla-central/annotate/7c5f187b65bf/gfx/layers/ipc/CompositorParent.cpp#l303
(In reply to Benoit Girard (:BenWa) from comment #11)
> Comment on attachment 8522320 [details] [diff] [review]
> change vsync-aligned compositor task scheduling timing.
> 
> Review of attachment 8522320 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/layers/ipc/CompositorParent.cpp
> @@ +279,5 @@
> > +  {
> > +    // We already compose one frame. Rest the mCurrentCompositeTaskMonitor to
> > +    // nullptr and we can post Composite() task at next vsync.
> > +    MonitorAutoLock lock(mCurrentCompositeTaskMonitor);
> > +    mCurrentCompositeTask = nullptr;
> 
> We can schedule a new composite from within the composition. This patch
> would break that =\.

Do you mean this line[1]? It will call [2] and request next composition. Is it not work?

[1]
https://hg.mozilla.org/mozilla-central/annotate/7c5f187b65bf/gfx/layers/ipc/CompositorParent.cpp#l970
[2]
https://hg.mozilla.org/mozilla-central/annotate/7c5f187b65bf/gfx/layers/ipc/CompositorParent.cpp#l274
Flags: needinfo?(bgirard)
(In reply to Jerry Shih[:jerry] (UTC+8) from comment #23)
> (In reply to Mason Chang [:mchang] from comment #22)
> > Comment on attachment 8522320 [details] [diff] [review]
> > change vsync-aligned compositor task scheduling timing.
> > 
> > Review of attachment 8522320 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > ::: gfx/layers/ipc/CompositorParent.cpp
> > @@ +280,5 @@
> > > +    // We already compose one frame. Rest the mCurrentCompositeTaskMonitor to
> > > +    // nullptr and we can post Composite() task at next vsync.
> > > +    MonitorAutoLock lock(mCurrentCompositeTaskMonitor);
> > > +    mCurrentCompositeTask = nullptr;
> > > +  }
> > 
> > Actually I don't think it does. We schedule a new composition by setting the
> > mNeedsComposite flag. When the next vsync occurs, it sets
> > mCurrentCompositeTask. As long as mNeedsComposite is set to true, at the
> > next vsync, it will post the composite task. This just won't allow a
> > composite task to be posted until after we finish compositing.
> 
> I think attachment 8522320 [details] [diff] [review] is enough.
> When we receive a vsync notification, while compositing, it will just skip
> the tick at [1], because mCurrentCompositeTask is not null(it implies that
> we have one frame more than vsync interval). This patch doesn't change
> mNeedsComposite flag. In next vsync, if we already finished one frame, we
> can post a new compositing task for next frame.
> 
> So if frame A is more than one vsync interval, the next frame B will be
> composed with these two following conditions.
> 1)frame A is finished
> 2)compositor receive a vsync notification
> We will not have a magic number(e.g. 15ms) in patch as attachment 8560188 [details] [diff] [review]
> [details] [diff] [review].
> 
> [1]
> https://hg.mozilla.org/mozilla-central/annotate/7c5f187b65bf/gfx/layers/ipc/
> CompositorParent.cpp#l303

Yes, I think Jerry is right here. I also think the proper thing should be to just wait until the next vsync to composite rather than try to catch up and drop the frame rate. Jerry is also right in that it is better than having a magic number.
Attachment #8560188 - Attachment is obsolete: true
Comment on attachment 8522320 [details] [diff] [review]
change vsync-aligned compositor task scheduling timing.

Review of attachment 8522320 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/ipc/CompositorParent.cpp
@@ +277,5 @@
>    }
>  
> +  {
> +    // We already compose one frame. Rest the mCurrentCompositeTaskMonitor to
> +    // nullptr and we can post Composite() task at next vsync.

Please update the comment to something more like:

"Only composite when we have finished a frame and only composite on vsync boundaries rather than try to catch up"
Hmm with the current patch, attachment 8522320 [details] [diff] [review], it doesn't seem like it's enough to just not schedule a next composite. Here's a profile while scolling the homescreen:

https://people.mozilla.org/~bgirard/cleopatra/#report=bfea3a1243668d7f338a9c7f6b946546973e0731

Once a composite becomes long, it doesn't throttle correctly, weirdo. Here's a profile with the 15 ms pause:

https://people.mozilla.org/~bgirard/cleopatra/#report=439e68e3131641a09bd5207ca518d92bf8cae2be
Flags: needinfo?(mchang)
Disregard comment 27, had a bad flash.
From looking at a couple or profiles, I'm not really sure we still want to do this. Taking a sample of ~1500 ms from each profile. Without throttling, we still hit every frame, the composite times just take longer due to sleeping. For example, looking at the frame view, we still hit every frame within ~15.5 - 17 ms. But each frame is still using a uniform vsync timestamp so APZ can still generate the correct frame animation.

With throttling, for ~1500 ms, we skip 10 frames but have much nicer composite times / composite start times. However, while scrolling the homescreen, it is much easier to notice these skipped frames, especially while a fling animation comes to a stop. It is noticeably jankier than without throttling. Which means we're still hitting the frames in time without throttling, just spending a lot of time sleeping. It also means, with throttling, we recover but only for a few frames. Within the profiles, we see that we can recover for ~10-20 frames before we spike again. 

It'd be sad if we couldn't hold onto the composite time gains, but I'm not sure the jankiness is worth it.
Created attachment 8560843 [details] [diff] [review]
Only composite when finished previous frame and on vsync tick. v1
Attachment #8522320 - Attachment is obsolete: true
I wonder if we can recover composite times with the power hint (bug1102200). Any idea Jerry?
Flags: needinfo?(hshih)
From our discussion in Taipei. Compositing ASAP lets us potentially composite an additional frame on time whereas waiting for vsync would force us to miss a frame. This already occurs on master and is noticeable to the user without heavy load. In addition, the behavior in this bug only occurs on a Flame device and we don't want to add a new heuristic specific to one device. For now, we will leave the behavior as is, where the Compositor will composite ASAP in an effort to catch up.
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → WORKSFORME
I want us to have a look into this further before the work week is over.

I'll see if I can write a testcase for this.
Looks like for now we're going to go for the composite ASAP but I believe later we will revisit this decision.
Flags: needinfo?(bgirard)
Flags: needinfo?(dwilson)
Flags: needinfo?(hshih)
You need to log in before you can comment on or make changes to this bug.