Closed Bug 1219145 Opened 5 years ago Closed 4 years ago

Detect whether we are currently blocking an animation

Categories

(Toolkit :: Performance Monitoring, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla45
Tracking Status
firefox45 --- fixed

People

(Reporter: Yoric, Assigned: Yoric)

References

Details

Attachments

(2 files)

No description provided.
This, along with bug 1219150, would let us considerably reduce false-positives for slow add-on detection. Nical, is there a way to be informed when we are preventing a refresh?
Flags: needinfo?(nical.bugzilla)
Redirecting to jwatt.
Flags: needinfo?(nical.bugzilla) → needinfo?(jwatt)
See Also: → 1178972
Bug 1219145 - nsRefreshDriver::IsJankCritical();r?jwatt

To refine its alerts, Performance Stats API needs to be able to know whether a long-running operation is actually causing user-visible jank in the current process. This patch introduces a trivial API that lets clients ask the refresh driver whether any kind of animation is ongoing.
Attachment #8681259 - Flags: review?(jwatt)
If I understand correctly the refresh driver, the above implementation could work.
Assignee: nobody → dteller
Attachment #8681259 - Flags: feedback?(dbaron)
I think you need a counter for "number of refresh drivers currently listening for vsync", otherwise refresh drivers will stomp on each other - one will set sIsJankCritical to false even if there's another that's still listening.
@mstange It's updated by StartTimer()/StopTimer(), which themselves are only fired when we switch between 0 and 1 refresh drivers, so I think we should be ok on that.
I see, ok.
Comment on attachment 8681259 [details]
MozReview Request: Bug 1219145 - nsRefreshDriver::IsJankCritical();r=hiro

I think it would be better for Brian to review this since I haven't really touched the refresh driver code. If David gets to the f? beforehand then maybe he can r+ too.
Flags: needinfo?(jwatt)
Attachment #8681259 - Flags: review?(jwatt) → review?(bbirtles)
This patch considers animations running on compositor 'jank' too because animations on compositor need refresh driver's tick. Is this what you expected?

If you want to detect animations which are consuming the main thread, I think the right place is after mViewManagerFlushIsPending check[1] inside nsRefreshDriver::Tick.

[1] http://hg.mozilla.org/mozilla-central/file/1596ab6985cb/layout/base/nsRefreshDriver.cpp#l1708

mViewManagerFlushIsPending is set to true only for root refresh driver, i.e. chrome refresh driver on non-E10S, chrome and content refresh drivers on E10S. So on E10S we need to tweak (pass the jank state to chrome refresh driver?) to know the 'Jank' state at a time.
(In reply to Hiroyuki Ikezoe (:hiro) from comment #9)
> This patch considers animations running on compositor 'jank' too because
> animations on compositor need refresh driver's tick. Is this what you
> expected?

No, you're right, I'm only interested in the main thread of the current process.

> If you want to detect animations which are consuming the main thread, I
> think the right place is after mViewManagerFlushIsPending check[1] inside
> nsRefreshDriver::Tick.
> 
> [1]
> http://hg.mozilla.org/mozilla-central/file/1596ab6985cb/layout/base/
> nsRefreshDriver.cpp#l1708
> 
> mViewManagerFlushIsPending is set to true only for root refresh driver, i.e.
> chrome refresh driver on non-E10S, chrome and content refresh drivers on
> E10S. So on E10S we need to tweak (pass the jank state to chrome refresh
> driver?) to know the 'Jank' state at a time.

Well, I'm not interested in the Tick() itself, but rather in the fact that there will be a Tick consumed by the main thread. Is there a way to determine from the timer that there is at least one main thread consumer?
Or should I set `IsJankCritical()` to true iff at least one refresh driver has `mViewManagerFlushIsPending` set to true?
Flags: needinfo?(hiikezoe)
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #10)

> Well, I'm not interested in the Tick() itself, but rather in the fact that
> there will be a Tick consumed by the main thread. Is there a way to
> determine from the timer that there is at least one main thread consumer?

An important thing is that animations on compositor need a refresh driver on *main-thread*.
So there is no way to detect such stats from *the timer*.

(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #11)
> Or should I set `IsJankCritical()` to true iff at least one refresh driver
> has `mViewManagerFlushIsPending` set to true?

I don't think it works as expected.  IIRC, the mViewManagerFlushIsPending is set to true only during nsRefreshDriver::Tick(). 

Another thing I am concerned is that many of callback functions in JS are processed regardless of the refresh driver.  For example:

setTimeout(function() {
  while (true) {} //
}, 100);

In above code, refresh driver does not work at all.

As far as I can tell, Frame Timing API[1] is very similar to what you want to do, but it's a per-document.

[1] http://w3c.github.io/frame-timing/
Flags: needinfo?(hiikezoe)
(In reply to Hiroyuki Ikezoe (:hiro) from comment #12)
> An important thing is that animations on compositor need a refresh driver on
> *main-thread*.
> So there is no way to detect such stats from *the timer*.

Fair enough. For a v1, I'm willing to go with an over-approximation that says that we are in a jank-critical section even if the main thread is actually not involved in the animation. I'll try and improve in a followup.

> (In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment
> Another thing I am concerned is that many of callback functions in JS are
> processed regardless of the refresh driver.  For example:
> 
> setTimeout(function() {
>   while (true) {} //
> }, 100);
> 
> In above code, refresh driver does not work at all.

Well, if an animation is in progress, the timer must be on, right?


> As far as I can tell, Frame Timing API[1] is very similar to what you want
> to do, but it's a per-document.
> 
> [1] http://w3c.github.io/frame-timing/

Indeed, this looks very similar. We'll see how the proposal evolves.
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #13)
> (In reply to Hiroyuki Ikezoe (:hiro) from comment #12)
> > An important thing is that animations on compositor need a refresh driver on
> > *main-thread*.
> > So there is no way to detect such stats from *the timer*.
> 
> Fair enough. For a v1, I'm willing to go with an over-approximation that
> says that we are in a jank-critical section even if the main thread is
> actually not involved in the animation. I'll try and improve in a followup.

Then, I can provide a simple solution that reports the jank section from refresh driver tick..

> > (In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment
> > Another thing I am concerned is that many of callback functions in JS are
> > processed regardless of the refresh driver.  For example:
> > 
> > setTimeout(function() {
> >   while (true) {} //
> > }, 100);
> > 
> > In above code, refresh driver does not work at all.
> 
> Well, if an animation is in progress, the timer must be on, right?

Yes, right.
(In reply to Hiroyuki Ikezoe (:hiro) from comment #14)
> (In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment
> #13)
> > (In reply to Hiroyuki Ikezoe (:hiro) from comment #12)
> > > An important thing is that animations on compositor need a refresh driver on
> > > *main-thread*.
> > > So there is no way to detect such stats from *the timer*.
> > 
> > Fair enough. For a v1, I'm willing to go with an over-approximation that
> > says that we are in a jank-critical section even if the main thread is
> > actually not involved in the animation. I'll try and improve in a followup.
> 
> Then, I can provide a simple solution that reports the jank section from
> refresh driver tick..

Ah, you can report the jank from RefreshDriverTimer.
(In reply to Hiroyuki Ikezoe (:hiro) from comment #15)
> (In reply to Hiroyuki Ikezoe (:hiro) from comment #14)
> > (In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment
> > #13)
> > > (In reply to Hiroyuki Ikezoe (:hiro) from comment #12)
> > > > An important thing is that animations on compositor need a refresh driver on
> > > > *main-thread*.
> > > > So there is no way to detect such stats from *the timer*.
> > > 
> > > Fair enough. For a v1, I'm willing to go with an over-approximation that
> > > says that we are in a jank-critical section even if the main thread is
> > > actually not involved in the animation. I'll try and improve in a followup.
> > 
> > Then, I can provide a simple solution that reports the jank section from
> > refresh driver tick..
> 
> Ah, you can report the jank from RefreshDriverTimer.

I might be wrong. On E10S if the main-thread of content process is busy and the main-thread of chrome process is NOT busy, the RefreshDriverTimer may not aware the busyness of the content process....
(In reply to Hiroyuki Ikezoe (:hiro) from comment #16)
> I might be wrong. On E10S if the main-thread of content process is busy and
> the main-thread of chrome process is NOT busy, the RefreshDriverTimer may
> not aware the busyness of the content process....

The relationship between content and parent timers are not clear to me. Since I am measuring and reporting each process independently anyway, I am only ever interested in the current process.
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #17)
> (In reply to Hiroyuki Ikezoe (:hiro) from comment #16)
> > I might be wrong. On E10S if the main-thread of content process is busy and
> > the main-thread of chrome process is NOT busy, the RefreshDriverTimer may
> > not aware the busyness of the content process....
> 
> The relationship between content and parent timers are not clear to me.
> Since I am measuring and reporting each process independently anyway, I am
> only ever interested in the current process.

On E10S, as you may already know, RefreshDriverTimer on chrome != RefreshDriverTimer on content. 
It should not be a problem if you do not need to get the jank stats on both of chrome and content processes at once.
Comment on attachment 8681259 [details]
MozReview Request: Bug 1219145 - nsRefreshDriver::IsJankCritical();r=hiro

Giving this to hiro since he has been looking into this.
Attachment #8681259 - Flags: review?(bbirtles) → review?(hiikezoe)
Attachment #8681259 - Flags: review?(hiikezoe)
Comment on attachment 8681259 [details]
MozReview Request: Bug 1219145 - nsRefreshDriver::IsJankCritical();r=hiro

https://reviewboard.mozilla.org/r/23789/#review21325

As I commented in bug 1219145 comment 9, animations on compositor also need refresh driver. You should distinguish the case.
For reference, here is a rough sketch what I had in mind in comment 14.
We can do similar thing in RefreshDriverTimer. But after I commented, I realized this is not what :Yoric wants to do because this patch sends a jank *after* the jank state happened.
Attachment #8681726 - Attachment is patch: true
Attachment #8681726 - Attachment mime type: text/x-patch → text/plain
Re attachment 8681259 [details]:  could you explain what guarantees that there exists only one VsyncRefreshDriverTimer per process?  This seems to depend on that assumption, and it's not at all clear to me that it's true, although I'm not really familiar with that code.

I also tend to agree that this might be a little too broad in interpreting any refresh observer as being critical.  Hiro is correct that this will report true for animations that are really running on the compositor, since we have to do some main thread work for them.  (Bug 1207014 is also vaguely related, since it might fix/improve that, and might also give us the ability to detect when we really need to hit ~16ms.)  There are also other things that are likely to be refresh observers.


I'm also a little unsure about the motivation for the change.  Why is it bad for an addon to block the main thread for a long time only when there's actually an animation running?  The reports you describe don't seem like false positives to me; it seems better to report the problem whenever it happens, rather than only when it's particularly noticeable.  (That said, things like GC improvements (such as bug 1214961) might improve cases where the report is really reporting a Gecko problem.)
Flags: needinfo?(dteller)
(In reply to David Baron [:dbaron] ⌚UTC+8 from comment #22)
> Re attachment 8681259 [details]:  could you explain what guarantees that
> there exists only one VsyncRefreshDriverTimer per process?  This seems to
> depend on that assumption, and it's not at all clear to me that it's true,
> although I'm not really familiar with that code.

Good point. Easy to fix, fortunately.

> I also tend to agree that this might be a little too broad in interpreting
> any refresh observer as being critical.  Hiro is correct that this will
> report true for animations that are really running on the compositor, since
> we have to do some main thread work for them.  (Bug 1207014 is also vaguely
> related, since it might fix/improve that, and might also give us the ability
> to detect when we really need to hit ~16ms.)  There are also other things
> that are likely to be refresh observers.

Theoretically, the code of PerfStats can adapt to a dynamic framerate goal, so I'll be happy to modify once bug 1207014 has landed.

I am definitely interested in refining to only handle main thread animations. This will be useful both for PerfStats and for bug 1178972. On the other hand, I'm willing to land something rougher as v1, it will be an improvement over what we have right now.

> I'm also a little unsure about the motivation for the change.  Why is it bad
> for an addon to block the main thread for a long time only when there's
> actually an animation running?  The reports you describe don't seem like
> false positives to me; it seems better to report the problem whenever it
> happens, rather than only when it's particularly noticeable.  (That said,
> things like GC improvements (such as bug 1214961) might improve cases where
> the report is really reporting a Gecko problem.)

We have feedback from users:
1. Add-on Foo does something that takes more than 64ms in the chrome while user is actually looking at content, or vice-versa, or that is somehow invisible for any reason;
2. The add-on watcher shows "Hey, add-on Foo may be slowing down Firefox";
3. User complains because of a false alert.

So we want to be able to distinguish between user-visible and non-user-visible jank. The latter will still be reported through Telemetry but may only be visible if a pref is switched on.
Flags: needinfo?(dteller)
(In reply to Hiroyuki Ikezoe (:hiro) from comment #21)
> Created attachment 8681726 [details] [diff] [review]
> Rought sketch what I had in mind in comment 14
> 
> For reference, here is a rough sketch what I had in mind in comment 14.
> We can do similar thing in RefreshDriverTimer. But after I commented, I
> realized this is not what :Yoric wants to do because this patch sends a jank
> *after* the jank state happened.

Well, your patch implements an additional jank detector. This has its uses, but not for my scenario, as PerfStats already has a specialized probe that not only detects a subset of jank, but can also pinpoint the culprit(s). Right now, I just want to be able to tell whether the jank we have detected is part of an animation.
Comment on attachment 8681259 [details]
MozReview Request: Bug 1219145 - nsRefreshDriver::IsJankCritical();r=hiro

Bug 1219145 - nsRefreshDriver::IsJankCritical();r?hiro

To refine its alerts, Performance Stats API needs to be able to know whether a long-running operation is actually causing user-visible jank in the current process. This patch introduces a trivial API that lets clients ask the refresh driver whether any kind of animation is ongoing.
Attachment #8681259 - Attachment description: MozReview Request: Bug 1219145 - nsRefreshDriver::IsJankCritical();r?jwatt → MozReview Request: Bug 1219145 - nsRefreshDriver::IsJankCritical();r?hiro
Attachment #8681259 - Flags: feedback?(dbaron) → review?(hiikezoe)
The latest commit just takes into account dbaron's remark that there may be several vsync drivers in the same process.
Filed refinements as bug 1220582.
Comment on attachment 8681259 [details]
MozReview Request: Bug 1219145 - nsRefreshDriver::IsJankCritical();r=hiro

https://reviewboard.mozilla.org/r/23789/#review21417

I think mouse movement on content window also causes janks (on chrome process I guess) with this patch and I have no idea how to eliminate compositor animations yet. But this patch is basically harmless, I'd expect we can eliminate at least compositor animations in bug 1220582.

We could also use this jank flag to test that finished (or cancelled, etc.) animations do not activate any timers. I've been using NSPR_LOG_MODULES=nsRefreshDriver:5 for the purpose, but if the jank flag is exposed in test frameworks we could use the flag in tets. I don't know the Performance Stats API is exposed in test frameworks though.
Attachment #8681259 - Flags: review?(hiikezoe) → review+
Yes, I plan to expose this through the Performance Stats API.
this failed to apply:

applying 85b7b48d2736
patching file layout/base/nsRefreshDriver.cpp
Hunk #2 FAILED at 413
1 out of 3 hunks FAILED -- saving rejects to file layout/base/nsRefreshDriver.cpp.rej
patch failed to apply


could you take a look, thanks!
Flags: needinfo?(dteller)
Keywords: checkin-needed
Comment on attachment 8681259 [details]
MozReview Request: Bug 1219145 - nsRefreshDriver::IsJankCritical();r=hiro

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/23789/diff/2-3/
Attachment #8681259 - Attachment description: MozReview Request: Bug 1219145 - nsRefreshDriver::IsJankCritical();r?hiro → MozReview Request: Bug 1219145 - nsRefreshDriver::IsJankCritical();r=hiro
Comment on attachment 8681259 [details]
MozReview Request: Bug 1219145 - nsRefreshDriver::IsJankCritical();r=hiro

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/23789/diff/3-4/
Rebased.
Flags: needinfo?(dteller)
Keywords: checkin-needed
seems this has problems in applying a5069ca09a00
patching file layout/base/nsRefreshDriver.cpp
Hunk #1 succeeded at 80 with fuzz 2 (offset 0 lines).
Hunk #2 FAILED at 418
1 out of 3 hunks FAILED -- saving rejects to file layout/base/nsRefreshDriver.cpp.rej
patch failed to apply
abort: fix up the merge and run hg transplant --continue
Flags: needinfo?(dteller)
Keywords: checkin-needed
Comment on attachment 8681259 [details]
MozReview Request: Bug 1219145 - nsRefreshDriver::IsJankCritical();r=hiro

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/23789/diff/3-4/
Rebased.
Flags: needinfo?(dteller)
https://hg.mozilla.org/mozilla-central/rev/8fa6b8ebc6f1
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
See Also: → 1232622
You need to log in before you can comment on or make changes to this bug.