Closed Bug 1019101 Opened 7 years ago Closed 7 years ago

Add a way to get the max CC pause during a given period of time

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33

People

(Reporter: mccr8, Assigned: mccr8)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

It would be nice to know how much mainthread animations are interrupted by the CC.  This has to be callable from JS.  Avi and I discussed this, and I think an interface that would work would be to have two methods (on Component.utils or something), something like startRecordingPauses() and getMaxPause().

startRecordingPauses() could just clear a maxRecordingPause variable in nsJSEnvironment.  Whenever we compute a slice pause, just set maxRecordingPause to the new value if it is larger.  Then stopRecording() would return that value.
So we'd need to check whether CC slice causes RefreshDriver tick to happen later than expected.

It is of course possible that other stuff between RD ticks actually cause the slowness.
Like, in a normal case there is 15ms random stuff (say, some JS is running), and when a CC slice 
runs, RD tick happens too late and the issue is possibly that other stuff...
But sure, getting more information about CC runs would be useful.
(In reply to Andrew McCreight [:mccr8] from comment #0)
> an interface that would work would be to have two methods (on
> Component.utils or something), something like startRecordingPauses() and
> getMaxPause().
> 
> startRecordingPauses() could just clear a maxRecordingPause variable in
> nsJSEnvironment.

Sounds good to me. Thanks for taking it. It will have some limitation of not being reentrant so wouldn't be able to work independently from overlapping animations, but I think it's ok for this stage. If we end up with an indication that there's a real problem, we could implement more robust approaches.

(In reply to Olli Pettay [:smaug] from comment #1)
> So we'd need to check whether CC slice causes RefreshDriver tick to happen
> later than expected.
> 
> It is of course possible that other stuff between RD ticks actually cause
> the slowness.

True and true. But to compartmentalize the problem, this bug is focused on CC.

Also while measuring RD delays would have some correlation to UI animation disruptions due to CC, IMO this might be over engineering for this stage on one hand, and not necessarily more accurate than the suggested approach on the other hand (but do correct me if I'm wrong). Right now we don't even know if it happens at all.

Makes sense?
Sure, better to start with something simple.
Sorry for not getting to this sooner, I got carried away working on another bug.  Does this interface look okay to you?  It should be like what we discussed before, except the names are slightly different.
Attachment #8438762 - Flags: review?(bugs)
Attachment #8438762 - Flags: feedback?(avihpit)
I did a little bit of local testing and it produced sensible looking results.
Comment on attachment 8438762 [details] [diff] [review]
Add a way to get the max CC pause during a given period of time.

Review of attachment 8438762 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks. Looks good to me. Hope to start experimenting with it soon.

::: js/xpconnect/idl/xpccomponents.idl
@@ +293,5 @@
>       *
> +     * Return the longest cycle collector slice time since the last
> +     * time clearMaxCCTime() was called.
> +     */
> +    long getMaxCCSliceTime();

I'm guessing it's not getMaxCCSliceTimeSinceClear() to keep the length sane?
Attachment #8438762 - Flags: feedback?(avihpit) → feedback+
Yeah, maybe I should put something about clear in there in the name...
Comment on attachment 8438762 [details] [diff] [review]
Add a way to get the max CC pause during a given period of time.

GetMaxCCSliceTime could indeed have SinceClear() in it.

update uuid in xpccomponents.idl
Attachment #8438762 - Flags: review?(bugs) → review+
try run: https://tbpl.mozilla.org/?tree=Try&rev=3b55b51fe964

This took me a little longer than expected to investigate that dt timeout, but it turns out as expected that is another patch in the push.
I added SinceClear to the various names and updated the UUID.

https://hg.mozilla.org/integration/mozilla-inbound/rev/057973613987
Thanks. Just to clarify, a CC slice is the thing which hangs the main thread, and typically (always?) there will _not_ be a loop of few slices in a row, right?

Because if we have N slices of M ms each, one right after the other, then getMaxCCSliceSinceClear() will return M, while in practice, the lag which the user experiences is N * M.

In other words, how frequent can slices be? or what's a typically minimal duration to expect between slices?
Apologies for not catching this in my earlier review. I just started implementing a simple JS wrapper for this which will allow few concurrent users of this system (without one's clear destroying data for the other).

I noticed that this API leaves one (probably extremely rare) case where we could "miss" a max slice, and that's if we do:

//--- demonstration of a sequence, not real code.

// First caller starts here
clearMaxCCTime();

// Some time passes

// Second caller starts here
max1 = getMaxCCSliceTimeSinceClear();
// **
clearMaxCCTime();

// Some time passes ...

// First caller wants to read the maxSlice
max2 = getMaxCCSliceTimeSinceClear();
return max(max1, max2);

//---------

** Any slice which happens here will be invisible to us

A trivial solution would be to modify clearMaxCCTime() to also return the max just before the clear, such that we can avoid a sequence of clear() followed by getMax() where there may be an invisible max slice in between, and instead we'll use max1 = clearMaxCCTime();.

However, if you know that a CC slice can never happen at **, then there's no need for this change.

However #2, I want to use the same wrapper also for GC (bug 1019611), so If such invisible slice can happen at either CC/GC, I'd prefer both to have the same API which can avoid missing such slices.
Flags: needinfo?(continuation)
Yeah, I guess that's true, but I assume that the period you want to measure actually starts after the clear.  For instance, you would do the clear, then start the animation.  Does that not work for the wrapper you are implementing?
Flags: needinfo?(continuation)
(In reply to Avi Halachmi (:avih) from comment #11)
> Thanks. Just to clarify, a CC slice is the thing which hangs the main
> thread, and typically (always?) there will _not_ be a loop of few slices in
> a row, right?
> 
> Because if we have N slices of M ms each, one right after the other, then
> getMaxCCSliceSinceClear() will return M, while in practice, the lag which
> the user experiences is N * M.
> 
> In other words, how frequent can slices be? or what's a typically minimal
> duration to expect between slices?

When we're actively CCing, when we finish a slice, we start a new 32ms timer, so it will be at least that long before we run the next slice.  So they aren't going to happen immediately.
https://hg.mozilla.org/mozilla-central/rev/057973613987
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
(In reply to Andrew McCreight [:mccr8] from comment #13)
> Yeah, I guess that's true, but I assume that the period you want to measure
> actually starts after the clear.  For instance, you would do the clear, then
> start the animation.  Does that not work for the wrapper you are
> implementing?

Just to clarify, at the sequence above, this code executes inside the wrapper. The individual callers don't use the API directly.

If there was only one caller, then it would start after the clear and there would be no further clear until its getMax and it would be fine.

But to support multiple callers (each with its own different start/end points), I have to issue getMax followed by clear when one wants to read its max and the other still measures - to be able to give each caller the max which happened during its individual duration.

At the example sequence above, there's a clear which happens inside the duration which the first caller wants to measure. I have to issue this clear since otherwise the second caller could get a max which happened before it started measuring (but after the first caller started measuring).

So this getMax/clear sequence is mandatory for such scenario. An example of such real-world scenario is the history menu of Australis. Once you click the History icon, the menu slides leftwards (opens) and the list itself independently grows downwards, and it's not deterministic which will end first, since it depends on the number of items at the list.

On this specific case I could avoid this issue by using only a single caller which is aware of this relation, and wait with getMax until both of them have finished, but as a generic wrapper which supports individual and unrelated call sites, then the getMax/clear sequence will happen whenever two call sites measure durations which are only partially overlapping, and it will inevitably happen inside the duration of at least one of these call sites (the one which keeps measuring after the other has stopped).


(In reply to Andrew McCreight [:mccr8] from comment #14)
> When we're actively CCing, when we finish a slice, we start a new 32ms
> timer, so it will be at least that long before we run the next slice.  So
> they aren't going to happen immediately.

Sounds good.
(In reply to Avi Halachmi (:avih) from comment #16)
> it will inevitably happen inside the duration of at least one of these call
> sites (the one which keeps measuring after the other has stopped).

Sorry, for the one which was already measuring when the second caller started measuring.
Here's a sample wrapper which allows such multiple partially overlapping call sites to use this facility without interrupting each other. I intend to use this kind of wrapper for both the CC and the (upcoming) GC API at bug 1019611.

(I only wrote it to assess issues, haven't tried it yet, so it might still be buggy)
See Also: → GC.60fps
Oops. Wrong bug. This is GC related.
See Also: GC.60fps
You need to log in before you can comment on or make changes to this bug.