Closed Bug 1510853 Opened 5 years ago Closed 5 years ago

Add CONTENT_FRAME_TIME variant that records why we missed a frame.

Categories

(Core :: Graphics: WebRender, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
mozilla65
Tracking Status
firefox65 --- fixed

People

(Reporter: mattwoodrow, Assigned: mattwoodrow)

References

Details

Attachments

(5 files, 1 obsolete file)

CONTENT_FRAME_TIME is a useful metric, but it's hard to know exactly why we ended up being slow.

It would be nice to have an alternative metric that records when we missed a frame, and the reason for that.

I want to focus on only recording things that we concretely know (like the vsync that we started compositing not be sequentially after the one we started content), and avoid guesses.
MozReview-Commit-ID: 8wBDg39R4nZ

Depends on D13349
MozReview-Commit-ID: 9RV9ZkHXZTR

Depends on D13350
I've attached WIP patches that seem to work fairly well. The first two are just plumbing, the last has the only interesting changes.

I've included fairly big comments in WebRenderBridgeChild and LayerTransactionParent explaining some of the caveats of this method of recording, including work that will show up in a different bucket to what we would expect from our existing metrics (CONTENT_FULL_PAINT_TIME, COMPOSITE_TIME).

One thing that I realised in the process, is that WebRender forces the render_backend and Renderer to run serially, as we only allow a single frame to be in progress at a time (RenderThread::TooManyPendingFrames == 1). If we already have a frame compositing at the next vsync, then we skip and wait for a future vsync tick.

This behaviour is somewhat different to non-WebRender, where we still have serial compositing, but it blocks the compositor thread, and the next vsync message is delivered when compositing completes (arbitrarily late).

In the case where we have a single long composite, non-WebRender will just start the next composite late, and probably catch up, only recording one dropped frame. WebRender will delay the next composite too, and record two frames. This behaviour change isn't necessarily bad, but it is different, and is worth at least understanding.


I'd really like to add explicit recording of this case, but as per my comment in WebRenderBridgeParent, I couldn't figure out how to get the right data into the right places.

Jeff and Sotaro, could you please look through the patch, see what you think, and see if you have any ideas to improve the data it collects?
Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(jmuizelaar)
(In reply to Matt Woodrow (:mattwoodrow) from comment #4)
> 
> In the case where we have a single long composite, non-WebRender will just
> start the next composite late, and probably catch up, only recording one
> dropped frame. WebRender will delay the next composite too, and record two
> frames. This behaviour change isn't necessarily bad, but it is different,
> and is worth at least understanding.

It's not obvious to me which behaviour is better. I wonder if it's worth changing WebRender to have the same behaviour as current Gecko to make the comparison easier. i.e. without a strong reason to have different behaviour we might as well keep the behaviour the same.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #5)
> (In reply to Matt Woodrow (:mattwoodrow) from comment #4)
> > 
> > In the case where we have a single long composite, non-WebRender will just
> > start the next composite late, and probably catch up, only recording one
> > dropped frame. WebRender will delay the next composite too, and record two
> > frames. This behaviour change isn't necessarily bad, but it is different,
> > and is worth at least understanding.
> 
> It's not obvious to me which behaviour is better. I wonder if it's worth
> changing WebRender to have the same behaviour as current Gecko to make the
> comparison easier. i.e. without a strong reason to have different behaviour
> we might as well keep the behaviour the same.

Yeah, this change seems to be independent of WebRender (apart from just being easier to implement this way with WR), and minimizing those sort of changes while we're actively trying to compare the options seems valuable.
(In reply to Matt Woodrow (:mattwoodrow) from comment #6)
> > It's not obvious to me which behaviour is better. I wonder if it's worth
> > changing WebRender to have the same behaviour as current Gecko to make the
> > comparison easier. i.e. without a strong reason to have different behaviour
> > we might as well keep the behaviour the same.
> 
> Yeah, this change seems to be independent of WebRender (apart from just
> being easier to implement this way with WR), and minimizing those sort of
> changes while we're actively trying to compare the options seems valuable.

Filed bug 1510899 for this.
Priority: -- → P2
(In reply to Matt Woodrow (:mattwoodrow) from comment #4)
> 
> One thing that I realised in the process, is that WebRender forces the
> render_backend and Renderer to run serially, as we only allow a single frame
> to be in progress at a time (RenderThread::TooManyPendingFrames == 1). If we
> already have a frame compositing at the next vsync, then we skip and wait
> for a future vsync tick.

Actually RenderThread::TooManyPendingFrames() permits at most 2 pending frame. One for rendering frame and one for frame build.

> This behaviour is somewhat different to non-WebRender, where we still have
> serial compositing, but it blocks the compositor thread, and the next vsync
> message is delivered when compositing completes (arbitrarily late).
> 
> In the case where we have a single long composite, non-WebRender will just
> start the next composite late, and probably catch up, only recording one
> dropped frame. WebRender will delay the next composite too, and record two
> frames. This behaviour change isn't necessarily bad, but it is different,
> and is worth at least understanding.

One reason of current implementation is that DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL usage on Win10. It seems that presented frame is not dropped, instead, the frame is presented later vsync interval.

It seems better to change WebRender as similar to compositor. It seems better for content.
Flags: needinfo?(sotaro.ikeda.g)
Attachment #9028508 - Attachment description: Bug 1510853 - Make TransactionId usable for other 64bit identifiers. → Bug 1510853 - Make TransactionId usable for other 64bit identifiers. r?jrmuizel
Attachment #9028509 - Attachment description: Bug 1510853 - Introduce VSyncId for every vsync, and pass it to all consumers. → Bug 1510853 - Make VsyncId available to compositor. r?jrmuizel
Attachment #9028510 - Attachment description: Bug 1510853 - Add CONTENT_FRAME_TIME_REASON. → Bug 1510853 - Add CONTENT_FRAME_TIME_REASON. r?jrmuizel
Attached file request.md (obsolete) —
Attachment #9029603 - Flags: review?(chutten)
Comment on attachment 9029603 [details]
request.md

The review request asks for permanent collection but the collection code has it expiring in 75. I'd recommend taking the "expires in six months" approach and have it expire in Firefox 68 or 69. That way we have a built-in checkin time to make sure it's collecting what we want and it's still helpful.

Of course, if you're certain of its ongoing relevance and usefulness, permanent collection is acceptable for this collection since it satisfies the other conditions.

Once the patch and the Data Review request agree, please feel free to r?me again.
Attachment #9029603 - Flags: review?(chutten)
Flags: needinfo?(jmuizelaar)
Attached file request.md
Attachment #9029603 - Attachment is obsolete: true
Attachment #9029838 - Flags: review?(chutten)
Comment on attachment 9029838 [details]
request.md

DATA COLLECTION REVIEW RESPONSE:

    Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate? 

Yes. This collection is Telemetry so it is documented in its definition file (Histograms.json), on the Probe Dictionary, and on the telemetry.mozilla.org Measurement dashboards.

    Is there a control mechanism that allows the user to turn the data collection on and off?

Yes. This collection is Telemetry so it can be controlled in Firefox's Preferences.

    If the request is for permanent data collection, is there someone who will monitor the data over time?

N/A, this collection will expire in Firefox 73 (about a year).

    Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under? **

Category 1, Technical.

    Is the data collection request for default-on or default-off?

Default on, pre-release channels only (and, at the moment, WR-enabled users only, which further restricts the population to Windows 10 Nightly users with nVidia gpus)

    Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?

No.

    Is the data collection covered by the existing Firefox privacy notice? 

Yes.

    Does there need to be a check-in in the future to determine whether to renew the data? 

Yes. Matt Woodrow is responsible for renewing or removing the collection before it expires in Firefox 73.

---
Result: datareview+
Attachment #9029838 - Flags: review?(chutten) → review+
Attachment #9028510 - Attachment description: Bug 1510853 - Add CONTENT_FRAME_TIME_REASON. r?jrmuizel → Bug 1510853 - Add CONTENT_FRAME_TIME_REASON. r=jrmuizel, data-review=chutten
Pushed by mwoodrow@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0ade0aa77b2f
Make TransactionId usable for other 64bit identifiers. r=jrmuizel
https://hg.mozilla.org/integration/autoland/rev/ae190948ad73
Introduce VsyncId and VsyncEvent for identifying vsyncs without timestamp comparisons. r=jrmuizel
https://hg.mozilla.org/integration/autoland/rev/d1ef6db7fc28
Make VsyncId available to compositor. r=jrmuizel
https://hg.mozilla.org/integration/autoland/rev/80baa7b09930
Add CONTENT_FRAME_TIME_REASON. r=jrmuizel, data-review=chutten
Pushed by mwoodrow@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2dd3277b6508
Make TransactionId usable for other 64bit identifiers. r=jrmuizel
https://hg.mozilla.org/integration/autoland/rev/be8b92970850
Introduce VsyncId and VsyncEvent for identifying vsyncs without timestamp comparisons. r=jrmuizel
https://hg.mozilla.org/integration/autoland/rev/eda7bfb669da
Make VsyncId available to compositor. r=jrmuizel
https://hg.mozilla.org/integration/autoland/rev/336f58aeb663
Add CONTENT_FRAME_TIME_REASON. r=jrmuizel, data-review=chutten
Pushed by mwoodrow@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6915aee134cb
Followup to restore code that accidentally got removed during a rebase.
Depends on: 1513657
Flags: needinfo?(matt.woodrow)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: