Closed Bug 1485591 Opened 6 years ago Closed 6 years ago

Understand checkerboarding

Categories

(Core :: Graphics: WebRender, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED

People

(Reporter: sotaro, Assigned: sotaro)

References

(Blocks 1 open bug)

Details

(Whiteboard: [needs-investigation])

Attachments

(2 files, 1 obsolete file)

This bug is created based on :mattwoodrow's mail.

--------------------------------------------------------------------

I've attached a recording and checkerboarding log. Using 'e' to single step frames through the video you should be able to match it up to the events in the log (sorry for the bad quality).

This was captured with a local patch to only call sampler.sample() (results in the 'viewport' entry in the log) when we're generating a frame, not for a render.

We can see consecutive videos frames showing the presentation of the rects 1794.47, 5303.65 and then 8075.11, except that the last one is supposed to be blank but isn't. The consecutive frames suggests that we didn't just fail to present and drop a frame.
See Also: → 1485314
another :mattwoodrow's mail.

-------------------------------------------------------

Thanks for doing that Ryan!

My reading of those numbers is that WR checkerboards more often (219 
events vs 132), but usually for a shorter duration (most 1 or 2 frames, 
gecko spreads out to much higher numbers) which leads to lower severity 
(severity is sqrt of pixels checkerboarded times duration).

All the other metrics look pretty comparable, we're not missing frames ever.

I suspect the WR checkerboard numbers might be a bit wrong, but I'm 
struggling to make progress.

Doing a fast scroll on the html5 spec looks visually fine (and doing a 
slow-mo video with my iphone shows maybe 1 frame max blank - will try 
again with shorter exposure to be more sure), but about:checkerboard 
shows an event with 9 missed frames (and a severity that corresponds to 
us being entirely blank for all 9).

It does look like WR is updating the content metrics on 
RecvSetDisplayList, even though we might not composite with that content 
next if the display list is still within async scene building. That 
seems like a cause of under reporting, not over though, and WR is so 
fast here that it shouldn't be happening.

We're also sampling APZ twice per frame (once when we finish async scene 
building, and again when we generate a frame to composite). It seems 
like that could leave to invalid results, but removing it doesn't change 
the results for this test.

I'm not sure what else to look at, maybe WR is presenting something 
different to what the APZ checkerboarding code thinks is being 
presented? I don't see how though, but my knowledge of WR internals is 
slim. I saw that the 'render on scroll' flag is false usually, but 
making it true has no effect either.

Ideas appreciated (or please forward to anyone that might have any)!

- Matt
:kats' mail.

--------------------------------------------

Hm, interesting. I don't have an explanation for your observation either. From looking at the checkerboard reporting code I don't see any problems. One thing to note is that the APZ only gets updated content metrics after the async scene build is done, as opposed to right on the RecvSetDisplayList ipdl message. (The APZUpdater keeps a queue and only passes the info on to APZ after the scene swap, but before the APZ sampling that results from the scene build).

Sounds like you eliminated the "double APZ sampling" as the problem as well which would have been my first guess. If we were hitting the case where the post-scene-build sample was checkerboarding but the for-reals-composite sample was not, then we'd get recorded checkerboard events with no visual checkerboarding. But also in that case the checkerboard durations would be smaller than one vsync interval so the pattern of events it would generate would be quite distinctive - you wouldn't see a 9-frame checkerboard from that.

One possibility might be if the displaylist items are not getting clipped to the displayport - then APZ would think it is checkerboarding but WR might still be able to draw stuff?

It might be worth constructing a test page that has visually indicates the scroll position (kind of like https://staktrace.github.io/moz-pages/grid.html), do a slo-mo recording, and see if the about:checkerboard recording matches the visual recording. The about:checkerboard data should give you specific scroll positions that you can map to the video using the page content.
:mattwoodrow's mail.

----------------------------------------

Results are interesting thus far..

I did a recording using the in-built Xbox screen recording tool (at 60fps).

The first frame that I try scroll, the video shows the scrollbar move and the page content doesn't.

Later on in the log there's an expected frame of full checkerboard, but that frame doesn't exist in the video. Unsure if the capturing missed it, or if we just didn't actually present it like we said we would.

Going to try coordinate filming with an external camera to reduce the chances of missing a frame from the recording.
Comment 4 is followed by comment 0.
(In reply to Sotaro Ikeda [:sotaro] from comment #4)
> :mattwoodrow's mail.
> 
> I did a recording using the in-built Xbox screen recording tool (at 60fps).
> 
> The first frame that I try scroll, the video shows the scrollbar move and
> the page content doesn't.
> 
> Later on in the log there's an expected frame of full checkerboard, but that
> frame doesn't exist in the video. Unsure if the capturing missed it, or if
> we just didn't actually present it like we said we would.

Yea, Win10 Game DVR seemed to drop frames even when 60fps was set.
(In reply to Sotaro Ikeda [:sotaro] from comment #2)
> another :mattwoodrow's mail.
> 
> We're also sampling APZ twice per frame (once when we finish async scene 
> building, and again when we generate a frame to composite). It seems 
> like that could leave to invalid results, but removing it doesn't change 
> the results for this test.

We sample APZ twice per frame, but AsyncPanZoomController::ReportCheckerboard() did check the checkerboard only during sampling for generating frame. 

By the ReportCheckerboard(), it checks the checkerboard if sample time is updated.
https://dxr.mozilla.org/mozilla-central/source/gfx/layers/apz/src/AsyncPanZoomController.cpp#4043

The sample time is updated only before generating frame in WebRenderBridgeParent::MaybeGenerateFrame().
https://dxr.mozilla.org/mozilla-central/source/gfx/layers/wr/WebRenderBridgeParent.cpp#1566
I tested checkerboard with modified https://staktrace.github.io/moz-pages/grid.html and captured videos with an external camera. Symptom of checkerboard were very different.

- With Direct3D11(Advanced Layers): video showed white content for several frames during checkerboard. Number of the white content was similar to number of checkerboarded frames in about:checkerboard.

- With WebRender: video showed 1-2 consecutive white content during checkerboard. . Number of the white content was fewer than number of checkerboarded frames in about:checkerboard.
(In reply to Sotaro Ikeda [:sotaro PTO 31/Aug-7/Sep] from comment #9)
> I tested checkerboard with modified
> https://staktrace.github.io/moz-pages/grid.html and captured videos with an
> external camera. Symptom of checkerboard were very different.
> 
> - With WebRender: video showed 1-2 consecutive white content during
> checkerboard. . Number of the white content was fewer than number of
> checkerboarded frames in about:checkerboard.

For me, when pref "apz.frame_delay.enabled:false" is set, I did not saw the symptom. Number of the white content was similar to number of checkerboarded frames in about:checkerboard.

The pref delays async scrolling by 1 frame and added by Bug 1375949. But checkerboard seems not care about the async scrolling delay.
It might be better to check actual checkerboarding by using AsyncPanZoomController::GetEffectiveScrollOffset() than Metrics().GetScrollOffset().

https://dxr.mozilla.org/mozilla-central/source/gfx/layers/apz/src/AsyncPanZoomController.cpp#3948
Assignee: nobody → sotaro.ikeda.g
Attachment #9004145 - Attachment is obsolete: true
Attachment #9004146 - Attachment is patch: false
Assignee: sotaro.ikeda.g → nobody
Attachment #9004146 - Attachment mime type: text/plain → application/pdf
Assignee: nobody → sotaro.ikeda.g
Priority: -- → P1
See Also: → 1375949
Depends on: 1487001
Priority: P1 → P2
Whiteboard: [needs-investigation]
Depends on: 1485314
See Also: 1485314
:mattwoodrow, problems of about:checkerboard were addressed. Are there other concerns about the problem?
Flags: needinfo?(matt.woodrow)
I don't think so, the reported numbers now seem to match what we visually see, so that's great! Thanks for fixing this Sotaro!
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(matt.woodrow)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: