Open Bug 1616326 Opened 2 years ago Updated 1 year ago

skewed image masks create junk on their edges

Categories

(Core :: Graphics: WebRender, defect, P3)

defect

Tracking

()

People

(Reporter: Gankra, Unassigned)

References

Details

(Keywords: regression)

Attachments

(4 files, 1 obsolete file)

Attached image bad-render.png

Noticed this while looking into failures from enabling Bug 1555356. It looks like we're incorrectly sampling the mask along its boundary, and showing what should be masked content.

simple yaml that demonstrates the issue (when dropped in wrench/reftests/mask/)

Hey emilio, it looks like this is caused by https://github.com/servo/webrender/pull/3220

Specifically, https://searchfox.org/mozilla-central/source/gfx/wr/webrender/res/cs_clip_image.glsl#62-65

Do you recall what problem that discard was solving?

Flags: needinfo?(emilio)
Keywords: regression
Priority: -- → P3

I don't recall off-hand. Probably was tryng to avoid incorrectly clipping pixels at the boundary. I suspect it was necessary at the time, but if it doesn't break existing tests added there and such it seems fine to remove to me.

Flags: needinfo?(emilio)
Assignee: nobody → a.beingessner

It's unclear what this was accomplishing, but it
prevents us from correctly processing the pixels
on the edge of the mask, causing masked content to
peek through. No tests seem to rely on this
discarding behaviour.

Also added a reftest that's fairly fuzzy but should
suffice as a canary for a regression here.

ah, wonderful "qr" in try-fuzzy gets everything except wrench. whelp, time to investigate

Flags: needinfo?(a.beingessner)

There's a r+ patch which didn't land and no activity in this bug for 2 weeks.
:Gankra, could you have a look please?
For more information, please visit auto_nag documentation.

Flags: needinfo?(a.beingessner)

Still working on this, much more involved than we hoped.

Flags: needinfo?(a.beingessner)
Attachment #9127593 - Attachment description: Bug 1616326 - Don't discard mask pixels. r?kvark → Bug 1616326 - Use a scissor rect instead of discard for image masks. r?gw

Ok so here's the situation with this bug:

The image mask shader (cs_clip_image.glsl) has a difficult task: it takes in an image
(instanced for each tile if the input image was tiled) that is in local space as
well as a target rectangle in screen space (subrect / actual_rect).

This is a problem because we may have two rectangles in different spaces that we need to render the intersection of. In particular, the subrect is always axis-aligned, but the local rect may not be. The current implementation approaches the problem by having the vertex shader generate geometry for the subrect while using discard in the fragment shader to mask out the shape of the input image (tile). The bug is ultimately that this doesn't properly perform AA on the boundaries of the image (tile), so we can end up rendering garbage there when the edges are non-trivial (when the image is not axis aligned).

An interesting side-effect of this approach is that the image-mask shader does not initialize the pixels of the subrect that aren't touched by the image (tile). This is actually necessary for tiling to work, because each tile instance is unaware of the other tiles, so no instance has the information necessary for us to initialize the pixels outside our own tile. It is conjectured that this is "fine" because anything that uses this image-mask will naturally clip itself to the image's rect, and therefore not source the out-of-bounds pixels (which contain whatever garbage was in the texture beforehand).

I was working on changing our approach so that the vertex shader generates geometry for the local image (tile), and use a scissor rect to clip to the subrect.

The benefit of this is that we could use the same edge AA logic we use for other prims, and the GPU should theoretically be able to do less work because it can clearly see the geometry and avoid fragment shading, instead of relying on discard

The downside of this is that changing the scissor rect breaks batching. This wouldn't hurt the tiled case, as each tile shares the same subrect, but it would hurt batching of other alpha tasks together. A potential optimization we can perform is to identify when the image rect is axis-aligned, in which case we can avoid using a scissor and instead make the vertices represent the intersection of the subrect and the image rect.

My patch is incomplete in a few regards:

  • I was still just testing out the approach, so my implementation is a bit messy
  • I hadn't integrated my approach with the rest of the cs_clip_shared family of shaders, but I'm not sure
    that's actually necessary/desirable. cs_clip_image is the only one that handles tiling, so it's genuinely a bit different.
  • I hadn't exactly finished figuring out how to properly apply the prim_transform and clip_transform; there is some suggestion that this is legacy over-engineered for things gecko won't ever emit.
  • I hadn't yet piped through the logic for the scissor rect, which I had been intending to ask gw to handle (since he seemed to think he could do it relatively easily, while it would have taken me a while to learn all the relevant code)
  • I hadn't implemented the axis-aligned optimization, although I was intending to only seriously consider that as a follow-up if the theoretical batching issues show up for real.
Blocks: 1555356

needs reassignment

Assignee: a.beingessner → nobody
Flags: needinfo?(jbonisteel)
Blocks: gfx-triage
Flags: needinfo?(jbonisteel)
No longer blocks: gfx-triage
Attached image skewtest_p400.png

I'm a bit confused as to the status of this bug. I cannot repro the problem (attached Quadro P400, same with HD630 and GTX1060), and then I noticed the patch has landed, which I suppose explains it? nope it was backed out.

Is this bug platform specific? Did we fix it some other way? Am I just lucky :)

Sorry if I miss something obvious, I'm still catching up.

(Edit2: was going to test in a VM on Ubuntu but running into build issues)

Flags: needinfo?(jbonisteel)

Checking with Alexis to see if we can get clarity on the status

Attached image skewtest_vm.png

Alexis still sees the problem on Mac, but it's more subtle than before.

I tested in a VM, no repro (attached - same in release build). That's all the platforms I can test. Since Mac is not officially a target for WR, I'll rebase 1555356 and try it and see if it's landable.
It's possible we just get lucky in wrench with the driver clearing allocated rendertargets to a value that happens to work.

bpeers@ubuntu:~/src/gecko/gfx/wr/wrench$ cargo run -- -p1.5 show ~/skew.yaml
    Finished dev [unoptimized + debuginfo] target(s) in 0.11s
     Running `/home/bpeers/src/gecko/gfx/wr/target/debug/wrench -p1.5 show /home/bpeers/skew.yaml`
OpenGL version 3.3 (Core Profile) Mesa 19.2.8, SVGA3D; build: RELEASE;  LLVM;
hidpi factor: 1.5 (native 1)
Flags: needinfo?(jbonisteel)
Attachment #9127593 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.