Open Bug 1603783 Opened 4 years ago Updated 2 years ago

PBO Texture uploads with offset > 0 are broken on AMD Mac Catalina

Categories

(Core :: Graphics: WebRender, defect, P3)

defect

Tracking

()

People

(Reporter: jnicol, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

This was noticed as bug 1603026 when trying to land bug 1598380 (which was backed out as a result).

There appears to be a driver bug on AMD Macs running Catalina, where texture uploads may fail, depending on the offset parameter (actually pixels, but interpreted as an offset due to having a bound PBO) to glTex(Sub)Image*.

What were the offsets used? We have a known bug about non-aligned offsets on AMD on Catalina:
https://github.com/servo/webrender/wiki/Driver-issues#bug-1558167---texture-uploads-require-the-byte-stride-of-256-on-amd-gpus-with-macos-1015

If your offsets weren't aligned to 256, could you make sure they are, and test again?
Otherwise, please add an entry. Would be interesting to understand why it works in some cases and not the others.

Attached image texture cache.png

Here's a screenshot of the texture cache. I pre-filled the PBO entirely with red to make it easy to see which uploads worked, and always uploaded from a fixed offset.

Some things to note:

  • uploads aligned to 128px failed, but the ones at 64px, 192px etc, succeeded. My hunch is that these are not aligned adequately to get DMAed, so avoid this bug. This might be a performance issue, though, so we should look in to that.
  • Uploads to x=0 always seem to succeed
  • The uploads at the end of the texture succeeded. This was with offset=2048*55+1024. Removing the +1024 makes the entire 14th row fail instead of half of it. Increasing the offset to 2048*64 means they fail all the way down to the bottom. Decreasing the offset to 0 means earlier and earlier rows start to succeed.
  • So the offset and the dest x,y both influence whether the upload will succeed, but I haven't figured out quite what the equation is.
  • If I set optimal_pbo_stride to 2048, so GL_UNPACK_ROW_LENGTH is 512, (not sure if coincidentally) the width of the texture, then all the uploads succeed. This would require an unacceptable amount of memory though, I imagine.

Seems like a right mess. Would be interested if anyone has any ideas on what to try?

The offsets were all aligned to 256. (A requirement for the fast path on Adreno, but also happens automatically due to having aligned strides).

The priority flag is not set for this bug.
:jbonisteel, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jbonisteel)
Flags: needinfo?(jbonisteel)
Priority: -- → P3

I've been trying to reproduce the AMD issue in a standalone case without any luck so far. All the combinations of (X texel offset, Y texel offset, PBO offset) for a 64-texel copy in an RGBA texture of size 256x16 appear to work fine on AMD with macOS Catalina.

After going back to Wrench and replicating the problem there, I was finally able to adjust the standalone setup so that the issue is reproducible. It's a win! Now we need to reach to AMD and ask what they think about it. It's all in https://github.com/kvark/gl-buster:

Init with renderer: AMD Radeon Pro 460 OpenGL Engine
Test: swizzle
	Relevant extensions: GL_ARB_texture_swizzle
	textureSize: PASS
Test: PBO uploads
	sanity copy at the origin: PASS
	copy at (128, 0, 0) by offset 16384 with stride 4: FAIL [0, 0, 0, 0]
Done
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: