Closed Bug 1779800 Opened 2 years ago Closed 2 years ago

WebGL2 instanced rendering fails on Firefox 102 on Mac OS 12.2

Categories

(Core :: Graphics: CanvasWebGL, defect, P2)

Firefox 102
Unspecified
macOS
defect

Tracking

()

VERIFIED FIXED
106 Branch
Tracking Status
firefox-esr102 --- verified
firefox106 --- fixed
firefox108 --- verified
firefox109 --- verified

People

(Reporter: karlsims, Assigned: jgilbert)

References

(Regression)

Details

(Keywords: regression)

Attachments

(4 files, 2 obsolete files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Firefox/102.0

Steps to reproduce:

Use the built-in variable gl_InstanceID in a WebGL2 vertex shader.

Actual results:

The value of gl_InstanceID is not correct. It seems to always be zero.
Note that this bug is not present and the test provided works fine on Firefox 102 on Linux, and it also works fine on Safari on Mac, but it fails on Firefox 102 on Mac.

Expected results:

The value of gl_InstanceID should be set properly so it can be accessed by the vertex shader.

The Bugbug bot thinks this bug should belong to the 'Core::Graphics: WebRender' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Graphics: WebRender
Product: Firefox → Core
OS: Unspecified → iOS
OS: iOS → macOS

I can repro on macOS 13 beta.

Status: UNCONFIRMED → NEW
Ever confirmed: true

I can reproduce this on macOS 12.4 on an Intel MacBook Pro, but can not reproduce it on an Apple Silicon Mac running macOS 13 Beta (22A5295i).

On Intel, mozgression didn't help because I couldn't find a good build. Going back to builds before August 2020 the test reports "Your browser does not support WebGL 2" and just renders a black square.

On any build I tested after August 2020, the test fails. I ran mozregression with mach (below) and via the pip package and both seemed to work for this range.

$ ./mach mozregression --bad 2022-07-25 --good 2020-07-01 -a "https://bug1779800.bmoattachments.org/attachment.cgi?id=9285664"

It seems this bug does not occur if you use a vertex array buffer to provide the 3 triangle coordinates instead of calculating them in the vertex shader. So perhaps the absence of a VAO is somehow disabling the correct setting of gl_InstanceID in the vertex shader?

The severity field is not set for this bug.
:aosmond, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(aosmond)
Component: Graphics: WebRender → Graphics: CanvasWebGL
Flags: needinfo?(aosmond) → needinfo?(jgilbert)
Attached file WebGL2-instance-test.html (obsolete) —

Looks like Mac needs fake-vertex-attrib for an instance attrib. :S

Assignee: nobody → jgilbert
Flags: needinfo?(jgilbert)

Minimal need seems to be that there needs to be a vertex attrib array enabled, with a non-zero instancing divisor, and with a vertex buffer of at least one byte attached.
I think this is a buggy optimization where the driver thinks it can truncate instanceCount to 1 if there's no instance attrib buffer.
Fortunately, it seems like we just need to do minimal setup for this case, and probably only if there's static use of gl_InstanceID but no instance buffer.
Honestly I don't think this optimization the driver is doing is valuable, so we might as well no check for gl_InstanceID at all.

Actually it looks like it doesn't even need any vertexAttribDivisor set at all. It's enough to have literally any vertexAttribArray enabled. Maybe this is an old fastpath for all-disabled vertexAttribArray drawing?
This means we can leverage our fake-vertex-attrib-0 then.

Attached file WebGL2-instance-test.html (obsolete) —
Attachment #9289732 - Attachment is obsolete: true
//TEST.enableAnyAttribArray = true; // Uncomment to fix.

[...]

  if (TEST.enableAnyAttribArray) {
    gl.enableVertexAttribArray(0);
  }
  gl.vertexAttribPointer(0, 1, gl.FLOAT, false, 0, 0);
Attachment #9289733 - Attachment is obsolete: true

The severity field is not set for this bug.
:jgilbert, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jgilbert)
Severity: -- → S3
Flags: needinfo?(jgilbert)
Priority: -- → P2

MacOS GL driver really just has some serious bugs with gl_VertexID and gl_InstanceID:
https://github.com/KhronosGroup/WebGL/pull/3484

We just need workarounds for these.

The main workaround for website authors is the same as we will be integrating into Firefox:
Polyfill to replace gl_VertexID and gl_InstanceID with our own IDs. E.g.:

let vdata = new Int32Array(Math.max(MAX_USED_VERTEX_ID, MAX_USED_INSTANCE_ID));
vdata = vdata.map((v,i) => i);
const vbuf = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, vbuf);
gl.bufferData(gl.ARRAY_BUFFER, vdata, gl.STATIC_DRAW);

let loc = gl.getAttribLocation(prog, "user_VertexID");
if (loc != -1) {
  gl.enableVertexAttribArray(loc);
  gl.bindBuffer(gl.ARRAY_BUFFER, vbuf);
  gl.vertexAttribIPointer(loc, 1, gl.INT, 0, 0);
}

loc = gl.getAttribLocation(prog, "user_InstanceID");
if (loc != -1) {
  gl.enableVertexAttribArray(loc);
  gl.bindBuffer(gl.ARRAY_BUFFER, vbuf);
  gl.vertexAttribIPointer(loc, 1, gl.INT, 0, 0);
  gl.vertexAttribDivisor(loc, 1);
}

And in the vertex shader:

in highp int user_VertexID;
in highp int user_InstanceID;

This reverts the code that would collapse DrawElements to DrawArrays in some cases,
because it was a correct transform in some cases.

Instead, we need to always keep around a cpu copy of the index buffer,
so that we can figure out how many fakeAttrib0 verts we need.

Finally, in order to correct for the bug where Mac offsets the base instance by first in DrawArrays,
we need to (temporarily) patch vertexAttribPointer offsets for any call with non-zero first.
(This can be disabled via MOZ_WEBGL_WORKAROUND_FIRST_AFFECTS_INSTANCE_ID=0)

No longer depends on: 1781111
See Also: → 1781111
Pushed by jgilbert@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/47db0dc58cf1
Fix WebGL instancing in some cases on Mac. r=gfx-reviewers,lsalzman
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 106 Branch

This is reasonably-likely to cause a minor perf regression on affected systems, so we might see Talos alert here.

Comment on attachment 9294347 [details]
Bug 1779800 - Fix WebGL instancing in some cases on Mac.

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: This fixes correctness for a category of real webgl content, and we don't want to wait on the next whole esr cycle for it.
  • User impact if declined: Certain misbehaving pages due to webgl correctness issues.
  • Fix Landed on Version: 106
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): It's on Release106 already, and we do have pretty good tests in this area, that visibly got less-fail-y after fixing this.
Attachment #9294347 - Flags: approval-mozilla-esr102?
Regressed by: 1476327

Kelsey, this patch doesn't graft cleanly to esr102. Could you please attach a patch based on esr102?

Flags: needinfo?(jgilbert)

Set release status flags based on info from the regressing bug 1476327

Blocks: 1755806
Attachment #9294347 - Attachment description: Bug 1779800 - Fix WebGL instancing in some cases on Mac. → Bug 1779800 - [esr102] Fix WebGL instancing in some cases on Mac.

(In reply to Ryan VanderMeulen [:RyanVM] from comment #25)

Hi Kelsey, it appears that the rebased ESR102 patch doesn't build.
https://treeherder.mozilla.org/jobs?repo=try&revision=95d63a0eba91abd5db4c68b7e5af4251652d4f39

Yes I'm working on it. I just want to show progress.

Flags: needinfo?(jgilbert)

(In reply to Kelsey Gilbert [:jgilbert] from comment #26)

Yes I'm working on it. I just want to show progress.

Ah, thanks!

This reverts the code that would collapse DrawElements to DrawArrays in some cases,
because it was not a correct transform in some cases.

Instead, we need to always keep around a cpu copy of the index buffer,
so that we can figure out how many fakeAttrib0 verts we need.

Finally, in order to correct for the bug where Mac offsets the base instance by first in DrawArrays,
we need to (temporarily) patch vertexAttribPointer offsets for any call with non-zero first.
(This can be disabled via MOZ_WEBGL_WORKAROUND_FIRST_AFFECTS_INSTANCE_ID=0)

Attachment #9294347 - Attachment description: Bug 1779800 - [esr102] Fix WebGL instancing in some cases on Mac. → Bug 1779800 - Fix WebGL instancing in some cases on Mac.

Looks likely green!

Flags: needinfo?(ryanvm)

I'm seeing test_2_conformance2__rendering__builtin-vert-attribs.html timeouts across the board?

Flags: needinfo?(ryanvm) → needinfo?(jgilbert)
Flags: needinfo?(jgilbert)
Attachment #9294347 - Flags: approval-mozilla-esr102?
Attachment #9306124 - Flags: approval-mozilla-esr102?

Comment on attachment 9306124 [details]
Bug 1779800 - [esr102] Fix WebGL instancing in some cases on Mac.

Approved for 102.6esr, thanks for the rebased patch!

Attachment #9306124 - Flags: approval-mozilla-esr102? → approval-mozilla-esr102+
Flags: qe-verify+
QA Whiteboard: [qa-triaged]

Reproduced with Fx 102 on macOS 12.
Verified fixed with Fx 109.0a1 (2022-12-06), Fx 108.0 and Fx 102.6.0esr on macOS 12.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: