Quoting Kvark: ``` Here is a rough plan: wgpu-hal expects platform-dependent draw/dispatch buffer. the only thing exposed to the outside is the size of that buffer (for each kind of draw/draw indexed/dispatch) wgpu-hal command encoder gets a new fn prepare_indirect(src_buffer, src_offset, dst_buffer, dst_offset, kind, count). Under the hood it will be a compute operation that does clamping as well as copies of first instance/vertex. wgpu-core spawns a separate encoder to prepare indirect calls whenever it sees one, and orders it accordingly. It also manages the destination buffer as a temporary thing. ``` To expand, if I understand correctly, What I'm planning to do is to: - have wgpu-core store some proxy indirect buffers (somewhere, maybe on the device). These will be used as input for the draw indirect commands instead of what is in the command. - in `command_encoder_run_render_pass_impl`, when running into indirect commands, we allocate ranges into the proxy buffers and use them in the render pass instead of the original buffer/range, and record that we need to sanitize the content of the original buffer at its requested offset into the new range of the proxy buffer. - After we are done with the render pass, we see if we have recorded any ranges to sanitize, if so we create a compute pass that will be submitted just before the render pass. It will dispatch a simple compute shader that reads from the indirect buffer/ranges and write the sanitized output into the proxy buffer.
Bug 1786566 Comment 1 Edit History
Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.
Quoting Kvark: ``` Here is a rough plan: wgpu-hal expects platform-dependent draw/dispatch buffer. the only thing exposed to the outside is the size of that buffer (for each kind of draw/draw indexed/dispatch) wgpu-hal command encoder gets a new fn prepare_indirect(src_buffer, src_offset, dst_buffer, dst_offset, kind, count). Under the hood it will be a compute operation that does clamping as well as copies of first instance/vertex. wgpu-core spawns a separate encoder to prepare indirect calls whenever it sees one, and orders it accordingly. It also manages the destination buffer as a temporary thing. ``` To expand, if I understand correctly, What I'm planning to do is to: - have wgpu-core store some proxy indirect buffers (somewhere, maybe on the device). These will be used as input for the draw indirect commands instead of what is in the command. - in `command_encoder_run_render_pass_impl`, when running into indirect commands, we allocate ranges into the proxy buffers and use them in the render pass instead of the original buffer/range, and record that we need to sanitize the content of the original buffer at its requested offset into the new range of the proxy buffer. - After we are done with the render pass, we see if we have recorded any ranges to sanitize, if so we create a compute pass that will be submitted just before the render pass. It will dispatch a simple compute shader that reads from the indirect buffer/ranges and write the sanitized output into the proxy buffer. Validating into a separate buffer instead of in-place is so that the validation cannot be observed by other means than the indrect draw itself.