Texture upload is inefficient on Android

RESOLVED FIXED in mozilla67

Status

()

enhancement
P1
normal
RESOLVED FIXED
9 months ago
5 months ago

People

(Reporter: mstange, Assigned: jnicol)

Tracking

(Blocks 2 bugs)

Trunk
mozilla67
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox64 disabled, firefox65 disabled, firefox66 disabled, firefox67 disabled)

Details

Attachments

(2 attachments)

Here's a profile from scrolling https://www.cnn.com/ampstories/ in a Fennec build that had WebRender enabled: https://perfht.ml/2Ejv8rQ

This build is from https://treeherder.mozilla.org/#/jobs?repo=try&revision=f07be174b4a02f783ef3e6882b02bfe696e6a427&selectedJob=195831601 .

In the profile, you can see glTexSubImage3D blocking on the GPU during texture upload. We should find a way of uploading textures that does not block.
Flags: needinfo?(jnicol)
We've hit this before.
One hack around it is to use TexImage instead of TexSubImage if we're uploading the whole buffer.
MapBufferRange(Write, Invalidate)+PBO+TexSubImage is probably what we want.
(In reply to Jeff Gilbert [:jgilbert] from comment #1)
> We've hit this before.
> One hack around it is to use TexImage instead of TexSubImage if we're
> uploading the whole buffer.
> MapBufferRange(Write, Invalidate)+PBO+TexSubImage is probably what we want.

Note that, with immutable storage (landing in bug 1496168), I don't think TexImage is allowed.
That's correct.
Priority: -- → P3
Assignee: nobody → jnicol
Blocks: wr-android-mvp
No longer blocks: stage-wr-next
Priority: P3 → P1

Here's another profile of texture upload problems https://perfht.ml/2R3ylkb from bug 1511731

log_gpu_snapshot is also showing up there. (not sure if that's expected)

WebRender won't be in Android until after 67

We're already using a PBO, but we're hitting a path in the driver which means it cannot do the upload asynchronously. In the profile under glTexSubImage3D you can see function names showing software copies and lots of waiting.

This is because the stride of the data in the buffer is not a multiple of 256 bytes. If we ensure the stride (and offset) is correct then the driver can do the upload asynchronously.

I have a naive implementation of this working, using glBufferSubData individually for each row to ensure the buffer is packed correctly. My profiles show the time for glTexSubImage3D pretty much disappears, and is now internally a quick hardware copy. However, the multiple calls to glBufferSubData is now slow. I'm working on using glMapBufferRange to eliminate that cost, then all should be good hopefully.

Flags: needinfo?(jnicol)

Okay, got MapBufferRange working. Here's a profile: https://perfht.ml/2Xcykut

All in all it looks much better than before. The bulk of the time is now the memcpy in to the buffer, and it's tiny compared to before.

I've made a pull request to add the required functions to gleam (https://github.com/servo/gleam/pull/183) and will tidy up my webrender patch and put it up for review.

The gleam stuff has been uploaded to crates.io as 0.6.9

This provides the functions glMapBufferRange and glUnmapBuffer.

Currently on Android we upload texture data to the webrender texture
cache using a PBO. On Adreno GPUs, however, this upload is still being
done synchronously, and profiles show a lot of time spent waiting in
glTexSubImage3D.

The problem is that the stride of the data in the PBO is not a
multiple of 256 bytes, so the driver is not able to DMA the upload.

This patch ensures that data is laid out optimally in the PBO, using
glMapBufferRange then copying the data line-by-line if required. This
allows the driver to perform the upload asynchronously as intended.

Depends on D20491

I tested this on a Galaxy S6 (Mali) and found that texture upload is fine even without my patch. So I've made the patch adreno-specific for now, but it will be easy to add other GPUs/platforms if required

Pushed by jnicol@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/dd8f4d598a43
Update gleam to 0.6.9. r=kats
https://hg.mozilla.org/integration/autoland/rev/a9eee2d6d9b8
Ensure PBO texture upload is performed asynchronously on webrender on Adrenos. r=kvark
Status: NEW → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67
You need to log in before you can comment on or make changes to this bug.