Closed Bug 1598380 Opened 2 years ago Closed 2 years ago

Driver stalls with PBO texture uploads on Adreno


(Core :: Graphics: WebRender, defect, P3)




Tracking Status
firefox73 --- fixed


(Reporter: jnicol, Assigned: jnicol)


(Regressed 1 open bug)



(1 file)

Here is a profile of zooming in and out of a text-heavy page:

20% of time is spent in TextureUploader::upload(), most of which is in ioctls talking to the driver, and very little copying the actual data.

I believe that the driver is stalling when we attempt to write data to/reallocate the buffer, until the texture upload of its previous data has completed. Our calls to glBufferData() should be convincing the driver to allocate a new buffer rather than reuse the previous one, but that doesn't seem to work on Adreno.

If I force the driver to create a new PBO rather than reusing the old one, by using glGenBuffers() for each upload, then most of the ioctl time goes away:

See Also: → 1599248

I'm not sure my description of what's happening in comment 0 is entirely correct. It certainly does help, but on really bad sites (such as the daily mail, in desktop mode, as mentioned in bug 1599248) the upload time is still massive.

While experimenting, I tried using a single TextureUploader for all uploads to each texture, rather than a separate one per upload. This effectively means we call glBufferData once per frame, with a large size, rather than tens or even hundreds of times. The driver seems to handle this much better! Like, even if I disable the raster-space rounding while zooming, meaning we rerasterize and reupload glyphs every frame, it's still snappy!

Uploading texture data is showing up frequently in profiles on Adreno devices,
especially when zooming on a text-heavy page. Specifically, the time is spent in
glMapBufferRange and glBufferSubData, most likely when internally allocating the
buffer before transferring data in to it.

Currently, we are creating a new PBO, by calling glBufferData(), for each
individual upload region. This change makes it so that we calculate the required
size for all upload regions to a texture, then create single a PBO of the
required size. The entire buffer is then mapped only once, and each individual
upload chunk is written to it. This can require the driver to allocate a large
buffer, sometimes multiple megabytes in size. However, it handles this case much
better than allocating tens or even hundreds of smaller buffers.

An upload chunk may require more space in a PBO than the original CPU-side
buffer, so that the data is aligned correctly for performance or correctness
reasons. Therefore it is the caller of Device.upload_texture()'s responsibility
to call a new function, Device.required_upload_size(), to calculate the required
size beforehand.

Pushed by
Use single PBO per texture for uploads. r=kvark
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla73
Regressions: 1603026
Flags: needinfo?(jnicol)
Resolution: FIXED → ---
Target Milestone: mozilla73 → ---

Jamie, can we re-land it in a way that is run-time configurable? I.e. on Android use one big PBO, on other platforms just keep creating a new PBO per upload.

Flags: needinfo?(jnicol)

That is my plan for the short-term, yes

See Also: → 1602550
Attachment #9114577 - Attachment description: Bug 1598380 - Use single PBO per texture for uploads. r?kvark → Bug 1598380 - Use single PBO per texture for uploads. r=kvark
Pushed by
Use single PBO per texture for uploads. r=kvark
Closed: 2 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla73
Regressions: 1631353
You need to log in before you can comment on or make changes to this bug.