Closed Bug 1685276 Opened 5 years ago Closed 5 years ago

Fix and re-enable partial present on Mali-G77

Categories

(Core :: Graphics: WebRender, defect, P3)

Unspecified
Android
defect

Tracking

()

RESOLVED FIXED
86 Branch
Tracking Status
firefox86 --- fixed

People

(Reporter: jnicol, Assigned: jnicol)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

We had reports on github about display corruption on a Mali-G77 device. This was filed on bugzilla as bug 1676474. I didn't have a device which could reproduce, but with a lot of help from the reporter we managed to establish that it was due to partial present, and therefore disabled partial present on Mali-Gxx devices. (As far as we're aware it only affects G77, but we disabled on all Gxx just to be safe.)

This bug tracks figuring out the problem and fixing/working around it, then re-enabling the feature.

The user reported the bug appeared, then went away, then re-appeared. Using mozregression I have found the following ranges:

Bug first appeared - Bug 1675159. We suspected this at the time based on the dates. It is a refactoring of related code. It shouldn't in theory have caused this, but here we are. As we said at the time, it's likely a subtle driver bug which this change uncovered by affecting the orderings of GL calls.

Bug went away - Likely bug 1661528. Again, this doesn't really make much sense, although it is consistent with...

Bug reappeared - Bug 1676390.

The bug went away when I landed a change which caused us to use persistently mapped buffers, and I think most importantly, call glFenceSync midway through the frame (after texture uploads). The bug then reappeared when I moved the glFenceSync call to the end of the frame.

I've been looking at what practical implications the first refactoring had and playing around with the code. I've discovered:

  • The original patch effectively moved the eglSetDamageRegion call to much earlier in the frame, prior to rendering any picture cache tiles. Previously it occured after all offscreen passes, at the very start of composite_simple().
  • Bug 1661528 effectively introduced a call to glFenceSync after uploading the vertex data textures, which is immediately prior to the new location of eglSetDamageRegion. This must force the driver to do some sort of internal flush, which makes the bug disappear. Bug 1676390 removed this, so the bug reappeared.
  • Switching the bound FBO immediately before eglSetDamageRegion also appears to trigger the required flush. glFlush does not, but glFinish does.
  • Rendering to offscreen targets is what seems to put the driver in the bad state:
    • If I move the eglSetDamageRegion call to before the first time we render to any offscreen target, it works again. Currently this is prior to update_texture_cache() in render_impl(), because we use batched texture uploads which glBlitFramebuffers to the texture cache. If I remove this blit (by using glCopyImageSubData, or disabling batched uploads) we can successfully call eglSetDamageRegion up until the GPU cache update. If we disable the scatter update method, we can successfully call eglSetDamageRegion up until the first picture cache tile is rendered.
  • Thankfully, calling eglSetDamageRegion after all of the offscreen passes, immediately before we composite to the main framebuffer, works. This is how we should proceed.

So the plan is to move the eglSetDamageRegion out of calculate_dirty_rects and back to the start of composite_simple. I'll also try to follow up with ARM and see if I can find out any more information.

In bug 1676474 an issue was reported regarding partial present on
Mali-G77 devices. This was introduced in bug 1675159, which refactored
some partial present logic and shifted the order of some OpenGL calls
around. As a precaution, we disabled the feature on all Mali-Gxx
devices.

The bug seems to occur when eglSetDamageRegion is called after
rendering to an offscreen render target (in this case due to texture
cache or GPU cache updates), but without the driver being flushed in
some way. This appears to be a bug in the Mali driver.

This patch moves the eglSetDamageRegion call back to its original
location -- after all offscreen render targets have been rendered,
immediately before rendering to the main framebuffer -- which fixes
the issue. It also re-enables the feature on all Mali-Gxx devices.

Pushed by jnicol@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/2386225aace8 Fix and re-enable partial present on Mali-Gxx devices. r=kvark
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → 86 Branch
Regressions: 1709548
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: