Closed Bug 1773128 Opened 3 years ago Closed 3 years ago

Crash in [@ libGLESv2_POWERVR_ROGUE.so@0x2252e]

Categories

(Core :: Graphics: WebRender, defect)

Unspecified
Android
defect

Tracking

()

RESOLVED FIXED
103 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox-esr102 --- unaffected
firefox101 --- wontfix
firefox102 --- fixed
firefox103 --- fixed

People

(Reporter: RyanVM, Assigned: jnicol)

References

(Blocks 1 open bug)

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

Long-running Fenix topcrash that seems to have started spiking in 98, all on Kindle Fire HD devices from what I can tell. Something we can fix with a driver blocklist maybe?

Crash report: https://crash-stats.mozilla.org/report/index/44fa5a77-4da5-4da5-86b6-e2b2f0220607

Reason: SIGSEGV / SEGV_MAPERR

Top 10 frames of crashing thread:

0 libGLESv2_POWERVR_ROGUE.so libGLESv2_POWERVR_ROGUE.so@0x0002252e 
1 libGLESv2_POWERVR_ROGUE.so libGLESv2_POWERVR_ROGUE.so@0x000225b3 
2 dalvik-non moving space (deleted) dalvik-non moving space @0x00167002 
3 libxul.so __clear_cache 
4 libxul.so __clear_cache 
5 libxul.so __clear_cache 
6 libxul.so __clear_cache 
7 libxul.so webrender::device::gl::UploadPBOPool::end_frame gfx/wr/webrender/src/device/gl.rs:4229
8 libxul.so webrender::renderer::Renderer::render_impl gfx/wr/webrender/src/renderer/mod.rs:2179
9 libxul.so webrender::renderer::Renderer::update gfx/wr/webrender/src/renderer/mod.rs:1594
Component: Graphics → Graphics: WebRender

It looks like this got much more frequent in version 98.

The common proto signatures are:

libGLESv2_POWERVR_ROGUE.so@0x2252e | libGLESv2_POWERVR_ROGUE.so@0x225b3 | arena_t::Malloc | BaseAllocator::malloc | malloc | <webrender::profiler::ProfilerFrame as core::convert::From<webrender::renderer::FullFrameStats>>::from | webrender::renderer::Renderer::render_impl | webrender::renderer::Renderer::update

and

libGLESv2_POWERVR_ROGUE.so@0x2252e | libGLESv2_POWERVR_ROGUE.so@0x225b3 | __clear_cache | __clear_cache | __clear_cache | __clear_cache | webrender::device::gl::UploadPBOPool::end_frame | webrender::renderer::Renderer::render_impl | webrender::renderer::Renderer::update

It's always hard to know how much to trust these on android. The former (which is most common) certainly seems nonsense. But perhaps we are indeed crashing in the glFenceSync call in UploadPBOPool::end_frame()

We can shift these users to software webrender, but I'm very curious why the crash numbers picked up in 98

Blocks: wr-powervr

In the super search for libGLESv2_POWERVR_ROGUE.so there are some other signatures with much lower numbers, but either seem to be coming from UploadPBOPool::end_frame or have (somewhat) spiked around the same time, so I think they are related.

If we include these we see some non-amazon devices too, I'm guessing the Fire is just the most common device with this GPU.

The affected GPUs appear to be:

  • PowerVR Rogue GX6250
  • PowerVR Rogue G6430
  • PowerVR Rogue G6200

I'd rather avoid blocking all users with these GPUs if we don't have to.

If we limit the super search to crashes that contain UploadPBOPool in the proto signature, we get this

It's mostly old android versions, but there are a couple crashes on SDK 28. We could block only old android versions, and just accept a few crashes for the devices on newer versions. Or we could block by driver version, we have 3830101, 3283119, 3443629, and 3573678 in that list.

There is a crash in glFenceSync (used to recycle PBOs when uploading
textures) affecting several PowerVR GPUs. Block webrender on the known
bad combinations of GPU and driver versions.

Assignee: nobody → jnicol
Status: NEW → ASSIGNED
Pushed by jnicol@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/5329d1b2f08b Block webrender on several PowerVR GPUs due to crash. r=gfx-reviewers,nical
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 103 Branch

The patch landed in nightly and beta is affected.
:jnicol, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(jnicol)

Comment on attachment 9280235 [details]
Bug 1773128 - Block webrender on several PowerVR GPUs due to crash. r?#gfx-reviewers

Beta/Release Uplift Approval Request

  • User impact if declined: Frequent crashes for users on affected devices
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Moves users on buggy devices to software webrender. Low population are affected.
  • String changes made/needed:
  • Is Android affected?: Yes
Flags: needinfo?(jnicol)
Attachment #9280235 - Flags: approval-mozilla-beta?

FWIW it looks like chrome has also encountered issues with EGL fences on PowerVR: https://source.chromium.org/chromium/chromium/src/+/main:gpu/config/gpu_driver_bug_list.json;l=703-725;drc=56143ba502572de50c99373a79220f9a83b209e4

I don't think we have the resources to work around the broken fences for such a low population, so we'll just use software webrender. But if we one day need to implement such a workaround for a more common device, then we may be able to use it here too.

Comment on attachment 9280235 [details]
Bug 1773128 - Block webrender on several PowerVR GPUs due to crash. r?#gfx-reviewers

Approved for 102 beta 8, thanks.

Attachment #9280235 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: