Crash in [@ mozilla::wr::RenderMacIOSurfaceTextureHost::GetSize]
Categories
(Core :: Graphics: WebRender, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr91 | --- | unaffected |
firefox-esr102 | --- | disabled |
firefox100 | --- | unaffected |
firefox101 | --- | unaffected |
firefox102 | - | disabled |
firefox103 | + | disabled |
firefox104 | --- | disabled |
firefox105 | + | verified |
People
(Reporter: aryx, Assigned: sotaro)
References
(Blocks 2 open bugs, Regression, )
Details
(Keywords: crash, regression, topcrash)
Crash Data
Attachments
(3 files)
11 crashes from 6 installations, all with Firefox 102.0a1 (on macOS obviously), first reported build ID is 20220505185614.
Crash report: https://crash-stats.mozilla.org/report/index/4b88927e-489d-4eb6-be12-5024c0220512
Reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
Top 10 frames of crashing thread:
0 XUL mozilla::wr::RenderMacIOSurfaceTextureHost::GetSize const gfx/webrender_bindings/RenderMacIOSurfaceTextureHost.cpp:68
1 XUL mozilla::layers::NativeLayerCA::AttachExternalImage gfx/layers/NativeLayerCA.mm:779
2 XUL mozilla::wr::RenderCompositorNative::AttachExternalImage gfx/webrender_bindings/RenderCompositorNative.cpp:321
3 XUL webrender::renderer::Renderer::update_native_surfaces gfx/wr/webrender/src/renderer/mod.rs:4688
4 XUL webrender::renderer::Renderer::render_impl gfx/wr/webrender/src/renderer/mod.rs:1976
5 XUL webrender::renderer::Renderer::render gfx/wr/webrender/src/renderer/mod.rs:1737
6 XUL wr_renderer_render gfx/webrender_bindings/src/bindings.rs:616
7 XUL mozilla::wr::RenderThread::UpdateAndRender gfx/webrender_bindings/RenderThread.cpp:537
8 XUL mozilla::wr::RenderThread::HandleFrameOneDoc gfx/webrender_bindings/RenderThread.cpp:387
9 XUL mozilla::detail::RunnableMethodImpl<mozilla::wr::RenderThread*, void xpcom/threads/nsThreadUtils.h:1200
Updated•3 years ago
|
Comment 1•3 years ago
•
|
||
I can reproduce this crash with SWGL enabled after accelerated Canvas2D bug 1773712 landed. However, this crash bug is not a regression from that Canvas2D bug because the Canvas2D change landed today and this crash bug was filed a month ago.
I bisected the crash with my STR to this pushlog with Canvas2D bug 1773712:
Steps to reproduce
- Enable SWGL (
gfx.webrender.software
= true) (UPDATED: andgfx.canvas.accelerated
= true) - Load https://results.enr.clarityelections.com/CA/Contra_Costa/114138/web.285569/#/summary
- Scroll down to the "GOVERNOR" section.
- Click on the "Show Chart" button to the right of the "GOVERNOR" section title.
Result
Crash bp-842adff9-37bc-4858-b749-f9cd30220612 in [@ mozilla::wr::RenderMacIOSurfaceTextureHost::GetSize ]
Comment 2•3 years ago
|
||
[Tracking Requested - why for this release]:
Kelsey, this crash is a regression from canvas color space bug 1703654.
Since accelerated Canvas2D bug 1773712 just enabled the gfx.canvas.accelerated
pref, I bisected my crash STR again with that pref force-enabled and landed on this earlier pushlog for canvas color space bug 1703654:
Comment 3•3 years ago
|
||
The crashes are gated to nightly build, so no need to track for 102 and 102ESR, but we should track for nightly as this is a top crasher on macOS and we should decide rapidly on backing out or not bug 1773712 on Nightly.
Comment 4•3 years ago
|
||
Requesting a backout of Bug 1773712 until the crash is investigated
Comment 5•3 years ago
|
||
Setting 103 to disabled, Bug 1773712 was backed out of central
Updated•3 years ago
|
Comment 6•3 years ago
|
||
The STR I found required the accelerated canvas be enabled (bug 1773712).
But this crash signature started in Nightly 102 before the accelerated canvas was enabled and we still have some crash reports from Beta (and DevEdition) 102. Example: bp-cf91cbe1-d3c6-4c54-ba60-59fc70220613
So there may be other STR that hit this same crash without the accelerated canvas. I bisected the original crash to bug 1703654, which landed in Nightly 102.
Comment 7•3 years ago
|
||
I believe this is from
wr::RenderMacIOSurfaceTextureHost* texture = aExternalImage->AsRenderMacIOSurfaceTextureHost();
MOZ_ASSERT(texture); <- I bet this is the issue, and that we would crash here if we did MOZ_RELEASE_ASSERT
mTextureHost = texture;
gfx::IntSize oldSize = mSize;
mSize = texture->GetSize(0); <- crashes here
Comment 8•3 years ago
|
||
(In reply to Chris Peterson [:cpeterson] from comment #1)
I can reproduce this crash with SWGL enabled after accelerated Canvas2D bug 1773712 landed. However, this crash bug is not a regression from that Canvas2D bug because the Canvas2D change landed today and this crash bug was filed a month ago.
I bisected the crash with my STR to this pushlog with Canvas2D bug 1773712:
Steps to reproduce
- Enable SWGL (
gfx.webrender.software
= true).- Load https://results.enr.clarityelections.com/CA/Contra_Costa/114138/web.285569/#/summary
- Scroll down to the "GOVERNOR" section.
- Click on the "Show Chart" button to the right of the "GOVERNOR" section title.
Result
Crash bp-842adff9-37bc-4858-b749-f9cd30220612 in [@ mozilla::wr::RenderMacIOSurfaceTextureHost::GetSize ]
I tried this STR, setting gfx.webrender.software and gfx.canvas.accelerated to true, but I don't get a crash. Is there anything else I am missing to repro this?
Comment 9•3 years ago
|
||
My hunch is that the SharedSurfaceIO is not keeping alive the MacIOSurface long with out-of-process WebGL, so it exports a SurfaceDescriptor from GPU process to content process, then back to GPU process for WebRender, but in that gap, for some reason, the SharedSurfaceIO goes away, so that when the MacIOSurfaceTextureHostOGL goes to create it from the SurfaceDescriptor for WebRender, it is already gone, so we end up with a null IOSurface that is causing these crashes downwind.
I would need a more reliable repro to test this, but Sotaro is working on something to fix a similar problem in D3D11 that might fix the issue here as well if this is actually the problem.
Comment 10•3 years ago
|
||
(In reply to Lee Salzman [:lsalzman] from comment #8)
I tried this STR, setting gfx.webrender.software and gfx.canvas.accelerated to true, but I don't get a crash. Is there anything else I am missing to repro this?
I don't know of any other steps or settings missing from my STR. I'll attach my about:support info. Maybe there is something peculiar about my hardware (a 2015 MacBook Pro).
Summarizing my findings:
My STR crashes on this build with a clean profile when gfx.webrender.software and gfx.canvas.accelerated are true:
mach mozregression --launch 5c16ac03eca7a366057cb2ac7a6f376f98dd8bf0 --pref "gfx.webrender.software:true" "gfx.canvas.accelerated:true" -a "https://results.enr.clarityelections.com/CA/Contra_Costa/114138/web.285569/#/summary"
But doesn't crash on the same build if gfx.webrender.software or gfx.canvas.accelerated are false:
mach mozregression --launch 5c16ac03eca7a366057cb2ac7a6f376f98dd8bf0 --pref "gfx.webrender.software:true" "gfx.canvas.accelerated:false" -a "https://results.enr.clarityelections.com/CA/Contra_Costa/114138/web.285569/#/summary"
mach mozregression --launch 5c16ac03eca7a366057cb2ac7a6f376f98dd8bf0 --pref "gfx.webrender.software:false" "gfx.canvas.accelerated:true" -a "https://results.enr.clarityelections.com/CA/Contra_Costa/114138/web.285569/#/summary"
Updated•3 years ago
|
Comment 11•3 years ago
|
||
Set release status flags based on info from the regressing bug 1703654
Updated•3 years ago
|
Comment 12•3 years ago
|
||
[Tracking Requested - why for this release]:
Lee, I can reproduce this SWGL crash again (using the STR in comment #1) now that gfx.canvas.accelerated
has been re-enabled in bug 1773712.
Updated•3 years ago
|
Comment 13•3 years ago
|
||
What's the target release for shipping accelerated canvas, Lee?
Comment 14•3 years ago
|
||
I hit this saving an edited (pdfjs.annotationEditorMode=0) PDF to PDF in yesterday(?)'s nightly on macOS.
Comment 15•3 years ago
|
||
I've hit this four times while using Google Maps over the last two days.
Comment 16•3 years ago
|
||
Sotaro, any ideas here?
Comment 17•3 years ago
|
||
100% repro visiting starlink.sx
Updated•3 years ago
|
Comment 18•3 years ago
|
||
Updated•3 years ago
|
Updated•3 years ago
|
Comment 19•3 years ago
|
||
Comment 20•3 years ago
|
||
bugherder |
Comment 21•3 years ago
|
||
This patch should at least avoid the crashes while we're investigating this bug.
Comment 22•3 years ago
|
||
We mark every external image in an async image pipeline as preferring a compositor surface:
https://searchfox.org/mozilla-central/rev/6a37a2ab9328bec6a29f688d1b2fba6974d34905/gfx/layers/wr/AsyncImagePipelineManager.cpp#453
There are additional requirements that must be met, so I am guessing that is why or part of why it doesn't always trip:
https://searchfox.org/mozilla-central/rev/43ba67391e71c57a14420e554e9d381543292611/gfx/wr/webrender/src/picture.rs#2520
As such, I don't think there is any guarantee we will get the expected type:
https://searchfox.org/mozilla-central/rev/43ba67391e71c57a14420e554e9d381543292611/gfx/layers/NativeLayerCA.mm#805
Should we be performing more checks before we set the flag? Or is it the responsibility of the compositing code to do the check?
Assignee | ||
Comment 23•3 years ago
•
|
||
Problem seems to exists at canUpdate check in AsyncImagePipelineManager::UpdateImageKeys(). It could not detect TextureHost change from MacIOSurfaceTextureHostOGL to ShmemTextureHost with same format and size.
It happened when accelerated canvas was fallback to sw canvas.
Assignee | ||
Comment 24•3 years ago
|
||
Updated•3 years ago
|
Updated•3 years ago
|
Comment 25•3 years ago
|
||
Comment 26•3 years ago
|
||
bugherder |
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Comment 27•3 years ago
|
||
Reproduced this issue on an affected Nightly build from 2022-05-13 using the STR from Comment 1, on macOS 10.15.
Verified as fixed on Firefox 105.0b5 (20220830185924) on the above OS.
Description
•