Crash in [@ mozilla::wr::RenderMacIOSurfaceTextureHost::GetSize]
Categories
(Core :: Graphics: WebRender, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr91 | --- | unaffected |
firefox-esr102 | --- | disabled |
firefox100 | --- | unaffected |
firefox101 | --- | unaffected |
firefox102 | - | disabled |
firefox103 | + | disabled |
firefox104 | --- | disabled |
firefox105 | + | verified |
People
(Reporter: aryx, Assigned: sotaro)
References
(Blocks 2 open bugs, Regression, )
Details
(Keywords: crash, regression, topcrash)
Crash Data
Attachments
(3 files)
11 crashes from 6 installations, all with Firefox 102.0a1 (on macOS obviously), first reported build ID is 20220505185614.
Crash report: https://crash-stats.mozilla.org/report/index/4b88927e-489d-4eb6-be12-5024c0220512
Reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
Top 10 frames of crashing thread:
0 XUL mozilla::wr::RenderMacIOSurfaceTextureHost::GetSize const gfx/webrender_bindings/RenderMacIOSurfaceTextureHost.cpp:68
1 XUL mozilla::layers::NativeLayerCA::AttachExternalImage gfx/layers/NativeLayerCA.mm:779
2 XUL mozilla::wr::RenderCompositorNative::AttachExternalImage gfx/webrender_bindings/RenderCompositorNative.cpp:321
3 XUL webrender::renderer::Renderer::update_native_surfaces gfx/wr/webrender/src/renderer/mod.rs:4688
4 XUL webrender::renderer::Renderer::render_impl gfx/wr/webrender/src/renderer/mod.rs:1976
5 XUL webrender::renderer::Renderer::render gfx/wr/webrender/src/renderer/mod.rs:1737
6 XUL wr_renderer_render gfx/webrender_bindings/src/bindings.rs:616
7 XUL mozilla::wr::RenderThread::UpdateAndRender gfx/webrender_bindings/RenderThread.cpp:537
8 XUL mozilla::wr::RenderThread::HandleFrameOneDoc gfx/webrender_bindings/RenderThread.cpp:387
9 XUL mozilla::detail::RunnableMethodImpl<mozilla::wr::RenderThread*, void xpcom/threads/nsThreadUtils.h:1200
Updated•9 months ago
|
Comment 1•8 months ago
•
|
||
I can reproduce this crash with SWGL enabled after accelerated Canvas2D bug 1773712 landed. However, this crash bug is not a regression from that Canvas2D bug because the Canvas2D change landed today and this crash bug was filed a month ago.
I bisected the crash with my STR to this pushlog with Canvas2D bug 1773712:
Steps to reproduce
- Enable SWGL (
gfx.webrender.software
= true) (UPDATED: andgfx.canvas.accelerated
= true) - Load https://results.enr.clarityelections.com/CA/Contra_Costa/114138/web.285569/#/summary
- Scroll down to the "GOVERNOR" section.
- Click on the "Show Chart" button to the right of the "GOVERNOR" section title.
Result
Crash bp-842adff9-37bc-4858-b749-f9cd30220612 in [@ mozilla::wr::RenderMacIOSurfaceTextureHost::GetSize ]
Comment 2•8 months ago
|
||
[Tracking Requested - why for this release]:
Kelsey, this crash is a regression from canvas color space bug 1703654.
Since accelerated Canvas2D bug 1773712 just enabled the gfx.canvas.accelerated
pref, I bisected my crash STR again with that pref force-enabled and landed on this earlier pushlog for canvas color space bug 1703654:
Comment 3•8 months ago
|
||
The crashes are gated to nightly build, so no need to track for 102 and 102ESR, but we should track for nightly as this is a top crasher on macOS and we should decide rapidly on backing out or not bug 1773712 on Nightly.
Comment 4•8 months ago
|
||
Requesting a backout of Bug 1773712 until the crash is investigated
Comment 5•8 months ago
|
||
Setting 103 to disabled, Bug 1773712 was backed out of central
Updated•8 months ago
|
Comment 6•8 months ago
|
||
The STR I found required the accelerated canvas be enabled (bug 1773712).
But this crash signature started in Nightly 102 before the accelerated canvas was enabled and we still have some crash reports from Beta (and DevEdition) 102. Example: bp-cf91cbe1-d3c6-4c54-ba60-59fc70220613
So there may be other STR that hit this same crash without the accelerated canvas. I bisected the original crash to bug 1703654, which landed in Nightly 102.
Comment 7•8 months ago
|
||
I believe this is from
wr::RenderMacIOSurfaceTextureHost* texture = aExternalImage->AsRenderMacIOSurfaceTextureHost();
MOZ_ASSERT(texture); <- I bet this is the issue, and that we would crash here if we did MOZ_RELEASE_ASSERT
mTextureHost = texture;
gfx::IntSize oldSize = mSize;
mSize = texture->GetSize(0); <- crashes here
Comment 8•8 months ago
|
||
(In reply to Chris Peterson [:cpeterson] from comment #1)
I can reproduce this crash with SWGL enabled after accelerated Canvas2D bug 1773712 landed. However, this crash bug is not a regression from that Canvas2D bug because the Canvas2D change landed today and this crash bug was filed a month ago.
I bisected the crash with my STR to this pushlog with Canvas2D bug 1773712:
Steps to reproduce
- Enable SWGL (
gfx.webrender.software
= true).- Load https://results.enr.clarityelections.com/CA/Contra_Costa/114138/web.285569/#/summary
- Scroll down to the "GOVERNOR" section.
- Click on the "Show Chart" button to the right of the "GOVERNOR" section title.
Result
Crash bp-842adff9-37bc-4858-b749-f9cd30220612 in [@ mozilla::wr::RenderMacIOSurfaceTextureHost::GetSize ]
I tried this STR, setting gfx.webrender.software and gfx.canvas.accelerated to true, but I don't get a crash. Is there anything else I am missing to repro this?
Comment 9•8 months ago
|
||
My hunch is that the SharedSurfaceIO is not keeping alive the MacIOSurface long with out-of-process WebGL, so it exports a SurfaceDescriptor from GPU process to content process, then back to GPU process for WebRender, but in that gap, for some reason, the SharedSurfaceIO goes away, so that when the MacIOSurfaceTextureHostOGL goes to create it from the SurfaceDescriptor for WebRender, it is already gone, so we end up with a null IOSurface that is causing these crashes downwind.
I would need a more reliable repro to test this, but Sotaro is working on something to fix a similar problem in D3D11 that might fix the issue here as well if this is actually the problem.
Comment 10•8 months ago
|
||
(In reply to Lee Salzman [:lsalzman] from comment #8)
I tried this STR, setting gfx.webrender.software and gfx.canvas.accelerated to true, but I don't get a crash. Is there anything else I am missing to repro this?
I don't know of any other steps or settings missing from my STR. I'll attach my about:support info. Maybe there is something peculiar about my hardware (a 2015 MacBook Pro).
Summarizing my findings:
My STR crashes on this build with a clean profile when gfx.webrender.software and gfx.canvas.accelerated are true:
mach mozregression --launch 5c16ac03eca7a366057cb2ac7a6f376f98dd8bf0 --pref "gfx.webrender.software:true" "gfx.canvas.accelerated:true" -a "https://results.enr.clarityelections.com/CA/Contra_Costa/114138/web.285569/#/summary"
But doesn't crash on the same build if gfx.webrender.software or gfx.canvas.accelerated are false:
mach mozregression --launch 5c16ac03eca7a366057cb2ac7a6f376f98dd8bf0 --pref "gfx.webrender.software:true" "gfx.canvas.accelerated:false" -a "https://results.enr.clarityelections.com/CA/Contra_Costa/114138/web.285569/#/summary"
mach mozregression --launch 5c16ac03eca7a366057cb2ac7a6f376f98dd8bf0 --pref "gfx.webrender.software:false" "gfx.canvas.accelerated:true" -a "https://results.enr.clarityelections.com/CA/Contra_Costa/114138/web.285569/#/summary"
Updated•8 months ago
|
Comment 11•7 months ago
|
||
Set release status flags based on info from the regressing bug 1703654
Updated•7 months ago
|
Comment 12•6 months ago
|
||
[Tracking Requested - why for this release]:
Lee, I can reproduce this SWGL crash again (using the STR in comment #1) now that gfx.canvas.accelerated
has been re-enabled in bug 1773712.
Updated•6 months ago
|
Comment 13•6 months ago
|
||
What's the target release for shipping accelerated canvas, Lee?
Comment 14•6 months ago
|
||
I hit this saving an edited (pdfjs.annotationEditorMode=0) PDF to PDF in yesterday(?)'s nightly on macOS.
Comment 15•6 months ago
|
||
I've hit this four times while using Google Maps over the last two days.
Comment 16•6 months ago
|
||
Sotaro, any ideas here?
Comment 17•6 months ago
|
||
100% repro visiting starlink.sx
Updated•6 months ago
|
Comment 18•6 months ago
|
||
Updated•6 months ago
|
Updated•6 months ago
|
Comment 19•6 months ago
|
||
Pushed by lsalzman@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/0f1fe5d29884 Check for null texture host. r=aosmond
Comment 20•6 months ago
|
||
bugherder |
Comment 21•6 months ago
|
||
This patch should at least avoid the crashes while we're investigating this bug.
Comment 22•6 months ago
|
||
We mark every external image in an async image pipeline as preferring a compositor surface:
https://searchfox.org/mozilla-central/rev/6a37a2ab9328bec6a29f688d1b2fba6974d34905/gfx/layers/wr/AsyncImagePipelineManager.cpp#453
There are additional requirements that must be met, so I am guessing that is why or part of why it doesn't always trip:
https://searchfox.org/mozilla-central/rev/43ba67391e71c57a14420e554e9d381543292611/gfx/wr/webrender/src/picture.rs#2520
As such, I don't think there is any guarantee we will get the expected type:
https://searchfox.org/mozilla-central/rev/43ba67391e71c57a14420e554e9d381543292611/gfx/layers/NativeLayerCA.mm#805
Should we be performing more checks before we set the flag? Or is it the responsibility of the compositing code to do the check?
Assignee | ||
Comment 23•6 months ago
•
|
||
Problem seems to exists at canUpdate check in AsyncImagePipelineManager::UpdateImageKeys(). It could not detect TextureHost change from MacIOSurfaceTextureHostOGL to ShmemTextureHost with same format and size.
It happened when accelerated canvas was fallback to sw canvas.
Assignee | ||
Comment 24•6 months ago
|
||
Updated•6 months ago
|
Updated•6 months ago
|
Comment 25•6 months ago
|
||
Pushed by sikeda.birchill@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/74587348428c Add TextureHost type check for using update in AsyncImagePipelineManager::UpdateImageKeys() r=gfx-reviewers,lsalzman
Comment 26•6 months ago
|
||
bugherder |
Assignee | ||
Updated•6 months ago
|
Assignee | ||
Updated•6 months ago
|
Updated•6 months ago
|
Updated•6 months ago
|
Comment 27•5 months ago
|
||
Reproduced this issue on an affected Nightly build from 2022-05-13 using the STR from Comment 1, on macOS 10.15.
Verified as fixed on Firefox 105.0b5 (20220830185924) on the above OS.
Description
•