Closed Bug 1468713 Opened Last year Closed Last year

blob-invalidation: Crash in mozalloc_abort | abort | webrender::texture_cache::TextureCache::get

Categories

(Core :: Graphics: WebRender, defect, critical)

x86_64
All
defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- unaffected
firefox60 --- unaffected
firefox61 --- unaffected
firefox62 --- disabled

People

(Reporter: darkspirit, Assigned: kvark)

References

(Blocks 1 open bug)

Details

(Keywords: crash, nightly-community, regression)

Crash Data

Attachments

(2 files)

I get this when I view the SVG generated by systemd-analyze plot every time.

I also get one "mozalloc_abort | abort | core::option::expect_failed | webrender::texture_cache::TextureCache::get" (https://crash-stats.mozilla.com/report/index/86450b1e-0084-4b8d-a758-0aa220180615) when I first viewed the SVG.
(In reply to lilydjwg from comment #1)
> I get this when I view the SVG generated by systemd-analyze plot every time.

Do you have a URL for this? Or in general more detailed STR that I can use to reproduce? I have no idea what the "systemd-analyze plot" is.
I'm attaching a SVG file that reproduces this crash. Try to scroll around to view the image, and Firefox will crash soon.

(I've edited the SVG a bit due to privacy concerns. I've checked that it still causes crashes.)
mozregression --good 2018-05-15 --bad 2018-06-15 --pref gfx.webrender.all:true general.autoScroll:true startup.homepage_welcome_url:'https://bug1468713.bmoattachments.org/attachment.cgi?id=8985750'
> 14:26.57 INFO: Last good revision: a25b2c7238f46770d612f2a2cb2f8731e31261ee
> 14:26.57 INFO: First bad revision: 133a13c44abedac2e448d315a32068ce1a5568f4
> 14:26.57 INFO: Pushlog:
> https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=a25b2c7238f46770d612f2a2cb2f8731e31261ee&tochange=133a13c44abedac2e448d315a32068ce1a5568f4

> 133a13c44abe	Jeff Muizelaar — Bug 1458968. Adjust fuzz for webrender tests r=mstange
> 0f55d0ffe494	Markus Stange — Bug 1458968 - Create the nsDisplaySVGWrapper item in nsSVGOuterSVGAnonChildFrame, not in nsSVGOuterSVGFrame. r=mattwoodrow
> 6629dc2614ed	Markus Stange — Bug 1458968 - Disable blend-difference-stacking.html because it fails now. r=mattwoodrow
> 30d54bb4cc27	Markus Stange — Bug 1458968 - Make the nsSVGOuterSVGAnonChildFrame a reference frame by always returning true from IsSVGTransformed. r=mattwoodrow
> 88d41ddd11be	Markus Stange — Bug 1165185 - Add a test for not invalidating transformed elements inside SVG during scrolling. r=roc


mozregression --launch 2018-05-31 -B debug --pref gfx.webrender.all:true general.autoScroll:true startup.homepage_welcome_url:'https://bug1468713.bmoattachments.org/attachment.cgi?id=8985750'
> 0:46.08 INFO: thread 'WRRenderBackend#1' panicked at 'Invalid vector image key', gfx/webrender/src/resource_cache.rs:951:29
= crash reason from bug 1455490


mozregression --launch 2018-06-15 -B debug --pref gfx.webrender.all:true general.autoScroll:true startup.homepage_welcome_url:'https://bug1468713.bmoattachments.org/attachment.cgi?id=8985750'
> 0:49.31 INFO: thread 'WRRenderBackend#1' panicked at 'BUG: handle not requested earlier in frame', gfx/webrender/src/texture_cache.rs:539:21


mozregression --good 2018-05-31 --bad 2018-06-15 -B debug --pref gfx.webrender.all:true general.autoScroll:true startup.homepage_welcome_url:'https://bug1468713.bmoattachments.org/attachment.cgi?id=8985750'
> 10:23.20 INFO: Last good revision: d08944831e26fc9ec613e1fa2dc35d00332586ac
> 10:23.20 INFO: First bad revision: 5b15326286d466b5cf4889160cc09b59bdde08fd
> 10:23.20 INFO: Pushlog:
> https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=d08944831e26fc9ec613e1fa2dc35d00332586ac&tochange=5b15326286d466b5cf4889160cc09b59bdde08fd

> 5b15326286d4	Martin Robinson — Bug 1465058 - Update for API change in WR PR 2756. r=kats
> ccda83410bf4	Kartikaya Gupta — Bug 1465058 - Update reftest annotations for WR PR 2784. r=Gankro
> a3f7b4a6d494	Kartikaya Gupta — Bug 1465058 - Update webrender to commit 8e697f8cb1f1aab2e5f6b9b903eb7191340b10c5. r=Gankro

> WR @ 5e4f257d9f3cdb5691ac62f3affe2e9189d66447
mozregression --repo try --launch fdf3cf023b736228649111edde66c1e5c12441fb -B debug --pref gfx.webrender.all:true general.autoScroll:true startup.homepage_welcome_url:'https://bug1468713.bmoattachments.org/attachment.cgi?id=8985750'
> thread 'WRRenderBackend#1' panicked at 'Invalid vector image key', gfx/webrender/src/resource_cache.rs:951:29

> WR @ 8e697f8cb1f1aab2e5f6b9b903eb7191340b10c5
mozregression --repo try --launch 346d49dfd90475e4877314545dff636d084154d4 -B debug --pref gfx.webrender.all:true general.autoScroll:true startup.homepage_welcome_url:'https://bug1468713.bmoattachments.org/attachment.cgi?id=8985750'
> thread 'WRRenderBackend#1' panicked at 'BUG: handle not requested earlier in frame', gfx/webrender/src/texture_cache.rs:539:21

https://github.com/servo/webrender/compare/5e4f257d9f3cdb5691ac62f3affe2e9189d66447...8e697f8cb1f1aab2e5f6b9b903eb7191340b10c5

It looks like bug 1458968 introduced a crash, but bug 1455490 ( servo/webrender#2796 ) changed the crash reason.
Summary: Crash in mozalloc_abort | abort | webrender::texture_cache::TextureCache::get → blob-invalidation: Crash in mozalloc_abort | abort | webrender::texture_cache::TextureCache::get
Has STR: --- → yes
Keywords: regression
Crash Signature: [@ mozalloc_abort | abort | webrender::texture_cache::TextureCache::get ] → [@ mozalloc_abort | abort | webrender::texture_cache::TextureCache::get ] [@ static struct webrender::resource_cache::CacheItem webrender::texture_cache::TextureCache::get ]
I can also reproduce it. I'll investigate some time this week. Anybody who wants to look at it sooner, please feel free to steal.
Assignee: nobody → bugmail
Ok, I chased this down a bit. The image tile is requested from the blob image renderer just fine. During block_until_all_resources_added, the resource_cache cycles through all the pending image requests, and resolves them at [1]. The image is correctly returned at that point, but then further down we hit the continue statement at [2] so the image is thrown away before it gets inserted into the texture_cache at [3]. I'm not too familiar with how the dirty rect stuff here is handled, but it seems like a WR issue given that it's throwing away the result from the blob renderer.

Dzmitry, can you take a look at this? The root cause might be more obvious to you.

[1] https://searchfox.org/mozilla-central/rev/285da1fd7dcf67448b9175741fa330158edcff73/gfx/webrender/src/resource_cache.rs#942
[2] https://searchfox.org/mozilla-central/rev/285da1fd7dcf67448b9175741fa330158edcff73/gfx/webrender/src/resource_cache.rs#974
[3] https://searchfox.org/mozilla-central/rev/285da1fd7dcf67448b9175741fa330158edcff73/gfx/webrender/src/resource_cache.rs#1035
Flags: needinfo?(kvark)
Figured out the failing path after talking to kats, upstream fix is coming.
Flags: needinfo?(kvark)
Attached video 2018-06-20_12-45-39.mp4
> https://github.com/servo/webrender/pull/2829#issuecomment-398610474
mozregression --repo try --launch 7b480ad64ed6306a9987fb8be3883f5f6e5c772e -B debug --pref gfx.webrender.all:true general.autoScroll:true startup.homepage_welcome_url:'https://bug1468713.bmoattachments.org/attachment.cgi?id=8985750'

Debian Testing, GTX 1060.
The crash is gone, but some areas are hidden until I, for example, click on them (which is not reliable). It can be fixed with gfx.webrender.blob.invalidation:false.
The fix in servo/webrender#2829 fixes one crash. There's another one that I see while scrolling around as well:

thread '<unnamed>' panicked at 'slice index starts at 4695884 but ends at 1048576'

stack looks like below, which seems to indicate the slice indexing at https://searchfox.org/mozilla-central/rev/d0a41d2e7770fc00df7844d5f840067cc35ba26f/gfx/webrender/src/renderer.rs#2558 is bad.

* thread #24, name = 'Renderer', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x00000001000b2435 libmozglue.dylib`::mozalloc_abort(msg=<unavailable>) at mozalloc_abort.cpp:34 [opt]
    frame #1: 0x00000001000b2460 libmozglue.dylib`::abort() at mozalloc_abort.cpp:81 [opt]
    frame #2: 0x00000001062b1139 XUL`__rust_start_panic [inlined] panic_abort::__rust_start_panic::abort::h087b7bb47b0c4b1e at lib.rs:59 [opt]
    frame #3: 0x00000001062b1134 XUL`__rust_start_panic at lib.rs:54 [opt]
    frame #4: 0x000000010627046d XUL`rust_panic at panicking.rs:608 [opt]
    frame #5: 0x00000001062703e1 XUL`std::panicking::rust_panic_with_hook::h6e16f84ca8c4a724 at panicking.rs:593 [opt]
    frame #6: 0x000000010627019f XUL`std::panicking::begin_panic::h950ea52f07a191e2 at panicking.rs:538 [opt]
    frame #7: 0x00000001062700f4 XUL`std::panicking::begin_panic_fmt::he83be180ce68059a at panicking.rs:522 [opt]
    frame #8: 0x0000000106270063 XUL`rust_begin_unwind at panicking.rs:498 [opt]
    frame #9: 0x00000001062c48d4 XUL`core::panicking::panic_fmt::h5f860eec3f515a80 at panicking.rs:71 [opt]
    frame #10: 0x00000001062ce965 XUL`core::slice::slice_index_order_fail::ha58442e555e1696b at mod.rs:756 [opt]
    frame #11: 0x000000010612f60d XUL`webrender::renderer::Renderer::update_texture_cache::he6af31d81611ba39 [inlined] _$LT$core..ops..range..Range$LT$usize$GT$$u20$as$u20$core..slice..SliceIndex$LT$$u5b$T$u5d$$GT$$GT$::index::h2bd1fc5f9d5af618(self=<unavailable>, slice=<unavailable>) at mod.rs:879 [opt]
    frame #12: 0x000000010612f605 XUL`webrender::renderer::Renderer::update_texture_cache::he6af31d81611ba39 [inlined] _$LT$core..ops..range..RangeFrom$LT$usize$GT$$u20$as$u20$core..slice..SliceIndex$LT$$u5b$T$u5d$$GT$$GT$::index::h47a8ba3ec2e3b4ef(slice=<unavailable>) at mod.rs:962 [opt]
    frame #13: 0x000000010612f605 XUL`webrender::renderer::Renderer::update_texture_cache::he6af31d81611ba39 [inlined] core::slice::_$LT$impl$u20$core..ops..index..Index$LT$I$GT$$u20$for$u20$$u5b$T$u5d$$GT$::index::hce3014a87419eea3(self=<unavailable>) at mod.rs:732 [opt]
    frame #14: 0x000000010612f605 XUL`webrender::renderer::Renderer::update_texture_cache::he6af31d81611ba39 [inlined] _$LT$alloc..vec..Vec$LT$T$GT$$u20$as$u20$core..ops..index..Index$LT$core..ops..range..RangeFrom$LT$usize$GT$$GT$$GT$::index::hde04433000a06d4d(index=<unavailable>) at vec.rs:1594 [opt]
    frame #15: 0x000000010612f605 XUL`webrender::renderer::Renderer::update_texture_cache::he6af31d81611ba39(self=<unavailable>) at renderer.rs:2558 [opt]
    frame #16: 0x0000000106127049 XUL`webrender::renderer::Renderer::render_impl::h66c007d7826d7dd4 at renderer.rs:2287 [opt]
    frame #17: 0x0000000106126f18 XUL`webrender::renderer::Renderer::render_impl::h66c007d7826d7dd4 at profiler.rs:204 [opt]
    frame #18: 0x0000000106126ee0 XUL`webrender::renderer::Renderer::render_impl::h66c007d7826d7dd4(self=<unavailable>, framebuffer_size=<unavailable>) at renderer.rs:2277 [opt]
    frame #19: 0x0000000106126940 XUL`webrender::renderer::Renderer::render::h0b68dc64fe414fac(self=<unavailable>, framebuffer_size=<unavailable>) at renderer.rs:2237 [opt]
    frame #20: 0x0000000105acdebf XUL`wr_renderer_render(renderer=<unavailable>, width=<unavailable>, height=<unavailable>) at bindings.rs:559 [opt]
    frame #21: 0x0000000102883dcf XUL`mozilla::wr::RendererOGL::UpdateAndRender(this=0x000000011a9fd480, aReadback=<unavailable>) at RendererOGL.cpp:136 [opt]
    frame #22: 0x00000001028837a5 XUL`mozilla::wr::RenderThread::UpdateAndRender(this=<unavailable>, aWindowId=<unavailable>, aReadback=false) at RenderThread.cpp:308 [opt]
    frame #23: 0x0000000102883652 XUL`mozilla::wr::RenderThread::NewFrameReady(this=0x000000010061bbd0, aWindowId=<unavailable>) at RenderThread.cpp:223 [opt]
The problem seems that we enter the dirty rect codepath at [1] and it's enormous:

Dirty rect is TypedRect(130766×5678 at (6611,2280)), descriptor size is 512×512

And so the computed offset that gets put into the TextureUpdateOp::Update structure is past the end of the byte data in the TextureUpdateSource::Bytes.

[1] https://searchfox.org/mozilla-central/rev/d0a41d2e7770fc00df7844d5f840067cc35ba26f/gfx/webrender/src/texture_cache.rs#1240
I tried turning the attachment into a crashtest but it seems like the crash is dependent on the specific window size/scroll positions/async scroll offsets and I couldn't get it to work reliably.
The WR upstream PR is now merged that should fix both issues.
(In reply to Dzmitry Malyshau [:kvark] from comment #13)
> The WR upstream PR is now merged that should fix both issues.

I tested on a local build with that PR applied and confirmed it seems to fix both of the crashes. Thanks!
(In reply to Emilio Cobos Álvarez [:emilio] from comment #5)
> FWIW, https://bug1294843.bmoattachments.org/attachment.cgi?id=8780694 repros
> for me as well.

This URL gives me a crash still, but it looks like bug 1469528. We can track it over here.
This should be fixed in the next nightly.
Assignee: bugmail → kvark
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.