Closed Bug 1454187 Opened Last year Closed Last year

Crash: Moz2D replay problem

Categories

(Core :: Graphics: WebRender, defect, P2, critical)

Unspecified
All
defect

Tracking

()

RESOLVED FIXED
mozilla62
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- unaffected
firefox59 --- disabled
firefox60 --- disabled
firefox61 --- disabled
firefox62 --- disabled

People

(Reporter: sphilp, Assigned: kats)

References

(Blocks 3 open bugs)

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is
report bp-02a9b23c-e860-4a50-8084-0a8d00180414.
=============================================================

Top 10 frames of crashing thread:

0 libmozglue.dylib mozalloc_abort memory/mozalloc/mozalloc_abort.cpp:34
1 libmozglue.dylib abort memory/mozalloc/mozalloc_abort.cpp:81
2 XUL panic_abort::__rust_start_panic::abort::h091e61b1e9ef8f82 src/libpanic_abort/lib.rs:62
3 XUL panic_abort::__rust_start_panic src/libpanic_abort/lib.rs:57
4 XUL std::panicking::rust_panic_with_hook::hfb431ab23831437f src/libstd/panicking.rs:607
5 XUL std::panicking::begin_panic::h5011c55181109e2c src/libstd/panicking.rs:537
6 XUL _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::haecbd23b4e6d2434 gfx/webrender_bindings/src/moz2d_renderer.rs:453
7 XUL rayon_core::registry::WorkerThread::wait_until_cold::he0d731a3ada7a599 third_party/rust/rayon-core/src/job.rs:60
8 XUL std::sys_common::backtrace::__rust_begin_short_backtrace::hf56cc90692bdeac5 third_party/rust/rayon-core/src/registry.rs:543
9 XUL _$LT$F$u20$as$u20$alloc..boxed..FnBox$LT$A$GT$$GT$::call_box::h1ee5cb2907999af5 src/libstd/thread/mod.rs:406

=============================================================
Crash Signature: std::panicking::rust_panic_with_hook ] → std::panicking::rust_panic_with_hook ] [@ mozalloc_abort | abort | libxul.so@0x3d08258 | libxul.so@0x3d08248 | libxul.so@0x3cf7da0 | rayon_core::job::{{impl}}::execute<T> ]
Depends on: 1440088
Summary: Crash in mozalloc_abort | abort | panic_abort::__rust_start_panic::abort::h091e61b1e9ef8f82 | panic_abort::__rust_start_panic | _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::haecbd23b4e6d2434 → Crash: Moz2D replay problem
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #1)
> No crashes between build 2018-03-26_220108 Win (bug 1446286) and
> 2018-04-07_220122 Linux.

Not sure where you got this from. When I look at [1] (crash reports with "Moz2D replay problem" sorted by buildid, I see that the incidence of crashes drops off significantly after 2018-03-30 which is when bug 1447076 landed. Before we had a variety of crash signatures and many many crashes, now we have just a few crashes and only a few signatures, most of which seem to have the rayon_core stuff in them. I'm not convinced this is the same issue as bug 1440088 comment 0.

[1] https://crash-stats.mozilla.com/search/?moz_crash_reason=~Moz2D%20replay%20problem&product=Firefox&version=61.0a1&date=%3E%3D2018-03-28T05%3A51%3A00.000Z&date=%3C2018-04-15T05%3A51%3A49.000Z&_sort=build_id&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=build_id&_columns=platform#crash-reports
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #2)
> > No crashes between build 2018-03-26_220108 Win (bug 1446286) and 2018-04-07_220122 Linux.

That was the "partly-wrong" part of comment 1 because the dates on Socorro still mean "report date" even if I configure it to sort by build date. When I increased the date range I realized my mistake and tagged the comment. Sorry.

> I'm not convinced this is the same issue as bug 1440088 comment 0.
bp-007e7c63-6f18-414b-8772-24f850180221 from bug 1440088 comment 0 is the same as
bp-5e7875ff-8e57-43a7-9d9d-2179b0180306 from bug 1440088 comment 2.
"Vector image error Unknown error" (bug 1391255) became to "Moz2D replay problem" (bug 1446286).

Because the very short [@ mozalloc_abort | abort | rayon_core::job::{{impl}}::execute<T> ] from bug 1446286 didn't happen again I didn't dare to reopen bug 1446286. This bug looked like an opportunity to declare it to the new accumulative bug of "Moz2D replay problem". Or would you prefer to reopen bug 1446286?
Let's leave that one closed, and use this bug to track the new accumulation of "Moz2D replay problem". Thanks!
Crash Signature: std::panicking::rust_panic_with_hook ] [@ mozalloc_abort | abort | libxul.so@0x3d08258 | libxul.so@0x3d08248 | libxul.so@0x3cf7da0 | rayon_core::job::{{impl}}::execute<T> ] → std::panicking::rust_panic_with_hook ] [@ mozalloc_abort | abort | libxul.so@0x3d08258 | libxul.so@0x3d08248 | libxul.so@0x3cf7da0 | rayon_core::job::{{impl}}::execute<T> ] [@ static void rayon_core::job::{{impl}}::execute<T> ]
Crash Signature: std::panicking::rust_panic_with_hook ] [@ mozalloc_abort | abort | libxul.so@0x3d08258 | libxul.so@0x3d08248 | libxul.so@0x3cf7da0 | rayon_core::job::{{impl}}::execute<T> ] [@ static void rayon_core::job::{{impl}}::execute<T> ] → std::panicking::rust_panic_with_hook ] [@ mozalloc_abort | abort | libxul.so@0x3d08258 | libxul.so@0x3d08248 | libxul.so@0x3cf7da0 | rayon_core::job::{{impl}}::execute<T> ] [@ mozalloc_abort | abort | libxul.so@0x4943878 | libxul.so@0x4943868 | libxul.…
Crash Signature: libxul.so@0x49333e0 | rayon_core::job::{{impl}}::execute<T> ] [@ static void rayon_core::job::{{impl}}::execute<T> ] → libxul.so@0x49333e0 | rayon_core::job::{{impl}}::execute<T> ] [@ static void rayon_core::job::{{impl}}::execute<T> ] [@ mozalloc_abort | abort | _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::ha94daf2bcef4141e ]
Crash Signature: ] → ] [@ mozalloc_abort | abort | rayon_core::job::{{impl}}::execute<T> ]
Crash Signature: ] [@ mozalloc_abort | abort | rayon_core::job::{{impl}}::execute<T> ] → ] [@ mozalloc_abort | abort | rayon_core::job::{{impl}}::execute<T> ] [@ mozalloc_abort | abort | _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::hec678696f06b28f7 ]
Crash Signature: ] [@ mozalloc_abort | abort | rayon_core::job::{{impl}}::execute<T> ] [@ mozalloc_abort | abort | _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::hec678696f06b28f7 ] → ] [@ mozalloc_abort | abort | rayon_core::job::{{impl}}::execute<T> ] [@ mozalloc_abort | abort | _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::hec678696f06b28f7 ] [@ mozalloc_abort | abort | _$LT$rayon_core..…
Crash Signature: mozalloc_abort | abort | libxul.so@0x4817df8 | libxul.so@0x4817de8 | libxul.so@0x4807990 | rayon_core::job::{{impl}}::execute<T> ] → mozalloc_abort | abort | libxul.so@0x4817df8 | libxul.so@0x4817de8 | libxul.so@0x4807990 | rayon_core::job::{{impl}}::execute<T> ] [@ mozalloc_abort | abort | libxul.so@0x47e81c8 | libxul.so@0x47e81b8 | libxul.so@0x47d7d60 | rayon_core::job::{{impl}}::e…
I got this when loading https://doodle.com/poll/vyhi6547fd99gcic but I haven't been able to reproduce it since then.
Crash Signature: rayon_core::job::{{impl}}::execute<T> ] [@ mozalloc_abort | abort | _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::h0fa6bff0331f0381 ] → rayon_core::job::{{impl}}::execute<T> ] [@ mozalloc_abort | abort | _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::h0fa6bff0331f0381 ] [@ mozalloc_abort | abort | _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as…
Crash Signature: _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::hff35929eb3ea4554 ] [@ mozalloc_abort | abort | _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::hd937568a7a7c5508 ] → _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::hff35929eb3ea4554 ] [@ mozalloc_abort | abort | _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::hd937568a7a7c5508 ] [@ mozal…
Crash Signature: mozalloc_abort | abort | _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::h133db474c4e73653 ] → mozalloc_abort | abort | _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::h133db474c4e73653 ] [@ mozalloc_abort | abort | _$LT$rayon_core..job..HeapJob$LT$BODY$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute ]
I can reproduce the problem on https://www.gatsbyjs.org/, thanks!

Looks like we're trying to create a drawtarget that's 12983x12983 and it's failing at [1]. Should we be tiling this thing? Is this the same as bug 1435896?

[1] https://searchfox.org/mozilla-central/rev/40577076a6e7a14d143725d0288353c250ea2b16/gfx/2d/Factory.cpp#483
Flags: needinfo?(a.beingessner)
Hm. Looks like there is code to try to force tiling at [1] but the max_texture_size there is 16384. So it lets the thing be not-tiled. Moz2D however seems to not like any sizes larger than 8192 [2], and so anything that falls between these two doesn't get tiled and then fails to create the draw target.

[1] https://searchfox.org/mozilla-central/rev/40577076a6e7a14d143725d0288353c250ea2b16/gfx/webrender/src/resource_cache.rs#498
[2] https://searchfox.org/mozilla-central/rev/40577076a6e7a14d143725d0288353c250ea2b16/gfx/2d/2D.h#1570
Forcing WR to cap the texture size at 8192 via the RendererOptions fixes the crash for me.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a154b09eeb9fd25631e860638d41e52a81fe4280
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #9)
> Hm. Looks like there is code to try to force tiling at [1] but the
> max_texture_size there is 16384. So it lets the thing be not-tiled. Moz2D
> however seems to not like any sizes larger than 8192 [2]

Actually to be precise, it's not the texture size that Moz2D has a problem with. It's the allocation size. The max texture size in moz2d is actually 32767, coming from the pref [1]. The max allocation size also comes from a pref [2] and is not big enough to fit a 16384x16384 texture at 4bpp, but is big enough to fit a 8192x8192 texture at 4bpp.

I really dislike all these magic numbers hard-coded throughout the codebase.

[1] https://searchfox.org/mozilla-central/rev/40577076a6e7a14d143725d0288353c250ea2b16/gfx/thebes/gfxPrefs.h#495
[2] https://searchfox.org/mozilla-central/rev/40577076a6e7a14d143725d0288353c250ea2b16/gfx/thebes/gfxPrefs.h#493
looks like you're on top of this
Flags: needinfo?(a.beingessner)
Comment on attachment 8985347 [details]
Bug 1454187 - Don't let webrender try to request blob images larger than Moz2D's max texture size.

https://reviewboard.mozilla.org/r/250968/#review257250
Attachment #8985347 - Flags: review?(nical.bugzilla) → review+
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/efeb9170c201
Don't let webrender try to request blob images larger than Moz2D's max texture size. r=nical
Assignee: nobody → bugmail
https://hg.mozilla.org/mozilla-central/rev/efeb9170c201
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
No longer depends on: 1440088
Duplicate of this bug: 1440088
See Also: → 1515932
You need to log in before you can comment on or make changes to this bug.