Crash in [@ mozilla::layers::CompositorBridgeParent::GetCompositorBridgeParentFromWindowId]
Categories
(Core :: Graphics: WebRender, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | unaffected |
firefox66 | --- | unaffected |
firefox67 | --- | unaffected |
firefox68 | --- | fixed |
People
(Reporter: calixte, Assigned: alexical)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(1 file)
This bug is for crash report bp-95969093-75f1-42c7-9026-02f140190324.
Top 10 frames of crashing thread:
0 xul.dll mozilla::layers::CompositorBridgeParent::GetCompositorBridgeParentFromWindowId gfx/layers/ipc/CompositorBridgeParent.cpp:1194
1 xul.dll wr_finished_scene_build gfx/webrender_bindings/RenderThread.cpp:910
2 xul.dll static void webrender_bindings::bindings::{{impl}}::post_scene_swap gfx/webrender_bindings/src/bindings.rs:874
3 xul.dll static void webrender::scene_builder::SceneBuilder::run gfx/wr/webrender/src/scene_builder.rs:326
4 xul.dll static void std::sys_common::backtrace::__rust_begin_short_backtrace<closure, src/libstd/sys_common/backtrace.rs:136
5 xul.dll static void alloc::boxed::{{impl}}::call_box< src/liballoc/boxed.rs:673
6 xul.dll static void std::sys::windows::thread::{{impl}}::new::thread_start src/libstd/sys/windows/thread.rs:56
7 kernel32.dll BaseThreadInitThunk
8 mozglue.dll static void patched_BaseThreadInitThunk mozglue/build/WindowsDllBlocklist.cpp:712
9 ntdll.dll RtlUserThreadStart
There is 1 crash in nightly 68 with buildid 20190324094708. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1441308.
[1] https://hg.mozilla.org/mozilla-central/rev?node=96da9d241051
Updated•5 years ago
|
Assignee | ||
Comment 1•5 years ago
|
||
Leaving the needinfo for now since I want to look closer at this, but at the outset, despite having its fingerprints on the backtrace, I don't see how my patch would lead to this crash.
Assignee | ||
Comment 2•5 years ago
|
||
Kats, is there anything preventing ClearResources executing concurrently with GetCompositorBridgeParentFromWindowId? It doesn't look like it to me. It looks like both CompositorBridgeParent::StopAndClearResources and CompositorBridgeParent::RecvAdoptChild are not locking on sIndirectLayerTreesLock when they call into ClearResources, and I don't know of any reason why the scene builder thread would not be running. Thoughts?
Comment 3•5 years ago
|
||
I believe the way this code was supposed to work was that this Destroy call which invokes ClearResources would block until all the WR-side structures for that window were shut down. So there would be no calls to GetCompositorBridgeParentFromWindowId for that window id happening after that function returned.
But even so, it's not required that sIndirectLayerTreesLock be held when ClearResources is called (in fact we explicitly make sure not to hold that, or it can result in deadlock). The only requirement is that sIndirectLayerTreesLock be held while modifying the sIndirectLayerTrees strucutre, which is the case in StopAndClearResources and RectAdoptChild
Is there a specific interleaving of these threads that you can think of that would cause this crash?
Comment 4•5 years ago
|
||
I looked at the other threads in the crash reports; the compositor thread is handling a RecvShutdown message on PWebRenderBridge, so presumably the Destroy call is the problem. The only thing I can think of is that mApis is being touched on two different threads without any synchronization. mApis gets emptied here on the compositor thread, and it's concurrently read from here at the crash point on the scene builder thread.
Previously mApi was a single RefPtr so it might not have had this problem, or at least it might have manifested differently.
Assignee | ||
Comment 5•5 years ago
|
||
Yeah, since WebRenderAPI implements thread-safe ref counting I think this was correct before, but in our case when we check mApis.Length(), there's no guarantee that it won't be cleared (and subsequently have its memory given to something else which, say, zeroes it), by the time we return mApis[(int)aRenderRoot].
I'd like to solve this without locking on a mutex in GetWebRenderAPI. We could do so by changing mApis to a RenderRootArray and just nulling out all the entries in ClearResources. Thoughts?
Comment 6•5 years ago
|
||
That sounds reasonable to me.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 7•5 years ago
|
||
This will allow us to simply do null checks when trying to get
an API, which will prevent problems with concurrently accessing
an nsTArray.
Pushed by dothayer@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/1d3fca2f0721 Replace mApis nsTArray with RenderRootArray r=sotaro
Comment 9•5 years ago
|
||
bugherder |
Updated•5 years ago
|
Updated•5 years ago
|
Updated•2 years ago
|
Description
•