[Linux/Wayland/HDR] Thread lock detection eats 100% of cpu and blocks UI
Categories
(Core :: XPCOM, defect, P3)
Tracking
()
People
(Reporter: stransky, Unassigned)
References
(Blocks 1 open bug)
Details
Linux HDR debug build is unusable. Thread lock detection eats 100% of cpu and blocks UI, prints on terminal lot of messages like:
###!!! ERROR: Potential deadlock detected:
=== Cyclical dependency starts at
--- Mutex : WaylandSurface calling context
[stack trace unavailable]
--- Next dependency:
--- Mutex : WaylandSurface (currently acquired)
calling context
[stack trace unavailable]
=== Cycle completed at
--- Mutex : WaylandSurface calling context
[stack trace unavailable]
Deadlock may happen for some other execution
Non-debug build works fine.
Reproductions steps:
- Downlad latest nightly / debug build, run on Linux/Wayland
- Set gfx.webrender.compositor.force-enabled / gfx.webrender.compositor to true
- Restart browser, try to scroll any page
Reporter | ||
Comment 1•1 month ago
|
||
When I stop Firefox in gdb, I see the main thread is cycling in the thread lock detection code which is very deep.
Reporter | ||
Comment 2•1 month ago
|
||
Is there any way how to disable the detector? I'd like to run it in TSAN but it used debug build AFAIK which is unusable.
Comment 3•1 month ago
|
||
IIRC it's possible to do a non-debug TSAN build, which might be what you want here. Though this does suggest that there are perhaps lock-inversion issues with the "WaylandSurface"
mutex (https://searchfox.org/mozilla-central/rev/4ce36232b265b53de4fb7eb754430f94e262bbbe/widget/gtk/WaylandSurface.h#386), which should be fixed to avoid potential deadlocks. My guess is that sometimes code holds locks for multiple WaylandSurface
objects at the same time in a non-globally-consistent order which is both leading to a giant dependency tree (slowing down the deadlock detector), and potentially could lead to a deadlock in some cases.
Unfortunately I don't think we have a different flag for the deadlock detector, it's just using #ifdef DEBUG
, so if you want the detector disabled, you'll need to use a non-debug build, or refactor the code in BlockingResourceBase.{h,cpp}
.
Description
•