On 64-bit Windows (x86_64, aarch64), stack walking relies on
RtlLookupFunctionEntry to navigate from one frame to the next. This
function acquires up to two ntdll internal locks when it is called.
The profiler and the background hang monitor both need to walk the
stacks of suspended threads. This can lead to deadlock situations,
which so far we have avoided with stack walk suppressions. We guard some
critical paths to mark them as suppressing stack walk, and we forbid
stack walking when any thread is currently on such path.
While stack walk suppression has helped remove most deadlock situations,
some can remain because it is hard to detect and manually annotate all
the paths that could lead to a deadlock situation. Another drawback is
that stack walk suppression disables stack walking for much larger
portions of code than required. For example, we disable stack walking
for LdrLoadDll, so we cannot collect stacks while we are loading a DLL.
Yet, the lock that could lead to a deadlock situation is only held
during a very small portion of the whole time spent in LdrLoadDll.
This patch addresses these two issues by implementing a finer-grained
strategy to avoid deadlock situations. We hook ntdll lock acquisition
and release functions in order to collect pointers to the two
problematic locks. This allows us to play with these locks, such that
we can guarantee safe stack walking with no deadlock.
If we fail to collect pointers to the locks, we fall back to using stack
walk suppressions like before. This way we get the best of both worlds:
if we are confident that the situation is under control, we will use the
new strategy and get better profiler accuracy and no deadlock; in case
of doubt, we can still use the profiler thanks to stack walk
Depends on D182553