Startup profiling deadlocks on Windows
Categories
(Core :: Gecko Profiler, defect, P2)
Tracking
()
People
(Reporter: florian, Unassigned)
References
Details
I tried to reproduce a test failure with MOZ_PROFILER_STARTUP to see what's happening, and out of 16 runs, I have 4 that failed with "TEST-UNEXPECTED-TIMEOUT | automation.py | application timed out after 370 seconds with no output"
https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=94084601bba55fb449e4136d649e3c48b384e251
These jobs have crash dumps, and stacks in their logs.
In three cases, the main thread is blocked on mozglue.dll!mozilla::WindowsDpiInitialization()
at https://searchfox.org/mozilla-central/rev/1061fae5e225a99ef5e43dbdf560a91a0c0d00d1/mozglue/misc/WindowsDpiInitialization.cpp#35-37 while the base profiler is trying to sample.
In a fourth case the main thread is blocked at https://searchfox.org/mozilla-central/rev/1061fae5e225a99ef5e43dbdf560a91a0c0d00d1/widget/windows/WindowsUIUtils.cpp#399-400 (this system function triggers loading a DLL) while the gecko profiler is trying to sample.
We have code at https://searchfox.org/mozilla-central/rev/1061fae5e225a99ef5e43dbdf560a91a0c0d00d1/mozglue/misc/StackWalk.cpp#300 trying to avoid these deadlocks. Maybe something has changed, or maybe there was a bug in it.
Reporter | ||
Comment 1•3 years ago
|
||
It also happens relatively frequently for me to see Windows profiles where there's one process that misses stack sampling but where other processes have good stack. Most recent example: https://share.firefox.dev/3bhteJu The parent process main thread has a 4 samples with full stacks from the gecko profiler at the very beginning, and the rest of the profile doesn't have native stack frames for that process.
Description
•