Open
Bug 1382943
Opened 7 years ago
Updated 2 years ago
No stack frames during LdrLoadDll
Categories
(Core :: Gecko Profiler, enhancement, P2)
Core
Gecko Profiler
Tracking
()
REOPENED
Tracking | Status | |
---|---|---|
firefox57 | --- | fix-optional |
People
(Reporter: cyu, Unassigned)
References
Details
(Whiteboard: inj+)
Attachments
(1 file, 1 obsolete file)
This is forked from https://bugzilla.mozilla.org/show_bug.cgi?id=1382150#c4
From this profile: https://perfht.ml/2ugAfjI we can see a weird call stack that js::RunScript() calls directly into patched_LdrLoadDll(), where important stack frames from mozilla::dom::HTMLCanvasElementBinding::getContext() to `anonymous namespace'::LoadLibrarySystem32() are missing. This is bad in profiling problems with DLL loads because the missing stack frames make us underestimate the cost of mozilla::dom::HTMLCanvasElementBinding::getContext(): if we click the getContext() function, we only see it taking 25 ms, but actually it takes much longer.
I captured this in VS debugger when the profiler walks the stack. The main thread stack:
mozglue.dll!patched_LdrLoadDll Line 628 C++
[External Code]
libGLESv2.dll!rx::Renderer11::initializeD3DDevice Line 745 C++
libGLESv2.dll!rx::Renderer11::initialize Line 543 C++
libGLESv2.dll!rx::CreateRendererD3D Line 120 C++
libGLESv2.dll!rx::DisplayD3D::initialize Line 233 C++
libGLESv2.dll!egl::Display::initialize Line 402 C++
libGLESv2.dll!egl::Initialize Line 66 C++
xul.dll!mozilla::gl::GetAndInitDisplay Line 187 C++
xul.dll!mozilla::gl::GetAndInitDisplayForAccelANGLE Line 273 C++
xul.dll!mozilla::gl::GLLibraryEGL::EnsureInitialized Line 498 C++
xul.dll!mozilla::gl::GLContextProviderEGL::CreateOffscreen Line 973 C++
xul.dll!mozilla::CreateGLWithANGLE Line 625 C++
xul.dll!mozilla::WebGLContext::CreateAndInitGLWith Line 681 C++
xul.dll!mozilla::WebGLContext::CreateAndInitGL Line 793 C++
xul.dll!mozilla::WebGLContext::SetDimensions Line 1012 C++
xul.dll!mozilla::dom::CanvasRenderingContextHelper::UpdateContext Line 249 C++
xul.dll!mozilla::dom::CanvasRenderingContextHelper::GetContext Line 204 C++
xul.dll!mozilla::dom::HTMLCanvasElement::GetContext Line 984 C++
xul.dll!mozilla::dom::HTMLCanvasElementBinding::getContext Line 253 C++
xul.dll!mozilla::dom::GenericBindingMethod Line 3066 C++
xul.dll!js::InternalCallOrConstruct Line 470 C++
xul.dll!Interpret Line 3060 C++
xul.dll!js::RunScript Line 410 C++
xul.dll!js::InternalCallOrConstruct Line 491 C++
xul.dll!js::jit::DoCallFallback Line 2552 C++
[External Code]
xul.dll!js::jit::GetNativeDataProperty<1> Line 1630 C++
[External Code]
Sampled by the sampler thread:
> mozglue.dll!WalkStackMain64 Line 542 C++
mozglue.dll!MozStackWalk Line 689 C++
xul.dll!DoNativeBacktrace Line 992 C++
xul.dll!DoSharedSample Line 1225 C++
xul.dll!DoPeriodicSample Line 1264 C++
xul.dll!Sampler::SuspendAndSampleAndResumeThread<<lambda_255f21e0b8ee7fb9b22d3728616c4153> > Line 169 C++
xul.dll!SamplerThread::Run Line 1806 C++
xul.dll!ThreadEntry Line 187 C++
And manually checked the result of RtlVirtualUnwind(), which dutifully returns the correct stack:
0x00007ffed6e6aae8 {Inside dxgi.dll!CDXGIBaseAdapter::LoadUMD(enum _KMTUMDVERSION,struct HINSTANCE__ * *)}
0x00007ffed49bdeef {Inside d3d11.dll!CCreateDeviceCache::CUMDAdapterCache::Load(struct IDXGIAdapterInternal2 *,enum _KMTUMDVERSION,unsigned int,struct FailureData::UMDFailureData &)}
0x00007ffed49bbd13 {Inside d3d11.dll!CCreateDeviceCache::CAdapterCache::ResolveUMDAndVersion(class std::vector<unsigned __int64,class std::allocator<unsigned __int64> > &,enum D3D_FEATURE_LEVEL,enum D3D_DRIVER_TYPE,unsigned int,bool,class NDXGI::CUMDAdapter * *,unsigned __int64 *)}
0x00007ffed49ba8fe {Inside d3d11.dll!D3D11CoreCreateDevice()}
0x00007ffed49e54ee {Inside d3d11.dll!D3D11CreateDeviceAndSwapChainImpl(struct IDXGIAdapter *,enum D3D_DRIVER_TYPE,struct HINSTANCE__ *,unsigned int,enum D3D_FEATURE_LEVEL const *,unsigned int,unsigned int,struct DXGI_SWAP_CHAIN_DESC const *,struct IDXGISwapChain * *,struct ID3D11Device * *,enum D3D_FEATURE_LEVEL *,struct ID3D11DeviceContext * *)}
0x00007ffed49e5259 {Inside d3d11.dll!D3D11CreateDeviceAndSwapChain()}
0x00007ffed49e5c1c {Inside d3d11.dll!D3D11CreateDeviceImpl(struct IDXGIAdapter *,enum D3D_DRIVER_TYPE,struct HINSTANCE__ *,unsigned int,enum D3D_FEATURE_LEVEL const *,unsigned int,unsigned int,struct ID3D11Device * *,enum D3D_FEATURE_LEVEL *,struct ID3D11DeviceContext * *)}
0x00007ffed49e514c {Inside d3d11.dll!D3D11CreateDevice()}
0x00007ffe8994d013 {libGLESv2.dll!rx::Renderer11::initializeD3DDevice(void), Line 745}
0x00007ffe8994c89e {libGLESv2.dll!rx::Renderer11::initialize(void), Line 543}
0x00007ffe898ef223 {libGLESv2.dll!rx::CreateRendererD3D(egl::Display * display, rx::RendererD3D * * outRenderer), Line 120}
0x00007ffe898ef7e9 {libGLESv2.dll!rx::DisplayD3D::initialize(egl::Display * display), Line 233}
0x00007ffe898ee0ed {libGLESv2.dll!egl::Display::initialize(void), Line 402}
0x00007ffe898d4f4d {libGLESv2.dll!egl::Initialize(void * dpy, int * major, int * minor), Line 66}
0x00007ffe80372985 {xul.dll!mozilla::gl::GetAndInitDisplay(mozilla::gl::GLLibraryEGL & egl, void * displayType), Line 187}
0x00007ffe80372a8c {xul.dll!mozilla::gl::GetAndInitDisplayForAccelANGLE(mozilla::gl::GLLibraryEGL & egl, nsACString * const out_failureId), Line 273}
0x00007ffe80371239 {xul.dll!mozilla::gl::GLLibraryEGL::EnsureInitialized(bool forceAccel, nsACString * const out_failureId), Line 498}
0x00007ffe8036e7ae {xul.dll!mozilla::gl::GLContextProviderEGL::CreateOffscreen(const mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> & size, const mozilla::gl::SurfaceCaps & minCaps, mozilla::gl::CreateContextFlags flags, nsACString * const out_failureId), Line 973}
0x00007ffe80b5d8ad {xul.dll!mozilla::CreateGLWithANGLE(const mozilla::gl::SurfaceCaps & caps, mozilla::gl::CreateContextFlags flags, mozilla::WebGLContext * webgl, std::vector<mozilla::WebGLContext::FailureReason,std::allocator<mozilla::WebGLContext::FailureReason> > * const out_failReasons), Line 625}
0x00007ffe80b5d5b6 {xul.dll!mozilla::WebGLContext::CreateAndInitGLWith(already_AddRefed<mozilla::gl::GLContext>(*)(const mozilla::gl::SurfaceCaps &, mozilla::gl::CreateContextFlags, mozilla::WebGLContext *, std::vector<mozilla::WebGLContext::FailureReason,std::allocator<mozilla::WebGLContext::FailureReason> > * const) fnCreateGL, const mozilla::gl::SurfaceCaps & baseCaps, mozilla::gl::CreateContextFlags flags, std::vector<mozilla::WebGLContext::FailureReason,std::allocator<mozilla::WebGLContext::FailureReason> > * const out_failReasons), Line 681}
0x00007ffe80b5d372 {xul.dll!mozilla::WebGLContext::CreateAndInitGL(bool forceEnabled, std::vector<mozilla::WebGLContext::FailureReason,std::allocator<mozilla::WebGLContext::FailureReason> > * const out_failReasons), Line 793}
0x00007ffe80b62eca {xul.dll!mozilla::WebGLContext::SetDimensions(int signedWidth, int signedHeight), Line 1012}
So somehow we dropped the frames in the final output.
Reporter | ||
Comment 1•7 years ago
|
||
Unfortunately, this is intentional. Stack walking is suppressed before calling stub_LdrLoadDll() for x64 so the stackwalker returns nothing when these DLLs are loaded. There is only the pseudo stack available from the profiler. Resolve as wontfix.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
Comment 2•7 years ago
|
||
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #1)
> Unfortunately, this is intentional. Stack walking is suppressed before
> calling stub_LdrLoadDll() for x64
Is it? How?
Comment 3•7 years ago
|
||
Oh, I see: http://searchfox.org/mozilla-central/rev/3a3af33f513071ea829debdfbc628caebcdf6996/mozglue/build/WindowsDllBlocklist.cpp#776
Of course, the function table lock... we can't stackwalk a thread while it's modifying the function table. So instead of skipping sampling altogether, we can still get pseudo stacks for the samples during that time.
Comment 4•7 years ago
|
||
We could make AutoSuppressStackWalking get a synchronous backtrace and use that stored backtrace for all samples of that thread that occur while stack walking is suppressed.
We still wouldn't get C++ stacks for other threads during that time, though. But it would be an improvement.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
As a baby step, would it be sufficient for profile accuracy if we just did something like?
void DummyEntry_StackWalkingWasSuppressed() { }
if (sStackWalkSuppressions) {
[Set up a single frame with addr = &DummyEntry_StackWalkingWasSuppressed]
return;
}
Comment 6•7 years ago
|
||
How would that help? I think Cervantes is interested in the complete C++ stack up to the call to LoadDll.
Reporter | ||
Comment 7•7 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #4)
> We could make AutoSuppressStackWalking get a synchronous backtrace and use
> that stored backtrace for all samples of that thread that occur while stack
> walking is suppressed.
>
Yes, this is doable, but we can only catch the 1st LdrLoadDll call on a specific thread and skip indirectly loaded dlls, or we will enter a deadlock. If some other thread is loading a dll, we may also want to avoid a synchronous stackwalk, or it will also be blocked by that dll load. This can create a long wait chain among several threads: sampler thread waiting for the loader lock to be held by the sampled thread doing sync stackwalk, waiting for a thread loading DLL.
> We still wouldn't get C++ stacks for other threads during that time, though.
> But it would be an improvement.
Reporter | ||
Comment 8•7 years ago
|
||
I tried to make a on-thread stackwalk when we suppress stackwalk the first time and use the saved stack for the sampler when we are still loading DLL. The sampler still gets empty stack. We saved nothing in the MozStackWalk() in AutoSuppressStackWalking() because the stackwalker thread gets an error in GetThreadContext(), which I still don't know why. I think this is better handled using a kernel stacakwalker (bug 1379286).
Reporter | ||
Comment 9•7 years ago
|
||
A sample profile:
https://perf-html.io/public/93bf39bc02fed62f82f36ba0cf306fe7503e706f/calltree/?hiddenThreads=&range=1.6588_1.8707&search=loaddll&thread=5&threadOrder=0-2-3-5-1-4
Assignee: nobody → cyu
Attachment #8891258 -
Attachment is obsolete: true
Attachment #8891974 -
Flags: review?(mstange)
Comment 10•7 years ago
|
||
Comment on attachment 8891974 [details] [diff] [review]
Pre-stackwalk and use the saved stack during stackwalk suppression
Review of attachment 8891974 [details] [diff] [review]:
-----------------------------------------------------------------
Some drive-by comments from the perspective of a semi-outsider.
I have a feeling that this code is going to be non-obvious to someone who first encounters it. There's an increase in cognitive load from AutoSuppressStackWalking going from a one-liner to a beefy function with synchronization etc. I think it would be helpful to sprinkle comments around liberally.
Also, out of a concern for regressions (especially in some of the rarer, weirder uses of AutoSuppressStackWalk) I'd like to ask that you please not land this until after merge day, and please hang out in #flow to watch for complaints. Maybe even ask ehsan or overholt to test-drive a build for a bit.
::: mozglue/misc/StackWalk.cpp
@@ +241,5 @@
> // solution would be to maintain a counter per thread, but then it would be
> // more difficult for WalkStackMain64 to read the suspended thread's counter.
> static Atomic<size_t> sStackWalkSuppressions;
>
> +static const size_t kMaxFrames = 1024;
For example, this would be a great place to add something like:
// The following helpers enable us to grab a stack trace at the point of AutoSuppressStackWalk, which will be substituted for any stacks that we can't grab while the suppression is active.
@@ +250,5 @@
> + DWORD mThreadId;
> + size_t mFrameCount;
> +};
> +static SuppressionStack sSuppressionStack;
> +static ProfilerSamplingCurrentThreadCB sProfilerSamplingCurrentThreadCB;
It took me several attempts to understand "CB". It might be better to just spell out Callback everywhere.
@@ +325,5 @@
> + return;
> + }
> +
> + // First suppression of stack walking. It's safe to save the current stack
> + // for the Gecko Profiler.
Maybe elaborate on this and mention that we'll use the saved stack during the suppression.
@@ +764,5 @@
> data.walkCallingThread = (targetThread == currentThread);
> + data.saveSuppressionStack = aSaveSuppressionStack;
> +
> + // Return saved stack during stackwalk suppression.
> + if (aCallback != StackWalkCallback) {
This seems fragile, what happens if other people add more callers in the future with different callbacks?
::: mozglue/misc/StackWalk_windows.h
@@ +10,5 @@
> *
> * See comment in StackWalk.cpp
> */
> +
> +using ProfilerSamplingCurrentThreadCB = bool (*)();
I would recommend moving these down a few lines, below AutoSuppressStackWalking, so that the /***/ comment stays next to AutoSuppressStackWalk (the thing it's describing).
::: tools/profiler/core/platform.cpp
@@ +2096,5 @@
> + return false;
> + }
> +
> + if (NS_IsMainThread()) {
> + return true;
Why is the condition always true when we are on the main thread? There must be some logic here that is non-obvious (to me as an outsider at least). I'd recommend a comment.
Comment 11•7 years ago
|
||
Also, the documentation for CreateSemaphore [1] says that the failure value for semaphore handles is NULL rather than INVALID_HANDLE_VALUE, so I'd recommend changing to null-checks throughout. (I know, these rules are messy! [2])
[1] https://msdn.microsoft.com/en-us/library/windows/desktop/ms682438(v=vs.85).aspx
[2] https://blogs.msdn.microsoft.com/oldnewthing/20040302-00/?p=40443/
Reporter | ||
Updated•7 years ago
|
Attachment #8891974 -
Flags: review?(mstange)
Updated•7 years ago
|
status-firefox57:
--- → fix-optional
Priority: -- → P2
Updated•7 years ago
|
Summary: Some stack frames are skipped in the profile → No stack frames during LdrLoadDll
Updated•7 years ago
|
Whiteboard: inj+
Updated•6 years ago
|
Assignee: cervantes.yu → nobody
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•