Closed Bug 1788859 Opened 3 years ago Closed 3 years ago

Stack-walking may cause deadlocks during calls to patched LdrLoadDll's

Categories

(Core :: Gecko Profiler, defect, P2)

Unspecified
Windows
defect

Tracking

()

RESOLVED DUPLICATE of bug 1836225
106 Branch
Tracking Status
firefox106 --- fixed

People

(Reporter: mozbugz, Assigned: mozbugz)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

Spawned from bug 1687510 comment 7:

I think I just got the deadlock (win10/x64 local build). I attached a debugger, the main thread was stuck in the interceptor while patched_LdrLoadDll was calling stub_LdrLoadDll().
We could add AutoSuppressStackWalking in these patched_... functions. It may not fix the suppression-check race, but it would help remove places where it could happen.

I'm leaving bug 1687510 open, as it shows some potential race (see bug 1687510 comment 5) that should probably be handled there.

Stack-walking during some Windows API calls result in deadlocks, see details in StackWalk.cpp.

Attachment #9292789 - Attachment description: Bug 1788859 - Suppress stack-walking while calling LdrLoadDll - r?florian → Bug 1788859 - Suppress stack-walking while calling LdrLoadDll in WindowsDllBlocklist.cpp - r?florian
Summary: Stack-walking may cause deadlocks during calles to patched LdrLoadDll's → Stack-walking may cause deadlocks during calls to patched LdrLoadDll's
Pushed by gsquelart@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/2676d5d34684 Suppress stack-walking while calling LdrLoadDll in WindowsDllBlocklist.cpp - r=florian
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 106 Branch

This changeset from bug 1788859 was an attempt at mitigating bug
1687510. With the proper fix coming in bug 1836225, this changeset is no
longer required. The stack walk suppressions that this changeset
adds are already on a stack walk suppressed path, as guarded by the
scope of the surrounding glue::ModuleLoadFrame. Indeed:

  • the constructor for glue::ModuleLoadFrame calls
    sLoaderAPI->ConstructAndNotifyBeginDllLoad;

  • LoaderPrivateAPIImp::ConstructAndNotifyBeginDllLoad and
    FallbackLoaderAPI::ConstructAndNotifyBeginDllLoad both end up calling
    gLoaderObserver->OnBeginDllLoad or mLoaderObserver->OnBeginDllLoad;

  • in both cases, the LoaderObserver is gMozglueLoaderObserver, as is
    guaranteed by the call to glue::ModuleLoadFrame::StaticInit in
    DllBlocklist_Initialize (which is the function that sets up this
    fallback patched_LdrLoadDll hook in case the launcher process failed
    to set up the freestanding patched_LdrLoadDll hook).

Depends on D181437

A patch has been attached on this bug, which was already closed. Filing a separate bug will ensure better tracking. If this was not by mistake and further action is needed, please alert the appropriate party. (Or: if the patch doesn't change behavior -- e.g. landing a test case, or fixing a typo -- then feel free to disregard this message)

Backout by yjuglaret@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f6a41bd11965 Backed out changeset 2676d5d34684 because it is no longer required. r=handyman,florian

Backout merged to central: https://hg.mozilla.org/mozilla-central/rev/f6a41bd11965

Normally we reopen backed out bugs but given this was fixed in 106 branch maybe you got other plans and will leave this for you to decide.

Flags: needinfo?(yjuglaret)

Maybe I should not have used a backout to remove that code, but the situation is under control, the patches in bug 1836225 fix the issue that was observed here. Resolving as duplicate which is maybe the best way to express that.

Duplicate of bug: 1836225
Flags: needinfo?(yjuglaret)
Resolution: FIXED → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: