Closed Bug 1535704 Opened 6 years ago Closed 6 years ago

win32k lockdown causes mscom::ProcessRuntime::InitializeSecurity to fail

Categories

(Core :: IPC: MSCOM, defect, P1)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox66 --- disabled
firefox67 --- disabled
firefox68 + fixed

People

(Reporter: mjf, Assigned: bugzilla)

References

Details

Attachments

(6 files)

Summarizing an email thread:
When trying to enable Vorbis decoding on RDD, the Windows call CoInitializeEx was returning E_OUTOFMEMORY. Aaron suggested that it was because mscom::ProcessRuntime initialization was missing. That was removed during the win32k lockdown for RDD sandboxing.

After re-adding the mscom::ProcessRuntime initialization, mscom::ProcessRuntime::InitializeSecurity fails calling SetEntriesInAclW with error code 127. Disabling the win32k lockdown allows this call to succeed. Aaron suggested that I open a bug at this point.

I'll attach a patch that has Vorbis decoding on RDD enabled along with the debug printfs that shows the error.

The test I'm running is:
./mach mochitest browser/base/content/test/general/browser_contextmenu_childprocess.js

Aaron,

As requested, here is the bug and info I scraped from the email thread between you, Bob, and myself.

Flags: needinfo?(aklotz)

Thanks Michael! I'll take a peek at this today.

Flags: needinfo?(aklotz)

The main suspect that I had was the presence of a DelayLoaded import. Indeed, it turns out that advapi32!SetEntriesInAclW redirects to a DelayLoaded import from api-ms-win-security-provider-l1-1-0.dll, which resolves to ntmarta.dll.

My current theory is that the sandbox is blocking this DelayLoad, so the implementation of SetEntriesInAclW cannot be resolved.

OTOH, the placement of mscom::ProcessRuntime in the patch in this bug seems to be implying that the ProcessRuntime is being instantiated before the token is dropped.

Anyway, this is still my best lead at the moment.

I can confirm the same behaviour if I try the following on a content process in a debugger:

  • When the content process starts up, I make the debugger break when ntmarta is loaded via delayload from SetEntriesInAclW;
  • In the debugger, I changed the NTSTATUS return code from STATUS_SUCCESS to STATUS_ACCESS_DENIED;
  • Walking up the call stack, I end up with error code 127 being returned from SetEntriesInAclW.

So I guess the next question is, what is it about the RDD process sandbox policy that is blocking this delayloaded DLL?

For ease of debugging, I modified the GPU process to use the RDD sandbox policy so that I could quickly observe this issue.

In addition to the information that I included in Comment 3, there is this:

At one point during delayload resolution, the loader checks to see whether the library we're trying to load is a KnownDLL. As part of that check, the loader ends up calling NtOpenSection in search of a section named ntmarta.dll with a non-null RootDirectory handle. Here's the handle information that I dumped from the RootDirectory:

  Type         	Directory
  Attributes   	0x10
  GrantedAccess	0x3:
         None
         Query,Traverse
  HandleCount  	262
  PointerCount 	8499676
  Name         	\KnownDlls

This call to NtOpenSection fails with STATUS_ACCESS_DENIED. Amusingly enough, an object named ntmarta.dll doesn't actually exist, so if NtOpenSection had returned a different error code indicating as such, we wouldn't have failed. But since we do fail with an unexpected error code, it is propagated up the stack and the delayload resolver sets SetEntriesInAclW to point to a stub that returns the error code that we're seeing here.

I'm going to move this over to sandboxing, since at this point it's clear that we need to fix how the RDD sandbox deals with delayloads. TL;DR we need these NtOpenSection calls in search of sections named \KnownDlls\* to work, or at least fail without STATUS_ACCESS_DENIED

Bob, thoughts?

Component: IPC: MSCOM → Security: Process Sandboxing
Flags: needinfo?(bobowencode)
Blocks: 1533985
No longer blocks: 1533985
See Also: → 1533985
Blocks: 1524049
Attached patch rddcom.patchSplinter Review

(In reply to Aaron Klotz [:aklotz] from comment #5)
...

This call to NtOpenSection fails with STATUS_ACCESS_DENIED. Amusingly enough, an object named ntmarta.dll doesn't actually exist, so if NtOpenSection had returned a different error code indicating as such, we wouldn't have failed. But since we do fail with an unexpected error code, it is propagated up the stack and the delayload resolver sets SetEntriesInAclW to point to a stub that returns the error code that we're seeing here.

I'm going to move this over to sandboxing, since at this point it's clear that we need to fix how the RDD sandbox deals with delayloads. TL;DR we need these NtOpenSection calls in search of sections named \KnownDlls\* to work, or at least fail without STATUS_ACCESS_DENIED

Bob, thoughts?

Sorry about the delay, I knew this was going to take a little while to investigate, so the quick answer would have been "No useful ones.". :-)

First thing I did was add a |if (XRE_IsRDDProcess()) { return true; }| to the start of ProcessRuntime::IsWin32kLockedDown, so I could test this without win32k lockdown enabled.

The initial problem here is that this is all being done on a separate thread.
From a token point of view, the raised initial privileges is done using an impersonation token set on the main thread before the suspended process is resumed.
So, while this is done before the sandbox is lowered, it's not on the main thread so things fail.

Initially I tried to warm things up on the main thread by loading rpcss.dll (which CoInitializeEx needed) and then a dummy call to SetEntriesInAcl.
This worked to get me as far as CoInitializeSecurity, but that failed as well.

So, I tried a different tack. In the attached patch it tries to open the thread token on the main thread and if there is one sets this on the MTA thread before doing anything else.
When the sandbox is lowered it also dispatches a task to the MTA thread to RevertToSelf.
You have to do this after the main thread, because the sandbox hooks NtSetInformationThread (which is what RevertToSelf uses), to make sure no reverts happen before the sandbox one on the main thread.

This all worked without win32k lockdown, but failed when I enabled the lockdown. :-(

Tracing through the calls for CoInitializeSecurity, it uses win32 API to get some WinStation and Desktop information:

    5     0 [  4]         win32u!NtUserGetProcessWindowStation
    7     0 [  3]       ntdll!KiUserCallbackDispatch
    8     0 [  4]         ntdll!KiUserCallForwarder
   11     0 [  5]           ntdll!LdrpValidateUserCallTarget rax = 00000fff`83057cb0
   15    11 [  4]         ntdll!KiUserCallForwarder
    2     0 [  4]         user32!_ClientThreadSetup
   11     0 [  5]           user32!ClientThreadSetup
   10     0 [  6]             ntdll!RtlEnterCriticalSection rax = 0
   16    10 [  5]           user32!ClientThreadSetup
   19     0 [  6]             ntdll!RtlLeaveCriticalSection rax = 0
   40    29 [  5]           user32!ClientThreadSetup rax = 1
   10    69 [  4]         user32!_ClientThreadSetup
    5     0 [  4]         ntdll!NtCallbackReturn
    1     0 [  3]       win32u!NtUserGetProcessWindowStation rax = cc
>> More than one level popped 3 -> 3
   28  8661 [  3]       combase!CRpcResolver::GetThreadWinstaDesktop
    3     0 [  4]         KERNEL32!GetCurrentThreadId rax = 57e8
   30  8664 [  3]       combase!CRpcResolver::GetThreadWinstaDesktop
    1     0 [  4]         user32!NtUserGetThreadDesktop
    6     0 [  4]         win32u!NtUserGetThreadDesktop rax = d0
>> More than one level popped 3 -> 3
   44  8671 [  3]       combase!CRpcResolver::GetThreadWinstaDesktop
    1     0 [  4]         user32!NtUserGetObjectInformation
    6     0 [  4]         win32u!NtUserGetObjectInformation rax = 1
>> More than one level popped 3 -> 3
   54  8678 [  3]       combase!CRpcResolver::GetThreadWinstaDesktop
    1     0 [  4]         user32!NtUserGetObjectInformation
    6     0 [  4]         win32u!NtUserGetObjectInformation rax = 1
>> More than one level popped 3 -> 3
  211  8685 [  3]       combase!CRpcResolver::GetThreadWinstaDesktop

aklotz - any ideas if we can call CoInitializeSecurity in a way that might not require this?

It's possible we could hook those and return dummy / known values or broker to the main process.
I think that it might be possible to enable win32k lockdown as a delayed mitigation, when we lower the sandbox.
I'm not sure if not having it enabled from the start, causes any security problems, but I guess it would be better than not having it at all.

We might also find that we need to do other initialization before we lower the sandbox as well.

Flags: needinfo?(bobowencode) → needinfo?(aklotz)

Ewwww... this is so gross!

CoInitializeSecurity executes some COM goop to resolve and open a connection to the DCOM SCM. Part of that connection info includes winsta and desktop information. There are two code paths that this code follows, depending on whether or not user32 can be resolved. Note that this test ignores the state of Win32k lockdown.

If user32 cannot be resolved, the code branches to a fallback that can resolve that info without requiring user32. But that doesn't happen unless user32 is not loadable!

  • I think this is probably a bug that I should report to the COM team; they really should be checking for Win32k lockdown as well as whether user32 is accessible.

  • As a workaround, we need to somehow ensure that user32 is not resolvable by COM.

Flags: needinfo?(aklotz)

Here are some notes on this:

  • We need to get our act together with xul.dll dependencies. We're wasting startup time loading user32, gdi32 and friends in sandboxed processes with win32k lockdown. Plus they're causing problems like the ones in this bug.

  • The function involved in all of this is combase!IsGetProcessWindowStationPresent, which calls ApiSetQueryApiSetPresence to determine whether it can invoke user32.

  • The parameter passed to ApiSetQueryApiSetPresence is a UNICODE_STRING containing "ext-ms-win-ntuser-windowstation-l1-1-0"

  • I don't know how hard it is to fix our XUL dependencies. Trivially delayloading gdi32 and user32 is insufficient due to being loaded transitively. Maybe we can clean this up and add enough assertions such that nobody can add a breaking entry to OS_LIBS without being backed out.

  • This explicit user32 load should go away if we're locked down. If we need DLL blocking via CreateRemoteThread to ride the trains to make this happen (it is currently only on Nightly), then so be it.

  • Short-term it might be easiest to just take care of the ApiSetQueryApiSetPresence issue when locked down.

A rough patch to handle ApiSetQueryApiSetPresence does work, however I need to make it work selectively depending on Win32k lockdown.

  • I have emailed our Microsoft discussion list to inform them of the bug.
  • I'll move this back to MSCOM and assign myself to put together an interim solution.
Assignee: nobody → aklotz
Status: NEW → ASSIGNED
Component: Security: Process Sandboxing → IPC: MSCOM
OS: Unspecified → Windows 10
Priority: -- → P1
Summary: win32k lockdown causes ProcessRuntime::InitializeSecurity for RDD process → win32k lockdown causes mscom::ProcessRuntime::InitializeSecurity to fail
Depends on: 1546158

This allows us to loosen the coupling between the sandbox any code that needs
to run as soon as the token has been lowered.

We use std::list here because the observer service is not yet initialized.

Depends on D27833

Attachment #9058812 - Attachment description: Bug 1535704: Part 3 - When sandboxed, mscom::ProcessRuntime must make the MTA thread impersonate using the main thread's token; r=bobowen! → Bug 1535704: Part 4 - When sandboxed, mscom::ProcessRuntime must make the MTA thread impersonate using the main thread's token; r=bobowen!
Pushed by aklotz@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/74eb8e5f7143 Part 1 - Move IsWin32kLockedDown into mozglue; r=bobowen
Depends on: 1546545
Pushed by aklotz@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d49e716056f8 Part 2 - Add a hook to sandbox target initialization that catches mscom's attempts to resolve user32 when Win32k lockdown is enabled; r=bobowen https://hg.mozilla.org/integration/autoland/rev/d303f2ce387e Part 3 - Add an observer to sandboxTarget that fires once the main thread's token has been lowered; r=bobowen https://hg.mozilla.org/integration/autoland/rev/496213852ce6 Part 4 - When sandboxed, mscom::ProcessRuntime must make the MTA thread impersonate using the main thread's token; r=bobowen
See Also: → 1547791
See Also: → 1553249
See Also: → 1751367
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: