Closed Bug 1530245 Opened 8 months ago Closed 7 months ago

Launch sandboxed x86 child processes on WoA from an x86 sandboxer process

Categories

(Core :: Security: Process Sandboxing, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla67
Tracking Status
firefox67 + fixed

People

(Reporter: cpearce, Assigned: cpearce)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 4 obsolete files)

On Windows on ARM64, rather than trying to sandbox the x86 GMP process from the aarch64 Firefox parent process (which requires fixing bug 1523765, which is tricky), an alternate approach would be to spawn an x86 child process which launches and sandboxes the GMP process.

Attached file WIP bundle (obsolete) —

When applied, the patches in this bundle almost works.

If I load http://cpearce.github.io/mse-eme/ I can see the GMP process spawn, and I see the EME key exchange occur - so the GMP process is being spawned and talking to the parent and content process. But then we hit a fatal assertion.

The stack frame of the first failed assertion is:

xul.dll!mozilla::ipc::GeckoChildProcessHost::OpenPrivilegedHandle(unsigned long aPid) Line 1284 C++
xul.dll!mozilla::ipc::GeckoChildProcessHost::PerformAsyncLaunch(std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char> > > > aExtraOpts) Line 1244  C++
xul.dll!mozilla::ipc::GeckoChildProcessHost::RunPerformAsyncLaunch::<unnamed-tag>::operator()() Line 589  C++
xul.dll!mozilla::detail::RunnableFunction<`lambda at c:/Users/chris/src/firefox/ipc/glue/GeckoChildProcessHost.cpp:588:24'>::Run() Line 559 C++
xul.dll!nsThread::ProcessNextEvent(bool aMayWait, bool * aResult) Line 1164 C++
xul.dll!NS_ProcessNextEvent(nsIThread * aThread, bool aMayWait) Line 474  C++

The logging I added (at sandboxBroker.cpp:288) also shows that after we spawned the GMP process it exited with code 103, which corresponds to ERROR_TOO_MANY_SEM_REQUESTS. I don't understand why we get that yet.

Also, in case it helps, I'm running an x86 parent process with an x86 child process. I'm building with this bash script, which puts the x86 binaries in the right place:

MOZCONFIG=i686.dbg.mozconfig ./mach build binaries || exit

mkdir -p obj-i686-dbg/dist/bin/i686/gmp-clearkey/0.1/

cp obj-i686-dbg/dist/bin/plugin-container.exe obj-i686-dbg/dist/bin/i686/
cp obj-i686-dbg/dist/bin/xul.dll obj-i686-dbg/dist/bin/i686/
cp obj-i686-dbg/dist/bin/gmp-clearkey/0.1/clearkey.dll obj-i686-dbg/dist/bin/i686/gmp-clearkey/0.1/
cp obj-i686-dbg/dist/bin/gmp-clearkey/0.1/manifest.json obj-i686-dbg/dist/bin/i686/gmp-clearkey/0.1/
cp obj-i686-dbg/dist/bin/nss3.dll obj-i686-dbg/dist/bin/i686/
cp obj-i686-dbg/dist/bin/mozglue.dll obj-i686-dbg/dist/bin/i686/
cp obj-i686-dbg/dist/bin/lgpllibs.dll obj-i686-dbg/dist/bin/i686/
cp obj-i686-dbg/dist/bin/msvcp140.dll obj-i686-dbg/dist/bin/i686/
cp obj-i686-dbg/dist/bin/vcruntime140.dll obj-i686-dbg/dist/bin/i686/

(In reply to Chris Pearce [:cpearce (GMT+13)] from comment #1)
...

The logging I added (at sandboxBroker.cpp:288) also shows that after we spawned the GMP process it exited with code 103, which corresponds to ERROR_TOO_MANY_SEM_REQUESTS. I don't understand why we get that yet.

This appears to be from the launch of the content processes in the normal SandboxBroker.

I seem to getting the same issue, but I've been having trouble getting windbg working on the x86 process.

One thing I thought might be causing a problem is that because we duplicate to the GMP process and it is no longer launched by the parent SandboxBroker, we need to call AddTargetPeer on the parent SandboxBroker with the new child GMP process handle.
Otherwise it doesn't know it is a valid target for the content process to duplicate to.

I added a static version of AddTargetPeer to SandboxBroker (I don't know why I didn't make this static to start with) and called it after GMP remote launch, but it doesn't fix the current problem.

This is slightly related to the Unix fork server project, in that they both involve remoting the launch operation to another process, although I don't know if it will end up making sense to share any code.

See Also: → 1470591

I had to edit the patch to remove some things, but I think it should apply.

Attached file WIP bundle v2 (obsolete) —

Updated bundle; doesn't shutdown sandboxer process explicitly. Part of the issues I was seeing was caused by the explicit shutdown of the sandboxer process after starting up the GMP process. This caused the GMP process to terminate, since I guess it's not started detached or somesuch?

We're still hitting the ERROR_TOO_MANY_SEM_REQUESTS error when trying to open a handle to one of our processess. I don't know why that is.

This bundle includes Bob's wibble patch.

Attachment #9046260 - Attachment is obsolete: true
Attached patch DuplicateHandlesForLauncher (obsolete) — Splinter Review

(In reply to Chris Pearce [:cpearce (GMT+13)] from comment #5)

Created attachment 9046589 [details]
WIP bundle v2

Updated bundle; doesn't shutdown sandboxer process explicitly. Part of the issues I was seeing was caused by the explicit shutdown of the sandboxer process after starting up the GMP process. This caused the GMP process to terminate, since I guess it's not started detached or somesuch?

We're still hitting the ERROR_TOO_MANY_SEM_REQUESTS error when trying to open a handle to one of our processess. I don't know why that is.

I actually looked where you were logging this today and the exit code is actually STILL_ACTIVE (259) (0x103), so this is a red herring.

The OpenPrivilegedHandle failure was because we were trying to use a handle from the launcher process. We need to duplicate it to our process.
The reverse applies for the handles to share.
In theory we would need to do the same for AddTargetPeer, but I don't think we need that on the launcher interface. We only ever need to call that in the parent I think.

We're probably leaking HANDLEs with this patch I think and probably need to add DUPLICATE_CLOSE_SOURCE to the DuplicateHandle options as well. Need to make sure that means we won't close them twice.

I had to start the BackgroundHangMonitor to get this working for me as well.

I also changed the launcher process to use firefox.exe, because I think that makes more sense.

Anyway with this patch on top of your latest bundle this works for x86 locally and on my arm64 device and I get playback for the clearkey demo. \o/

Definitely some shutdown issues with either the GMP or launcher, because if I refresh then it crashes the browser.
It probably makes sense to have a single launcher process for launching all GMP ones.
Then shut it down after a timeout once any GMP processes have finished and no new ones start.

Attachment #9046428 - Attachment is obsolete: true
Priority: -- → P1
Assignee: nobody → cpearce
Attachment #9046589 - Attachment is obsolete: true
Attachment #9046754 - Attachment is obsolete: true

If this fix doesn't land this week, we will need to uplift it to 67. This is our workaround for 67+ until a native aarch64 build for widevine is available.

Hi Chris, Anthony, Chuck, can you confirm that landing this change (in m-c this week during soft code freeze) will not break media playback (Netflix, Youtube, Amazon Prime,etc )in non-EME builds when 67 goes from Nightly to Beta (on Monday, Marc 18th)?

The lowest risk option would be to land this fix in m-c 68 and give it a week to stick before uplifting this in 67. Pascal, FYI.

Flags: needinfo?(pascalc)
Flags: needinfo?(cpearce)
Flags: needinfo?(charmston)
Flags: needinfo?(ajones)

(In reply to Ritu Kothari (:ritu) from comment #10)

Hi Chris, Anthony, Chuck, can you confirm that landing this change (in m-c this week during soft code freeze) will not break media playback (Netflix, Youtube, Amazon Prime,etc )in non-EME builds when 67 goes from Nightly to Beta (on Monday, Marc 18th)?

I expect that landing this will not break media playback on other platforms. Most of the behaviour changes are behind "if on Windows on ARM64" guards, so this should only affect the platform in question.

The lowest risk option would be to land this fix in m-c 68 and give it a week to stick before uplifting this in 67. Pascal, FYI.

I agree this is the lowest risk option. I would like to try to get it landed this week so we can uncover any issues sooner, but that depends on how jed's review goes, and how comfortable relman are with that. :)

Flags: needinfo?(cpearce)
Flags: needinfo?(charmston)
Flags: needinfo?(ajones)

Note that on its own, this won't do much. This will need an uplift of bug 1529194 and bug 1527463 and anything in between those two (bug 1533659 is one of them that I just landed now, and there will be more before bug 1527463 lands)

Pushed by cpearce@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/00fe102f6c06
Make GeckoChildProcessHost::mSandboxBroker an abstract pointer. r=bobowen
https://hg.mozilla.org/integration/autoland/rev/64abd0254f73
Launch sandbox from new remote sandbox broker process. r=jld,bobowen
Status: NEW → RESOLVED
Closed: 7 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67
Flags: needinfo?(pascalc)
You need to log in before you can comment on or make changes to this bug.