Closed Bug 1297740 Opened 3 years ago Closed 2 years ago

ASAN UAF in PerformAsyncLaunch (if MOZ_LOG/etc are set)

Categories

(Core :: IPC, defect, P2, critical)

defect

Tracking

()

RESOLVED FIXED
mozilla59
Tracking Status
firefox51 --- wontfix
firefox-esr52 --- wontfix
firefox57 --- wontfix
firefox58 --- wontfix
firefox59 --- fixed

People

(Reporter: jesup, Assigned: gcp)

References

Details

(Keywords: csectype-uaf, sec-moderate, Whiteboard: [post-critsmash-triage][adv-main59+])

Attachments

(2 files, 3 obsolete files)

I modified testing/mochitest/runtests.py to add a MOZ_LOG setting for Try, and got multiple ASAN UAFs.

It appears that the ownership model for the GeckoChildProcessHost is borked - when PerformAsyncLaunch runs (in a nonowningrunnablemethod!) the 'this' object has been deleted already.  PerformAsyncLaunch seems to only touch 'this' if MOZ_LOG or NSPR_LOG_MODULES is set.

https://treeherder.mozilla.org/logviewer.html#?job_id=26272983&repo=try#L1943
https://treeherder.mozilla.org/logviewer.html#?job_id=26272995&repo=try#L2069

Marking as sec-high instead of critical since it requires MOZ_LOG/NSPR_LOG_MODULES, which is unusual for non-mozilla-devs.  Note that this may also mean that the value doesn't get into the child process; I'm not sure.
Jesup what kind of process was being launched?
Keywords: sec-highsec-moderate
Priority: P1 → P2
Flags: needinfo?(rjesup)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #1)
> Jesup what kind of process was being launched?

Content Process I'm 99% sure; but I suspect it happens to any of them
Flags: needinfo?(rjesup)
Group: core-security → dom-core-security
This can't happen with NPAPI plugin processes because of the synclaunch behavior. I guess it can happen with the async-launched forms if the client decides to kill off the process before we actually get around to launching it. I'd be interested *why* that's happening, but we should fix this regardless.
Attached file asan_crash
This happened again restarting a profile after a crash - session restore came up for a second, then ASAN crash.
It repros under ASAN+rr :-)

It appears the problem is that if the Host that called PR_SetEnv goes away, the string use for PR_SetEnv() is destroyed, and subsequent PR_GetEnv's return the destroyed string (see prenv.h for why this is).  So this comment is incorrect.  I suggest for this case we simply leak the (short) string to avoid violating NSPR requirements.  I'm sure there are more complex solutions, but any string passed to PR_SetEnv() must have a lifetime past the last time it's fetched by PR_GetEnv (and past where the value fetched is used).  I'm not certain that doing it in ClearOnShutdown would even be sufficient (and that's a pain since it requires MainThread).

  // Passing temporary to PR_SetEnv is ok here if we keep the temporary
  // for the time we launch the sub-process.  It's copied to the new
  // environment.
  PR_SetEnv(buffer.BeginReading());

(Note: some process is shutting down in the SessionRestore case):
console.error: PushService: 
  stateChangeProcessEnqueue: Error transitioning state
  UnknownError

###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost

[New Thread 8992.9364]
...
<spawns a new Content process and crashes with ASAN error>
Flags: needinfo?(wmccloskey)
Flags: needinfo?(jmathies)
Gabor, could you please take a look at this issue, see if there's a fix.
Flags: needinfo?(wmccloskey)
Flags: needinfo?(jmathies)
Flags: needinfo?(gkrizsanits)
For what it's worth, the changes I'm proposing in bug 1401786 will make it easy to do this kind of thing without side-effecting the parent process's environment.
Cool - though until that lands, I'd love to see a fix in 58 to avoid weird stuff happening - and to avoid ASAN bitching at me after a crash (and maybe other times)
Seems like :gcp is trying to land something that removes the problematic bits mentioned in Comment 6.

Bug 1386404.
Flags: needinfo?(gkrizsanits) → needinfo?(gpascutto)
Depends on: 1386404
For reference: the unsafe use of PR_SetEnv seems to have originated in bug 534764.  Bug 1248565 made it worse (always a use-after-free rather than conditional/racy) but then bug 1270752 undid that part.  There's also another instance of this anti-pattern in the profiler (AutoSetProfilerEnvVarsForChildProcess); that one might be safe but we should clean it up at some point.


But… is this really the same problem as in comment #0?  The logs have expired by now, but if that's about the entire GeckoChildProcessHost object being deleted then it's a different bug from the env var problems.
I reported the original; I think this is likely the same issue.  It was easier to investigate with rr...
Bug 1386404 is somewhat blocked on audioipc using locations on the filesystem, so I will extract the relevant parts and land them here first.
Flags: needinfo?(gpascutto)
Attached patch fix-asan-env (obsolete) — Splinter Review
MozReview-Commit-ID: KTdvzYIFYzv
Attachment #8927387 - Flags: review?(jld)
Attachment #8927387 - Flags: review?(jld) → review+
Welp, base::environment_map is only supported on POSIX, but the Windows code also wants to get those log variables set.
Blocks: 1386404
Depends on: 1257276
No longer depends on: 1386404
Attached patch env_patch.diff (obsolete) — Splinter Review
WIP on environment handling change for Unix and Windows
Assignee: nobody → gpascutto
Attachment #8927387 - Attachment is obsolete: true
Attached patch env-set-directly (obsolete) — Splinter Review
MozReview-Commit-ID: IL1MMoy8iH8
Attachment #8934609 - Flags: review?(jld)
Comment on attachment 8934609 [details] [diff] [review]
env-set-directly

Review of attachment 8934609 [details] [diff] [review]:
-----------------------------------------------------------------

Mostly looks good, but make sure this doesn't completely break --disable-sandbox, and warn about env_map not working yet in the non-sandboxed Windows case.

::: ipc/chromium/src/base/process_util.h
@@ +112,3 @@
>    // Environment variables to be applied in addition to the current
>    // process's environment, replacing them where necessary.
> +  EnvironmentMap env_map;

This isn't used by process_util_win.cc, only by the sandbox path in GeckoChildProcessHost, so these environment changes won't apply to non-sandboxed processes.

That may not need to be fixed in this bug, but there should at least be a FIXME comment pointing to bug 1257276.

::: ipc/glue/GeckoChildProcessHost.cpp
@@ +22,5 @@
>  #include "mozilla/Sprintf.h"
>  #include "prenv.h"
>  #include "nsXPCOMPrivate.h"
>  
> +#include "mozilla/EnvironmentMap.h"

Won't this break on --disable-sandbox builds, because the header is in security/sandbox?

::: security/sandbox/chromium/sandbox/win/src/target_process.cc
@@ +115,5 @@
> +  // Ignore return value? What can we do?
> +  FreeEnvironmentStrings(original_environment);
> +  // Cast away const and type. It's an input param only
> +  // so casting away const should be safe.
> +  LPVOID new_env_ptr = (void*)new_environment.data();

Comment nit: I agree that it's safe to cast away const here, but I'm not seeing why the pointer target would be const in the first place — new_environment isn't const and there's a non-const basic_string::data.

::: security/sandbox/win/src/sandboxbroker/sandboxBroker.h
@@ +8,5 @@
>  #define __SECURITY_SANDBOX_SANDBOXBROKER_H__
>  
>  #include <stdint.h>
> +#include <map>
> +#include <string>

Nit: these includes might have been needed in an earlier version of this patch but it doesn't look like that's the case now.
Attachment #8934609 - Flags: review?(jld) → review+
Attached patch env.diffSplinter Review
Addresses review comments. I ended up including just the code needed in the old non-sandboxed callpath. I think it's the least bad solution.
Attachment #8934213 - Attachment is obsolete: true
Attachment #8934609 - Attachment is obsolete: true
Attachment #8935806 - Flags: review?(jld)
Comment on attachment 8935806 [details] [diff] [review]
env.diff

Review of attachment 8935806 [details] [diff] [review]:
-----------------------------------------------------------------

r=me if the missing files are unchanged from the previous diff.

::: ipc/chromium/src/base/process_util_win.cc
@@ +287,5 @@
> +    cur++;
> +  return cur + 1;
> +}
> +
> +std::wstring AlterEnvironment(const wchar_t* env,

A comment that mentions where these two functions are copied from would be nice.  Also, can they be static?

::: ipc/glue/moz.build
@@ +20,5 @@
>      'CrashReporterHost.h',
>      'CrashReporterMetadataShmem.h',
>      'CrossProcessMutex.h',
>      'CrossProcessSemaphore.h',
> +    'EnvironmentMap.h',

This file seems to have been omitted from the diff.

::: security/sandbox/moz.build
@@ +43,5 @@
>          'chromium/base/callback_internal.cc',
>          'chromium/base/cpu.cc',
>          'chromium/base/debug/alias.cc',
>          'chromium/base/debug/profiler.cc',
> +        'chromium/base/environment.cc',

…and this file.
Attachment #8935806 - Flags: review?(jld) → review+
This put the mingw builds on fire. The fix is probably easy but I need to verify it doesn't break MSVC or clang again (sigh).
Flags: needinfo?(gpascutto)
Flags: needinfo?(gpascutto)
Duplicate of this bug: 1257276
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Depends on: 1428950
Depends on: 1430118
Group: dom-core-security → core-security-release
Flags: qe-verify-
Whiteboard: [post-critsmash-triage]
Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][adv-main59+]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.