Closed Bug 1779312 Opened 2 years ago Closed 2 years ago

Sandbox code uses strerror in contexts that require async signal safety

Categories

(Core :: Security: Process Sandboxing, defect)

Unspecified
Linux
defect

Tracking

()

RESOLVED FIXED
105 Branch
Tracking Status
firefox105 --- fixed

People

(Reporter: jld, Assigned: jld)

References

Details

Attachments

(4 files)

This is a very old bug in the Linux sandboxing: using strerror when logging error messages in contexts that are or may be async signal unsafe, like the SIGSYS handler (because the syscall being intercepted could have been in async signal context) or, more importantly, after cloneing the child process.

I've known about this for a while but hadn't gotten around to fixing it, and it wasn't obviously causing problems… until it did, in bug 1778052, where a patch was backed out because it caused strerror to always be called after clone and lost the race often enough to make the mochitests ~100% orange.

Conveniently, glibc provides strerrorname_np and strerrordesc_np (the first is the symbol name like EINVAL, the second is the text description in “the default locale”) which are explicitly documented to be async signal safe (and thread safe). Inconveniently, those were added in glibc 2.32 and Mozilla's official builds use an older version.

What I'm planning to do here is use strerrorname_np if available, otherwise a switch statement to handle most of the common errors, and fall back to error %d with the numeric value if the name lookup (either version) fails.

Two minor things I noticed while converting the existing sandbox logging:

  1. One call site was using %u, but that doesn't exist in this printf
    dialect, only %d; signedness is determined by the actual argument
    type via template magic.

  2. POSIX functions that return an error number just return the number;
    there was one place that was negating it before use, as if it had
    come from the Linux syscall ABI.

Originally this was written for B2G and used the Android logging
facility, which (like syslog) includes a severity level. However, all
current usage is on desktop where we just write to stderr, and there was
never much demand to add support for any log levels besides "error".

More importantly for the current situation, renaming the macro to
SANDBOX_LOG avoids confusion between SANDBOX_LOG_ERROR and
SANDBOX_LOG_ERRNO (or SANDBOX_LOG_ERROR_ERRNO or whatever).

This adds two new logging macros, which are intended to be async signal
safe:

  • SANDBOX_LOG_ERRNO, which appends the error similarly to perror but
    uses the error identifier (e.g., EINVAL instead of Invalid argument).
    Unlike perror, formatting directives are available as for SANDBOX_LOG.

  • SANDBOX_LOG_WITH_ERROR is the same thing but the error number is the
    first argument instead of using errno; this is useful for newer POSIX
    APIs which return an error number.

This will be used in the next patch to replace the existing use of
strerror, which is not async signal safe (or thread-safe).

strerror is async signal unsafe, and we're using it in contexts where
that's a problem: in particular in the child process after clone()ing,
where it can deadlock if it takes locks the parents' other threads had
held (or cause other undefined behavior), but also in the SIGSYS handler
if it's nested inside an async signal. It's also thread-unsafe.

This is mostly a mechanical replacement with the new SANDBOX_LOG_ERRNO
or SANDBOX_LOG_WITH_ERROR; two messages had the error string in the
middle and have been adjusted.

Pushed by jedavis@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2413d3f6d41c
Preliminary fixes to some misuses of SANDBOX_LOG_ERROR. r=glandium
https://hg.mozilla.org/integration/autoland/rev/4ed8e078268a
Rename `SANDBOX_LOG_ERROR` to just `SANDBOX_LOG`. r=glandium
https://hg.mozilla.org/integration/autoland/rev/de2eccf44572
Add macros for Linux sandbox logging with an error code. r=glandium
https://hg.mozilla.org/integration/autoland/rev/c84108ada24c
Replace uses of strerror in Linux sandbox code. r=glandium

(Belatedly adjust summary to reflect the actual thing that's being fixed.)

Summary: strerror is async signal unsafe → Sandbox code uses strerror in contexts that require async signal safety
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: