Closed Bug 1286865 Opened 3 years ago Closed 3 years ago

Reporting for non-fatal seccomp violations

Categories

(Core :: Security: Process Sandboxing, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox53 --- affected
firefox54 --- fixed

People

(Reporter: gcp, Assigned: jld)

References

(Blocks 1 open bug)

Details

(Whiteboard: sblc2)

Attachments

(6 files)

If at some point we make seccomp violations returns failure to the caller, instead of killing the process with SIGSYS, we lose the ability to trace the cause and underlying callers of the syscall.

This could cause silent, but hard to debug breakage for Firefox users.

This requires some way to collect call stacks from seccomp and/or Firefox itself.
Whiteboard: sblc2
Depends on: 1280469
One thing we could do that's relatively easy: gather telemetry for the number of calls to each syscall.  The easy part of that is having an atomic counter that the SIGSYS handler increments; the harder part is getting them injected into telemetry from a safe context.  That doesn't tell us the underlying cause, but could at least give us data on which ones might need a closer look.

More difficult is doing reporting like that while still allowing the syscall.  In principle this can be done by making the filter check the instruction pointer value and allow it from a particular instance of the syscall instruction that's not normally used, which the SIGSYS handler uses to reissue the call (which malicious code could just call directly, so this isn't for “load-bearing” security use).  Chromium sort of has support for this technique but it's a debug-only thing that applies to the entire filter and removes all security; last I saw, it'd need patches to the BPF compiler to allow testing the IP the same way as an argument.  This isn't what comment #0 asks for, but it could be useful for syscalls that we're currently allowing and would like to disallow.

And then there's the problem of stack walking.  Calling _Unwind_Backtrace out of the SIGSYS handler might work, but if the offending syscall was in an async signal handler, or in a context holding locks related to malloc or the dynamic linker (see dl_iterate_phdr(3)), it could deadlock or worse.  Nightly builds have --enable-profiling and therefore frame pointers, but that doesn't help us, because it applies only to Gecko, not to the system libraries at the top of the stack.  So that's where bug 1280469 comes in.

(Incidentally, the Gecko profiler used to do something like what we'd need here — the SIGPROF handler would copy the stack contents (up to some limit) into a buffer, and a background thread would run the breakpad unwinder asynchronously.  But it was removed — if I recall correctly both because the resource usage was impractical for profiler use and due to general technical debt / divergence from upstream breakpad.)
Whiteboard: sblc2 → sb?
This came up in the sandboxing meeting this morning.  :gcp suggested having a ring buffer of the last N denied syscalls and exposing that in about:support (including arguments) so that users could paste it into a bug if needed, in addition to collecting telemetry.  We probably also want to be able to include some of this info in crash reports, in case denied syscalls might have caused a crash indirectly (by dereferencing a null pointer, release asserting, etc.).  And we're going to defer the question of stack unwinding by assuming we can't have it.

At first I was thinking the SIGSYS handler would write directly to the ring buffer, using some kind of lockless atomic thing, but that gets complicated — multiple threads can write concurrently, and writing needs to be nonblocking (and thus lossy/fallible) because a writer might be preempting another writer and/or the reader. 

I think the right tool to get the data out of the signal handler is a socketpair: writing and reading are thread safe and async signal safe, either or both can be made nonblocking or not, and it doesn't need a lot of code to use.

Sockets also allow having a single reader in the parent process collecting all the messages, instead of doing it per-process in the child and using other forms of IPC (and more code) to get the info out.  This is also useful for processes that crash, which don't need to do any extra work to get the info out of the process.

Also, the socket writes can be blocking, as long as the reader can't indirectly block on a child process.  (This might rule out running it from the parent's main thread event loop, if we still have parent->child CPOWs or anything like that, but running this from a dedicated thread might be a good idea anyway.)  That means there's no need to keep a tally of dropped messages about unexpected syscalls, which is an edge case inside an edge case.

And now, some notes on the use cases.

Telemetry:

* I'm not sure how thread safety works here; we might wind up needing to send the main thread a runnable holding an array of syscall number or something like that.

* We need to decide on a number of buckets and can't change it without switching to a new histogram identifier.  I'm not sure how much these buckets cost / how much we should try to avoid allocating more than we need.  (If they're too expensive to use one per syscall number we could maybe just have a count of events, which is better than nothing.)  Alternately, maybe there's something else we can use besides histograms that's a better fit for (hopefully) sparse data like this.

* We probably won't need more than 1k syscall numbers, but some arches don't start at 0: x32 starts at 0x4000_0000, ARM OABI starts at 0x90_0000 but we don't care about that, the __ARM_NR space starts at 0xf_0000 and we sort of care about that, IA64 starts at 1024 and we probably don't care, and MIPS starts at some number of decimal thousands based on the subarch and we might care (MIPS has some presence in embedded, and Chromium supports sandboxing on it).

* Do we want to try separating this by process type?  Or limiting it to content only?

about:support:

* The hard part will probably be getting the table of numbers into JavaScript, but maybe XPCOM already has something for this.

* This will be reading the ring buffer on the main thread, but it can just use a mutex to coordinate with the socket reader thread (if that's how this winds up working).

Crash reporting:

* If there's one table for everything then we'd need to select out the entries for the crashing process.

* Not sure how much info to embed in a crash.  Just the number of recent rejected syscalls?  Also the syscall numbers?  Are there length limits or other reasons not to throw in all the arguments?

* The socket reader may not have finished (or started) reading the crashed child's last writes; should force a poll from the crash reporter to ensure that.

Testing:

* The plan is to keep crashing on Nightly, but this means landing code that won't be used until it gets to later stages of the release cycle.  That's kind of bad.  At a minimum we need unit tests that flip the pref for this, spin off a new content process, and do something bad.
Assignee: nobody → jld
Whiteboard: sb? → sblc2
Comments on the patches:

1. I mentioned the possibility of unit tests in a few comments, but have not (yet) written any; I've tested manually only.  It would be good to fix that, but I didn't want to delay getting more eyes on this any longer.

2. About the Telemetry being opt-out and nonexpiring: this would fall under “monitoring and solving product quality issues” (from https://wiki.mozilla.org/Firefox/Data_Collection#Requirements).  Any events in this histogram indicate a potential user-perceptible failure, which may or may not be directly visible as broken functionality (e.g., if there's a fallback for the blocked system call but it has lower performance).

3. System calls where we return an error directly from the seccomp-bpf policy (or return an error other than ENOSYS from a trap callback) are not reported; it's only the ones where we'd crash on Nightly.  There may be cases where we'd want to monitor possible problems but not crash on Nightly, but that would be a followup.

4. We expect a small number of distinct keys for the Telemetry histogram to actually be reported, but the set of possible keys is large and the exact keys that might show up are something we won't know until it happens; this is why it's a keyed count histogram instead of an enumeration.  I've tried to avoid doing things that might cause key count explosion (assuming non-malicious use, but someone who wants to spam bad telemetry can do that directly) and left comments to try to warn anyone who'd modify the code in the future.
Here's a screenshot of the current state of the about:support interface, looking more populated than will hopefully be typical, and demonstrating the fixed size of the log.
Comment on attachment 8832767 [details]
Bug 1286865 - Step 4: Report rejected syscall info in Telemetry.

https://reviewboard.mozilla.org/r/109002/#review110200

::: toolkit/components/telemetry/Histograms.json:10405
(Diff revision 1)
>      "description": "Whether the sandbox is enabled for the content process"
>    },
> +  "SANDBOX_REJECTED_SYSCALLS": {
> +    "alert_emails": ["jld@mozilla.com", "gcp@mozilla.com"],
> +    "bug_numbers": [1286865],
> +    "expires_in_version": "never",

The requirements for a never expiring probe are higher than for one that expires.

We generally prefer to see telemetry which expires after 6 releases (so `60` in the present case). You can always renew it (you'll get an email before it expires) if it's still useful, but we prefer to do it that way because it goes away if people stop paying attention to it.

It's even more important since this is an opt-out probe, which also have higher requirements.

So, would an expiring opt-out probe work for you or would you like me to consult with the other data stewards to discuss it?

::: toolkit/components/telemetry/Histograms.json:10410
(Diff revision 1)
> +    "expires_in_version": "never",
> +    "releaseChannelCollection": "opt-out",
> +    "kind": "count",
> +    "keyed": true,
> +    "cpp_guard": "XP_LINUX",
> +    "description": "System calls rejected by a seccomp-bpf sandbox policy"

Two questions so we can make the description a little bit clearer:

By "system call", do you mean an ID / integer of some kind, or the name / string of a system call?

How often is it triggered? Everytime a user does something which triggers a system call which is blocked by the sandbox, we get a telemetry ping for that syscall?
Attachment #8832767 - Flags: review?(francois)
(In reply to Jed Davis [:jld] {⏰UTC-7} from comment #8)
> 2. About the Telemetry being opt-out and nonexpiring: this would fall under
> “monitoring and solving product quality issues” (from
> https://wiki.mozilla.org/Firefox/Data_Collection#Requirements).  Any events
> in this histogram indicate a potential user-perceptible failure, which may
> or may not be directly visible as broken functionality (e.g., if there's a
> fallback for the blocked system call but it has lower performance).

Sorry, I should have read this before I reviewed your patch.

I do agree this is a strong case for non-expiring. I'm happy to bring it up with bsmedberg if you prefer to keep it that way.
Comment on attachment 8832765 [details]
Bug 1286865 - Step 2: Add XPCOM bindings for sandbox syscall reporter.

https://reviewboard.mozilla.org/r/108998/#review110500

r+ on the build system parts.
Attachment #8832765 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8832763 [details]
Bug 1286865 - Step 0: Turn off crash-on-seccomp-fail by default on non-nightly.

https://reviewboard.mozilla.org/r/108994/#review110544
Attachment #8832763 - Flags: review?(gpascutto) → review+
Comment on attachment 8832766 [details]
Bug 1286865 - Step 3: Expose rejected syscall log in about:support.

https://reviewboard.mozilla.org/r/109000/#review110838

LGTM
Attachment #8832766 - Flags: review?(adw) → review+
Comment on attachment 8832764 [details]
Bug 1286865 - Step 1: Gather syscall info from SIGSYS handlers into the parent process.

https://reviewboard.mozilla.org/r/108996/#review110632

::: security/sandbox/linux/Sandbox.cpp:142
(Diff revision 1)
>    gChromiumSigSysHandler(nr, info, ctx);
>    if (!ContextIsError(ctx, ENOSYS)) {
>      return;
>    }
>  
> -  pid_t pid = getpid();
> +  SandboxReport report = gSandboxReporterClient->MakeReportAndSend(&savedCtx);

Although this is a Maybe, I guess you don't need a check here because you abort elsewhere if constructing it fails. But I'd consider it good practice to add a debug assert here anyway, in case we ever start moving things around.

::: security/sandbox/linux/SandboxFilter.cpp:172
(Diff revision 1)
>        // Timekeeping
>      case __NR_clock_gettime: {
>        Arg<clockid_t> clk_id(0);
>        return If(clk_id == CLOCK_MONOTONIC, Allow())
>  #ifdef CLOCK_MONOTONIC_COARSE
> +        // Used by SandboxReporter.

Presumably by other things too, or we wouldn't whitelist it? :-)

::: security/sandbox/linux/SandboxReporterClient.h:27
(Diff revision 1)
> +  explicit SandboxReporterClient(SandboxReport::ProcType aProcType,
> +                                 int aFd = kSandboxReporterFileDesc);
> +
> +  // Constructs a report from a signal context (the ucontext_t* passed
> +  // as void* to an sa_sigaction handler); uses the caller's pid and tid.
> +  SandboxReport MakeReport(const void* aContext);

Can this be private?

::: security/sandbox/linux/SandboxReporterClient.h:29
(Diff revision 1)
> +
> +  // Constructs a report from a signal context (the ucontext_t* passed
> +  // as void* to an sa_sigaction handler); uses the caller's pid and tid.
> +  SandboxReport MakeReport(const void* aContext);
> +
> +  void SendReport(const SandboxReport& aReport);

Same: can probably be private?

::: security/sandbox/linux/SandboxReporterClient.cpp:34
(Diff revision 1)
> +					     int aFd)
> +  : mProcType(aProcType)
> +  , mFd(aFd)
> +{
> +  // TODO, maybe: assert that the fd is a socket of the expected type.
> +  // (Or, see if the crash reporter does anything like that & follow its lead.)

Might as well file an associated bug and nom it :)

::: security/sandbox/linux/SandboxReporterClient.cpp:49
(Diff revision 1)
> +  // sending uninitialized alignment padding to another process.
> +  PodZero(&report);
> +
> +  clock_gettime(CLOCK_MONOTONIC_COARSE, &report.mTime);
> +  report.mPid = getpid();
> +  MOZ_RELEASE_ASSERT(report.mPid > 0);

What are you hoping to accomplish by this assert?

::: security/sandbox/linux/SandboxReporterClient.cpp:72
(Diff revision 1)
> +  // The "common" seccomp-bpf policy allows sendmsg but not send(to),
> +  // so just use sendmsg even though send would suffice for this.
> +  struct iovec iov;
> +  struct msghdr msg;
> +
> +  iov.iov_base = const_cast<void*>(static_cast<const void*>(&aReport));

Does a direct const_cast not work? Surprising!

::: security/sandbox/linux/reporter/SandboxReporter.h:33
(Diff revision 1)
> +  // For normal use, don't construct this directly; use the
> +  // Singleton() method.
> +  //
> +  // For unit testing, use this constructor followed by the Init
> +  // method; the object isn't usable unless Init returns true.
> +  SandboxReporter();

can be explicit

::: security/sandbox/linux/reporter/SandboxReporter.h:46
(Diff revision 1)
> +  void GetClientFileDescriptorMapping(int* aSrcFd, int* aDstFd) const;
> +
> +  // A snapshot of the report ring buffer; element 0 of `mReports` is
> +  // the `mOffset`th report to be received, and so on.
> +  struct Snapshot {
> +    uint64_t mOffset;

Why not size_t? Especially as kSandboxReporterBufferSize is size_t.

::: security/sandbox/linux/reporter/SandboxReporterCommon.h:20
(Diff revision 1)
> +// Note: this is also used in libmozsandbox, so dependencies on
> +// symbols from libxul probably won't work.
> +
> +namespace mozilla {
> +static const size_t kSandboxSyscallArguments = 6;
> +static const int kSandboxReporterFileDesc = 5;

What's the magic behind this value?
Attachment #8832764 - Flags: review?(gpascutto) → review+
Comment on attachment 8832765 [details]
Bug 1286865 - Step 2: Add XPCOM bindings for sandbox syscall reporter.

https://reviewboard.mozilla.org/r/108998/#review111104

Do we need an extra reviewer for the IDL parts?

What's the rationale for exposing that it is a ringbuffer through IDL? Consistency of numbering? (seeing the total amount of failed calls?)
Attachment #8832765 - Flags: review?(gpascutto) → review+
Comment on attachment 8832766 [details]
Bug 1286865 - Step 3: Expose rejected syscall log in about:support.

https://reviewboard.mozilla.org/r/109000/#review110622

::: toolkit/content/aboutSupport.xhtml:550
(Diff revision 1)
>        <table>
>  	<tbody id="sandbox-tbody">
>  	</tbody>
>        </table>
> +
> +#if defined(XP_LINUX)

I guess you could add MOZ_SANDBOX here.
Attachment #8832766 - Flags: review?(gpascutto) → review+
Comment on attachment 8832767 [details]
Bug 1286865 - Step 4: Report rejected syscall info in Telemetry.

https://reviewboard.mozilla.org/r/109002/#review111106
Attachment #8832767 - Flags: review?(gpascutto) → review+
Comment on attachment 8832766 [details]
Bug 1286865 - Step 3: Expose rejected syscall log in about:support.

https://reviewboard.mozilla.org/r/109000/#review111264
Comment on attachment 8832767 [details]
Bug 1286865 - Step 4: Report rejected syscall info in Telemetry.

https://reviewboard.mozilla.org/r/109002/#review111274
Comment on attachment 8832764 [details]
Bug 1286865 - Step 1: Gather syscall info from SIGSYS handlers into the parent process.

https://reviewboard.mozilla.org/r/108996/#review111292

::: security/sandbox/linux/Sandbox.cpp:142
(Diff revision 1)
>    gChromiumSigSysHandler(nr, info, ctx);
>    if (!ContextIsError(ctx, ENOSYS)) {
>      return;
>    }
>  
> -  pid_t pid = getpid();
> +  SandboxReport report = gSandboxReporterClient->MakeReportAndSend(&savedCtx);

There's already a debug assertion in `Maybe`'s `operator->`.  (And a release assertion at the top of `SetCurrentProcessSandbox`.)

::: security/sandbox/linux/SandboxFilter.cpp:172
(Diff revision 1)
>        // Timekeeping
>      case __NR_clock_gettime: {
>        Arg<clockid_t> clk_id(0);
>        return If(clk_id == CLOCK_MONOTONIC, Allow())
>  #ifdef CLOCK_MONOTONIC_COARSE
> +        // Used by SandboxReporter.

Good point; fixed.

::: security/sandbox/linux/SandboxReporterClient.h:27
(Diff revision 1)
> +  explicit SandboxReporterClient(SandboxReport::ProcType aProcType,
> +                                 int aFd = kSandboxReporterFileDesc);
> +
> +  // Constructs a report from a signal context (the ucontext_t* passed
> +  // as void* to an sa_sigaction handler); uses the caller's pid and tid.
> +  SandboxReport MakeReport(const void* aContext);

Originally I wasn't sure if I wanted to use `MakeReportAndSend` at all, or print to stderr before sending, or what.  But it still might make sense to leave this public for the unit tests I might eventually write.  And I don't think there's any real cost to leaving it public.

::: security/sandbox/linux/SandboxReporterClient.h:29
(Diff revision 1)
> +
> +  // Constructs a report from a signal context (the ucontext_t* passed
> +  // as void* to an sa_sigaction handler); uses the caller's pid and tid.
> +  SandboxReport MakeReport(const void* aContext);
> +
> +  void SendReport(const SandboxReport& aReport);

As above; might be useful at least to be able to test this separately, and I don't think anything is gained by making it private.

::: security/sandbox/linux/SandboxReporterClient.cpp:34
(Diff revision 1)
> +					     int aFd)
> +  : mProcType(aProcType)
> +  , mFd(aFd)
> +{
> +  // TODO, maybe: assert that the fd is a socket of the expected type.
> +  // (Or, see if the crash reporter does anything like that & follow its lead.)

Thanks for catching that; I checked for FIXME because I thought I'd used that consistently, but not quite.

I checked, and the crash reporter doesn't do any kind of verification, so I'm going to follow its lead and change this to a comment explaining it.

::: security/sandbox/linux/SandboxReporterClient.cpp:49
(Diff revision 1)
> +  // sending uninitialized alignment padding to another process.
> +  PodZero(&report);
> +
> +  clock_gettime(CLOCK_MONOTONIC_COARSE, &report.mTime);
> +  report.mPid = getpid();
> +  MOZ_RELEASE_ASSERT(report.mPid > 0);

It seemed like a good idea to make sure the result would always be “valid”, but now that you mention it, we can probably trust that we didn't break `getpid`.

::: security/sandbox/linux/SandboxReporterClient.cpp:72
(Diff revision 1)
> +  // The "common" seccomp-bpf policy allows sendmsg but not send(to),
> +  // so just use sendmsg even though send would suffice for this.
> +  struct iovec iov;
> +  struct msghdr msg;
> +
> +  iov.iov_base = const_cast<void*>(static_cast<const void*>(&aReport));

No, `const_cast` doesn't allow implicit conversion along with the `const` removal.

::: security/sandbox/linux/reporter/SandboxReporter.h:46
(Diff revision 1)
> +  void GetClientFileDescriptorMapping(int* aSrcFd, int* aDstFd) const;
> +
> +  // A snapshot of the report ring buffer; element 0 of `mReports` is
> +  // the `mOffset`th report to be received, and so on.
> +  struct Snapshot {
> +    uint64_t mOffset;

The buffer has to fit in memory, but the total number ever received could potentially overflow a `uint32_t` even on 32-bit.  I'll add a comment.

::: security/sandbox/linux/reporter/SandboxReporterCommon.h:20
(Diff revision 1)
> +// Note: this is also used in libmozsandbox, so dependencies on
> +// symbols from libxul probably won't work.
> +
> +namespace mozilla {
> +static const size_t kSandboxSyscallArguments = 6;
> +static const int kSandboxReporterFileDesc = 5;

fds 0-2: stdio; fd 3: IPC; fd 4: crash reporter.  (And IPC will check that we don't accidentally use the same descriptor twice.)  I'll add a comment.
Comment on attachment 8832767 [details]
Bug 1286865 - Step 4: Report rejected syscall info in Telemetry.

https://reviewboard.mozilla.org/r/109002/#review111336
How should I get data-review on this?

Also, https://reviewboard.mozilla.org/r/109002/#review111274 has a comment about the telemetry data that I thought would be posted as part of comment #20, but ReviewBoard is confusing.
…and this needinfo was supposed to go with the last comment.
Flags: needinfo?(francois)
(In reply to Jed Davis [:jld] {⏰UTC-7} from comment #28)
> How should I get data-review on this?

I will discuss the permanent opt-out with Benjamin.

> Also, https://reviewboard.mozilla.org/r/109002/#review111274 has a comment
> about the telemetry data that I thought would be posted as part of comment
> #20, but ReviewBoard is confusing.

I'm not sure what you're referring to, but comment 10 also had two questions about the description, which haven't been addressed yet.
Flags: needinfo?(francois)
(In reply to Jed Davis [:jld] {⏰UTC-7} from comment #21)
> ::: security/sandbox/linux/reporter/SandboxReporter.h:46
> (Diff revision 1)
> > +  void GetClientFileDescriptorMapping(int* aSrcFd, int* aDstFd) const;
> > +
> > +  // A snapshot of the report ring buffer; element 0 of `mReports` is
> > +  // the `mOffset`th report to be received, and so on.
> > +  struct Snapshot {
> > +    uint64_t mOffset;
> 
> The buffer has to fit in memory, but the total number ever received could
> potentially overflow a `uint32_t` even on 32-bit.  I'll add a comment.

It's actually a count (mCount), not an actual offset, right? (It becomes an offset after the modulo).
(In reply to Gian-Carlo Pascutto [:gcp] from comment #31)
> (In reply to Jed Davis [:jld] {⏰UTC-7} from comment #21)
> > ::: security/sandbox/linux/reporter/SandboxReporter.h:46
> > (Diff revision 1)
> > > +  void GetClientFileDescriptorMapping(int* aSrcFd, int* aDstFd) const;
> > > +
> > > +  // A snapshot of the report ring buffer; element 0 of `mReports` is
> > > +  // the `mOffset`th report to be received, and so on.
> > > +  struct Snapshot {
> > > +    uint64_t mOffset;
> > 
> > The buffer has to fit in memory, but the total number ever received could
> > potentially overflow a `uint32_t` even on 32-bit.  I'll add a comment.
> 
> It's actually a count (mCount), not an actual offset, right? (It becomes an
> offset after the modulo).

It's the offset into the sequence of reports where the buffer starts.  Or, put another way, the offset to add to an array index to get the sequence index.
(In reply to François Marier [:francois] from comment #10)
> So, would an expiring opt-out probe work for you or would you like me to
> consult with the other data stewards to discuss it?

Answering this question a little more thoroughly: an expiring probe could make sense if there were some bounded set of bugs we were trying to discover, such that the histogram staying empty for a while would mean that our work was done.

But, one of the failure modes here is libraries we depend on releasing new versions that use syscalls that we're blocking, and the kernel will keep adding new syscalls (and new sub-syscalls for multiplexers like prctl) for them to use, so the influx of bugs will never end.

> ::: toolkit/components/telemetry/Histograms.json:10410
> (Diff revision 1)
> > +    "expires_in_version": "never",
> > +    "releaseChannelCollection": "opt-out",
> > +    "kind": "count",
> > +    "keyed": true,
> > +    "cpp_guard": "XP_LINUX",
> > +    "description": "System calls rejected by a seccomp-bpf sandbox policy"
> 
> Two questions so we can make the description a little bit clearer:
> 
> By "system call", do you mean an ID / integer of some kind, or the name /
> string of a system call?

[copying and pasting this out of ReviewBoard, so it's someplace people can actually find it]

In the general case it's the (architecture, integer) pair, because the syscall numbers vary by architecture.  It would be possible to generate and ship a table of system call names to map the numbers back into something more human-appropriate (and not generate separate keys for 32-bit and 64-bit x86), but I'm not sure it's worth the effort.  For syscalls that are handled specially in order to include some data from arguments, the name is used.

Is there are good place to document all of this?  The "description" key in the JSON seems to want something more concise than I can make this if I want to explain exactly what's transmitted, and comments in the source code (like the one I added) also aren't ideal.

> How often is it triggered? Everytime a user does something which triggers a
> system call which is blocked by the sandbox, we get a telemetry ping for
> that syscall?

Yes, every time a system call is blocked by the sandbox.  Currently this is limited to syscalls where we'd crash on Nightly: in some cases (e.g., fork()) we're returning an error instead, because we can't fix the code to stop trying them, and those aren't included.
(In reply to Jed Davis [:jld] {⏰UTC-7} from comment #33)
> Answering this question a little more thoroughly: an expiring probe could
> make sense if there were some bounded set of bugs we were trying to
> discover, such that the histogram staying empty for a while would mean that
> our work was done.
> 
> But, one of the failure modes here is libraries we depend on releasing new
> versions that use syscalls that we're blocking, and the kernel will keep
> adding new syscalls (and new sub-syscalls for multiplexers like prctl) for
> them to use, so the influx of bugs will never end.

Ok, thanks for expanding on this. This sounds like a strong case for a non-expiring probe indeed.

> > By "system call", do you mean an ID / integer of some kind, or the name /
> > string of a system call?
> 
> In the general case it's the (architecture, integer) pair, because the
> syscall numbers vary by architecture.  It would be possible to generate and
> ship a table of system call names to map the numbers back into something
> more human-appropriate (and not generate separate keys for 32-bit and 64-bit
> x86), but I'm not sure it's worth the effort.  For syscalls that are handled
> specially in order to include some data from arguments, the name is used.
> 
> Is there are good place to document all of this?  The "description" key in
> the JSON seems to want something more concise than I can make this if I want
> to explain exactly what's transmitted, and comments in the source code (like
> the one I added) also aren't ideal.

The description key doesn't have to be concise. In fact, it needs to contain enough information to make it clear what is actually collected and in what format. You can also refer to an in-tree file for more information if it's relevant (e.g. SECURITY_UI: http://searchfox.org/mozilla-central/rev/848c29538ab007fb95dc6cff194f0e3d3809613d/toolkit/components/telemetry/Histograms.json#6377)

> > How often is it triggered? Everytime a user does something which triggers a
> > system call which is blocked by the sandbox, we get a telemetry ping for
> > that syscall?
> 
> Yes, every time a system call is blocked by the sandbox.  Currently this is
> limited to syscalls where we'd crash on Nightly: in some cases (e.g.,
> fork()) we're returning an error instead, because we can't fix the code to
> stop trying them, and those aren't included.

Putting all of these details together, how about this as a description?

    Record an (architecture, syscall ID) pair every time a system call is blocked by a seccomp-bpf sandbox policy. Limited to syscalls where we would crash on Nightly.
(In reply to François Marier [:francois] from comment #34)
> Putting all of these details together, how about this as a description?
> 
>     Record an (architecture, syscall ID) pair every time a system call is
> blocked by a seccomp-bpf sandbox policy. Limited to syscalls where we would
> crash on Nightly.

The problem there is that in some cases this *is* collecting more information than just the syscall ID — just, not much information, and nothing that would be personally identifying.

So I wrote a longer description to try to say that (and I see that Histograms.json does have a handful of longer descriptions already, even if most of them seem to be fairly short), and added a pointer to the source with the comments, and added a comment specifically about privacy because I noticed I hadn't said anything about it directly, there.  Once this build is done I'll re-push the review.
I have discussed this with Benjamin and the opt-out and non-expiring nature of the probe isn't a problem. The only remaining thing is to check whether or not the data we would get via the key is covered by our privacy policy.

(In reply to Jed Davis [:jld] {⏰UTC-7} from comment #35)
> The problem there is that in some cases this *is* collecting more
> information than just the syscall ID — just, not much information, and
> nothing that would be personally identifying.

I tried to figure out what kind of information we could receive from the patch but it wasn't that obvious to me. Can you give me examples of the kind of information we'd get as part of the key?
Flags: needinfo?(jld)
It can potentially include some arguments (registers) supplied to the syscall, if those are relevant to us (i.e. we filter certain flags or settings for a given syscall only) and this is specifically enabled. This is included in the Telemetry description in the patch, which also links to a more detailed explanation in the source:

    // Syscalls that are filtered by arguments in one or more of the
    // policies in SandboxFilter.cpp should generally have those
    // arguments included here, but don't include irrelevant
    // information that would cause large numbers of distinct keys for
    // the same issue -- for example, pids or pointers.  When in
    // doubt, include arguments only if they would typically be
    // constants (or asm immediates) in the code making the syscall.
    //
    // Also, keep in mind that this is opt-out data collection and
    // privacy is critical.  While it's unlikely that information in
    // the register values alone could personally identify a user
    // (see also crash reports, where register contents are public),
    // and the guidelines in the previous paragraph should rule out
    // any value that's capable of holding PII, please be careful.

Example is also in the screenshot: https://bug1286865.bmoattachments.org/attachment.cgi?id=8832775

Specific examples:

clone (flags?)
prctl (options?)
socketcall (multiplexed syscall with actual operation in a register - we need to know the actual operation)
Flags: needinfo?(jld)
Comment on attachment 8832767 [details]
Bug 1286865 - Step 4: Report rejected syscall info in Telemetry.

https://reviewboard.mozilla.org/r/109002/#review115114

(In reply to Gian-Carlo Pascutto [:gcp] from comment #43)
> It can potentially include some arguments (registers) supplied to the
> syscall, if those are relevant to us (i.e. we filter certain flags or
> settings for a given syscall only) and this is specifically enabled. This is
> included in the Telemetry description in the patch, which also links to a
> more detailed explanation in the source:

Thanks for the extra details gcp. I'm satisfied that at the moment this doesn't leak user data.

To ensure that any additions are reviewed carefully, I've suggested an extra warning in the code.

::: security/sandbox/linux/reporter/SandboxReporter.cpp:163
(Diff revision 4)
> +    // privacy is critical.  While it's unlikely that information in
> +    // the register values alone could personally identify a user
> +    // (see also crash reports, where register contents are public),
> +    // and the guidelines in the previous paragraph should rule out
> +    // any value that's capable of holding PII, please be careful.
> +

Let's add a new paragraph here:

"When making changes here, please consult with a data steward (https://wiki.mozilla.org/Firefox/Data_Collection) and ask for a review if you are unsure about anything."
Attachment #8832767 - Flags: review?(francois) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/7781de08a1c6d84a92e9d54a78ac9f54f8c4c240
Bug 1286865 - Step 0: Turn off crash-on-seccomp-fail by default on non-nightly. r=gcp

https://hg.mozilla.org/integration/mozilla-inbound/rev/f73368ed36cf12bf18f7d66f370d5cd6b8a5e8db
Bug 1286865 - Step 1: Gather syscall info from SIGSYS handlers into the parent process. r=gcp

https://hg.mozilla.org/integration/mozilla-inbound/rev/30d3b74aaa013e89e7189d9544f42357b8b2ab44
Bug 1286865 - Step 2: Add XPCOM bindings for sandbox syscall reporter. r=gcp r=glandium

https://hg.mozilla.org/integration/mozilla-inbound/rev/75ce9ac0623fc345ba4c0c77e79bb430947ed8d6
Bug 1286865 - Step 3: Expose rejected syscall log in about:support. r=adw r=gcp

https://hg.mozilla.org/integration/mozilla-inbound/rev/edfe5b5d3b8b6288b7908e22a3abf1b04d00eff2
Bug 1286865 - Step 4: Report rejected syscall info in Telemetry. r=gcp r=francois
See Also: → 1381653
You need to log in before you can comment on or make changes to this bug.