In bug 1322554, we silence a crash by blocking threads whose start addresses are not authorized. While this may prevent crashes, it's probably a good idea to annotate when this happens in a ping.
Putting at P3 due to lack of resources, per aklotz.
Priority: -- → P3
Assigning to Carl. Please prioritize bug 1435827 above this one whenever the former is unblocked.
Assignee: nobody → ccorcoran
Priority: P3 → P1
Summary: Add telemetry about BaseThreadInitThunk gatekeeping blocks a thread → Add telemetry for when BaseThreadInitThunk gatekeeping blocks a thread
Whiteboard: inj? → inj+
I'm looking for advice on the best way to submit this data via telemetry. We're (tentatively) looking to record the following data every time we block a potentially-crashing / malicious thread: > - Reason why blocked > - Is it a known blocked entry point (bug 1435816) > - Process uptime > - Thread entry point address > - other data about that address, for example whether it is near a module, virtual memory flags, etc. sunahsuh, can you help advise which telemetry delivery method would be most appropriate for this? It's unclear to me whether we should ride the main ping, make a custom ping, use telemetry Events, or something I haven't considered yet.
From first glance telemetry events seems like the easiest path forward but there are a few limitations that might keep us from using it -- the format allows up to 10 key-value extra metadata fields, with keys limited to 15 bytes and values limited to 80 (full limit documentation here: https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/telemetry/collection/events.html#limits). If the data you'd like to collect fits within those parameters, events are sent within an hour of collection and become queryable within a day of receipt (and hopefully soon, within minutes of receipt). The event ping also has the full environment attached, which would be very useful for this data. I assume this is a relatively rare event, given my reading of the original bug? (By which I mean, happens once per 1000s or 10,000s of browser sessions, if even that.) If the limitations won't allow use of Events, I suspect a custom ping would be the next best option since we can make a direct-to-parquet dataset that will be queryable in STMO within an hour or so of receipt merely by adding a schema to the pipeline (and a corresponding parquet schema.) The documentation for doing this on the pipeline side is here: https://docs.telemetry.mozilla.org/cookbooks/new_ping.html
(In reply to Sunah Suh (she/her) [:sunahsuh] from comment #4) > I assume this is a relatively rare > event, given my reading of the original bug? (By which I mean, happens once > per 1000s or 10,000s of browser sessions, if even that.) Was this cleared up somewhere? What's the expected volume?
(In reply to Georg Fritzsche [:gfritzsche] from comment #5) > What's the expected volume? The expected volume is extremely low. I don't have any numbers, but we currently only block threads when: - A thread is started with an entry point located in bad memory - Or when a thread is started with an entry point in a LoadLibrary variant. This is an indication of a 3rd party app trying to inject a DLL.
I think preliminary telemetry might be useful here to measure the size of the problem. A categorical histogram  (example: ) of counts by reason would do the trick, and be rather quick to implement and easy to uplift. : https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/telemetry/collection/histograms.html#categorical : https://mzl.la/2ApXRHx
Attaching a WIP which performs the event data gathering and dispatching. What remains is to send the event data via telemetry events, once bug 1313327 is landed.
You need to log in before you can comment on or make changes to this bug.