Closed Bug 1542830 Opened 8 months ago Closed 2 months ago

Move collection of Untrusted Modules telemetry into the launcher process

Categories

(External Software Affecting Firefox :: Telemetry, enhancement, P1)

Unspecified
Windows
enhancement

Tracking

(firefox68 wontfix, firefox71 fixed)

RESOLVED FIXED
Tracking Status
firefox68 --- wontfix
firefox71 --- fixed

People

(Reporter: aklotz, Assigned: aklotz)

References

(Depends on 1 open bug, Blocks 4 open bugs)

Details

(Whiteboard: inj+)

Attachments

(9 files, 1 obsolete file)

47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
7.49 KB, text/plain
chutten
: data-review+
Details

Currently the untrusted modules ping is hooked in via the legacy blocklist code in mozglue. This means that the ping currently misses any DLL loads that take place between process creation and legacy blocklist initialization.

We should move the gathering of untrusted modules telemetry into hooks set by launcher process, so that we can collect data during process startup.

  • Currently the launcher process hooks NtMapViewOfSection for DLL blocking, as this allows us to give the loader a chance to resolve the effective path of the binary being loaded. This gives us a perf optimization over the legacy implementation, where we were effectively doing 2 path searches for each DLL load (our own, plus the loader's). We want to leave this as-is.

  • OTOH, we need to use LdrLoadDll as the hook if we want to be able to take any timing measurements for untrusted modules telemetry, so we should add this as a second hook that is set via the launcher process.

A couple of things to keep in mind:

  • We need a way for Gecko to retrieve the untrusted module info from the new hook without opening up any avenues for third-party tampering;
  • I eventually want to remove the legacy blocklist code, so at the very least we'll want to rough-in the capability to do all of the profiler stuff that the existing LdrLoadDll hook can do, such that the profiler can register itself somehow during startup and allow us to suspend stackwalking, record markers, and so forth.

One other question that I've been thinking about is wondering how much of this needs to be bare-metal C++ with NTDLL, and how much of this can be more complicated. eg should we add tracking code in our loader hooks to determine when kernel32 and the CRT are loaded, such that we can then delegate to higher-level code once we have the capability? Or do we just create a profiler interface that can be registered from elsewhere and, once registered (presumably late enough in the process lifetime that we have a full runtime), we just fire off events into the ether?

Anyway, lots to think about here, and I'll leave it in your capable hands!

I guess this should be P1 since I assigned it. Or something.

Priority: P3 → P1
Depends on: 1435827
Attached patch WIP Patch (obsolete) — Splinter Review

This patch takes the approach of moving the existing blocklist code into the launcher process, rather than trying to start over developing a new blocklist right away. That can be done later over time.

This (unfinished, work-in-progress) patch primarily does these things:

  1. Physically move the DLL blocklist code out of mozglue and into a new subdirectory of browser so that it gets built as part of firefox.exe and not mozglue.dll, and thus is usable by the launcher process. This also means moving the blocklist tests and changing namespaces to not include the word "glue" anymore.
  2. Patch the blocklist and telemetry code (and a few of their dependencies) to be able to work before any DLLs are loaded. This mostly means things like swapping out kernel32 calls for ntdll calls, especially replacing explicit or implicit dynamic allocations with calls to the Rtl* heap functions, manually implementing things like string functions that we couldn't get elsewhere, and also removing a few dynamic allocations that just weren't necessary.
  3. Have the launcher process install the LdrLoadDll detour into the browser process.

This leaves a lot of stuff still to do. With this patch I think we should be getting the desired telemetry on static/early DLL loads, but I haven't done anything to verify that it actually goes anywhere useful. The launcher process doesn't attempt to do any equivalent of DllBlocklist_Initialize(), and I don't know what effects that's causing but it's probably not anything good. The launcher's NtMapViewOfSection-based blocklist is left entirely untouched, so this patch leaves both running. There's also lots of general cleanup and formatting to do in the new code.

Blocks: 1238735

Per discussion with jimm:
Given Matt's current workload, since this bug is on the critical path and other work is now blocked on it, I'm going to take this and push it across the finish line.

Assignee: mhowell → aklotz
Depends on: 1568634
Depends on: 1573264
Depends on: 1573273
Depends on: 1573274
Depends on: 1573275
Attachment #9075832 - Attachment is obsolete: true
Depends on: 1577061

Fixing bug 1577061 solves the X2 failures on Win64, but I'm still consistently able to reproduce timeouts on 32-bit Win7. This is going to require more investigation, I'm afraid. :-(

Blocks: 1577217
Depends on: 1577594
Blocks: 1522830
Depends on: 1578786
Component: General → Telemetry
Product: Firefox → External Software Affecting Firefox
Version: Trunk → unspecified
Attachment #9087569 - Attachment description: Bug 1542830: Part 1 - NativeNt → Bug 1542830: Part 1 - Updates to NativeNt.h; r=mhowell
Attachment #9087570 - Attachment description: Bug 1542830: Part 2 - winlauncher and winlauncher/freestanding → Bug 1542830: Part 2 - Modify launcher process blocklist to collect information about untrusted module loads; r=mhowell
Attachment #9087571 - Attachment description: Bug 1542830: Part 3 - Add ntdll_freestanding.lib to freestanding → Bug 1542830: Part 3 - Add ntdll_freestanding.lib to freestanding; r=mhowell, r=#firefox-build-system-reviewers
Attachment #9087572 - Attachment description: Bug 1542830: Part 4 - mozglue → Bug 1542830: Part 4 - Modify mozglue to use new untrusted modules interfaces; r=mhowell
Attachment #9087573 - Attachment description: Bug 1542830: Part 5 - WinHeaderOnlyUtils → Bug 1542830: Part 5 - Make ModuleVersion copyable and movable; r=mhowell
Attachment #9087574 - Attachment description: Bug 1542830: Part 6 - xre → Bug 1542830: Part 6 - Rewrite the untrusted modules processor in toolkit/xre; r=mhowell
Attachment #9087575 - Attachment description: Bug 1542830: Part 7 - ProcessedStack and CombinedStacks → Bug 1542830: Part 7 - Support MFBT Vector in ProcessedStack and add Swap operation to CombinedStacks; r=janerik
Attachment #9087576 - Attachment description: Bug 1542830: Part 8 - Ping changes → Bug 1542830: Part 8 - Update the untrusted modules ping format to support the new Untrusted Modules 2.0 interface and data format; r=janerik
Attachment #9092202 - Flags: data-review?(chutten)
Attachment #9087569 - Attachment description: Bug 1542830: Part 1 - Updates to NativeNt.h; r=mhowell → Bug 1542830: Part 1 - Updates to NativeNt.h; r=mhowell!
Attachment #9087570 - Attachment description: Bug 1542830: Part 2 - Modify launcher process blocklist to collect information about untrusted module loads; r=mhowell → Bug 1542830: Part 2 - Modify launcher process blocklist to collect information about untrusted module loads; r=mhowell!
Attachment #9087571 - Attachment description: Bug 1542830: Part 3 - Add ntdll_freestanding.lib to freestanding; r=mhowell, r=#firefox-build-system-reviewers → Bug 1542830: Part 3 - Add ntdll_freestanding.lib to freestanding; r=mhowell!,#firefox-build-system-reviewers!
Attachment #9087572 - Attachment description: Bug 1542830: Part 4 - Modify mozglue to use new untrusted modules interfaces; r=mhowell → Bug 1542830: Part 4 - Modify mozglue to use new untrusted modules interfaces; r=mhowell!
Attachment #9087573 - Attachment description: Bug 1542830: Part 5 - Make ModuleVersion copyable and movable; r=mhowell → Bug 1542830: Part 5 - Make ModuleVersion copyable and movable; r=mhowell!
Attachment #9087574 - Attachment description: Bug 1542830: Part 6 - Rewrite the untrusted modules processor in toolkit/xre; r=mhowell → Bug 1542830: Part 6 - Rewrite the untrusted modules processor in toolkit/xre; r=mhowell!
Attachment #9087575 - Attachment description: Bug 1542830: Part 7 - Support MFBT Vector in ProcessedStack and add Swap operation to CombinedStacks; r=janerik → Bug 1542830: Part 7 - Support MFBT Vector in ProcessedStack and add Swap operation to CombinedStacks; r=janerik!
Attachment #9087576 - Attachment description: Bug 1542830: Part 8 - Update the untrusted modules ping format to support the new Untrusted Modules 2.0 interface and data format; r=janerik → Bug 1542830: Part 8 - Update the untrusted modules ping format to support the new Untrusted Modules 2.0 interface and data format; r=janerik!
Comment on attachment 9092202 [details]
untrusted_modules_revised_data_review.txt

DATA COLLECTION REVIEW RESPONSE:

    Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?

Yes. This collection is documented in [its in-tree documentation](https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/data/index.html).

    Is there a control mechanism that allows the user to turn the data collection on and off?

Yes. This collection is Telemetry so can be controlled through Firefox's Preferences.

    If the request is for permanent data collection, is there someone who will monitor the data over time?

Yes, jimm is responsible.

    Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 1, Technical.

    Is the data collection request for default-on or default-off?

Default on for all channels.

    Does the instrumentation include the addition of any new identifiers?

No.

    Is the data collection covered by the existing Firefox privacy notice?

Yes.

    Does there need to be a check-in in the future to determine whether to renew the data?

No. This collection is permanent.

---
Result: datareview+
Attachment #9092202 - Flags: data-review?(chutten) → data-review+
Attachment #9087576 - Attachment description: Bug 1542830: Part 8 - Update the untrusted modules ping format to support the new Untrusted Modules 2.0 interface and data format; r=janerik! → Bug 1542830: Part 8 - Rename the "untrustedModules" ping to "third-party-modules" and change the schema to support multiprocess; r=janerik!
Attachment #9087571 - Attachment description: Bug 1542830: Part 3 - Add ntdll_freestanding.lib to freestanding; r=mhowell!,#firefox-build-system-reviewers! → Bug 1542830: Part 3 - Add ntdll_freestanding.lib to freestanding; r=mhowell!,froydnj!
Pushed by aklotz@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9f5dbd1b2959
Part 1 - Updates to NativeNt.h; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/d156dc595b69
Part 2 - Modify launcher process blocklist to collect information about untrusted module loads; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/c25b7bf354d9
Part 3 - Add ntdll_freestanding.lib to freestanding; r=mhowell,froydnj
https://hg.mozilla.org/integration/autoland/rev/70ffff30551a
Part 4 - Modify mozglue to use new untrusted modules interfaces; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/363039c98534
Part 5 - Make ModuleVersion copyable and movable; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/9e90ee4981c6
Part 6 - Rewrite the untrusted modules processor in toolkit/xre; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/7ee12138946d
Part 7 - Support MFBT Vector in ProcessedStack and add Swap operation to CombinedStacks; r=janerik
https://hg.mozilla.org/integration/autoland/rev/4f72161be496
Part 8 - Rename the "untrustedModules" ping to "third-party-modules" and change the schema to support multiprocess; r=janerik
Flags: needinfo?(aklotz)
Pushed by aklotz@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e7bef6486a38
Part 1 - Updates to NativeNt.h; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/df658ffb8599
Part 2 - Modify launcher process blocklist to collect information about untrusted module loads; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/268530552281
Part 3 - Add ntdll_freestanding.lib to freestanding; r=mhowell,froydnj
https://hg.mozilla.org/integration/autoland/rev/7a3cbd2f59f0
Part 4 - Modify mozglue to use new untrusted modules interfaces; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/39a7c05d54ef
Part 5 - Make ModuleVersion copyable and movable; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/0a9169ab2623
Part 6 - Rewrite the untrusted modules processor in toolkit/xre; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/b85f58fd5bbd
Part 7 - Support MFBT Vector in ProcessedStack and add Swap operation to CombinedStacks; r=janerik
https://hg.mozilla.org/integration/autoland/rev/b9f7fc8d0172
Part 8 - Rename the "untrustedModules" ping to "third-party-modules" and change the schema to support multiprocess; r=janerik
Flags: needinfo?(aklotz)
Pushed by aklotz@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/da7e775c4051
Part 1 - Updates to NativeNt.h; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/919f1af7c135
Part 2 - Modify launcher process blocklist to collect information about untrusted module loads; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/c4e547a6a039
Part 3 - Add ntdll_freestanding.lib to freestanding; r=mhowell,froydnj
https://hg.mozilla.org/integration/autoland/rev/8e2da9ff5f5a
Part 4 - Modify mozglue to use new untrusted modules interfaces; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/84b903e60dc9
Part 5 - Make ModuleVersion copyable and movable; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/73ec288886cd
Part 6 - Rewrite the untrusted modules processor in toolkit/xre; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/1aa253e6604a
Part 7 - Support MFBT Vector in ProcessedStack and add Swap operation to CombinedStacks; r=janerik
https://hg.mozilla.org/integration/autoland/rev/6fcb417f7ff4
Part 8 - Rename the "untrustedModules" ping to "third-party-modules" and change the schema to support multiprocess; r=janerik
Regressions: 1582937
Regressions: 1582939

Maybe you want to merge bug 1582937 into here.

For the record,
this
(In reply to Pulsebot from comment #39)

Pushed by aklotz@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/da7e775c4051
Part 1 - Updates to NativeNt.h; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/919f1af7c135
Part 2 - Modify launcher process blocklist to collect information about
untrusted module loads; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/c4e547a6a039
Part 3 - Add ntdll_freestanding.lib to freestanding; r=mhowell,froydnj
https://hg.mozilla.org/integration/autoland/rev/8e2da9ff5f5a
Part 4 - Modify mozglue to use new untrusted modules interfaces; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/84b903e60dc9
Part 5 - Make ModuleVersion copyable and movable; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/73ec288886cd
Part 6 - Rewrite the untrusted modules processor in toolkit/xre; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/1aa253e6604a
Part 7 - Support MFBT Vector in ProcessedStack and add Swap operation to
CombinedStacks; r=janerik
https://hg.mozilla.org/integration/autoland/rev/6fcb417f7ff4
Part 8 - Rename the "untrustedModules" ping to "third-party-modules" and
change the schema to support multiprocess; r=janerik

cause this regression:
== Change summary for alert #23177 (as of Sat, 21 Sep 2019 13:31:43 GMT) ==

Regressions:

46% tp5n nonmain_normal_fileio windows10-64-shippable opt e10s stylo 352,415,047.58 -> 513,380,308.75
38% tp5n nonmain_normal_fileio windows10-64-shippable-qr opt e10s stylo 376,501,507.08 -> 520,844,282.08
13% tp5n nonmain_normal_fileio windows7-32-shippable opt e10s stylo 376,185,697.08 -> 425,389,093.33

Improvements:

26% tp5n main_startup_fileio windows10-64-shippable-qr opt e10s stylo 1,000,273.67 -> 744,551.00

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=23177

And this:
(In reply to Cristian Brindusan [:cbrindusan] from comment #41)

Backed out 8 changesets (Bug 1542830) for causing Nightly bustages.

Backout: https://hg.mozilla.org/mozilla-central/rev/2b745a11bb29bd2e4b56f13ab7e9ea211ffed3fb

caused this:
== Change summary for alert #23189 (as of Mon, 23 Sep 2019 04:57:21 GMT) ==

Regressions:

36% tp5n main_startup_fileio windows10-64-shippable-qr opt e10s stylo 743,832.23 -> 1,013,649.67

Improvements:

30% tp5n nonmain_normal_fileio windows10-64-shippable-qr opt e10s stylo 521,883,205.67 -> 365,219,723.50
28% tp5n nonmain_normal_fileio windows10-64-shippable opt e10s stylo 513,743,420.08 -> 367,859,744.75
12% tp5n nonmain_normal_fileio windows7-32-shippable opt e10s stylo 422,292,007.17 -> 369,942,939.92

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=23189

Flags: needinfo?(aklotz)
Pushed by aklotz@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e972b0955a82
Part 1 - Updates to NativeNt.h; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/dd2cd8192026
Part 2 - Modify launcher process blocklist to collect information about untrusted module loads; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/69a6bb189968
Part 3 - Add ntdll_freestanding.lib to freestanding; r=mhowell,froydnj
https://hg.mozilla.org/integration/autoland/rev/717047a824a8
Part 4 - Modify mozglue to use new untrusted modules interfaces; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/d9e10c6117c1
Part 5 - Make ModuleVersion copyable and movable; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/19e2cf05ef88
Part 6 - Rewrite the untrusted modules processor in toolkit/xre; r=mhowell
https://hg.mozilla.org/integration/autoland/rev/864954ad4270
Part 7 - Support MFBT Vector in ProcessedStack and add Swap operation to CombinedStacks; r=janerik
https://hg.mozilla.org/integration/autoland/rev/e0fbad31951f
Part 8 - Rename the "untrustedModules" ping to "third-party-modules" and change the schema to support multiprocess; r=janerik

Thunderbird still compiles with these changesets, thanks!

Regressions: 1583756
No longer blocks: 1238735
No longer regressions: 1583756
Regressions: 1583768
Depends on: 1583982
Regressions: 1584377
Regressions: 1590430
You need to log in before you can comment on or make changes to this bug.