Integrate the Glean SDK into mozilla-central
Categories
(Toolkit :: Telemetry, task, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox74 | --- | fixed |
People
(Reporter: chutten, Assigned: chutten)
References
Details
Attachments
(12 files, 6 obsolete files)
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
2.10 KB,
text/plain
|
tdsmith
:
data-review+
|
Details |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review |
With the Glean SDK crates available on crates.io, dependencies resolved, integration layer technologies sorted, and build system adapted we can finally vendor the crates in and init them. We can prove that it builds when we want it, and only when we want it. We can write some more tests.
Don't go so far as to send the "prototype" ping yet. That's for later.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 1•5 years ago
|
||
Poking through mozilla-central, I found that CertStorage registers itself as an nsIObserver for its own prefs. This might be helpful in ensuring we have and keep the current value of the data upload pref, though it would mean a certain amount of unsafe
code.
Assignee | ||
Comment 2•5 years ago
|
||
Updated•5 years ago
|
Assignee | ||
Comment 3•5 years ago
|
||
Depends on D56468
Assignee | ||
Comment 4•5 years ago
|
||
Doesn't compile because I'm trying to send an nsIObserver across threads.
Need to spend some time thinking about the design of this state...
Depends on D56469
Assignee | ||
Comment 5•5 years ago
|
||
Depends on D56470
Assignee | ||
Comment 6•5 years ago
|
||
Assignee | ||
Comment 7•5 years ago
|
||
Depends on D57105
Assignee | ||
Comment 8•5 years ago
|
||
Since we're the only one sending data, and we're doing so infrequently, let's
get the pref value before each ping send instead of building a pref observer
right this second.
Depends on D57106
Assignee | ||
Comment 9•5 years ago
|
||
Depends on D57107
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 10•5 years ago
|
||
Depends on D57108
Assignee | ||
Comment 11•5 years ago
|
||
This is my first time writing Rust in m-c. Jan-Erik and I recognize that maybe we should have a practised eye take a look at these changes before we push them in, even if it is gated to just Nightly.
Lina, would you be or know of someone who could provide this sort of feedback?
Assignee | ||
Comment 12•5 years ago
|
||
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 14•5 years ago
|
||
(In reply to Tim Smith 👨🔬 [:tdsmith] from comment #13)
How will you document the prototype ping?
Good question! I will put a pings.yaml
in toolkit/components/telemetry/fog/
that will contain the necessary information.
Assignee | ||
Comment 15•5 years ago
|
||
Assignee | ||
Comment 16•5 years ago
|
||
Updated•5 years ago
|
Assignee | ||
Comment 17•5 years ago
|
||
You can find the draft here: https://phabricator.services.mozilla.com/D57463
It will live here: https://hg.mozilla.org/mozilla-central/file/tip/toolkit/components/telemetry/fog/pings.yaml
Comment 18•5 years ago
|
||
Assignee | ||
Comment 19•5 years ago
|
||
Jan-Erik tells me Emilio might have some input on how I'm doing Rust in mozilla-central. I would love to have input on where I've gone wrong : )
Comment 20•5 years ago
|
||
I took a look. It looks mostly fine with some nits, but the prefs usage definitely needs to get fixed.
Comment 21•5 years ago
|
||
(In reply to Tim Smith 👨🔬 [:tdsmith] from comment #18)
Comment on attachment 9116199 [details]
data collection reviewAlicia, this is a data-review request for a new ping type ("prototype ping")
containing a new identifier. The identifier is named "client_id" but it does
not have the same value as the normal telemetry client_id. The new value
("prototype
client-id") and the telemetry client_id will not be sent in
the same document. The prototype ping will only contain Category 1 technical
data. The prototype ping is a time-limited experimental collection in
Nightly only in order to shake out the Glean telemetry library on desktop.
Please confirm that you don't have any concerns.Sounds great otherwise; thanks chutten for answering my questions about the
content of the ping and documentation!--
- Is there or will there be documentation that describes the schema for
the ultimate data set in a public, complete, and accurate way?toolkit/components/telemetry/fog/pings.yaml will describe the ping.
https://mozilla.github.io/glean/book/user/pings/index.html#ping-sections
describes the ping_info and client_info content they will contain.In the future, glean pings will be described at
https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/index.
html and collections will be described by an enhanced probe dictionary.
- Is there a control mechanism that allows the user to turn the data
collection on and off?Yes, the Firefox telemetry opt-out.
- If the request is for permanent data collection, is there someone who
will monitor the data over time?n/a.
- Using the category system of data
types on the Mozilla
wiki, what collection type of data do the requested measurements fall under?Category 1, technical data.
- Is the data collection request for default-on or default-off?
Default-on.
- Does the instrumentation include the addition of any new
identifiers?Yes, discussed above.
- Is the data collection covered by the existing Firefox privacy notice?
Yes, but ni agray to confirm.
- Does there need to be a check-in in the future to determine whether to
renew the data?No, but chutten's responsible for removing it eventually.
- Does the data collection use a third-party collection tool?
No.
Trust approved. Nightly privacy notice will cover this as part of its "experimental nature" language.
Assignee | ||
Comment 22•5 years ago
|
||
Depends on D57359
Updated•5 years ago
|
Assignee | ||
Comment 23•5 years ago
|
||
nsIProcess is the tried-and-true method for launching utility subprocesses on
Firefox Desktop's supported platforms. Use that.
Depends on D57977
Comment 24•5 years ago
|
||
I read through your patches, and they look great to me too, :chutten!
Comment 25•5 years ago
|
||
Comment 26•5 years ago
|
||
Comment 27•5 years ago
|
||
Backed out 7 changesets (bug 1591564) for leaks and tier2 build bustages.
https://hg.mozilla.org/integration/autoland/rev/006b24f2618def928d1b4ada66322da9a06aa190
Failure log leaks:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=283845214&repo=autoland&lineNumber=5516
Failure log bustages:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=283842210&repo=autoland&lineNumber=93915
Assignee | ||
Comment 28•5 years ago
|
||
The Tier2 build problem appears to be libbacktrace
being included in mingw windows builds where it maybe should not be. (( The linked discussion points to libbacktrace
skipping msvc, but maybe it should also be skipping clang on mingw? (I have no idea). ))
The leak is almost certainly due to me mishandling an nsString
in the integration layer somewhere.
Comment 29•5 years ago
|
||
Updated•5 years ago
|
Comment 30•5 years ago
|
||
I spent a lot of time today tracking down the libbacktrace issue and why it didn't occur before.
Except for glean-preview we didn't include or update any other Rust crates in this patch set.
glean-preview uses glean-core, which uses failure, which includes backtrace
by defualt.
failure was already included before and compiled with default features, and so was backtrace.
glean doesn't really need failure's backtrace feature, so I tried disabling its default features.
Due to how cargo currently works this needs to be done across all transitive dependencies (the final feature set a crate is compiled with it the union across it all).
- Disabling it in ffi-support in application-services
- Disabling it in Glean (and depending on that changed ffi-support
I then changed m-c to use the crates from git directly.
Turns out that still fails with the same issue on the x86_64-pc-windows-gnu target.
So something is re-enabling that failure "std" feature in tree.
With a combination of cargo-feature-set and cargo tree I tracked it down further.
This is the full feature set for the gkrust-shared crate as compiled:
cargo feature-set --features bitsdownload --features cranelift_x86 --features cubeb-remoting --features fogotype --features gecko_debug --features gecko_profiler --features gecko_refcount_logging --features moz_memory --features moz_places --features new_cert_storage --features new_xulstore --features quantum_render
Turns out the quantum_render
feature enables the webrender_bindings crate,
which then somehow pulls in failure through transitive dependencies again.
More trial-and-error revealed its a dependency of the dirs crate,
only used on Windows.
Except dirs doesn't need failure for the target we're building for (x86_64-pc-windows-gnu
or Mac or Linux).
It's again a transitive dependency for a crate called redox_users
, which is only pulled in when compiled for redox
Except that's not how cargo works.
cargo always pulls in all dependencies, merges all features and only later ignores the crates it doesn't actually need.
That's a long standing issue:
Why did it all work before just fine?
Nothing of the above dependency graph changed with the patch set from :chutten.
So the only thing that changed is us actually using glean_preview.
Which means we're suddenly calling functions that return types that wrap other types that come from the backtrace crate, so the linker is now looking for them and can't find them.
I guess we've just been lucky so far that this didn't blow up.
So what's the solution?
- Short term: fork failure, remove backtrace there and vendor that fork
- Already did a try run
- Mid term: get rid of failure across the Glean code base (ffi-support might want that anyway and we don't benefit much from it either)
- That only helps us, but it won't be long before someone else runs into this problem again.
- Long term: fix cargo once and for all.
Comment 32•5 years ago
|
||
Forking crates to fix problems is fine, though obviously long-term undesirable. We could try disallowing failure
from ever appearing in our build graph, though that's probably hard, especially when it appears in crates we'll never compile.
I guess it doesn't work to force the failure
crate's features in https://searchfox.org/mozilla-central/source/build/workspace-hack/Cargo.toml, because cargo would still union in the std
(and therefore the backtrace
) feature from other crates, correct?
Updated•5 years ago
|
Comment 33•5 years ago
|
||
Correct, cargo always takes the union.
I'm going to land the patch as a short-term fix and I will work on removing failure from crates that don't really need it.
Assignee | ||
Comment 34•5 years ago
|
||
The end of the std::thread at process end didn't seem to release the owned
nsStringBuffer in a way that refcounting liked. So let's copy the nsAString
into an owned String, move it into the thread's closure, and convert it as
necessary to an nsAString when we invoke pingsender.
Not the most efficient, but it doesn't have to be. This is prototype code
that will be removed.
Depends on D58809
Assignee | ||
Comment 35•5 years ago
|
||
Depends on D59531
Comment 36•5 years ago
|
||
Comment 37•5 years ago
|
||
Backed out 9 changesets (Bug 1591564) for talos and browser chrome failures at appdata\local\temp\db\data.mdb
Failure log:
- browser chrome: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=284452233&repo=autoland&lineNumber=1505
- talos-xperf: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=284460626&repo=autoland&lineNumber=1730
[task 2020-01-10T20:31:35.603Z] 20:31:35 INFO - TEST-START | browser/base/content/test/performance/browser_startup_mainthreadio.js
[task 2020-01-10T20:31:35.662Z] 20:31:35 INFO - TEST-INFO | started process screenshot
[task 2020-01-10T20:31:35.883Z] 20:31:35 INFO - TEST-INFO | screenshot: exit 0
[task 2020-01-10T20:31:35.884Z] 20:31:35 INFO - Buffered messages logged at 20:31:35
[task 2020-01-10T20:31:35.884Z] 20:31:35 INFO - Entering test bound
[task 2020-01-10T20:31:35.885Z] 20:31:35 INFO - TEST-PASS | browser/base/content/test/performance/browser_startup_mainthreadio.js | The IO interposer should be enabled in builds that are not RELEASE_OR_BETA -
[task 2020-01-10T20:31:35.885Z] 20:31:35 INFO - whitelisted paths before profile selection:
[task 2020-01-10T20:31:35.885Z] 20:31:35 INFO - Z:\task_1578687885\AppData\Roaming\Mozilla\Firefox\profiles.ini - condition: true, ignoreIfUnused: true, read: 1, stat: 1, listedPath: UAppData:profiles.ini
[task 2020-01-10T20:31:35.886Z] 20:31:35 INFO - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner - condition: true, stat: 1, listedPath: ProfD:
[task 2020-01-10T20:31:35.886Z] 20:31:35 INFO - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\parent.lock - condition: true, stat: 1, listedPath: ProfD:parent.lock
[task 2020-01-10T20:31:35.886Z] 20:31:35 INFO - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\minidumps - condition: true, stat: 1, listedPath: ProfD:minidumps
[task 2020-01-10T20:31:35.886Z] 20:31:35 INFO - Z:\task_1578687885\build\application\firefox\browser\defaults\preferences - condition: true, stat: 1, listedPath: XCurProcD:defaults/preferences
[task 2020-01-10T20:31:35.886Z] 20:31:35 INFO - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\startupCache\scriptCache-child-current.bin - condition: true, stat: 1, listedPath: ProfLDS:startupCache/scriptCache-child-current.bin
[task 2020-01-10T20:31:35.886Z] 20:31:35 INFO - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\startupCache\scriptCache-child.bin - condition: true, stat: 1, listedPath: ProfLDS:startupCache/scriptCache-child.bin
[task 2020-01-10T20:31:35.886Z] 20:31:35 INFO - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\startupCache\scriptCache-current.bin - condition: true, stat: 1, listedPath: ProfLDS:startupCache/scriptCache-current.bin
[task 2020-01-10T20:31:35.887Z] 20:31:35 INFO - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\startupCache\scriptCache.bin - condition: true, stat: 1, listedPath: ProfLDS:startupCache/scriptCache.bin
[task 2020-01-10T20:31:35.887Z] 20:31:35 INFO - Z:\task_1578687885\build\application\firefox\defaults\pref\channel-prefs.js - stat: 1, read: 1, close: 1, listedPath: PrfDef:channel-prefs.js
[task 2020-01-10T20:31:35.887Z] 20:31:35 INFO - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\prefs.js - stat: 1, read: 1, close: 1, listedPath: PrefD:prefs.js
[task 2020-01-10T20:31:35.887Z] 20:31:35 INFO - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\user.js - stat: 1, read: 1, close: 1, listedPath: PrefD:user.js
[task 2020-01-10T20:31:35.887Z] 20:31:35 INFO - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\xulstore\data.mdb - condition: true, write: 1, fsync: 1, listedPath: ProfD:xulstore/data.mdb
[task 2020-01-10T20:31:35.888Z] 20:31:35 INFO - (PoisonIOInterposer) stat - Z:\task_1578687885\AppData\Roaming\Mozilla\Firefox\profiles.ini
[task 2020-01-10T20:31:35.888Z] 20:31:35 INFO - (PoisonIOInterposer) stat - c:\users\task_1578687885\appdata\local\temp\tmporpc9c.mozrunner
[task 2020-01-10T20:31:35.889Z] 20:31:35 INFO - (PoisonIOInterposer) stat - c:\users\task_1578687885\appdata\local\temp\tmporpc9c.mozrunner\parent.lock
[task 2020-01-10T20:31:35.889Z] 20:31:35 INFO - (PoisonIOInterposer) stat - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\minidumps
[task 2020-01-10T20:31:35.889Z] 20:31:35 INFO - (PoisonIOInterposer) read - Z:\task_1578687885\build\application\firefox\defaults\pref\channel-prefs.js
[task 2020-01-10T20:31:35.890Z] 20:31:35 INFO - (PoisonIOInterposer) stat - Z:\task_1578687885\build\application\firefox\browser\defaults\preferences
[task 2020-01-10T20:31:35.890Z] 20:31:35 INFO - Buffered messages finished
[task 2020-01-10T20:31:35.890Z] 20:31:35 INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/performance/browser_startup_mainthreadio.js | unexpected read on C:\Users\task_1578687885\AppData\Local\Temp\db\data.mdb before profile selection -
[task 2020-01-10T20:31:35.891Z] 20:31:35 INFO - Stack trace:
[task 2020-01-10T20:31:35.891Z] 20:31:35 INFO - XREMain::XRE_main
[task 2020-01-10T20:31:35.891Z] 20:31:35 INFO - (PoisonIOInterposer) read - C:\Users\task_1578687885\AppData\Local\Temp\db\data.mdb
[task 2020-01-10T20:31:35.892Z] 20:31:35 INFO - (PoisonIOInterposer) read - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\prefs.js
[task 2020-01-10T20:31:35.892Z] 20:31:35 INFO - (PoisonIOInterposer) read - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\user.js
[task 2020-01-10T20:31:35.893Z] 20:31:35 INFO - (PoisonIOInterposer) stat - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\startupCache\scriptCache-child.bin
[task 2020-01-10T20:31:35.893Z] 20:31:35 INFO - (PoisonIOInterposer) stat - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\startupCache\scriptCache-child-current.bin
[task 2020-01-10T20:31:35.894Z] 20:31:35 INFO - (PoisonIOInterposer) stat - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\startupCache\scriptCache.bin
[task 2020-01-10T20:31:35.894Z] 20:31:35 INFO - (PoisonIOInterposer) stat - C:\Users\task_1578687885\AppData\Local\Temp\tmporpc9c.mozrunner\startupCache\scriptCache-current.bin
Assignee | ||
Comment 38•5 years ago
|
||
It appears as though Glean's init opens the db on the thread it's called on. Lemme wrap the entire init block in a spawned std::thread
and see if that solves it.
Trypush here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0267643b182b6f781fe42ee1a67fc6ed97154501
(looking to see if the Bc tests come back green)
Assignee | ||
Comment 39•5 years ago
|
||
Glean init does some I/O which is forbidden on the main thread (especially
during startup). Though not recommended for mobile platforms, let's init it
off the main thread.
Later versions of the Glean SDK might take care of off-thread I/O during init.
But for now, let's do it ourselves.
Depends on D59532
Comment 40•5 years ago
|
||
Comment 41•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/b18f5ee12494
https://hg.mozilla.org/mozilla-central/rev/6dbc2325d1ed
https://hg.mozilla.org/mozilla-central/rev/43d749cff07e
https://hg.mozilla.org/mozilla-central/rev/5edf28e3c13b
https://hg.mozilla.org/mozilla-central/rev/fc040effba1c
https://hg.mozilla.org/mozilla-central/rev/427faa0aef87
https://hg.mozilla.org/mozilla-central/rev/fb49d44ffbb2
https://hg.mozilla.org/mozilla-central/rev/1cf9c1e5f7bb
https://hg.mozilla.org/mozilla-central/rev/52dbbdef7811
https://hg.mozilla.org/mozilla-central/rev/7de9590b9e81
Description
•