Open Bug 1670259 Opened 4 years ago Updated 4 months ago

FOG User Testing Ergonomics - Design, Documentation, Implementation

Categories

(Toolkit :: Telemetry, task, P3)

task

Tracking

()

People

(Reporter: chutten, Unassigned)

References

(Depends on 6 open bugs)

Details

(Whiteboard: [telemetry:fog:m7])

FOG's users (data implementors) will need a way to easily write tests for their instrumentation code. We should design, implement, and document testing strategies for our users, tying together instrumentation with testing.

We might provide custom test utilities to check for instrumentation errors (invalid_state and friends). We should provide examples. We must support more than just JavaScript test suites.

Metrics already have testGetValue, this is for establishing the layer of norms above that.

Depends on: 1681742

A test ergo thing I noticed: Testing a Memory Distribution is kinda irritating because the buckets and sums returned in testGetValue are in bytes, but the samples could be kilobytes or megabytes or whatever. Makes me wish for a units of measure system.

Another test ergo thing I noticed: test_get_num_errors requires that you know the errors that a metric is likely to have. Maybe we should throw an exception or something on test_get_value if there are errors?

And another: There's no way to clear data between test runs in xpcshell or GTest because the init happens when the suite starts. Maybe we can fix this in GTest with a custom fixture.

Yet another test ergo thing: for IPC tests we only have a JS API for awaiting the flush of all child data to the parent (await FOG.testFlushAllChildren()). We should totally expose a similar method on Rust and C++.

And: The tests in t/c/g/xpcshell cannot ever succeed in parallel because they both write metrics to the same profile simultaneously. We need to have a better xpcshell testing story here, perhaps using the new capability to specify a data path on FOG init to point each file to its own tempdir?

With a user/dev split in the Docs we could really benefit from user-facing Debugging information. A lot of it will be pointing folks at the Debug Ping Viewer, but we also have about:glean that might make people happy. Also, though it might be a bit too much behind-the-curtain, a reference to how to turn on logging might be beneficial.

Depends on: 1685402

We should be specific that instrumentation is expected to be tested. No more testing-exception-unchanged!

Depends on: 1683171

We should link to the Testing Strategy document in the developer testing docs.

Depends on: 1709147

We may wish to replace the existing link on about:glean or add a new one to our new user-centric documentation when it's ready.

Depends on: 1720491

User-facing test docs should include fundamentals on the test suites involved. Do we supply a gtest test fixture to extend? Explain what that is and how to use it. Do we need xpcshell tests to get a profile? Explain why and how that works.

Depends on: 1737157

TestResetFOG isn't implemented on Android, meaning any tests that need to clear state are gonna explode on Android. (cf the backout on bug 1751241)

This isn't ideal, as resetting Glean (via resetting FOG) is a central pillar to how we're going to make instrumentation tests more pleasant.

Jan-Erik: Can Glean be reset in Android from xpcshell, mochi, gtest, and friends? Was this NS_ERROR_NOT_IMPLEMENTED_DUE_TO_TIME or NS_ERROR_NOT_IMPLEMENTED_BECAUSE_IMPOSSIBLE?

Flags: needinfo?(jrediger)

I think it was a NS_ERROR_NOT_NEEDED_AT_THE_TIME.
And my initial implementation didn't even have a functioning fog_init, and so implementing fog_test_reset was even less necessary.
Now that fog_init does something in an Android build and is only used in tests we should be able to also provide fog_test_reset there.

Flags: needinfo?(jrediger)
Depends on: 1752139
Depends on: 1752357
Depends on: 1756055
Depends on: 1756057
Depends on: 1762281
Depends on: 1766515
Depends on: 1778363

Would it be possible to ergonomically offer a timestamp-insensitive comparison operation for EventRecords returned via test_get_value? I mean, we could certainly write one in Rust and expose it (or mirror it) in C++. But JS? Do we write this once per language, or add more to the FFI?

Or do we document clever ways to split the timestamp off in the docs, like for JS:

let eventRecords = Glean.myCat.myEvent.testGetValue();
let {timestamp, ...record} = eventRecords[0];
Assert.deepEqual(record, {category: "my_cat", name: "my_event"});
Depends on: 1799977

via jlink on https://phabricator.services.mozilla.com/D168915

For what it's worth I have one small suggestion - it would be nice if there was some kind of feedback when you you press the "Apply setting and submit ping" button to let you know that you clicked it.

Depends on: 1818352
Depends on: 1818772
Depends on: 1827926
Depends on: 1872874
You need to log in before you can comment on or make changes to this bug.