Closed Bug 1448945 Opened 6 years ago Closed 6 years ago

Make artifact builds pick up changes in Events.yaml

Categories

(Toolkit :: Telemetry, enhancement, P1)

enhancement
Points:
3

Tracking

()

RESOLVED FIXED
mozilla61
Tracking Status
firefox61 --- fixed

People

(Reporter: miker, Assigned: janerik)

References

Details

Attachments

(8 files, 6 obsolete files)

59 bytes, text/x-review-board-request
chutten
: review+
Details
59 bytes, text/x-review-board-request
chutten
: review+
Details
59 bytes, text/x-review-board-request
chutten
: review+
froydnj
: review+
Details
59 bytes, text/x-review-board-request
chutten
: review+
Details
59 bytes, text/x-review-board-request
chutten
: review+
Details
59 bytes, text/x-review-board-request
chutten
: review+
Details
59 bytes, text/x-review-board-request
chutten
: review+
Details
59 bytes, text/x-review-board-request
chutten
: review+
Details
A full build can take a long time... especially when devs need to switch between artifact and non-artifact builds numerous times daily.

Can we add the plumbing to make artifact builds pick up changes in Events.yaml?
See Also: → 1425909
Assignee: nobody → alessio.placitelli
Priority: P2 → P1
Moving this to Jan-Erik, as he'll be pushing this to the finishing line. The status so far:

- The registration code is in place (and doesn't crash anymore);
- The code that generates the JSON definitions out of the YAML file is pretty much there;
- The snapshotting logic is implemented (events can be seen in about:telemetry)
- There's initial test coverage.

What's missing:

- The JSON file is not loaded. We should load it (as we do with the scalar JSON file) on local developer builds when Firefox starts;
- We need test coverage for the part above;
- We need doc updates to mention that artifact builds support events (we also need to update the relevant MDN page);
- We need the review from nfroydnj on the moz.build changes;
- We should take this for a manual test ride in addition to xpcshell tests

For context and technical background on the design, see bug 1425909 comment 12 (and next comments). The patch there can inform the missing changes in this patch as well ;)

@Jan-Erik, of course let me know if you need more details about this madness!
Assignee: alessio.placitelli → jrediger
Points: --- → 3
Flags: needinfo?(jrediger)
Comment on attachment 8966928 [details]
Bug 1448945 - Generate JSON definitions for dynamic builtin events.

https://reviewboard.mozilla.org/r/235584/#review241370

r=me on the moz.build parts.
Attachment #8966928 - Flags: review?(nfroyd) → review+
Attachment #8965705 - Attachment is obsolete: true
Flags: needinfo?(jrediger)
Attachment #8965706 - Attachment is obsolete: true
Attachment #8965707 - Attachment is obsolete: true
Attachment #8965708 - Attachment is obsolete: true
Attachment #8966926 - Flags: review?(alessio.placitelli) → review?(chutten)
Attachment #8966927 - Flags: review?(alessio.placitelli) → review?(chutten)
Attachment #8966928 - Flags: review?(alessio.placitelli) → review?(chutten)
Attachment #8966929 - Flags: review?(alessio.placitelli) → review?(chutten)
Comment on attachment 8966926 [details]
Bug 1448945 - Enable recording events in artifact builds without rebuilding Firefox.

https://reviewboard.mozilla.org/r/235580/#review241508

::: toolkit/components/telemetry/TelemetryEvent.cpp:469
(Diff revision 1)
>      return RecordEventResult::UnknownEvent;
>    }
>  
> -  if (eventKey->dynamic) {
> +  // Fixup the process id only for non-builtin (e.g. supporting build faster)
> +  // dynamic events.
> +  if (eventKey->dynamic &&

Don't I remember us needing a !isExpired before the array dereference here?

::: toolkit/components/telemetry/nsITelemetry.idl:523
(Diff revision 1)
>    /**
> +   * Parent process only. Register dynamic builtin events. The parameters
> +   * have the same meaning as the usual |registerEvents| function.
> +   *
> +   * This function is only meant to be used to support the "artifact build"/
> +   * "built faster" developers by allowing to add new events without rebuilding

typo "built" should be "build"
Attachment #8966926 - Flags: review?(chutten)
Comment on attachment 8966927 [details]
Bug 1448945 - Enable snapshotting dynamic builtin events.

https://reviewboard.mozilla.org/r/235582/#review241510

::: toolkit/components/telemetry/TelemetryEvent.cpp:1141
(Diff revision 1)
>      }
>  
> -    for (auto iter = gEventRecords.Iter(); !iter.Done(); iter.Next()) {
> +    // The snapshotting function is the same for both static and dynamic builtin events.
> +    // We can use the same function and store the events in the same output storage.
> +    auto snapshotter = [aDataset, &locker, &processEvents]
> +                       (EventRecordsMapType& aProcessStorage, bool aIsBuiltinDynamic)

aIsBuiltinDynamic isn't read in this lambda.
Attachment #8966927 - Flags: review?(chutten)
Comment on attachment 8966928 [details]
Bug 1448945 - Generate JSON definitions for dynamic builtin events.

https://reviewboard.mozilla.org/r/235584/#review241514

Seems about right
Attachment #8966928 - Flags: review?(chutten) → review+
Comment on attachment 8966929 [details]
Bug 1448945 - Add test coverage for dynamic builtin events.

https://reviewboard.mozilla.org/r/235586/#review241518

::: toolkit/components/telemetry/tests/unit/test_TelemetryEvents_buildFaster.js:28
(Diff revision 1)
> +    },
> +  });
> +
> +  // Record some events.
> +  Telemetry.recordEvent(TEST_EVENT_NAME, "test1", "object1");
> +  Telemetry.recordEvent(TEST_EVENT_NAME, "test2", "object1", null,

After the problems we had with non-first method+object pairs, I'd like to see us record test2b on object1 as well, including one or more of key1 and key2.
Attachment #8966929 - Flags: review?(chutten)
Attachment #8966930 - Flags: review?(alessio.placitelli) → review?(chutten)
Attachment #8966931 - Flags: review?(alessio.placitelli) → review?(chutten)
Attachment #8966932 - Flags: review?(alessio.placitelli) → review?(chutten)
Attachment #8966933 - Flags: review?(alessio.placitelli) → review?(chutten)
Attachment #8966935 - Flags: review?(alessio.placitelli) → review?(chutten)
Attachment #8966934 - Flags: review?(alessio.placitelli) → review?(chutten)
Here's a plan to test these changes manually: https://gist.github.com/badboy/789108bb048826e461facea8366238c7
Comment on attachment 8966930 [details]
Bug 1448945 - Load dynamic inbuilt event probes on startup.

https://reviewboard.mozilla.org/r/235588/#review241864

::: toolkit/components/telemetry/TelemetryController.jsm:1060
(Diff revision 1)
>              break;
>          }
>          return newValue;
>        });
>      } catch (ex) {
> -      this._log.error(`registerJsProbes - there was an error loading {$scalarProbeFilename}`,
> +      this._log.error(`registerScalarProbes - there was an error loading ${scalarProbeFilename}`,

nice catch

::: toolkit/components/telemetry/TelemetryController.jsm:1086
(Diff revision 1)
> +
> +    // Load the file off the disk.
> +    let eventJSProbes = {};
> +    try {
> +      let fileContent = await OS.File.read(eventProbeFile.path, { encoding: "utf-8" });
> +      eventJSProbes = JSON.parse(fileContent);

Given how similar the preambles in these functions are, did you give some thought to factoring out the common load from disk behaviours?
Attachment #8966930 - Flags: review?(chutten)
Comment on attachment 8966931 [details]
Bug 1448945 - Allow underscore in category, method and extra keys of event.

https://reviewboard.mozilla.org/r/235590/#review241868

::: toolkit/components/telemetry/TelemetryEvent.cpp:1088
(Diff revision 1)
>      if (extra_keys.Length() > kMaxExtraKeyCount) {
>        JS_ReportErrorASCII(cx, "No more than 10 extra keys can be registered.");
>        return NS_ERROR_INVALID_ARG;
>      }
>      for (auto& key : extra_keys) {
> -      if (!IsValidIdentifierString(key, kMaxExtraKeyNameByteLength, false, false)) {
> +      if (!IsValidIdentifierString(key, kMaxExtraKeyNameByteLength, false, true)) {

The documentation doesn't say the extra key names are identifiers. (we can change the document to say so)
Attachment #8966931 - Flags: review?(chutten)
Comment on attachment 8966932 [details]
Bug 1448945 - Add test coverage for loading dynamic inbuilt events on startup.

https://reviewboard.mozilla.org/r/235592/#review241870
Attachment #8966932 - Flags: review?(chutten) → review+
Comment on attachment 8966933 [details]
Bug 1448945 - Document support for adding events in artifact builds.

https://reviewboard.mozilla.org/r/235594/#review241872

::: toolkit/components/telemetry/docs/collection/events.rst:9
(Diff revision 1)
>  Events
>  ======
>  
>  Across the different Firefox initiatives, there is a common need for a mechanism for recording, storing, sending & analysing application usage in an event-oriented format.
>  *Event Telemetry* specifies a common events data format, which allows for broader, shared usage of data processing tools.
> +Adding scalars is supported in artifact builds and build faster workflows.

s/scalars/events/
Attachment #8966933 - Flags: review?(chutten)
Comment on attachment 8966934 [details]
Bug 1448945 - Check for expired event first.

https://reviewboard.mozilla.org/r/235596/#review241876

::: toolkit/components/telemetry/TelemetryEvent.cpp:471
(Diff revision 1)
>  
> +  // If the event is expired or not enabled for this process, we silently drop this call.
> +  // We don't want recording for expired probes to be an error so code doesn't
> +  // have to be removed at a specific time or version.
> +  // Even logging warnings would become very noisy.
> +  if (IsExpired(*eventKey)) {

Ah, so this is where this ended up. I recommend fixing this up into the commit that adds the array index.
Attachment #8966934 - Flags: review?(chutten) → review+
Comment on attachment 8966935 [details]
Bug 1448945 - Only update expiration if not a builtin event.

https://reviewboard.mozilla.org/r/235598/#review241878

Yup, we can fix this up into the earlier commit that introduced it.
Attachment #8966935 - Flags: review?(chutten)
Attachment #8966934 - Attachment is obsolete: true
Attachment #8966935 - Attachment is obsolete: true
Comment on attachment 8966926 [details]
Bug 1448945 - Enable recording events in artifact builds without rebuilding Firefox.

https://reviewboard.mozilla.org/r/235580/#review242654

More or less there. One thing I'd like an explanation on, and one nit.

::: toolkit/components/telemetry/TelemetryEvent.cpp:482
(Diff revision 2)
> +  if (eventKey->dynamic &&
> +      !(*gDynamicEventInfo)[eventKey->id].builtin) {
>      processType = ProcessID::Dynamic;
>    }
>  
> -  EventRecordArray* eventRecords = GetEventRecordsForProcess(lock, processType, *eventKey);
> +  bool isDynamicBuiltin = eventKey->dynamic && (*gDynamicEventInfo)[eventKey->id].builtin;

Looks like this could be used in the above conditional to make it more readable?

::: toolkit/components/telemetry/TelemetryEvent.cpp:558
(Diff revision 2)
>  
>      // Re-registering events can happen when add-ons update, so we don't print warnings.
>      // We don't support changing their definition, but the expiry might have changed.
>      EventKey* existing = nullptr;
>      if (gEventNameIDMap.Get(eventName, &existing)) {
> -      if (eventExpired[i]) {
> +      if (!eventInfos[i].builtin && eventExpired[i]) {

So... if someone tries to register a dynamic builtin event with the same category,method,object tuple as an already-registered event... we won't use its expired status to determine whether the existing event should be treated as expired?

Do I understand that correctly?

So this doesn't actually matter because for all i, eventExpired[i] is false, right? Is this change necessary?

::: toolkit/components/telemetry/nsITelemetry.idl:561
(Diff revision 2)
>    /**
>     * Parent process only. Register dynamic builtin scalars. The parameters
>     * have the same meaning as the usual |registerScalars| function.
>     *
>     * This function is only meant to be used to support the "artifact build"/
> -   * "built faster" developers by allowing to add new scalars without rebuilding
> +   * "build faster" developers by allowing to add new scalars without rebuilding

Nice.
Attachment #8966926 - Flags: review?(chutten) → review+
Comment on attachment 8966927 [details]
Bug 1448945 - Enable snapshotting dynamic builtin events.

https://reviewboard.mozilla.org/r/235582/#review242664
Attachment #8966927 - Flags: review?(chutten) → review+
Comment on attachment 8966929 [details]
Bug 1448945 - Add test coverage for dynamic builtin events.

https://reviewboard.mozilla.org/r/235586/#review242674

::: toolkit/components/telemetry/tests/unit/test_TelemetryEvents_buildFaster.js:40
(Diff revision 2)
> +  Assert.ok(("parent" in snapshot), "Should have parent events in the snapshot.");
> +
> +  let expected = [
> +    [TEST_EVENT_NAME, "test1", "object1"],
> +    [TEST_EVENT_NAME, "test2", "object1", null, {key1: "foo", key2: "bar"}],
> +    [TEST_EVENT_NAME, "test2", "object2", null, {key2: "bar"}],

Does this test pass? There is no object2 registered.
Attachment #8966929 - Flags: review?(chutten)
Comment on attachment 8966930 [details]
Bug 1448945 - Load dynamic inbuilt event probes on startup.

https://reviewboard.mozilla.org/r/235588/#review242694
Attachment #8966930 - Flags: review?(chutten) → review+
Comment on attachment 8966931 [details]
Bug 1448945 - Allow underscore in category, method and extra keys of event.

https://reviewboard.mozilla.org/r/235590/#review242696
Attachment #8966931 - Flags: review?(chutten) → review+
Comment on attachment 8966933 [details]
Bug 1448945 - Document support for adding events in artifact builds.

https://reviewboard.mozilla.org/r/235594/#review242698
Attachment #8966933 - Flags: review?(chutten) → review+
Comment on attachment 8966929 [details]
Bug 1448945 - Add test coverage for dynamic builtin events.

https://reviewboard.mozilla.org/r/235586/#review243128
Attachment #8966929 - Flags: review?(chutten) → review+
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/03ec35311b19
Enable recording events in artifact builds without rebuilding Firefox. r=chutten
https://hg.mozilla.org/integration/autoland/rev/4badb094b348
Enable snapshotting dynamic builtin events. r=chutten
https://hg.mozilla.org/integration/autoland/rev/d7199d45cfd9
Generate JSON definitions for dynamic builtin events. r=chutten,froydnj
https://hg.mozilla.org/integration/autoland/rev/441de7a08d1c
Add test coverage for dynamic builtin events. r=chutten
https://hg.mozilla.org/integration/autoland/rev/c8fb850a6d66
Load dynamic inbuilt event probes on startup. r=chutten
https://hg.mozilla.org/integration/autoland/rev/350a522001fa
Allow underscore in category, method and extra keys of event. r=chutten
https://hg.mozilla.org/integration/autoland/rev/efb850288b07
Add test coverage for loading dynamic inbuilt events on startup. r=chutten
https://hg.mozilla.org/integration/autoland/rev/4557fca4b7e2
Document support for adding events in artifact builds. r=chutten
Keywords: checkin-needed
It landed. MDN page edited.
Depends on: 1460400
Depends on: 1573588

I added an item to the list of extra_keys, and xpcshell does not seem to pick it up with artifacts build:

$ ./mach build
$ ./mach xpcshell-test services/settings/test/unit/test_remote_settings.js
...
...
 0:03.10 INFO "CONSOLE_MESSAGE: (warn) [JavaScript Warning: "Invalid extra key for event ["uptake.remotecontent.result", "uptake", "remotesettings"]."]"
 0:03.19 TEST_END: Test FAIL, expected PASS. Subtests passed 561/562. Unexpected 1

Do I have to do something specific or is this a regression?

Flags: needinfo?(MattN+bmo)

Sorry, no idea.

Flags: needinfo?(MattN+bmo) → needinfo?(jrediger)

I looked into it again.
Seems like Telemetry is not in test mode and thus not loading the files.

With the following patch, ensuring the telemetry controller gets initialized, changes are picked up in artifact builds:

diff --git services/settings/test/unit/test_remote_settings.js services/settings/test/unit/test_remote_settings.js
index 8b0a989942c74..cf6d1f1e74b2a 100644
--- services/settings/test/unit/test_remote_settings.js
+++ services/settings/test/unit/test_remote_settings.js
@@ -6,6 +6,7 @@ const { Services } = ChromeUtils.import("resource://gre/modules/Services.jsm");
 const { AppConstants } = ChromeUtils.import(
   "resource://gre/modules/AppConstants.jsm"
 );
+const { TelemetryController } = ChromeUtils.import("resource://gre/modules/TelemetryController.jsm");
 
 const IS_ANDROID = AppConstants.platform == "android";
 
@@ -103,6 +104,9 @@ function run_test() {
   });
 }
 add_task(clear_state);
+add_task(async function asyncSetup() {
+  TelemetryController.observe(null, "profile-after-change", null);
+});
 
 add_task(async function test_records_obtained_from_server_are_stored_in_db() {
   // Test an empty db populates

Now that is obviously not the solution, as it relies on implementation details. What also works is forcing test mode: await TelemetryController.testSetup(), but that's also not good.

:chutten, any idea why Telemetry is not initialized in time and what way we can provide others to ensure it is?

Flags: needinfo?(jrediger) → needinfo?(chutten)

Ah, so they're available to artifact builds after all, they're just not loaded in tests because no one told Telemetry to do anything. Hm.

Telemetry's as initialized as it expects to be in tests: no controller, session, custom pings, send, scheduling, heavy init... nothing in JSMs, basically. The parts that are init'd are from the IDL inwards. This is on purpose, if I intuit our history properly. For all non-Telemetry-internal tests, we don't want/need the Telemetry JSMs to do anything... except apparently now we do : )

We could cut a hole down to registerJSProbes (call it testRegisterJSProbes()) but that's about as helpful as testSetup (though external tests could at least await the right things this way). Not very ergonomic to ask all tests running in artefact or non-artefact mode to remember to write a magic incantation in each of their tests.

Maybe there's a topic to observe or a basic-level head.js we could tie into to call await testRegisterJSProbes() for us. It'll probably break a couple of the buildfaster tests, but it should have surprisingly little effect on init. What do you think, Jan-Erik?

Flags: needinfo?(chutten)

We could have a testRegisterJSProbes function and insert ourselves into testing/xpcshell/head.js to call that for all xpcshell tests, so that no one else needs to do that.
(That won't solve the same problem for mochitests, but I'm wondering why they don't have the telemetry stuff around to begin with, that should be a nearly-full-fledged browser already)

wdyt, chutten?

Flags: needinfo?(chutten)

Sounds like a solid plan of action to me!

Flags: needinfo?(chutten)
Blocks: 1619990
You need to log in before you can comment on or make changes to this bug.