Consider if `recordEvent` should ever throw

RESOLVED FIXED in Firefox 68

Status

()

task
P1
normal
RESOLVED FIXED
4 months ago
2 months ago

People

(Reporter: chutten, Assigned: chutten)

Tracking

(Blocks 2 bugs)

Trunk
mozilla68
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox67 wontfix, firefox68 fixed)

Details

Attachments

(1 attachment)

Assignee

Description

4 months ago

Histograms' accumulate functions don't throw.

Scalars' add/set/setmax functions don't throw.

Events' recordEvent throws, but only if the event is unknown.

Should we bring recordEvent into consistency with Histograms and Scalars? This bug is about evaluating users of recordEvent, specifically the ones that handle the exception thrown in the case that the event is unregistered.

Assignee

Comment 1

4 months ago

Of note might be how often recordEvent is called with an unknown event. We have data on that thanks to TELEMETRY_EVENT_RECORDING_ERROR. Here are the counts for beta and nightly (pre-release-only probe).

In short: it's happened maybe a hundred times ever on beta. And less than that on nightly.

One use-case we should consider when changing things here:
There might be tests that want to know if recording an event succeeded (i.e. if it conformed to the schema of registered events, like bug 1443560).

Flags: needinfo?(gijskruitbosch+bugs)

Hey Gijs, this is the bug where we're tracking the issue you mentioned.

Comment 4

4 months ago

(In reply to Georg Fritzsche [:gfritzsche] from comment #3)

Hey Gijs, this is the bug where we're tracking the issue you mentioned.

Thanks, I noticed because I got bugmail as I filed 1311100. :-)

Flags: needinfo?(gijskruitbosch+bugs)

My 2 cents: no, it shouldn't throw, and it definitely doesn't make sense for it to throw sometimes and silently fail other times (as when extra values are present that shouldn't be).

I think there are two use cases: one is during "real operation", where maybe we should never throw, and another is during testing, where I think we should throw. One way that we could maybe satisfy both is to for Telemetry to offer some kind of mock that calling code can use as a substitute for Telemetry in tests, and this mock should validate that events (and histograms, and scalars, etc.) are of the correct form, and this mock should throw when that assumption is violated. This would also let us write tests that don't rely on the current shared global state of the Telemetry object (i.e. getting a snapshot of events, having to clear events from other tests, etc.).

Comment 6

4 months ago

We can use Cu.isInAutomation (and/or the C++ equivalent) to make the error behavior different between automation and what users see. We already make use of this e.g. in fluent's parsing/usage code.

We're tracking this as P1 for now and will sort through the different event work & use-cases next week.

Priority: -- → P1
Assignee

Comment 8

4 months ago

To clarify:

  • recordEvent should not throw in production, to match the Histogram and Scalar recording functions. We can and should log, though.
  • recordEvent must throw in automation, to ensure tests as written can continue to be useful.

The current condition (recordEvent throwing even in production for the one type of failure that's under test) is acceptable for now, so let's push this to P2.

(( And a note for whoever picks this up: This might be complicated by some recordEvent validation not happening until it gets to the parent process, and by the C++ API which cannot throw. ))

Priority: P1 → P2

(In reply to Chris H-C :chutten from comment #8)

To clarify:

  • recordEvent should not throw in production, to match the Histogram and Scalar recording functions. We can and should log, though.
  • recordEvent must throw in automation, to ensure tests as written can continue to be useful.

+1 on the first point.

The automation problem might have other solutions, we should talk through that (make test harness fail on specific error logs, test specific mock, ...).

Having the same API throw only under some conditions is probably not a good pattern as it can be surprising.

Assignee

Updated

2 months ago
Assignee: nobody → chutten
Status: NEW → ASSIGNED
Type: enhancement → task
Priority: P2 → P1

Comment 11

2 months ago
Pushed by chutten@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e0b954c335af
Telemetry.recordEvent should not throw r=janerik
Assignee

Updated

2 months ago
Blocks: 1542891
Assignee

Comment 12

2 months ago

Taking the follow-up of what to do in automation to bug 1542891. This bug is now only concerned with bringing recordEvent into semantic compliance with the rest of Telemetry's recording APIs.

Comment 13

2 months ago
bugherder
Status: ASSIGNED → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68
You need to log in before you can comment on or make changes to this bug.