1629950 - Glean SDK might benefit from a "struct" metric type

Chris H-C :chutten

Assignee

Description

•

5 years ago

Proposal for changing an existing or adding a new Glean metric type

Who is the individual/team requesting this change?

:chutten, Firefox Telemetry

Is this about changing an existing metric type or creating a new one?

Creating a new one

Can you describe the data that needs to be recorded?

Data that comes in a structured format. For example: the customized layout state of Firefox Desktop's UI, the size and speed of the fastest GC run.

Can you provide a raw sample of the data that needs to be recorded (this is in the abstract, and not any particular implementation details about its representation in the payload or the database)

library_button: { state: "pinned", location: "right", parent_item: "none" },
fastest_gc: { speed: 4250ns, size: 412KB }

What is the business question/use-case that requires the data to be recorded?

General structured data ingestion reasons.

How would the data be consumed?

The data would be available for analysis in the usual ad hoc tooling. GLAM's aggregation could aggregate the sub-metrics individually (like showing a distribution of the fastest GCs' speeds and a distribution of the fastest GCs' sizes), but the relationship between the structured datapoints would be lost without a lot of work.

Why existing metric types are not enough?

We could do this today by flattening the structs like so:

library_button_state: "pinned",
library_button_location: "right",
library_button_parent_item: "none",
fastest_gc_speed: 4250ns,
fastest_gc_size: 412KB,

This is suboptimal for organizational reasons and may make instrumentation mistakes easier.

Also, Events could be adopted to report this via its extra dictionary, but I have other issues with that.

What is the timeline by which the data needs to be collected?

No strict timeline.

Chris H-C :chutten

Assignee

Updated

•

5 years ago

Comment 1

•

5 years ago

Further requirement:

In discussions with the team we highlighted a potential need to use this metric type to support a library of "Named Structs". For example, perhaps a named struct of struct_search_metrics would have the list of search engines, the current default engine, the number of searches per engine, and assorted other search-y things in it.

Chris H-C :chutten

Assignee

Updated

•

4 years ago

Comment 2

•

3 years ago

For tackling first thing in the new year.

Assignee: nobody → chutten

Status: NEW → ASSIGNED

Priority: -- → P2

Jan-Erik Rediger [:janerik]

Comment 3

•

3 years ago

chutten took the time and prepared the initial proposal: https://docs.google.com/document/d/1JNyJ_WFexoQAA8TTT9xR8X_yrBzp5uUdSO_38I7jYS4/edit#

I selected the relevant stakeholders. Please take a look at the proposal and help shape the design, especially with respect to your designated area of expertise.
As per the process I hope this will take no more than 2 weeks (2022-02-14).

Flags: needinfo?(tlong)

Flags: needinfo?(rmiller)

Flags: needinfo?(jklukas)

Flags: needinfo?(brizental)

Jan-Erik Rediger [:janerik]

Updated

•

3 years ago

See Also: → https://mozilla-hub.atlassian.net/browse/DS-1880

Chris H-C :chutten

Assignee

Comment 4

•

3 years ago

Note to those being consulted: thanks to the very helpful input thus far provided, I've had to rejigger things slightly but comprehensively.

Specifically, :klukas pointed out an excellent reason that the client should pay the complexity cost of assembling the struct into the appropriate structure that makes it easier to ingest. In retrospect this seems obvious, but at the time I couldn't see it, so thank you Jeff!

You may wish to re-review the doc as this has ramifications and there's some stuff that wasn't there when the doc went out:

There's a fully-fledged Purpose section explaining why this complexity is worth the effort.
The fully-qualified field names will no longer be limited to the metric name length. Instead each field can have the full use of the 30B of characters that a metric name has. Yes, my_category.my_metric.my_field.my_nested_field.my_double_nested_field.my_triple_nested_field.my_quadruple_nested_field is valid. (I think)
Payload assembly has gotten complicated enough that it has its own section separate from Storage. Give it a gander.
Max nesting limit has been reduced to 5, for everyone's sanity
Duplication of struct metrics and struct_list metrics and their fields is far less likely because field position within the structure is now an identifying quality. a.b_c.d will no longer collide with a.b.c_d
There's an optional API improvement to make commits a little easier. Still not happy with it, so please poke holes in it at "Option: Commit Full Subtrees"
There's a new section on Threading and IPC showing that, by wrapping a metric in a struct_list, you can make a previously-forbidden-from-using-IPC metric into one that can be used across processes. Which is a neat trick.
There's a new section on Documentation, Use, and Education. Learn some Latin!

Unresolved/ongoing discussions include

Should storage pay the complexity price instead of payload assembly?
Should struct even have a transaction API since there can only be one of them?
Do we have to drop support for event metrics because of a quirk of the Glean ping payload schema?

Things unchanged (for now):

APIs, SQL, how it looks in metrics.yaml... basically all the touch-points where a Instrumentor, Consumer, or Analyst would interact with these remain unchanged from the initial version.

If any of this is meaningful to your review, you may wish to go through again and weigh in again. Sorry for the shift in bearing, but it was a necessary course correction.

Beatriz Rizental [:brizental]

Comment 5

•

3 years ago

•

Edited

Left my decision summary on the proposal. Although discussions around implementation details are still ongoing, I don't feel they are blocking that.

I will be active on the discussions while they are unresolved and I can contribute to them. Once that is over then I will add my sign-off to the proposal.

Flags: needinfo?(brizental)

GitHub Bugzilla PR Linker

Comment 6

•

3 years ago

Attached file Oops, linked to the wrong bug (obsolete) — Details

Beatriz Rizental [:brizental]

Updated

•

3 years ago

Attachment #9262856 - Attachment description: Link to GitHub pull-request: https://github.com/mozilla/glean.js/pull/1174 → Oops, linked to the wrong bug

Attachment #9262856 - Attachment is obsolete: true

Jeff Klukas [:klukas] (UTC-4)

Comment 7

•

3 years ago

Thanks :chutten for absorbing suggestions and updating the design. I've left feedback and signed off. I'm tagging in :relud, though, for checking on what complexity this might add to probe-scraper.

Flags: needinfo?(jklukas) → needinfo?(dthorn)

Daniel Thorn [:relud]

Comment 8

•

3 years ago

•

Edited

(In reply to Jeff Klukas [:klukas] (UTC-4) from comment #7)

I'm tagging in :relud, though, for checking on what complexity this might add to probe-scraper.

I agree with :chutten's assessment that "it will largely be some new abstractions that call into all the existing functionality"

Flags: needinfo?(dthorn)

Rob Miller [:rmiller]

Comment 9

•

3 years ago

Reviewed.

Flags: needinfo?(rmiller)

Travis Long [:travis_]

Updated

•

3 years ago

Flags: needinfo?(tlong)

Chris H-C :chutten

Assignee

Comment 10

•

3 years ago

Tagging Jan-Erik for questions in comments on Storage, and for where we go from here. We've got a lot of sign-offs, so we might be close to accepting this and starting to think about scheduling.

Flags: needinfo?(jrediger)

Jan-Erik Rediger [:janerik]

Updated

•

3 years ago

Flags: needinfo?(jrediger)

Jan-Erik Rediger [:janerik]

Comment 11

•

3 years ago

This proposal has been accepted.
Published proposal doc: https://docs.google.com/document/d/e/2PACX-1vTL4YVxNkUQx-yJGjc_DTc9Dpeyf2LcJx0z2W31JkBCOHIuS9dB3kEA57DjkFbuDFztzRggT_O2m3IF/pub

I'm going to file bugs for implementation.

Status: ASSIGNED → RESOLVED

Closed: 3 years ago

Resolution: --- → FIXED

Jan-Erik Rediger [:janerik]

Updated

•

3 years ago

Blocks: 1758967

Beatriz Rizental [:brizental]

Updated

•

3 years ago

Blocks: 1761247

Chris H-C :chutten

Assignee

Updated

•

2 years ago

Blocks: 1831905