Open Bug 1945220 Opened 6 months ago Updated 2 months ago

Consider a glean_parser lint that'll fail if a metric name or category, or a label or extra key, or a part of the structure of an object metric is a reserved word in one of the SDK's supported languages

Categories

(Data Platform and Tools :: Glean: SDK, enhancement, P4)

enhancement

Tracking

(Not tracked)

People

(Reporter: chutten, Unassigned)

References

Details

We've managed so far to avoid having to deal with "the reserved word problem" through luck. Developers have been very good at naming their metric categories and names specifically, and the compiler has been good at ensuring that they don't miss something.

For labels and event extras we've allowed reserved words, supporting this capability through prefixes. The rationale was that though type is an unimaginative metric name or category, inside the context of a labeled metric's labels or an event's extras, any qualification would be duplicative (e.g. if the event is search.performed then forcing folks to call extra keys search_type and search_class is unnecessarily strict (and may fall afoul of our 'no common prefix' lint, if applied to metric names).

Rust and Swift have affordances for identifiers that shadow reserved words. Swift, though, gets verbose if you escape an identifier and it isn't a reserved word, so applying it universally would get us into trouble with the iOS devs whose builds suddenly get filled with warnings on code they didn't write.

After discussion in the Glean SDK Meeting, we've decided to add a glean_parser glint for tokens that use reserved words. We'll have to collect and maintain a list of all reserved words from all programming languages we support (including languages like C++ which we don't really support in the SDK, but kinda-sorta do anyway). This will be a lint so that it can be suppressed if 1) instrumentation engineers know that their metrics will never be used in a given language (e.g. Python-only projects can use e.g. type which is reserved in Rust), or 2) migrating existing tokens that are reserved words and currently work because they implicitly rely on this assumption.

We have not 100% settled on the full list of names that will or will not be subject to this lint. Definitely metric categories and metric names. Maybe also event extra keys and object metric fields. Maybe labels. (It follows that we haven't settled on whether we want to unprefix any of these fields that currently get prefixed as a result. This is complicated by C++'s use of unscoped preprocessor defines (see bug 1672273))

We have not decided whether to put the C++ reserved keyword list inside glean_parser or merely make it easy for applications to extend the list when running glean_parser (like we do for version-based expiry).

We have not decided on how specific to make the glint message, but I propose we should mention which programming language(s) reserve the word that we'd like to forbid so that devs understand what's going on.

Duplicate of this bug: 1944067

This seems useful and somewhat lower effort, keeping this out of the backlog in P3 for now.

Priority: -- → P3
See Also: 1944067
Priority: P3 → P4
See Also: → 1968533
You need to log in before you can comment on or make changes to this bug.