Closed Bug 1968533 Opened 7 months ago Closed 6 months ago

InvalidLabel errors for `widget.ime_name_on_windows`

Categories

(Data Platform and Tools :: Glean: SDK, task, P1)

task

Tracking

(firefox141 fixed)

RESOLVED FIXED
Tracking Status
firefox141 --- fixed

People

(Reporter: chutten|PTO, Assigned: chutten|PTO)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

We're seeing invalid_label metric errors for the widget.ime_name_on_windows metric. This hits about 4% of all clients across channels.

Error: invalid_label
Channel: Nightly
Date range: 2025-05-19 to 2025-05-26
Graph: https://sql.telemetry.mozilla.org/queries/105105/source?p_appid=firefox_desktop&p_channel=nightly&p_date_range=2025-05-19--2025-05-26&p_error_type=invalid_label&p_metric=widget.ime_name_on_windows#258708

  • This usually is due to labels being too long. The label limit is 111 characters long so these'd have to be pretty egregiously long. This limit was recently upped from 71, so I'd have expected the number of affected clients to decrease (which it didn't).
  • This could also happen if there's non-printable ASCII or non-ASCII characters in the label. Which I guess is very possible for IMEs?

See also the error reporting docs.

I will take a look at the labels reported by the Legacy Telemetry mirror to see if the differences in the two systems can reveal some hints.

Top IME name on Windows from "main" pings received from Firefox Desktop Nightly cliens on May 24? 0x0804|微软拼音 (query)

That won't fit in "111 characters of printable ASCII", as it's not ASCII at all.

...guess we're going to either

  1. Need to change this to something that isn't a labeled_* metric
    • This might be a good change for the instrumentation as it's compound data: it's a |-delimited pair of "locale id" and "ime name". It could be nicer to analyze if those two fields were split out (maybe as event metric extra keys, or parts of an object metric's structure)
  2. Need to accept non-ASCII characters in labels
    • The original label regex was abandoned in bug 1672273 with a proposal to widen acceptable characters to "anything printable". We obviously didn't take it that wide, but we could've then, and the second best time might be now.

In the fullness of time, both would probably be the best process. For now, I think #2 is the way we'll have to go as we need to maintain the Scalar mirroring until Legacy Telemetry is removed.

See Also: → 1954805, 1945220

The ping's notifications go to Masayuki.

Flags: needinfo?(masayuki)

Oh, isn't non-ASCII character available? I didn't know that because the Telemetry API didn't cause any errors. As you see, we need to collect non-ASCII IME names because we cannot know the English name from the localized IME name. So, it's really helpful to make it accepts non-ASCII characters if it's possible.

Flags: needinfo?(masayuki)

The Telemetry Scalar API for keys just accepts bytes which it interprets as UTF-16. Glean's labeled_* metrics have specific limits on labels and we thought we were in the clear when we switched it over, but we didn't account for this.

So, yeah, we're definitely going to support UTF-8 here. The only reasons we didn't do this in the past were 1) An erroneous assumption that labels had to adhere to BigQuery's column identifier format, and 2) Keeping consistent rules between static labels (specified ahead of time in a list in a metrics.yaml file) (which generates enums, which means trying to figure out what to do with non-identifier-safe characters) and dynamic ones like these.

chutten merged PR [mozilla/glean]: bug 1968533 - Permit dynamic labels to have non-ASCII characters (#3145) in 16ea839.

We're likely to cut a new Glean SDK release and get it into m-c real soon, but until we do we can't expect these errors to decrease in volume. I don't think there's any rush to get this in faster as the data being sent through Legacy is unaffected, so analyses are uninterrupted.

I'll leave this bug open until we get confirmation that the error rates go down.

Severity: -- → S4

The PR was included in Glean SDK v64.4.0 which was vendored in bug 1968902 which was included in nightlies from 20250602211130 onwards. According to https://sql.telemetry.mozilla.org/queries/108507/source#266576 absolutely all invalid_value errors ceased in nightlies from then onwards.

I'd say this has now been taken care of : )

Status: ASSIGNED → RESOLVED
Closed: 6 months ago
Resolution: --- → FIXED
Component: Widget: Win32 → Glean: SDK
Product: Core → Data Platform and Tools
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: