Closed Bug 1653046 Opened 4 years ago Closed 4 years ago

"Invalid UTF-8 byte sequence in string" when setting a string to 'None' in Python

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: mhentges, Assigned: mdroettboom)

References

Details

(Whiteboard: [telemetry:glean-rs:m13])

Attachments

(2 files)

Link to GitHub pull-request: https://github.com/mozilla/glean/pull/1080 4 years ago GitHub Bugzilla PR Linker 42 bytes, text/x-github-pull-request		Details \| Review
Link to GitHub pull-request: https://github.com/mozilla/glean/pull/1081 4 years ago GitHub Bugzilla PR Linker 42 bytes, text/x-github-pull-request		Details \| Review

Mitchell Hentges [:mhentges] 🦀

Reporter

Description

•

4 years ago

•

Edited

I'm unsure what caused glean to go into this state, but when I submit a ping, I'm seeing the following error text:

ERROR glean_ffi::handlemap_ext] Glean failed (ErrorCode(42)): Invalid UTF-8 byte sequence in string

If there's a way I can get more useful information, let me know!

Mitchell Hentges [:mhentges] 🦀

Reporter

Comment 1

•

4 years ago

Ah, it's because I was setting None for a string value, e.g.:

# yaml
mhentges:
  test:
    type: string
    # ...

Glean.initialize(
    "testing",
    "0.1",
    True,
)
metrics = load_metrics("metrics.yaml")
pings = load_pings("pings.yaml")
metrics.mhentges.test.set(None)
pings.usage.submit()

Alessio Placitelli [:Dexter]

Comment 2

•

4 years ago

(In reply to Mitchell Hentges [:mhentges] 🦀 from comment #1)

Ah, it's because I was setting None for a string value, e.g.:

Ah! Congratulations, you found a bug in the Python bindings of the Glean SDK :-) We should catch this and report errors using Glean. We will fix this!

Fixing here probably means:

prioritize bug 1650781;
whenever None is attempted to be reported, record an invalid_value error.

Priority: -- → P3

Updated

•

4 years ago

Whiteboard: [telemetry:glean-rs:m?] → [telemetry:glean-rs:m13]

Michael Droettboom [:mdroettboom]

Assignee

Updated

•

4 years ago

Assignee: nobody → mdroettboom

Michael Droettboom [:mdroettboom]

Assignee

Comment 3

•

4 years ago

I think we should limit the scope of this bugfix somewhat (or at least do it in stages).

Type checking in Python is inherently a runtime thing, and is hard to do 100% without also being more exclusive than users will expect. This problem affects every one of Glean's API's (if you pass a non-integer to CounterMetric.add, for example) and we should probably step back and decide what guarantees we want to make before embarking on a large undertaking like that.
In this specific case, the code that translates Python strings to char * to go over FFI supports converting None to NULL, which is why the error rather obscure. There are only two FFI functions where that behavior is actually desirable / safe, so we should fix that. That will change the error to be a Python exception (but still a runtime exception, not a Glean error), but one that should be easier for Python developers to understand.
We don't currently support reporting errors from the language binding side. We'd have to catch the type mismatch on the Python side and then report the error from there over FFI.
Glean is pretty heavily mypy annotated, which would also have caught this class of thing. We should encourage our users to use mypy in our docs.

I'm going to propose doing (2) and (4) now, but deferring (1) and (3) as having much higher cost:benefit ratios.

GitHub Bugzilla PR Linker

Comment 4

•

4 years ago

Attached file Link to GitHub pull-request: https://github.com/mozilla/glean/pull/1080 — Details

GitHub Bugzilla PR Linker

Comment 5

•

4 years ago

Attached file Link to GitHub pull-request: https://github.com/mozilla/glean/pull/1081 — Details

Alessio Placitelli [:Dexter]

Comment 6

•

4 years ago

(In reply to Michael Droettboom [:mdroettboom] from comment #3)

I'm going to propose doing (2) and (4) now, but deferring (1) and (3) as having much higher cost:benefit ratios.

Yes, I like and support this approach :)

Michael Droettboom [:mdroettboom]

Assignee

Comment 7

•

4 years ago

I filed bug 1654404 for the follow-up with (1) and (3).

Status: NEW → RESOLVED

Closed: 4 years ago

Resolution: --- → FIXED

Alessio Placitelli [:Dexter]

Updated

•

3 years ago

Bugzilla

Quick Search

"Invalid UTF-8 byte sequence in string" when setting a string to 'None' in Python

Categories

(Data Platform and Tools :: Glean: SDK, defect, P3)

Tracking

(Not tracked)

People

(Reporter: mhentges, Assigned: mdroettboom)

References

Details

(Whiteboard: [telemetry:glean-rs:m13])

Crash Data

Security

(public)

User Story

Attachments

(2 files)

Description

Comment 1

Comment 2

Updated

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Updated

Attachment

General

Description

File Name

Content Type