Closed Bug 1986022 Opened 3 months ago Closed 2 months ago

Encoding of non-ASCII characters in response bodies is broken

Categories

(Remote Protocol :: WebDriver BiDi, defect, P3)

defect
Points:
2

Tracking

(firefox145 fixed)

RESOLVED FIXED
145 Branch
Tracking Status
firefox145 --- fixed

People

(Reporter: hbenl, Assigned: jdescottes)

References

Details

(Whiteboard: [webdriver:m17][webdriver:relnote])

Attachments

(2 files)

This Playwright test fails when I try it with this PR and when I look at the raw BiDi messages I see that the character Ü is represented by the bytes C3 83 C2 9C. This looks like a double encoding (in node.js Buffer.from(Buffer.from("Ü", "utf8").toString("latin1"), "utf8") gives me <Buffer c3 83 c2 9c>).

Severity: -- → S3
Points: --- → 2
Priority: -- → P2
Whiteboard: [webdriver:m18]
Priority: P2 → P3

Had a quick look, it seems we actually don't get the charset for the response, and we fail to decode the stream properly.
The channel's contentCharset is not set for this response. It seems in most spots DevTools falls back to UTF-8 when the charset is not set, so maybe we should do the same here.

Assignee: nobody → jdescottes
Status: NEW → ASSIGNED

This can be entirely fixed in devtools, but I'll try to add a wdspec test for it anyway.

Whiteboard: [webdriver:m18] → [webdriver:m17]

I wrote a wdspec test, but I actually see that Chrome has the same behavior as current Firefox so we might need a spec discussion about what to do for responses which don't define a charset.

Here maybe you could propose a PR to fix the test with

  server.setRoute('/empty.html', (req, res) => {
    res.statusCode = 200;
    res.setHeader('content-type', 'text/plain; charset=utf-8');
    res.end(Buffer.from('Ü (lowercase ü)', 'utf-8'));
  });

?

Flags: needinfo?(hbenl)

In the meantime I filed https://github.com/w3c/webdriver-bidi/issues/1005
The wdspec test is up at https://github.com/web-platform-tests/wpt/pull/54928 for discussion.

I will remove the patches from the review queue until we have an agreement.

Here maybe you could propose a PR to fix the test

I'd prefer to keep the test failing since it shows a real issue.

Flags: needinfo?(hbenl)
See Also: → 1989404
Pushed by jdescottes@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/6490a19899f5 https://hg.mozilla.org/integration/autoland/rev/e447db312c3a [devtools] Default to UTF-8 when charset is missing for NetworkUtils.decodeResponseChunks r=bomsy,devtools-reviewers https://github.com/mozilla-firefox/firefox/commit/ae453557a5d0 https://hg.mozilla.org/integration/autoland/rev/da770b22c2f1 [bidi] Always use UTF-8 to decode bytes for network.getData r=webdriver-reviewers,whimboo
Status: ASSIGNED → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → 145 Branch
Whiteboard: [webdriver:m17] → [webdriver:m17][webdriver:relnote]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: