Handle unicode headers for tentative setExtraHeaders test
Categories
(Remote Protocol :: WebDriver BiDi, defect, P2)
Tracking
(firefox150 fixed)
| Tracking | Status | |
|---|---|---|
| firefox150 | --- | fixed |
People
(Reporter: jdescottes, Assigned: jdescottes)
References
Details
(Whiteboard: [webdriver:m19], [wptsync upstream][webdriver:relnote])
Attachments
(3 files)
At the moment we only fail two test cases from the setExtraHeaders test suite, when setting a unicode header '你好世界'
| Assignee | ||
Updated•8 months ago
|
| Assignee | ||
Updated•8 months ago
|
| Assignee | ||
Updated•4 months ago
|
| Assignee | ||
Comment 1•4 months ago
|
||
| Assignee | ||
Comment 2•4 months ago
|
||
Hi Kershaw,
In BiDi we are using nsIHttpChannel.setRequestHeader to modify request headers, and there is a webplatform test which asserts that we can set a header with the value 你好世界. But that doesn't seem to work with setRequestHeader (see attached test), the value stored is ``}\u0016L`.
At this point I'm not sure if this a bug in setRequestHeader, or if I should encode the header value before calling setRequestHeader, or if this value is simply not acceptable for a header? Let me know what you think!
Comment 4•4 months ago
|
||
(In reply to Julian Descottes [:jdescottes] from comment #2)
Hi Kershaw,
In BiDi we are using
nsIHttpChannel.setRequestHeaderto modify request headers, and there is a webplatform test which asserts that we can set a header with the value你好世界. But that doesn't seem to work with setRequestHeader (see attached test), the value stored is ``}\u0016L`.At this point I'm not sure if this a bug in setRequestHeader, or if I should encode the header value before calling setRequestHeader, or if this value is simply not acceptable for a header? Let me know what you think!
The issue is in how XPConnect converts a JavaScript string to ACString (the IDL type of setRequestHeader's value parameter). ACString is a raw byte string with no encoding. When XPConnect converts a JS string to ACString, it calls JS_EncodeStringToBuffer which truncates each UTF-16 code unit to its low byte (buffer[i] = char(src[i])). So "你好世界" (U+4F60, U+597D, U+4E16, U+754C) becomes the 4 garbage bytes `}\x16L — just the low byte of each code point.
On the HTTP spec side, RFC 9110 §5.5 allows bytes in the range %x80-FF in header values, and only forbids CR/LF/NUL.
I assume what the WPT expects is the UTF-8 encoding of "你好世界" as raw bytes?
If so, I think the fix can be adding another API to use AUTF8String instead of ACString for the value parameter.
Updated•4 months ago
|
| Assignee | ||
Updated•4 months ago
|
Comment 6•4 months ago
|
||
| bugherder | ||
| Assignee | ||
Comment 7•4 months ago
|
||
Reopening the bug since Bug 2020207 will revert the fix. We need to find a solution for this which doesn't regress other consumers setting reading headers.
| Assignee | ||
Comment 8•4 months ago
|
||
| Assignee | ||
Comment 9•4 months ago
|
||
Updated•4 months ago
|
Comment 10•4 months ago
|
||
Comment 11•3 months ago
|
||
Comment 12•3 months ago
|
||
| bugherder | ||
Comment 13•3 months ago
|
||
| bugherder | ||
| Assignee | ||
Updated•2 months ago
|
| Assignee | ||
Updated•2 months ago
|
Description
•