xhr with range header gets unreliable Content-Length (and different from other browsers) for 'text/plain' files
Categories
(Core :: DOM: Networking, defect, P2)
Tracking
()
People
(Reporter: piovesan.carlo, Unassigned)
References
(Blocks 2 open bugs)
Details
(Whiteboard: [necko-triaged][necko-priority-next])
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
Steps to reproduce:
Go to the browser console, run:
var xhr = new XMLHttpRequest(); xhr.open("HEAD", "https://raw.githubusercontent.com/duckdb/duckdb_spatial/main/test/data/nyc_taxi/taxi_zones/taxi_zones.prj",false);xhr.setRequestHeader('Range', bytes=0-
);xhr.onload = ()=>{console.log(xhr.getResponseHeader('Content-Length'));}; xhr.send(null);
This perform an XHR async request while providing the Range header.
Trying the same for a different
Alternatively I set up this test website:
https://carlopi.github.io/content-length-test/
that performs the same computations on a bunch of combinations (HEAD/GET, ranges or not).
Originally reported here: https://github.com/duckdb/duckdb-wasm/issues/1580
Actual results:
For the console test:
Firefox returns 347 (the file of the compressed file) while Chromium/Safari returns 562 (the size of the file after decompression).
Using the test website, on Firefox the first few lines will be like:
347 using HEAD + RANGE
347 using GET + RANGE
562 using GET's arraybuffer's byteLength
while on chrome/safari they will be:
562 using GET's arraybuffer's byteLength
562 using HEAD + RANGE
562 using GET + RANGE
Expected results:
I would expect result to match between browsers, and in particular that HEAD requests performed with Ranges header attached to return the actual file length (when decompressed)
Reporter | ||
Comment 1•4 months ago
|
||
An user of duckdb-wasm reported that it works for them "I'm running Firefox 115 ESR on my Mac", so it might be that this is a regression, but I haven't reproduced that.
Comment 2•4 months ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::DOM: Networking' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Updated•4 months ago
|
Comment 3•4 months ago
|
||
Firefox requests gzip
Content-Encoding (via Accept-Encoding
request header) while Chrome uses identity
encoding. Chrome is right according to the spec.
https://fetch.spec.whatwg.org/#http-network-or-cache-fetch (Step 8.19.)
If httpRequest’s header list contains
Range
, then append (Accept-Encoding
,identity
) to httpRequest’s header list.
[Note] This avoids a failure when handling content codings with a part of an encoded response.
Additionally, many servers mistakenly ignoreRange
headers if a non-identity encoding is accepted.
Apparently we fail to handle the case where XHR
or fetch
adds a Range
request header.
Comment 4•4 months ago
|
||
By the way, I got this result with reporter's testcase on Chrome.
562 using HEAD + RANGE
562 using GET's arraybuffer's byteLength
347 using GET + RANGE
That is, Chrome did not handle GET + RANGE case correctly. I don't know the reason why I got a different result from reporter's one.
Reporter | ||
Comment 5•4 months ago
|
||
By the way, I got this result with reporter's testcase on Chrome.
562 using HEAD + RANGE
562 using GET's arraybuffer's byteLength
347 using GET + RANGE
That is, Chrome did not handle GET + RANGE case correctly. I don't know the reason why I got a different result from reporter's one.
Amazingly, also Chrome has a weird behaviour here, given that disabling/cleaning the cache the results are:
562 using HEAD + RANGE
562 using GET's arraybuffer's byteLength
562 using GET + RANGE
while when cache kicks in they are (as you posted):
562 using HEAD + RANGE
562 using GET's arraybuffer's byteLength
347 using GET + RANGE
Safari looks to be behaving with the sensible behavior.
Comment 6•3 months ago
•
|
||
(In reply to Masatoshi Kimura [:emk] from comment #3)
Firefox requests
gzip
Content-Encoding (viaAccept-Encoding
request header) while Chrome usesidentity
encoding. Chrome is right according to the spec.
https://fetch.spec.whatwg.org/#http-network-or-cache-fetch (Step 8.19.)If httpRequest’s header list contains
Range
, then append (Accept-Encoding
,identity
) to httpRequest’s header list.
[Note] This avoids a failure when handling content codings with a part of an encoded response.
Additionally, many servers mistakenly ignoreRange
headers if a non-identity encoding is accepted.Apparently we fail to handle the case where
XHR
orfetch
adds aRange
request header.
I see that we are sending identity value along with gzip.
I see that in case of range header, we do append the identity value.
The spec mentions to "append" the value into the header list and hence I think we are behaving as per the spec?
We will discuss this internally during our bug review meeting to decide on further course of action.
Fix should be straightforward. We need to just set the header instead of merge.
However, I see that Chrome and Safari both just sends identity
for Accept-Encoding
request header.
Comment 7•3 months ago
|
||
(In reply to Sunil Mayya from comment #6)
The spec mentions to "append" the value into the header list and hence I think we are behaving as per the spec?
IMO it is a spec bug because the spec has the following note right after the text:
This avoids a failure when handling content codings with a part of an encoded response.
Additionally, many servers mistakenly ignoreRange
headers if a non-identity encoding is accepted.
Yes, I know this is a non-normative note, but the current spec text does not resolve the problem that this note is concerning about.
Comment 8•3 months ago
|
||
Also I fail to understand the definition of "append" in the spec:
To append a header (name, value) to a header list list:
- If list contains name, then set name to the first such header’s name.
Note
This reuses the casing of the name of the header already in list, if any. If there are multiple matched headers their names will all be identical.- Append (name, value) to list.
If list contains name, name is already match a byte-case-insensitive for such header’s name. So effectively step 1 looks no-op for me.
Updated•3 months ago
|
Updated•3 days ago
|
Description
•