Closed Bug 1182869 Opened 9 years ago Closed 5 years ago

When server sends only part of the file with Content-Range header rest of the file is not fetched

Categories

(Core :: Audio/Video: Playback, defect, P3)

x86
Windows 7
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: madis.parn, Assigned: jya)

Details

User Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.130 Safari/537.36

Steps to reproduce:

Go to https://evening-stream-7636.herokuapp.com/



Actual results:

Only first part of the audio is fetched and 0:31 is shown as total duration


Expected results:

Audio length should show as 3:34
Chrome and IE11 are able to play the whole file.

Firefox request:
GET https://evening-stream-7636.herokuapp.com/audio.mp3
Host: evening-stream-7636.herokuapp.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:41.0) Gecko/20100101 Firefox/41.0
Accept: audio/webm,audio/ogg,audio/wav,audio/*;q=0.9,application/ogg;q=0.7,video/*;q=0.6,*/*;q=0.5
Accept-Language: en-US,en;q=0.5
Referer: https://evening-stream-7636.herokuapp.com/
Range: bytes=0-
Connection: keep-alive

Response (206):
Accept-Ranges: bytes
Cache-Control: no-cache, no-store, must-revalidate
Connection: keep-alive
Content-Length: 500000
Content-Range: bytes 0-499999/3425149
Content-Type: application/mpeg
Date: Sun, 12 Jul 2015 12:16:56 GMT
Server: Cowboy
Via: 1.1 vegur

Altough the entity length 3425149 is larger than reponse length 500000 no further requests are made.
OS: Unspecified → Windows 7
Hardware: Unspecified → x86
Component: Untriaged → Video/Audio
Product: Firefox → Core
Component: Audio/Video → Audio/Video: Playback
So if I understand corrently, the server only sends half the bytes, but the author wants us to use the declared size in the Content-Range header, not the duration of the code segment which is actually downloadable.

I'm not sure what makes sense to do in a case like this. Patches welcome, or maybe get input from the network team?
Priority: -- → P5
The problem is that browser does not download the whole content, but only the first chunk.
Jason - is this a networking issue?
Flags: needinfo?(jduell.mcbugs)
So the GET request in comment 1 includes

   Range: bytes=0-

which makes it legal for a server to reply with a 206 (and that response may be only part of the content, as it is here, IIUC: RFC 7233 says the server SHOULD send the exact range asked for, which would be all of the file, but it's clearly not doing that).

I'm wondering why we're using the Range 0- thing here.  If we omitted it, I suspect we'd get the whole file.  From a glance over nsHttpChannel.cpp, it looks like the most likely scenarios are that the media code has called channel.resumeAt(0), or possibly just manually set the Range header (which would be hacky indeed), or.. perhaps there's some other way of getting a Range header added automatically to a channel (possibly if we had a partial cache entry already, but I suspect that's not what's going on here).

Daniel, do you have any ideas here? Can you poke around at see if you can reproduce?
Flags: needinfo?(jduell.mcbugs) → needinfo?(daniel)
It seems to be slightly more complicated than so even. This server UNCONDITIONALLY responds with that 206 to a normal GET request too, even without a "Range: bytes=0-" header. Even without any Range: request at all.

A plain curl command line like "curl -v https://evening-stream-7636.herokuapp.com/audio.mp3 -O" shows it, and it too will only get the 500000 initial bytes.

Not that it helps, but this is a HTTP protocol violation. RFC7233 section 4.1: "The 206 (Partial Content) status code indicates that the server is successfully fulfilling a range request"

Hence, the problem is not the request as it will get that 206 back no matter what. The problem is how to act on the response and that feels perhaps more related to how the <audio> tag is supposed to work?
Flags: needinfo?(daniel)
How strange.  Daniel, can you verify that Chrome and IE11 also see a partial 206 response, and then (I assume) automatically issue new requests with Range headers to get the rest of the file?

This looks like a misbehaving web server to me, and I hate to bake in logic that works around that unless we've got proof that other browsers are already doing it.

(And I suspect we'd want to handle the reissue at the media code layer, not the necko layer: coalescing multiple channels into a "single" one sounds like a bad idea).
Flags: needinfo?(daniel)
I've now tried the same URL with Internet Explorer 11 (on Windows 7) and Chrome 47 (on Linux) and this is the outcome:

They both do almost the identical requests (and get the same 500K response) as Firefox does.

Chrome then displays 3:34 in the little player bar (but without loading more initial data than Firefox did). IE displays 0:00 and if I hover the mouse over the time display it shows "-30s" which I take as a hint that it has 30 seconds of data downloaded/buffered - this thus matches roughly what Firefox has.

The problem seems to be what the <audio> tag displays or what it doesn't figure out from the data Firefox has downloaded for it. It does not seem to be an actual network related issue as far as I can see.
Flags: needinfo?(daniel)
I would also like to note, that IE 11 plays the whole 3:34 of the chunked mp3. The "-30s" displayed when hovering over duration is the amount player skippes back after clicking on the area.

Also when using javascript 
document.querySelector('audio').duration
the results are
Firefox: 31.233296
Chrome: 214.055375
IE 11:  214.055375
Right, I should add that I've only investigated and compared what the three browsers load when "going to" that page. I've not actually played anything. Clearly the other two browsers make more further requests to get the rest of the data if the audio is actually played.
Anthony,

So it looks like you've got a misbehaving website (sends only part of the data even though we're not using Range), yet IE/Chrome's media layers are doing a better job of determining that part of the file is missing and then fetching the remainder (I assume using Range requests: I'll ask Daniel to verify that part).   So yes, you have a networking issue, but the fix is going to need to be that the media code gets smarter about detecting that only part of the file was downloaded and ask for the rest of it until it gets the whole thing.
Flags: needinfo?(ajones)
Daniel: one more thing. Can you fire off a cURL request and verify that using Range: 500K- gets the rest of the file?  I mean, I assume that's what's happening for IE/Chrome, but I want to verify it if possible.  Thanks!
Flags: needinfo?(daniel)
If I ask for 500K and onwards, I get the next 500K:

$ curl -v https://evening-stream-7636.herokuapp.com/audio.mp3 -O -r 500000-

Sends this:

> GET /audio.mp3 HTTP/1.1
> Host: evening-stream-7636.herokuapp.com
> Range: bytes=500000-

... and it gets this back;

< HTTP/1.1 206 Partial Content
< Server: Cowboy
< Content-Type: application/mpeg
< Accept-Ranges: bytes
< Content-Range: bytes 500000-999999/3425149
< Content-Length: 500000

(I trimmed the headers a bit for brevity)

and on it goes like that. If I bump the range request another 500K I get the subsequent 500K of data etc.
Flags: needinfo?(daniel)
Robert - what do you make of this?
Flags: needinfo?(roc)
Priority: P5 → P2
I see that we get "Content-Range: bytes 0-499999/3425149". I guess we need to use that as a cue that more data can be loaded.

Currently when we get OnStopRequest we treat that as a definitive statement that the resource has ended. I think in this case, when Content-Range tells us the request will not read all the resource, we should remember that in MediaChannelResource and then when it gets OnStopRequest, pass a flag to MediaCacheStream::NotifyDataEnded saying that this isn't really the end. Then MediaCacheStream::NotifyDataEnded in that case should not set stream->mStreamLength and should not do any of the stuff with stream->mDidNotifyDataEnded.
Flags: needinfo?(roc)
Flags: needinfo?(ajones)
Flags: needinfo?(ajones)
Assignee: nobody → jyavenard
My first reaction is to close it as invalid... but hey, why not!?
Flags: needinfo?(ajones)
Is the server behaving like this to prevent some utilities to download the full content?
curl only gets the first 500k bytes
wget just fails.

Any particular reason it's configured to do what it does? which web server is it ?

interesting...
Flags: needinfo?(madis.parn)
Mass change P2 -> P3
Priority: P2 → P3
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true

Reporter has gone silent; do we want to close this or no?

Flags: needinfo?(madis.parn) → needinfo?(jyavenard)

WFM, proper size is returned. The code handling that has changed a lot since reported.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Flags: needinfo?(jyavenard)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.