Closed
Bug 1378228
Opened 7 years ago
Closed 3 months ago
XHR Range Requests on LARGE local files (via file://) takes forever to return and sometimes freeze up the browser
Categories
(Core :: DOM: Networking, defect, P2)
Tracking
()
RESOLVED
INVALID
People
(Reporter: sharun.msgs, Unassigned)
Details
(Whiteboard: [domcore-bugbash-triaged])
Attachments
(2 files)
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 Steps to reproduce: Doing a XHR Range Request on a local large files (using 50GB wikipedia dumps) takes forever to return. Firefox appears to search for the end of the file before returning a response. A couple weeks back, it would try to read the entire file into memory and the browser would hang. It now looks like it's not doing that, but it is still probably searching for the end of file given the amount of time it takes to return. My code just needs the file meta data in the header (first 80 bytes) of these files. And it uses a basic XHR function below to get it. This works on Chrome instantly. function read(bytestart, byteend, oncomplete){ var req = new XMLHttpRequest(); req.addEventListener("load", oncomplete); req.open('GET', params["archive"], true); req.overrideMimeType('text\/plain; charset=x-user-defined'); req.responseType = "arraybuffer"; req.setRequestHeader('Range', 'bytes='+bytestart+'-'+byteend); try { req.send(null); } catch (ex) { console.log(ex); } }
Updated•7 years ago
|
Component: Untriaged → DOM
Product: Firefox → Core
Comment 1•7 years ago
|
||
baku, do you know what's up here? Something to do with file:// sandboxing?
Flags: needinfo?(amarchesini)
Comment 2•7 years ago
|
||
Hi, thanks for this bug report. But I cannot reproduce it. Can you please send me a testcase? Plus, I would like to know a couple of things: 1. are you using Firefox 54? Or nightly? 2. In e10s mode or not? 3. Is the file a zip file?
Flags: needinfo?(amarchesini) → needinfo?(sharun.msgs)
Reporter | ||
Comment 3•7 years ago
|
||
Flags: needinfo?(sharun.msgs)
Reporter | ||
Comment 4•7 years ago
|
||
1. I am using Firefox 54.0.1 32 bit on win 10. 2. Not using nightly. Don't know what e10s mode is. 3. The file I am using is a ZIM file parts of which are compressed. [ http://www.openzim.org/wiki/ZIM_file_format ]. It is the format used by Kiwix the offling wikipedia reader. I have attached a loadZIM.html. Place a ZIM in the same directory as the html file. Here's a test file https://github.com/kiwix/kiwix-js/blob/master/tests/wikipedia_en_ray_charles_2015-06.zim To load file:///C:/loadZIM.html?archive=<filename> It should print out the file header in console. If you want to test with larger files - http://wiki.kiwix.org/wiki/Special:MyLanguage/Content_in_all_languages Current behaviour Firefox doesn't get to the XHR load handler Chrome and Edges print out the header but Chrome has to be started with the --allow-file-access-from-files
Reporter | ||
Comment 5•7 years ago
|
||
Thanks for looking into this and please let me know if you need any further details.
Comment 6•7 years ago
|
||
The issue here is that firefox ignores req.setRequestHeader('Range', 'bytes='+bytestart+'-'+byteend) for file:// URL. If you see, the ArrayBuffer contains all the bytes of the file.
Comment 7•7 years ago
|
||
The fastest way to fix it on your side, is to set responseType to 'blob'. When you have the response, slice it: var blob = this.response.slice(bytestart, byteend-bytestart); After this, do a FileReader.readAsArrayBuffer(the_sliced_blob).
Updated•7 years ago
|
Flags: needinfo?(sharun.msgs)
Reporter | ||
Comment 8•7 years ago
|
||
If it ignores the Range Request for file:// isn't it going to read the entire file as a blob? My problem is large files. Just some context- Wiki dumps are 50-60 GB. Similar story for most other offline web content - stackexchange, khan academy, zealdocs etc. The problem with File object is it can't be easily created or stored across sessions. So we have to keep asking users to reselect files via the fileselector. XHR allows us to bypass this but as these files grow in size, not having Range Requests makes access thro XHR also unusable. Fixing this issue would greatly benefit offline web content access. PS: I did try the blob approach with my large files and they don't return.
Flags: needinfo?(sharun.msgs)
Comment 9•7 years ago
|
||
> The problem with File object is it can't be easily created or stored across
> sessions. So we have to keep asking users to reselect files via the
> fileselector.
I'll probably work on this issue. But in the meantime, you have a workaround.
Here an example.
Updated•7 years ago
|
Flags: needinfo?(sharun.msgs)
Reporter | ||
Comment 10•7 years ago
|
||
This just made my day :) Thanks a lot for the workaround! We have been struggling with this and it's consequences for a while. And great to hear you might work on the File object issue!!!
Flags: needinfo?(sharun.msgs)
Reporter | ||
Comment 11•7 years ago
|
||
Ok I got a bit carried away there seeing the results return immediately for the request. To retrieve and render a wikipedia page takes hundreds of range requests and it gets slow very fast. I guess because the big blob keeps getting reloaded or whatever. Anyways...the workaround is still useful for tests/usecases requiring a small number of requests. So thanks for it. Ideally the ultimate fix for this issue should see multiple xhr file:// range requests, performance matching http:// range requests. In theory it should be much faster as everything is on disk. Related comment for sake of completeness A fix for his issue will help reloading a file from session to session without need for a File Object. But since this is XHR, (range) requests will work on a file found in the local directory where initiating code resides. Specifying a relative path or a file://full-path-to-diff-directory will require the FileSelector approach.
Reporter | ||
Comment 12•7 years ago
|
||
Please ignore my performance comment. Sorry! Bug in my code. Workaround seems to work.
Updated•7 years ago
|
Priority: -- → P2
Comment 13•7 years ago
|
||
I think we can move it to P3 or P4.
Reporter | ||
Comment 14•7 years ago
|
||
I am still seeing performance issues with the workaround. It's not clear to me what is going on or how to pin it down. But there is a noticeable difference between multiple XHR range request for "http://" and "file://" over a single large file. "file://" for some reason is very slow. Will just add - supporting range based access on "file://" (in addition to the fileselector dependency issue mentioned above) is crucial to supporting offline web content access. It's the simplest route compared to webframes/apps/addons/extensions/firefoxOS etc. There is no reason to be connected to the increasingly noisy and distracting internet all the time, if one can store KhanAcademy, StackExchange, Wikipedia and other increasing numbers of high quality web archives on a little SD card. It's unbelievable to me that even though all this great web content is now able to fit on my disk with tons of space to spare, the platforms makes it more efficient and easy to access the content online!!! Mozilla can really do something about this.
Reporter | ||
Comment 15•7 years ago
|
||
The workaround doesn't seem to be working in 55.0.3. Same issue as before, range requests on large files never seem to complete loading. Do I need to change anything in the workaround code? baku any suggestions?
Flags: needinfo?(amarchesini)
Comment 16•7 years ago
|
||
> ... loading. Do I need to change anything in the workaround code? baku any
> suggestions?
Can you share your code with the workaround again? Thanks.
Flags: needinfo?(amarchesini)
Reporter | ||
Comment 17•7 years ago
|
||
This is where I am using it - https://github.com/sharun-s/kiwix-html5/blob/dev/www/js/lib/util.js#L241 Does that help? Let me know if you need something else. Thanks for looking into it!
Reporter | ||
Updated•7 years ago
|
Flags: needinfo?(amarchesini)
Comment 18•7 years ago
|
||
sharun, can you please check it again? I did some improvements for FF57. Let me know if this issue is already fixed in nightly. Thanks!
Flags: needinfo?(amarchesini) → needinfo?(sharun.msgs)
Reporter | ||
Comment 19•7 years ago
|
||
I tried nightly (https://archive.mozilla.org/pub/firefox/nightly/2017/10/2017-10-20-22-11-29-mozilla-central/firefox-58.0a1.en-US.win64.zip) and 57. Still causes my machine to freeze. I tried with both arraybuffer and blob as response type. As mentioned above the key here is file size. When I tried with a 600MB file XHR returns (used attached loadZIM.html above). If I try this with a 50-60GB file (wikipedia/stackoverflow dumps) it causes the freeze up.
Assignee | ||
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
Updated•2 years ago
|
Severity: normal → S3
Comment 20•3 months ago
|
||
[domcore-bugbash-triaged] Doing domcore random bug triage : if this is still valid, please file a new bug; providing new test cases is going to be a huge help for us.
Status: UNCONFIRMED → RESOLVED
Closed: 3 months ago
Component: DOM: Core & HTML → DOM: Networking
Flags: needinfo?(sharun.msgs)
Resolution: --- → INVALID
Whiteboard: [domcore-bugbash-triaged]
Comment 21•3 months ago
|
||
The demo now fails in both Chrome and Firefox with a CORS request for the XHR.
If I set security.fileuri.strict_origin_policy
to false it still happens.
The problem with this is that it expects the Range request header to do something for a file channel, and that's not the case.
Andrea's suggestion with the blob works much better because it doesn't actually copy the entire file into memory.
You need to log in
before you can comment on or make changes to this bug.
Description
•