Last Comment Bug 945281 - indexedDB and/or Blob constructor - Blob incremental storage seems very slow for large files
: indexedDB and/or Blob constructor - Blob incremental storage seems very slow ...
Status: NEW
: perf
Product: Core
Classification: Components
Component: DOM: IndexedDB (show other bugs)
: 25 Branch
: x86 Windows Vista
-- normal with 1 vote (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
: Hsin-Yi Tsai [:hsinyi]
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-12-02 07:59 PST by Aymeric Vitte
Modified: 2014-10-31 09:48 PDT (History)
8 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments

Description User image Aymeric Vitte 2013-12-02 07:59:08 PST
User Agent: Mozilla/5.0 (Windows NT 6.0; rv:25.0) Gecko/20100101 Firefox/25.0 (Beta/Release)
Build ID: 20131112160018

Steps to reproduce:

See #944918 for the use case, the size of files is +/- 250 MB, after +/- 100 "put" operations appending each time 2 MB to the Blob, we open a transaction to get the record and we have to wait about 260 seconds to get onsuccess fired (probably waiting that all put operations have been processed).

The following code seems to reproduce the same behavior:

var db;
var DB=indexedDB.open('test',1);
DB.onupgradeneeded=function(evt) {
	var db=evt.target.result;
	var store=db.createObjectStore('test',{keyPath:'id'});
};
DB.onsuccess=function (evt) {
	db=evt.target.result;
	open_db=function() {
		return db.transaction(['test'],'readwrite').objectStore('test');
	};
	var a=new Blob();
	for (var i=0;i<100;i++) {
		var t=open_db();
		var b=new Uint8Array(2097152);
		a=new Blob([a,b]);
		t.put({id:0,data:a});
	};
	console.log("get");
	var c=open_db();
	var t0=Date.now();
	var d=c.get(0);
	d.onsuccess=function(evt) {
		console.log((Date.now()-t0));
	};
};
Comment 1 User image Aymeric Vitte 2014-01-23 03:46:38 PST
Incremental blob for large files as shown above is definitely slow, it's easy to reproduce.

Or is the method incorrect and should the blob not be incremented but reconstituted once with all chunks? (new Blob([chunk1,...,chunk100000]))
Comment 2 User image Jan Varga [:janv] 2014-04-17 06:16:09 PDT
It seems that stored files are slow especially on windows.
Comment 3 User image guy paskar 2014-04-17 08:14:22 PDT
Made some benchmarks with indexeddb and filehandle :

Mac:
1) number of appends: 750 , append size:1200*1000 = took ~ 20 seconds 
2)number of appends: 750*4 , append size:1200*1000/4 = took ~ 28 seconds

Windows:

1)number of appends: 750*4 , append size:1200*1000/4 = took ~ 404 seconds
2)number of appends: 750 , append size:1200*1000 = took ~ 136 seconds
3)number of appends: 750/2 , append size:1200*1000*2 = took ~ 96 seconds

So it is clear that windows is much slower than mac and also that on windows - increasing the size of each append while decreasing number of appends significantly improves performance.
Comment 4 User image Jan Varga [:janv] 2014-04-17 09:25:12 PDT
Did you run the benchmark on the same hardware ?

I suspect that fsync is much slower on windows.
Comment 5 User image Jan Varga [:janv] 2014-04-17 10:16:20 PDT
A discussion that confirms my suspicion:
http://stackoverflow.com/questions/18276554/windows-fsync-flushfilebuffers-performance-with-large-files

I'll investigated the FILE_FLAG_NO_BUFFERING flag.
Comment 6 User image guy paskar 2014-04-21 23:55:52 PDT
(In reply to Jan Varga [:janv] from comment #4)
> Did you run the benchmark on the same hardware ?
> 
> I suspect that fsync is much slower on windows.

No - That was on two different machines with different hardware. Will try to benchmark on the same machine.
Comment 7 User image Peter 2014-10-29 06:46:16 PDT
I can confirm that the same problem exists on Ubuntu 14.04 (with Firefox 33.0).

The cause is obvious...data is not appended but somehow copied to the new blob.

This means, if you save a blob incrementally by appending a 1 MB chunk each time, it will get slower the larger the file gets because when for example a file has 512 MB and you already downloaded and persisted 500 MB, when you get the next 1 MB chunk and append it to the file, actually 501 MB are written to the disk (I guess the the first 500 MB are copied from the old blob and the new chunk gets appended at the end), for the next 1 MB chunk then 502 MB are written to disk. This results in 1003 MB beeing written to disk instead of only 2 MB...and it gets even worse to persist the "last" 10 MB of the file.
Comment 8 User image Ben Turner (not reading bugmail, use the needinfo flag!) 2014-10-31 09:06:14 PDT
(In reply to Peter from comment #7)
> The cause is obvious...data is not appended but somehow copied to the new
> blob.

Yeah, this is just how blobs work: they represent one indivisible blob of data once you create them (i.e. they lose the notion that they are somehow composed of multiple chunks).

An alternative strategy for your download case is to store each chunk as an individual blob, and then when you're finished you create a new blob that wraps each of those chunks.
Comment 9 User image Aymeric Vitte 2014-10-31 09:39:41 PDT
Then you spend your time concatenating and slicing... as I am doing for Peersm project: save chunks, concat, load blob, slice, encrypt, decrypt, hash, concat, etc

This is completely inefficient and inept.

Please take a look at http://lists.w3.org/Archives/Public/public-webapps/2014JulSep/0332.html (read all the thread if you want) which demonstrates that really basic things can not be done with File and indexedDB, and apparently will not be feasible with Streams neither...

An older thread was http://lists.w3.org/Archives/Public/public-webapps/2013OctDec/0657.html, I kept since that time requesting that both APIs handle partial data, without success until now, I don't even think it has been considered for indexedDB V2
Comment 10 User image Ben Turner (not reading bugmail, use the needinfo flag!) 2014-10-31 09:48:11 PDT
(In reply to Aymeric Vitte from comment #9)

Let's take this to another bug. This bug is about our windows implementation flushing file data more slowly than other platforms.

Note You need to log in before you can comment on or make changes to this bug.