indexedDB and/or Blob constructor - Blob incremental storage seems very slow for large files

NEW
Unassigned

Status

()

Core
DOM: IndexedDB
3 years ago
3 years ago

People

(Reporter: Aymeric Vitte, Unassigned)

Tracking

({perf})

25 Branch
x86
Windows Vista
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

3 years ago
User Agent: Mozilla/5.0 (Windows NT 6.0; rv:25.0) Gecko/20100101 Firefox/25.0 (Beta/Release)
Build ID: 20131112160018

Steps to reproduce:

See #944918 for the use case, the size of files is +/- 250 MB, after +/- 100 "put" operations appending each time 2 MB to the Blob, we open a transaction to get the record and we have to wait about 260 seconds to get onsuccess fired (probably waiting that all put operations have been processed).

The following code seems to reproduce the same behavior:

var db;
var DB=indexedDB.open('test',1);
DB.onupgradeneeded=function(evt) {
	var db=evt.target.result;
	var store=db.createObjectStore('test',{keyPath:'id'});
};
DB.onsuccess=function (evt) {
	db=evt.target.result;
	open_db=function() {
		return db.transaction(['test'],'readwrite').objectStore('test');
	};
	var a=new Blob();
	for (var i=0;i<100;i++) {
		var t=open_db();
		var b=new Uint8Array(2097152);
		a=new Blob([a,b]);
		t.put({id:0,data:a});
	};
	console.log("get");
	var c=open_db();
	var t0=Date.now();
	var d=c.get(0);
	d.onsuccess=function(evt) {
		console.log((Date.now()-t0));
	};
};
Component: Untriaged → DOM: IndexedDB
Product: Firefox → Core
(Reporter)

Comment 1

3 years ago
Incremental blob for large files as shown above is definitely slow, it's easy to reproduce.

Or is the method incorrect and should the blob not be incremented but reconstituted once with all chunks? (new Blob([chunk1,...,chunk100000]))

Comment 2

3 years ago
It seems that stored files are slow especially on windows.

Comment 3

3 years ago
Made some benchmarks with indexeddb and filehandle :

Mac:
1) number of appends: 750 , append size:1200*1000 = took ~ 20 seconds 
2)number of appends: 750*4 , append size:1200*1000/4 = took ~ 28 seconds

Windows:

1)number of appends: 750*4 , append size:1200*1000/4 = took ~ 404 seconds
2)number of appends: 750 , append size:1200*1000 = took ~ 136 seconds
3)number of appends: 750/2 , append size:1200*1000*2 = took ~ 96 seconds

So it is clear that windows is much slower than mac and also that on windows - increasing the size of each append while decreasing number of appends significantly improves performance.

Comment 4

3 years ago
Did you run the benchmark on the same hardware ?

I suspect that fsync is much slower on windows.

Updated

3 years ago
Status: UNCONFIRMED → NEW
Ever confirmed: true

Comment 5

3 years ago
A discussion that confirms my suspicion:
http://stackoverflow.com/questions/18276554/windows-fsync-flushfilebuffers-performance-with-large-files

I'll investigated the FILE_FLAG_NO_BUFFERING flag.

Comment 6

3 years ago
(In reply to Jan Varga [:janv] from comment #4)
> Did you run the benchmark on the same hardware ?
> 
> I suspect that fsync is much slower on windows.

No - That was on two different machines with different hardware. Will try to benchmark on the same machine.
Keywords: perf

Comment 7

3 years ago
I can confirm that the same problem exists on Ubuntu 14.04 (with Firefox 33.0).

The cause is obvious...data is not appended but somehow copied to the new blob.

This means, if you save a blob incrementally by appending a 1 MB chunk each time, it will get slower the larger the file gets because when for example a file has 512 MB and you already downloaded and persisted 500 MB, when you get the next 1 MB chunk and append it to the file, actually 501 MB are written to the disk (I guess the the first 500 MB are copied from the old blob and the new chunk gets appended at the end), for the next 1 MB chunk then 502 MB are written to disk. This results in 1003 MB beeing written to disk instead of only 2 MB...and it gets even worse to persist the "last" 10 MB of the file.
(In reply to Peter from comment #7)
> The cause is obvious...data is not appended but somehow copied to the new
> blob.

Yeah, this is just how blobs work: they represent one indivisible blob of data once you create them (i.e. they lose the notion that they are somehow composed of multiple chunks).

An alternative strategy for your download case is to store each chunk as an individual blob, and then when you're finished you create a new blob that wraps each of those chunks.
(Reporter)

Comment 9

3 years ago
Then you spend your time concatenating and slicing... as I am doing for Peersm project: save chunks, concat, load blob, slice, encrypt, decrypt, hash, concat, etc

This is completely inefficient and inept.

Please take a look at http://lists.w3.org/Archives/Public/public-webapps/2014JulSep/0332.html (read all the thread if you want) which demonstrates that really basic things can not be done with File and indexedDB, and apparently will not be feasible with Streams neither...

An older thread was http://lists.w3.org/Archives/Public/public-webapps/2013OctDec/0657.html, I kept since that time requesting that both APIs handle partial data, without success until now, I don't even think it has been considered for indexedDB V2
(In reply to Aymeric Vitte from comment #9)

Let's take this to another bug. This bug is about our windows implementation flushing file data more slowly than other platforms.
You need to log in before you can comment on or make changes to this bug.