Closed Bug 704933 Opened 13 years ago Closed 11 years ago

speed-up localStorage SQLite writes

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 807021

People

(Reporter: mak, Assigned: vladan)

References

Details

(Keywords: main-thread-io, perf, Whiteboard: [Snappy:P1])

Attachments

(1 file)

There are certain things we can do to speed up localStorage writes now:
1. PRAGMA synchronous = OFF
2. PRAGMA locking_mode = EXCLUSIVE
3. OpenUnsharedDatabase

1. may require sometimes we run a PRAGMA quick_check(1) to ensure the database is sane, and if not sane we should throw it away.
As well as on startup if the opening fails we should throw it away and make a new db (maybe it's already doing that?)
note quick_check(1) may be expensive, not as expensive as integrity_check, but still someone you don't want to run on mainthread while the user is browsing.
s/someone/something/ in comment 1
Assignee: nobody → vdjeric
Blocks: 701880
Whiteboard: [snappy]
Whiteboard: [snappy] → [Snappy]
This is blocking a Snappy P1, so I'm going to mark it P1, too, and from Comment 0 it doesn't sound like a long-term project.
Whiteboard: [Snappy] → [Snappy:P1]
To by sync in how well you know how DOM storage (localStorage - i.e. the persistent one) back-end works: I use an in-memory table to operate on, also using a transaction for all inserts, then every 5 seconds these in-memory tables get flushed to disk tables and then are deleted.


Did anyone test or measured what operation is actually the slow one?  There is an op of setting a key=value in the memory table and then the flushing operation of all the collected changes.  Both are execution of a respective SQL statement.

The former can be checked here, notice to measure only the call to Execute, not the preparation:
http://hg.mozilla.org/mozilla-central/annotate/fafaf614791f/dom/src/storage/nsDOMStoragePersistentDB.cpp#l743

The letter:
http://hg.mozilla.org/mozilla-central/annotate/fafaf614791f/dom/src/storage/nsDOMStoragePersistentDB.cpp#l556

Note that there are two instances of nsDOMStoragePersistentDB, one for content, one for chrome.

Also I would be interested in measuring the following:
http://hg.mozilla.org/mozilla-central/annotate/fafaf614791f/dom/src/storage/nsDOMStoragePersistentDB.cpp#l497

But all of this should be covered by measuring the times of all statements.  Looking forward to results!
This patch implements mak's optimizations from comment 0. It checks the DB integrity on every DB open, i.e. it checks the integrity of the webappsstore.sqlite and chromeappsstore.sqlite DBs on the main thread at every startup. I did a quick check of the performance impact of the integrity check and it takes 1 or 2 ms on my laptop with a webappsstore.sqlite of about 2MB containing 400 records (acquired by browsing HTML5 localStorage demos).

Honza: I could take a look into those timings.
Attachment #579902 - Flags: review?(mak77)
by the discussion in bug 627635 I'm not sure we can go the synchronous=off path, nor I think we can do integrity check on startup, while it's not expensive in your case it's still an expensive operation by itself and some users may suffer it more.

But we can take the other things immediately, so openUnshared and exclusive mode.

Considering comment 4 the number of fsyncs should already be low if the temp table is synchronized on a timer, so we may not gain that much from synchronous off.
For writes, could be interesting to do the db synchronization asynchronous, the problem is that temp tables, triggers and views are per connection, and we don't want to use mixed sync/async Storage API, to avoid thread locking.
An alternative may be to have 2 connection, one reads synchronously from the view and writes synchronous ONLY to the temp table, the other one asynchronously sends each write to the disk table (with WAL this would not fsync till a size limit we can set). Though, it's hard to tell if this would be faster than doing timed writes on mainthread without measurements.

I see other things that may be improved, based on my experience with partitioned tables, for example EnsureLoadTemporaryTableForStorage looks like an overkill, why not using a UNION ALL in SELECT and an UPDATE trigger on the view that copies only needed entries to the temp table? So, I have various ideas, and if you could make some sort of stress test for reads and writes we may evaluate them.

Another interesting improvement, not for writes but for startup, may be to use a StatementCache, so it doesn't have to create all the statements on startup, they would be created on demand.
Marco, your solution sounds like it moves ASYNCness to a higher (and overly complicated)level. Ie there is still dataloss until background thread syncs and reads *could* still block on db update.

I think we should experiment with Vlad's approach on the nightly channel before we try a more complicated solution.
(In reply to Marco Bonardo [:mak] from comment #6)
> by the discussion in bug 627635 I'm not sure we can go the synchronous=off
> path, nor I think we can do integrity check on startup, while it's not
> expensive in your case it's still an expensive operation by itself and some
> users may suffer it more.

Rather than guessing whether it is an expensive operation, we could try putting it in + a telemetry probe to time it.
(In reply to Nathan Froyd (:froydnj) from comment #9)
> Rather than guessing whether it is an expensive operation, we could try
> putting it in + a telemetry probe to time it.

Just as an aside, the current patch has probes that report on DB corruption and the time required to perform the integrity check.
(In reply to Taras Glek (:taras) from comment #8)
> Marco, your solution sounds like it moves ASYNCness to a higher (and overly
> complicated)level. Ie there is still dataloss until background thread syncs
> and reads *could* still block on db update.

It's not really more complicated than what happens already, I was not aware dom storage was already using a temp table buffer, that changes things a lot.
The dataloss we are caring about is old existing data, not data in the temp buffer that, with current solution, could already get lost at any time. 

Looking at the code, to me looks like the number of fsyncs is a minor issue, while we may concentrate on the size of data to fsync (write back only modified entries for example, not whole scopes) and moving those fsyncs to another thread, for UI responsiveness reasons.

> I think we should experiment with Vlad's approach on the nightly channel
> before we try a more complicated solution.

I don't think blind experiments bring anything useful, moving to synchronous=off is something that would reduce fsyncs number to 0, but there is already a temp buffer with a similar same scope.  So the gain is low (moving from a few fsyncs to 0) the loss is reduced data durability.

(In reply to Vladan Djeric (:vladan) from comment #10)
> Just as an aside, the current patch has probes that report on DB corruption
> and the time required to perform the integrity check.

While nightly testers are a precious resource, it's not a perfect audience, and once this hits a release and we find we just made startup worse, it's late.

My thought improvement is this:
- on read, hit the temp table and then the disk table, the temp table will always be up-to-date
- on insert or update do it on both the temp (sync) and the disk table (async), a trigger can easily and automatically copy data from disk to temp.
- use WAL, so reads are concurrent to writes
- stop flushing temp tables to disk, since disk is kept up-to-date.

Honza may evaluate this approach, and we could find a good compromise. Having a stress test to measure the impact would then be useful, I think Honza may have already made one in the past when he added the temp table.
> on read, hit the temp table and then the disk table, the temp table will always be up-to-date

Can you explain more in detail what you mean with this?


I think if you want to do something here, do it quickly.  After the new year, I'll start playing with LevelDB and I believe it 
will be an improvement it self and all your work might get threw away.

So I don't want to spend much time on deciding on what to with the current back end.


Few notes:
- first, wait for results of probes before guessing on changes you list here
- make sure WAL will work well on mobile (will not regress)
- try to figure out if SQLite correctly sets the page size for common file systems on mobile (should match the allocation size)


I only have made a very simple page with a code that inserts 1000 items in a loop, them modifies them and then I think deletes them all.  That is not a good test at all.  

To have a better real-life test I think usual apps will mostly update some state keys, relatively often, and then add new, larger, data keys.  Occasionally will delete or modify them.  Sometimes they will iterate them with the .key(n) API and read/modify/delete some keys they will find.  A test that simulates a behavior like this would be interesting.

However, having some probes how often keys are added, modified and deleted would also be interesting.
(In reply to Honza Bambas (:mayhemer) from comment #12)
> > on read, hit the temp table and then the disk table, the temp table will always be up-to-date
> 
> Can you explain more in detail what you mean with this?

you can use UNION ALL on a SELECT, so that if you have an entry in both the temp and disk table it will return the temp one, otherwise the disk one. The temp table is always up-to-date, the disk table may be behind for async writes, but if we are writing, the temp table has the up-to-date entry already. It's not different than what you do now, but now you always only select from temp.

> I think if you want to do something here, do it quickly.  After the new
> year, I'll start playing with LevelDB and I believe it 
> will be an improvement it self and all your work might get threw away.

Well, partially. According to latest levelDB discussions, it won't be usable on mainthread, so you'll still need some temp buffer and complicated code to sync with the other thread. May take more time than initially evaluated.

> Few notes:
> - first, wait for results of probes before guessing on changes you list here

Which probes?

> - make sure WAL will work well on mobile (will not regress)

we used it with good results in Places, and it's used in cookies. I don't see reasons for it to be slower than current, it's just another journal type.

> - try to figure out if SQLite correctly sets the page size for common file
> systems on mobile (should match the allocation size)

I'm not sure why this should change compared to the current one.
(In reply to Marco Bonardo [:mak] from comment #13)
> Which probes?

AFAIK, there has landed probes for measuring time spent by each statement execution.  Maybe check those first.


Do you plan to go incrementally?  E.g. first start with enabling WAL only and then compare results from probes.

Also you should have a test page utilizing localStorage in some real-life way and test with it your self.
(In reply to Honza Bambas (:mayhemer) from comment #14)
> (In reply to Marco Bonardo [:mak] from comment #13)
> > Which probes?
> 
> AFAIK, there has landed probes for measuring time spent by each statement
> execution.  Maybe check those first.

Ah yes, this discussion and bug came out from the fact probes show there's time spent in fsyncs on mainthread, that is hurting UI reponsiveness, that's why I was looking into a solution that would not throw away completely the durability of the data, while also moving those fsyncs out of mainthread. My solution may also remove some complication of the current approach and reduce some writes (I actually didn't have enough time to check if we write back the entire scope, included unchanged data, or not)
(In reply to Marco Bonardo [:mak] from comment #15)
> (I actually didn't have enough time to check if we write
> back the entire scope, included unchanged data, or not)

That's a good point!  We probably are according from how I can see the statements are built.
Marco, "few fsyncs" are costing us a lot. This is one of the biggest known sources of mainthread lag in firefox.
I am concerned that I haven't seen a good/immediate plan of attack yet(other than Vladan's). Lets start landing improvements as soon as we can, wal, async and/or data reduction should go in as they are ready so we can get data on any improvements.
Regarding nightly users, google is the biggest user of dom storage so I think we'll get good enough data that way.
In case I wasn't clear, "few fsyncs" are causing multisecond pauses. I think we should land async changes to alleviate that pain and look into other options in parallel.
I'll try to make a mockup patch to evaluate this.

A problem related to using WAL, is that it will make this database dependent on SQLite version (only SQLite 3.7.x+ can open WAL databases), this may be problematic if Mobile moves to using system SQLite (as I think they are investigating) since Android <3.0 is still on SQLite 3.5 or 3.6.
(In reply to Marco Bonardo [:mak] from comment #19)
> I'll try to make a mockup patch to evaluate this.
> 
> A problem related to using WAL, is that it will make this database dependent
> on SQLite version (only SQLite 3.7.x+ can open WAL databases), this may be
> problematic if Mobile moves to using system SQLite (as I think they are
> investigating) since Android <3.0 is still on SQLite 3.5 or 3.6.

Can we make WAL conditional on sqlite version and use async mode if WAL is not available?
yes, the problem is that the approach is different, with wal we may avoid batching changes, without it we must keep doing that. Btw I'd probably leave the async part as the second step and first concentrate on reducing reads/writes.
Marco, on IRC you said you found that dom storage is writing way too much. Do you have an ETA for your prototype patch?
I'll look into it today, I almost cleared my reviews queue and the tree's metered, so looks like a good time.
Comment on attachment 579902 [details] [diff] [review]
DOMStorage speedups + check DB integrity on every DB open

this component needs some cleanup, so I'm going to do some parts, reviews should then be easy enough:
- take the safe part of vladan's patch (unshared and exclusive, the latter may be removed if we'll later move to multiple connections)
- use a StatementCache to improve startup
- remove the Binder class overhead, since it's useless
- sync back to disk only changed entries
- evaluate some async approach

There are various unresolved problems regarding the async approach:
- can't use WAL on mobile if they use system sqlite (<3.7.0 on Android <3.0).
- can't mixup sync/async Storage (thread-locking)
- can't easily use multiple connection, since temp tables are per connection.
Attachment #579902 - Flags: review?(mak77)
Would be interesting to have a DOMStorage stress test that we could run across different browsers, and see how we perform compared to other implementations.
Would be useful for future changes too.
Vladan, would you have any time to make one, taking into account the hints in comment 12?
So I had an idea using 2 connections, a read-only synchronous one and an asynchronous one. The main issue is that the current system is more complex than needed, so I have to pay some attention to changes.

The synchronous connection has a temp table, where all keys for a scope are copied, any change is then applied to this table so it's always up-to-date. We always read and write synchronously to this table, but it's never synced to disk. This connection will only generate synchronous read IO.

The asynchronous connection has a temp table, where only modified keys are asynchronously copied. Changes are applied to this table as well, so they are applied twice, but always to memory tables. When it's flushed, it asynchronously copies all its entries to the disk table. This will generate asynchronous read/write/fsync IO, automatically batched though.

While it looks complicated, it can be simpler than the current approach, doesn't require transactions management and should be more performant.
The part I still have to figure out, is when to discard the read buffer, otherwise all the database would slowly be copied to memory and stick there forever. This is probably something that may be easy to find when I'm at a later stage.
I think we should try an async-only approach and measure dataloss(if any) before we over-complicate the design.
how do you try an async-only approach on a synchronous API?
(In reply to Marco Bonardo [:mak] from comment #28)
> how do you try an async-only approach on a synchronous API?

I meant PRAGMA synchronous = 0
I'm splitting parts to new bugs, so we can proceed without overloading anyone.
My new implementation is in a decent shape, I have figured out an acceptable way to invalidate the memory cache, just have to clean it up, btw this is the last step in the chain and we can have simpler middle-steps just to reduce io.

(In reply to Taras Glek (:taras) from comment #29)
> I meant PRAGMA synchronous = 0

this is the first approach that was discarded, so imo there's no point in going back discussing it.
Depends on: 711970
Depends on: 711972
Depends on: 712006
Depends on: 712009
Depends on: 714964
Depends on: 728197
The issue of slow main-thread SQL writes was addressed in bug 807021 and bug 600307.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.