Last Comment Bug 686049 - LocalStorage length getter performance is low (Firefox 5.0)
: LocalStorage length getter performance is low (Firefox 5.0)
Status: RESOLVED DUPLICATE of bug 600307
:
Product: Core
Classification: Components
Component: DOM (show other bugs)
: Trunk
: x86_64 Windows 7
: P2 normal (vote)
: ---
Assigned To: Boris Zbarsky [:bz]
:
Mentors:
Depends on: 687755
Blocks:
  Show dependency treegraph
 
Reported: 2011-09-09 16:20 PDT by Alex
Modified: 2013-04-04 13:53 PDT (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
part 1. Refactor some code to make it easier to add new nsDOMStorageDBWrapper methods that forward to the various storage implementations. (7.18 KB, patch)
2011-09-09 21:20 PDT, Boris Zbarsky [:bz]
honzab.moz: review+
Details | Diff | Review
part 2. Add a GetKeyCount method on storage backends. (11.39 KB, patch)
2011-09-09 21:22 PDT, Boris Zbarsky [:bz]
no flags Details | Diff | Review

Description Alex 2011-09-09 16:20:49 PDT
User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.220 Safari/535.1

Steps to reproduce:

Have been tested local storage performance in different browsers under Windows. Used IE8, IE9, Firefox 5.0, Chrome 13.XX


Actual results:

My application loads 1500 customer records from server, which I put into local storage and load them from this cache on the next page load. I measured the time browser needs to load the data from the server and then the time for loading the data from the local storage. IE and Chrome improved that value ~30%. Firefox 5 - increased the time in approximately the same 30%.
So,  FF has the worst performance on comparatively large amounts of data.


Expected results:

I expected reduction of time in comparison with time needed to load the same amount of data from server.

Following the example at the https://support.mozilla.com/en-US/questions/750266 I made another test script. I put it to the html page which must be loaded from the web-server (for example, http://localhost:8888/index.html), not from the file system (for example, file:///C:/Users/Me/index.html). Otherwise there will be no real access (read/write) to local storage.

    <script type="text/javascript">
        window.localStorage.clear();
        var TestValue = "1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890";
        for (var i = 0; i < 2000; i++) {
            window.localStorage.setItem(i, TestValue);
        }

        var start = new Date();
        for (var i = 0; i < window.localStorage.length; i++) {
            window.localStorage.getItem(i);
        }
        var finish = new Date();

        document.write("Cache loading time: <br>" + (finish.getTime() - start.getTime()));
    </script>

Sample figures from different browsers are (local server requested):
IE8: 20, 23, 20 ms
Chrome: 692, 705, 728 ms
FF5: 15111, 14883, 15326 ms

In the real world (particular case - my application) with real various data loaded from the remote server I got the following (it's an average time after lots of measurements).
Time in milliseconds. Sample loading data consists of 1500 records.

           |   From server   | From local storage
---------------------------------------------
IE8       |     16501        |   10252
---------------------------------------------
IE9       |     24629        |   15142
---------------------------------------------
Chrome |     8285          |   5585
---------------------------------------------
FF5      |     9279          |   12937
---------------------------------------------

The last table data is just for illustration.
Comment 1 Boris Zbarsky [:bz] 2011-09-09 19:41:55 PDT
Thanks for filing this!

What that testcase shows, mostly, is that window.localStorage.length is O(N) in the number of elements in the storage.  That gives the loop being timed quadratic behavior in the loop limit of the first loop (because you get N calls to the length getter, each one taking time O(N)).

Unless your actual application is checking the length a lot, that testcase really doesn't match your application's workload.  Does your application do that?

If I make a single length get before the loop and then loop to that index, then I get times in Firefox that are about 9x faster than Chrome on my machine.  But I'm on Mac; behavior on Windows may differ.

Honza, the way the length getter works in the persistent db case seems crazy to me.  Much better would be to have a "count accessible keys" getter that the persistent db would implement via an sql query, no?
Comment 2 Boris Zbarsky [:bz] 2011-09-09 19:45:24 PDT
And Alex, if your app is _not_ doing .length, I'd love to see a testcase that actually matches your workload....
Comment 3 Boris Zbarsky [:bz] 2011-09-09 21:20:16 PDT
Created attachment 559655 [details] [diff] [review]
part 1.  Refactor some code to make it easier to add new nsDOMStorageDBWrapper methods that forward to the various storage implementations.
Comment 4 Boris Zbarsky [:bz] 2011-09-09 21:22:00 PDT
Created attachment 559656 [details] [diff] [review]
part 2.  Add a GetKeyCount method on storage backends.
Comment 5 Boris Zbarsky [:bz] 2011-09-09 21:23:46 PDT
Comment on attachment 559656 [details] [diff] [review]
part 2.  Add a GetKeyCount method on storage backends.

Honza, what do you think of this general approach?

If you think it's a good idea, I can try to see what we can do for the memory db as well....

That said, I'm still seeing O(N^2) behavior because the count() is still O(N) even with the index.  The constant seems to be about 35 times smaller, though, which is a bit of an improvement.
Comment 6 Boris Zbarsky [:bz] 2011-09-09 21:32:36 PDT
An obvious question: would it make sense to create views that only show the insecure stuff instead of querying the full table every time?  Or would that not help?
Comment 7 Honza Bambas (:mayhemer) 2011-09-11 10:59:37 PDT
Comment on attachment 559655 [details] [diff] [review]
part 1.  Refactor some code to make it easier to add new nsDOMStorageDBWrapper methods that forward to the various storage implementations.

Review of attachment 559655 [details] [diff] [review]:
-----------------------------------------------------------------

The best would be to:

- have an abstract class covering all common methods
- have a method DOMStorageImpl::DB() that would choose and return the correct end-class
- change usage of gStorageDB in that class to use the DB() method
- leave all methods that are not able to be abstracted this way in the wrapper (and potentially rename that class)

I started to think of moving DOM storage away from sqlite and move it to LevelDB if possible, so no major changes right now as the code might completely go away (I would like to refactor it one day completely).

So, your patch is good right now.
Comment 8 Honza Bambas (:mayhemer) 2011-09-11 11:34:47 PDT
Comment on attachment 559656 [details] [diff] [review]
part 2.  Add a GetKeyCount method on storage backends.

Review of attachment 559656 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/src/storage/nsDOMStorage.cpp
@@ -1074,5 @@
>  DOMStorageImpl::GetLength(bool aCallerSecure, PRUint32* aLength)
>  {
> -  // Force reload of items from database.  This ensures sync localStorages for
> -  // same origins among different windows.
> -  mItemsCached = PR_FALSE;

This is what cause the problem in the first place, and not just this one.  We recache all keys unnecessarily too often.

In bug 683316 I plan (in next few days) to never drop this flag until the storage has been changed (by both chrome or content).

I believe that could fix this problem as well.

However, we may combine the two patches.

Therefor for now I just drop the feedback flag.

::: dom/src/storage/nsDOMStorageMemoryDB.cpp
@@ +491,5 @@
> +    return NS_OK;
> +  }
> +
> +  // XXXbz there's _got_ to be a faster way to do this!  Can we just
> +  // keep track of this count on the nsInMemoryStorage?

Yes, in AllKeyEnum at nsDOMStorageMemoryDB.cpp.

::: dom/src/storage/nsDOMStoragePersistentDB.cpp
@@ +247,5 @@
> +
> +  rv = mConnection->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
> +        "CREATE INDEX scope_index_temp"
> +        " ON webappsstore2_temp(scope)"));
> +  NS_ENSURE_SUCCESS(rv, rv);

You don't need this index (scope_index_temp).  The two previous indexes covers your new query (at least the 3.7.7.1 sqlite version).

@@ +1116,5 @@
> +  rv = EnsureLoadTemporaryTableForStorage(aStorage);
> +  NS_ENSURE_SUCCESS(rv, rv);
> +
> +  mozIStorageStatement* statement =
> +    aInsecureOnly ? mCountInsecureKeysStatement : mCountInsecureKeysStatement;

mCountAllKeysStatement in the negative response?  

(Tests would show this, did you run them? plain-mochi : dom/tests/localstorage, I think)

@@ +1140,5 @@
> +    NS_ENSURE_SUCCESS(rv, rv);
> +
> +    *aCount = count;
> +  } else {
> +    // Can this even happen?  Should we throw if it does?

You may ignore the |exists| value.  If it were false, statement->GetInt32 would throw NS_ERROR_UNEXPECTED for you.  It is also better approach because |exists| can be uninitialized after some errors from sqlite.
Comment 9 Boris Zbarsky [:bz] 2011-09-11 12:01:03 PDT
> I believe that could fix this problem as well.

Sort of.  Testing length across removals would still be slow.  And we'd still want a fast path for testing length instead of enumerating the whole hashtable.... doing that for the in-memory tables is slower than the new SQL queries I'm adding, on my machine.  But we could cache the count if we've cached the keyset, of course.

> Yes, in AllKeyEnum at nsDOMStorageMemoryDB.cpp.

With my patch we're no longer calling that to get length, right?

> You don't need this index (scope_index_temp)

OK.  That wasn't obvious to me.  :)

> mCountAllKeysStatement in the negative response?  

Yep.

> Tests would show this, did you run them?

Not yet; I wanted to check whether the general approach was OK first.  I'd fix the in-memory stuff to before running tests.

_Is_ the general approach something we want?

> You may ignore the |exists| value.

OK.

> If it were false, statement->GetInt32 would throw NS_ERROR_UNEXPECTED for you.

Do I want that in this case, or do I want 0?  I guess for count() I better always have a row if things worked....
Comment 10 Honza Bambas (:mayhemer) 2011-09-12 13:13:53 PDT
(In reply to Boris Zbarsky (:bz) from comment #9)
> > I believe that could fix this problem as well.
> 
> Sort of.  ...

Agree with all.

Enhance nsInMemoryStorage of tracking the secure items is doable.  But better in a separate bug pending on this one?

> 
> > Yes, in AllKeyEnum at nsDOMStorageMemoryDB.cpp.
> 
> With my patch we're no longer calling that to get length, right?

Yes, overlook.  That is also called _only_ when preloading the mem-db from a persistent-db (session only cookies mode).

> _Is_ the general approach something we want?

I think yes.  As I said, I want to try to rewrite this completely with LevelDB, but in the mean time, let's go this way.

> I guess for count() I better
> always have a row if things worked....

I didn't find a prove, but from a logical point of view that function could fail only on DB fault, but that would be indicated by failure of GetInt32 anyway.
Comment 11 Alex 2011-09-12 14:52:45 PDT
(In reply to Boris Zbarsky (:bz) from comment #1)
> Unless your actual application is checking the length a lot, that testcase
> really doesn't match your application's workload.  Does your application do
> that?

Hi Boris, thanks for your comment. I was investigating my case today after checking of your suggestion on the test script from the top post. Really, moving getLength() out of the loop helped a lot. 

However, it didn't help my application (it checks length only once before loop starts). And as it's built on GWT framework I checked its sources and found that core GWT class StorageImpl.java has the following method:

  public native String key(String storage, int index) /*-{
    // few browsers implement retrieval correctly when index is out of range.
    // compensate to preserve API expectation. According to W3C Web Storage spec
    // <a href="http://www.w3.org/TR/webstorage/#dom-storage-key">
    // "If n is greater than or equal to the number of key/value pairs in the
    // object, then this method must return null."
    return (index >= 0 && index < $wnd[storage].length) ?
      $wnd[storage].key(index) : null;
  }-*/;

I.e. it's checking length on every request to key(index), and it's built in in current GWT implementation. To solve my problem I added my own method to GWT having length as a given parameter: 
   public native String fastKey(String storage, int index, int length) {...};
specially for using inside loops. 

And now FF is the fastest browser within those participating in tests.
I am going to file this in GWT issues tracker referencing this discussion if you don't mind.
Comment 12 Boris Zbarsky [:bz] 2011-09-12 16:50:01 PDT
Alex, I have no problem with you referencing this bug report in the GWT issue.

Honza, it looks like we in fact throw on key() with an out-of-range index...  Do we have a bug on that?

Also, key() is still O(N), just with smaller overhead.  :(

In any case, I'll work on getting the patches above updated.
Comment 13 Honza Bambas (:mayhemer) 2011-09-13 11:31:07 PDT
(In reply to Boris Zbarsky (:bz) from comment #12)
> Honza, it looks like we in fact throw on key() with an out-of-range index...
> Do we have a bug on that?

Bug 509241.

> Also, key() is still O(N), just with smaller overhead.  :(

Bug 683316?

> In any case, I'll work on getting the patches above updated.

Thank you for them, BTW!  From my experience it will take time before (if any) we move to something better like LevelDB for localStorage.
Comment 14 Boris Zbarsky [:bz] 2011-09-19 19:50:02 PDT
I spun off part 1 into bug 687755.
Comment 15 Honza Bambas (:mayhemer) 2011-10-17 08:33:19 PDT
Boris, should I take this bug and finish your patch?
Comment 16 Boris Zbarsky [:bz] 2011-10-17 08:37:46 PDT
If you have time, that would be great!
Comment 17 Marco Castelluccio [:marco] 2013-02-11 07:22:16 PST
Does the rewrite in bug 600307 take care of this too?
Comment 18 Honza Bambas (:mayhemer) 2013-02-11 07:37:54 PST
(In reply to Marco Castelluccio [:marco] from comment #17)
> Does the rewrite in bug 600307 take care of this too?

Very much so :)

*** This bug has been marked as a duplicate of bug 600307 ***

Note You need to log in before you can comment on or make changes to this bug.