Last Comment Bug 807021 - Move LocalStorage writes off the main thread
: Move LocalStorage writes off the main thread
Status: VERIFIED FIXED
[Snappy:P1]
:
Product: Core
Classification: Components
Component: DOM (show other bugs)
: 16 Branch
: All All
: -- normal with 2 votes (vote)
: mozilla20
Assigned To: Vladan Djeric (:vladan)
:
Mentors:
: 599988 704933 (view as bug list)
Depends on:
Blocks: 767944
  Show dependency treegraph
 
Reported: 2012-10-30 11:00 PDT by Vladan Djeric (:vladan)
Modified: 2013-11-12 08:55 PST (History)
24 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
LocalStorage cache data structures (Part 1) (16.51 KB, patch)
2012-11-23 10:00 PST, Vladan Djeric (:vladan)
honzab.moz: review+
Details | Diff | Splinter Review
nsDOMStoragePersistentDB now flushes off the main thread (Part 2) (66.29 KB, patch)
2012-11-23 10:03 PST, Vladan Djeric (:vladan)
honzab.moz: review+
Details | Diff | Splinter Review
Interdiff for Try build (3.29 KB, patch)
2012-11-26 19:02 PST, Vladan Djeric (:vladan)
no flags Details | Diff | Splinter Review
Interdiff after Honza's review (Part 1) (16.98 KB, patch)
2012-12-07 04:27 PST, Vladan Djeric (:vladan)
honzab.moz: feedback+
Details | Diff | Splinter Review
Interdiff after Honza's review (Part 2) (29.17 KB, patch)
2012-12-07 04:29 PST, Vladan Djeric (:vladan)
honzab.moz: feedback+
Details | Diff | Splinter Review
LocalStorage cache data structures (Part 1) (16.38 KB, patch)
2012-12-07 04:32 PST, Vladan Djeric (:vladan)
honzab.moz: review-
Details | Diff | Splinter Review
nsDOMStoragePersistentDB now flushes off the main thread (Part 2) (66.30 KB, patch)
2012-12-07 04:34 PST, Vladan Djeric (:vladan)
no flags Details | Diff | Splinter Review
LocalStorage cache data structures (Part 1) - includes Honza's LATEST reviews (15.97 KB, patch)
2012-12-12 00:00 PST, Vladan Djeric (:vladan)
bugs: review-
Details | Diff | Splinter Review
nsDOMStoragePersistentDB now flushes off the main thread (Part 2) - includes Honza's LATEST reviews (63.11 KB, patch)
2012-12-12 00:03 PST, Vladan Djeric (:vladan)
bugs: review-
Details | Diff | Splinter Review
Interdiff after Honza's latest reviews (20.11 KB, patch)
2012-12-12 00:05 PST, Vladan Djeric (:vladan)
honzab.moz: feedback+
Details | Diff | Splinter Review
Interdiff after Olli's review (44.70 KB, patch)
2012-12-14 00:22 PST, Vladan Djeric (:vladan)
bugs: review+
vladan.bugzilla: feedback+
Details | Diff | Splinter Review
LocalStorage cache data structures (Part 1) - after Olli's review (16.42 KB, patch)
2012-12-14 00:23 PST, Vladan Djeric (:vladan)
no flags Details | Diff | Splinter Review
691236: nsDOMStoragePersistentDB now flushes off the main thread (Part 2) - after Olli's review (64.73 KB, patch)
2012-12-14 00:24 PST, Vladan Djeric (:vladan)
no flags Details | Diff | Splinter Review

Description Vladan Djeric (:vladan) 2012-10-30 11:00:50 PDT
LocalStorage is a major performance issue in Firefox; its fetch, flush and delete operations are in the top 10 sources of main-thread slow SQL in the browser.

We need to make flush, fetch and delete operations asynchronous
Comment 1 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-10-30 13:17:42 PDT
Async or off-main-thread?
Comment 2 David Rajchenbach-Teller [:Yoric] (please use "needinfo") 2012-10-30 13:24:54 PDT
There is a language abuse among Mozilla developers that confuses both, but I am pretty sure we want it off main thread.
Comment 3 :Ehsan Akhgari 2012-11-01 11:24:08 PDT
(In reply to David Rajchenbach Teller [:Yoric] from comment #2)
> There is a language abuse among Mozilla developers that confuses both, but I
> am pretty sure we want it off main thread.

I hope that we're not planning on introducing an I/O thread for this.  :-)
Comment 4 David Rajchenbach-Teller [:Yoric] (please use "needinfo") 2012-11-05 23:29:22 PST
If there is nobody on this, I could take a look once I have solved my sessionstore/talos issues. Vladan, do you have details about the necessary steps required to make this happen?
Comment 5 (dormant account) 2012-11-05 23:31:48 PST
(In reply to David Rajchenbach Teller [:Yoric] from comment #4)
> If there is nobody on this, I could take a look once I have solved my
> sessionstore/talos issues. Vladan, do you have details about the necessary
> steps required to make this happen?

Thanks for the offer. Vladan is on this.
Comment 6 Vladan Djeric (:vladan) 2012-11-23 10:00:56 PST
Created attachment 684732 [details] [diff] [review]
LocalStorage cache data structures (Part 1)

This is a refactor of DOM LocalStorage to use an explicit hash-table cache and to write to the DB exclusively on a background thread

A few words about the new design:

- The LocalStorage DB now uses Write-Ahead Logging. If WAL is not available, the database falls back to TRUNCATE journaling and synchronous (main thread) operation. This fallback should be very rare according to the documentation, e.g. custom VFSes that don't support memory mapped files.
- The main thread only reads, and the helper thread only writes. Thanks to WAL, they can do this concurrently
- Scopes are fetched on the main thread and then continue to live in the hash-table cache until they are evicted
- Scopes are evicted from the cache if they are clean and have not been read or written in the last 10 seconds
- A site's quota use is calculated with an SQL statement on the main thread the first time the quota is required. Afterwards, this value is cached and updated as scopes move in and out of the cache
- The helper thread flushes dirty items every 10 seconds
- There is no synchronization needed between the main thread and helper thread except on shutdown when the main thread waits for the helper thread to finish its work
- Comments in the patch headers have additional implementation details and are a good starting point for reviewing the patch

Next Steps:

- Get rid of extra caching layer in nsDOMStorage, additional testing, adding regression tests

I would like to get this version reviewed and landed as a first step, and then add prefetch as the next milestone.
Comment 7 Vladan Djeric (:vladan) 2012-11-23 10:03:11 PST
Created attachment 684733 [details] [diff] [review]
nsDOMStoragePersistentDB now flushes off the main thread (Part 2)

Second half of patch
Comment 8 (dormant account) 2012-11-25 23:22:31 PST
Are these WIP patches or did you forget to set r?
Comment 9 Vladan Djeric (:vladan) 2012-11-26 07:24:22 PST
(In reply to Taras Glek (:taras) from comment #8)
> Are these WIP patches or did you forget to set r?

These patches are ready for review but I don't know who should review them. I sent you a mail on Friday
Comment 10 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-11-26 07:37:53 PST
Mayhemer should review. And also some DOM peer. Could be me or jst or sicking or bz, I think.
Comment 11 Honza Bambas (:mayhemer) 2012-11-26 09:49:01 PST
Hmm.. first look at the first patch and it resembles what I do for the complete overhaul in bug 600307.
Comment 12 Vladan Djeric (:vladan) 2012-11-26 10:30:20 PST
(In reply to Honza Bambas (:mayhemer) from comment #11)
> Hmm.. first look at the first patch and it resembles what I do for the
> complete overhaul in bug 600307.

Yeah, I think we should merge our efforts so we don't duplicate work.

By the way, there's a bug in setting the aAppId value in nsDOMStoragePersistentDB::RemoveAllForApp, I wasn't taking into account the value of aOnlyBrowserElement. I have the fix but it's tiny change and I was thinking of uploading it after I finish ripping out the extra caching layer in DOMStorage.

Also, I ran afoul of the "expected fail" in the "test_brokenUTF-16" mochitest (i.e. bug 505708):

http://mxr.mozilla.org/mozilla-central/source/dom/tests/mochitest/localstorage/test_brokenUTF-16.html?force=1#43

Do you know why this issue still exists in the current LocalStorage if the bug was fixed in SQLite a long time ago?
Comment 13 Honza Bambas (:mayhemer) 2012-11-26 10:42:29 PST
(In reply to Vladan Djeric (:vladan) from comment #12)
> (In reply to Honza Bambas (:mayhemer) from comment #11)
> > Hmm.. first look at the first patch and it resembles what I do for the
> > complete overhaul in bug 600307.
> 
> Yeah, I think we should merge our efforts so we don't duplicate work.

I will completely throw your work away.  I have my own caching code, sorry.  If you want this to uplift to aurora, then I'm OK to have your code in one release cycle.  But then, when my patch stabilizes, I want to completely get rid of all dom storage code as it is now including your patch (I already did that, actually).

> 
> By the way, there's a bug in setting the aAppId value in
> nsDOMStoragePersistentDB::RemoveAllForApp, I wasn't taking into account the
> value of aOnlyBrowserElement. I have the fix but it's tiny change and I was
> thinking of uploading it after I finish ripping out the extra caching layer
> in DOMStorage.

I will adopt this change in my code.

> 
> Also, I ran afoul of the "expected fail" in the "test_brokenUTF-16"
> mochitest (i.e. bug 505708):
> 
> http://mxr.mozilla.org/mozilla-central/source/dom/tests/mochitest/
> localstorage/test_brokenUTF-16.html?force=1#43

My patch fixes any UTF-16 issues.

> 
> Do you know why this issue still exists in the current LocalStorage if the
> bug was fixed in SQLite a long time ago?

I believe it is (was!) some misuse of nsString/nsAString/nsSubstring/whatever.  My patch is using strictly nsAString and nsString for keys and values and all works well with it.
Comment 14 Honza Bambas (:mayhemer) 2012-11-26 10:43:19 PST
Vladan, did you pass your patches to try?
Comment 15 (dormant account) 2012-11-26 11:42:09 PST
(In reply to Honza Bambas (:mayhemer) from comment #13)
> (In reply to Vladan Djeric (:vladan) from comment #12)
> > (In reply to Honza Bambas (:mayhemer) from comment #11)
> > > Hmm.. first look at the first patch and it resembles what I do for the
> > > complete overhaul in bug 600307.
> > 
> > Yeah, I think we should merge our efforts so we don't duplicate work.
> 
> I will completely throw your work away.  I have my own caching code, sorry. 
> If you want this to uplift to aurora, then I'm OK to have your code in one
> release cycle.  But then, when my patch stabilizes, I want to completely get
> rid of all dom storage code as it is now including your patch (I already did
> that, actually).

I think that's fine. I believe this patch addresses a critical performance issue and we should get a good fix in soon. If there is a better patch coming in a future release, that's even better.
Comment 16 Vladan Djeric (:vladan) 2012-11-26 19:02:35 PST
Created attachment 685443 [details] [diff] [review]
Interdiff for Try build

(In reply to Honza Bambas (:mayhemer) from comment #14)
> Vladan, did you pass your patches to try?

Yes, you can see the try results here: https://tbpl.mozilla.org/?tree=Try&rev=e8f8ea08aa11
I attached an inter-diff patch of the small changes I made since the initial patch.

The B2G Try failures are infrastructure exceptions, the other 2 MochiTest failures seem unrelated to LocalStorage (intermittent failures?).

Note that my code also passes the broken UTF16 tests when the broken UTF16 values are cached in memory, but flushing to storage and then reading the same data from storage brings up the same old problem as the current LocalStorage
Comment 17 Honza Bambas (:mayhemer) 2012-11-28 10:55:25 PST
Another question: what is the motivation to uplift this to aurora and why you believe the patches are simpler to uplift rather then having a completely new impl?  (I didn't look deeper at the patches - going to do it now, so I might find answer there..)
Comment 18 (dormant account) 2012-11-28 11:50:36 PST
(In reply to Honza Bambas (:mayhemer) from comment #17)
> Another question: what is the motivation to uplift this to aurora and why
> you believe the patches are simpler to uplift rather then having a
> completely new impl?  (I didn't look deeper at the patches - going to do it
> now, so I might find answer there..)

Motivation is that this likely the single biggest source of long Firefox pauses. Uplifting this will spare our users a few weeks of suffering. 
I imagine that a targeted fix is safer to uplift than a complete rewrite.
Comment 19 Honza Bambas (:mayhemer) 2012-11-29 14:16:51 PST
I'm going to start review of these patches tomorrow.
Comment 20 Jonas Sicking (:sicking) No longer reading bugmail consistently 2012-11-30 00:10:31 PST
For what it's worth, I see no reason to uplift the patches to the aurora branch. We have the release trains for a reason. Features and fixes that don't make a train should wait for the next train unless there are special reasons such as regression fixes or fixes for new code.

That said, I don't personally have any opinion on if we should land the patches here first, or go for Honza's bigger rewrite.
Comment 21 Honza Bambas (:mayhemer) 2012-12-01 16:36:03 PST
Comment on attachment 684732 [details] [diff] [review]
LocalStorage cache data structures (Part 1)

Review of attachment 684732 [details] [diff] [review]:
-----------------------------------------------------------------

This really looks good.  Please just address the comments or object against.

Please add more detailed comments mainly on the flags and also on how the flushing operation works: that it copies (actually moves) the data about changes in a single main thread loop and builds a sqlite statement array of them that is executed on the background thread.

::: dom/src/storage/nsLocalStorageCache.cpp
@@ +19,5 @@
> +{
> +  nsScopeCache* scopeCache = nullptr;
> +  mScopeCaches.Get(aScopeName, &scopeCache);
> +  if (scopeCache)
> +    scopeCache->mAccessTime = TimeStamp::Now();

Maybe use PR_IntervalNow()?  This doesn't need to be hi-precision.  Up to you.

@@ +58,5 @@
> +  GetUsageEnumData* data = (GetUsageEnumData*)aParams;
> +  if (StringBeginsWith(aScopeName, *(data->mQuotaKey)))
> +    data->mUsage += aScopeCache->GetQuotaUsage();
> +  return PL_DHASH_NEXT;
> +}

Put into anon namespace.

@@ +82,5 @@
> +
> +  aScopeCache->DeleteScope();
> +
> +  return PL_DHASH_NEXT;
> +}

As well here and on all other places of your new code for static functions and local classes.

@@ +85,5 @@
> +  return PL_DHASH_NEXT;
> +}
> +
> +void
> +nsLocalStorageCache::DeleteMatchingScopes(const nsACString& aPattern)

MarkForDelete* ?

@@ +91,5 @@
> +  mScopeCaches.Enumerate(DeleteMatchingEnum, (void*)&aPattern);
> +}
> +
> +void
> +nsLocalStorageCache::RemoveAllScopes()

Maybe ForgetAllScopes?  Because this doesn't actually remove them from the database.

@@ +173,5 @@
> +    // If the flush failed, recreate the entire scope on disk on the next flush.
> +    // Alternatively, we could un-apply the data used in the failed flush,
> +    // but that would require backing up all the keys & values being flushed.
> +    bool isDirty = true;
> +    aScopeCache->mWasScopeDeleted = true;

Ups, won't this delete the scope from disk on the next flush?

@@ +250,5 @@
> +  mTable.Init();
> +}
> +
> +nsresult
> +nsScopeCache::AddKey(const nsAString& aKey,

Some name that doesn't imply this is used by DOM API would be good.  This is used by the fetch code to load the key into the cache.  Hard to say what is the proper name.

@@ +256,5 @@
> +                     bool aSecure)
> +{
> +  nsCacheEntry* entry = new nsCacheEntry();
> +  if (!entry)
> +    return NS_ERROR_OUT_OF_MEMORY;

You don't need this check?

@@ +334,5 @@
> +  nsCacheEntry* entry = nullptr;
> +  if (!mTable.Get(aKey, &entry)) {
> +    entry = new nsCacheEntry();
> +    if (!entry)
> +      return NS_ERROR_OUT_OF_MEMORY;

Null checks are obsolete.

@@ +340,5 @@
> +  }
> +
> +  entry->mValue = aValue;
> +  entry->mIsSecure = aSecure;
> +  entry->mIsDirty = true;

only when the value has actually changed?

@@ +344,5 @@
> +  entry->mIsDirty = true;
> +
> +  nsAutoString key(aKey);
> +  if (mDeletedKeys.find(key) != mDeletedKeys.end())
> +    mDeletedKeys.erase(key);

Nit: doesn't erase() make the check silently for you?

::: dom/src/storage/nsLocalStorageCache.h
@@ +55,5 @@
> +class nsScopeCache
> +{
> +public:
> +  nsScopeCache();
> +  nsresult AddKey(const nsAString& aKey, const nsAString& aValue, bool aSecure);

secure flag on keys should not be worked with any more, we distinct scopes by origin now, so there cannot be items in one scope that would either secure and non-secure

@@ +68,5 @@
> +
> +  /**
> +   * A single key/value pair in a scope.
> +   */
> +  class nsCacheEntry

CacheEntry or KeyEntry

@@ +118,5 @@
> +    struct ChangeSet {
> +      bool mWasDeleted;
> +      std::set<nsString>* mDeletedKeys;
> +      std::vector<nsString> mDirtyKeys;
> +      std::vector<nsScopeCache::nsCacheEntry*> mDirtyValues;

Why not to use nsTHashtable<nsStringHashkey> and nsTArray<nsString> etc?
Comment 22 Honza Bambas (:mayhemer) 2012-12-01 16:45:09 PST
Comment on attachment 684733 [details] [diff] [review]
nsDOMStoragePersistentDB now flushes off the main thread (Part 2)

Review of attachment 684733 [details] [diff] [review]:
-----------------------------------------------------------------

This is really is good!

Please add more comments, this looks complex, but after few hours studding the code it is actually relatively simple.  However, better comments would save me some of my Saturday evening...  

I really like the coalescing idea that comes out of the concept.  I will adopt that approach in my patch as well.


After this careful review I believe that after few days baking on m-c we really can uplift.

However, first I want to ask for prove this really helps performance (writes) and it doesn't cause any regressions.  I don't expect them since number of reads on the main thread is decreased.


Vladan, if you want to do a preload of a scope cache, I think you can just copy what I do here (if you want to do it then let me review a followup patch):
https://bugzilla.mozilla.org/attachment.cgi?id=686683&action=diff#a/dom/base/nsGlobalWindow.cpp_sec3

::: dom/src/storage/nsDOMStorageDBWrapper.cpp
@@ +378,2 @@
>  
>      if (!NS_SUCCEEDED(rv)) {

Please change this to NS_FAILED, thanks.

@@ +381,4 @@
>        return;
>      }
>  
> +    // TODO vdjeric

What is TODO here?

::: dom/src/storage/nsDOMStoragePersistentDB.cpp
@@ +98,5 @@
> +  // Get any remaining uncommitted data and flush it synchronously
> +  rv = PrepareForFlush();
> +  NS_ENSURE_SUCCESS(rv, rv);
> +
> +  // The helper thread is guaranteed to not be using the DB at this point

Because of the mFinalFlushState != FLUSH_IN_PROGRESS and fact that it can be again started only on this (main) thread?

@@ +146,5 @@
> +    mCache.MarkFlushFailed();
> +    mWasRemoveAllCalled |= mIsRemoveAllPending;
> +    mIsRemoveAllPending = false;
> +    mIsFlushPending = false;
> +  }

mIsRemoveAllPending = false; and mIsFlushPending = false; can probably be moved out of the conditions code.

@@ +752,4 @@
>  
> +  // This resets the state of the async thread
> +  // Since mIsFlushPending = false, no synchronization is necessary here
> +  mFinalFlushState = FLUSH_IN_PROGRESS;

// Since mIsFlushPending = false

...but you set it to true just a line above...


The comment may need some update.

@@ +757,5 @@
> +  return NS_OK;
> +}
> +
> +nsresult
> +nsDOMStoragePersistentDB::FlushAndEvictFromCache(bool aIsShuttingDown)

Add comment that this is the 'top-level' method called from the timer and shutdown observers.

@@ +761,5 @@
> +nsDOMStoragePersistentDB::FlushAndEvictFromCache(bool aIsShuttingDown)
> +{
> +  nsresult rv;
> +
> +  // Evict any scopes that were idle for more than TEMP_TABLE_MAX_AGE seconds

You've renamed that to MAX_IDLE_TIME.  Also, I think 10 seconds is too short.  You then have to refetch again.  But up to you.  The goal of this patch is to speedup writes and not reads.

@@ +979,5 @@
> +  }
> +}
> +
> +nsresult
> +nsDOMStoragePersistentDB::RemoveMatching(const nsACString& aPattern)

Please try to find a different name for this method, it implies it synchronously removes something but actually just creates empty scopes that will on flush make DB content for those scope to be deleted.

@@ +987,3 @@
>    nsCOMPtr<mozIStorageStatement> stmt = mStatements.GetCachedStatement(
> +    "SELECT DISTINCT(scope) FROM webappsstore2"
> +    " WHERE scope LIKE :scope"

Use either GLOB or LIKE for all your statements for consistency when possible.

@@ +1007,5 @@
> +      if (!scopeCache)
> +        return NS_ERROR_OUT_OF_MEMORY;
> +      mCache.AddScope(scopeName, scopeCache);
> +    }
> +    scopeCache->DeleteScope();

This is duplicated by consumers.  You always call mCache.DeleteMatchingScopes() after this method.  It is a bit misleading to have DeleteScope() call as well here.

@@ +1059,5 @@
> +  // Remove any matching scopes from the cache that were not on disk
> +  nsAutoCString appPrefix;
> +  appPrefix.AppendInt(aAppId);
> +  appPrefix.AppendLiteral(":");
> +  mCache.DeleteMatchingScopes(appPrefix);

Ah, now I see the missing aOnlyBrowserElement.

::: dom/src/storage/nsDOMStoragePersistentDB.h
@@ +110,5 @@
>  
>    /**
>     * Removes all keys from storage. Used when clearing storage.
>     */
> +  nsresult RemoveAll();

File style is:

nsresult
RemovaAll();

@@ +165,3 @@
>     */
> +  nsresult
> +  FetchScope(DOMStorageImpl* aStorage, nsScopeCache* aScopeCache);

The Fetch* methods should be private.

@@ +208,5 @@
> +  BindInsertKey(mozIStorageStatement* aStmt,
> +                mozIStorageBindingParamsArray** aParamArray,
> +                const nsACString& aScopeName,
> +                const nsAString& aKey,
> +                const nsCacheEntry* aEntry);

Make these anon-namespaced functions in cpp, please.

@@ +268,5 @@
> +    NS_IMETHOD Run();
> +  private:
> +    nsDOMStoragePersistentDB* mDB;
> +    nsRefPtr<FlushCallbackTask> mCallback;
> +  };

Could this class be in anon namespace in the cpp file?

@@ +284,5 @@
> +    FINAL_FLUSH_FAILED
> +  };
> +  FinalFlushState mFinalFlushState;
> +  PRLock* mShutdownLock;
> +  PRCondVar* mShutdownCondVar;

Switch to using monitor please.

@@ +301,5 @@
> +  nsLocalStorageCache mCache;
> +  nsDataHashtable<nsCStringHashKey, int32_t> mQuotaUseByUncached;
> +  bool mWasRemoveAllCalled;
> +  bool mIsRemoveAllPending;
> +  bool mIsFlushPending;

Add proper comments to all of these scary flags.
Comment 23 Vladan Djeric (:vladan) 2012-12-07 04:27:48 PST
Created attachment 689676 [details] [diff] [review]
Interdiff after Honza's review (Part 1)

(In reply to Honza Bambas (:mayhemer) from comment #21)

Applied your review comments with a few exceptions below

> @@ +173,5 @@
> > +    // If the flush failed, recreate the entire scope on disk on the next flush.
> > +    // Alternatively, we could un-apply the data used in the failed flush,
> > +    // but that would require backing up all the keys & values being flushed.
> > +    bool isDirty = true;
> > +    aScopeCache->mWasScopeDeleted = true;
> 
> Ups, won't this delete the scope from disk on the next flush?

That's right, but this is by design. After a flush fails, we don't know what exactly was being flushed as the cache might have changed, so we want to recreate the scope on disk from scratch using the current state. Consider the case where a key is deleted or an entire scope is deleted but the flush fails. The scope's cache will have no information about the items which needed to be deleted during the failed flush, so on the next flush attempt, we need to err on the side of caution and remove all the scope data from disk and then add the keys still in the scope's cache.

The delete after the failed flush is a wasteful if nothing was actually deleted, but failed flushes should be *very* rare. The alternative to this model is to back up entries before a flush but it wastes memory in common case + it's a bit more complicated.

> > +nsresult
> > +nsScopeCache::AddKey(const nsAString& aKey,
> 
> Some name that doesn't imply this is used by DOM API would be good.  This is
> used by the fetch code to load the key into the cache.  Hard to say what is
> the proper name.

Renamed it to AddEntry

> @@ +256,5 @@
> > +  nsCacheEntry* entry = new nsCacheEntry();
> > +  if (!entry)
> > +    return NS_ERROR_OUT_OF_MEMORY;
> 
> You don't need this check?
...
> Null checks are obsolete.

Isn't it potentially dangerous to forego checking for NULL pointers? e.g. bug 635705

> ::: dom/src/storage/nsLocalStorageCache.h
> @@ +55,5 @@
> > +class nsScopeCache
> > +{
> > +public:
> > +  nsScopeCache();
> > +  nsresult AddKey(const nsAString& aKey, const nsAString& aValue, bool aSecure);
> 
> secure flag on keys should not be worked with any more, we distinct scopes
> by origin now, so there cannot be items in one scope that would either
> secure and non-secure

I did notice this while writing the patch but I didn't want to create any changes in the format of the data in the DB with this patch. If I were to always write a constant value in the "secure" column, then older versions of Firefox wouldn't be able to overwrite those rows' values since "secure" fields wouldn't match. This might be important if my patch were to get backed out or if an older version of Firefox tried to use a profile modified by a newer version.

> Why not to use nsTHashtable<nsStringHashkey> and nsTArray<nsString> etc?

nsTHashtable is not iterable in a for-loop so I'd rather not use it for sets in nsLocalStorageCache, and I decided to stick with STL containers once I decided to use the iterable std::set
Comment 24 Vladan Djeric (:vladan) 2012-12-07 04:29:53 PST
Created attachment 689677 [details] [diff] [review]
Interdiff after Honza's review (Part 2)
Comment 25 Vladan Djeric (:vladan) 2012-12-07 04:32:42 PST
Created attachment 689678 [details] [diff] [review]
LocalStorage cache data structures (Part 1)
Comment 26 Vladan Djeric (:vladan) 2012-12-07 04:34:01 PST
Created attachment 689679 [details] [diff] [review]
nsDOMStoragePersistentDB now flushes off the main thread (Part 2)
Comment 27 Vladan Djeric (:vladan) 2012-12-07 04:56:53 PST
(In reply to Honza Bambas (:mayhemer) from comment #22)
> Comment on attachment 684733 [details] [diff] [review]
> nsDOMStoragePersistentDB now flushes off the main thread (Part 2)

> After this careful review I believe that after few days baking on m-c we
> really can uplift.
> 
> However, first I want to ask for prove this really helps performance
> (writes) and it doesn't cause any regressions.  I don't expect them since
> number of reads on the main thread is decreased.

The flush operation now happens off the main thread, so the benefit to writes is straightforward -- the UI "pauses" from flushing are now gone. This patch does the same amount of scope key reads (fetching scope data) + it does fewer DB reads for quota enforcement thanks to more comprehensive quota caching.

> Vladan, if you want to do a preload of a scope cache, I think you can just
> copy what I do here (if you want to do it then let me review a followup
> patch):
> https://bugzilla.mozilla.org/attachment.cgi?id=686683&action=diff#a/dom/base/
> nsGlobalWindow.cpp_sec3

I think we should use this patch as a short-term solution for improving writes so we can eliminate a major source of user pain. Pre-fetching would add complexity + it would require deeper testing across diverse environments to confirm the benefit of eliminating blocking reads is greater than the costs of additional I/O (e.g. when LocalStorage is prefetched for a page but the page never uses it).


> > +    // TODO vdjeric
> 
> What is TODO here?

Nothing, I forgot to remove this comment

> ::: dom/src/storage/nsDOMStoragePersistentDB.cpp
> @@ +98,5 @@
> > +  // Get any remaining uncommitted data and flush it synchronously
> > +  rv = PrepareForFlush();
> > +  NS_ENSURE_SUCCESS(rv, rv);
> > +
> > +  // The helper thread is guaranteed to not be using the DB at this point
> 
> Because of the mFinalFlushState != FLUSH_IN_PROGRESS and fact that it can be
> again started only on this (main) thread?

That's right. I'll clarify the comment

> @@ +761,5 @@
> > +nsDOMStoragePersistentDB::FlushAndEvictFromCache(bool aIsShuttingDown)
> > +{
> > +  nsresult rv;
> > +
> > +  // Evict any scopes that were idle for more than TEMP_TABLE_MAX_AGE seconds
> 
> [...] I think 10 seconds is too short.  You then have to refetch again.  But up to you.  The goal of this
> patch is to speedup writes and not reads.

It's a trade-off between I/O and memory use for caching, so I chose to stick with the interval that's currently being used. Note that the current Firefox code evicts scopes *unconditionally* after 10 seconds, even if they've been recently used!

> @@ +979,5 @@
> > +  }
> > +}
> > +
> > +nsresult
> > +nsDOMStoragePersistentDB::RemoveMatching(const nsACString& aPattern)
> 
> Please try to find a different name for this method, it implies it
> synchronously removes something but actually just creates empty scopes that
> will on flush make DB content for those scope to be deleted.

Renamed it to "FetchMatchingScopeNames"


> @@ +268,5 @@
> > +    NS_IMETHOD Run();
> > +  private:
> > +    nsDOMStoragePersistentDB* mDB;
> > +    nsRefPtr<FlushCallbackTask> mCallback;
> > +  };
> 
> Could this class be in anon namespace in the cpp file?

I refactored it into an anonymous namespace, but I had to expose more nsDOMStoragePersistentDB functionality as public methods
Comment 28 Honza Bambas (:mayhemer) 2012-12-07 07:57:02 PST
(In reply to Vladan Djeric (:vladan) from comment #23)
> > Ups, won't this delete the scope from disk on the next flush?
> 
> That's right, but this is by design. 

I think I understand... but I'm not sure I agree.

It will delete data from disk, but not from the cache.  At least before we throw the cache (scope) away what we do very soon and actually a bit non-deterministically from the web content point of view.  So, the data will be after a very short time corrupted (deleted) when we refetch scope from disk on next localStorage access.  

I don't think it is a good idea at all.  I'd leave the data at the state it is rather or significantly prolonged the time scope is living in memory.  E.g. I will not throw a scope cache away until all windows using it are closed.  There is a proposal to let this behave like in session-only mode then, it is better then to open some unexpected behavior.

> Isn't it potentially dangerous to forego checking for NULL pointers? e.g.
> bug 635705

new operator is infallible.  That was memory clone using NS_Alloc that is not infallible.  Please remove null check after using new.

> > secure flag on keys should not be worked with any more, we distinct scopes
> > by origin now, so there cannot be items in one scope that would either
> > secure and non-secure
> 
> I did notice this while writing the patch but I didn't want to create any

I agree and I actually wanted to remove this comment before publishing the review.

(In reply to Vladan Djeric (:vladan) from comment #27)
> The flush operation now happens off the main thread, so the benefit to
> writes is straightforward -- the UI "pauses" from flushing are now gone.
> This patch does the same amount of scope key reads (fetching scope data) +
> it does fewer DB reads for quota enforcement thanks to more comprehensive
> quota caching.

These are words, not proves.

> I think we should use this patch as a short-term solution for improving
> writes so we can eliminate a major source of user pain. Pre-fetching would
> add complexity + it would require deeper testing across diverse environments
> to confirm the benefit of eliminating blocking reads is greater than the
> costs of additional I/O (e.g. when LocalStorage is prefetched for a page but
> the page never uses it).

Agree.  This is a lot of changes already and preloading is not that simple mainly in IPC environment.

> > [...] I think 10 seconds is too short.  You then have to refetch again.  But up to you.  The goal of this
> > patch is to speedup writes and not reads.
> 
> It's a trade-off between I/O and memory use for caching, so I chose to stick
> with the interval that's currently being used. Note that the current Firefox
> code evicts scopes *unconditionally* after 10 seconds, even if they've been
> recently used!

Fair.

> > Could this class be in anon namespace in the cpp file?
> 
> I refactored it into an anonymous namespace, but I had to expose more
> nsDOMStoragePersistentDB functionality as public methods

the "could" in my question was meant "if this is little work for good benefit".  If you had to change a lot of other stuff because of this request then you probably didn't have to do it.
Comment 29 Honza Bambas (:mayhemer) 2012-12-07 12:42:54 PST
Comment on attachment 689677 [details] [diff] [review]
Interdiff after Honza's review (Part 2)

Review of attachment 689677 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/src/storage/nsDOMStoragePersistentDB.cpp
@@ +272,5 @@
>      NS_ENSURE_SUCCESS(rv, rv);
>  
> +    // Create monitor for co-ordinating shutdown between threads
> +    mShutdownMonitor = PR_NewMonitor();
> +    NS_ENSURE_TRUE(mShutdownMonitor, NS_ERROR_FAILURE);

You have the Monitor class, but leave this as is already...

@@ +510,5 @@
>    stmt = mStatements.GetCachedStatement(
>      "SELECT scope, SUM(LENGTH(key) + LENGTH(value)) "
>      "FROM ( "
>        "SELECT scope, key, value FROM webappsstore2 "
> +      "WHERE scope LIKE :quotaKey"

I think LIKE is slower since it is case-insensitive, but that is a minor nit.

@@ +1095,2 @@
>    } else {
> +    scopePattern.AppendLiteral(":");

Why not ":_:" anymore?  I'd rather leave it.
Comment 30 Honza Bambas (:mayhemer) 2012-12-07 12:44:13 PST
Comment on attachment 689678 [details] [diff] [review]
LocalStorage cache data structures (Part 1)

Review of attachment 689678 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/src/storage/nsLocalStorageCache.cpp
@@ +269,5 @@
> +
> +  // We can't evict scopes currently getting flushed in case the flush fails
> +  bool shouldEvict = (!aScopeCache->mIsDirty &&
> +                      !aScopeCache->mIsFlushPending &&
> +                      (aScopeCache->mAccessTime <= data->mMinAccessTime));

Problem is that PR_IntervalNow() may wrap.  So this condition is wrong.  

mAccessTime is epoch of the time we last accessed the scope.  That is good.  However, you have to do:

evict when: (PR_IntervalNow() - aScopeCache->mAccessTime) > PR_SecondsToInterval(MAX_IDLE_TIME)
Comment 31 Vladan Djeric (:vladan) 2012-12-10 12:05:33 PST
Per smaug's request, this is another description of the patches:

These patches move all LocalStorage writes to disk to a helper thread. LocalStorage data is still read on the main thread, but the caching rules have been refined.

How setting a key works (DB::SetKey):

- First, we make sure the scope is already cached in memory in the DB::EnsureScopeLoaded method
- If the scope is not already cached in the scope cache (nsLocalStorageCache mCache), we fetch it from disk on the main thread (DB::FetchScope) and then add the scope data to the scope cache (nsLocalStorageCache::AddScope)
- When a scope is added to the scope cache, we initialize its "last accessed" timestamp to the current time. This timestamp is used when periodically evicting cached scopes that haven't been read from or written to for a while
- Since quota enforcement is done per site (e.g. google.com) and a site can have multiple scopes (e.g. mail.google.com, news.google.com), we also need to fetch information about the site's quota usage from disk (DB::GetUsageInternal). We fetch the size of the site's scopes on disk (EnsureQuotaUsageLoaded) and cache the size of the site's data on disk in a hash (DB::mQuotaUseByUncached). 
- After the on-disk quota use is cached for a site in this way, we no longer need to read from disk to do quota enforcement -- the current quota use can always be calculated by adding the size of the data in the cached scopes (computed dynamically) to the size of the uncached scopes (fetched from DB::mQuotaUseByUncached). The value in DB::mQuotaUseByUncached is updated whenever a scope is evicted from memory or a new scope is cached into memory
- Finally, set the new value in nsScopeCache::SetKey. This method will mark the new key entry dirty and also mark the scope dirty

How getting a key works (DB::GetKeyValue):

- First, make sure the scope is cached in memory in DB::EnsureScopeLoaded
- Get a pointer to the scope cache using nsLocalStorageCache::GetScope. Calling this method will update the mLastAccessed timestamp
- Do the hash lookup in nsScopeCache::GetKey

How deleting a key works (DB::RemoveKey):

- DB::EnsureScopeLoaded + nsLocalStorageCache::GetScope + nsScopeCache::RemoveKey
- nsScopeCache::RemoveKey removes the key from the key/value hash table and adds it to the mDeletedKeys vector and marks the scope dirty 

How flushing works in the common case (DB::FlushAndEvictFromCache):

- A timer fires every 10 seconds and calls DB::FlushAndEvictFromCache
- If there is a long-running flush operation already happening on the helper thread (DB::mIsFlushPending), nothing is done and we return
- Call nsScopeCache::GetFlushData which returns pointers to dirty data in cached scopes (dirty keys, removed keys, etc). The dirty data isn't copied, instead pointers are passed around
- The dirty data is converted into SQL parameter arrays in DB::PrepareFlushStatements
- All dirty scopes have their "dirty" flag reset but their "pending flush" flag gets set (in nsScopeCache::MarkScopesPending). Additionally, the dirty flags for the individual items within the scope are reset. This allows us to keep track of additional changes that occur after the flush begins
- Set the global DB::mIsFlushPending flag
- Enqueue a new FlushTask and dispatch it to the helper thread
- At this point, the main thread can continue modifying data in the caches with no regard for the helper thread. Note that scopes with the "dirty" or "pending flush" flags *cannot* be evicted from the cache (this is important for handling failed flushes)
- The FlushTask runs on the helper thread and calls DB::Flush which binds the previously-gathered SQL parameters representing dirty data and executes the flush statements within a single transaction
- Regardless of the flush failing or succeeding, the FlushTask enqueues a FlushCallbackTask to the main thread
- The FlushCallbackTask runs on the main thread and calls DB::HandleFlushComplete. The DB::mIsFlushPending flag is reset. If the flush succeeded, the "pending flush" flag is cleared within all scopes. See further below for description of handling failed flushes

How cache eviction works (DB::FlushAndEvictFromCache):

- Eviction begins in the same method as the flushing (DB::FlushAndEvictFromCache)
- Scopes that haven't been read or written in the last 10 seconds and that don't have dirty data & are not pending a flush are evicted. Upon eviction, the quota use in mQuotaUseByUncached is increased by the size of the scope's data

How removing scopes works (DB::ClearStorage, DB::RemoveOwner, DB::RemoveAll):

- Unlike the current implementation, scope deletion from the DB is *not* a synchronous operation
- When an individual scope is cleared via localStorage.clear(), the scope's key/value hash is cleared and nsScopeCache::mWasScopeDeleted is set and the scope's dirty flag is set. On the next flush, the scope gets deleted from disk. Note again that dirty scopes can't get evicted
- When scopes are deleted according to a pattern, e.g. "Forget about this site", we first fetch the scope names that match the pattern from disk (DB::FetchMatchingScopeNames) and add entries for them in the scope cache, then we mark all cached scopes matching the pattern deleted (DB::MarkMatchingScopesDeleted) and set each scope's dirty flag. The cached site quota use in DB::mQuotaUseByUncached is removed
- When all scopes are deleted (by clearing browsing history), we clear the data in all cached scopes + clear the mQuotaUseByUncached hash + set a single global DB::mWasRemoveAllCalled flag. This global flag prevents further fetching of data from disk until the "remove all" request is flushed.

How failed flushes are handled:

- As mentioned in previous comments, failed flushes should be very rare, e.g. a profile on a network share over a flaky network. When a flush fails, the cache *accurately* represents the current state of the cached scopes in LocalStorage, so only the scopes with the "flush pending" flag need special handling. The scopes that failed flush are marked dirty in entirety and we also set their mWasScopeDeleted flag. This ensures that we don't make any assumptions about the state of the scope's data on disk. Instead, we delete any old scope data and add back the cached keys on next flush. See my replies to Honza above for a bit more detail on the rationale.
- If a flush of a "clear browsing history" operation fails, we want to re-attempt it on the next flush and we want to prevent any reads from disk until the "remove all" is successfully written out, so we have the DB::mWasRemoveAllPending flag
Comment 32 Vladan Djeric (:vladan) 2012-12-10 12:09:36 PST
Two last bits:

- This model relies on WAL for reads from the main threads to be concurrent with writes on the helper thread. If WAL is not available, it falls back to rollback journalling and synchronous operation
- On shutdown, upon receiving the "profile-before-change" event, the main thread blocks on a monitor while waiting for the helper thread to finish any outstanding work. During regular operation, the main thread *does not* make use of the monitor.
Comment 33 Vladan Djeric (:vladan) 2012-12-11 23:21:01 PST
(In reply to Honza Bambas (:mayhemer) from comment #28)
> (In reply to Vladan Djeric (:vladan) from comment #23)
> > > Ups, won't this delete the scope from disk on the next flush?
> > 
> > That's right, but this is by design. 
> 
> I think I understand... but I'm not sure I agree.

I think we're still misunderstanding each other :( But I think I understand the source of confusion now. See my answers below

> It will delete data from disk, but not from the cache.  At least before we
> throw the cache (scope) away what we do very soon and actually a bit
> non-deterministically from the web content point of view.

This is the source of confusion. This code will not cause scope data to be deleted from disk after a flush fails.

After a flush fails, the disk state has not changed at all (thanks to the transaction rollback) and the scope cache contains a correct representation of the scope's current state. So to bring disk state back in sync with the cache state, on the next flush, we simply delete all on-disk data for the scope and write out **all** the keys cached for the scope. This behavior is triggered by setting mWasScopeDeleted = true in the scope cache and then marking *all* the keys in the scope cache *dirty*! Basically, we just write out all the data in the scope cache to disk on next flush.

> So, the data will be after a very short time corrupted (deleted) when we
> refetch scope from disk on next localStorage access.  

The scope's cache has the correct state at all times. As long as the scope is cached, we will not re-fetch it from disk. And as long as a scope cache has dirty data or is pending a flush, we will not evict it. After a scope flush fails, the scope is marked dirty so it can be re-flushed (previous comment), so it is not at risk of being evicted.

> @@ +1095,2 @@
> >    } else {
> > +    scopePattern.AppendLiteral(":");
> 
> Why not ":_:" anymore?  I'd rather leave it.

Because I use the same pattern for the DB deletes and cache deletes and cache deletes don't understand wildcards. I can use a different pattern for cache deletes if you prefer

> These are words, not proves.

I wrote a simple test that exercises various aspects of localStorage functionality against a DB with 4MB data stored in a single scope. I used about:telemetry to get timing information across 10 runs. All timings are averages. All timings are for main thread operations except for flush. The test machine is my Win7 laptop with a mechanical hard drive.

These are the results:

Time to fetch 4MB of scope data from disk (LOCALDOMSTORAGE_FETCH_DOMAIN_MS):
10.3ms (old average) -> 10.3 ms (new average)

"getItem" for a single cached 0.4MB value (LOCALDOMSTORAGE_GETVALUE_MS without disk fetch):
1.3ms (old) -> 0ms (new)

"getAllKeys" for cached scope (LOCALDOMSTORAGE_GETALLKEYS_MS):
10ms -> 0ms

Calculating on-disk quota use for a scope with 4MB data (LOCALDOMSTORAGE_FETCH_QUOTA_USE_MS):
5ms -> 5ms

"setItem" for cached scope (LOCALDOMSTORAGE_SETVALUE_MS minus LOCALDOMSTORAGE_FETCH_QUOTA_USE_MS):
1.7ms -> 0ms

"removeItem" of a cached 0.4MB key/value (LOCALDOMSTORAGE_REMOVEKEY_MS):
1.3ms -> 0ms

Clearing all 4MB of data in localStorage (LOCALDOMSTORAGE_REMOVEALL_MS):
283ms -> 0ms

Time spent flushing 4MB of new scope data (LOCALDOMSTORAGE_TIMER_FLUSH_MS for non-empty flushes):
OLD: 10 flushes on main thread with 118 ms/flush on average
NEW: 10 flushes on helper thread with 47.6ms/flush on average

P.S. I cleaned up shutdown code by taking advantage of the event-loop spinning that already happens inside nsThread::Shutdown
Comment 34 Vladan Djeric (:vladan) 2012-12-12 00:00:30 PST
Created attachment 691235 [details] [diff] [review]
LocalStorage cache data structures (Part 1) - includes Honza's LATEST reviews

Applied Honza's latest feedback + greatly simplified shutdown synchronization code
Comment 35 Vladan Djeric (:vladan) 2012-12-12 00:03:35 PST
Created attachment 691236 [details] [diff] [review]
nsDOMStoragePersistentDB now flushes off the main thread (Part 2) - includes Honza's LATEST reviews

These are minor changes from the previous version. Olli, you can just look at the interdiff patch
Comment 36 Vladan Djeric (:vladan) 2012-12-12 00:05:41 PST
Created attachment 691237 [details] [diff] [review]
Interdiff after Honza's latest reviews
Comment 37 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-12-12 08:45:26 PST
Comment on attachment 691235 [details] [diff] [review]
LocalStorage cache data structures (Part 1) - includes Honza's LATEST reviews


>+
>+struct GetUsageEnumData
>+{
>+  const nsACString* mQuotaKey;
const nsCString&


>+  int32_t mUsage;
>+};
>+
>+PLDHashOperator
>+GetUsageEnum(const nsACString& aScopeName,
>+             nsScopeCache* aScopeCache,
>+             void* aParams)
>+{
>+  GetUsageEnumData* data = (GetUsageEnumData*)aParams;
>+  if (StringBeginsWith(aScopeName, *(data->mQuotaKey)))
>+    data->mUsage += aScopeCache->GetQuotaUsage();
>+  return PL_DHASH_NEXT;
>+}
>+
>+struct EvictEnumData
>+{
>+  std::vector<nsCString>* mEvicted;
nsTArray<nsCString>&

>+  std::vector<int32_t>* mEvictedSize;
nsTArray<int32_t>

>+};
>+
>+struct GetAllKeysEnumData
>+{
>+  nsTHashtable<nsSessionStorageEntry>* mKeys;
nsTHashtable<nsSessionStorageEntry>&

>+  DOMStorageImpl* mStorage;
>+};
>+
>+PLDHashOperator
>+GetAllKeysEnum(const nsAString& aKey,
>+               nsScopeCache::KeyEntry* aEntry,
>+               void* aParams)
>+{
>+  GetAllKeysEnumData* data = (GetAllKeysEnumData*)aParams;
static_cast


>+PLDHashOperator
>+MarkMatchingDeletedEnum(const nsACString& aScopeName,
>+                        nsAutoPtr<nsScopeCache>& aScopeCache,
>+                        void* aPattern)
>+{
>+  const nsACString* pattern = (const nsACString*)aPattern;
static_cast 

>+  if (!StringBeginsWith(aScopeName, *pattern))
>+    return PL_DHASH_NEXT;
Always use curly brackets with 'if'


>+GetChangedKeysEnum(const nsAString& aKey,
>+                   nsAutoPtr<nsScopeCache::KeyEntry>& aEntry,
>+                   void* aParams)
>+{
>+  if (!aEntry->mIsDirty)
>+    return PL_DHASH_NEXT;
ditto. Seems to occur several times in the patch.


>+
>+  nsLocalStorageCache::FlushData::ChangeSet* changeSet =
>+    (nsLocalStorageCache::FlushData::ChangeSet*)aParams;
static_cast


>+GetEntrySize(const nsAString& aKey,
>+             nsScopeCache::KeyEntry* aEntry,
>+             void* aParam)
>+{
>+  int32_t* usage = (int32_t*)aParam;
static_cast ... apparently the same problem happens in many places

>+nsLocalStorageCache::GetScope(const nsACString& aScopeName)
>+{
>+  nsScopeCache* scopeCache = nullptr;
>+  mScopeCaches.Get(aScopeName, &scopeCache);
>+  if (scopeCache)
>+    scopeCache->mAccessTime = PR_IntervalNow(); 
  nsScopeCache* scopeCache = nullptr;
  if (mScopeCaches.Get(aScopeName, &scopeCache)) {
    scopeCache->mAccessTime = PR_IntervalNow(); 
  }
Do we really want to use PR_IntervalNode and not timestamp?
Same also elsewhere


>+#include <vector>
>+#include <set>
Please use our own data structures.
We can guarantee how they behave, but can't control stl

>+ * 3. Flush timer fires and dirty data is collected:
I think "dirty data is collected" is a bit vague
>+  // -- Keys to be deleted from disk on next flush
>+  std::set<nsString> mDeletedKeys;
nsTHashtable<nsStringHashKey>

>+    struct ChangeSet {
{ to next line

>+      bool mWasDeleted;
>+      std::set<nsString>* mDeletedKeys;
nsTHashtable<nsStringHashKey>

>+      std::vector<nsString> mDirtyKeys;
nsTArray<nsString>
>+      std::vector<nsScopeCache::KeyEntry*> mDirtyValues;
nsTArray<nsScopeCache::KeyEntry*>


>+    };
>+    // scope's name in mScopeNames --> scope's ChangeSet in mChanged
>+    std::vector<nsCString> mScopeNames;
nsTArray<nsCString>

>+    std::vector<ChangeSet> mChanged;
nsTArray<ChangeSet>


>+  SetFlushStateEnum(const nsACString&,
Please give some param name
Comment 38 Honza Bambas (:mayhemer) 2012-12-12 10:16:30 PST
(In reply to Vladan Djeric (:vladan) from comment #33)
> I think we're still misunderstanding each other :( But I think I understand
> the source of confusion now. See my answers below
> ...
> comment), so it is not at risk of being evicted.

Now I understand what it intends to do.  But sad is that it is so hard to recognize from looking (a long time) at the code and needs to be so complicatedly explained.

This needs to be well documented in the code as well if bug 600307 should delay and this patch would survive more release cycles.

> 
> > @@ +1095,2 @@
> > >    } else {
> > > +    scopePattern.AppendLiteral(":");
> > 
> > Why not ":_:" anymore?  I'd rather leave it.
> 
> Because I use the same pattern for the DB deletes and cache deletes and
> cache deletes don't understand wildcards. I can use a different pattern for
> cache deletes if you prefer

I just prefer fewer changes when you are changing existing code and want to uplift.  This works as well, but is a not-needed change IMO.

> 
> > These are words, not proves.
> 
> I wrote a simple test that exercises various aspects of localStorage

This is what I wanted to see.  As a precondition to land this seems to be enough.  However, real world usage is always different and we cannot get data until we land this patch.
Comment 39 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-12-12 10:23:32 PST
(In reply to Honza Bambas (:mayhemer) from comment #38)
> I just prefer fewer changes when you are changing existing code and want to
> uplift.  This works as well, but is a not-needed change IMO.
FYI, I'm rather strongly against uplift. This is a major change and must get
enough time for testing. We have too many regressions in each release and we should be
very careful what kinds of patches can be landed with less than normal testing.
Comment 40 Honza Bambas (:mayhemer) 2012-12-12 10:38:42 PST
(In reply to Olli Pettay [:smaug] from comment #39)
> (In reply to Honza Bambas (:mayhemer) from comment #38)
> > I just prefer fewer changes when you are changing existing code and want to
> > uplift.  This works as well, but is a not-needed change IMO.
> FYI, I'm rather strongly against uplift. This is a major change and must get
> enough time for testing. We have too many regressions in each release and we
> should be
> very careful what kinds of patches can be landed with less than normal
> testing.


Then I don't see a need to even finish this patch and would rather immediately close this bug as WONTFIX and let Vladan spent time on reviewing the completely new implementation in bug 600307.  There is fresh patch ready for the first review, he already knows that patch, so doing a complete review should not be that complicated now.  

I was hoping the work here will go faster, I spend on my r+ a Saturday night since I was told it was "extremely urgent"...  Many different opinions from many different people, as usual what makes prioritization a mess.

Conclusion: review and land on m-c and m-a ASAP or close right now.  Otherwise this is just wasting of time.
Comment 41 Honza Bambas (:mayhemer) 2012-12-12 11:02:16 PST
Comment on attachment 691237 [details] [diff] [review]
Interdiff after Honza's latest reviews

+  PRIntervalTime idleTime = PR_IntervalToSeconds(
+    PR_IntervalNow() - aScopeCache->mAccessTime);

Nit:

static const kMaxIdleTime = PR_SecondsToInterval(MAX_IDLE_TIME);
bool evict = (PR_IntervalNow() - aScopeCache->mAccessTime) > kMakIdleTime;


It saves conversion on every call, the *To* methods may not be cheap.
Comment 42 Honza Bambas (:mayhemer) 2012-12-12 11:03:57 PST
(In reply to Olli Pettay [:smaug] from comment #37)
> Do we really want to use PR_IntervalNode and not timestamp?
> Same also elsewhere

It was my suggestion.  TimeStamp should be used only when hi precision is needed.  This is not the case.  TimeStamp is not cheap to use, mainly on WinXP, while interval time is.
Comment 43 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-12-12 11:16:05 PST
Comment on attachment 691236 [details] [diff] [review]
nsDOMStoragePersistentDB now flushes off the main thread (Part 2) - includes Honza's LATEST reviews

I'd like to see some MOZ_ASSERTs in the methods which are supposed to run in the flush thread.
Something like MOZ_ASSERT(!flushThread || NS_GetCurrentThread() == flushThread).
Add also some asserts to those methods which should be run only in the main thread.
And if possible would be really good to add assertions also to those methods which may run in both threads.
MOZ_ASSERT((stateIsX && NS_GetCurrentThread() == mainThread) || (stateIsY && flushThread && NS_GetCurrentThread() == flushThread));


>+nsDOMStoragePersistentDB::Flush()
>+{
>+  nsresult rv;
>+  Telemetry::AutoTimer<Telemetry::LOCALDOMSTORAGE_TIMER_FLUSH_MS> timer;
>+
>+  // Make the flushes atomic
>+  mozStorageTransaction transaction(mConnection, false);
>+  for (uint32_t i = 0; i < mFlushStatements.Length(); ++i)
>+  {
{ hould be in the previous line
>+  nsAutoCString stmtString(
>+    MOZ_STORAGE_UNIQUIFY_QUERY_STR "PRAGMA journal_mode = ");
>+  if (aIsWal)
>+    stmtString.AppendLiteral("wal");
>+  else
>+    stmtString.AppendLiteral("truncate");

if (expr) {
  stmt1;
} else {
  stmt2;
}

>+nsDOMStoragePersistentDB::FetchScope(DOMStorageImpl* aStorage,
>+                                     nsScopeCache* aScopeCache)
>+{
>+  if (mWasRemoveAllCalled || mIsRemoveAllPending)
>+    return NS_OK;
brackets are missing. Same also elsewhere.

>+nsDOMStoragePersistentDB::EnsureScopeLoaded(DOMStorageImpl* aStorage)
>+{
>+  nsresult rv;
declare rv when you use it first time

>+nsDOMStoragePersistentDB::EnsureQuotaUsageLoaded(const nsACString& aQuotaKey)
>+{
>+  nsresult rv;
ditto

>+struct UpdateQuotaEnumData
>+{
>+  nsCString* mEvictedScope;
s/*/&/

>+UpdateQuotaEnum(const nsACString& aQuotaKey,
>+                int32_t& aUncachedSize,
>+                void* aParams)
>+{
>+  UpdateQuotaEnumData* data = (UpdateQuotaEnumData*)aParams;
static_cast

>+nsDOMStoragePersistentDB::EvictUnusedScopes()
>+{
>+  std::vector<nsCString> evictedScopes;
>+  std::vector<int32_t> evictedSize;
Use nsTArray


>+nsresult
>+BindScope(mozIStorageStatement* aStmt,
>+          mozIStorageBindingParamsArray** aParamArray,
>+          const nsACString& aScopeName)
Hmm, out param should be the last one

>+{
>+  nsresult rv;
declare when used


>+BindScopeAndKey(mozIStorageStatement* aStmt,
>+                mozIStorageBindingParamsArray** aParamArray,
>+                const nsACString& aScopeName,
>+                const nsAString& aKey)
>+{
>+  nsresult rv;
fix rv and outparam

>+BindInsertKey(mozIStorageStatement* aStmt,
>+              mozIStorageBindingParamsArray** aParamArray,
>+              const nsACString& aScopeName,
>+              const nsAString& aKey,
>+              const nsScopeCache::KeyEntry* aEntry)
>+{
>+  nsresult rv;
ditto


>+nsDOMStoragePersistentDB::PrepareFlushStatements(const FlushData& aDirtyData)
>+{
>+  nsresult rv;
declare when used

>+  nsCOMPtr<mozIStorageBindingParamsArray> tempArray;
>+  const int32_t dirtyScopeCount = aDirtyData.mScopeNames.size();
When you move to nsTArray for mScopeNames, you should change
int32_t to uint32_t. Check the types also elsewhere.


>+InvalidateMatchingQuotaEnum(const nsACString& aQuotaKey,
>+                            int32_t&,
>+                            void* aPattern)
>+{
>+  const nsACString* pattern = (nsACString*)aPattern;
static_cast



In general looks good, but I especially would like to see those assertions.
Comment 44 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-12-12 11:31:07 PST
(In reply to Honza Bambas (:mayhemer) from comment #40)
> Then I don't see a need to even finish this patch and would rather
> immediately close this bug as WONTFIX and let Vladan spent time on reviewing
> the completely new implementation in bug 600307.  There is fresh patch ready
> for the first review, he already knows that patch, so doing a complete
> review should not be that complicated now.  
How regression risky is bug 600307? The patch at least is huge.
Should that land very early in the FF21 cycle (Jan 7) and land the less risky patch in this bug
for FF20? Note, Christmas break will reduce the time we have for fixing regressions.
Of course if bug 600307 isn't significantly riskier than the patch in this bug, we should just land it.

However, I wouldn't be surprised if some people would really like to get this fixed in Aurora.
But this is not a small change. Far from the usual changes accepted to Aurora.
Drivers need to decide.
Comment 45 Honza Bambas (:mayhemer) 2012-12-12 11:45:09 PST
It's harder to be detached from my patch in bug 600307.  I'm not that worried about DOM storage functionality (however there is probably at least one mid-range issue we will find some time after pushing to the real world).  I'm more worried about some security check I might screwed up, but that should be caught during review.

The patch here doesn't change any security checks and doesn't touch IPC, so it is definitely less risky.  However I expect there will be an issue with the functionality of DOM storage we won't catch at all during the review (as you can see, even my careful review didn't make me understand what all implications a failed flush had, so there was probably more I didn't catch.)

As also Josh Aas told me, this should land for 20 and bug 600307 for 21, since I don't believe 600307 will get polished and landed in only few days (however I'm sad about it).
Comment 46 Vladan Djeric (:vladan) 2012-12-14 00:22:28 PST
Created attachment 692210 [details] [diff] [review]
Interdiff after Olli's review

(In reply to Olli Pettay [:smaug])
> >+struct GetUsageEnumData
> >+{
> >+  const nsACString* mQuotaKey;
> const nsCString&

Is there a benefit to using a concrete string type here? I would have to copy the nsACString passed in from the DOMStorage GetUsage interface into a new nsCString to make this change

> I'd like to see some MOZ_ASSERTs in the methods which are supposed to run in the flush thread.

I added MOZ_ASSERTS, but I didn't add them to any methods that could be called from both threads because the assert condition would be too complicated. We do main-thread flushing even when we have a helper thread in certain error cases and on shutdown during the final flush.
Comment 47 Vladan Djeric (:vladan) 2012-12-14 00:23:24 PST
Created attachment 692211 [details] [diff] [review]
LocalStorage cache data structures (Part 1) - after Olli's review
Comment 48 Vladan Djeric (:vladan) 2012-12-14 00:24:23 PST
Created attachment 692212 [details] [diff] [review]
691236: nsDOMStoragePersistentDB now flushes off the main thread (Part 2) - after Olli's review
Comment 49 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-12-14 05:26:06 PST
(In reply to Vladan Djeric (:vladan) from comment #46)
> Created attachment 692210 [details] [diff] [review]
> Interdiff after Olli's review
> 
> (In reply to Olli Pettay [:smaug])
> > >+struct GetUsageEnumData
> > >+{
> > >+  const nsACString* mQuotaKey;
> > const nsCString&
> 
> Is there a benefit to using a concrete string type here?
Oops, const nsACString& is fine.
Comment 50 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-12-14 07:37:09 PST
Comment on attachment 692210 [details] [diff] [review]
Interdiff after Olli's review

Only few nits


>+  void SetFlushResult(bool aSucceeded) {
{ should be in the next line


> nsresult
> nsDOMStoragePersistentDB::Flush()
> {
>-  nsresult rv;
>   Telemetry::AutoTimer<Telemetry::LOCALDOMSTORAGE_TIMER_FLUSH_MS> timer;
> 
>   // Make the flushes atomic
>   mozStorageTransaction transaction(mConnection, false);
>+
>+  nsresult rv;
>   for (uint32_t i = 0; i < mFlushStatements.Length(); ++i)
>   {
{ should be in the previous line


> nsDOMStoragePersistentDB::RemoveAllForApp(uint32_t aAppId, bool aOnlyBrowserElement)
> {
>-  nsresult rv;
>+  nsAutoCString appIdString;
>+  appIdString.AppendInt(aAppId);
> 
>-  nsAutoCString scopePattern;
>-  scopePattern.AppendInt(aAppId);
>+  nsAutoCString scopePattern = appIdString;
>   if (aOnlyBrowserElement) {
>-    scopePattern.AppendLiteral(":t:");
>+    scopePattern += ":t:";
>   } else {
>-    scopePattern.AppendLiteral(":");
>+    scopePattern += ":_:";
>   }
Use AppendLiteral to append literals :)
AppendLiteral is tiny bit faster than +


>-  flushData->mScopeNames.push_back(nsDependentCString(aScopeName.BeginReading()));
>-  flushData->mChanged.push_back(changeSet);
>+  flushData->mScopeNames.AppendElement(
>+    nsDependentCString(aScopeName.BeginReading()));
AppendElement(nsCString(aScopeName)) should work, and is probably even faster
since with the BeginReading hack the data isn't shared.


>-  mDeletedKeys.erase(nsAutoString(aKey));
>+  //TODO: mDeletedKeys.RemoveElement(nsAutoString(aKey));
>+  mDeletedKeys.RemoveElement(aKey);
What is the todo about?


>     struct ChangeSet {
{ goes to the next line
Comment 51 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-12-14 07:43:12 PST
Comment on attachment 692211 [details] [diff] [review]
LocalStorage cache data structures (Part 1) - after Olli's review


>+  if (evict) {
>+    data->mEvicted.AppendElement(nsDependentCString(aScopeName.BeginReading()));
>+    data->mEvictedSize.AppendElement(aScopeCache->GetQuotaUsage());
>+    return PL_DHASH_REMOVE;
>+  } else {
>+    return PL_DHASH_NEXT;
>+  }
else after return


>+nsScopeCache::GetKey(const nsAString& aKey,
>+                     nsAString& aValue,
>+                     bool* aSecure) const
>+{
>+  KeyEntry* entry = nullptr;
>+  if (mTable.Get(aKey, &entry)) {
>+    aValue = entry->mValue;
>+    *aSecure = entry->mIsSecure;
>+    return true;
>+  } else {
>+    return false;
>+  }
ditto
Comment 52 Honza Bambas (:mayhemer) 2012-12-22 06:42:51 PST
So, what is the status here?
Comment 55 Vladan Djeric (:vladan) 2012-12-25 01:35:41 PST
*** Bug 704933 has been marked as a duplicate of this bug. ***
Comment 56 Bogdan Maris, QA [:bogdan_maris] 2013-01-11 08:16:30 PST
Is there a way to test this?
Comment 57 Vladan Djeric (:vladan) 2013-01-11 09:02:10 PST
Hi Bogdan, I don't think we need more testing for this change. I have already done a fair bit of correctness & performance testing and the Telemetry data from Nightly and Aurora looks to be confirming the test results.
Comment 58 Bogdan Maris, QA [:bogdan_maris] 2013-03-05 07:00:22 PST
Please change the status of the bug if necessary, thanks.
Comment 59 Vladan Djeric (:vladan) 2013-11-12 08:55:40 PST
*** Bug 599988 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.