Last Comment Bug 721320 - Improve BrowserDB / LocalBrowserDB performance
: Improve BrowserDB / LocalBrowserDB performance
Status: RESOLVED FIXED
:
Product: Firefox for Android
Classification: Client Software
Component: General (show other bugs)
: Trunk
: All Android
: -- normal (vote)
: ---
Assigned To: Gian-Carlo Pascutto [:gcp]
:
Mentors:
Depends on: 721352
Blocks:
  Show dependency treegraph
 
Reported: 2012-01-26 02:01 PST by Gian-Carlo Pascutto [:gcp]
Modified: 2012-02-06 07:10 PST (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
fixed
fixed


Attachments
Add performance measurements for Profile Migration (4.62 KB, patch)
2012-01-26 03:02 PST, Gian-Carlo Pascutto [:gcp]
no flags Details | Diff | Review
Patch 1. Improve LocalBrowserDB perf with SQLite tweak (1.46 KB, patch)
2012-01-26 03:03 PST, Gian-Carlo Pascutto [:gcp]
lucasr.at.mozilla: review+
mak77: feedback+
akeybl: approval‑mozilla‑aurora+
Details | Diff | Review

Description Gian-Carlo Pascutto [:gcp] 2012-01-26 02:01:11 PST
Continued from bug 713283:

On a Galaxy S2 2.3.4 writing to SD storage:

I/ProfMigr(20644): Bookmarks migration took 19308 ms
I/ProfMigr(20644): Get old history took 8 ms
I/ProfMigr(20644): History query 9894 ms
I/ProfMigr(20644): DB pushing took 71490 ms
I/ProfMigr(20644): history migration took 81392 ms
I/ProfMigr(20644): Profile migration finished
I/GeckoApp(20644): Profile migration took 100788 ms

Database writes are gruesomely slow. There are a max of 250 history entries imported, each history entry causes at most 3 writes (create entry, update entry, update favicon). This means that every single write takes about a *second*. (There is actually some other overhead in decoding favicons and whatnot included in that number, but I'm pretty confident IO is the limiting factor here)

Performance on a Tab 10.1 with Honeycomb 3.2 on internal storage is more than twice as fast (45s). I think this is because of:

        public void onOpen(SQLiteDatabase db) {
            Log.d(LOGTAG, "Opening browser.db: " + db.getPath());

            // From Honeycomb on, it's possible to run several db
            // commands in parallel using multiple connections.
            if (Build.VERSION.SDK_INT >= 11) {
                db.enableWriteAheadLogging();
            }

WAL is a no-downsides performance win which we also use in desktop. But it requires >3.7.0:
http://stackoverflow.com/questions/2421189/version-of-sqlite-used-in-android

So we have to make do with older versions.
I patched this code to issue SQLite PRAGMA's in the SDK < 11 case. Performance impact is dramatic, here are the two most relevant results:

synchronous = NORMAL
journal = MEMORY

I/ProfMigr(16652): Bookmarks migration took 6504 ms
I/ProfMigr(16652): Get old history took 7 ms
I/ProfMigr(16652): query took 9209 ms
I/ProfMigr(16652): DB pushing took 22906 ms
I/ProfMigr(16652): history migration took 32123 ms
I/ProfMigr(16652): Profile migration finished
I/GeckoApp(16652): Profile migration took 38712 ms

synchronous=OFF with journal=whatever has the same performance.

synchronous = NORMAL
journal = PERSIST

I/ProfMigr(20696): Bookmarks migration took 10487 ms
I/ProfMigr(20696): Get old history took 10 ms
I/ProfMigr(20696): query took 8706 ms
I/ProfMigr(20696): DB pushing took 46062 ms
I/ProfMigr(20696): history migration took 54778 ms
I/ProfMigr(20696): Profile migration finished
I/GeckoApp(20696): Profile migration took 65579 ms

http://www.sqlite.org/pragma.html
PRAGMA synchronous = NORMAL: "When synchronous is NORMAL (1), the SQLite database engine will still sync at the most critical moments, but less often than in FULL mode. There is a very small (though non-zero) chance that a power failure at just the wrong time could corrupt the database in NORMAL mode. But in practice, you are more likely to suffer a catastrophic disk failure or some other unrecoverable hardware fault."

Reading more documentation, getting corruption in NORMAL mode requires a collision in a 32-bit checksum. That's what the "very small" chance means.

PRAGMA journal = PERSIST:
Deleting a file is an expensive operation on many systems. So as an optimization, SQLite can be configured to avoid the delete operation of section 3.11. Instead of deleting the journal file in order to commit a transaction, the file is either truncated to zero bytes in length or its header is overwritten with zeros. Truncating the file to zero length saves having to make modifications to the directory containing the file since the file is not removed from the directory. Overwriting the header has the additional savings of not having to update the length of the file (in the "inode" on many systems) and not having to deal with newly freed disk sectors. Furthermore, at the next transaction the journal will be created by overwriting existing content rather than appending new content onto the end of a file, and overwriting is often much faster than appending.

SQLite can be configured to commit transactions by overwriting the journal header with zeros instead of deleting the journal file by setting the "PERSIST" journaling mode using the journal_mode PRAGMA.

On embedded systems with synchronous filesystems, TRUNCATE results in slower behavior than PERSIST. (I verified this to be the case on Android)

Based on this, I will make a patch to put our LocalBrowserDB to use NORMAL+PERSIST, for a 50% gain in performance.

I think we should also consider a "setPerformanceMode" switch, which toggles the database to synchronous=NONE, for use during migration operations. My rationale here is that when doing this operation, the destination database will tend to be empty or have very little data. We can afford to lose it. The source database is not written so it will be recoverable, meaning we will be able to simply retry the operation with minimal or no data loss.
Comment 1 Gian-Carlo Pascutto [:gcp] 2012-01-26 02:55:33 PST
Another thing that would probably help loads is inserting everything in 1 transaction. Starting from API level 5 there is ContentProviderOperation which seems to be made to do database operations in bulk.

So that'd need to extend BrowserDB to have a "bulkUpdate" API, which constructs a ContentProviderOperation, and does newXXX on that for every row, then calls applyBatch. In LocalBrowserDB, we then override that to properly use transactions.

By the way, LocalBrowserDB is full of:

        if (Build.VERSION.SDK_INT >= 11) {
            Log.d(LOGTAG, "Beginning update transaction: " + uri);
            db.beginTransaction();
            try {
                updated = updateInTransaction(uri, values, selection, selectionArgs);
                db.setTransactionSuccessful();
                Log.d(LOGTAG, "Successful update transaction: " + uri);
            } finally {
                db.endTransaction();
            }
        } else {
            updated = updateInTransaction(uri, values, selection, selectionArgs);
        }

However, as far as I can tell transactions are available starting from API level 1?
Comment 2 Gian-Carlo Pascutto [:gcp] 2012-01-26 03:02:14 PST
Created attachment 591734 [details] [diff] [review]
Add performance measurements for Profile Migration
Comment 3 Gian-Carlo Pascutto [:gcp] 2012-01-26 03:03:33 PST
Created attachment 591735 [details] [diff] [review]
Patch 1. Improve LocalBrowserDB perf with SQLite tweak
Comment 4 Lucas Rocha (:lucasr) 2012-01-26 04:23:34 PST
Comment on attachment 591735 [details] [diff] [review]
Patch 1. Improve LocalBrowserDB perf with SQLite tweak

Review of attachment 591735 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/db/BrowserProvider.java.in
@@ +387,5 @@
>                  db.enableWriteAheadLogging();
> +            } else {
> +                // Pre-Honeycomb, we can do some lesser optimizations.
> +                Cursor cursor = db.rawQuery("PRAGMA synchronous=NORMAL", null);
> +                cursor.close();

Maybe close the cursor on a finally block just to be safe?
Comment 5 Gian-Carlo Pascutto [:gcp] 2012-01-26 07:32:36 PST
https://hg.mozilla.org/integration/mozilla-inbound/rev/e7142decacd5
Comment 6 Marco Bonardo [::mak] 2012-01-26 16:19:47 PST
Comment on attachment 591735 [details] [diff] [review]
Patch 1. Improve LocalBrowserDB perf with SQLite tweak

Review of attachment 591735 [details] [diff] [review]:
-----------------------------------------------------------------

https://hg.mozilla.org/mozilla-central/rev/e7142decacd5

which synchronous mode are you using with wal? do you use a journal_size_limit? did you tweak wal_autocheckpoint? Or is this coming from the system and you can't tweak it?

Regarding synchronous normal, it's not much safe for data durability, but may work depending on your needs, surely it does much less fsyncs than FULL. Though, if your app exits abruptly may easily corrupt the db.

journal mode persist on a phone may be fine, on desktop we were using truncate, that had good performances especially on linux.
Other interesting performance wins are locking mode exclusive (though that locks the database to a connection, that I'm not sure if it's what you want, in some case may be worth it though since avoids checking if someone is using the database before using it) and avoiding shared cache mode (so on using the cache can avoid checking other users).
Comment 7 Marco Bonardo [::mak] 2012-01-26 16:20:29 PST
Can this bug be resolved or is there more work coming?
Comment 8 Gian-Carlo Pascutto [:gcp] 2012-01-26 23:39:16 PST
>which synchronous mode are you using with wal? do you use a journal_size_limit? 
>did you tweak wal_autocheckpoint? Or is this coming from the system and you can't 
>tweak it?

It's coming from the system. I think we can override with PRAGMA queries, though. Didn't attempt tweaking WAL yet as performance was 2x faster than non-WAL, and WAL is *not* available on the majority of devices yet.

We need to be decent in non-WAL mode.

>Regarding synchronous normal, it's not much safe for data durability, but may work 
>depending on your needs, surely it does much less fsyncs than FULL. Though, if 
>your app exits abruptly may easily corrupt the db.

The SQLite documentation says its checksumming in this mode and we'll only get corruption on a 32-bit checksum collision. That should be safe enough?

Corrupting the browser.db would be bad. On the other hand, we don't have to deal with power-loss situations on the phone. Only our own crashes are a concern.

>Other interesting performance wins are locking mode exclusive (though that 
>locks the database to a connection, that I'm not sure if it's what you want,

Prolly doesn't work with Sync running.

>and avoiding shared cache mode (so on using the cache can avoid checking other 
>users).

That might be worth looking into.
Comment 9 Gian-Carlo Pascutto [:gcp] 2012-01-26 23:41:45 PST
>Can this bug be resolved or is there more work coming?

Let's keep it open as a meta until we don't suck.
Comment 10 Gian-Carlo Pascutto [:gcp] 2012-01-27 00:01:19 PST
Comment on attachment 591735 [details] [diff] [review]
Patch 1. Improve LocalBrowserDB perf with SQLite tweak

[Approval Request Comment]
Wanted to get better import performance on pre-Honeycomb devices.
Comment 11 Alex Keybl [:akeybl] 2012-01-27 16:17:59 PST
Comment on attachment 591735 [details] [diff] [review]
Patch 1. Improve LocalBrowserDB perf with SQLite tweak

[Triage Comment]
Mobile only - approved for Aurora.
Comment 12 Cameron McCormack (:heycam) 2012-01-27 18:55:51 PST
*** Bug 721934 has been marked as a duplicate of this bug. ***
Comment 13 Brad Lassey [:blassey] (use needinfo?) 2012-01-29 21:22:30 PST
(In reply to Gian-Carlo Pascutto (:gcp) from comment #9)
> >Can this bug be resolved or is there more work coming?
> 
> Let's keep it open as a meta until we don't suck.

gcp, I don't think meta bugs should have patches on them. I'd suggest filing another bug for your meta bug and closing this one out.
Comment 14 Gian-Carlo Pascutto [:gcp] 2012-01-29 22:39:17 PST
https://hg.mozilla.org/releases/mozilla-aurora/rev/0895be30b9ba
Comment 15 Gian-Carlo Pascutto [:gcp] 2012-01-30 04:52:02 PST
I realized in bug 722055 that this patch is wrong: we only take the alternate path if we're using an old build SDK, but in fact we don't care about the build version (we use very recent SDK's whenever we can); we want to use the fallback path whenever the *runtime* Android version is < 3.0.
Comment 16 Gian-Carlo Pascutto [:gcp] 2012-01-30 05:32:44 PST
It seems that Build.VERSION.SDK_INT does a check at runtime, too.
Comment 17 Brad Lassey [:blassey] (use needinfo?) 2012-01-30 11:49:14 PST
https://hg.mozilla.org/releases/mozilla-aurora/rev/0895be30b9ba

Note You need to log in before you can comment on or make changes to this bug.