The default bug view has changed. See this FAQ.

Use WAL journaling for places.sqlite

RESOLVED FIXED in mozilla2.0b9

Status

()

Toolkit
Places
RESOLVED FIXED
7 years ago
3 years ago

People

(Reporter: mak, Assigned: sdwilsh)

Tracking

(Blocks: 1 bug, {perf})

Trunk
mozilla2.0b9
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(blocking2.0 betaN+)

Details

(Whiteboard: [fixed-in-places])

Attachments

(1 attachment, 11 obsolete attachments)

(Reporter)

Description

7 years ago
the new journal mode could reduce fsyncs without large changes on our side.
(Reporter)

Updated

7 years ago
Depends on: 581000
(Reporter)

Updated

7 years ago
Summary: Investigate using WAL journal for places.sqlite → Use WAL journaling for places.sqlite
(Reporter)

Updated

7 years ago
Assignee: nobody → mak77
Status: NEW → ASSIGNED
(Reporter)

Comment 1

7 years ago
Created attachment 459835 [details] [diff] [review]
patch v1.0
(Reporter)

Comment 2

7 years ago
Created attachment 459920 [details] [diff] [review]
patch v1.1

minor change.

Notice this applies on top of bug 552023, landing this before will require moving the shutdown journal setup to PlacesDBFlush.
Attachment #459835 - Attachment is obsolete: true
(Reporter)

Comment 3

7 years ago
Comment on attachment 459920 [details] [diff] [review]
patch v1.1

asking feedback instead of review for the dependency. It should be ready to review apart that but you could need to unbitrot it if willing to push before I'm back.
Attachment #459920 - Flags: feedback?(sdwilsh)
(Reporter)

Comment 4

7 years ago
tentative blocking for I/O (fsyncs) win.
WAL is a new journaling mode introduced in sqlite 3.7.0 that reduces I/O and fragmentation.
blocking2.0: --- → ?
(Reporter)

Comment 5

7 years ago
Created attachment 460037 [details] [diff] [review]
patch v1.2
Attachment #459920 - Attachment is obsolete: true
Attachment #460037 - Flags: feedback?(sdwilsh)
Attachment #459920 - Flags: feedback?(sdwilsh)
(Assignee)

Comment 6

7 years ago
Comment on attachment 460037 [details] [diff] [review]
patch v1.2

This looks fine.  I'll do a proper review of it too.
Attachment #460037 - Flags: review?(sdwilsh)
Attachment #460037 - Flags: feedback?(sdwilsh)
Attachment #460037 - Flags: feedback+

Comment 7

7 years ago
OS/2 guys: maybe you want to see if this patch works on OS/2. As we don't have WAL support implemented in our SQLite port, it should fall-back, but might be good to check this before this lands.

> nsresult
>+nsNavHistory::SetJournaling(const nsCString& aJournalMode) {
Just curious: is Mozilla (or places) migrating to a Java-like end-of-the line brace style?

Comment 8

7 years ago
Note that while using WAL will result in less fsyncs (one instead of three), it'll still sync on every commit.

To take full advantage, the synchronous mode should be changed to normal ("PRAGMA synchronous = NORMAL"). I believe this setting is already mapped to a hidden pref ("toolkit.storage.synchronous").

With that setting, sqlite will only fsync the database when checkpointing the journal. This trades off durability (the last seconds of changes before a crash may be lost). If I read the documentation correctly, in WAL mode it's safe to use NORMAL (it doesn't increase the risk of corruption like it supposedly does for non-WAL databases).

Finally since checkpointing does sync and will therefore block, it should be done in some background thread periodically instead of relying on the automatic mechanism which triggers when the WAL file grows to 1000 pages and will cause an unexpected hitch. Fortunately the checkpointing doesn't block the database, that is, you can read and write to it while it's taking place.
(Assignee)

Comment 9

7 years ago
(In reply to comment #7)
> OS/2 guys: maybe you want to see if this patch works on OS/2. As we don't have
> WAL support implemented in our SQLite port, it should fall-back, but might be
> good to check this before this lands.
Please let us know if we'll need to ifdef this out or make special code for you guys.

> Just curious: is Mozilla (or places) migrating to a Java-like end-of-the line
> brace style?
No, but that's something I'd catch in review.

(In reply to comment #8)
> Note that while using WAL will result in less fsyncs (one instead of three),
> it'll still sync on every commit.
Right, we are OK with having fsyncs (we want durability), but reducing them is a win.

> To take full advantage, the synchronous mode should be changed to normal
> ("PRAGMA synchronous = NORMAL"). I believe this setting is already mapped to a
> hidden pref ("toolkit.storage.synchronous").
We will not be doing that though because we've seen an increase in database corruption in the past when we had it set to that.  Places is always at synchronous = FULL.

> Finally since checkpointing does sync and will therefore block, it should be
> done in some background thread periodically instead of relying on the automatic
> mechanism which triggers when the WAL file grows to 1000 pages and will cause
> an unexpected hitch. Fortunately the checkpointing doesn't block the database,
> that is, you can read and write to it while it's taking place.
A large portion of places SQLite I/O is done on a background thread already to avoid fsyncs, writes, and reads on the main thread.
(Assignee)

Updated

7 years ago
blocking2.0: ? → betaN+

Comment 10

7 years ago
(In reply to comment #6)
> Comment on attachment 460037 [details] [diff] [review]
> patch v1.2
(In reply to comment #7)
> OS/2 guys: maybe you want to see if this patch works on OS/2. As we don't have
> WAL support implemented in our SQLite port, it should fall-back, but might be
> good to check this before this lands.
After a slight correction of the patch for bit rot it built fine, the browser comes up and stays alive. Any stress test I can do to find out whether there might be a not so obvious regression?
(Assignee)

Comment 11

7 years ago
Ideally running the tests should do it.  Also making sure history is saved when you restart.
(Assignee)

Comment 12

7 years ago
Comment on attachment 460037 [details] [diff] [review]
patch v1.2

>+++ b/toolkit/components/places/src/nsNavHistory.cpp
>@@ -142,9 +143,12 @@ using namespace mozilla::places;
> // Filename used to backup corrupt databases.
> #define DATABASE_CORRUPT_FILENAME NS_LITERAL_STRING("places.sqlite.corrupt")
> 
>-// We use the TRUNCATE journal mode to reduce the number of fsyncs.  Without
>-// this setting we had a Ts hit on Linux.  See bug 460315 for details.
>-#define DATABASE_JOURNAL_MODE "TRUNCATE"
>+// We use the WAL journal mode to reduce the number of fsyncs.
>+#define DATABASE_DEFAULT_JOURNAL_MODE NS_LITERAL_CSTRING("wal")
>+// If the system does not support WAL, or we cannot use it, we will fallback
>+// to this setting.  Other journal modes caused a Ts hit on Linux.
>+// See bug 460315 for details.
>+#define DATABASE_FALLBACK_JOURNAL_MODE NS_LITERAL_CSTRING("truncate")
We should use an enum for this at this point.

> nsresult
>+nsNavHistory::SetJournaling(const nsCString& aJournalMode) {
>+  // Note: binding seems to be not working on pragma statements.
yeah, it does not work, and I'm not sure the comment is really needed here.

>+  // WARNING:
>+  //  Don't leave unfinalized statements around till journal mode has been set,
>+  //  some journaling mode does not like them and will fail to be set.
nit: wording is a bit awkward

>+  // Be sure to set journaling after cache_size and page_size, some mode will
>+  // prevent that kind of changes.
nit: journal mode

>-    rv = mDBConn->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
>-        "PRAGMA journal_mode = MEMORY"));
>-    NS_ENSURE_SUCCESS(rv, rv);
>+    (void)SetJournaling(NS_LITERAL_CSTRING("memory"));
We used to check the result here.  I think we still should.

>-    rv = mDBConn->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
>-        "PRAGMA journal_mode = " DATABASE_JOURNAL_MODE));
>-    NS_ENSURE_SUCCESS(rv, rv);
>+    if (NS_FAILED(SetJournaling(DATABASE_DEFAULT_JOURNAL_MODE))) {
>+      (void)SetJournaling(DATABASE_FALLBACK_JOURNAL_MODE);
>+    }
Ditto.

>+    // Set back journaling to a backwards compatible value.  Newer journal modes
>+    // like WAL make the database incomatible with old versions of the browser.
>+    (void)SetJournaling(NS_LITERAL_CSTRING("memory"));
We should link to the SQLite documentation about this too.

>@@ -5483,38 +5527,14 @@ nsNavHistory::VacuumDatabase()
>                                              nsnull);
>     }
> 
>-    // Actually vacuuming a database is a slow operation, since it could take
>-    // seconds.  Part of the time is spent in updating the journal file on disk
>-    // and this is particularly bad on devices with slow I/O.  Temporary
>-    // moving the journal to memory could increase a bit the possibility of
>-    // corruption if we crash during this time, but makes the process really
>-    // faster.
>-    nsCOMPtr<mozIStorageStatement> journalToMemory;
>-    rv = mDBConn->CreateStatement(NS_LITERAL_CSTRING(
>-        "PRAGMA journal_mode = MEMORY"),
>-      getter_AddRefs(journalToMemory));
>-    NS_ENSURE_SUCCESS(rv, rv);
>-
>-    nsCOMPtr<mozIStorageStatement> vacuum;
>-    rv = mDBConn->CreateStatement(NS_LITERAL_CSTRING("VACUUM"),
>-                                  getter_AddRefs(vacuum));
>-    NS_ENSURE_SUCCESS(rv, rv);
>-
>-    nsCOMPtr<mozIStorageStatement> journalToDefault;
>-    rv = mDBConn->CreateStatement(NS_LITERAL_CSTRING(
>-        "PRAGMA journal_mode = " DATABASE_JOURNAL_MODE),
>-      getter_AddRefs(journalToDefault));
>-    NS_ENSURE_SUCCESS(rv, rv);
>-
>-    mozIStorageBaseStatement *stmts[] = {
>-      journalToMemory,
>-      vacuum,
>-      journalToDefault
>-    };
So, why are we nuking all of this?

>+++ b/toolkit/components/places/src/nsNavHistory.h
>+  nsresult SetJournaling(const nsCString& aJournalMode);
>+  nsCString mCurrentJournalMode;
Both of these should take the previously mentioned enum I think.
Attachment #460037 - Flags: review?(sdwilsh) → review-

Comment 13

7 years ago
Just a comment about the memory-journal-during-vacuum thing. The only difference between a memory journal and no journal is that the former allows you to rollback a transaction, whereas the latter doesn't. In both cases, a interruption will corrupt the database almost every time (1). A memory journal doesn't protect the database integrity at all.

In the particular case of vacuum, there are two phases (creating the clean db and copying it back). A crash during the second phase without a disk journal will corrupt the database practically 100% of the time (the way everything is rearranged during a vacuum the chances of survival are almost zero).

Moreover vacuum can't be inside any other transactions, and should have no reason to rollback (at the point where writes to the actual database file start no more disk space is needed). So a memory journal gives you exactly nothing. Either you risk the database during that time (journal off) or you don't (disk journal). Currently you do, and I don't think playing that lottery once a month or so is a reasonable game.

Also, a memory journal during vacuum has an additional risk: a vacuum journal will reach the size of the database itself. If the db is large and the machine has little RAM and starts swapping, the time needed and therefore the risk window would be widened a lot (and the user will be more likely to end the process at that time since it stops responding).

SQLite could probably be improved to vacuum faster on filesystems that support atomic rename (right now it generates a defragmented db on a temp file, and then copies it back over the original file; it could just rename the temporary file over the original one). This would also help defragment the files on-disk.

(1) http://www.sqlite.org/pragma.html#pragma_journal_mode
(Reporter)

Comment 14

7 years ago
And also the fact setting journal to another mode, vacuum, and go back to WAL will fail, because we have unfinalized statements. This will be a problem because we need to do vacuum out of wal if the page size is not the expected one (page size cannot be changed after entering wal mode).
This issue should be moved to the global vacuum component bug, most likely when a vacuum+page_size change happens we will have to fallback to truncate mode till the browser is restarted to correctly do the page_size change.

Btw, regarding the crash->corruption problems we are aware of them, but the chance of having a crash while copying back the file were smaller than the perf penalty we were paying. Actually I redid the same tests and looks like vacuum is now faster also in truncate mode (and almost as fast in wal mode), so the memory thing can go away with recent sqlite versions.
(Reporter)

Comment 15

7 years ago
(In reply to comment #12)
> >-    rv = mDBConn->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
> >-        "PRAGMA journal_mode = MEMORY"));
> >-    NS_ENSURE_SUCCESS(rv, rv);
> >+    (void)SetJournaling(NS_LITERAL_CSTRING("memory"));
> We used to check the result here.  I think we still should.
> 
> >-    rv = mDBConn->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
> >-        "PRAGMA journal_mode = " DATABASE_JOURNAL_MODE));
> >-    NS_ENSURE_SUCCESS(rv, rv);
> >+    if (NS_FAILED(SetJournaling(DATABASE_DEFAULT_JOURNAL_MODE))) {
> >+      (void)SetJournaling(DATABASE_FALLBACK_JOURNAL_MODE);
> >+    }
> Ditto.


I'm not sure makes sense to check failures there, if we fail setting mode we will stay to the default DELETE mode, and browser can still work, the code will already warn in debug builds.
If we check failure the database could be marked as corrupt, when maybe it's not and is instead just a filesystem not supporting that mode. That will cause us to be completely not working.
Imo warnings are enough there.
(Assignee)

Comment 16

7 years ago
Alright, I'm fine with it then.
(Reporter)

Comment 17

7 years ago
Created attachment 462339 [details] [diff] [review]
patch v2.0

Addresses comments and can land before temp tables removal.
Attachment #460037 - Attachment is obsolete: true
Attachment #462339 - Flags: review?(sdwilsh)
(Reporter)

Comment 18

7 years ago
Probably the suggestion to run checkpointing in a separate thread makes sense... but it would be hard to detect each transaction without using sqlite3_wal_hook(), so that should probably be exposed in Storage, or storage should automatically do checkpointing in a separate thread for all users (that seems to make sense)
(Reporter)

Comment 19

7 years ago
btw, some checkpointing could already happen in a separate thread... if the query that causes wal to go over limit is an async query for example, checkpointing happens in that thread... since we are supposed to use async stmts more than sync ones, the problem could be fixed by itself.
(Assignee)

Comment 20

7 years ago
Comment on attachment 462339 [details] [diff] [review]
patch v2.0

For a review with more context with the comments, please see http://reviews.visophyte.org/r/462339/.

on file: toolkit/components/places/src/nsNavHistory.h line 126
>     JOURNAL_DELETE = 0
>     // Can reduce fsyncs on Linux when journal is deleted (See bug 460315).
>     // We fallback to this mode when WAL is unavailable.
>   , JOURNAL_TRUNCATE
>     // Journal is never deleted, can cause a privacy hit.
>   , JOURNAL_PERSIST
>     // Unsafe in case of crashes on database swap or low memory.
>   , JOURNAL_MEMORY
>     // Can globally reduce number of fsyncs.  We try to use this mode by default.
>   , JOURNAL_WAL
>     // Unsafe mode due to corruption on crashes.
>   , JOURNAL_OFF

Not sure it's worthwhile to have modes here that we do not use.


r=sdwilsh
Attachment #462339 - Flags: review?(sdwilsh) → review+
(Reporter)

Updated

7 years ago
Blocks: 563538
(Reporter)

Comment 21

7 years ago
Created attachment 464813 [details] [diff] [review]
patch v2.1
Attachment #462339 - Attachment is obsolete: true
(Reporter)

Comment 22

7 years ago
Created attachment 464825 [details] [diff] [review]
patch v2.2
Attachment #464813 - Attachment is obsolete: true
(Reporter)

Comment 23

7 years ago
Created attachment 464826 [details] [diff] [review]
patch v2.3

forgot to qrefresh.
Attachment #464825 - Attachment is obsolete: true
Looks like this regressed Ts on linux:

Regression: Ts increase 25.7% on Linux Firefox
   Previous results:
       450.368 from build 20100813121430 of revision da59cce7bfa2 at 2010-08-13 12:20:20 on talos-r3-fed-018 run # 0
   New results:
       566.211 from build 20100813121912 of revision dfd35987dd82 at 2010-08-13 12:50:31 on talos-r3-fed-004 run # 0
   http://mzl.la/benerL
   http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=da59cce7bfa2&tochange=dfd35987dd82
(Reporter)

Comment 25

7 years ago
(In reply to comment #24)
> Looks like this regressed Ts on linux:

noticed, we are waiting for numbers from all platforms to decide.
(Reporter)

Comment 26

7 years ago
forgot to mark
http://hg.mozilla.org/mozilla-central/rev/dfd35987dd82
Status: ASSIGNED → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b4
(Reporter)

Comment 27

7 years ago
Just to annotate things down, since I think it's useful for everybody, while looking around for known perf issues on ext4, I found an article saying ext4 with barriers on can regress sqlite performances, and seeing how wal and barrries are working that could be a bad interaction.
I asked catlee if it would be possible to run a Ts test on a linux box with barrier=0 on ext4, he said should be possible. If the regression is linux only that test would be really interesting.
But I can't tell more till I see other filesystems results.
(Reporter)

Comment 28

7 years ago
the global effect was Ts regressions on Linux and Windows, some of them are worrying too like a 52% on Win7...

But a 10% Ts win on OS X.

Seeing the fact max dirty profile loses less than min dirty profile, sounds like there is some fixed "setup time" to create the wal that hurts more small profiles in percentage.
Btw, due to the fact this hurts 2 systems out of 3 i'm going to backout and we should ask SQLite team if they have evidence of these times.
(Reporter)

Comment 29

7 years ago
Backed out
http://hg.mozilla.org/mozilla-central/rev/41d486ae06b2
http://hg.mozilla.org/mozilla-central/rev/2fa5b630f3bc

Shawn, can we contact someone at SQLite and try to figure out if this expensiveness on creation is expected/fixable somehow?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 30

7 years ago
To summarize, we saw the following regressions (sorted by platform):
Ts increase 25.7% on Linux Firefox
Ts, MIN Dirty Profile increase 23.8% on Linux Firefox
Ts, MED Dirty Profile increase 24.7% on Linux Firefox
Ts, MAX Dirty Profile increase 20.7% on Linux Firefox
Ts Shutdown, Cold MIN Dirty Profile increase 9.3% on Linux Firefox

Ts increase 30.8% on Linux x64 Firefox
Ts, MIN Dirty Profile increase 21.8% on Linux x64 Firefox
Ts, MED Dirty Profile increase 20.6% on Linux x64 Firefox
Ts, MAX Dirty Profile increase 21.4% on Linux x64 Firefox

Ts increase 37.3% on XP Firefox
Ts, MIN Dirty Profile increase 27.4% on XP Firefox
Ts, MED Dirty Profile increase 29.4% on XP Firefox
Ts, MAX Dirty Profile increase 28.9% on XP Firefox
Ts, Cold MIN Dirty Profile increase 21.6% on XP Firefox
Ts, Cold MED Dirty Profile increase 30.6% on XP Firefox
Ts, Cold MAX Dirty Profile increase 28% on XP Firefox

Ts increase 52.7% on Win7 Firefox
Ts, MIN Dirty Profile increase 44.4% on Win7 Firefox
Ts, MED Dirty Profile increase 28% on Win7 Firefox
Ts, MAX Dirty Profile increase 9.97% on Win7 Firefox
Ts, Cold MIN Dirty Profile increase 48.4% on Win7 Firefox
Ts, Cold MED Dirty Profile increase 26.9% on Win7 Firefox
Ts, Cold MAX Dirty Profile increase 9.8% on Win7 Firefox
Ts Shutdown increase 19.7% on Win7 Firefox
Tp4 (Memset) increase 1.67% on Win7 Firefox
Tp4 (%CPU) increase 10.4% on Win7 Firefox


And the following improvements:
Ts, MIN Dirty Profile decrease 10% on MacOSX 10.6.2 Firefox
Ts, MAX Dirty Profile decrease 8.93% on MacOSX 10.6.2 Firefox
Dromaeo (V8) increase 9.92% on XP Firefox (weird...probably unrelated to this change)
(Assignee)

Comment 31

7 years ago
SQLite folks are looking into this.
(Assignee)

Comment 32

7 years ago
Created attachment 467040 [details]
Just enabling WAL

I pushed this minimal changeset to the places branch (http://hg.mozilla.org/projects/places//rev/b3c24cff8af1) and this is the perf data comparison with changesets on mozilla-central (116f2046b9ef, 007d994cac53, 2dbb1278a15c, 34737d9895f8, and 40fa4ebeacbb) that should be a good baseline.

It looks like Ts is basically even (small regressions at worst).  What's more interesting is the Ts_generated results that show a regression for min, but med and max are OK.  Cannot really explain that yet.
(Assignee)

Comment 33

7 years ago
Created attachment 467194 [details]
Just enabling WAL (better comparisons)

OK, this fixes some issues with the compare-talos stuff and displays more information.  I also had some more talos runs kicked off this morning (not sure if windows is reporting that yet though, but mac and linux should be).  Same changesets as before.

There's only one small possible regression here on windows for Ts_generated_max.
Attachment #467040 - Attachment is obsolete: true
(Assignee)

Updated

7 years ago
Attachment #467194 - Attachment is obsolete: true
(Assignee)

Comment 34

7 years ago
OK, just going to provide links at this point instead of doing HTML attachments.

Just enabling WAL (http://hg.mozilla.org/projects/places/rev/b3c24cff8af1):
http://tinyurl.com/2fugkpq

Just enabling WAL with SQLite 3.7.1 alpha build (http://hg.mozilla.org/projects/places/rev/a8a5613f5855):
http://tinyurl.com/23rvzot

Converting the database at shutdown back to TRUNCATE on SQLite 3.7.0.1 (http://hg.mozilla.org/projects/places/rev/3760a7e09c3b):
http://tinyurl.com/24ygezf (I will be filing a bug to get two more talos runs on this change shortly)
(Assignee)

Comment 35

7 years ago
Interesting thing to note so far is that we aren't seeing the regressions from comment 30 (although we are seeing different regressions - weird).
(Assignee)

Comment 36

7 years ago
Er...fun times.  I've been testing against SQLite 3.6.23.1...

Interesting that tests were not failing.  Anywho, I just pushed mak's original change, and now I'm going to reland 3.7.1, and we'll get numbers on that.
(Assignee)

Comment 37

7 years ago
All of these are with a snapshot of 3.7.1.

mak's original set of changes:
http://tinyurl.com/27zgkqd

Just enabling WAL:
http://tinyurl.com/23rvzot

Converting the database at shutdown back to TRUNCATE:
http://tinyurl.com/27tu36p
(Assignee)

Comment 38

7 years ago
Based on comment 37, it looks like the regression is caused by switching back to TRUNCATE on shutdown.
(Assignee)

Comment 39

7 years ago
Marco, I just talked to LegNeato about the possibility of getting 3.7.1 on 1.9.2 and he said that shipping it in beta 5 and letting people beat on it for a little bit would be enough for him to be comfortable.  However, he is concerned about folks downgrading from 4.0 to 3.6, but wants to talk to Cww to see how often people downgrade.  If push comes to shove, we can do a change to places2.sqlite as well, but I think we clearly cannot do the TRUNCATE switch on shutdown.

Comment 40

7 years ago
I have done some tests about switching out of WAL on an Win XP machine. Going to MEMORY or OFF instead of TRUNCATE roughly doubles the speed of the switch. I think this is safe.

Using PRAGMA synchronous = OFF and then switching to any non-WAL mode is instantaneous, but that's probably not safe for the data if there's a crash at that very moment.
(Assignee)

Comment 41

7 years ago
(In reply to comment #40)
> I have done some tests about switching out of WAL on an Win XP machine. Going
> to MEMORY or OFF instead of TRUNCATE roughly doubles the speed of the switch. I
> think this is safe.
If a crash were to happen, we'd be out of luck then though (at least for downgrades; might be OK for staying on the same version).

> Using PRAGMA synchronous = OFF and then switching to any non-WAL mode is
> instantaneous, but that's probably not safe for the data if there's a crash at
> that very moment.
Yeah, that's risky.
(Reporter)

Comment 42

7 years ago
I think the problem is slightly different. enabling WAL has a cost clearly, but since Talos discards the best and worst result, if you switch to TRUNCATE each startup will pay the WAL cost, if you don't do that, then only the first startup will pay that cost and that result will be discarded, so you won't see a regression (but it was there just once). If the problem would be setting TRUNCATE on shutdown you'd see a regression in Ts Shutdown but not in Ts.
So, having an insight in why creating a wal journal is so expensive for SQLite would still be good.

If there is no way to speed up the wal creation on SQLite side, the only solutions to the regression on our side are:
- push sqlite 3.7.1 to 3.5.x and 3.6.x and pay the wal cost just once (or ignore 3.5.x since it's unlilely someone will downgrade to that one).
- make places2.sqlite, manually migrate history from places.sqlite and delete it (expensive from a coding point of view)
(Assignee)

Comment 43

7 years ago
(In reply to comment #42)
> I think the problem is slightly different. enabling WAL has a cost clearly, but
> since Talos discards the best and worst result, if you switch to TRUNCATE each
> startup will pay the WAL cost, if you don't do that, then only the first
> startup will pay that cost and that result will be discarded, so you won't see
> a regression (but it was there just once). If the problem would be setting
> TRUNCATE on shutdown you'd see a regression in Ts Shutdown but not in Ts.
> So, having an insight in why creating a wal journal is so expensive for SQLite
> would still be good.
Right, this is what I was trying to say.  Sorry if it came across differently.

> - push sqlite 3.7.1 to 3.5.x and 3.6.x and pay the wal cost just once (or
> ignore 3.5.x since it's unlilely someone will downgrade to that one).
I personally prefer this due to costs (and not worry about 1.9.1).
(Reporter)

Comment 44

7 years ago
Shawn, so what's current plan for sqlite 3.7.1 on 1.9.2?
Changing my patch to avoid truncate change at shutdown is pretty trivial but this is blocked by the downgrade compatibility issue.
Whiteboard: [waiting for sqlite3.7.1 on 1.9.2 branch][breaks backwards compatibility]
(Reporter)

Updated

7 years ago
Depends on: 583611
No longer depends on: 581000
(Assignee)

Comment 45

7 years ago
(In reply to comment #44)
> Shawn, so what's current plan for sqlite 3.7.1 on 1.9.2?
> Changing my patch to avoid truncate change at shutdown is pretty trivial but
> this is blocked by the downgrade compatibility issue.
Needs to bake in a beta before we can land it there.  We should take this change though but need to make sure we backport before final.
(Assignee)

Comment 46

7 years ago
Stealing this.
Assignee: mak77 → sdwilsh
(Assignee)

Updated

7 years ago
Target Milestone: mozilla2.0b4 → ---
(Assignee)

Comment 47

7 years ago
Created attachment 474235 [details] [diff] [review]
v2.4

Removes the bit rot (this isn't ready to land yet)
Attachment #464826 - Attachment is obsolete: true
(Assignee)

Comment 48

7 years ago
Created attachment 474264 [details] [diff] [review]
v2.5

This just does the startup stuff and does not convert at shutdown.  Going to have mak review this just to make sure I didn't do anything stupid, but this passes our xpchshell tests with flying colors.
Attachment #474235 - Attachment is obsolete: true
Attachment #474264 - Flags: review?(mak77)
(Assignee)

Updated

7 years ago
Status: REOPENED → ASSIGNED
Whiteboard: [waiting for sqlite3.7.1 on 1.9.2 branch][breaks backwards compatibility] → [needs review mak]
Keywords: perf

Updated

7 years ago
Blocks: 595574
(Reporter)

Comment 49

7 years ago
Comment on attachment 474264 [details] [diff] [review]
v2.5

># HG changeset patch
># Parent a293b6b40cece8a2fed0121bdef9c2b8eba72878
># User Marco Bonardo <mbonardo@mozilla.com>, Shawn Wilsher <me@shawnwilsher.com>

didn't know setting multiple users was working this way. How does this appear in pushlog?


> nsresult
>+nsNavHistory::SetJournalMode(enum JournalMode aJournalMode) {
>+  nsCAutoString journalMode;
>+  switch (aJournalMode) {
>+    case JOURNAL_DELETE:
>+      journalMode.AssignLiteral("delete");
>+      break;
>+    case JOURNAL_TRUNCATE:
>+      journalMode.AssignLiteral("truncate");
>+      break;
>+    case JOURNAL_MEMORY:
>+      journalMode.AssignLiteral("memory");
>+      break;
>+    case JOURNAL_WAL:
>+      journalMode.AssignLiteral("wal");
>+      break;
>+    default:
>+      NS_ABORT_IF_FALSE(false, "Trying to set an unknown journal mode.");

should be NS_NOTREACHED
Also it's probably better to set default to a default mode, or we will try to ser journal to an empty string

>+  NS_WARN_IF_FALSE(succeeded,
>+                   nsPrintfCString(128, "Setting journal mode failed: %s",
>+                                   PromiseFlatCString(journalMode).get()).get());
>+  if (succeeded) {
>+    mCurrentJournalMode = aJournalMode;
>+  }
>+  else {
>+    return NS_ERROR_FAILURE;
>+  }

since there is already an if (succeeded) I'd convert the NS_WARN_IF_FALSE to a NS_WARNING inside the if

>+nsresult
> nsNavHistory::InitDB()
> {

>@@ -1832,9 +1887,7 @@ nsNavHistory::MigrateV9Up(mozIStorageCon
>     // reducing write times by a half, but will temporary consume more memory
>     // and increase risks of corruption if we should crash in the middle of this
>     // update.
>-    rv = mDBConn->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
>-        "PRAGMA journal_mode = MEMORY"));
>-    NS_ENSURE_SUCCESS(rv, rv);
>+    (void)SetJournalMode(JOURNAL_MEMORY);
> 
>     rv = mDBConn->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
>         "UPDATE moz_places SET last_visit_date = "
>@@ -1844,9 +1897,12 @@ nsNavHistory::MigrateV9Up(mozIStorageCon
>     NS_ENSURE_SUCCESS(rv, rv);
> 
>     // Restore the default journal mode.
>-    rv = mDBConn->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
>-        "PRAGMA journal_mode = " DATABASE_JOURNAL_MODE));
>-    NS_ENSURE_SUCCESS(rv, rv);
>+    if (NS_FAILED(SetJournalMode(JOURNAL_WAL))) {
>+      // Ignore errors, if we fail here the database could be considered corrupt
>+      // and we won't be able to go on, even if it's just matter of a bogus file
>+      // system.  The default mode (DELETE) will be fine in such a case.
>+      (void)SetJournalMode(JOURNAL_TRUNCATE);
>+    }

I'd remove these temporary journal changes in migrateV9Up:
- first because from my testing latest sqlite versions are almost as fast, so the gain is ignorable
- second because this code path is only touched by users upgrading from FX3.5.x, that is a small percentage
- third because they are risky both for not being able to restore WAL, but also for dataloss on crashes. Since now tha gain is minor, the risk does not pay for it.


>+    // If journal mode is WAL, a VACUUM cannot upgrade page_size value.
>+    // If current page_size is not the expected one, journal mode must be
>+    // changed to a rollback one.  Once done we won't be able to go back to WAL
>+    // mode though, since unfinalized statements exist.  Just keep using
>+    // compatible mode till next restart.

from what you said, unfinalized is wrong, should be not-resetted

>     nsCOMPtr<mozIStoragePendingStatement> ps;
>     nsRefPtr<VacuumDBListener> vacuumDBListener =
>       new VacuumDBListener(mPrefBranch);
>-    rv = mDBConn->ExecuteAsync(stmts, NS_ARRAY_LENGTH(stmts),
>-                               vacuumDBListener, getter_AddRefs(ps));
>+    rv = vacuum->ExecuteAsync(nsnull, getter_AddRefs(ps));

the callback should be vacuumDBListener, not nsnull

r=me with the above fixed/answered
Attachment #474264 - Flags: review?(mak77) → review+
(Assignee)

Comment 50

7 years ago
(In reply to comment #49)
> didn't know setting multiple users was working this way. How does this appear
> in pushlog?
Shows up as the first person on the web, but hg stores it properly so if that ever gets fixed, things show up right.  On the changeset page is also shows up correctly.
(Assignee)

Updated

7 years ago
Whiteboard: [needs review mak] → [needs updated patch]
(Assignee)

Comment 51

7 years ago
Perf comparisons will show up here: http://tinyurl.com/29tcotl
(Assignee)

Comment 52

7 years ago
Created attachment 476849 [details] [diff] [review]
v2.6

Addresses review comments
Attachment #474264 - Attachment is obsolete: true
(Assignee)

Updated

7 years ago
Whiteboard: [needs updated patch] → [needs perf data]
(Assignee)

Updated

7 years ago
Depends on: 596970
(Assignee)

Comment 53

7 years ago
Also includes Tp4 numbers: http://tinyurl.com/2632hze
(Assignee)

Updated

7 years ago
Depends on: 598153
(Assignee)

Updated

7 years ago
Depends on: 598179
(In reply to comment #49)
> Comment on attachment 474264 [details] [diff] [review]
> >+      NS_ABORT_IF_FALSE(false, "Trying to set an unknown journal mode.");
> 
> should be NS_NOTREACHED

We really should be using ABORTs instead of ASSERTIONs for new code.
(Reporter)

Comment 55

7 years ago
where is this documented? If this is what we want, we should blog about it and add some deprecate annotation to nsDebug.h

btw, using NS_ABORT here is fine.
(Assignee)

Comment 56

7 years ago
(In reply to comment #54)
> We really should be using ABORTs instead of ASSERTIONs for new code.
Why?
(Assignee)

Comment 57

7 years ago
It looks to me like we get the following performance wins with this patch:
Tp4 win on windows and linux (32 and 64).
ts_cold on linux (32 and 64).

...and the following regressions:
memory usage regressions
tp4 on osx 64-bit only (32-bit is fine)

Notes:
1) the Ts generated places database results are weird and don't make any sense.  This could be due to an artifact in how the databases are created (they aren't created with WAL).  Some of the smaller sized databases are reporting regressions, and the bigger ones are reporting wins, which is not what I would expect.  I've never really trusted these numbers because of their noise, and I'm not convinced they are real regressions (getting more data points on the before and after would likely help, but is difficult to do).
2) The increase in memory usage is expected since WAL uses shared memory to write to the files (mmap on unix, and I have no idea what it uses on windows).  These are expected, but give us wins (and let us use less memory when we use more than one connection to a database like we will in bug 582703).

Comment 53 has a link to the page I'm using to compare runs and.  I'm going to ask for explicit approval to land this and to accept these regressions.  If stuff shakes out that we didn't expect on mozilla-central, then I'll obviously need to backout and do more investigation.
(Assignee)

Updated

7 years ago
Whiteboard: [needs perf data] → [needs approval]
(Assignee)

Updated

7 years ago
Attachment #476849 - Flags: approval2.0?
(Assignee)

Comment 58

7 years ago
And for the sake of interesting numbers, this is bug 573492 (WAL) and bug 552023 (kill temp tables) perf numbers: http://tinyurl.com/2dodkwm
(Assignee)

Comment 59

7 years ago
See bug 583611 comment 19 on issues with upgrading to SQLite 3.7.1 on branch.  We are in a bit of a sticky situation...
(Assignee)

Updated

7 years ago
Blocks: 599980
(Assignee)

Comment 60

7 years ago
(In reply to comment #57)
> ...and the following regressions:
> memory usage regressions
> tp4 on osx 64-bit only (32-bit is fine)
Sorry, that's ts_cold.  tp4 looks fine.
(Assignee)

Updated

7 years ago
Blocks: 600980

Comment 61

7 years ago
I might be a bit late to the table here... but I've been reading about WAL at http://www.sqlite.org/draft/wal.html and it very specifically says:

"When a reader needs a page of content, it first checks the WAL to see if that page appears there, and if so it pulls in the last copy of the page that occurs in the WAL prior to the reader's end mark. If no copy of the page exists in the WAL prior to the reader's end mark, then the page is read from the original database file. Readers can exist in separate processes, so to avoid forcing every reader to scan the entire WAL looking for pages (the WAL file can grow to multiple megabytes, depending on how often checkpoints are run), a data structure called the "wal-index" is maintained in shared memory which helps readers locate pages in the WAL quickly and with a minimum of I/O. The wal-index greatly improves the performance of readers, but the use of shared memory means that all readers must exist on the same machine. This is why the write-ahead log implementation will not work on a network filesystem."

How do we cater for people who do actually use a network filesystem? I'm sure there are folks who are using NFS...
(Assignee)

Comment 62

7 years ago
(In reply to comment #61)
> How do we cater for people who do actually use a network filesystem? I'm sure
> there are folks who are using NFS...
Firefox has never supported using the same profile in more than one running instance of Firefox, so this isn't a problem here.

Comment 63

7 years ago
Somebody want to tell this guy?

https://blogs.cs.umbc.edu/willm1/2009/02/08/firefox-3-over-nfs-sucks/

But understood... though where is this in the documentation?
(Assignee)

Comment 64

7 years ago
(In reply to comment #63)
> Somebody want to tell this guy?
> 
> https://blogs.cs.umbc.edu/willm1/2009/02/08/firefox-3-over-nfs-sucks/
That's a completely different issue.

> But understood... though where is this in the documentation?
To be clear, you can use a profile over NFS, you just can't have more than one copy of Firefox using that profile.  This applies always, regardless of where your profile is located.

Comment 65

7 years ago
Ah! OK, that makes more sense now.
(Assignee)

Comment 66

7 years ago
http://hg.mozilla.org/projects/places/rev/31c960a57b46
Whiteboard: [needs approval] → [landed in places]
(Assignee)

Updated

7 years ago
Whiteboard: [landed in places] → [fixed-in-places]
(Assignee)

Updated

7 years ago
Depends on: 602729
(Reporter)

Comment 67

7 years ago
Shawn, regarding the Ts reg we see on Linux, could still make sense to investigate comment 27, it is still possible ext4 with barriers is hurting us, and that would explain why we suffer here an not on osx or win. Also if that would be the case, I'd not care much about perf when the user explicitly decided to drop perf for data coherence.
(Assignee)

Comment 68

7 years ago
I believe that when I asked someone from releng, they said we were running ext3, not ext4 on the talos boxes.  Should probably confirm that though...
(Reporter)

Comment 69

7 years ago
well they told me the exact opposite on August, i.e. all boxes are on ext4 :(
(Assignee)

Comment 70

7 years ago
(In reply to comment #69)
> well they told me the exact opposite on August, i.e. all boxes are on ext4 :(
Alright, let's check again and then report back here :)
(Assignee)

Comment 71

7 years ago
<jhford-buildduty>	sdwilsh: fedora talos slaves have barriers enabled
(Reporter)

Comment 72

7 years ago
that "could" explain why sometimes linux shows regressions not shown by other OS, slower writes could do it.
(Assignee)

Updated

7 years ago
Blocks: 609122
Comment on attachment 476849 [details] [diff] [review]
v2.6

<johnath> sdwilsh: do I want to approve the patch on bug 573492?
<sdwilsh> johnath: it's part of the places branch...when that's ready to go in, it's going to be merged.  By itself, it's a regression, with all of our stuff...well, see http://etherpad.mozilla.com:9000/places-branch
<sdwilsh> johnath: can probably cancel the request, fwiw
Attachment #476849 - Flags: approval2.0?
(Reporter)

Comment 74

6 years ago
http://hg.mozilla.org/mozilla-central/rev/31c960a57b46
Status: ASSIGNED → RESOLVED
Last Resolved: 7 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b9
You need to log in before you can comment on or make changes to this bug.