Last Comment Bug 767025 - Store activity flags on offline cache entries in memory for better performance
: Store activity flags on offline cache entries in memory for better performance
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Networking: Cache (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla17
Assigned To: Thinker Li [:sinker]
:
Mentors:
Depends on:
Blocks: 769184 769279
  Show dependency treegraph
 
Reported: 2012-06-21 09:55 PDT by Mike Hommey [:glandium]
Modified: 2012-07-24 13:39 PDT (History)
19 users (show)
ryanvm: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-


Attachments
v1 (15.25 KB, patch)
2012-06-26 16:00 PDT, Honza Bambas (:mayhemer)
no flags Details | Diff | Review
part 1: remove flags (10.63 KB, patch)
2012-06-27 01:31 PDT, Andreas Gal :gal
no flags Details | Diff | Review
part 2: don't write back record when unbinding (3.83 KB, patch)
2012-06-27 01:32 PDT, Andreas Gal :gal
no flags Details | Diff | Review
part 3: track active entries in memory and refuse to delete them with raise(ignore) (5.80 KB, patch)
2012-06-27 01:32 PDT, Andreas Gal :gal
no flags Details | Diff | Review
part 1: remove flags (15.14 KB, patch)
2012-06-27 10:05 PDT, Thinker Li [:sinker]
honzab.moz: review+
Details | Diff | Review
part 2: track active entries in memory and refuse to delete them with raise(ignore) (10.05 KB, patch)
2012-06-27 10:06 PDT, Thinker Li [:sinker]
honzab.moz: review-
Details | Diff | Review
mozStorage:5 log for template app with patches applied (7.13 KB, text/plain)
2012-06-27 23:35 PDT, Mike Hommey [:glandium]
no flags Details
mozStorage:5 log for dialer app when placing a call with patches applied (27.92 KB, text/plain)
2012-06-27 23:44 PDT, Mike Hommey [:glandium]
no flags Details
part 1: remove flags, v2 (16.36 KB, patch)
2012-06-28 18:58 PDT, Thinker Li [:sinker]
tlee: review+
Details | Diff | Review
part 2: track active entries in memory and refuse to delete them with raise(ignore), v2 (9.94 KB, patch)
2012-06-28 19:02 PDT, Thinker Li [:sinker]
no flags Details | Diff | Review
part 2: track active entries in memory and refuse to delete them with raise(ignore), v3 (10.31 KB, patch)
2012-06-29 04:01 PDT, Thinker Li [:sinker]
honzab.moz: review-
Details | Diff | Review
Part 3: install eviction observer before deleting a entry (1.28 KB, patch)
2012-06-29 04:05 PDT, Thinker Li [:sinker]
honzab.moz: review-
Details | Diff | Review
Part 4: testcase (8.92 KB, patch)
2012-06-29 04:13 PDT, Thinker Li [:sinker]
honzab.moz: review-
Details | Diff | Review
Part 2: track active entries in memory and refuse to d elete by ignore (10.37 KB, patch)
2012-06-29 13:16 PDT, Thinker Li [:sinker]
honzab.moz: review-
Details | Diff | Review
Part 2: track active entries in memory and refuse to d elete by ignore, v2 (10.41 KB, patch)
2012-07-10 03:49 PDT, Thinker Li [:sinker]
gal: review+
Details | Diff | Review
Part 3: testcase, v2 (9.10 KB, patch)
2012-07-10 03:52 PDT, Thinker Li [:sinker]
gal: review+
Details | Diff | Review
Part 2: track active entries in memory and refuse to d elete by ignore, v3 (11.02 KB, patch)
2012-07-10 23:37 PDT, Thinker Li [:sinker]
no flags Details | Diff | Review
Part 2: track active entries in memory and refuse to d elete by ignore, v4 (10.96 KB, patch)
2012-07-12 03:07 PDT, Thinker Li [:sinker]
gal: review+
honzab.moz: review+
Details | Diff | Review
part 1: remove flags, v3 (16.37 KB, patch)
2012-07-15 22:20 PDT, Thinker Li [:sinker]
tlee: review+
Details | Diff | Review
Part 2: track active entries in memory and refuse to d elete by ignore, v5 (10.98 KB, patch)
2012-07-15 22:22 PDT, Thinker Li [:sinker]
tlee: review+
Details | Diff | Review
640559: Part 3: testcase, v3 (9.14 KB, patch)
2012-07-15 22:25 PDT, Thinker Li [:sinker]
tlee: review+
Details | Diff | Review

Description Mike Hommey [:glandium] 2012-06-21 09:55:40 PDT
Accesses to the offlinecache by b2g triggers writes to the index.sqlite journal. This obviously shouldn't happen when the cache is populated and when what is being looked for is already there.
Comment 1 Honza Bambas (:mayhemer) 2012-06-21 10:07:36 PDT
We are marking active items (=being currently held by some channel) in the database.  I think this is the source of writes.  This way of marking items directly in the database has chosen to make queries easier to build and be more optimal from point of view of SQL. 

We could have an in-memory view or joined in-memory table that would hold this info and still could be efficiently used in queries.
Comment 2 Honza Bambas (:mayhemer) 2012-06-21 10:11:10 PDT
We do UPDATE moz_cache SET Flags = 1 when an entry is activated and drop this back to zero when it is no longer active.  On start we are also dropping this flag from the database.  Clearly a candidate to move to an in-memory table.
Comment 3 Mike Hommey [:glandium] 2012-06-21 11:49:56 PDT
Relatedly, the database is not in WAL mode. Switching it in WAL mode already does make things faster, although not as fast as it would be with an in-memory table.
Comment 4 Mike Hommey [:glandium] 2012-06-21 11:51:56 PDT
(In reply to Mike Hommey [:glandium] from comment #3)
> Relatedly, the database is not in WAL mode. Switching it in WAL mode already
> does make things faster, although not as fast as it would be with an
> in-memory table.

Mmmm apparently, switching OfflineCache/index.sqlite to WAL mode crashes b2g :-/
Comment 5 Mike Hommey [:glandium] 2012-06-21 22:22:07 PDT
This is the mozStorage log I get when launching the B2G Template app:

074324840[11950]: sqlite3_trace on 8b8920 for 'SELECT ClientID, ItemType FROM moz_cache WHERE Key = 'http://template.gaiamobile.org/' ORDER BY LastFetched DESC, LastModified DESC;'
980744[890860]: sqlite3_trace on 8b8920 for 'SELECT MetaData, Generation, Flags, DataSize, FetchCount, LastFetched, LastModified, ExpirationTime, ItemType FROM moz_cache WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340279803|25' AND Key = 'http://template.gaiamobile.org/';'
980744[890860]: sqlite3_trace on 8b8920 for 'UPDATE moz_cache SET Flags = 1 WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340279803|25' AND Key = 'http://template.gaiamobile.org/';'
980744[890860]: sqlite3_trace on 8b8920 for 'UPDATE moz_cache SET MetaData = x'726571756573742d6d6574686f640047455400726573706f6e73652d6865616400474554202f20485454502f312e310d0a00636861727365740049534f2d383835392d3100', Flags = 0, DataSize = 286, FetchCount = 2, LastFetched = 1340342338000000, LastModified = 1340342338000000, ExpirationTime = 0 WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340279803|25' AND Key = 'http://template.gaiamobile.org/';'
980744[890860]: sqlite3_trace on 8b8920 for 'SELECT MetaData, Generation, Flags, DataSize, FetchCount, LastFetched, LastModified, ExpirationTime, ItemType FROM moz_cache WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340279803|25' AND Key = 'http://template.gaiamobile.org/manifest.appcache';'
980744[890860]: sqlite3_trace on 8b8920 for 'UPDATE moz_cache SET Flags = 1 WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340279803|25' AND Key = 'http://template.gaiamobile.org/manifest.appcache';'
980744[890860]: sqlite3_trace on 8b8920 for 'SELECT MetaData, Generation, Flags, DataSize, FetchCount, LastFetched, LastModified, ExpirationTime, ItemType FROM moz_cache WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340342338|15' AND Key = 'http://template.gaiamobile.org/manifest.appcache';'
074324840[11950]: sqlite3_trace on 8b8920 for 'UPDATE moz_cache SET MetaData = x'726571756573742d6d6574686f640047455400726573706f6e73652d6865616400474554202f20485454502f312e310d0a00', Flags = 0, DataSize = 224, FetchCount = 2, LastFetched = 1340342338000000, LastModified = 1340279803000000, ExpirationTime = 0 WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340279803|25' AND Key = 'http://template.gaiamobile.org/manifest.appcache';'
074324840[11950]: sqlite3_trace on 8b8920 for 'CREATE TEMP TRIGGER cache_on_delete AFTER DELETE ON moz_cache FOR EACH ROW BEGIN SELECT cache_eviction_observer(  OLD.key, OLD.generation); END;'
074324840[11950]: Initialized statement 'DELETE FROM moz_cache WHERE ClientID=? AND Flags = 0;' (0x814ee0)
074324840[11950]: sqlite3_trace on 8b8920 for 'DELETE FROM moz_cache WHERE ClientID='http%3A//template.gaiamobile.org/manifest.appcache|0000001340342338|15' AND Flags = 0;'
074324840[11950]: Finalizing statement 'DELETE FROM moz_cache WHERE ClientID=? AND Flags = 0;'
074324840[11950]: Initialized statement 'DELETE FROM moz_cache_groups WHERE ActiveClientID=?;' (0x814ee0)
074324840[11950]: sqlite3_trace on 8b8920 for 'DELETE FROM moz_cache_groups WHERE ActiveClientID='http%3A//template.gaiamobile.org/manifest.appcache|0000001340342338|15';'
074324840[11950]: Finalizing statement 'DELETE FROM moz_cache_groups WHERE ActiveClientID=?;'
074324840[11950]: Initialized statement 'DELETE FROM moz_cache_namespaces WHERE ClientID=?' (0x814ee0)
074324840[11950]: sqlite3_trace on 8b8920 for 'DELETE FROM moz_cache_namespaces WHERE ClientID='http%3A//template.gaiamobile.org/manifest.appcache|0000001340342338|15''
074324840[11950]: Finalizing statement 'DELETE FROM moz_cache_namespaces WHERE ClientID=?'
074324840[11950]: sqlite3_trace on 8b8920 for 'DROP TRIGGER cache_on_delete;'
980744[890860]: sqlite3_trace on 8b8920 for 'SELECT MetaData, Generation, Flags, DataSize, FetchCount, LastFetched, LastModified, ExpirationTime, ItemType FROM moz_cache WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340279803|25' AND Key = 'http://template.gaiamobile.org/manifest.appcache';'
980744[890860]: sqlite3_trace on 8b8920 for 'UPDATE moz_cache SET Flags = 1 WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340279803|25' AND Key = 'http://template.gaiamobile.org/manifest.appcache';'
980744[890860]: sqlite3_trace on 8b8920 for 'SELECT MetaData, Generation, Flags, DataSize, FetchCount, LastFetched, LastModified, ExpirationTime, ItemType FROM moz_cache WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340342338|16' AND Key = 'http://template.gaiamobile.org/manifest.appcache';'
074324840[11950]: sqlite3_trace on 8b8920 for 'UPDATE moz_cache SET MetaData = x'726571756573742d6d6574686f640047455400726573706f6e73652d6865616400474554202f20485454502f312e310d0a00', Flags = 0, DataSize = 224, FetchCount = 3, LastFetched = 1340342338000000, LastModified = 1340279803000000, ExpirationTime = 0 WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340279803|25' AND Key = 'http://template.gaiamobile.org/manifest.appcache';'
074324840[11950]: sqlite3_trace on 8b8920 for 'CREATE TEMP TRIGGER cache_on_delete AFTER DELETE ON moz_cache FOR EACH ROW BEGIN SELECT cache_eviction_observer(  OLD.key, OLD.generation); END;'
074324840[11950]: Initialized statement 'DELETE FROM moz_cache WHERE ClientID=? AND Flags = 0;' (0x814ee0)
074324840[11950]: sqlite3_trace on 8b8920 for 'DELETE FROM moz_cache WHERE ClientID='http%3A//template.gaiamobile.org/manifest.appcache|0000001340342338|16' AND Flags = 0;'
074324840[11950]: Finalizing statement 'DELETE FROM moz_cache WHERE ClientID=? AND Flags = 0;'
074324840[11950]: Initialized statement 'DELETE FROM moz_cache_groups WHERE ActiveClientID=?;' (0x814ee0)
074324840[11950]: sqlite3_trace on 8b8920 for 'DELETE FROM moz_cache_groups WHERE ActiveClientID='http%3A//template.gaiamobile.org/manifest.appcache|0000001340342338|16';'
074324840[11950]: Finalizing statement 'DELETE FROM moz_cache_groups WHERE ActiveClientID=?;'
074324840[11950]: Initialized statement 'DELETE FROM moz_cache_namespaces WHERE ClientID=?' (0x814ee0)
074324840[11950]: sqlite3_trace on 8b8920 for 'DELETE FROM moz_cache_namespaces WHERE ClientID='http%3A//template.gaiamobile.org/manifest.appcache|0000001340342338|16''
074324840[11950]: Finalizing statement 'DELETE FROM moz_cache_namespaces WHERE ClientID=?'
074324840[11950]: sqlite3_trace on 8b8920 for 'DROP TRIGGER cache_on_delete;'
Comment 6 Mike Hommey [:glandium] 2012-06-25 10:42:16 PDT
Here is a testcase showing how hurtful this can be:
- In B2G on an otoro phone, open the dialer
- Type a phone number, say, 555
- Tap the green call button
- The call info panel shows up after ~5s (no matter how many subsequent times you try).

Now, a little hack to make the database updates virtually free:
- adb shell
$ stop b2g
$ mv /data/local/OfflineCache /data/local/OfflineCache.orig
$ mkdir /data/local/OfflineCache
$ mount -t tmpfs none /data/local/OfflineCache
$ for i in 0 1 2 3 4 5 6 7 8 9 A B C D E F; do ln -s /data/local/OfflineCache.orig/$i /data/local/OfflineCache/$i; done
$ cat /data/local/OfflineCache.orig/index.sqlite > /data/local/OfflineCache/index.sqlite
$ start b2g

(In practice, this puts the OfflineCache index.sqlite database on tmpfs, but not the data itself ; this also guarantees that sqlite temporary (journal) files are created on the tmpfs)

- Try again. The call info panel now shows up after ~1s.

Honza, do you think you can take this bug and improve the situation? I'm not familiar with the code, and it would make sense that someone that is takes it further.
Comment 7 Honza Bambas (:mayhemer) 2012-06-26 05:53:49 PDT
Thanks Mike for confirming my words.  How important is this (deadline) ?
Comment 8 Thinker Li [:sinker] 2012-06-26 07:16:23 PDT
if we go to multi-processing, is it still working? (in memory)  Or we should involve shared memory!?
Comment 9 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-06-26 07:18:57 PDT
Content processes don't know about offlinecache, AFAIK.  The requests should be routed through the master process.
Comment 10 Honza Bambas (:mayhemer) 2012-06-26 07:24:53 PDT
Yes, access to the sqlite DB is made only on the chrome process.  This will "just work".
Comment 11 Andreas Gal :gal 2012-06-26 11:01:23 PDT
This is of critical importance to fix ASAP. It makes our performance miserable. Can we parallelize reads and remove the writes? Do you have an ETA for a patch?
Comment 12 Thinker Li [:sinker] 2012-06-26 13:59:44 PDT
Is there any reason that we should query database for the cache entry every time?  Can we just hash the path and check directories for the file with the hashed name.  We can create a directory for each version of apps to collect all files.  So, just query for the directory of the latest version one time, and cache it on the memory.  I think it would enhance more.
Comment 13 Honza Bambas (:mayhemer) 2012-06-26 14:51:24 PDT
Thinker, I really don't understand what you mean.  To clarify: an entry is active when a channel(s) is(are) currently working with it (i.e. writes to it, or read it).  Entries are usually active for a short time.  The active indication is used to prevent race conditions and leave entries the platform currently works with alone when e.g. evicting entries.
Comment 14 Thinker Li [:sinker] 2012-06-26 15:08:02 PDT
Honza, I known it.  I am off the topic.  I am talking about if it is possible to avoid also the queries for every cache entry.  I mean mStatement_FindEntry.  Maybe I should create a new bug for it.
Comment 15 Honza Bambas (:mayhemer) 2012-06-26 15:18:56 PDT
(In reply to Thinker Li [:sinker] from comment #14)
> Honza, I known it.  I am off the topic.  I am talking about if it is
> possible to avoid also the queries for every cache entry.  I mean
> mStatement_FindEntry.  Maybe I should create a new bug for it.

Ah, sorry, I was confused.  That should be in a different bug, please CC me.
Comment 16 Honza Bambas (:mayhemer) 2012-06-26 16:00:14 PDT
Created attachment 636918 [details] [diff] [review]
v1

Try: https://tbpl.mozilla.org/?tree=Try&rev=b17167c240be
Comment 17 Dave Camp (:dcamp) 2012-06-26 17:24:55 PDT
Comment on attachment 636918 [details] [diff] [review]
v1

Review of attachment 636918 [details] [diff] [review]:
-----------------------------------------------------------------

If the process crashes while there are multiple versions of the cache, won't you lose which one is active?
Comment 18 Dave Camp (:dcamp) 2012-06-26 17:45:03 PDT
(In reply to Dave Camp (:dcamp) from comment #17)

> If the process crashes while there are multiple versions of the cache, won't
> you lose which one is active?

Ignore that question, I was misremembering what that flag was about.
Comment 19 Dave Camp (:dcamp) 2012-06-26 17:47:28 PDT
So this patch will reduce write churn, but we'll still be updating LastFetched and FetchedCount for each entry.  Are those actually useful?  Can we stop storing those entirely, or at least batch updates to them?
Comment 20 Mike Hommey [:glandium] 2012-06-26 23:38:23 PDT
It doesn't look like it does much difference on the dialer testcase.

With this patch, launching the template application looks like this:

I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'SELECT ClientID, ItemType FROM moz_cache WHERE Key = 'http://template.gaiamobile.org/' ORDER BY LastFetched DESC, LastModified DESC;'
I/PRLog   (  299): 39734488[25e4c60]: sqlite3_trace on 20c1400 for 'SELECT MetaData, Generation, Flags, DataSize, FetchCount, LastFetched, LastModified, ExpirationTime, ItemType FROM moz_cache WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340712152|25' AND Key = 'http://template.gaiamobile.org/' AND NOT EXISTS (SELECT * FROM moz_cache_active  WHERE moz_cache.ClientID = moz_cache_active.ClientID  AND moz_cache.Key = moz_cache_active.Key );'
I/PRLog   (  299): 39734488[25e4c60]: sqlite3_trace on 20c1400 for 'INSERT OR REPLACE INTO moz_cache_active VALUES ('http%3A//template.gaiamobile.org/manifest.appcache|0000001340712152|25', 'http://template.gaiamobile.org/');'
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'UPDATE moz_cache SET MetaData = x'726571756573742d6d6574686f640047455400726573706f6e73652d6865616400474554202f20485454502f312e310d0a00636861727365740049534f2d383835392d3100', Flags = 0, DataSize = 286, FetchCount = 2, LastFetched = 1340778814000000, LastModified = 1340778814000000, ExpirationTime = 0 WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340712152|25' AND Key = 'http://template.gaiamobile.org/';'
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'DELETE FROM moz_cache_active WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340712152|25' AND Key = 'http://template.gaiamobile.org/';'
I/PRLog   (  299): 39734488[25e4c60]: sqlite3_trace on 20c1400 for 'SELECT MetaData, Generation, Flags, DataSize, FetchCount, LastFetched, LastModified, ExpirationTime, ItemType FROM moz_cache WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340712152|25' AND Key = 'http://template.gaiamobile.org/manifest.appcache' AND NOT EXISTS (SELECT * FROM moz_cache_active  WHERE moz_cache.ClientID = moz_cache_active.ClientID  AND moz_cache.Key = moz_cache_active.Key );'
I/PRLog   (  299): 39734488[25e4c60]: sqlite3_trace on 20c1400 for 'INSERT OR REPLACE INTO moz_cache_active VALUES ('http%3A//template.gaiamobile.org/manifest.appcache|0000001340712152|25', 'http://template.gaiamobile.org/manifest.appcache');'
I/PRLog   (  299): 39734488[25e4c60]: sqlite3_trace on 20c1400 for 'SELECT MetaData, Generation, Flags, DataSize, FetchCount, LastFetched, LastModified, ExpirationTime, ItemType FROM moz_cache WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340778814|11' AND Key = 'http://template.gaiamobile.org/manifest.appcache' AND NOT EXISTS (SELECT * FROM moz_cache_active  WHERE moz_cache.ClientID = moz_cache_active.ClientID  AND moz_cache.Key = moz_cache_active.Key );'
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'UPDATE moz_cache SET MetaData = x'726571756573742d6d6574686f640047455400726573706f6e73652d6865616400474554202f20485454502f312e310d0a00', Flags = 0, DataSize = 224, FetchCount = 2, LastFetched = 1340778815000000, LastModified = 1340712152000000, ExpirationTime = 0 WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340712152|25' AND Key = 'http://template.gaiamobile.org/manifest.appcache';'
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'DELETE FROM moz_cache_active WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340712152|25' AND Key = 'http://template.gaiamobile.org/manifest.appcache';'
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'CREATE TEMP TRIGGER cache_on_delete AFTER DELETE ON moz_cache FOR EACH ROW BEGIN SELECT cache_eviction_observer(  OLD.key, OLD.generation); END;'
I/PRLog   (  299): 1074439528[1de4950]: Initialized statement 'DELETE FROM moz_cache WHERE ClientID = ? AND NOT EXISTS (SELECT * FROM moz_cache_active  WHERE moz_cache.ClientID = moz_cache_active.ClientID  AND moz_cache.Key = moz_cache_active.Key );' (0x275ebc0)
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'DELETE FROM moz_cache WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340778814|11' AND NOT EXISTS (SELECT * FROM moz_cache_active  WHERE moz_cache.ClientID = moz_cache_active.ClientID  AND moz_cache.Key = moz_cache_active.Key );'
I/PRLog   (  299): 1074439528[1de4950]: Finalizing statement 'DELETE FROM moz_cache WHERE ClientID = ? AND NOT EXISTS (SELECT * FROM moz_cache_active  WHERE moz_cache.ClientID = moz_cache_active.ClientID  AND moz_cache.Key = moz_cache_active.Key );'
I/PRLog   (  299): 1074439528[1de4950]: Initialized statement 'DELETE FROM moz_cache_groups WHERE ActiveClientID=?;' (0x275ebc0)
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'DELETE FROM moz_cache_groups WHERE ActiveClientID='http%3A//template.gaiamobile.org/manifest.appcache|0000001340778814|11';'
I/PRLog   (  299): 1074439528[1de4950]: Finalizing statement 'DELETE FROM moz_cache_groups WHERE ActiveClientID=?;'
I/PRLog   (  299): 1074439528[1de4950]: Initialized statement 'DELETE FROM moz_cache_namespaces WHERE ClientID=?' (0x275ebc0)
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'DELETE FROM moz_cache_namespaces WHERE ClientID='http%3A//template.gaiamobile.org/manifest.appcache|0000001340778814|11''
I/PRLog   (  299): 1074439528[1de4950]: Finalizing statement 'DELETE FROM moz_cache_namespaces WHERE ClientID=?'
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'DROP TRIGGER cache_on_delete;'
I/PRLog   (  299): 39734488[25e4c60]: sqlite3_trace on 20c1400 for 'SELECT MetaData, Generation, Flags, DataSize, FetchCount, LastFetched, LastModified, ExpirationTime, ItemType FROM moz_cache WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340712152|25' AND Key = 'http://template.gaiamobile.org/manifest.appcache' AND NOT EXISTS (SELECT * FROM moz_cache_active  WHERE moz_cache.ClientID = moz_cache_active.ClientID  AND moz_cache.Key = moz_cache_active.Key );'
I/PRLog   (  299): 39734488[25e4c60]: sqlite3_trace on 20c1400 for 'INSERT OR REPLACE INTO moz_cache_active VALUES ('http%3A//template.gaiamobile.org/manifest.appcache|0000001340712152|25', 'http://template.gaiamobile.org/manifest.appcache');'
I/PRLog   (  299): 39734488[25e4c60]: sqlite3_trace on 20c1400 for 'SELECT MetaData, Generation, Flags, DataSize, FetchCount, LastFetched, LastModified, ExpirationTime, ItemType FROM moz_cache WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340778814|12' AND Key = 'http://template.gaiamobile.org/manifest.appcache' AND NOT EXISTS (SELECT * FROM moz_cache_active  WHERE moz_cache.ClientID = moz_cache_active.ClientID  AND moz_cache.Key = moz_cache_active.Key );'
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'UPDATE moz_cache SET MetaData = x'726571756573742d6d6574686f640047455400726573706f6e73652d6865616400474554202f20485454502f312e310d0a00', Flags = 0, DataSize = 224, FetchCount = 3, LastFetched = 1340778815000000, LastModified = 1340712152000000, ExpirationTime = 0 WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340712152|25' AND Key = 'http://template.gaiamobile.org/manifest.appcache';'
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'DELETE FROM moz_cache_active WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340712152|25' AND Key = 'http://template.gaiamobile.org/manifest.appcache';'
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'CREATE TEMP TRIGGER cache_on_delete AFTER DELETE ON moz_cache FOR EACH ROW BEGIN SELECT cache_eviction_observer(  OLD.key, OLD.generation); END;'
I/PRLog   (  299): 1074439528[1de4950]: Initialized statement 'DELETE FROM moz_cache WHERE ClientID = ? AND NOT EXISTS (SELECT * FROM moz_cache_active  WHERE moz_cache.ClientID = moz_cache_active.ClientID  AND moz_cache.Key = moz_cache_active.Key );' (0x275ebc0)
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'DELETE FROM moz_cache WHERE ClientID = 'http%3A//template.gaiamobile.org/manifest.appcache|0000001340778814|12' AND NOT EXISTS (SELECT * FROM moz_cache_active  WHERE moz_cache.ClientID = moz_cache_active.ClientID  AND moz_cache.Key = moz_cache_active.Key );'
I/PRLog   (  299): 1074439528[1de4950]: Finalizing statement 'DELETE FROM moz_cache WHERE ClientID = ? AND NOT EXISTS (SELECT * FROM moz_cache_active  WHERE moz_cache.ClientID = moz_cache_active.ClientID  AND moz_cache.Key = moz_cache_active.Key );'
I/PRLog   (  299): 1074439528[1de4950]: Initialized statement 'DELETE FROM moz_cache_groups WHERE ActiveClientID=?;' (0x275ebc0)
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'DELETE FROM moz_cache_groups WHERE ActiveClientID='http%3A//template.gaiamobile.org/manifest.appcache|0000001340778814|12';'
I/PRLog   (  299): 1074439528[1de4950]: Finalizing statement 'DELETE FROM moz_cache_groups WHERE ActiveClientID=?;'
I/PRLog   (  299): 1074439528[1de4950]: Initialized statement 'DELETE FROM moz_cache_namespaces WHERE ClientID=?' (0x275ebc0)
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'DELETE FROM moz_cache_namespaces WHERE ClientID='http%3A//template.gaiamobile.org/manifest.appcache|0000001340778814|12''
I/PRLog   (  299): 1074439528[1de4950]: Finalizing statement 'DELETE FROM moz_cache_namespaces WHERE ClientID=?'
I/PRLog   (  299): 1074439528[1de4950]: sqlite3_trace on 20c1400 for 'DROP TRIGGER cache_on_delete;'

I can also attach the log for the dialer testcase if that's helpful.
Comment 21 Andreas Gal :gal 2012-06-26 23:43:23 PDT
LastFetched and FetchedCount seem to be used for about:cache only. We should nuke that. Honza why do we need the active flag? With the dooming we should never nuke open entries anyway. We should not do any writes during fetching, ever.
Comment 22 Andreas Gal :gal 2012-06-27 00:25:52 PDT
Also while we are at it. Can we use nsDiskCacheDevice instead of nsDiskCacheDeviceSQL for the offline cache? That might eliminate all problems as well.
Comment 23 Andreas Gal :gal 2012-06-27 00:28:40 PDT
Note: one of the reasons I am keen on avoiding the writes here is that we are sitting on a software write-leveled flash filesystem with dog slow write performance. Best case its slow, worst case we will keep the flash storage pretty quickly.
Comment 24 Andreas Gal :gal 2012-06-27 00:45:27 PDT
The right thing to do here is to delete the flag and the write-on-access fields. Rows should be locked ("mark active") in an in memory list which is then checked in the eviction observer and the observer can do a RAISE(IGNORE) to skip deleting that row.
Comment 25 Andreas Gal :gal 2012-06-27 01:31:29 PDT
Created attachment 637026 [details] [diff] [review]
part 1: remove flags
Comment 26 Andreas Gal :gal 2012-06-27 01:32:05 PDT
Created attachment 637027 [details] [diff] [review]
part 2: don't write back record when unbinding
Comment 27 Andreas Gal :gal 2012-06-27 01:32:43 PDT
Created attachment 637028 [details] [diff] [review]
part 3: track active entries in memory and refuse to delete them with raise(ignore)
Comment 28 Andreas Gal :gal 2012-06-27 01:33:27 PDT
The attached patches are completely untested but this is how I would do it. Please feel free to take the patches and run with it if you agree.
Comment 29 Thinker Li [:sinker] 2012-06-27 01:40:56 PDT
(In reply to Andreas Gal :gal from comment #22)
> Also while we are at it. Can we use nsDiskCacheDevice instead of
> nsDiskCacheDeviceSQL for the offline cache? That might eliminate all
> problems as well.

Offline cache requires versionize updating.  A new version is active only after a full update.  We still need some additional work on nsDiskCacheDevice to catch the requirements of offline cache.
Comment 30 Andreas Gal :gal 2012-06-27 01:51:45 PDT
Thinker, can you review my patches and see if they make sense? I don't know much about this code.
Comment 31 Andreas Gal :gal 2012-06-27 01:54:53 PDT
On a related note, for pages with a lot of resources we completely serialize access by doing sync sqlite queries here. This is really slow. A simple fix might be to fetch all resources with a single query and cache that and if the next fetch is from the same origin skip the query and look into the cache, otherwise evict the cache and cache the new origin. This might trash badly in corner cases but will be in almost all cases superior to what we do right now. I will talk to Thinker about this.
Comment 32 Mike Hommey [:glandium] 2012-06-27 03:16:19 PDT
(In reply to Andreas Gal :gal from comment #25)
> Created attachment 637026 [details] [diff] [review]
> part 1: remove flags

I tried part 1 & 2, and now all I get is a neterror page because system.gaiamobile.org can't be found. Which may just be a problem with using an old database (since it's generated from an old xulrunner) with the new queries. This also suggests this would fail to work after upgrades on desktop.
Comment 33 Thinker Li [:sinker] 2012-06-27 10:05:05 PDT
Created attachment 637169 [details] [diff] [review]
part 1: remove flags

fix some minor errors.
Comment 34 Thinker Li [:sinker] 2012-06-27 10:06:41 PDT
Created attachment 637170 [details] [diff] [review]
part 2: track active entries in memory and refuse to delete them with raise(ignore)

Raise ignore error correctly.
Comment 35 Thinker Li [:sinker] 2012-06-27 10:08:11 PDT
(In reply to Mike Hommey [:glandium] from comment #32)
> (In reply to Andreas Gal :gal from comment #25)
> > Created attachment 637026 [details] [diff] [review]
> > part 1: remove flags
> 
> I tried part 1 & 2, and now all I get is a neterror page because
> system.gaiamobile.org can't be found. Which may just be a problem with using
> an old database (since it's generated from an old xulrunner) with the new
> queries. This also suggests this would fail to work after upgrades on
> desktop.

I have done some updates.  Please ignore part 2, but apply part 1 and part 3.  It seems work fine for me.
Comment 36 Honza Bambas (:mayhemer) 2012-06-27 11:15:53 PDT
As Thinker suggested we cannot easily use nsDiskCacheDevice.  Btw, it writes last fetched and fetch count data too.

General question: is the cache read slow because of general read and write slowness of the device or just because of reads and writes to sqlite are slow?

Other question: if you create a patch that simply removes all these writes while fetching from the app cache (actually a simple unstable patch) does it help dramatically with the performance to prove sqlite writes are cause here?

And finally: if anyone wants to work on this, then please take this bug by changing the assignee field.  I've spent a lot of time to produce the patch and what Andreas and Thinker made is more or less the same, so we shouldn't waste time this way.

I'll take a closer look at comment 20 that could be giving interesting information.  Thanks for it, btw.
Comment 37 Mike Hommey [:glandium] 2012-06-27 23:23:43 PDT
(In reply to Thinker Li [:sinker] from comment #35)
> (In reply to Mike Hommey [:glandium] from comment #32)
> > (In reply to Andreas Gal :gal from comment #25)
> > > Created attachment 637026 [details] [diff] [review]
> > > part 1: remove flags
> > 
> > I tried part 1 & 2, and now all I get is a neterror page because
> > system.gaiamobile.org can't be found. Which may just be a problem with using
> > an old database (since it's generated from an old xulrunner) with the new
> > queries. This also suggests this would fail to work after upgrades on
> > desktop.
> 
> I have done some updates.  Please ignore part 2, but apply part 1 and part
> 3.  It seems work fine for me.

It works better than it did in comment 32. However, the performance benefit is not entirely satisfying. It cuts the wait time of the dialer test case in about a half, but that's still > 2s.
I'll attach storage logs for the template app and the dialer when placing a call.
Comment 38 Mike Hommey [:glandium] 2012-06-27 23:35:37 PDT
Created attachment 637384 [details]
mozStorage:5 log for template app with patches applied

This is from doing nothing on the main screen and tapping the template icon to the template application being displayed.
Comment 39 Mike Hommey [:glandium] 2012-06-27 23:44:38 PDT
Created attachment 637387 [details]
mozStorage:5 log for dialer app when placing a call with patches applied

This is from having the dialer app opened, already having typed "555" and tapping the call button to the call information being displayed.

One interesting thing in both logs, which i was removing in my previous copy/paste because i was only grabbing logs according to their content, is that the first thing that happens is an update of the offline cache entry for something that was displayed and probably isn't relevant anymore.

That being said, there are still plenty of update and deletes, and in the dialer case, there's even a savepoint,
Comment 40 Andreas Gal :gal 2012-06-28 02:11:31 PDT
Thinker, please update the two patches that are ready to land right now and we will take the rest of the work to a new bug. Please do that first before anything else so we can get reviews for these in parallel to the additional follow-up work.
Comment 41 Andreas Gal :gal 2012-06-28 02:16:22 PDT
Actually patches look ready, I set the flag. Will clone the bug for follow-up work.
Comment 42 Honza Bambas (:mayhemer) 2012-06-28 09:59:28 PDT
The patches are not buildable (clobber, up to date m-c):

mozStorageConnection.obj : error LNK2019: unresolved external symbol _sqlite3_result_error_code referenced in function "void __cdecl mozilla::storage::`anonym
ous namespace'::basicFunctionHelper(struct sqlite3_context *,int,struct Mem * *)" (?basicFunctionHelper@?A0xea6702a0@storage@mozilla@@YAXPAUsqlite3_context@@H
PAPAUMem@@@Z)
xul.dll : fatal error LNK1120: 1 unresolved externals
Comment 43 Honza Bambas (:mayhemer) 2012-06-28 10:26:40 PDT
Comment on attachment 637169 [details] [diff] [review]
part 1: remove flags

Review of attachment 637169 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/cache/nsDiskCacheDeviceSQL.cpp
@@ -1107,2 @@
>      StatementSql ( mStatement_UpdateEntrySize,   "UPDATE moz_cache SET DataSize = ? WHERE ClientID = ? AND Key = ?;" ),
> -    StatementSql ( mStatement_UpdateEntryFlags,  "UPDATE moz_cache SET Flags = ? WHERE ClientID = ? AND Key = ?;" ),

Remove also the member it self.
Comment 44 Honza Bambas (:mayhemer) 2012-06-28 10:35:44 PDT
Comment on attachment 637170 [details] [diff] [review]
part 2: track active entries in memory and refuse to delete them with raise(ignore)

Review of attachment 637170 [details] [diff] [review]:
-----------------------------------------------------------------

I don't see readdition of the code removed in part1@CreateCacheEntry.

Add a test the deletion prevention during eviction works.  You can inspire with following tests: test_offlinecache_custom-directory.js and e.g. test_bug248970_cache.js.  Best, create 3 entries, hold one of them and call the eviction API (both evict all and evict for clientid) and then check the files are still present (maybe just open input stream on the cache entry and try to read from it).  You have to cover all places of the code you are changing.  It's unfortunate we have just tests that may not test this deeply if at all (mochitests under dom/) what is not enough to catch potential regression by your changes.


Also r+ for part1 is conditioned by a good part2 (or 3?) followup patch.  Both must be landed just together.

::: netwerk/cache/nsDiskCacheDeviceSQL.cpp
@@ +1346,5 @@
>        return nsnull;
>      }
> +
> +    // lock the entry
> +    Lock(key);

Map the locking by the whole cache key (i.e. ClientID + key).

::: netwerk/cache/nsDiskCacheDeviceSQL.h
@@ +257,5 @@
>    nsCOMPtr<mozIStorageStatement>  mStatement_FindClient;
>    nsCOMPtr<mozIStorageStatement>  mStatement_FindClientByNamespace;
>    nsCOMPtr<mozIStorageStatement>  mStatement_EnumerateGroups;
>    nsCOMPtr<mozIStorageStatement>  mStatement_EnumerateGroupsTimeOrder;
> +  nsCOMPtr<mozIStorageStatement>  mStatement_RaiseIgnore;

Unused.

::: storage/src/mozStorageConnection.cpp
@@ +213,5 @@
>    }
> +  int retcode = result ? variantToSQLiteT(aCtx, result) : SQLITE_OK;
> +  if (retcode == SQLITE_IGNORE) {
> +    ::sqlite3_result_error_code(aCtx, SQLITE_IGNORE);
> +  } else if (retcode != SQLITE_OK) {

I'm not a peer for this code.

You need get this reviewed from someone else.
Comment 45 Thinker Li [:sinker] 2012-06-28 18:58:08 PDT
Created attachment 637758 [details] [diff] [review]
part 1: remove flags, v2
Comment 46 Thinker Li [:sinker] 2012-06-28 19:02:29 PDT
Created attachment 637759 [details] [diff] [review]
part 2: track active entries in memory and refuse to delete them with raise(ignore), v2

Remove unused member and Lock() with clientID + key
Comment 47 Andreas Gal :gal 2012-06-29 03:31:49 PDT
Whats the status here? Can we get the reviews and land this?
Comment 48 Thinker Li [:sinker] 2012-06-29 04:01:40 PDT
Created attachment 637838 [details] [diff] [review]
part 2: track active entries in memory and refuse to delete them with raise(ignore), v3

A minor change with testcase proved.
Comment 49 Thinker Li [:sinker] 2012-06-29 04:05:40 PDT
Created attachment 637840 [details] [diff] [review]
Part 3: install eviction observer before deleting a entry

This is necessary to remove files in real for DOOM.
Comment 50 Thinker Li [:sinker] 2012-06-29 04:13:17 PDT
Created attachment 637844 [details] [diff] [review]
Part 4: testcase

make sure all files are removed as expected.  Only inactive entry would be removed.
Comment 51 Thinker Li [:sinker] 2012-06-29 04:14:09 PDT
(In reply to Andreas Gal :gal from comment #47)
> Whats the status here? Can we get the reviews and land this?

Let go~
Comment 52 Honza Bambas (:mayhemer) 2012-06-29 09:58:24 PDT
I'm taking a look quickly.
Comment 53 Honza Bambas (:mayhemer) 2012-06-29 10:27:02 PDT
Comment on attachment 637844 [details] [diff] [review]
Part 4: testcase

Review of attachment 637844 [details] [diff] [review]:
-----------------------------------------------------------------

Just an advice: the test can be hugely simplified.  You can just write cache entries directly (open for write, take the out stream and write to it).  You don't need the whole update mechanism to fill the offline cache.  It actually just affects the test run time.
Comment 54 Honza Bambas (:mayhemer) 2012-06-29 10:27:12 PDT
Comment on attachment 637840 [details] [diff] [review]
Part 3: install eviction observer before deleting a entry

Review of attachment 637840 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/cache/nsDiskCacheDeviceSQL.cpp
@@ +956,5 @@
>    bool hasRows;
>    rv = statement->ExecuteStep(&hasRows);
>    NS_ENSURE_SUCCESS(rv, rv);
>  
> +  evictionObserver.Apply();

Doesn't DeleteData() call take care of the file deletion?  I don't see other reason to add the eviction observer, since otherwise this all is a no-op.

Explain why exactly (in detail) why you have added this code here.
Comment 55 Honza Bambas (:mayhemer) 2012-06-29 10:27:50 PDT
Comment on attachment 637838 [details] [diff] [review]
part 2: track active entries in memory and refuse to delete them with raise(ignore), v3

Review of attachment 637838 [details] [diff] [review]:
-----------------------------------------------------------------

I'm still missing an alternative to the check for an entry being active @ CreateCacheEntry.  Actually, FindEntry has to return nsnull immediately when the cache entry under the key is found active.

::: netwerk/cache/nsDiskCacheDeviceSQL.cpp
@@ +108,3 @@
>                               " ON moz_cache FOR EACH ROW BEGIN SELECT"
>                               " cache_eviction_observer("
> +                             "  OLD.ClientID || ':' || OLD.key, OLD.generation);"

Change the method's signature and pass both components separately.

@@ +191,5 @@
> +  nsCAutoString fullKey;
> +  rv = values->GetUTF8String(0, fullKey);
> +  if (NS_FAILED(rv)) {
> +      return nsnull;
> +  }

Just curious: why can't you any more use AsSharedUTF8String?

@@ +1139,5 @@
>                                                       " AND NameSpace <= ?2 AND ?2 GLOB NameSpace || '*'"
>                                                       " ORDER BY NameSpace DESC;"),
>      StatementSql ( mStatement_InsertNamespaceEntry,  "INSERT INTO moz_cache_namespaces (ClientID, NameSpace, Data, ItemType) VALUES(?, ?, ?, ?);"),
>      StatementSql ( mStatement_EnumerateGroups,       "SELECT GroupID, ActiveClientID FROM moz_cache_groups;"),
> +    StatementSql ( mStatement_EnumerateGroupsTimeOrder, "SELECT GroupID, ActiveClientID FROM moz_cache_groups ORDER BY ActivateTimeStamp;"),

I don't think you want that trailing colon.

@@ +1473,5 @@
>    LOG(("nsOfflineCacheDevice::DoomEntry [key=%s]\n", entry->Key()->get()));
>  
> +  if (!entry->IsActive()) {
> +      Unlock(entry->Key()->get());
> +  }

Why is this needed?  Why wasn't this needed before?  Please add an explanation comment to the code.
Comment 56 Thinker Li [:sinker] 2012-06-29 12:36:16 PDT
Comment on attachment 637840 [details] [diff] [review]
Part 3: install eviction observer before deleting a entry

Review of attachment 637840 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/cache/nsDiskCacheDeviceSQL.cpp
@@ +956,5 @@
>    bool hasRows;
>    rv = statement->ExecuteStep(&hasRows);
>    NS_ENSURE_SUCCESS(rv, rv);
>  
> +  evictionObserver.Apply();

For DoomEntry(), DeleteEntry() is called with deleteData==false.  So, DeleteEntry() is never called for DoomEntry().  The testcase in part 4 show that nsCacheSession::DoomEntry() makes call nsOfflineCacheDevice::DoomEntry() with an inactive entry. That entry would never be deactivated, so the file is never be killed.  So, I apply the observer here, to make the file be deleted only while it is not in activated.

By the way, I found there are easier ways to fix it.  Just check if an entry is active, call DeleteEntry() with deleteData == false for active entry and deleteData == true for inactive entry.  Another way is always activate entries at nsCacheService.
Comment 57 Thinker Li [:sinker] 2012-06-29 13:16:20 PDT
Created attachment 637983 [details] [diff] [review]
Part 2: track active entries in memory and refuse to d elete by ignore

Change the way of handling DoomEntry for inactive entries.
Comment 58 Thinker Li [:sinker] 2012-06-30 10:28:08 PDT
(In reply to Honza Bambas (:mayhemer) from comment #53)
> Comment on attachment 637844 [details] [diff] [review]
> Part 4: testcase
> 
> Review of attachment 637844 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Just an advice: the test can be hugely simplified.  You can just write cache
> entries directly (open for write, take the out stream and write to it).  You
> don't need the whole update mechanism to fill the offline cache.  It
> actually just affects the test run time.

I think it helps to figure interactive behavior between relative modules.
For example, with this testcase, I find nsCacheService will Doom an inactive entry.
Comment 59 Honza Bambas (:mayhemer) 2012-07-02 05:56:13 PDT
(In reply to Thinker Li [:sinker] from comment #58)
> I think it helps to figure interactive behavior between relative modules.
> For example, with this testcase, I find nsCacheService will Doom an inactive
> entry.

Actually, I'm against this argument.  Your test should be as simple and narrowed to check only your code changes as possible.  We have other tests (mochi) for interaction.  I don't see a reason why to involve the offline cache update service here.  But it is up to you to spent additional time on changing the test.  This is not a review requirement.
Comment 60 Andreas Gal :gal 2012-07-03 03:17:54 PDT
Is the patch good to go minus the test?
Comment 61 Honza Bambas (:mayhemer) 2012-07-03 08:31:34 PDT
(In reply to Andreas Gal :gal from comment #60)
> Is the patch good to go minus the test?

No, I need to finish review of the part2 patch.  Part1 cannot be landed w/o it, actually, I would like to land them merged as a single change set.

Since this is relatively urgent, the test is not needed to land with it, but I will watch for landing it ASAP after that.
Comment 62 Andreas Gal :gal 2012-07-04 16:41:50 PDT
Honza, how is that review coming along? We really need to make progress on this.
Comment 63 Honza Bambas (:mayhemer) 2012-07-05 12:33:30 PDT
Comment on attachment 637983 [details] [diff] [review]
Part 2: track active entries in memory and refuse to d elete by ignore

Review of attachment 637983 [details] [diff] [review]:
-----------------------------------------------------------------

r-

Emphasizing again: *removal of code from CreateCacheEntry in part1 patch has not been readded in part2 with the new implementation.  FindEntry has to return null when there is an active entry under the demanded key.*

This may be either wrong (then please anybody comment with explanation) or simply has not been addressed.

::: netwerk/cache/nsDiskCacheDeviceSQL.cpp
@@ +190,5 @@
>    PRUint32 valueLen;
> +  const char *clientID = values->AsSharedUTF8String(0, &valueLen);
> +  const char *key = values->AsSharedUTF8String(1, &valueLen);
> +  nsCAutoString fullKey(clientID);
> +  fullKey.Append(key);

There is a colon (':') between clientID and the key.  Thus, this cannot work at all.  If you didn't catch this, then there is probably something wrong with the test.

@@ +1471,5 @@
>  
>    // We can go ahead and delete the corresponding row in our table,
>    // but we must not delete the file on disk until we are deactivated.
> +  // In another word, the file should be delete if the entry is
> +  // deactivated.

"In another words, the file should be deleted only if the entry had been deactivated."

(Still may not be 100% correct English, though.)

::: netwerk/cache/nsDiskCacheDeviceSQL.h
@@ +162,5 @@
>                                                 char ***keys);
>  
> +  bool                    IsLocked(const char *key);
> +  void                    Lock(const char *key);
> +  void                    Unlock(const char *key);

Let this take const nsACString&.  You always have it in hands when working with this API and inside you just convert to nsDepenedentCString.  I believe you are OK with it, see [1].

[1] http://hg.mozilla.org/mozilla-central/annotate/e0f64c714814/xpcom/glue/nsHashKeys.h#l142

::: storage/src/mozStorageConnection.cpp
@@ +213,5 @@
>    }
> +  int retcode = variantToSQLiteT(aCtx, result);
> +  if (retcode == SQLITE_IGNORE) {
> +    ::sqlite3_result_int(aCtx, SQLITE_IGNORE);
> +  } else if (retcode != SQLITE_OK) {

Again, I cannot review this.  Forwarding this piece to Shawn Wilsher.
Comment 64 Honza Bambas (:mayhemer) 2012-07-05 12:34:14 PDT
Comment on attachment 637844 [details] [diff] [review]
Part 4: testcase

Review of attachment 637844 [details] [diff] [review]:
-----------------------------------------------------------------

r- based on that this test is not able to find a serious bug in the part2 patch.
Comment 65 Thinker Li [:sinker] 2012-07-05 20:05:50 PDT
(In reply to Honza Bambas (:mayhemer) from comment #63)
> Comment on attachment 637983 [details] [diff] [review]
> Part 2: track active entries in memory and refuse to d elete by ignore
> 
> Review of attachment 637983 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r-
> 
> Emphasizing again: *removal of code from CreateCacheEntry in part1 patch has
> not been readded in part2 with the new implementation.  FindEntry has to
> return null when there is an active entry under the demanded key.*

I will check it at FindEntry or CreateCacheEntry.

> 
> This may be either wrong (then please anybody comment with explanation) or
> simply has not been addressed.
> 
> ::: netwerk/cache/nsDiskCacheDeviceSQL.cpp
> @@ +190,5 @@
> >    PRUint32 valueLen;
> > +  const char *clientID = values->AsSharedUTF8String(0, &valueLen);
> > +  const char *key = values->AsSharedUTF8String(1, &valueLen);
> > +  nsCAutoString fullKey(clientID);
> > +  fullKey.Append(key);
> 
> There is a colon (':') between clientID and the key.  Thus, this cannot work
> at all.  If you didn't catch this, then there is probably something wrong
> with the test.

The colon between clientID and the key changes only the hash of the key and uniqueness of the key.  I don' think it is a wrong of the testcase in part4.  It just means we need another testcase for uniqueness of keys.
Comment 66 Thinker Li [:sinker] 2012-07-05 21:26:31 PDT
(In reply to Thinker Li [:sinker] from comment #65)
> (In reply to Honza Bambas (:mayhemer) from comment #63)
> > Comment on attachment 637983 [details] [diff] [review]
> > Part 2: track active entries in memory and refuse to d elete by ignore
> > 
> > Review of attachment 637983 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > r-
> > 
> > Emphasizing again: *removal of code from CreateCacheEntry in part1 patch has
> > not been readded in part2 with the new implementation.  FindEntry has to
> > return null when there is an active entry under the demanded key.*
> 
> I will check it at FindEntry or CreateCacheEntry.
> 
> > 
> > This may be either wrong (then please anybody comment with explanation) or
> > simply has not been addressed.
> > 
> > ::: netwerk/cache/nsDiskCacheDeviceSQL.cpp
> > @@ +190,5 @@
> > >    PRUint32 valueLen;
> > > +  const char *clientID = values->AsSharedUTF8String(0, &valueLen);
> > > +  const char *key = values->AsSharedUTF8String(1, &valueLen);
> > > +  nsCAutoString fullKey(clientID);
> > > +  fullKey.Append(key);
> > 
> > There is a colon (':') between clientID and the key.  Thus, this cannot work
> > at all.  If you didn't catch this, then there is probably something wrong
> > with the test.
> 
> The colon between clientID and the key changes only the hash of the key and
> uniqueness of the key.  I don' think it is a wrong of the testcase in part4.
> It just means we need another testcase for uniqueness of keys.
By the way, since the hash code is changed, I will fix it in the testcase.
Comment 67 Thinker Li [:sinker] 2012-07-10 03:49:29 PDT
Created attachment 640558 [details] [diff] [review]
Part 2: track active entries in memory and refuse to d elete by ignore, v2

Fix comment & format of Lock keys.
Comment 68 Thinker Li [:sinker] 2012-07-10 03:52:53 PDT
Created attachment 640559 [details] [diff] [review]
Part 3: testcase, v2

Hold file "foo3" to keep it active during running evictEntries().
Comment 69 Andreas Gal :gal 2012-07-10 07:20:27 PDT
sdwlish isn't around any more reliably. Give him a few hours to object, but in the meantime I will steal the review.
Comment 70 Andreas Gal :gal 2012-07-10 07:30:14 PDT
Comment on attachment 640558 [details] [diff] [review]
Part 2: track active entries in memory and refuse to d elete by ignore, v2

Review of attachment 640558 [details] [diff] [review]:
-----------------------------------------------------------------

Ok, lets get this in. App load performance is absolutely killing us right now.

::: netwerk/cache/nsDiskCacheDeviceSQL.cpp
@@ +103,5 @@
>                     nsOfflineCacheEvictionFunction *evictionFunction)
>      : mDB(db), mEvictionFunction(evictionFunction)
>      {
>        mDB->ExecuteSimpleSQL(
> +          NS_LITERAL_CSTRING("CREATE TEMP TRIGGER cache_on_delete BEFORE DELETE"

Note that this slightly changes the failure mode. We can now crash before removing the row from the DB and at that point the file will be already gone. The lookup code property handles this case and ignores the entry, so its still safe. We should rip this code out soon and replace it with something sane instead. I have measured setting up and destroying the trigger taking 15ms ON MY DESKTOP. On the phone it must be an order of magnitude slower. Thinker has a new bug open to avoid these evictions on every fetch.

@@ +1498,5 @@
>  
>    // We can go ahead and delete the corresponding row in our table,
>    // but we must not delete the file on disk until we are deactivated.
> +  // In another word, the file should be deleted if the entry had been
> +  // deactivated.

You mean "The file should be delete once the entry has been deactivated." ?

::: storage/src/mozStorageConnection.cpp
@@ +211,5 @@
>                             -1);
>      return;
>    }
> +  int retcode = variantToSQLiteT(aCtx, result);
> +  if (retcode == SQLITE_IGNORE) {

I verified that "SQLITE_IGNORE" isn't referenced in the code base before this patch (outside sqlite itself), so this change has zero risk.
Comment 71 Andreas Gal :gal 2012-07-10 07:31:29 PDT
Comment on attachment 640559 [details] [diff] [review]
Part 3: testcase, v2

I just hope we never change the hash format.
Comment 72 Nathan Froyd [:froydnj] 2012-07-10 08:12:19 PDT
Comment on attachment 640558 [details] [diff] [review]
Part 2: track active entries in memory and refuse to d elete by ignore, v2

Review of attachment 640558 [details] [diff] [review]:
-----------------------------------------------------------------

You could probably ask mak for a quick sanity-check on the storage/ bits.

r+'ing a patch that you originally wrote and that honza hasn't yet r+'d seems somewhat unorthodox...
Comment 73 Andreas Gal :gal 2012-07-10 08:17:35 PDT
Cool. Thanks for the referral.
Comment 74 Honza Bambas (:mayhemer) 2012-07-10 15:19:48 PDT
(In reply to Thinker Li [:sinker] from comment #65)
> (In reply to Honza Bambas (:mayhemer) from comment #63)
> > Comment on attachment 637983 [details] [diff] [review]
> > Part 2: track active entries in memory and refuse to d elete by ignore
> > 
> > Review of attachment 637983 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > r-
> > 
> > Emphasizing again: *removal of code from CreateCacheEntry in part1 patch has
> > not been readded in part2 with the new implementation.  FindEntry has to
> > return null when there is an active entry under the demanded key.*
> 
> I will check it at FindEntry or CreateCacheEntry.

So, any updates on this particular piece?

> 
> > 
> > This may be either wrong (then please anybody comment with explanation) or
> > simply has not been addressed.
> > 
> > ::: netwerk/cache/nsDiskCacheDeviceSQL.cpp
> > @@ +190,5 @@
> > >    PRUint32 valueLen;
> > > +  const char *clientID = values->AsSharedUTF8String(0, &valueLen);
> > > +  const char *key = values->AsSharedUTF8String(1, &valueLen);
> > > +  nsCAutoString fullKey(clientID);
> > > +  fullKey.Append(key);
> > 
> > There is a colon (':') between clientID and the key.  Thus, this cannot work
> > at all.  If you didn't catch this, then there is probably something wrong
> > with the test.
> 
> The colon between clientID and the key changes only the hash of the key and
> uniqueness of the key.  I don' think it is a wrong of the testcase in part4.
> It just means we need another testcase for uniqueness of keys.

I don't quit follow this argument, but maybe I'm missing something?  The problem is that you were marking an entry in the hash table using <clientid>:<url>, but here you were just searching it only as <clientid><url>.  So it could never be found in the hash table, right?
Comment 75 Thinker Li [:sinker] 2012-07-10 23:37:12 PDT
Created attachment 640942 [details] [diff] [review]
Part 2: track active entries in memory and refuse to d elete by ignore, v3

Fix the issue what Honza had mentioned in comment 63.
Comment 76 Thinker Li [:sinker] 2012-07-10 23:41:15 PDT
(In reply to Honza Bambas (:mayhemer) from comment #74)
> I don't quit follow this argument, but maybe I'm missing something?  The
> problem is that you were marking an entry in the hash table using
> <clientid>:<url>, but here you were just searching it only as
> <clientid><url>.  So it could never be found in the hash table, right?

I didn't get the point at first instant.  But, I catch your point and fix the issue at the v2.
Comment 77 Andreas Gal :gal 2012-07-11 03:07:27 PDT
Honza found the bug so ideally he should r+. I will steal if he is overloaded with review requests.
Comment 78 Honza Bambas (:mayhemer) 2012-07-11 06:26:08 PDT
Comment on attachment 640942 [details] [diff] [review]
Part 2: track active entries in memory and refuse to d elete by ignore, v3

Review of attachment 640942 [details] [diff] [review]:
-----------------------------------------------------------------

Thinker, please also address:

::: netwerk/cache/nsDiskCacheDeviceSQL.h
@@ +162,5 @@
>                                                 char ***keys);
>  
> +  bool                    IsLocked(const char *key);
> +  void                    Lock(const char *key);
> +  void                    Unlock(const char *key);

Let this take const nsACString&.  You always have it in hands when working with this API and inside you just convert to nsDepenedentCString.  I believe you are OK with it, see [1].

[1] http://hg.mozilla.org/mozilla-central/annotate/e0f64c714814/xpcom/glue/nsHashKeys.h#l142
Comment 79 Thinker Li [:sinker] 2012-07-12 03:07:06 PDT
Created attachment 641414 [details] [diff] [review]
Part 2: track active entries in memory and refuse to d elete by ignore, v4

Change signatures of Lock(), Unlock() and IsLocked().
Comment 80 Honza Bambas (:mayhemer) 2012-07-12 13:25:31 PDT
Comment on attachment 641414 [details] [diff] [review]
Part 2: track active entries in memory and refuse to d elete by ignore, v4

Review of attachment 641414 [details] [diff] [review]:
-----------------------------------------------------------------

r=honzab for the network parts.

::: netwerk/cache/nsDiskCacheDeviceSQL.cpp
@@ +190,5 @@
>    PRUint32 valueLen;
> +  const char *clientID = values->AsSharedUTF8String(0, &valueLen);
> +  const char *key = values->AsSharedUTF8String(1, &valueLen);
> +  nsCAutoString fullKey(clientID);
> +  fullKey.AppendLiteral(":");

Append(NS_LITERAL_CSTRING(":")) might be a little bit faster here.
Comment 81 Nathan Froyd [:froydnj] 2012-07-12 18:29:57 PDT
(In reply to Honza Bambas (:mayhemer) from comment #80)
> ::: netwerk/cache/nsDiskCacheDeviceSQL.cpp
> > +  fullKey.AppendLiteral(":");
> 
> Append(NS_LITERAL_CSTRING(":")) might be a little bit faster here.

AppendLiteral is actually going to be faster, as there's no runtime string length calculation.  Less source code clutter, too.
Comment 82 Honza Bambas (:mayhemer) 2012-07-13 10:47:32 PDT
On IRC Nathan suggested to use even AppendLiteral(':') - taking just a char.
Comment 83 Andreas Gal :gal 2012-07-13 10:48:57 PDT
Any of these approaches works. A nano second doesn't matter here. Lets just land it.
Comment 84 Thinker Li [:sinker] 2012-07-15 20:45:02 PDT
(In reply to Andreas Gal :gal from comment #83)
> Any of these approaches works. A nano second doesn't matter here. Lets just
> land it.

Will mak77 review this?
Comment 85 Andreas Gal :gal 2012-07-15 20:58:24 PDT
Doesn't look like it. Storage is under-owned, and your change is trivial and I posted above an explanation why its extremely unlikely (or impossible) how that storage change will affect any other code in our code base, so r=me, and if anyone complains, I will take the fall (dear future complainer: please review, and post follow-up fix requests, we will take care of them).
Comment 86 Thinker Li [:sinker] 2012-07-15 22:20:35 PDT
Created attachment 642486 [details] [diff] [review]
part 1: remove flags, v3

r=honzab
Comment 87 Thinker Li [:sinker] 2012-07-15 22:22:45 PDT
Created attachment 642487 [details] [diff] [review]
Part 2: track active entries in memory and refuse to d elete by ignore, v5

r=honzab,gal
Comment 88 Thinker Li [:sinker] 2012-07-15 22:25:02 PDT
Created attachment 642488 [details] [diff] [review]
640559: Part 3: testcase, v3

r=gal
Comment 89 Mozilla RelEng Bot 2012-07-16 01:15:25 PDT
Try run for 3ddd552e89e8 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=3ddd552e89e8
Results (out of 215 total builds):
    exception: 1
    success: 194
    warnings: 15
    failure: 5
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/tlee@mozilla.com-3ddd552e89e8
Comment 92 Dietrich Ayala (:dietrich) 2012-07-24 13:39:30 PDT
If this is a serious enough performance issue, please re-nom for blocking if this gets re-opened in the future.

Note You need to log in before you can comment on or make changes to this bug.