Last Comment Bug 769184 - don't write back cache entries into offline cache at every access
: don't write back cache entries into offline cache at every access
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Networking: Cache (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla17
Assigned To: Thinker Li [:sinker] PTO during Sep 22 ~ 30
:
:
Mentors:
Depends on: 767025
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-28 02:19 PDT by Andreas Gal :gal
Modified: 2012-07-17 15:34 PDT (History)
16 users (show)
ryanvm: in‑testsuite-
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Update cache entry only when data or meta-data is dirty (1.72 KB, patch)
2012-06-28 03:25 PDT, Thinker Li [:sinker] PTO during Sep 22 ~ 30
no flags Details | Diff | Splinter Review
mozStorage:5 log for dialer app when placing a call with patch applied (16.71 KB, text/plain)
2012-06-28 03:37 PDT, Mike Hommey [:glandium]
no flags Details
mozStorage:5 log for dialer app when placing a call with patch applied, with timestamps and duration (30.71 KB, text/plain)
2012-06-28 04:07 PDT, Mike Hommey [:glandium]
no flags Details
Update cache entry only when data or meta-data is dirty, v2 (1.86 KB, patch)
2012-06-28 15:34 PDT, Thinker Li [:sinker] PTO during Sep 22 ~ 30
michal.novotny: review-
Details | Diff | Splinter Review
Update cache entry only for new (4.25 KB, patch)
2012-07-02 22:21 PDT, Thinker Li [:sinker] PTO during Sep 22 ~ 30
honzab.moz: review+
Details | Diff | Splinter Review
logs of stuck/non-stuck dom/tests/mochitest/ajax/offline/test_identicalManifest.html (64.21 KB, application/zip)
2012-07-12 04:34 PDT, Honza Bambas (:mayhemer)
no flags Details
Update cache entry only for new, v2 (4.23 KB, patch)
2012-07-15 20:41 PDT, Thinker Li [:sinker] PTO during Sep 22 ~ 30
tlee: review+
Details | Diff | Splinter Review

Description Andreas Gal :gal 2012-06-28 02:19:18 PDT
- only write back the cache entry if its dirty, and only consider dirty meta data and dirty data, not dirty access time and count
- also check why we do the sqlite access from the main thread
Comment 1 Thinker Li [:sinker] PTO during Sep 22 ~ 30 2012-06-28 03:25:55 PDT
Created attachment 637443 [details] [diff] [review]
Update cache entry only when data or meta-data is dirty
Comment 2 Mike Hommey [:glandium] 2012-06-28 03:37:46 PDT
Created attachment 637449 [details]
mozStorage:5 log for dialer app when placing a call with patch applied
Comment 3 Andreas Gal :gal 2012-06-28 03:55:00 PDT
Comment on attachment 637443 [details] [diff] [review]
Update cache entry only when data or meta-data is dirty

This removes the updates on deactivation.
Comment 4 Mike Hommey [:glandium] 2012-06-28 04:07:21 PDT
Created attachment 637452 [details]
mozStorage:5 log for dialer app when placing a call with patch applied, with timestamps and duration

Cumulative duration looks suspiciously low.
Comment 5 Honza Bambas (:mayhemer) 2012-06-28 11:59:38 PDT
Comment on attachment 637443 [details] [diff] [review]
Update cache entry only when data or meta-data is dirty

Review of attachment 637443 [details] [diff] [review]:
-----------------------------------------------------------------

From POW of offline cache I'm OK with the code, accept the comments.

But I don't know well enough how the dirty flags work and whether to update only when both are set is OK.

Reassigning to Michal.

::: netwerk/cache/nsDiskCacheDeviceSQL.cpp
@@ +1372,5 @@
>      // that happened.  otherwise, all we have to do here is delete the file
>      // on disk.
>      DeleteData(entry);
>    }
> +  else if (entry->IsDataDirty() && entry->IsMetaDataDirty()) {

|| ?

@@ +1378,5 @@
>  
>      // XXX Assumption: the row already exists because it was either created
>      // with a call to BindEntry or it was there when we called FindEntry.
>  
> +    LOG(("Data and meta-data of the cache entry is clean.\n"));

Not sure why you have chosen this log text.  Maybe "updating dirty entry?"

Mainly log the negative case like "skipping update since entry is not dirty".
Comment 6 Thinker Li [:sinker] PTO during Sep 22 ~ 30 2012-06-28 15:34:19 PDT
Created attachment 637710 [details] [diff] [review]
Update cache entry only when data or meta-data is dirty, v2
Comment 7 Thinker Li [:sinker] PTO during Sep 22 ~ 30 2012-06-28 17:13:38 PDT
(In reply to Honza Bambas (:mayhemer) from comment #5)
> Comment on attachment 637443 [details] [diff] [review]
> Update cache entry only when data or meta-data is dirty
> 
> Review of attachment 637443 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> From POW of offline cache I'm OK with the code, accept the comments.
> 
> But I don't know well enough how the dirty flags work and whether to update
> only when both are set is OK.
> 
> Reassigning to Michal.
> 
> ::: netwerk/cache/nsDiskCacheDeviceSQL.cpp
> @@ +1372,5 @@
> >      // that happened.  otherwise, all we have to do here is delete the file
> >      // on disk.
> >      DeleteData(entry);
> >    }
> > +  else if (entry->IsDataDirty() && entry->IsMetaDataDirty()) {
> 
> || ?

Yes! You are right.  This is a mistake for refactoring code.
Comment 8 Andreas Gal :gal 2012-06-29 03:24:30 PDT
dcamp please steal if you can
Comment 9 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-06-29 04:32:24 PDT
A few things:

1. I am pretty sure that it is the case that not every place where we modify a cache entry sets the dirty/metadata-dirty bits. IMO, nsCacheEntry should be changed to set the dirty bit/bits whenever its mutators are called; however, when we create a cache entry, we call some of the mutators immediately after construction of a new object, so if we were to make that change, we'd need to add a "SetNotDirty()" call after those construction-time mutations.

Regardless, in general don't expect that the dirty bits are accurately set without auditing the code. But, also, see below.

2. It seems like when we're opening an offline cache entry *NOT* for updating it, then we should only be opening it with ACCESS_READ instead of read/write access--if for no other reason than that enables us to have concurrent readers most of the time. The implementation of the offline cache is a little bit of a mystery for me because I haven't worked with it, so I'm probably overlooking something here.

3. Do expiration time, last access time, etc. even matter for the offline cache as it is currently implemented (no garbage collection)? I know imglib and live bookmarks use the expiration time of cache entries for some reason, but I think we should be able to do something useful for what those things need without having to store and update such metadata.

In general, it seems like if we're writing to offline cache entries often at all, then we're doing something wrong, since the whole point of offline cache entries is that they're nearly always static.
Comment 10 Honza Bambas (:mayhemer) 2012-06-29 07:46:19 PDT
(In reply to Brian Smith (:bsmith) from comment #9)
> A few things:
> 
> 1. I am pretty sure that it is the case that not every place where we modify
> a cache entry sets the dirty/metadata-dirty bits. IMO, nsCacheEntry should
> be changed to set the dirty bit/bits whenever its mutators are called;
> however, when we create a cache entry, we call some of the mutators
> immediately after construction of a new object, so if we were to make that
> change, we'd need to add a "SetNotDirty()" call after those
> construction-time mutations.
> 
> Regardless, in general don't expect that the dirty bits are accurately set
> without auditing the code. But, also, see below.

Agree.  The getter for the dirty flags are not used in the whole platform, anywhere.  So it is very risky to use them here since nobody knows whether they are properly set.

> 
> 2. It seems like when we're opening an offline cache entry *NOT* for
> updating it, then we should only be opening it with ACCESS_READ instead of
> read/write access--if for no other reason than that enables us to have
> concurrent readers most of the time. The implementation of the offline cache
> is a little bit of a mystery for me because I haven't worked with it, so I'm
> probably overlooking something here.

We are opening offline cache entries for ACCESS_READ only, you should know this since you were touching the code that does it in nsHttpChannel a lot recently.  Offline cache entry is used the same way as a "normal" http cache entry.

> 
> 3. Do expiration time, last access time, etc. even matter for the offline
> cache as it is currently implemented (no garbage collection)? I know imglib
> and live bookmarks use the expiration time of cache entries for some reason,
> but I think we should be able to do something useful for what those things
> need without having to store and update such metadata.

Expiration time for offline cache entries doesn't make sense.  I personally don't know what the LastFetchedTime and FatchCount are for, but HTTP cache is using them.  That could just confuse about:cache, but performance is a priority here.

> 
> In general, it seems like if we're writing to offline cache entries often at
> all, then we're doing something wrong, since the whole point of offline
> cache entries is that they're nearly always static.

What means "writing to offline cache entries" ?  Updating metadata?  Http cache does it as well.  But I just may not understand well what you mean.
Comment 11 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-06-29 13:41:05 PDT
(In reply to Honza Bambas (:mayhemer) from comment #10)
> Expiration time for offline cache entries doesn't make sense.  I personally
> don't know what the LastFetchedTime and FatchCount are for, but HTTP cache
> is using them.  That could just confuse about:cache, but performance is a
> priority here.

I filed bug 769796 and bug 769797 to remove this unneeded information from cache. If we fix bug 769796 then would we need to do anything more here in this bug?

> What means "writing to offline cache entries" ?  Updating metadata?  Http
> cache does it as well.  But I just may not understand well what you mean.

I mean we should try to make it so we don't have to write any metadata or really make any writes to the offline cache files at all, except in the case where we're explicitly replacing the offline cache entry.
Comment 12 Michal Novotny (:michal) 2012-06-29 18:21:04 PDT
Comment on attachment 637710 [details] [diff] [review]
Update cache entry only when data or meta-data is dirty, v2

(In reply to Honza Bambas (:mayhemer) from comment #10)
> > 1. I am pretty sure that it is the case that not every place where we modify
> > a cache entry sets the dirty/metadata-dirty bits. IMO, nsCacheEntry should
> > be changed to set the dirty bit/bits whenever its mutators are called;
> > however, when we create a cache entry, we call some of the mutators
> > immediately after construction of a new object, so if we were to make that
> > change, we'd need to add a "SetNotDirty()" call after those
> > construction-time mutations.
> > 
> > Regardless, in general don't expect that the dirty bits are accurately set
> > without auditing the code. But, also, see below.
> 
> Agree.  The getter for the dirty flags are not used in the whole platform,
> anywhere.  So it is very risky to use them here since nobody knows whether
> they are properly set.

AFAICS, dirty flags for data and metadata are set properly. I'm not sure if this is also true for entry-dirty flag. It isn't used in this patch, which is wrong.


(In reply to Brian Smith (:bsmith) from comment #11)
> I filed bug 769796 and bug 769797 to remove this unneeded information from
> cache. If we fix bug 769796 then would we need to do anything more here in
> this bug?

Do we need lastModified for offline cache entries? If we remove also this information, then after fixing bugs #769796 and #769797 we would not need to check IsEntryDirty(), but we would still need to check IsMetadataDirty() and IsDataDirty().
Comment 13 Thinker Li [:sinker] PTO during Sep 22 ~ 30 2012-06-30 10:06:26 PDT
(In reply to Michal Novotny (:michal) from comment #12)
> Comment on attachment 637710 [details] [diff] [review]
> Update cache entry only when data or meta-data is dirty, v2
> 
> (In reply to Honza Bambas (:mayhemer) from comment #10)
> > > 1. I am pretty sure that it is the case that not every place where we modify
> > > a cache entry sets the dirty/metadata-dirty bits. IMO, nsCacheEntry should
> > > be changed to set the dirty bit/bits whenever its mutators are called;
> > > however, when we create a cache entry, we call some of the mutators
> > > immediately after construction of a new object, so if we were to make that
> > > change, we'd need to add a "SetNotDirty()" call after those
> > > construction-time mutations.
> > > 
> > > Regardless, in general don't expect that the dirty bits are accurately set
> > > without auditing the code. But, also, see below.
> > 
> > Agree.  The getter for the dirty flags are not used in the whole platform,
> > anywhere.  So it is very risky to use them here since nobody knows whether
> > they are properly set.
> 
> AFAICS, dirty flags for data and metadata are set properly. I'm not sure if
> this is also true for entry-dirty flag. It isn't used in this patch, which
> is wrong.

After a restudying for this patch, I have a new idea.
AFAIK, all documents in offline cache device are updated in a transaction.  The updates are triggered by the changes of the associated cache manifest.  So, every later updates should not effect existing entries.  We don't need to update entries for later changes.  In this patch, now, I update entries if one of its IsMetadataDirty() and IsDataDirty()
is true.  I even think it is not necessary.  Since update of the pages should be in a transaction, there is no reason to update entry for later changes.

So, I suggest to update entries for new entries only.  What do you think?
Comment 14 Thinker Li [:sinker] PTO during Sep 22 ~ 30 2012-07-02 22:21:57 PDT
Created attachment 638587 [details] [diff] [review]
Update cache entry only for new

This patch updates only entries that is returned by BindEntry(), a new entry.  I add a new flag |FLAG_NEW_ENTRY| for nsOfflineCacheBinding, it is marked by BindEntry().  DeactivateEntry() checks the flag and saves only new entries.
Comment 15 Andreas Gal :gal 2012-07-04 16:43:54 PDT
Michal, when do you expect to get to this review?
Comment 16 Michal Novotny (:michal) 2012-07-05 14:33:33 PDT
Comment on attachment 638587 [details] [diff] [review]
Update cache entry only for new

This would mean that we would not need to store fetchCount, lastFetched, lastModified and expirationTime at all, since they would never be updated. I'm not sure if we need this information for entries in the offline cache. I think that Honza is better reviewer for this change.
Comment 17 Honza Bambas (:mayhemer) 2012-07-05 14:36:46 PDT
I cannot do this sooner then on Monday.

I don't think that these metadata are used a different way then for HTTP cache.  Note that the offline cache entries are used as HTTP cache entries for read when we are performing an offline cache update.
Comment 18 Andreas Gal :gal 2012-07-10 01:28:37 PDT
review ping
Comment 19 Honza Bambas (:mayhemer) 2012-07-10 10:23:23 PDT
(In reply to Michal Novotny (:michal) from comment #16)
> Comment on attachment 638587 [details] [diff] [review]
> Update cache entry only for new
> 
> This would mean that we would not need to store fetchCount, lastFetched,
> lastModified and expirationTime at all, since they would never be updated.
> I'm not sure if we need this information for entries in the offline cache. 

First, we need lastModified and expirationTime because offline cache entries are used as normal cache entries when updating.  We could introduce server load bugs.  Since those are never updated later anyway, we probably don't need to update offline cache entries.

After Michal's comment 12, where he ensures flags are set properly, I would rather prefer to update just conditionally.  If that still impacts performance, we should first confirm it's because of offline cache entries unwanted updates and then figure out where the flags get set and see what we can do further.
Comment 20 Honza Bambas (:mayhemer) 2012-07-10 10:32:36 PDT
Err.. MarkEntryDirty() is also called when fetched.  So there is no optimization then...
Comment 21 Honza Bambas (:mayhemer) 2012-07-10 11:57:50 PDT
Comment on attachment 638587 [details] [diff] [review]
Update cache entry only for new

Review of attachment 638587 [details] [diff] [review]:
-----------------------------------------------------------------

I have no chance to test this since m-c is currently broken when building on Windows.  But the patch looks good.  If there are no problems on try, we can land land this and see.

Asking Michal on feedback about this patch, since he knows the binding mechanics better then me.

::: netwerk/cache/nsDiskCacheDeviceSQL.cpp
@@ +1383,5 @@
>      // the row should have been removed in DoomEntry... we could assert that
>      // that happened.  otherwise, all we have to do here is delete the file
>      // on disk.
>      DeleteData(entry);
> +  } else if (((nsOfflineCacheBinding *)entry->Data())->IsNewEntry()) {

I'd rather leave the original form:

}
else if ()
{

@@ +1390,5 @@
> +    // Only new entries are updated, since offline cache is updated in
> +    // transactions.  New entries are those who is returned from
> +    // BindEntry().
> +
> +    LOG(("nsOfflineCacheDevice::DeactivateEntry updating dirty entry\n"));

Maybe "updating new entry".
Comment 22 Honza Bambas (:mayhemer) 2012-07-10 13:56:58 PDT
Comment on attachment 638587 [details] [diff] [review]
Update cache entry only for new

r-'ing the patch.  test_identicalManifest.html hangs.
Comment 23 Andreas Gal :gal 2012-07-11 08:25:48 PDT
Even if this patch worked, we are still evicting right? so we need a bug to stop that too (the delete and trigger crap). Thinker, can you file that one as well?
Comment 24 Thinker Li [:sinker] PTO during Sep 22 ~ 30 2012-07-11 18:51:24 PDT
(In reply to Andreas Gal :gal from comment #23)
> Even if this patch worked, we are still evicting right? so we need a bug to
> stop that too (the delete and trigger crap). Thinker, can you file that one
> as well?

bug 771420 can fix it.
Comment 25 Thinker Li [:sinker] PTO during Sep 22 ~ 30 2012-07-12 00:20:43 PDT
(In reply to Honza Bambas (:mayhemer) from comment #22)
> Comment on attachment 638587 [details] [diff] [review]
> Update cache entry only for new
> 
> r-'ing the patch.  test_identicalManifest.html hangs.

It is not my fault.  It is still hung even remove the patch.
Comment 26 Thinker Li [:sinker] PTO during Sep 22 ~ 30 2012-07-12 00:34:11 PDT
(In reply to Thinker Li [:sinker] from comment #25)
> (In reply to Honza Bambas (:mayhemer) from comment #22)
> > Comment on attachment 638587 [details] [diff] [review]
> > Update cache entry only for new
> > 
> > r-'ing the patch.  test_identicalManifest.html hangs.
> 
> It is not my fault.  It is still hung even remove the patch.
Forget the last message.  I tested it with a stupid wrong way that make all testcases hung.

test_identicalManifest.html does not hung on my box (Linux/debian).  Could you tell me what kind of environment are using for testing?
Comment 27 Honza Bambas (:mayhemer) 2012-07-12 04:34:10 PDT
Created attachment 641424 [details]
logs of stuck/non-stuck dom/tests/mochitest/ajax/offline/test_identicalManifest.html

This are logs, running dom/tests/mochitest/ajax/offline/test_identicalManifest.html alone.  m-c to this date, Windows 7, standard debug test-enabled and log-enabled build.  Only patch applied is the one here in this bug, just merged to apply cleanly (no logic changes).  I don't much believe this is platform dependent.

Files in the zip:
mozilla-NOT_STUCK-no-patch.log - just pure m-c
mozilla-STUCK-with-patch.log - m-c + patch for this bug

Check for differences.
Comment 28 Honza Bambas (:mayhemer) 2012-07-12 04:44:24 PDT
Curious is that I cannot reproduce when patch for bug 767025 is applied before this patch.  Thinker, I'm worried it just hides some problem, can you please check on that?
Comment 29 Thinker Li [:sinker] PTO during Sep 22 ~ 30 2012-07-13 04:40:36 PDT
(In reply to Honza Bambas (:mayhemer) from comment #28)
> Curious is that I cannot reproduce when patch for bug 767025 is applied
> before this patch.  Thinker, I'm worried it just hides some problem, can you
> please check on that?

I have ran test_identicalManifest.html for hundreds times with a script.  The script assumes the testcase shall be finished in 30s, or it is a hanging.  But I don't hit any hanging.  The logic of this patch is very simple, but I don't see the symptom that can be the cause at this moment.
Comment 30 Honza Bambas (:mayhemer) 2012-07-13 10:28:55 PDT
(In reply to Thinker Li [:sinker] from comment #29)
> (In reply to Honza Bambas (:mayhemer) from comment #28)
> > Curious is that I cannot reproduce when patch for bug 767025 is applied
> > before this patch.  Thinker, I'm worried it just hides some problem, can you
> > please check on that?
> 
> I have ran test_identicalManifest.html for hundreds times with a script. 
> The script assumes the testcase shall be finished in 30s, or it is a
> hanging.  But I don't hit any hanging.  The logic of this patch is very
> simple, but I don't see the symptom that can be the cause at this moment.

OK, then I will go though the logs my self and see what I can figure out from them.
Comment 31 Honza Bambas (:mayhemer) 2012-07-13 11:54:16 PDT
OK, I understand now.  Problem is that when an entry is open, we write the flags indicating the entry is active.  But when the entry is "not dirty", what means that is not new, the flag is never drop back to 0 because we don't write it back.  And hence we cannot open the entry anymore.

This is fixed in bug 767025 completely.  So, this bug's patch can be landed only together with patch for bug 767025.
Comment 32 Honza Bambas (:mayhemer) 2012-07-13 11:54:44 PDT
Comment on attachment 638587 [details] [diff] [review]
Update cache entry only for new

Changing to r+ based on previous comment.
Comment 33 Andreas Gal :gal 2012-07-13 11:56:21 PDT
++honza. Lets get both ready and land together.
Comment 34 Thinker Li [:sinker] PTO during Sep 22 ~ 30 2012-07-15 20:41:07 PDT
Created attachment 642471 [details] [diff] [review]
Update cache entry only for new, v2

r=honzab
Comment 35 Mozilla RelEng Bot 2012-07-15 23:45:23 PDT
Try run for 4c72a859cb88 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=4c72a859cb88
Results (out of 198 total builds):
    exception: 3
    success: 166
    warnings: 18
    failure: 11
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/tlee@mozilla.com-4c72a859cb88
Comment 36 Ryan VanderMeulen [:RyanVM] 2012-07-16 17:44:24 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/b7acf1662bae
Comment 37 Ed Morley [:emorley] 2012-07-17 02:10:50 PDT
https://hg.mozilla.org/mozilla-central/rev/b7acf1662bae

Note You need to log in before you can comment on or make changes to this bug.