Closed Bug 977766 Opened 10 years ago Closed 10 years ago

HTTP cache v2: selective deletion of data (by load context info)

Categories

(Core :: Networking: Cache, defect)

defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla31

People

(Reporter: mayhemer, Assigned: michal)

References

Details

Attachments

(1 file, 7 obsolete files)

This is both about API and implementation.

We have nsICacheStorage interface with asyncEvictStorage method.  This:
1) removes all entries synchronously from the storage (akak context) hashtables, so next time we go down to IO manager to reload entries fro that context
2) dispatches EvictionRunnable [1] where we have some TODOs and HACKs

The API is IMO sufficient.

The implementation needs to change:
- remove EvictionRunnable altogether
- memory only entries should be removed from the main (disk+memory) hashtable synchronously by looping the memory-only hashtable (sGlobal[memoryStorageID], [3]) ; since we plan to shrink the memory pool, I'm not worried about performance much, but depends
- the special code at AddStorageEntry [2] may then be removed
- the removal operation should then be passed down to the IO manager

And here we are, we need some asynchronous API on the IO manager that will:
- mark *somehow* / see bellow / all entries with the given context key as "don't use"
- start the selective deletion according data in the index (doesn't need to happen immediately, mainly in case we're still updating the index)

Not sure how to implement this, here is one suggestion tho:

We may have a file whom name is a context key hash, and we touch it every time we evict a context.  This file's LastModified time may easily be read (and then cached) - let's call it LastUsableTimeFromContext['x'].  Any file that is open and belongs to the context 'x' and has LastModified before or at the LastUsableTimeFromContext['x'] will be threw away (ignored) if we try to open it sooner than the background deletion actually erases the file on disk.

To sum I can see 3 bugs here:
- have the new API on IO manager and manage the context-last-usable-time files (michal or me)
- remove the EvictionRunnable class and its usage, migrate to the IO man api (me, use this bug)
- have a selective background deletion mechanism (michal)

Note that this only uses the index to find the appropriate files belonging to the given context on disk and doesn't need to modify the index.

Makes sense?


[1] http://hg.mozilla.org/mozilla-central/annotate/4cfb6c61b137/netwerk/cache2/CacheStorageService.cpp#l165
[2] http://hg.mozilla.org/mozilla-central/annotate/4cfb6c61b137/netwerk/cache2/CacheStorageService.cpp#l1261
[3] http://hg.mozilla.org/mozilla-central/annotate/4cfb6c61b137/netwerk/cache2/CacheStorageService.cpp#l1463
Flags: needinfo?(michal.novotny)
(In reply to Honza Bambas (:mayhemer) from comment #0)
> And here we are, we need some asynchronous API on the IO manager that will:
> - mark *somehow* / see bellow / all entries with the given context key as
> "don't use"
> - start the selective deletion according data in the index (doesn't need to
> happen immediately, mainly in case we're still updating the index)

We also need to doom any active entry with the given context key.


> We may have a file whom name is a context key hash, and we touch it every
> time we evict a context.  This file's LastModified time may easily be read
> (and then cached) - let's call it LastUsableTimeFromContext['x'].  Any file
> that is open and belongs to the context 'x' and has LastModified before or
> at the LastUsableTimeFromContext['x'] will be threw away (ignored) if we try
> to open it sooner than the background deletion actually erases the file on
> disk.

This would mean that we cannot trust the result from CacheIndex::HasEntry() since we don't have this information at the time we call HasEntry(), i.e. CacheIndex::HasEntry() can return EXISTS while the subsequent CacheFileIOManager::OpenFile() will ignore the file. But it shouldn't be IMO a problem for the usage of in HashEntry() in CacheFile::Init().
Flags: needinfo?(michal.novotny)
(In reply to Michal Novotny (:michal) from comment #1)
> (In reply to Honza Bambas (:mayhemer) from comment #0)
> > And here we are, we need some asynchronous API on the IO manager that will:
> > - mark *somehow* / see bellow / all entries with the given context key as
> > "don't use"
> > - start the selective deletion according data in the index (doesn't need to
> > happen immediately, mainly in case we're still updating the index)
> 
> We also need to doom any active entry with the given context key.

I don't think it's necessary.  As I understand, you will go through the disk using the index data and delete file by file.  You can easily check if there is a handle for that file and doom the handle rather then delete the file.  On the level of service/cache entries we will throw the whole context hashtable away which effectively dooms the entries from point of view of the cache service consumers.

Or is there some reason?  Maybe there is..

> 
> 
> > We may have a file whom name is a context key hash, and we touch it every
> > time we evict a context.  This file's LastModified time may easily be read
> > (and then cached) - let's call it LastUsableTimeFromContext['x'].  Any file
> > that is open and belongs to the context 'x' and has LastModified before or
> > at the LastUsableTimeFromContext['x'] will be threw away (ignored) if we try
> > to open it sooner than the background deletion actually erases the file on
> > disk.
> 
> This would mean that we cannot trust the result from CacheIndex::HasEntry()
> since we don't have this information at the time we call HasEntry(), i.e.
> CacheIndex::HasEntry() can return EXISTS while the subsequent
> CacheFileIOManager::OpenFile() will ignore the file. But it shouldn't be IMO
> a problem for the usage of in HashEntry() in CacheFile::Init().

I think CacheIndex::HasEntry() cannot be considered (even w/o what I suggest) as nothing more then a hint.  If it lies, we still go to disk and discover the true.  Just a bit later...
Comment 2
Flags: needinfo?(michal.novotny)
(In reply to Honza Bambas (:mayhemer) from comment #2)
> I don't think it's necessary.  As I understand, you will go through the disk
> using the index data and delete file by file.  You can easily check if there
> is a handle for that file and doom the handle rather then delete the file. 
> On the level of service/cache entries we will throw the whole context
> hashtable away which effectively dooms the entries from point of view of the
> cache service consumers.
> 
> Or is there some reason?  Maybe there is..

No, removing the entry from hashtable in CacheServiceStorage doesn't effectively doom the entry since CacheFile would re-read the data from the disk. We have to doom the active entries since any later update changes the last modified time. Just an example:

- we have some entry E with context C that is being written
- we call EvictByContext(C), so we remember timestamp T of this moment
- we append some data to the entry E at a timestamp T2
- we iterate over all files and remove any file with last modified time smaller than T, but T2 is bigger so we won't remove this file


> > This would mean that we cannot trust the result from CacheIndex::HasEntry()
> > since we don't have this information at the time we call HasEntry(), i.e.
> > CacheIndex::HasEntry() can return EXISTS while the subsequent
> > CacheFileIOManager::OpenFile() will ignore the file. But it shouldn't be IMO
> > a problem for the usage of in HashEntry() in CacheFile::Init().
> 
> I think CacheIndex::HasEntry() cannot be considered (even w/o what I
> suggest) as nothing more then a hint.  If it lies, we still go to disk and
> discover the true.  Just a bit later...

Well, HasEntry() should return a trustful information and that's why the result can be DO_NOT_KNOW. As I wrote, it won't be a problem in CacheFileIOManager::OpenFile() since the change will generate false positives, but it would be a problem if it produced false negatives. We must keep this in mind if we ever use HasEntry() for some other task.
Flags: needinfo?(michal.novotny)
(In reply to Michal Novotny (:michal) from comment #4)
> (In reply to Honza Bambas (:mayhemer) from comment #2)
> > I don't think it's necessary.  As I understand, you will go through the disk
> > using the index data and delete file by file.  You can easily check if there
> > is a handle for that file and doom the handle rather then delete the file. 
> > On the level of service/cache entries we will throw the whole context
> > hashtable away which effectively dooms the entries from point of view of the
> > cache service consumers.
> > 
> > Or is there some reason?  Maybe there is..
> 
> No, removing the entry from hashtable in CacheServiceStorage doesn't
> effectively doom the entry since CacheFile would re-read the data from the
> disk. 

I know this very well and never said otherwise.  The point here was to doom only in IO man and not in both cache service and IO man, but read on.

> We have to doom the active entries since any later update changes the
> last modified time. Just an example:
> 
> - we have some entry E with context C that is being written
> - we call EvictByContext(C), so we remember timestamp T of this moment
> - we append some data to the entry E at a timestamp T2
> - we iterate over all files and remove any file with last modified time
> smaller than T, but T2 is bigger so we won't remove this file

That is a good argument.  I cannot think of a simple solution how we could only doom a context in the IO manager.  If you don't know how to do it then probably no one else does.

The EvictionRunnable will then be left in and will do mostly the same as it does now.  However, will be posted using this new method: https://bugzilla.mozilla.org/attachment.cgi?id=8386097&action=diff#a/netwerk/cache2/CacheIOThread.h_sec2

That is bad for performance.  But whatever the IO man would have to do to express that a context is about not to be used would have to happen here (on the first level) as well.  And I presume you would need to iterate all open handles and check one by one if belonging to the given context and doom them.  Still would love some way to do this w/o iteration of existing stuff.

> 
> 
> > > This would mean that we cannot trust the result from CacheIndex::HasEntry()
> > > since we don't have this information at the time we call HasEntry(), i.e.
> > > CacheIndex::HasEntry() can return EXISTS while the subsequent
> > > CacheFileIOManager::OpenFile() will ignore the file. But it shouldn't be IMO
> > > a problem for the usage of in HashEntry() in CacheFile::Init().
> > 
> > I think CacheIndex::HasEntry() cannot be considered (even w/o what I
> > suggest) as nothing more then a hint.  If it lies, we still go to disk and
> > discover the true.  Just a bit later...
> 
> Well, HasEntry() should return a trustful information and that's why the
> result can be DO_NOT_KNOW. As I wrote, it won't be a problem in
> CacheFileIOManager::OpenFile() since the change will generate false
> positives, but it would be a problem if it produced false negatives. We must
> keep this in mind if we ever use HasEntry() for some other task.

Yep, I understand, in this case it will be a false positive, which is OK.
(In reply to Honza Bambas (:mayhemer) from comment #5)
> > We have to doom the active entries since any later update changes the
> > last modified time. Just an example:
> > 
> > - we have some entry E with context C that is being written
> > - we call EvictByContext(C), so we remember timestamp T of this moment
> > - we append some data to the entry E at a timestamp T2
> > - we iterate over all files and remove any file with last modified time
> > smaller than T, but T2 is bigger so we won't remove this file
> 
> That is a good argument.  I cannot think of a simple solution how we could
> only doom a context in the IO manager.  If you don't know how to do it then
> probably no one else does.

So, this whole seems to be just a misunderstanding and it's my fault since I use term "entry" inconsistently. In fact, I meant that we need to doom all CacheFileHandles with the given context, which is possible. I.e. nsICacheStorage::CacheEvictByContect(C) must post an event (the same priority as open, doom etc.) to IO thread and this event will iterate all handles in hashtable and it will doom all that matches context C.
(In reply to Michal Novotny (:michal) from comment #6)
> So, this whole seems to be just a misunderstanding and it's my fault since I
> use term "entry" inconsistently. In fact, I meant that we need to doom all
> CacheFileHandles with the given context, which is possible. I.e.
> nsICacheStorage::CacheEvictByContect(C) must post an event (the same
> priority as open, doom etc.) to IO thread and this event will iterate all
> handles in hashtable and it will doom all that matches context C.

Aha, ok.  So just make 100% sure I understand: EvictionRunnable may freely go away.  On the side of the service I will only empty the appropriate context hashtables.  Then I will call some IOManager->EvictContext(key) that will post an eviction event (using FlushOpensAndDispatch method on IO thread) that will take care of everything in one place.

If so, I can start working on my parts in the service.  Only thing to agree on is the IO manager API.
Backup
Attached patch CacheStorageService parts wip2 (obsolete) — Splinter Review
Backup 2
Attachment #8390620 - Attachment is obsolete: true
Blocks: 909282
Attached patch CacheFile part v1 (obsolete) — Splinter Review
Attachment #8396485 - Flags: review?(honzab.moz)
Michal, can you please add few top-level explanatory points about the patch to ease the review?  Thanks.
Depends on: 976866
Attachment #8391145 - Attachment description: wip2 → CacheStorageService parts wip2
Comment on attachment 8396485 [details] [diff] [review]
CacheFile part v1

Review of attachment 8396485 [details] [diff] [review]:
-----------------------------------------------------------------

f- for:

- in general, again, a complete lack of comments
- using directly the context name for the signal files what limits (hard to engage) the character set we can use for it, should be either a hash (cannot parse tho) or some form of base64 or 32 which is safe for file system and cheap to decompile


The way we evict a separate storage should also be documented at https://developer.mozilla.org/en-US/docs/HTTP_Cache

::: netwerk/cache2/CacheFileContextEvictor.cpp
@@ +39,5 @@
> +
> +nsresult
> +CacheFileContextEvictor::Init(nsIFile *aCacheDirectory)
> +{
> +  LOG(("CacheFileContextEvictor::Init()"));

assert IO thread here.

@@ +85,5 @@
> +
> +  CacheFileContextEvictorEntry *entry = nullptr;
> +
> +  for (uint32_t i = 0 ; i < mEntries.Length() ; ++i) {
> +    if (mEntries[i]->mInfo->Equals(aLoadContextInfo)) {

hashtable by the key rather?

@@ +99,5 @@
> +  }
> +
> +  entry->mTimeStamp = PR_Now() / PR_USEC_PER_MSEC;
> +
> +  PersistEvictionInfoToDisk(aLoadContextInfo);

assert io thread

@@ +105,5 @@
> +  if (mIndexIsUpToDate) {
> +    if (entry->mIterator) {
> +      entry->mIterator->Close();
> +      entry->mIterator = nullptr;
> +    }

please explain in a comment.

@@ +111,5 @@
> +    rv = CacheIndex::GetIterator(aLoadContextInfo, false,
> +                                 getter_AddRefs(entry->mIterator));
> +    if (NS_FAILED(rv)) {
> +      // This could probably happen during shutdown. Remove the entry from
> +      // the array, but leave the info on the disk. No entry can be opened

why the entry has to be removed?

@@ +138,5 @@
> +    mIndexIsUpToDate = isUpToDate;
> +    return NS_OK;
> +  }
> +
> +  if (!isUpToDate == !mIndexIsUpToDate) {

remove both the !

@@ +142,5 @@
> +  if (!isUpToDate == !mIndexIsUpToDate) {
> +    return NS_OK;
> +  }
> +
> +  if (isUpToDate && mIndexIsUpToDate) {

I don't follow this condition, isn't is wrong?
should the new state be remembered?

@@ +148,5 @@
> +      return NS_OK;
> +    }
> +
> +    // We're not evicting, but we should?!
> +    LOG(("CacheFileContextEvictor::CacheIndexStateChanged() - "));

?

@@ +168,5 @@
> +  return NS_OK;
> +}
> +
> +nsresult
> +CacheFileContextEvictor::WasRemoved(const nsACString &aKey, nsIFile *aFile,

s/aKey/aContextKey/

@@ +182,5 @@
> +  if (!info) {
> +    LOG(("CacheFileContextEvictor::WasRemoved() - Cannot parse key!"));
> +    *_retval = false;
> +    return NS_OK;
> +  }

another argument for a str->entry hashtable.

@@ +200,5 @@
> +    return NS_OK;
> +  }
> +
> +  PRTime lastModifiedTime;
> +  rv = aFile->GetLastModifiedTime(&lastModifiedTime);

assert io thread

@@ +233,5 @@
> +  }
> +
> +  PRFileDesc *fd;
> +  rv = file->OpenNSPRFileDesc(PR_RDWR | PR_CREATE_FILE | PR_TRUNCATE, 0600,
> +                              &fd);

assert io thread

@@ +240,5 @@
> +  }
> +
> +  PR_Close(fd);
> +
> +  LOG(("CacheFileContextEvictor::PersistEvictionInfoToDisk() - File created."));

some comments with explanation of what this file is and what info it carries would be useful.

@@ +262,5 @@
> +  }
> +
> +  rv = file->Remove(false);
> +  if (NS_WARN_IF(NS_FAILED(rv))) {
> +    return rv;

please log this.

@@ +277,5 @@
> +{
> +  LOG(("CacheFileContextEvictor::LoadEvictInfoFromDisk() [this=%p]", this));
> +
> +  nsresult rv;
> +

assert thread here please

@@ +321,5 @@
> +      continue;
> +    }
> +
> +    nsCOMPtr<nsILoadContextInfo> info = CacheFileUtils::ParseKey(
> +      Substring(leaf, strlen(kContextEvictionPrefix)));

try use nsArraySize or something instead of strlen please, that gives a compile time known value

@@ +334,5 @@
> +      continue;
> +    }
> +
> +    CacheFileContextEvictorEntry *entry = nullptr;
> +    entry = new CacheFileContextEvictorEntry();

CacheFileContextEvictorEntry *entry = new CacheFileContextEvictorEntry(); ?
or at least no need to init with nullptr.

@@ +351,5 @@
> +  nsresult rv;
> +
> +  nsAutoCString leafName;
> +  leafName.Assign(NS_LITERAL_CSTRING(kContextEvictionPrefix));
> +  CacheFileUtils::CreateKeyPrefix(aLoadContextInfo, leafName);

oh, I think the fact that CreateKeyPrefix appends is a bug and a bad thing to count on.  We should either rename it to AppendKeyPrefix or change the behavior.  I prefer the former.  I'll do it with update to patch for bug 968593.

this also means the ':' MUST NOT be part of the key prefix and also puts some restrictions on it, since not all chars can be used in file names..

@@ +377,5 @@
> +  CloseIterators();
> +
> +  nsresult rv;
> +
> +  for (uint32_t i = mEntries.Length() ; i != 0 ; --i) {

please:

for (uint32_t i = mEntries.Length() ; i > 0;) {
   --i
   mEntries[i]->...
}

@@ +427,5 @@
> +    LOG(("CacheFileContextEvictor::StartEvicting() - Cannot dispatch event to "
> +         "IO thread. [rv=0x%08x]", rv));
> +  }
> +
> +  mEvicting = true;

if this is happening on the io thread, please assert it

::: netwerk/cache2/CacheFileContextEvictor.h
@@ +16,5 @@
> +namespace net {
> +
> +class CacheIndexIterator;
> +
> +struct CacheFileContextEvictorEntry

class

@@ +23,5 @@
> +  PRTime                       mTimeStamp; // in milliseconds
> +  nsRefPtr<CacheIndexIterator> mIterator;
> +};
> +
> +class CacheFileContextEvictor : public nsISupports

To express this may evict more then one context, maybe call it CacheFileContext_s_Evictor ?

@@ +26,5 @@
> +
> +class CacheFileContextEvictor : public nsISupports
> +{
> +public:
> +  NS_DECL_THREADSAFE_ISUPPORTS

NS_INLINE_DECL_THREADSAFE_REFCOUNTING instead please

@@ +33,5 @@
> +  virtual ~CacheFileContextEvictor();
> +
> +  nsresult Init(nsIFile *aCacheDirectory);
> +
> +  uint32_t NumOfContexts();

ContextsCount?

@@ +52,5 @@
> +  nsresult EvictEntries();
> +
> +  bool mEvicting;
> +  bool mIndexIsUpToDate;
> +  static bool mDiskAlreadySearched;

should have 's' prefix

@@ +53,5 @@
> +
> +  bool mEvicting;
> +  bool mIndexIsUpToDate;
> +  static bool mDiskAlreadySearched;
> +  nsTArray<nsAutoPtr<CacheFileContextEvictorEntry> > mEntries;

may IMO be more useful and cleaner as a hashtable

@@ +56,5 @@
> +  static bool mDiskAlreadySearched;
> +  nsTArray<nsAutoPtr<CacheFileContextEvictorEntry> > mEntries;
> +  nsCOMPtr<nsIFile> mCacheDirectory;
> +  nsCOMPtr<nsIFile> mEntriesDir;
> +};

some comments on everything would be useful.

::: netwerk/cache2/CacheFileIOManager.cpp
@@ +1549,5 @@
>                                       uint32_t aFlags,
>                                       CacheFileHandle **_retval)
>  {
>    LOG(("CacheFileIOManager::OpenFileInternal() [hash=%08x%08x%08x%08x%08x, "
> +       "key=%s, flags=%d]", LOGSHA1(aHash), PromiseFlatCString(aKey).get(),

AFAIK you don't need to flatten the string, just BeginReading() will do.

BTW, once I was told to to use nsCSubstring class for internal functions instead.  But this is just for information, I believe I break this rule myself on many places.

@@ +2630,5 @@
> +  nsCOMPtr<nsIRunnable> ev;
> +  ev = NS_NewRunnableMethodWithArg<nsCOMPtr<nsILoadContextInfo> >
> +         (ioMan, &CacheFileIOManager::EvictByContextInternal, aLoadContextInfo);
> +
> +  rv = ioMan->mIOThread->Dispatch(ev, CacheIOThread::OPEN_PRIORITY);

here you have to use FlushOpensAndDispatch from bug 976866

@@ +2681,5 @@
> +                                                   aLoadContextInfo,
> +                                                   &equals);
> +    MOZ_ASSERT(NS_SUCCEEDED(rv));
> +    if (NS_FAILED(rv)) {
> +      LOG(("CacheFileIOManager::EvictByContextInternal() - Cannot parse key in "

if (NS_FAILED(rv)) {
  MOZ_CRASH("...");
  LOG(("...

?

@@ +2727,5 @@
> +
> +  nsCOMPtr<nsIEventTarget> ioTarget = IOTarget();
> +  MOZ_ASSERT(ioTarget);
> +
> +  rv = ioTarget->Dispatch(ev, nsIEventTarget::DISPATCH_NORMAL);

would Dispatch(CURRENT_LEVEL) be useful here?

@@ +3386,5 @@
> +    mContextEvictor = new CacheFileContextEvictor();
> +    mContextEvictor->Init(mCacheDirectory);
> +    if (mContextEvictor->NumOfContexts() == 0) {
> +      mContextEvictor = nullptr;
> +    }

maybe swap through a local nsRefPtr instead of storing/removing to/from the member?

::: netwerk/cache2/CacheIndex.cpp
@@ +352,5 @@
>         "dontMarkIndexClean=%d]", index->mState, index->mIndexOnDiskIsValid,
>         index->mDontMarkIndexClean));
>  
> +  LOG(("CacheIndex::PreShutdown() - Closing iterators."));
> +  for (uint32_t i = index->mIterators.Length() ; i != 0 ; --i) {

for (i = Length(); i > 0;)
{
  --i;
  mIterators[i];

@@ +3095,5 @@
> +{
> +  AssertOwnsLock();
> +
> +  for (uint32_t i = 0 ; i < mIterators.Length() ; ++i) {
> +    mIterators[i]->RemoveRecord(aRecord);

no shouldBeUpdated check here?

@@ +3106,5 @@
> +{
> +  AssertOwnsLock();
> +
> +  for (uint32_t i = 0 ; i < mIterators.Length() ; ++i) {
> +    mIterators[i]->ReplaceRecord(aOldRecord, aNewRecord);

as well here?

::: netwerk/cache2/CacheIndex.h
@@ +562,5 @@
>  
> +  // Returns an iterator that returns entries matching a given context that were
> +  // present in the index at the time this method was called. If aUpdate is true
> +  // then the iterator will also return entries created after this call.
> +  static nsresult GetIterator(nsILoadContextInfo *aInfo, bool aUpdate,

maybe rename aUpdate to aKeepUpdated?

@@ +989,5 @@
> +    mLocked = false;
> +  }
> +
> +private:
> +  nsRefPtr<CacheIndex> mIndex;

is there really a need to refcount?  Isn't this a stack class only?  also, should be a guard object and should hide default ctor().

::: netwerk/cache2/CacheIndexContextIterator.h
@@ +22,5 @@
> +  virtual ~CacheIndexContextIterator();
> +
> +private:
> +  void AddRecord(CacheIndexRecord *aRecord);
> +  void AddRecords(const nsTArray<CacheIndexRecord *> &aRecords);

please keep the 'virtual' word

::: netwerk/cache2/CacheIndexIterator.cpp
@@ +68,5 @@
> +       aStatus));
> +
> +  // Make sure status will be a failure
> +  MOZ_ASSERT(NS_FAILED(aStatus));
> +  if (!NS_FAILED(aStatus)) {

NS_SUCCEEDED?

::: netwerk/cache2/CacheIndexIterator.h
@@ +18,5 @@
> +
> +class CacheIndexIterator : public nsISupports
> +{
> +public:
> +  NS_DECL_THREADSAFE_ISUPPORTS

use http://hg.mozilla.org/mozilla-central/annotate/c69c55582faa/xpcom/glue/nsISupportsImpl.h#l443 instead.  not necessary to introduce 3 virtual methods here.

::: netwerk/cache2/CacheStorageService.cpp
@@ +1092,5 @@
> +    nsCOMPtr<nsILoadContextInfo> info = CacheFileUtils::ParseKey(aContextKey);
> +    MOZ_ASSERT(info);
> +    if (info) {
> +      CacheFileIOManager::EvictByContext(info);
> +    }

I didn't realize few details here, this is not quite correct:
- we must not call this for PB contexts
- we must not call this when clearing the whole cache, there is EvictAll called for that, [1]

I will update my patch part as following:
This method will take an optional nsILoadContextInfo argument that when non-null you will pass to EvictByContext().  This way we will easily bypass for pb contexts and when called from Clear().  And there is also no need to reparse the key when there is probably the info object a frame above.

[1] http://mxr.mozilla.org/mozilla-central/source/netwerk/cache2/CacheStorageService.cpp#509
Attachment #8396485 - Flags: review?(honzab.moz) → feedback-
Attached patch CacheStorageService part v1 (obsolete) — Splinter Review
- hopefully the changes are well commented in the code
- for those being removed: 
  - the background one-by-one eviction event is no longer needed, entries are just synchrnously removed (means effectively dooming for memory and pb entries) and disk entries are obvious ; this event was responsible for the callback, it is now synthesized
  - before this patch we were just removing the memory hash table and special code in AddStorageEntry took care to doom the memory entry left in the disk hash table ; no longer needed, memory entries are rather removed manually from the disk hashtable as well in a single loop ; stats show that there is usually a very small number of memory-only entries and deletion of just memory entries is rare

Please apply this patch AFTER the CacheFile part.  The CacheStorageService changes in that patch must be removed before that.
Attachment #8391145 - Attachment is obsolete: true
Attachment #8398127 - Flags: review?(michal.novotny)
Attached patch CacheFile part v2 (obsolete) — Splinter Review
(In reply to Honza Bambas (:mayhemer) from comment #12)
> @@ +85,5 @@
> > +
> > +  CacheFileContextEvictorEntry *entry = nullptr;
> > +
> > +  for (uint32_t i = 0 ; i < mEntries.Length() ; ++i) {
> > +    if (mEntries[i]->mInfo->Equals(aLoadContextInfo)) {
> 
> hashtable by the key rather?

No, there will be just one item most of the time. In fact, hashtable would be slower in this case.


> @@ +111,5 @@
> > +    rv = CacheIndex::GetIterator(aLoadContextInfo, false,
> > +                                 getter_AddRefs(entry->mIterator));
> > +    if (NS_FAILED(rv)) {
> > +      // This could probably happen during shutdown. Remove the entry from
> > +      // the array, but leave the info on the disk. No entry can be opened
> 
> why the entry has to be removed?

The comment is quite verbose, so I think the reason is obvious. We keep the entry for two purposes:

- to delete files from disk
- to prevent opening files that should be considered as deleted

As the comment says, no entry file will be opened during shutdown, so nobody is going to call WasRemoved(). Also we cannot delete any entry file without the iterator.


> @@ +142,5 @@
> > +  if (!isUpToDate == !mIndexIsUpToDate) {
> > +    return NS_OK;
> > +  }
> > +
> > +  if (isUpToDate && mIndexIsUpToDate) {
> 
> I don't follow this condition, isn't is wrong?
> should the new state be remembered?

Wrong was the condition above this one. It is now fixed and some comments were added. 


> @@ +168,5 @@
> > +  return NS_OK;
> > +}
> > +
> > +nsresult
> > +CacheFileContextEvictor::WasRemoved(const nsACString &aKey, nsIFile *aFile,
> 
> s/aKey/aContextKey/

?? This is not a context key, this is a key.


> @@ +182,5 @@
> > +  if (!info) {
> > +    LOG(("CacheFileContextEvictor::WasRemoved() - Cannot parse key!"));
> > +    *_retval = false;
> > +    return NS_OK;
> > +  }
> 
> another argument for a str->entry hashtable.

?? How can a hashtable ensure that nobody passes a garbage to this function?


> ::: netwerk/cache2/CacheFileContextEvictor.h
> @@ +16,5 @@
> > +namespace net {
> > +
> > +class CacheIndexIterator;
> > +
> > +struct CacheFileContextEvictorEntry
> 
> class

Why?

 
> ::: netwerk/cache2/CacheFileIOManager.cpp
> @@ +1549,5 @@
> >                                       uint32_t aFlags,
> >                                       CacheFileHandle **_retval)
> >  {
> >    LOG(("CacheFileIOManager::OpenFileInternal() [hash=%08x%08x%08x%08x%08x, "
> > +       "key=%s, flags=%d]", LOGSHA1(aHash), PromiseFlatCString(aKey).get(),
> 
> AFAIK you don't need to flatten the string, just BeginReading() will do.

But BeginReading() doesn't guarantee that the string is null terminated, or am I wrong?


> @@ +2727,5 @@
> > +
> > +  nsCOMPtr<nsIEventTarget> ioTarget = IOTarget();
> > +  MOZ_ASSERT(ioTarget);
> > +
> > +  rv = ioTarget->Dispatch(ev, nsIEventTarget::DISPATCH_NORMAL);
> 
> would Dispatch(CURRENT_LEVEL) be useful here?

I don't understand this comment.


> ::: netwerk/cache2/CacheIndex.cpp
> @@ +3095,5 @@
> > +{
> > +  AssertOwnsLock();
> > +
> > +  for (uint32_t i = 0 ; i < mIterators.Length() ; ++i) {
> > +    mIterators[i]->RemoveRecord(aRecord);
> 
> no shouldBeUpdated check here?

No, comment added.


> @@ +3106,5 @@
> > +{
> > +  AssertOwnsLock();
> > +
> > +  for (uint32_t i = 0 ; i < mIterators.Length() ; ++i) {
> > +    mIterators[i]->ReplaceRecord(aOldRecord, aNewRecord);
> 
> as well here?

No, comment added as well.

 
> @@ +989,5 @@
> > +    mLocked = false;
> > +  }
> > +
> > +private:
> > +  nsRefPtr<CacheIndex> mIndex;
> 
> is there really a need to refcount?  Isn't this a stack class only?  also,
> should be a guard object and should hide default ctor().

Note that this is not a new code. I've just moved it to header file so iterators could use it too.


> ::: netwerk/cache2/CacheIndexIterator.h
> @@ +18,5 @@
> > +
> > +class CacheIndexIterator : public nsISupports
> > +{
> > +public:
> > +  NS_DECL_THREADSAFE_ISUPPORTS
> 
> use
> http://hg.mozilla.org/mozilla-central/annotate/c69c55582faa/xpcom/glue/
> nsISupportsImpl.h#l443 instead.  not necessary to introduce 3 virtual
> methods here.

This would declare AddRef()/Release() as non-virtual, but CacheIndexContextIterator inherits from this class so they must be virtual, right?
Attachment #8396485 - Attachment is obsolete: true
Attachment #8399500 - Flags: review?(honzab.moz)
Comment on attachment 8398127 [details] [diff] [review]
CacheStorageService part v1

Review of attachment 8398127 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/cache2/CacheStorageService.cpp
@@ +1125,5 @@
>    }
>  
> +  // An artificial callback.  This is a candidate for removal tho.  In the new cache
> +  // any 'doom' or 'evict' function ensures that the entry or entries being doomed
> +  // is/are not acessible after the function returns.  So there is probably no need

accessible
Attachment #8398127 - Flags: review?(michal.novotny) → review+
Attachment #8399500 - Attachment is obsolete: true
Attachment #8399500 - Flags: review?(honzab.moz)
Attachment #8399944 - Flags: review?(honzab.moz)
Comment on attachment 8399944 [details] [diff] [review]
CacheFile part v3 - fixed a compilation problem

Review of attachment 8399944 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with comments

::: netwerk/cache2/CacheFileContextEvictor.cpp
@@ +159,5 @@
> +    // Index is up to date and this is no change.
> +    if (mEvicting) {
> +      // If we are evicting, just exist.
> +      return NS_OK;
> +    }

the new comments may use some clean up I think

@@ +168,5 @@
> +         "Starting now."));
> +
> +    CreateIterators();
> +    StartEvicting();
> +    return NS_OK;

maybe just let it fall through?  this is a code duplication...

@@ +375,5 @@
> +    }
> +
> +    // We encode the load context info with the trailing null, but we cannot
> +    // trust the files found on disk, so we need to append a terminating null
> +    // to the decoded data before we use it.

"but we cannot trust the files found on disk" - really?

@@ +376,5 @@
> +
> +    // We encode the load context info with the trailing null, but we cannot
> +    // trust the files found on disk, so we need to append a terminating null
> +    // to the decoded data before we use it.
> +    decoded[decodedLen] = 0;

- base 64 is using '/' char that is not OK for windows, you need to replace it with '-'
- please use http://mxr.mozilla.org/mozilla-central/source/xpcom/io/Base64.h

@@ +425,5 @@
> +  keyPrefix.Append(":foo");
> +
> +  char* b64 = PL_Base64Encode(keyPrefix.get(),
> +                              keyPrefix.Length() + 1, // include trailing null
> +                              nullptr);

Base64.h here as well.

@@ +457,5 @@
> +  CloseIterators();
> +
> +  nsresult rv;
> +
> +  for (uint32_t i = 0 ; i < mEntries.Length() ; ) {

for (x; y; z) { ?

::: netwerk/cache2/CacheFileContextEvictor.h
@@ +46,5 @@
> +  // be considered as evicted. It returns true when there is a matching context
> +  // info to the given key and the last modified time of the entry file is
> +  // earlier than the time stamp of the time when the context was added to the
> +  // evictor.
> +  nsresult WasRemoved(const nsACString &aKey, nsIFile *aFile, bool *_retval);

WasEvicted?  (Removed is too general)

@@ +52,5 @@
> +private:
> +  // Writes information about eviction of the given context to the disk. This is
> +  // done for every context added to the evictor to be able to recover eviction
> +  // after a shutdown or crash. When the context file is found after startup, we
> +  // restore mTimeStap from the last modified time of the file.

mTimeStap

@@ +57,5 @@
> +  nsresult PersistEvictionInfoToDisk(nsILoadContextInfo *aLoadContextInfo);
> +  // Once we are done with eviction for the given context, the eviction info is
> +  // removed from the disk.
> +  nsresult RemoveEvictInfoFromDisk(nsILoadContextInfo *aLoadContextInfo);
> +  // Tries to load all context from the disk. This method is called just once

contexts?

::: netwerk/cache2/CacheFileIOManager.cpp
@@ +2686,5 @@
> +    if (NS_FAILED(rv)) {
> +      LOG(("CacheFileIOManager::EvictByContextInternal() - Cannot parse key in "
> +           "handle! [handle=%p, key=%s]", handles[i].get(),
> +           handles[i]->Key().get()));
> +      MOZ_CRASH("Unexpected error!");

then remove the MOZ_ASSERT(NS_SUCCEEDED)?

::: netwerk/cache2/CacheIndex.cpp
@@ +352,5 @@
>         index->mDontMarkIndexClean));
>  
> +  LOG(("CacheIndex::PreShutdown() - Closing iterators."));
> +  for (uint32_t i = 0 ; i < index->mIterators.Length() ; ) {
> +    rv = index->mIterators[i]->CloseInternal(NS_ERROR_FAILURE);

maybe a comment that CloseInternal removes itself from the index->mIterators array.

@@ +3106,5 @@
> +{
> +  AssertOwnsLock();
> +
> +  for (uint32_t i = 0 ; i < mIterators.Length() ; ++i) {
> +    // Remove the record from iterator always, it make no sence to return

it makes no

::: netwerk/cache2/CacheIndexContextIterator.h
@@ +14,5 @@
> +
> +class CacheIndexContextIterator : public CacheIndexIterator
> +{
> +public:
> +  NS_DECL_THREADSAFE_ISUPPORTS

remove (correctly inherited from the base class)

::: netwerk/cache2/CacheIndexIterator.h
@@ +18,5 @@
> +
> +class CacheIndexIterator : public nsISupports
> +{
> +public:
> +  NS_DECL_THREADSAFE_ISUPPORTS

as discussed, this can use the inline as well.  you could also use mfbt/RefPtr.h, up to you...

@@ +39,5 @@
> +  nsresult CloseInternal(nsresult aStatus);
> +
> +  virtual bool ShouldBeNewAdded() { return mAddNew; }
> +  virtual void AddRecord(CacheIndexRecord *aRecord);
> +  virtual void AddRecords(const nsTArray<CacheIndexRecord *> &aRecords);

only AddRecord and AddRecords should actually be virtual, no?
Attachment #8399944 - Flags: review?(honzab.moz) → review+
Attached patch CacheFile part v4 (obsolete) — Splinter Review
https://tbpl.mozilla.org/?tree=Try&rev=64e9f8072eae
Attachment #8399944 - Attachment is obsolete: true
Attachment #8401969 - Flags: review+
Michal, anything that prevents landing this right now?
Assignee: nobody → michal.novotny
Status: NEW → ASSIGNED
I'm waiting for the result on try server: https://tbpl.mozilla.org/?tree=Try&rev=3c8fe94c0360
Attachment #8398127 - Attachment is obsolete: true
https://hg.mozilla.org/mozilla-central/rev/40a275481f3f
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: