Closed Bug 913819 Opened 11 years ago Closed 10 years ago

HTTP cache v2: preload several chunks in advance in input stream to speed up reading

Categories

(Core :: Networking: Cache, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla32

People

(Reporter: mayhemer, Assigned: michal)

References

Details

(Whiteboard: [cache2])

Attachments

(1 file, 3 obsolete files)

Part of this bug should be:
- determine optimal chunk size (now 16kB)
- preload chunks in advance to save repeated OnDataAvailable calls with only 16kB of data (pretty suboptimal as tests show)
Whiteboard: [cache2]
Blocks: cache2enable
I think there will not be much time to finish this in the time frame.  One of the first optimizations to come after that!
Blocks: cache2afterenable
No longer blocks: cache2enable
Shows up being cause of talos regressions.  Needs to be done for the first deployment.  Also another joint problem may be that there is too much posts in the async reading chain.  I was in the past analyzing this with the Event Tracer and had a patch that tried to remove most of them, tho a bit hacked.

I can play with just enlarging chunk to say 512kB, but from my experience this may not have much performance influence.
Blocks: cache2enable
No longer blocks: cache2afterenable
(In reply to Honza Bambas (:mayhemer) from comment #2)
> I can play with just enlarging chunk to say 512kB, but from my experience
> this may not have much performance influence.

I don't understand this part of your comment. If 16kB chunk size causes the talos regression, then increasing the chunk size must have a positive effect.
(In reply to Michal Novotny (:michal) from comment #3)
> (In reply to Honza Bambas (:mayhemer) from comment #2)
> > I can play with just enlarging chunk to say 512kB, but from my experience
> > this may not have much performance influence.
> 
> I don't understand this part of your comment. If 16kB chunk size causes the
> talos regression, then increasing the chunk size must have a positive effect.

It will have, but I somehow sense it may not be enough.  I'll definitely try it right now, it's easy.
tsvgr_opacity

cache2 - 16kb (m-c) : https://tbpl.mozilla.org/?tree=Try&rev=a505e96204f6
cache2 - 512kB (patched) : https://tbpl.mozilla.org/?tree=Try&rev=56e8a11532b3
So, with chunk at 512kB it seems like we are still slower, however, not that dramatically this time:

I've ignored extreme values over 240ms.  I took some 15 measurements for cache1 and cache2+512kb chunk.

               median,   average
cache1      :  148.5     159.2666666667
cache2+512kb:  175       167.4333333333


Seems like chink size enlargement is not enough.  However, doing so may be in general a good improvement.

Michal, what size of the chunk would you suggest?
Flags: needinfo?(michal.novotny)
Ignore my previous comment!  Asking on IRC is always good thing to do ;)  This is how the result should be compared:

http://compare-talos.mattn.ca/?oldRevs=c0ec2e234457&newRev=56e8a11532b3&server=graphs.mozilla.org&submit=true

It shows that we actually improve svg opacity with this.  Cool!

I'll change the bug dependencies.
Blocks: cache2afterenable
No longer blocks: cache2enable
Severity: major → normal
I moved the discussion to bug 988318.
Flags: needinfo?(michal.novotny)
Seems like some of our talos tests are still regressing because of this (https://bugzilla.mozilla.org/show_bug.cgi?id=982598#c15).

Tentatively blocking cache2 enable.

Michal, if you give me some clues how this should be implemented I can work on it.
Blocks: cache2enable
No longer blocks: cache2afterenable
I can imagine some algorithm like |while (!preloadedChunks.count < 4) preloadChunk(lastPreloadedChunkIndex+1);| invoked immediately we verify metadata and maybe make the "4" preferable.
Attached patch patch v1 (obsolete) — Splinter Review
https://tbpl.mozilla.org/?tree=Try&rev=cfcfabd70081
Attachment #8414068 - Flags: review?(honzab.moz)
Comment on attachment 8414068 [details] [diff] [review]
patch v1

Review of attachment 8414068 [details] [diff] [review]:
-----------------------------------------------------------------

Looks pretty good! f+

This patch is however missing one important factor we were talking about: start the preload immediately when you have successfully loaded metadata (ASAP on the IO thread).  I checked in the debugger that first time CacheFile::PreloadChunks is called from:

>	xul.dll!mozilla::net::CacheFile::PreloadChunks(0) Line 1191	C++
 	xul.dll!mozilla::net::CacheFile::GetChunkLocked(0, false, 0x1826a798, 0x1826a7a8) Line 1056	C++
 	xul.dll!mozilla::net::CacheFileInputStream::EnsureCorrectChunk(false) Line 546	C++
 	xul.dll!mozilla::net::CacheFileInputStream::AsyncWait(0x189cd03c, 0, 0, 0x00452c80) Line 333	C++
 	xul.dll!nsInputStreamPump::EnsureWaiting() Line 142	C++
 	xul.dll!nsInputStreamPump::AsyncRead(0x18cd3318, 0x00000000) Line 382	C++
 	xul.dll!mozilla::net::nsHttpChannel::ReadFromCache(true) Line 3609	C++
 	xul.dll!mozilla::net::nsHttpChannel::ContinueConnect() Line 366	C++
 	xul.dll!mozilla::net::nsHttpChannel::OnCacheEntryAvailableInternal(0x189377e0, false, 0x00000000, NS_OK) Line 3110	C++
 	xul.dll!mozilla::net::nsHttpChannel::OnCacheEntryAvailable(0x189377e0, false, 0x00000000, NS_OK) Line 3056	C++
 	xul.dll!mozilla::net::CacheEntry::InvokeAvailableCallback({...}) Line 719	C++
 	xul.dll!mozilla::net::CacheEntry::AvailableCallbackRunnable::Run() Line 176	C++

...on the main thread - that is way too late.

Best would be to start at CacheFile::OnMetadataRead.  In CacheFile::ShouldKeepChunk then, when there are no input streams, behave as there would be a single input stream active at pos 0 - actually means to always preload the first 4 chunks of payload.

You may counterargument that when no one opens any input stream (which is IMO very very rare) we will keep the chunks almost forever for no use.  To avoid that (which I think is for after-enable) we may resurrect some of the ThrowMemoryCacheData() callers.  It's still there, just not used right now.

::: netwerk/cache2/CacheFile.cpp
@@ +987,5 @@
> +  // Preload chunks from disk when this is disk backed entry and the listener
> +  // is reader. NOTE: preloading code calls this method without callback, so
> +  // do not preload chunks when there is no listener.
> +  bool preload = !mMemoryOnly && !aWriter && aCallback;
> +  bool calledByPreloader = !aWriter && !aCallback;

time to change |bool aWriter| to |enum { READER, WRITER, PRELOADER } aCaller| ?

Guessing caller type on value of some arguments is definitely not a stable way of coding, but if this passes try, let's leave it for this version (followup OK)

@@ +996,5 @@
>    if (mChunks.Get(aIndex, getter_AddRefs(chunk))) {
>      LOG(("CacheFile::GetChunkLocked() - Found chunk %p in mChunks [this=%p]",
>           chunk.get(), this));
>  
> +    MOZ_ASSERT(!calledByPreloader, "Unexpected!");

comment please why we must not get here when called by the preload code.  as well as on other places you do this assert

@@ +1180,5 @@
> +  AssertOwnsLock();
> +
> +  uint32_t lastChunk = aIndex + CacheObserver::PreloadChunkCount();
> +
> +  for (uint32_t i = (aIndex + 1); i <= lastChunk; ++i) {

I think better would be to pass index you actually want to preload from and not an index of a chunk we already have (what I can see from how this is called).

That is also needed to preload from chunk 0 in OnMetadataRead().  And also IMO makes a bit more sense.

@@ +1189,5 @@
> +      return;
> +    }
> +
> +    if (mChunks.GetWeak(i) || mCachedChunks.GetWeak(i)) {
> +      // This chunk is already in memory or is being read right now.

maybe LOG this?

@@ +1216,5 @@
> +#else
> +  // Cache chunk when this is memory only entry or we don't have a handle yet.
> +  if (mMemoryOnly || mOpeningFile) {
> +    return true;
> +  }

simply add:

if (!mInputs.Length() && aIndex < CacheObserver::PreloadChunkCount()) return true; :)
Attachment #8414068 - Flags: review?(honzab.moz) → feedback+
Attached patch patch v2 (obsolete) — Splinter Review
(In reply to Honza Bambas (:mayhemer) from comment #12)
> @@ +1189,5 @@
> > +      return;
> > +    }
> > +
> > +    if (mChunks.GetWeak(i) || mCachedChunks.GetWeak(i)) {
> > +      // This chunk is already in memory or is being read right now.
> 
> maybe LOG this?

This will happen very often so I don't want to bloat the log with this message.


> @@ +1216,5 @@
> > +#else
> > +  // Cache chunk when this is memory only entry or we don't have a handle yet.
> > +  if (mMemoryOnly || mOpeningFile) {
> > +    return true;
> > +  }
> 
> simply add:
> 
> if (!mInputs.Length() && aIndex < CacheObserver::PreloadChunkCount()) return
> true; :)

I've introduced a new flag mPreloadWithoutInput instead of checking mInput.Length(). The idea is that some timer will drop this flag and perform a cleanup of the cached chunks. But I'll do this later in a follow up bug.
Attachment #8414068 - Attachment is obsolete: true
Attachment #8414718 - Flags: review?(honzab.moz)
Comment on attachment 8414718 [details] [diff] [review]
patch v2

Review of attachment 8414718 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!  r=hoznab

Now only to check this helps with the talos regressions.

::: netwerk/cache2/CacheFile.cpp
@@ +1231,5 @@
> +  if (mMemoryOnly || mOpeningFile) {
> +    return true;
> +  }
> +
> +  uint32_t preloadChunkCount = CacheObserver::PreloadChunkCount();

maybe return false when preloadChunkCount == 0 here. (despite that the math bellow will not pass for 0 anyway)

::: netwerk/cache2/CacheFile.h
@@ +188,5 @@
>    bool           mOpenAsMemoryOnly;
>    bool           mDataAccessed;
>    bool           mDataIsDirty;
>    bool           mWritingMetadata;
> +  bool           mPreloadWithoutInput;

Maybe mPreloadWithoutInputStreams (Input is too general) or mInitialPreload or so..  Up to you.
Attachment #8414718 - Flags: review?(honzab.moz) → review+
Attached patch patch v3 - final patch (obsolete) — Splinter Review
https://tbpl.mozilla.org/?tree=Try&rev=f2f2351b07fc
Attachment #8414718 - Attachment is obsolete: true
Attachment #8414823 - Flags: review+
Attached patch patch v4Splinter Review
Fixed call to CacheFile::PreloadChunks() outside a lock from CacheFile::OnMetadataRead().

https://tbpl.mozilla.org/?tree=Try&rev=c379ea07551a
Attachment #8414823 - Attachment is obsolete: true
Attachment #8415012 - Flags: review+
Michal, can you please land the patch even though it doesn't fully solve the talos regression?  We want to do a trial on Nightly for few days again and this should be in for the time.  Thanks.
The discussion to preload being invoked from CacheEntry::InvokeCallback() after we examined with OnCacheEntryCheck() should happen in a followup.

More I'm thinking of it I tend to preload every time (the current patch), despite that revalidation may fail.  The ratio is more or less 50:50 (200/304).
Blocks: 1004185
https://hg.mozilla.org/mozilla-central/rev/8a6a5bfdedc6
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
You need to log in before you can comment on or make changes to this bug.