Closed Bug 1163719 Opened 4 years ago Closed 4 years ago

Write StartupCache entries by loading order

Categories

(Core :: XPCOM, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox41 --- fixed

People

(Reporter: jchen, Assigned: jchen)

Details

Attachments

(1 file, 2 obsolete files)

Right now StartupCache entries are ordered arbitrarily -- according to the hashes in the hashtable, but when reading from the cache, it's better to have the entries in loading order, to minimize hard faults when reading from the archive.
This patch makes us write StartupCache entries in the order that we load them. Theoretically, this will improve performance when reading from the cache. However, I have not actually measured the impact, because I'm not sure what the best way to do the measurement is (maybe something like counting the number of hard faults when reading the StartupCache archive).
Attachment #8604352 - Flags: review?(nfroyd)
Comment on attachment 8604352 [details] [diff] [review]
Store startup cache entries in loading order (v1)

Review of attachment 8604352 [details] [diff] [review]:
-----------------------------------------------------------------

f+ pending some performance numbers, both on the reading *and* writing sides.

Also, WDYT about just making CacheEntry inherit from mozilla::LinkedListElement<CacheEntry>, and keeping a LinkedList that we can iterate over?  Should be slightly smaller.

::: startupcache/StartupCache.cpp
@@ +413,3 @@
>  {
> +  if (!data) {
> +    return;

Why is this extra check necessary now?

@@ +426,5 @@
>    rv = writer->HasEntry(key, &hasEntry);
>    NS_ASSERTION(NS_SUCCEEDED(rv) && hasEntry == false, 
>                 "Existing entry in disk StartupCache.");
>  #endif
> +  rv = writer->AddEntryStream(key, holder->time, nsIZipWriter::COMPRESSION_BEST,

Can you please put the COMPRESSION_BEST change (hooray for excellent type checking) in a separate bug?  I'm not so sure that we want the best compression here, possibly just the fastest.

@@ +483,5 @@
>    holder.time = now;
>  
> +  for (size_t i = 0; i < mPendingWrites.length(); ++i) {
> +    const auto& key = mPendingWrites[i];
> +    CacheCloseHelper(key, mTable.Get(key), &holder);

Can you file a followup bug on eliminating CacheWriteHolder now that we're not constrained by the Enumerate() interface?

::: startupcache/StartupCache.h
@@ +164,5 @@
>                                           mozilla::MallocSizeOf mallocSizeOf,
>                                           void *);
>  
>    nsClassHashtable<nsCStringHashKey, CacheEntry> mTable;
> +  mozilla::Vector<nsCString> mPendingWrites;

I think it'd be slightly better to use nsTArray here.
Attachment #8604352 - Flags: review?(nfroyd) → feedback+
Vladan, do you have any suggestions for testing the performance of the startup cache changes here?
Flags: needinfo?(vdjeric)
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #3)
> Vladan, do you have any suggestions for testing the performance of the
> startup cache changes here?

A Talos try push with the startup tests is a good place to begin.
Re-running the startup Talos tests locally with a cold disk cache and pre-fetch disabled would also be interesting. 
Measuring "hard faults" & read timings locally during a brief but "typical" session would also be interesting.
We should also add Telemetry for this and evaluate the success on Aurora & Nightly. Of course, we should look for impact on startup numbers from Telemetry as well.
Flags: needinfo?(vdjeric)
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #2)
> Comment on attachment 8604352 [details] [diff] [review]
> Store startup cache entries in loading order (v1)
> 
> Review of attachment 8604352 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> f+ pending some performance numbers, both on the reading *and* writing sides.

So I measured the total time spent reading the cache during typical startup, and total time spent writing the cache on first-run. I ran each test 10 times and calculated the average. I measured Fennec on a Nexus 4, because this bug benefits Fennec the most. Desktop pre-compiles its startup cache into omnijar and already reorders the resulting omnijar. We can't pre-compile for Fennec (yet) so it doesn't benefit.

Without reordering, the average read took 79.6ms (stddev=5.6ms)
With reordering, the average read took 68.0ms (stddev=3.5ms)
The difference in read time was -11.6ms (-14.6%)

Without reordering, the average write took 326ms
With reordering, the average write took 337ms
The difference in write time was +11ms (+3.4%)

I think the read improvement is small but not negligible and is worth the increase in (one-time) write time.

In terms of memory, this change adds around 1.5kB to hold the array itself on first-run (the acual string buffers are shared with the hashtable). The array is not populated on subseuqent runs so there's no additional memory cost.

> Also, WDYT about just making CacheEntry inherit from
> mozilla::LinkedListElement<CacheEntry>, and keeping a LinkedList that we can
> iterate over?  Should be slightly smaller.

CacheEntry only contains the data of the entry, but not the name of the entry. However, when we write the entry, we need both the name and the data. If we only have a list of entry data, we can't easily retrieve the corresponding entry names. So we really need a list of entry names, and retrieve the corresponding entry data through the hashtable.

> ::: startupcache/StartupCache.cpp
> @@ +413,3 @@
> >  {
> > +  if (!data) {
> > +    return;
> 
> Why is this extra check necessary now?

It was a sanity check to make sure the key was found in the hashtable. I changed it to an assertion.

> @@ +426,5 @@
> >    rv = writer->HasEntry(key, &hasEntry);
> >    NS_ASSERTION(NS_SUCCEEDED(rv) && hasEntry == false, 
> >                 "Existing entry in disk StartupCache.");
> >  #endif
> > +  rv = writer->AddEntryStream(key, holder->time, nsIZipWriter::COMPRESSION_BEST,
> 
> Can you please put the COMPRESSION_BEST change (hooray for excellent type
> checking) in a separate bug?  I'm not so sure that we want the best
> compression here, possibly just the fastest.

Removed the change.

> @@ +483,5 @@
> >    holder.time = now;
> >  
> > +  for (size_t i = 0; i < mPendingWrites.length(); ++i) {
> > +    const auto& key = mPendingWrites[i];
> > +    CacheCloseHelper(key, mTable.Get(key), &holder);
> 
> Can you file a followup bug on eliminating CacheWriteHolder now that we're
> not constrained by the Enumerate() interface?

Okay.

> ::: startupcache/StartupCache.h
> @@ +164,5 @@
> >                                           mozilla::MallocSizeOf mallocSizeOf,
> >                                           void *);
> >  
> >    nsClassHashtable<nsCStringHashKey, CacheEntry> mTable;
> > +  mozilla::Vector<nsCString> mPendingWrites;
> 
> I think it'd be slightly better to use nsTArray here.

Changed to nsTArray. I also changed |idStr| in PutBuffer to an nsCString; this was necessary to make the hashtable keys and the array entries share string buffers.
Attachment #8604352 - Attachment is obsolete: true
Attachment #8605445 - Flags: review?(nfroyd)
Comment on attachment 8605445 [details] [diff] [review]
Store startup cache entries in loading order (v2)

Review of attachment 8605445 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for providing the performance numbers.  I thought the cache entries held their names for some reason, but you're right, the linked list wouldn't work all that well.

::: startupcache/StartupCache.h
@@ +164,5 @@
>                                           mozilla::MallocSizeOf mallocSizeOf,
>                                           void *);
>  
>    nsClassHashtable<nsCStringHashKey, CacheEntry> mTable;
> +  nsTArray<nsCString> mPendingWrites;

Can you add a comment here, something like:

// We write out the cache entries in access order.  This array stores the keys in the order they were accessed.
Attachment #8605445 - Flags: review?(nfroyd) → review+
Added check for existing entries in PutBuffer. StartupCache specifies that we should *not* call PutBuffer twice for the same entry, but we seem to be doing that in several places already. We should address this in a follow-up.
Attachment #8605445 - Attachment is obsolete: true
Attachment #8609596 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/d773854c6324
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
You need to log in before you can comment on or make changes to this bug.