Closed Bug 1491151 Opened Last year Closed Last year

Add an MRU cache data structure

Categories

(Core :: XPCOM, enhancement)

enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla64
Tracking Status
firefox64 --- fixed

People

(Reporter: erahm, Assigned: erahm)

Details

Attachments

(7 files, 1 obsolete file)

Attached patch Part 1: Add MruCache (obsolete) — Splinter Review
This adds a most recently used cache that can be used as a quick lookup table
for frequently used entries.

This design works for the 5 use cases we currently have, but I'm open to suggestions on different approaches.
Attachment #9008916 - Flags: review?(nfroyd)
Attachment #9008916 - Flags: review?(bugs)
Assignee: nobody → erahm
Status: NEW → ASSIGNED
Attachment #9008917 - Flags: review?(bugs)
This adds debug statistics to the MRU cache. It's helpful for verifiying it's being used and what effectiveness it has. We probably don't want to land it.
Comment on attachment 9008916 [details] [diff] [review]
Part 1: Add MruCache

Review of attachment 9008916 [details] [diff] [review]:
-----------------------------------------------------------------

I have a bunch of style/taste comments, but overall this looks like a nice cleanup of a bunch of duplicated code.  f+ while we work out the below.

::: xpcom/ds/MruCache.h
@@ +20,5 @@
> +///
> +/// `NotNull` will return true if `Value` is not a pointer type or if the
> +/// pointer value is not null.
> +template <typename Value, bool IsPtr = std::is_pointer<Value>::value>
> +struct NullCheckerHelper

Maybe `EmptyChecker` or similar would be a better name?  `Null` is too tied up with pointers; what we are really using this for is determining whether cache entries are worth checking or not, and I think the name should reflect that?

@@ +43,5 @@
> +///   - HashNumber Hash(const KeyType& aKey)
> +///   - bool Match(const KeyType& aLhs, const ValueType& aRhs)
> +/// @see MruCacheHasher for an example.
> +template <class Hasher, size_t Size=31>
> +class MruCache

Alternative design suggestion for this:

template<class Key, class Value, class Cache, size_t Size=31>
class MruCache
{
public:
  using KeyType = Key;
  using ValueType = Value;

  // Will have to use `this->` qualification for `Hash`/`Match` below.
  ...
};

class NodeInfoCache : public MruCache<NodeInfoInner, NodeInfo*, NodeInfoCache>
{
  static HashNumber Hash(const KeyType&) {...}
  static bool Match(const KeyType&, const ValueType&) {...}
};

I think the client code is slightly nicer in this case: there's a bit more of a connection between the thing and what sort of values it's storing.  This is a bikeshedding sort of thing, so not going to insist, more just putting it out there for consideration.

@@ +100,5 @@
> +  ///    }
> +  ///
> +  ///    auto foo = new Foo();
> +  ///    mTable.Insert(aKey, foo);
> +  ///    p.Insert(foo);

I think this method should be called `Set` rather than `Insert`; the latter connotes growing the containing entity, which we are explicitly not doing.  Or maybe `Entry` should just provide `operator=` instead of `Insert`?

@@ +121,5 @@
> +      Data() = std::move(aValue);
> +      mMatch = true;
> +    }
> +
> +    void Insert(ValueType& aValue)

Better C++ here, if scarier, would be something like:

template<typename ValueConvertible> // Or `U`
void Insert(ValueConvertible&& aValue)
{
  Data() = std::forward<ValueConvertible>(aValue);
  mMatch = true;
}

and then you get both methods for free, plus enabling passing in things convertible to ValueType.

@@ +158,5 @@
> +private:
> +    ValueType mCache[Size] = {};
> +};
> +
> +/// Example hash adapter. Intended to be specialized for the main container type.

I think this is misleading documentation: none of the subsequent patches specialize this class.  They all inherit from it, which is quite different.  And at that point, it's also not an "example". :)

@@ +167,5 @@
> +  using KeyType = Key;
> +  using ValueType = Value;
> +
> +  static HashNumber Hash(const KeyType& aKey) { return HashGeneric(aKey); }
> +  static bool Match(const KeyType& aKey, const ValueType& aVal)

I'm not sure these are doing enough work to warrant including them (only one of the subsequent patches uses the default definition), and I tend to think overriding non-virtual methods is somewhat confusing.  WDYT?
Attachment #9008916 - Flags: review?(nfroyd) → feedback+
Comment on attachment 9008916 [details] [diff] [review]
Part 1: Add MruCache

>+
>+/// Helper struct for checking if a value is null.
>+///
>+/// `NotNull` will return true if `Value` is not a pointer type or if the
>+/// pointer value is not null.
Please use normal comments, so either // or /* */
Same also elsewhere.
(/// smells like rust-ism)


>+template <typename Value, bool IsPtr = std::is_pointer<Value>::value>
>+struct NullCheckerHelper
>+{
>+  static bool NotNull(const Value&) { return true; }
>+};
>+/// Template specialization for the `IsPtr == true` case.
>+template <typename Value>
>+struct NullCheckerHelper<Value, true>
>+{
>+  static bool NotNull(const Value& aVal) { return aVal != nullptr; }
>+};
>+
Because NotNull is used as a type in Gecko, could the methods be IsNotNull
Or is IsNonNull better?




>+  /// Helper that holds an entry that matched a lookup key. Usage:
>+  ///
>+  ///    auto p = mCache.Lookup(aKey);
Of course I'd prefer using concrete type here, since auto just hides the information from reader.
But I guess we use auto often with datastructures, so fine.

>+struct StringStruct
>+{
>+  nsCString mKey;
>+  nsCString mOther;
>+};
>+
>+struct StringStructHasher : public MruCacheHasher<nsCString, StringStruct>
>+{
>+  static HashNumber Hash(const KeyType& aKey) { return *aKey.BeginReading() - 1; }
>+  static bool Match(const KeyType& aKey, const ValueType& aVal) { return aKey == aVal.mKey; }
>+};
>+
>+TEST(MruCache, TestEmptyCache)
>+{
>+  {
>+    // Test a basic empty cache.
>+    mozilla::MruCache<IntMapHasher> mru;
>+
>+    // Make sure the default values are set.
>+    for (int i = 1; i < 32; i++) {
>+      auto p = mru.Lookup(i);
>+
>+      // Shouldn't be found.
>+      EXPECT_FALSE(p);
>+
>+      // Index should currently be defaulted.
>+      EXPECT_EQ(p.Data(), 0);
Ah, this is a bit weird, that Data() returns always some value, even though it might not be valid.

I think Data() implementation should assert that it isn't use if mMatch is false.
then Insert implementation needs to set mMatch to true before accessing Data()



NotNull renamed and asserting in Data(), r+
Attachment #9008916 - Flags: review?(bugs) → review+
Comment on attachment 9008917 [details] [diff] [review]
Part 2: Use MruCache for eTLD table

># HG changeset patch
># User Eric Rahm <erahm@mozilla.com>
># Date 1536615989 25200
>#      Mon Sep 10 14:46:29 2018 -0700
># Node ID 309939a7562284aaaabe31de7db6b185d12057a1
># Parent  9a0805c9c80556fa44d59c9d929c6f39b16b3e08
>Bug 1491151 - Part 2: Use MruCache for eTLD table. r=smaug
>
>diff --git a/netwerk/dns/nsEffectiveTLDService.cpp b/netwerk/dns/nsEffectiveTLDService.cpp
>--- a/netwerk/dns/nsEffectiveTLDService.cpp
>+++ b/netwerk/dns/nsEffectiveTLDService.cpp
>@@ -216,15 +216,18 @@ nsEffectiveTLDService::GetBaseDomainInte
>   // main thread-only as the cache is not thread-safe.
>   TLDCacheEntry* entry = nullptr;
>   if (aAdditionalParts == 1 && NS_IsMainThread()) {
>-    if (LookupForAdd(aHostname, &entry)) {
>+    auto p = mMruTable.Lookup(aHostname);
>+    if (p) {
>       // There was a match, just return the cached value.
>-      aBaseDomain = entry->mBaseDomain;
>+      aBaseDomain = p.Data().mBaseDomain;
>       if (trailingDot) {
>         aBaseDomain.Append('.');
>       }
> 
>       return NS_OK;
>     }
>+
>+    entry = &p.Data();
Hmm, I see, you use Data() this way. Could we please not. This is rather confusing and error prone.
One method for accessing valid values and another for using existing lookup result to update the entry.
update the main and and then this too
Attachment #9008917 - Flags: review?(bugs) → review+
Attachment #9008918 - Flags: review?(bugs) → review+
Attachment #9008919 - Flags: review?(bugs) → review+
Attachment #9008920 - Flags: review?(bugs) → review+
Comment on attachment 9008921 [details] [diff] [review]
Part 6: Convert CCGraphBuilder to use MRU cache

>-  Variant<PtrInfo*, uint32_t> cacheVariant = mGraphCache.GetEntryOrIndex(aPtr);
>-  if (cacheVariant.is<PtrInfo*>()) {
>-    MOZ_ASSERT(cacheVariant.as<PtrInfo*>()->mParticipant == aParticipant,
>+  auto cp = mGraphCache.Lookup(aPtr);
not that other patches have had descriptive variable names either, but I have no idea what cp might mean.

>@@ -2319,7 +2292,7 @@ CCGraphBuilder::AddNode(void* aPtr, nsCy
>                "nsCycleCollectionParticipant shouldn't change!");
>   }
> 
>-  mGraphCache.Add(cacheVariant.as<uint32_t>(), result);
>+  cp.Insert(result);
This should use the new method to update the existing entry in the cache.
(the new version of Data())
Attachment #9008921 - Flags: review?(bugs) → review+
(In reply to Nathan Froyd [:froydnj] from comment #8)
> Comment on attachment 9008916 [details] [diff] [review]
> Part 1: Add MruCache
> 
> Review of attachment 9008916 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I have a bunch of style/taste comments, but overall this looks like a nice
> cleanup of a bunch of duplicated code.  f+ while we work out the below.
> 
> ::: xpcom/ds/MruCache.h
> @@ +20,5 @@
> > +///
> > +/// `NotNull` will return true if `Value` is not a pointer type or if the
> > +/// pointer value is not null.
> > +template <typename Value, bool IsPtr = std::is_pointer<Value>::value>
> > +struct NullCheckerHelper
> 
> Maybe `EmptyChecker` or similar would be a better name?  `Null` is too tied
> up with pointers; what we are really using this for is determining whether
> cache entries are worth checking or not, and I think the name should reflect
> that?

I'll go with EmptyChecker.

> @@ +43,5 @@
> > +///   - HashNumber Hash(const KeyType& aKey)
> > +///   - bool Match(const KeyType& aLhs, const ValueType& aRhs)
> > +/// @see MruCacheHasher for an example.
> > +template <class Hasher, size_t Size=31>
> > +class MruCache
> 
> Alternative design suggestion for this:
> 
> template<class Key, class Value, class Cache, size_t Size=31>
> class MruCache
> {
> public:
>   using KeyType = Key;
>   using ValueType = Value;
> 
>   // Will have to use `this->` qualification for `Hash`/`Match` below.
>   ...
> };
> 
> class NodeInfoCache : public MruCache<NodeInfoInner, NodeInfo*,
> NodeInfoCache>
> {
>   static HashNumber Hash(const KeyType&) {...}
>   static bool Match(const KeyType&, const ValueType&) {...}
> };
> 
> I think the client code is slightly nicer in this case: there's a bit more
> of a connection between the thing and what sort of values it's storing. 
> This is a bikeshedding sort of thing, so not going to insist, more just
> putting it out there for consideration.

That seems reasonable. Originally I was hoping to have things work a bit like w/ nsHashKeys, but moved away from that.

> @@ +100,5 @@
> > +  ///    }
> > +  ///
> > +  ///    auto foo = new Foo();
> > +  ///    mTable.Insert(aKey, foo);
> > +  ///    p.Insert(foo);
> 
> I think this method should be called `Set` rather than `Insert`; the latter
> connotes growing the containing entity, which we are explicitly not doing. 
> Or maybe `Entry` should just provide `operator=` instead of `Insert`?

I'll go with switching to `Set`.

> @@ +121,5 @@
> > +      Data() = std::move(aValue);
> > +      mMatch = true;
> > +    }
> > +
> > +    void Insert(ValueType& aValue)
> 
> Better C++ here, if scarier, would be something like:
> 
> template<typename ValueConvertible> // Or `U`
> void Insert(ValueConvertible&& aValue)
> {
>   Data() = std::forward<ValueConvertible>(aValue);
>   mMatch = true;
> }
> 
> and then you get both methods for free, plus enabling passing in things
> convertible to ValueType.

Thanks for the tip, that's much cleaner.

> @@ +158,5 @@
> > +private:
> > +    ValueType mCache[Size] = {};
> > +};
> > +
> > +/// Example hash adapter. Intended to be specialized for the main container type.
> 
> I think this is misleading documentation: none of the subsequent patches
> specialize this class.  They all inherit from it, which is quite different. 
> And at that point, it's also not an "example". :)
> 
> @@ +167,5 @@
> > +  using KeyType = Key;
> > +  using ValueType = Value;
> > +
> > +  static HashNumber Hash(const KeyType& aKey) { return HashGeneric(aKey); }
> > +  static bool Match(const KeyType& aKey, const ValueType& aVal)
> 
> I'm not sure these are doing enough work to warrant including them (only one
> of the subsequent patches uses the default definition), and I tend to think
> overriding non-virtual methods is somewhat confusing.  WDYT?

Getting rid of this in favor of your previous suggestion.
(In reply to Nathan Froyd [:froydnj] from comment #8)

> Alternative design suggestion for this:
> 
> template<class Key, class Value, class Cache, size_t Size=31>
> class MruCache
> {
> public:
>   using KeyType = Key;
>   using ValueType = Value;
> 
>   // Will have to use `this->` qualification for `Hash`/`Match` below.
>   ...
> };
> 
> class NodeInfoCache : public MruCache<NodeInfoInner, NodeInfo*,
> NodeInfoCache>
> {
>   static HashNumber Hash(const KeyType&) {...}
>   static bool Match(const KeyType&, const ValueType&) {...}
> };
This approach makes it even harder to understand what KeyType and ValueType mean, when
one is forced always pass extra template argument.
Though, I guess Hash and Match could in concrete implementations take the actual types, not typedefs.
(In reply to Olli Pettay [:smaug] from comment #9)
> Comment on attachment 9008916 [details] [diff] [review]
> Part 1: Add MruCache
> 
> >+
> >+/// Helper struct for checking if a value is null.
> >+///
> >+/// `NotNull` will return true if `Value` is not a pointer type or if the
> >+/// pointer value is not null.
> Please use normal comments, so either // or /* */
> Same also elsewhere.
> (/// smells like rust-ism)

Yeah, I was in the rust-zone, I'll clean it up.

> >+template <typename Value, bool IsPtr = std::is_pointer<Value>::value>
> >+struct NullCheckerHelper
> >+{
> >+  static bool NotNull(const Value&) { return true; }
> >+};
> >+/// Template specialization for the `IsPtr == true` case.
> >+template <typename Value>
> >+struct NullCheckerHelper<Value, true>
> >+{
> >+  static bool NotNull(const Value& aVal) { return aVal != nullptr; }
> >+};
> >+
> Because NotNull is used as a type in Gecko, could the methods be IsNotNull
> Or is IsNonNull better?

Good point, I'll switch to `IsNonNull`.


> >+  /// Helper that holds an entry that matched a lookup key. Usage:
> >+  ///
> >+  ///    auto p = mCache.Lookup(aKey);
> Of course I'd prefer using concrete type here, since auto just hides the
> information from reader.
> But I guess we use auto often with datastructures, so fine.

Right, this is pretty common now. Otherwise we'd have to do:

> MruCache<KeyType, ValueType, Cache, Size>::Entry p = mCache.Lookup(aKey);

Which is a bit overboard.

> >+struct StringStruct
> >+{
> >+  nsCString mKey;
> >+  nsCString mOther;
> >+};
> >+
> >+struct StringStructHasher : public MruCacheHasher<nsCString, StringStruct>
> >+{
> >+  static HashNumber Hash(const KeyType& aKey) { return *aKey.BeginReading() - 1; }
> >+  static bool Match(const KeyType& aKey, const ValueType& aVal) { return aKey == aVal.mKey; }
> >+};
> >+
> >+TEST(MruCache, TestEmptyCache)
> >+{
> >+  {
> >+    // Test a basic empty cache.
> >+    mozilla::MruCache<IntMapHasher> mru;
> >+
> >+    // Make sure the default values are set.
> >+    for (int i = 1; i < 32; i++) {
> >+      auto p = mru.Lookup(i);
> >+
> >+      // Shouldn't be found.
> >+      EXPECT_FALSE(p);
> >+
> >+      // Index should currently be defaulted.
> >+      EXPECT_EQ(p.Data(), 0);
> Ah, this is a bit weird, that Data() returns always some value, even though
> it might not be valid.
> 
> I think Data() implementation should assert that it isn't use if mMatch is
> false.
> then Insert implementation needs to set mMatch to true before accessing
> Data()

Makes sense, I'll have to make the tests a little more complicated but that seems worth it.

> NotNull renamed and asserting in Data(), r+

Thanks for the thorough review!
(In reply to Olli Pettay [:smaug] from comment #13)
> (In reply to Nathan Froyd [:froydnj] from comment #8)
> 
> > Alternative design suggestion for this:
> > 
> > template<class Key, class Value, class Cache, size_t Size=31>
> > class MruCache
> > {
> > public:
> >   using KeyType = Key;
> >   using ValueType = Value;
> > 
> >   // Will have to use `this->` qualification for `Hash`/`Match` below.
> >   ...
> > };
> > 
> > class NodeInfoCache : public MruCache<NodeInfoInner, NodeInfo*,
> > NodeInfoCache>
> > {
> >   static HashNumber Hash(const KeyType&) {...}
> >   static bool Match(const KeyType&, const ValueType&) {...}
> > };
> This approach makes it even harder to understand what KeyType and ValueType
> mean, when
> one is forced always pass extra template argument.
> Though, I guess Hash and Match could in concrete implementations take the
> actual types, not typedefs.

I can switch those to concrete types in the updated implementations.
(In reply to Eric Rahm [:erahm] from comment #14)
> (In reply to Olli Pettay [:smaug] from comment #9)
> > Comment on attachment 9008916 [details] [diff] [review]
> > Part 1: Add MruCache
> > 
> > >+
> > >+/// Helper struct for checking if a value is null.
> > >+///
> > >+/// `NotNull` will return true if `Value` is not a pointer type or if the
> > >+/// pointer value is not null.
> > Please use normal comments, so either // or /* */
> > Same also elsewhere.
> > (/// smells like rust-ism)
> 
> Yeah, I was in the rust-zone, I'll clean it up.
> 
> > >+template <typename Value, bool IsPtr = std::is_pointer<Value>::value>
> > >+struct NullCheckerHelper
> > >+{
> > >+  static bool NotNull(const Value&) { return true; }
> > >+};
> > >+/// Template specialization for the `IsPtr == true` case.
> > >+template <typename Value>
> > >+struct NullCheckerHelper<Value, true>
> > >+{
> > >+  static bool NotNull(const Value& aVal) { return aVal != nullptr; }
> > >+};
> > >+
> > Because NotNull is used as a type in Gecko, could the methods be IsNotNull
> > Or is IsNonNull better?
> 
> Good point, I'll switch to `IsNonNull`.

Oh sorry, per Nathan's suggestion was switching to `EmptyChecker::NotEmpty`. I can do `IsNotEmpty` instead.
that is ok
smaug's suggestion for using the concrete types instead of the typedefs makes the code much clearer!
Updates per reviews.
Attachment #9009295 - Flags: review?(nfroyd)
Attachment #9008916 - Attachment is obsolete: true
(In reply to Olli Pettay [:smaug] from comment #10)
> Comment on attachment 9008917 [details] [diff] [review]
> Part 2: Use MruCache for eTLD table
> 
> ># HG changeset patch
> ># User Eric Rahm <erahm@mozilla.com>
> ># Date 1536615989 25200
> >#      Mon Sep 10 14:46:29 2018 -0700
> ># Node ID 309939a7562284aaaabe31de7db6b185d12057a1
> ># Parent  9a0805c9c80556fa44d59c9d929c6f39b16b3e08
> >Bug 1491151 - Part 2: Use MruCache for eTLD table. r=smaug
> >
> >diff --git a/netwerk/dns/nsEffectiveTLDService.cpp b/netwerk/dns/nsEffectiveTLDService.cpp
> >--- a/netwerk/dns/nsEffectiveTLDService.cpp
> >+++ b/netwerk/dns/nsEffectiveTLDService.cpp
> >@@ -216,15 +216,18 @@ nsEffectiveTLDService::GetBaseDomainInte
> >   // main thread-only as the cache is not thread-safe.
> >   TLDCacheEntry* entry = nullptr;
> >   if (aAdditionalParts == 1 && NS_IsMainThread()) {
> >-    if (LookupForAdd(aHostname, &entry)) {
> >+    auto p = mMruTable.Lookup(aHostname);
> >+    if (p) {
> >       // There was a match, just return the cached value.
> >-      aBaseDomain = entry->mBaseDomain;
> >+      aBaseDomain = p.Data().mBaseDomain;
> >       if (trailingDot) {
> >         aBaseDomain.Append('.');
> >       }
> > 
> >       return NS_OK;
> >     }
> >+
> >+    entry = &p.Data();
> Hmm, I see, you use Data() this way. Could we please not. This is rather
> confusing and error prone.
> One method for accessing valid values and another for using existing lookup
> result to update the entry.
> update the main and and then this too

Good point, I was being a bit lazy. Updated to use `Maybe<MruCache<...>::Entry> entry` instead and then `if (entry) entry->Set(...)` further down.
(In reply to Olli Pettay [:smaug] from comment #11)
> Comment on attachment 9008921 [details] [diff] [review]
> Part 6: Convert CCGraphBuilder to use MRU cache
> 
> >-  Variant<PtrInfo*, uint32_t> cacheVariant = mGraphCache.GetEntryOrIndex(aPtr);
> >-  if (cacheVariant.is<PtrInfo*>()) {
> >-    MOZ_ASSERT(cacheVariant.as<PtrInfo*>()->mParticipant == aParticipant,
> >+  auto cp = mGraphCache.Lookup(aPtr);
> not that other patches have had descriptive variable names either, but I
> have no idea what cp might mean.

Switched to `PtrInfoCache::Entry cached`.

> >@@ -2319,7 +2292,7 @@ CCGraphBuilder::AddNode(void* aPtr, nsCy
> >                "nsCycleCollectionParticipant shouldn't change!");
> >   }
> > 
> >-  mGraphCache.Add(cacheVariant.as<uint32_t>(), result);
> >+  cp.Insert(result);
> This should use the new method to update the existing entry in the cache.
> (the new version of Data())

Switched to `cached.Set`.
Attachment #9009295 - Flags: review?(nfroyd) → review+
Pushed by erahm@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/eac44c4585ed
Part 1: Add MruCache. r=smaug,froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/8248f7b4da44
Part 2: Use MruCache for eTLD table. r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/54da92d94f47
Part 3: Use MRU cache in NodeInfoManager. r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/c138f6d72ee1
Part 4: Convert atom table to use MRU cache. r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/4a46b4673978
Part 5: Convert content list to use MRU cache. r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/5ab4db62dc82
Part 6: Convert CCGraphBuilder to use MRU cache. r=smaug
You need to log in before you can comment on or make changes to this bug.