Store variable-length prefix to disk

RESOLVED DUPLICATE of bug 1305801

Status

()

Toolkit
Safe Browsing
P2
normal
RESOLVED DUPLICATE of bug 1305801
2 years ago
2 years ago

People

(Reporter: dimi, Assigned: dimi)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: #sbv4-m1)

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment, 6 obsolete attachments)

(Assignee)

Description

2 years ago
In V2, we store fixed-length prefix to .cache file. We will also need to store variable-length prefix to disk
(Assignee)

Comment 1

2 years ago
Created attachment 8767100 [details] [diff] [review]
WIP patch

This patch is currently based on patches in bug 1283007
(Assignee)

Updated

2 years ago
Whiteboard: #sbv4-v1
(Assignee)

Updated

2 years ago
Whiteboard: #sbv4-v1 → #sbv4-m1
Blocks: 1167038
Depends on: 1284178
Priority: -- → P2
(Assignee)

Comment 2

2 years ago
Created attachment 8785177 [details] [diff] [review]
P1. Store variable length prefix to disk

Hi Henry,
The lookupCache refactor in this patch is to separate v2 and v4 specific things. One thing I should highlight here is I move |mUpdateCompletions| to V2 only because we won't need additional array to handle update completions in V4 anymore(bug 1288833).

I will upload testcase next week.
Attachment #8767100 - Attachment is obsolete: true
Attachment #8785177 - Flags: feedback?(hchang)
(Assignee)

Updated

2 years ago
Attachment #8785177 - Attachment description: Patch. Store variable length prefix to disk → P1. Store variable length prefix to disk
Just had a quick look and looks pretty neat! Will have a deeper review next week. My understanding to what this patch would result is:

1) The TableUpdateV4 would be used to generate VLPrefixSet and LookupCacheV4.
2) However, the prefixes, including the 4-byte and variable length ones,
   are not going to be checked. It doesn't break safebrowsing since we still
   have other prefixes in V2 tables.

Correct me if I am wrong :)

Note that this patch needs to remove the V4 update guard which is going to
be added by Bug 1296820.

Thanks!
(Assignee)

Comment 4

2 years ago
(In reply to Henry Chang [:henry][:hchang] from comment #3)
> 1) The TableUpdateV4 would be used to generate VLPrefixSet and LookupCacheV4.

Yes, TableUpdateV4 will generate VLPrefixSet when update happens.
As for LookupCacheV4, I won't say it generated by TableUpdateV4. LookupCacheV4 will be created
whenever someone calls |GetLookupCache| with table name ends up with '-proto', for example, 
https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/Classifier.cpp#277

> 2) However, the prefixes, including the 4-byte and variable length ones,
>    are not going to be checked. It doesn't break safebrowsing since we still
>    have other prefixes in V2 tables.
> 
Yes, you are right. Currently we loop through all the tables to check if a url matchs[1]. After applying this patch, it will also include V4 tables. And in this patch, the |LookupCacheV4::Has| always return false, i filed a bug to implement this in M2(bug 1298257).

[1] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/Classifier.cpp#300
(Assignee)

Comment 5

2 years ago
Created attachment 8785905 [details] [diff] [review]
P2. Testcase
Attachment #8785905 - Flags: feedback?(hchang)
(Assignee)

Comment 6

2 years ago
Created attachment 8785906 [details] [diff] [review]
P2. Testcase

Remove debug message
Attachment #8785905 - Attachment is obsolete: true
Attachment #8785905 - Flags: feedback?(hchang)
Attachment #8785906 - Flags: feedback?(hchang)
Comment on attachment 8785906 [details] [diff] [review]
P2. Testcase

Review of attachment 8785906 [details] [diff] [review]:
-----------------------------------------------------------------

Excellent test cases! These test cases inspired me a lot :) 

BTW, are the 3 test cases able to be extracted the common part to one function? I'm not sure if it's trivial. Just for your reference. Thanks!
Attachment #8785906 - Flags: feedback?(hchang) → feedback+
(Assignee)

Comment 8

2 years ago
(In reply to Henry Chang [:henry][:hchang] from comment #7)
> BTW, are the 3 test cases able to be extracted the common part to one
> function? I'm not sure if it's trivial. Just for your reference. Thanks!

Yes! I think so. Thanks for your suggestion, i will refine it in next patch
Comment on attachment 8785177 [details] [diff] [review]
P1. Store variable length prefix to disk

Review of attachment 8785177 [details] [diff] [review]:
-----------------------------------------------------------------

It seems inevitable to separate LookupCache to V2 and V4 even we've split TableUpdate and ProtocolParser. I couldn't come up with a better approach than this patch. Thanks!
Attachment #8785177 - Flags: feedback?(hchang) → feedback+
(Assignee)

Comment 10

2 years ago
Created attachment 8789239 [details] [diff] [review]
[Rebased] bug1283007.patch

Rebased patch for testing
(Assignee)

Comment 11

2 years ago
Created attachment 8789241 [details] [diff] [review]
[Rebased] bug1283009.patch

Rebased patch for testing
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Assignee)

Comment 15

2 years ago
Hi Francois,
This patch is based on patch of bug 1283007. Although that bug is still waiting for gcp's final review but I think we can move forward on this one since theses two patches should be fairly independent(Unless there is a major change of interface).

Could you help review this ? thanks!
Flags: needinfo?(francois)
(Assignee)

Comment 16

2 years ago
Also please see Comment 2 for additional information about this patch, thanks.
(Assignee)

Comment 17

2 years ago
(In reply to Dimi Lee[:dimi][:dlee] from comment #15)
> Hi Francois,
> This patch is based on patch of bug 1283007. Although that bug is still
> waiting for gcp's final review but I think we can move forward on this one
> since theses two patches should be fairly independent(Unless there is a
> major change of interface).
> 
> Could you help review this ? thanks!

gcp is already r+ bug 1283007, cancel needinfo
Flags: needinfo?(francois)

Comment 18

2 years ago
mozreview-review
Comment on attachment 8789680 [details]
Bug 1283009 - Store variable-length prefix to disk.

https://reviewboard.mozilla.org/r/77816/#review77030

That looks good. A few comments inline.

Great test, as always :)

::: toolkit/components/url-classifier/Classifier.cpp:933
(Diff revision 3)
>    return NS_OK;
>  }
>  
>  nsresult
> +Classifier::UpdateTableV4(nsTArray<TableUpdate*>* aUpdates,
> +                           const nsACString& aTable)

nit: indentation

::: toolkit/components/url-classifier/Classifier.cpp:964
(Diff revision 3)
> +      for (auto iter = map.Iter(); !iter.Done(); iter.Next()) {
> +        nsCString* prefix = new nsCString(iter.Data()->GetPrefixString());
> +        prefixes.Put(iter.Key(), prefix);
> +      }
> +    } else {
> +      // TODO: Bug 1287058, partial update

Since we don't support partial updates yet, maybe we should clear the state that we save so that the next update is guaranteed to be a full update again.

That way we have something that works fully, even though it's inefficient.

::: toolkit/components/url-classifier/Classifier.cpp:977
(Diff revision 3)
> +
> +  rv = lookupCache->WriteFile();
> +  NS_ENSURE_SUCCESS(rv, rv);
> +
> +  int64_t now = (PR_Now() / PR_USEC_PER_SEC);
> +  LOG(("Successfully updated %s\n", aTable.BeginReading()));

The pattern I've seen elsewhere is `aTable.get()`.

I'm not sure that matters, but it seems simpler.

::: toolkit/components/url-classifier/Classifier.cpp:1015
(Diff revision 3)
>      if (mLookupCaches[i]->TableName().Equals(aTable)) {
>        return mLookupCaches[i];
>      }
>    }
>  
> -  UniquePtr<LookupCache> cache(new LookupCache(aTable, mRootStoreDirectory));
> +  // TODO : It would be better if we have a more general non-main thread

Should we file a follow-up bug for this?

::: toolkit/components/url-classifier/LookupCache.h:125
(Diff revision 3)
> -               bool* aHas, bool* aComplete);
> -  bool IsPrimed();
>  
> +  virtual nsresult Open();
> +  virtual nsresult Init() = 0;
> +  virtual nsresult ClearPrefix() = 0;

grammar nit: `ClearPrefixes()`

::: toolkit/components/url-classifier/LookupCache.cpp:385
(Diff revision 3)
> +}
> +
> +nsresult
> +LookupCacheV2::Init()
> +{
> +  mPrefixSet = new nsUrlClassifierPrefixSet();

Do we ever delete this?

::: toolkit/components/url-classifier/LookupCache.cpp:430
(Diff revision 3)
> +
> +  if (found) {
> +    *aHas = true;
> +  }
> +
> +  // TODO: We may need to distinguish completions found in cache or update in the future

If you think we definitely need to do this, then we should file a follow-up bug.

Otherwise, if we're not sure, I would suggest removing the "TODO:" from the comment.

Alternatively, we could split them up now:

```
  if (mGetHashCache.BinaryIndexOf(aCompletion) != nsTArray<Completion>::NoIndex) {
    LOG(("Complete (cache) in %s", mTableName.get()));
    *aComplete = true;
    *aHas = true;
  } else if (mUpdateCompletions.BinaryIndexOf(aCompletion) != nsTArray<Completion>::NoIndex) {
    LOG(("Complete (update) in %s", mTableName.get()));
    *aComplete = true;
    *aHas = true;
  }
```

::: toolkit/components/url-classifier/LookupCache.cpp:598
(Diff revision 3)
> +#endif
>  
> -  bool exists;
> -  rv = psFile->Exists(&exists);
> +nsresult
> +LookupCacheV4::Init()
> +{
> +  mVLPrefixSet = new VariableLengthPrefixSet();

Does that member ever get deleted?

This class has an empty destructor and I didn't see a delete anywhere.

::: toolkit/components/url-classifier/tests/gtest/TestPerProviderDirectory.cpp:88
(Diff revision 3)
>      // used as the private store.
> -    VerifyPrivateStorePath<LookupCache>("goog-phish-shavar", "google", rootDir, false);
> +    VerifyPrivateStorePath<LookupCacheV2>("goog-phish-shavar", "google", rootDir, false);
>  
>      // For V4 tables, if provider is found, use per-provider subdirectory;
>      // If not found, use root directory.
> -    VerifyPrivateStorePath<LookupCache>("goog-noprovider-proto", "", rootDir, false);
> +    VerifyPrivateStorePath<LookupCacheV2>("goog-noprovider-proto", "", rootDir, false);

Shouldn't this be `LookupCacheV4`?

::: toolkit/components/url-classifier/tests/gtest/TestUrlClassifierTableUpdateV4.cpp:30
(Diff revision 3)
> +  }
> +}
> +
> +// N: Number of prefixes, MIN/MAX: minimum/maximum prefix size
> +// This function will append generated prefixes to outArray.
> +// All elemets in the generated array will be different.

typo: `elements`

::: toolkit/components/url-classifier/tests/gtest/TestUrlClassifierTableUpdateV4.cpp:39
(Diff revision 3)
> +                              uint32_t MAX,
> +                              _PrefixArray& outArray)
> +{
> +  outArray.SetCapacity(outArray.Length() + N);
> +
> +  uint32_t range = (MAX - MIN + 1);

This could be `const` to make sure it doesn't get accidentally modified.

::: toolkit/components/url-classifier/tests/gtest/TestUrlClassifierTableUpdateV4.cpp:46
(Diff revision 3)
> +  for (uint32_t i = 0; i < N; i++) {
> +    uint32_t prefixSize = (rand() % range) + MIN;
> +    _Prefix prefix;
> +    prefix.SetLength(prefixSize);
> +
> +    while(true) {

nit: space between `while` and `(`

::: toolkit/components/url-classifier/tests/gtest/TestUrlClassifierTableUpdateV4.cpp:69
(Diff revision 3)
> +GenerateUpdateData(bool fullUpdate,
> +                   PrefixStringMap& add,
> +                   nsTArray<uint32_t>& removal,
> +                   nsTArray<TableUpdate*>& tableUpdates)
> +{
> +  TableUpdateV4* tableUpdate = new TableUpdateV4(NS_LITERAL_CSTRING("test-proto"));

I would suggest using a valid table name here, like `gtest-malware-proto`

::: toolkit/components/url-classifier/tests/gtest/TestUrlClassifierTableUpdateV4.cpp:95
(Diff revision 3)
> +                         getter_AddRefs(file));
> +
> +  file->AppendRelativeNativePath(NS_LITERAL_CSTRING("safebrowsing/test-proto.pset"));
> +
> +  RefPtr<VariableLengthPrefixSet> load = new VariableLengthPrefixSet;
> +  load->Init(NS_LITERAL_CSTRING("test-proto"));

Again, `gtest-malware-proto`.
Attachment #8789680 - Flags: review?(francois) → review-
(Assignee)

Comment 19

2 years ago
mozreview-review-reply
Comment on attachment 8789680 [details]
Bug 1283009 - Store variable-length prefix to disk.

https://reviewboard.mozilla.org/r/77816/#review77030

Thanks for review!

> Since we don't support partial updates yet, maybe we should clear the state that we save so that the next update is guaranteed to be a full update again.
> 
> That way we have something that works fully, even though it's inefficient.

I intend to land bug 1283007, bug 1283009(this one) & bug 1287058(partial update) together once they are all r+.
Is that ok for you?

> The pattern I've seen elsewhere is `aTable.get()`.
> 
> I'm not sure that matters, but it seems simpler.

|aTable| here is nsACString, nsACString doesn't have |get| function.
I guess the pattern you saw is "nsCString"

> Do we ever delete this?

mPrefixSet is reference counting by RefPtr[1]
It will be deleted automatically when LookupCache is freed.

[1] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/LookupCache.h#140

> If you think we definitely need to do this, then we should file a follow-up bug.
> 
> Otherwise, if we're not sure, I would suggest removing the "TODO:" from the comment.
> 
> Alternatively, we could split them up now:
> 
> ```
>   if (mGetHashCache.BinaryIndexOf(aCompletion) != nsTArray<Completion>::NoIndex) {
>     LOG(("Complete (cache) in %s", mTableName.get()));
>     *aComplete = true;
>     *aHas = true;
>   } else if (mUpdateCompletions.BinaryIndexOf(aCompletion) != nsTArray<Completion>::NoIndex) {
>     LOG(("Complete (update) in %s", mTableName.get()));
>     *aComplete = true;
>     *aHas = true;
>   }
> ```

We don't this for v2, since lookupCache is separated now.
This comment should be removed, thanks for reminding!

> Does that member ever get deleted?
> 
> This class has an empty destructor and I didn't see a delete anywhere.

The same reason as mPrefixSet

> Shouldn't this be `LookupCacheV4`?

Yes, you are right, sorry i didn't check it carefully.
Comment hidden (mozreview-request)
(Assignee)

Updated

2 years ago
Attachment #8785177 - Attachment is obsolete: true
(Assignee)

Updated

2 years ago
Attachment #8785906 - Attachment is obsolete: true
(Assignee)

Updated

2 years ago
Attachment #8789239 - Attachment is obsolete: true
(Assignee)

Updated

2 years ago
Attachment #8789241 - Attachment is obsolete: true
(In reply to Dimi Lee[:dimi][:dlee] from comment #19)
> I intend to land bug 1283007, bug 1283009(this one) & bug 1287058(partial
> update) together once they are all r+.
> Is that ok for you?

Sounds good to me.

> > The pattern I've seen elsewhere is `aTable.get()`.
> > 
> > I'm not sure that matters, but it seems simpler.
> 
> |aTable| here is nsACString, nsACString doesn't have |get| function.
> I guess the pattern you saw is "nsCString"

You're right. I was missing the PromiseFlatCString() part:

http://searchfox.org/mozilla-central/rev/bcc9ea6947878ca6378e5b7d6f08c1c090ed9eb7/dom/security/SRIMetadata.cpp#33
 
> > Do we ever delete this?
> 
> mPrefixSet is reference counting by RefPtr[1]
> It will be deleted automatically when LookupCache is freed.
> 
> [1]
> https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-
> classifier/LookupCache.h#140

Ah, of course. I should have caught that.

Comment 22

2 years ago
mozreview-review
Comment on attachment 8789680 [details]
Bug 1283009 - Store variable-length prefix to disk.

https://reviewboard.mozilla.org/r/77816/#review77190
Attachment #8789680 - Flags: review?(francois) → review+
(Assignee)

Updated

2 years ago
Attachment #8789680 - Flags: review?(gpascutto)
(Assignee)

Comment 23

2 years ago
Hi gcp,
Could you help review this one ? This patch is based on bug 1283007.
Thanks!

Comment 24

2 years ago
mozreview-review
Comment on attachment 8789680 [details]
Bug 1283009 - Store variable-length prefix to disk.

https://reviewboard.mozilla.org/r/77816/#review77818

::: toolkit/components/url-classifier/Classifier.cpp
(Diff revision 4)
>  
>    // Read the part of the store that is (only) in the cache
> -  LookupCache *lookupCache = GetLookupCache(store.TableName());
> -  if (!lookupCache) {
> -    return NS_ERROR_FAILURE;
> +  LookupCacheV2* lookupCache =
> +    LookupCache::Cast<LookupCacheV2>(GetLookupCache(store.TableName()));
> +  NS_ENSURE_TRUE(lookupCache, NS_ERROR_FAILURE);
> -  }

FWIW this was originally an if ()+return pair, and that is actually the style recommended by our guidelines (which disrecommend NS_ENSURE_*), so there was no need to change this. I guess it was done for consistency with the rest of the code?

::: toolkit/components/url-classifier/Classifier.cpp:961
(Diff revision 4)
> +      prefixes.Clear();
> +      TableUpdateV4::PrefixesStringMap& map = updateV4->Prefixes();
> +
> +      for (auto iter = map.Iter(); !iter.Done(); iter.Next()) {
> +        nsCString* prefix = new nsCString(iter.Data()->GetPrefixString());
> +        prefixes.Put(iter.Key(), prefix);

Maybe this could use a comment pointing out that PrefixStringMap is an nsClassHashtable, which takes ownership of put objects.

::: toolkit/components/url-classifier/Classifier.cpp:977
(Diff revision 4)
> +
> +  rv = lookupCache->WriteFile();
> +  NS_ENSURE_SUCCESS(rv, rv);
> +
> +  int64_t now = (PR_Now() / PR_USEC_PER_SEC);
> +  LOG(("Successfully updated %s\n", aTable.BeginReading()));



::: toolkit/components/url-classifier/Classifier.cpp:977
(Diff revision 4)
> +
> +  rv = lookupCache->WriteFile();
> +  NS_ENSURE_SUCCESS(rv, rv);
> +
> +  int64_t now = (PR_Now() / PR_USEC_PER_SEC);
> +  LOG(("Successfully updated %s\n", aTable.BeginReading()));

There's no guarantee nsACString is null-terminated, and that's exactly why it doesn't have a get() method.

Use the PromiseFlatCString().get() style used earlier in this function.

See for example this doc:
https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Reference/Glue_classes/nsEmbedCString

which makes it clear the get() (and null termination guarantee) only comes from concrete implementations of nsACString.

::: toolkit/components/url-classifier/Classifier.cpp:1048
(Diff revision 4)
>                               uint32_t aCount,
>                               PrefixArray* aNoiseEntries)
>  {
> -  LookupCache *cache = GetLookupCache(aTableName);
> -  if (!cache) {
> -    return NS_ERROR_FAILURE;
> +  // TODO : Bug 1297962, support adding noise for v4
> +  LookupCacheV2 *cache = static_cast<LookupCacheV2*>(GetLookupCache(aTableName));
> +  NS_ENSURE_TRUE(cache, NS_ERROR_FAILURE);

Same remark as before, we no longer recommend this style so that's moving backwards.

::: toolkit/components/url-classifier/LookupCache.h:181
(Diff revision 4)
> +
> +#if DEBUG
> +  void DumpCompletions();
> +#endif
> +
> +  static const int VER = 2;

Some time in the future you're going to have the classic "Undefined reference to LookupCacheV2::VER." gcc error.

You need a definition in addition to the declaration.

::: toolkit/components/url-classifier/LookupCache.h:218
(Diff revision 4)
> +  virtual nsresult Has(const Completion& aCompletion,
> +                       bool* aHas, bool* aComplete) override;
> +
> +  nsresult Build(PrefixStringMap& aPrefixMap);
> +
> +  static const int VER = 4;

Same.
Attachment #8789680 - Flags: review?(gpascutto) → review+
(Assignee)

Comment 25

2 years ago
mozreview-review-reply
Comment on attachment 8789680 [details]
Bug 1283009 - Store variable-length prefix to disk.

https://reviewboard.mozilla.org/r/77816/#review77818

Thanks for review!
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Assignee)

Updated

2 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1305801
You need to log in before you can comment on or make changes to this bug.