Closed Bug 1283009 Opened 9 years ago Closed 9 years ago

Store variable-length prefix to disk

Categories

(Toolkit :: Safe Browsing, defect, P2)

defect

Tracking

()

RESOLVED DUPLICATE of bug 1305801

People

(Reporter: dimi, Assigned: dimi)

References

Details

(Whiteboard: #sbv4-m1)

Attachments

(1 file, 6 obsolete files)

In V2, we store fixed-length prefix to .cache file. We will also need to store variable-length prefix to disk
Attached patch WIP patch (obsolete) — Splinter Review
This patch is currently based on patches in bug 1283007
Whiteboard: #sbv4-v1
Whiteboard: #sbv4-v1 → #sbv4-m1
Depends on: 1284178
Priority: -- → P2
Hi Henry, The lookupCache refactor in this patch is to separate v2 and v4 specific things. One thing I should highlight here is I move |mUpdateCompletions| to V2 only because we won't need additional array to handle update completions in V4 anymore(bug 1288833). I will upload testcase next week.
Attachment #8767100 - Attachment is obsolete: true
Attachment #8785177 - Flags: feedback?(hchang)
Attachment #8785177 - Attachment description: Patch. Store variable length prefix to disk → P1. Store variable length prefix to disk
Just had a quick look and looks pretty neat! Will have a deeper review next week. My understanding to what this patch would result is: 1) The TableUpdateV4 would be used to generate VLPrefixSet and LookupCacheV4. 2) However, the prefixes, including the 4-byte and variable length ones, are not going to be checked. It doesn't break safebrowsing since we still have other prefixes in V2 tables. Correct me if I am wrong :) Note that this patch needs to remove the V4 update guard which is going to be added by Bug 1296820. Thanks!
(In reply to Henry Chang [:henry][:hchang] from comment #3) > 1) The TableUpdateV4 would be used to generate VLPrefixSet and LookupCacheV4. Yes, TableUpdateV4 will generate VLPrefixSet when update happens. As for LookupCacheV4, I won't say it generated by TableUpdateV4. LookupCacheV4 will be created whenever someone calls |GetLookupCache| with table name ends up with '-proto', for example, https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/Classifier.cpp#277 > 2) However, the prefixes, including the 4-byte and variable length ones, > are not going to be checked. It doesn't break safebrowsing since we still > have other prefixes in V2 tables. > Yes, you are right. Currently we loop through all the tables to check if a url matchs[1]. After applying this patch, it will also include V4 tables. And in this patch, the |LookupCacheV4::Has| always return false, i filed a bug to implement this in M2(bug 1298257). [1] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/Classifier.cpp#300
Attached patch P2. Testcase (obsolete) — Splinter Review
Attachment #8785905 - Flags: feedback?(hchang)
Attached patch P2. Testcase (obsolete) — Splinter Review
Remove debug message
Attachment #8785905 - Attachment is obsolete: true
Attachment #8785905 - Flags: feedback?(hchang)
Attachment #8785906 - Flags: feedback?(hchang)
Comment on attachment 8785906 [details] [diff] [review] P2. Testcase Review of attachment 8785906 [details] [diff] [review]: ----------------------------------------------------------------- Excellent test cases! These test cases inspired me a lot :) BTW, are the 3 test cases able to be extracted the common part to one function? I'm not sure if it's trivial. Just for your reference. Thanks!
Attachment #8785906 - Flags: feedback?(hchang) → feedback+
(In reply to Henry Chang [:henry][:hchang] from comment #7) > BTW, are the 3 test cases able to be extracted the common part to one > function? I'm not sure if it's trivial. Just for your reference. Thanks! Yes! I think so. Thanks for your suggestion, i will refine it in next patch
Comment on attachment 8785177 [details] [diff] [review] P1. Store variable length prefix to disk Review of attachment 8785177 [details] [diff] [review]: ----------------------------------------------------------------- It seems inevitable to separate LookupCache to V2 and V4 even we've split TableUpdate and ProtocolParser. I couldn't come up with a better approach than this patch. Thanks!
Attachment #8785177 - Flags: feedback?(hchang) → feedback+
Attached patch [Rebased] bug1283007.patch (obsolete) — Splinter Review
Rebased patch for testing
Attached patch [Rebased] bug1283009.patch (obsolete) — Splinter Review
Rebased patch for testing
Hi Francois, This patch is based on patch of bug 1283007. Although that bug is still waiting for gcp's final review but I think we can move forward on this one since theses two patches should be fairly independent(Unless there is a major change of interface). Could you help review this ? thanks!
Flags: needinfo?(francois)
Also please see Comment 2 for additional information about this patch, thanks.
(In reply to Dimi Lee[:dimi][:dlee] from comment #15) > Hi Francois, > This patch is based on patch of bug 1283007. Although that bug is still > waiting for gcp's final review but I think we can move forward on this one > since theses two patches should be fairly independent(Unless there is a > major change of interface). > > Could you help review this ? thanks! gcp is already r+ bug 1283007, cancel needinfo
Flags: needinfo?(francois)
Comment on attachment 8789680 [details] Bug 1283009 - Store variable-length prefix to disk. https://reviewboard.mozilla.org/r/77816/#review77030 That looks good. A few comments inline. Great test, as always :) ::: toolkit/components/url-classifier/Classifier.cpp:933 (Diff revision 3) > return NS_OK; > } > > nsresult > +Classifier::UpdateTableV4(nsTArray<TableUpdate*>* aUpdates, > + const nsACString& aTable) nit: indentation ::: toolkit/components/url-classifier/Classifier.cpp:964 (Diff revision 3) > + for (auto iter = map.Iter(); !iter.Done(); iter.Next()) { > + nsCString* prefix = new nsCString(iter.Data()->GetPrefixString()); > + prefixes.Put(iter.Key(), prefix); > + } > + } else { > + // TODO: Bug 1287058, partial update Since we don't support partial updates yet, maybe we should clear the state that we save so that the next update is guaranteed to be a full update again. That way we have something that works fully, even though it's inefficient. ::: toolkit/components/url-classifier/Classifier.cpp:977 (Diff revision 3) > + > + rv = lookupCache->WriteFile(); > + NS_ENSURE_SUCCESS(rv, rv); > + > + int64_t now = (PR_Now() / PR_USEC_PER_SEC); > + LOG(("Successfully updated %s\n", aTable.BeginReading())); The pattern I've seen elsewhere is `aTable.get()`. I'm not sure that matters, but it seems simpler. ::: toolkit/components/url-classifier/Classifier.cpp:1015 (Diff revision 3) > if (mLookupCaches[i]->TableName().Equals(aTable)) { > return mLookupCaches[i]; > } > } > > - UniquePtr<LookupCache> cache(new LookupCache(aTable, mRootStoreDirectory)); > + // TODO : It would be better if we have a more general non-main thread Should we file a follow-up bug for this? ::: toolkit/components/url-classifier/LookupCache.h:125 (Diff revision 3) > - bool* aHas, bool* aComplete); > - bool IsPrimed(); > > + virtual nsresult Open(); > + virtual nsresult Init() = 0; > + virtual nsresult ClearPrefix() = 0; grammar nit: `ClearPrefixes()` ::: toolkit/components/url-classifier/LookupCache.cpp:385 (Diff revision 3) > +} > + > +nsresult > +LookupCacheV2::Init() > +{ > + mPrefixSet = new nsUrlClassifierPrefixSet(); Do we ever delete this? ::: toolkit/components/url-classifier/LookupCache.cpp:430 (Diff revision 3) > + > + if (found) { > + *aHas = true; > + } > + > + // TODO: We may need to distinguish completions found in cache or update in the future If you think we definitely need to do this, then we should file a follow-up bug. Otherwise, if we're not sure, I would suggest removing the "TODO:" from the comment. Alternatively, we could split them up now: ``` if (mGetHashCache.BinaryIndexOf(aCompletion) != nsTArray<Completion>::NoIndex) { LOG(("Complete (cache) in %s", mTableName.get())); *aComplete = true; *aHas = true; } else if (mUpdateCompletions.BinaryIndexOf(aCompletion) != nsTArray<Completion>::NoIndex) { LOG(("Complete (update) in %s", mTableName.get())); *aComplete = true; *aHas = true; } ``` ::: toolkit/components/url-classifier/LookupCache.cpp:598 (Diff revision 3) > +#endif > > - bool exists; > - rv = psFile->Exists(&exists); > +nsresult > +LookupCacheV4::Init() > +{ > + mVLPrefixSet = new VariableLengthPrefixSet(); Does that member ever get deleted? This class has an empty destructor and I didn't see a delete anywhere. ::: toolkit/components/url-classifier/tests/gtest/TestPerProviderDirectory.cpp:88 (Diff revision 3) > // used as the private store. > - VerifyPrivateStorePath<LookupCache>("goog-phish-shavar", "google", rootDir, false); > + VerifyPrivateStorePath<LookupCacheV2>("goog-phish-shavar", "google", rootDir, false); > > // For V4 tables, if provider is found, use per-provider subdirectory; > // If not found, use root directory. > - VerifyPrivateStorePath<LookupCache>("goog-noprovider-proto", "", rootDir, false); > + VerifyPrivateStorePath<LookupCacheV2>("goog-noprovider-proto", "", rootDir, false); Shouldn't this be `LookupCacheV4`? ::: toolkit/components/url-classifier/tests/gtest/TestUrlClassifierTableUpdateV4.cpp:30 (Diff revision 3) > + } > +} > + > +// N: Number of prefixes, MIN/MAX: minimum/maximum prefix size > +// This function will append generated prefixes to outArray. > +// All elemets in the generated array will be different. typo: `elements` ::: toolkit/components/url-classifier/tests/gtest/TestUrlClassifierTableUpdateV4.cpp:39 (Diff revision 3) > + uint32_t MAX, > + _PrefixArray& outArray) > +{ > + outArray.SetCapacity(outArray.Length() + N); > + > + uint32_t range = (MAX - MIN + 1); This could be `const` to make sure it doesn't get accidentally modified. ::: toolkit/components/url-classifier/tests/gtest/TestUrlClassifierTableUpdateV4.cpp:46 (Diff revision 3) > + for (uint32_t i = 0; i < N; i++) { > + uint32_t prefixSize = (rand() % range) + MIN; > + _Prefix prefix; > + prefix.SetLength(prefixSize); > + > + while(true) { nit: space between `while` and `(` ::: toolkit/components/url-classifier/tests/gtest/TestUrlClassifierTableUpdateV4.cpp:69 (Diff revision 3) > +GenerateUpdateData(bool fullUpdate, > + PrefixStringMap& add, > + nsTArray<uint32_t>& removal, > + nsTArray<TableUpdate*>& tableUpdates) > +{ > + TableUpdateV4* tableUpdate = new TableUpdateV4(NS_LITERAL_CSTRING("test-proto")); I would suggest using a valid table name here, like `gtest-malware-proto` ::: toolkit/components/url-classifier/tests/gtest/TestUrlClassifierTableUpdateV4.cpp:95 (Diff revision 3) > + getter_AddRefs(file)); > + > + file->AppendRelativeNativePath(NS_LITERAL_CSTRING("safebrowsing/test-proto.pset")); > + > + RefPtr<VariableLengthPrefixSet> load = new VariableLengthPrefixSet; > + load->Init(NS_LITERAL_CSTRING("test-proto")); Again, `gtest-malware-proto`.
Attachment #8789680 - Flags: review?(francois) → review-
Comment on attachment 8789680 [details] Bug 1283009 - Store variable-length prefix to disk. https://reviewboard.mozilla.org/r/77816/#review77030 Thanks for review! > Since we don't support partial updates yet, maybe we should clear the state that we save so that the next update is guaranteed to be a full update again. > > That way we have something that works fully, even though it's inefficient. I intend to land bug 1283007, bug 1283009(this one) & bug 1287058(partial update) together once they are all r+. Is that ok for you? > The pattern I've seen elsewhere is `aTable.get()`. > > I'm not sure that matters, but it seems simpler. |aTable| here is nsACString, nsACString doesn't have |get| function. I guess the pattern you saw is "nsCString" > Do we ever delete this? mPrefixSet is reference counting by RefPtr[1] It will be deleted automatically when LookupCache is freed. [1] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/LookupCache.h#140 > If you think we definitely need to do this, then we should file a follow-up bug. > > Otherwise, if we're not sure, I would suggest removing the "TODO:" from the comment. > > Alternatively, we could split them up now: > > ``` > if (mGetHashCache.BinaryIndexOf(aCompletion) != nsTArray<Completion>::NoIndex) { > LOG(("Complete (cache) in %s", mTableName.get())); > *aComplete = true; > *aHas = true; > } else if (mUpdateCompletions.BinaryIndexOf(aCompletion) != nsTArray<Completion>::NoIndex) { > LOG(("Complete (update) in %s", mTableName.get())); > *aComplete = true; > *aHas = true; > } > ``` We don't this for v2, since lookupCache is separated now. This comment should be removed, thanks for reminding! > Does that member ever get deleted? > > This class has an empty destructor and I didn't see a delete anywhere. The same reason as mPrefixSet > Shouldn't this be `LookupCacheV4`? Yes, you are right, sorry i didn't check it carefully.
Attachment #8785177 - Attachment is obsolete: true
Attachment #8785906 - Attachment is obsolete: true
Attachment #8789239 - Attachment is obsolete: true
Attachment #8789241 - Attachment is obsolete: true
(In reply to Dimi Lee[:dimi][:dlee] from comment #19) > I intend to land bug 1283007, bug 1283009(this one) & bug 1287058(partial > update) together once they are all r+. > Is that ok for you? Sounds good to me. > > The pattern I've seen elsewhere is `aTable.get()`. > > > > I'm not sure that matters, but it seems simpler. > > |aTable| here is nsACString, nsACString doesn't have |get| function. > I guess the pattern you saw is "nsCString" You're right. I was missing the PromiseFlatCString() part: http://searchfox.org/mozilla-central/rev/bcc9ea6947878ca6378e5b7d6f08c1c090ed9eb7/dom/security/SRIMetadata.cpp#33 > > Do we ever delete this? > > mPrefixSet is reference counting by RefPtr[1] > It will be deleted automatically when LookupCache is freed. > > [1] > https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url- > classifier/LookupCache.h#140 Ah, of course. I should have caught that.
Attachment #8789680 - Flags: review?(francois) → review+
Attachment #8789680 - Flags: review?(gpascutto)
Hi gcp, Could you help review this one ? This patch is based on bug 1283007. Thanks!
Comment on attachment 8789680 [details] Bug 1283009 - Store variable-length prefix to disk. https://reviewboard.mozilla.org/r/77816/#review77818 ::: toolkit/components/url-classifier/Classifier.cpp (Diff revision 4) > > // Read the part of the store that is (only) in the cache > - LookupCache *lookupCache = GetLookupCache(store.TableName()); > - if (!lookupCache) { > - return NS_ERROR_FAILURE; > + LookupCacheV2* lookupCache = > + LookupCache::Cast<LookupCacheV2>(GetLookupCache(store.TableName())); > + NS_ENSURE_TRUE(lookupCache, NS_ERROR_FAILURE); > - } FWIW this was originally an if ()+return pair, and that is actually the style recommended by our guidelines (which disrecommend NS_ENSURE_*), so there was no need to change this. I guess it was done for consistency with the rest of the code? ::: toolkit/components/url-classifier/Classifier.cpp:961 (Diff revision 4) > + prefixes.Clear(); > + TableUpdateV4::PrefixesStringMap& map = updateV4->Prefixes(); > + > + for (auto iter = map.Iter(); !iter.Done(); iter.Next()) { > + nsCString* prefix = new nsCString(iter.Data()->GetPrefixString()); > + prefixes.Put(iter.Key(), prefix); Maybe this could use a comment pointing out that PrefixStringMap is an nsClassHashtable, which takes ownership of put objects. ::: toolkit/components/url-classifier/Classifier.cpp:977 (Diff revision 4) > + > + rv = lookupCache->WriteFile(); > + NS_ENSURE_SUCCESS(rv, rv); > + > + int64_t now = (PR_Now() / PR_USEC_PER_SEC); > + LOG(("Successfully updated %s\n", aTable.BeginReading())); ::: toolkit/components/url-classifier/Classifier.cpp:977 (Diff revision 4) > + > + rv = lookupCache->WriteFile(); > + NS_ENSURE_SUCCESS(rv, rv); > + > + int64_t now = (PR_Now() / PR_USEC_PER_SEC); > + LOG(("Successfully updated %s\n", aTable.BeginReading())); There's no guarantee nsACString is null-terminated, and that's exactly why it doesn't have a get() method. Use the PromiseFlatCString().get() style used earlier in this function. See for example this doc: https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Reference/Glue_classes/nsEmbedCString which makes it clear the get() (and null termination guarantee) only comes from concrete implementations of nsACString. ::: toolkit/components/url-classifier/Classifier.cpp:1048 (Diff revision 4) > uint32_t aCount, > PrefixArray* aNoiseEntries) > { > - LookupCache *cache = GetLookupCache(aTableName); > - if (!cache) { > - return NS_ERROR_FAILURE; > + // TODO : Bug 1297962, support adding noise for v4 > + LookupCacheV2 *cache = static_cast<LookupCacheV2*>(GetLookupCache(aTableName)); > + NS_ENSURE_TRUE(cache, NS_ERROR_FAILURE); Same remark as before, we no longer recommend this style so that's moving backwards. ::: toolkit/components/url-classifier/LookupCache.h:181 (Diff revision 4) > + > +#if DEBUG > + void DumpCompletions(); > +#endif > + > + static const int VER = 2; Some time in the future you're going to have the classic "Undefined reference to LookupCacheV2::VER." gcc error. You need a definition in addition to the declaration. ::: toolkit/components/url-classifier/LookupCache.h:218 (Diff revision 4) > + virtual nsresult Has(const Completion& aCompletion, > + bool* aHas, bool* aComplete) override; > + > + nsresult Build(PrefixStringMap& aPrefixMap); > + > + static const int VER = 4; Same.
Attachment #8789680 - Flags: review?(gpascutto) → review+
Comment on attachment 8789680 [details] Bug 1283009 - Store variable-length prefix to disk. https://reviewboard.mozilla.org/r/77816/#review77818 Thanks for review!
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: