Closed Bug 453723 Opened 16 years ago Closed 16 years ago

Short-circuit known-clean hosts in the urlclassifier

Categories

(Toolkit :: Safe Browsing, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: dcamp, Assigned: dcamp)

References

Details

Attachments

(2 files, 4 obsolete files)

Attached patch v1 (obsolete) — Splinter Review
After a lookup on a clean host, we don't need to keep checking every url on that host.  Attached patch keeps a small cache of clean host keys, and shortcuts lookups on these hosts.

The cache is reset after an update.
Attachment #336932 - Flags: review?(tony)
Comment on attachment 336932 [details] [diff] [review]
v1

>+  PRLock* mCleanHostKeysLock;

Can you add a comment explaining what the lock is used to protect?

>+  // First, if this key has been checked since our last update and had
>+  // no entries, we can exit early.  We also do this check before
>+  // posting the lookup to this thread, but in case multiple lookups
>+  // are queued at the same time, it's worth checking again here.
>+  nsAutoLock lock(mCleanHostKeysLock);
>+
>+  if (mCleanHostKeys.Has(hostKey))
>+    return NS_OK;
>+
>+  lock.unlock();

Maybe put this code in a local scope so you don't need to manually call lock.unlock()?  Same with the use of the autolock below.

> nsresult
> nsUrlClassifierDBService::LookupURI(nsIURI* uri,
>-                                    nsIUrlClassifierCallback* c)
>+                                    nsIUrlClassifierCallback* c,
>+                                    PRBool forceLookup,
>+                                    PRBool *didLookup)

Can we put all the in params before the out params (so uri, forceLookup, c, didLookup)?  Isn't that what normally happens in other Moz code?

>+class nsUrlClassifierFragmentSet

There should be a comment describing this class (linked list).


Should there be new unittests or do the existing ones cover it?  E.g., querying the same host twice or query host, update, query host.
Attachment #336932 - Flags: review?(tony) → review+
(In reply to comment #1)

> > nsresult
> > nsUrlClassifierDBService::LookupURI(nsIURI* uri,
> >-                                    nsIUrlClassifierCallback* c)
> >+                                    nsIUrlClassifierCallback* c,
> >+                                    PRBool forceLookup,
> >+                                    PRBool *didLookup)
> 
> Can we put all the in params before the out params (so uri, forceLookup, c,
> didLookup)?  Isn't that what normally happens in other Moz code?

uri, c, and forcelookup are all in parameters...

> >+class nsUrlClassifierFragmentSet
> 
> There should be a comment describing this class (linked list).
> Should there be new unittests or do the existing ones cover it?  E.g., querying
> the same host twice or query host, update, query host.

Oops, didn't hg add the unit test file.  Will post a new patch soon...
Attached patch v2 (obsolete) — Splinter Review
This new version adds a couple more small caches.  The first is a clean fragment cache.  So if we're doing checks on a host that does have malware, we'll avoid hashing/looking up fragments that have already been checked.  This should help avoid excess hashing on pages that load a lot of similar urls:

http://foo.bar.com/app/scripts/foo.js
http://foo.bar.com/app/scripts/bar.js
http://foo.bar.com/app/scripts/baz.js

Previously, this would generate hashes for each of foo.bar.com/, foo.bar.com/app, and foo.bar.com/app/scripts for each of these lookups.  With this patch, we'll only generate those once (during the first check), and then only check the leaf nodes on subsequent loads.

Finally, we cache the last db lookup of a host key.  So for the examples above, only the first load will result in a db lookup for foo.bar.com.  Subsequent checks will just look at the results from the earlier foo.bar.com query.

Between these three caches, a typical page load will still only result in one or two db lookups, and subsequent checks for script, css, etc should do minimal work.
Attachment #336932 - Attachment is obsolete: true
Attachment #340462 - Flags: review?(tony)
Comment on attachment 340462 [details] [diff] [review]
v2

looks good!
Attachment #340462 - Flags: review?(tony) → review+
Landed as http://hg.mozilla.org/mozilla-central/rev/1a971f9406f8
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
unit test failed, this was backed out.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached patch followup (obsolete) — Splinter Review
Thunderbird and seamonkey unit tests were failing because they don't turn on the malware/phishing prefs.  New patch turns them on explicitly in the unit tests, and also clears the caches during a database reset.
Attachment #341007 - Flags: review?(tony)
Comment on attachment 341007 [details] [diff] [review]
followup

OK
Attachment #341007 - Flags: review?(tony) → review+
Landed as http://hg.mozilla.org/mozilla-central/rev/74aad43f37a5
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
Attached file test workaround (obsolete) —
There's a race in the test cases (starting a lookup directly after a resetDatabase call).  I'm landing this workaround until the race (bug 457790) can be properly fixed.
Depends on: 457828
Reverted again, due to bug 457828.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached patch v3Splinter Review
New patch includes a fix for the crash, and a test that triggers the crash without this fix.  Interdiff coming up...
Attachment #340462 - Attachment is obsolete: true
Attachment #341007 - Attachment is obsolete: true
Attachment #341031 - Attachment is obsolete: true
Attachment #343641 - Flags: review?(tony)
Attached patch interdiffSplinter Review
Comment on attachment 343641 [details] [diff] [review]
v3

make it so
Attachment #343641 - Flags: review?(tony) → review+
(re-re-)landed as http://hg.mozilla.org/mozilla-central/rev/7825b962281d
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
Depends on: 475436
Product: Firefox → Toolkit
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: