Lookup certificate information in whitelists to suppress remote lookups

RESOLVED FIXED in mozilla30

Status

()

defect
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: mmc, Assigned: mmc)

Tracking

(Blocks 1 bug)

unspecified
mozilla30
x86
macOS
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 5 obsolete attachments)

Once bug 928536 is done, we need to lookup certificate information for signed binaries in goog-white-digest256 to suppress remote lookups. Unfortunately this isn't documented, but the call in Chrome is here:

https://code.google.com/p/chromium/codesearch#chromium/src/chrome/browser/safe_browsing/download_protection_service.h&q=CertificateWhitelistStrings&sq=package:chromium&type=cs&l=168
Posted patch check-cert-whitelists (obsolete) — Splinter Review
Needs tests, not ready for review
Assignee: nobody → mmc
Posted patch check-cert-whitelists (obsolete) — Splinter Review
Hey Gian-Carlo,

The idea behind this patch is that for application reputation checks, if a binary is signed then we can use the certificate information to whitelist it from remote checks. Basically we take the signatures, which may contain multiple certificate chains from multiple signers, then construct URLs from them using their common name and organization name and sha1 fingerprint to look up in the digest256 tables.

There is no documentation for this, I took all the logic from the Chrome code. Let me know if you want a walk through over vidyo or something.

Thanks,
Monica
Attachment #8371197 - Attachment is obsolete: true
Attachment #8371990 - Flags: review?(gpascutto)
Comment on attachment 8371990 [details] [diff] [review]
check-cert-whitelists

Review of attachment 8371990 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/downloads/ApplicationReputation.cpp
@@ +342,5 @@
> +  // There are no more URIs to check against local list, so send the remote
> +  // query if we can.
> +  // Revert to just ifdef XP_WIN when remote lookups are enabled (bug 933432)
> +//#if 0 and defined(XP_WIN)
> +#if defined(XP_WIN)

This needs to be #if 0 before checkin, again.
Comment on attachment 8371990 [details] [diff] [review]
check-cert-whitelists

+David for a small amount of PSM stuff, and for double-checking that nothing needs to be an NSSShutdownObject.
Attachment #8371990 - Flags: review?(dkeeler)
Comment on attachment 8372070 [details] [diff] [review]
Check certificate whitelist strings before sending remote lookup (

Forgot test file.
Attachment #8372070 - Flags: review?(gpascutto)
Attachment #8372070 - Flags: review?(dkeeler)
Attachment #8371990 - Attachment is obsolete: true
Attachment #8371990 - Flags: review?(gpascutto)
Attachment #8371990 - Flags: review?(dkeeler)
Comment on attachment 8372070 [details] [diff] [review]
Check certificate whitelist strings before sending remote lookup (

Review of attachment 8372070 [details] [diff] [review]:
-----------------------------------------------------------------

Your use of PSM apis looks good. Nothing more needs to be a nsNSSShutDownObject.

::: toolkit/components/downloads/ApplicationReputation.cpp
@@ +147,5 @@
> +    const ClientDownloadRequest_CertificateChain& aChain);
> +
> +  // For signed binaries, generate strings of the form:
> +  // http://sb-ssl.google.com/safebrowsing/csd/certificate/
> +  //   <issuer_cert_sha1_fingerprint>[/CN=<cn>][/O=<org>][/OU=<unit>]

You might make it a bit more clear that <cn>, <org>, and <unit> refer to cert, not the issuer.

@@ +458,5 @@
> +    const_cast<char *>(aChain.element(0).certificate().data()),
> +    aChain.element(0).certificate().size(), getter_AddRefs(signer));
> +  NS_ENSURE_SUCCESS(rv, rv);
> +
> +  for (int i = 1; i < aChain.element_size(); ++i) {

Are the chains just <signer, issuer> pairs? I didn't get that impression from the other patch. If they can have multiple intermediates, won't you end up with whitelist strings for pairs of certificates where neither of them issued the other? Maybe that's the way Chrome does it, but it seems strange.

::: toolkit/components/downloads/test/unit/data/signed_win.exe
@@ +1,1 @@
> +<?xml version="1.0" encoding="UTF-8"?>

I imagine this is supposed to be a binary?
Attachment #8372070 - Flags: review?(dkeeler) → review+
Comment on attachment 8372070 [details] [diff] [review]
Check certificate whitelist strings before sending remote lookup (

Review of attachment 8372070 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for sanity-checking.

::: toolkit/components/downloads/ApplicationReputation.cpp
@@ +458,5 @@
> +    const_cast<char *>(aChain.element(0).certificate().data()),
> +    aChain.element(0).certificate().size(), getter_AddRefs(signer));
> +  NS_ENSURE_SUCCESS(rv, rv);
> +
> +  for (int i = 1; i < aChain.element_size(); ++i) {

They can have multiple intermediaries. So the first one is supposed to be the signer, and that one is always used to construct the whitelist string. Any of its ancestors may be used in conjunction to whitelist, too, even if it didn't directly issue the signing cert. I will add a comment pointing to the chrome code.

https://code.google.com/p/chromium/codesearch#chromium/src/chrome/browser/safe_browsing/download_protection_service.cc&q=CertificateChainIsWhitelisted&sq=package:chromium&type=cs&l=714

::: toolkit/components/downloads/test/unit/data/signed_win.exe
@@ +1,1 @@
> +<?xml version="1.0" encoding="UTF-8"?>

Oops, this is me adding the wrong test file from the wrong machine yet again. It should be the same as netwerk/test/unit/data/signed_win.exe, will fix.
gcp sez: document what chrome is doing and put it in a comment somewhere. Also he can't get to the review until next week when his workweek is over.
Comment on attachment 8372070 [details] [diff] [review]
Check certificate whitelist strings before sending remote lookup (

Review of attachment 8372070 [details] [diff] [review]:
-----------------------------------------------------------------

I'm finding it hard to figure out what all this code is doing, which is a bit scary. Realistically if anyone besides you is going to have to fix something here, they'll end up having to reverse engineer both Chrome source and Firefox source to try to determine what Chrome is actually doing and what Firefox is doing differently. This seems like a maintenance nightmare. The old SafeBrowsing code was potentially even less legible than this but it had the advantage you could read the spec and understand what it was trying to do, which helps when reading the code and understanding control flow.

Based on this, as I already mentioned on IRC, I'd strongly recommend you document what you reverse engineered from Chrome and try to document the protocol and what this code is actually doing. This can be very brief and I'd stick it at the top of whatever the main cpp of this feature is. But I think we really need this. If there's bugs then, we can check if it's a case of the doc not matching what Chrome does, or Firefox not doing what the doc describes, which sounds far easier to me.

::: toolkit/components/downloads/ApplicationReputation.cpp
@@ +191,5 @@
> +
> +  // Look up the given URI in the safebrowsing DBs, optionally on both the allow
> +  // list and the blocklist. If there is a match, call
> +  // PendingLookup::OnComplete. Otherwise, call PendingLookup::LookupNext.
> +  nsresult LookupSpec(const nsACString& aSpec, bool allowListOnly);

nit: for consistency it should probably aAllowListOnly

@@ +212,5 @@
> +                   nsIUrlClassifierCallback)
> +
> +PendingDBLookup::PendingDBLookup(PendingLookup* aPendingLookup) :
> +  mAllowListOnly(false),
> +  mPendingLookup(aPendingLookup)

So we're passing in a raw ptr (actually "this" from the caller), and storing it in an nsRefPtr. This doesn't really seem to make sense to me? What ownership problem is the nsRefPtr trying to solve here?

@@ +230,5 @@
> +{
> +  LOG(("Checking principal %s", aSpec.Data()));
> +  mSpec = aSpec;
> +  nsresult rv = LookupSpecInternal(aSpec, allowListOnly);
> +  if (!NS_SUCCEEDED(rv)) {

!NS_SUCCEEDED = NS_FAILED?

@@ +243,5 @@
> +nsresult
> +PendingDBLookup::LookupSpecInternal(const nsACString& aSpec,
> +                                    bool allowListOnly)
> +{
> +  mAllowListOnly = allowListOnly;

You can move this to LookupSpec and simplify the prototype.

@@ +264,5 @@
> +  NS_ENSURE_SUCCESS(rv, rv);
> +
> +  // Check local lists to see if the URI has already been whitelisted or
> +  // blacklisted.
> +  nsCOMPtr<nsIUrlClassifierDBService> dbService = nullptr;

nit: Why nullptr here, but not elsewhere in this code? It's not very consistent.

@@ +266,5 @@
> +  // Check local lists to see if the URI has already been whitelisted or
> +  // blacklisted.
> +  nsCOMPtr<nsIUrlClassifierDBService> dbService = nullptr;
> +  dbService = do_GetService(NS_URLCLASSIFIERDBSERVICE_CONTRACTID, &rv);
> +  NS_ENSURE_SUCCESS(rv, rv);

This function seems to do a whole bunch of GetService calls for every lookup. I don't know how big their overhead is but this seems quite suboptimal? We don't do a lot of downloads but for other parts of the code we might end up calling this several times in succession?

@@ +328,5 @@
> +  // Look up all of the URLs that could whitelist this download.
> +  int index = mAllowlistSpecs.Length() - 1;
> +  if (index >= 0) {
> +    nsCString spec = mAllowlistSpecs[index];
> +    mAllowlistSpecs.RemoveElementAt(index);

Let's file a bug to make nsTArray.Pop() and nsTArray.Shift()? I'm surprised we have no concept of a stack in Firefox.

@@ +329,5 @@
> +  int index = mAllowlistSpecs.Length() - 1;
> +  if (index >= 0) {
> +    nsCString spec = mAllowlistSpecs[index];
> +    mAllowlistSpecs.RemoveElementAt(index);
> +    nsRefPtr<PendingDBLookup> lookup(new PendingDBLookup(this));

This is really an AutoPtr/ScopedPtr?

@@ +341,5 @@
> +  }
> +  // There are no more URIs to check against local list, so send the remote
> +  // query if we can.
> +  // Revert to just ifdef XP_WIN when remote lookups are enabled (bug 933432)
> +//#if 0 and defined(XP_WIN)

Get rid of this.

@@ +488,5 @@
> +nsresult
> +PendingLookup::StartLookup()
> +{
> +  nsresult rv = DoLookupInternal();
> +  if (!NS_SUCCEEDED(rv)) {

NS_FAILED

@@ +545,5 @@
>    ClientDownloadRequest_SignatureInfo* aSignatureInfo)
>  {
> +  // If we haven't been set for any reason, bail.
> +  if (!aSigArray) {
> +    return NS_OK;

Bit weird to return OK on failure. The next calls in the callers are now guaranteed to fail, no?
Attachment #8372070 - Flags: review?(gpascutto) → review-
Thanks for the review, gcp! I started on the documentation and then realized it is too big and needs to be on a wiki, but I'm not going to get it done before my long weekend (tomorrow-monday). About the RefPtr stuff: the previous iteration of this code just had RefPtr<PendingLookup>, which the DB service kept alive as a callback. Since PendingLookup now doesn't do lookups directly, its child PendingDBLookup still needs to keep a handle to it until the final callback is complete, otherwise PendingLookup will get GCed. From the xpcshell test and manual inspection of the logs, things seem to be allocated and deallocated as expected, but I am happy to do it a better way if you have any ideas.

The null sigArray should only ever happen in testing. I can set those in the unittest and will fix everything else.
>Since PendingLookup now doesn't do lookups directly, its child PendingDBLookup 
>still needs to keep a handle to it until the final callback is complete, otherwise 
>PendingLookup will get GCed.

So what it's really doing is transferring ownership of the this object to the callee, which will delete it when he himself exits scope? I guess it's safe because all calls are of the form "return lookup->Callback()".

I'm confused by your use of the word GC here, because the prototype is this:
   PendingDBLookup(PendingLookup* aPendingLookup);

That's a raw pointer, so any reference counts are broken. It can only work if it's transfer of ownership, I think?

Reading on a bit, I think this:
nsRefPtr<PendingLookup> lookup(new PendingLookup(aQuery, aCallback));

is also really an AutoPtr/ScopedPtr.
(In reply to Gian-Carlo Pascutto (:gcp) from comment #13)
> >Since PendingLookup now doesn't do lookups directly, its child PendingDBLookup 
> >still needs to keep a handle to it until the final callback is complete, otherwise 
> >PendingLookup will get GCed.
> 
> So what it's really doing is transferring ownership of the this object to
> the callee, which will delete it when he himself exits scope? I guess it's
> safe because all calls are of the form "return lookup->Callback()".
> 
> I'm confused by your use of the word GC here, because the prototype is this:
>    PendingDBLookup(PendingLookup* aPendingLookup);
> 
> That's a raw pointer, so any reference counts are broken. It can only work
> if it's transfer of ownership, I think?

I don't think so, it gets assigned to an nsRefPtr<PendingLookup> mPendingLookup which increases the ref count, same as allocating any new nsRefPtr. Is there a more idiomatic way to do this?

> 
> Reading on a bit, I think this:
> nsRefPtr<PendingLookup> lookup(new PendingLookup(aQuery, aCallback));
> 
> is also really an AutoPtr/ScopedPtr.

I'm not really sure what you mean, since it must outlive the scope of the block that allocates it and is not a ScopedPtr. PendingDBLookup must live until it re-enters its parent callback, and PendingLookup must live until it re-enters the original callback. I'm sure I can improve the commenting here.
> > That's a raw pointer, so any reference counts are broken. It can only work
> > if it's transfer of ownership, I think?
> 
> I don't think so, it gets assigned to an nsRefPtr<PendingLookup>
> mPendingLookup which increases the ref count, same as allocating any new
> nsRefPtr.

Yes, here's the code:
http://mxr.mozilla.org/mozilla-central/source/xpcom/base/nsAutoPtr.h#945
>Is there a more idiomatic way to do this?

I think it's more clear in this case to use an AutoPtr/ScopedPtr with a .forget() or .release() or whatever it's called in the call. My issue with using an nsRefPtr here is that this implies sharing of ownership. But that's not true, as the object is only owned by one class at a time. 

http://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#304
http://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#325

I think this is the exact same as what you're doing, but without the overhead of the RefPtr and with clear transfer-of-ownership semantics.

>I'm not really sure what you mean, since it must outlive the scope of the block that allocates it and 
>is not a ScopedPtr.

I should have indicated what I was looking at for the second example:

nsresult ApplicationReputationService::QueryReputationInternal

  // Create a new pending lookup and start the call chain.
  nsRefPtr<PendingLookup> lookup(new PendingLookup(aQuery, aCallback));
  NS_ENSURE_STATE(lookup);

  return lookup->StartLookup();

There's no way lookup escapes the scope here. So again, no reason to use nsRefPtr.
(In reply to Gian-Carlo Pascutto (:gcp) from comment #16)
> nsresult ApplicationReputationService::QueryReputationInternal
> 
>   // Create a new pending lookup and start the call chain.
>   nsRefPtr<PendingLookup> lookup(new PendingLookup(aQuery, aCallback));
>   NS_ENSURE_STATE(lookup);
> 
>   return lookup->StartLookup();
> 
> There's no way lookup escapes the scope here. So again, no reason to use
> nsRefPtr.

StartLookup is an asynchronous call that doesn't call nsIApplicationReputationCallback until the DB lookups and remote lookups are done. nsAutoPtr deallocates at the end of that block before the callback is complete. I consistently get a crash when using nsAutoPtr here. Am I missing something?
Attachment #8372070 - Attachment is obsolete: true
Comment on attachment 8377891 [details] [diff] [review]
Check certificate whitelist strings before sending remote lookup (

- Address gcp's comments except for the AutoPtr vs RefPtr thing.
- Wrote up https://wiki.mozilla.org/Security/Features/Application_Reputation_Design_Doc
- Filed https://bugzilla.mozilla.org/show_bug.cgi?id=974153 for nsTArray
Attachment #8377891 - Flags: review?(gpascutto)
>- Wrote up https://wiki.mozilla.org/Security/Features/Application_Reputation_Design_Doc

Question here: if I understand correctly, on non-Windows platforms, we only use the blocklist. The whitelist is useless because even if an entry isn't on it, we cannot do a remote probe so the download will pass anyway. Correct?

If that is true, we should disable updates for those on non-Windows platforms. No need to waste diskspace and bandwidth especially on mobile for data we can't use.
Comment on attachment 8377891 [details] [diff] [review]
Check certificate whitelist strings before sending remote lookup (

Review of attachment 8377891 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM. I want to fiddle with the nsRefPtr usage a bit to understand what's going on here, but that shouldn't hold this patch up.

::: toolkit/components/downloads/ApplicationReputation.cpp
@@ +273,5 @@
> +  // HandleEvent is guaranteed to call either:
> +  // 1) PendingLookup::OnComplete if the URL can be classified locally, or
> +  // 2) PendingLookup::LookupNext if the URL can be cannot classified locally.
> +  // Allow listing trumps block listing.
> +  nsCString allowList;

nsCAutoString

@@ +281,5 @@
> +    LOG(("Found principal %s on allowlist [this = %p]", mSpec.get(), this));
> +    return mPendingLookup->OnComplete(false, NS_OK);
> +  }
> +
> +  nsCString blockList;

same

@@ +350,5 @@
> +nsCString
> +PendingLookup::EscapeCertificateAttribute(const nsACString& aAttribute)
> +{
> +  // Escape '/' because it's a field separator, and '%' because Chrome does
> +  nsCString escaped;

You can perhaps pre-allocate because you already know the minimum length? Does SetCapacity do the right thing?

@@ +369,5 @@
> +nsCString
> +PendingLookup::EscapeFingerprint(const nsACString& aFingerprint)
> +{
> +  // Google's fingerprint doesn't have colons
> +  nsCString escaped;

same
Attachment #8377891 - Flags: review?(gpascutto) → review+
Attachment #8377891 - Attachment is obsolete: true
Attachment #8378496 - Attachment is obsolete: true
Address gcp's comments. Also I realized I was leaking memory in making the nsCOMPtr<service> member variables of ApplicationReputationService. I timed it, and do_getService for the ones we are using take between 0.001 and 0.01 ms, so it's in the noise. Lookups take around 0.2-0.5 ms total, not including remote lookups.

Try: https://tbpl.mozilla.org/?tree=Try&rev=7cd100e9600e(In reply to Gian-Carlo Pascutto (:gcp) from comment #20)

> >- Wrote up https://wiki.mozilla.org/Security/Features/Application_Reputation_Design_Doc
> 
> Question here: if I understand correctly, on non-Windows platforms, we only
> use the blocklist. The whitelist is useless because even if an entry isn't
> on it, we cannot do a remote probe so the download will pass anyway. Correct?
> 
> If that is true, we should disable updates for those on non-Windows
> platforms. No need to waste diskspace and bandwidth especially on mobile for
> data we can't use.

Filed https://bugzilla.mozilla.org/show_bug.cgi?id=974579.
Depends on: 974786
https://hg.mozilla.org/mozilla-central/rev/892751e80036
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
No longer blocks: 977236
You need to log in before you can comment on or make changes to this bug.