Closed Bug 1514202 Opened 10 months ago Closed 10 months ago

Port flash url-classifier to nsIUrlClassifierFeature

Categories

(Toolkit :: Safe Browsing, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla66
Tracking Status
firefox66 --- fixed

People

(Reporter: baku, Assigned: baku)

References

Details

Attachments

(2 files, 5 obsolete files)

No description provided.
Attached patch part 1 - flash features (obsolete) — Splinter Review
Attachment #9031404 - Flags: review?(dlee)
Attached patch part 2 - cleanup (obsolete) — Splinter Review
Attachment #9031405 - Flags: review?(dlee)
Priority: -- → P1
Comment on attachment 9031404 [details] [diff] [review]
part 1 - flash features

Edgar, can you review the DOM part of this patch?
Attachment #9031404 - Flags: review?(echen)
Attachment #9031405 - Flags: review?(echen)
Comment on attachment 9031405 [details] [diff] [review]
part 2 - cleanup

Review of attachment 9031405 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me!

::: dom/base/nsDocument.cpp
@@ -379,5 @@
> -  bool mAsyncClassified;
> -  nsTArray<nsCString> mMatchedTables;
> -  mozilla::dom::FlashClassification mResult;
> -};
> -

I guess we can also get rid of the #include "nsIURIClassifier.h" from nsDocument.cpp & .h  !!!

@@ +12008,5 @@
>   */
> +FlashClassification nsIDocument::DocumentFlashClassification() {
> +  // If neither pref is on, skip the null-principal and principal URI checks.
> +  if (StaticPrefs::plugins_http_https_only() &&
> +      !StaticPrefs::plugins_flashBlock_enabled()) {

The comment and the behavior seems doesn't match.
It should be 
if (!StaticPrefs::plugins_http_https_only() &&    
    !StaticPrefs::plugins_flashBlock_enabled()) {
}

I check the revision history, it looks like this is wrong after the previous fix?[1]

[1] https://hg.mozilla.org/mozreview/gecko/rev/aa00a4c2253498133b99f7cb9198c4521cebfefe#index_header

@@ +12029,5 @@
>      // * chrome documents
>      // * "bare" data: loads
>      // * FTP/gopher/file
>      nsAutoCString scheme;
> +    nsresult rv = classificationURI->GetScheme(scheme);

Any reason we declare another nsresult here?

@@ +12069,5 @@
>    }
>  
> +  if (state == nsIHttpChannel::FlashPluginAllowed) {
> +    classification = FlashClassification::Allowed;
> +  }

I would suggest the following because I think it is more straightforward.

FlashClassification classification = FlashClassification::Unknown;
nsCOMPtr<nsIHttpChannel> httpChannel = do_QueryInterface(GetChannel());
if (httpChannel) {
  nsIHttpChannel::FlashPluginState state = nsIHttpChannel::FlashPluginUnknown;
  if (...)
}
Attachment #9031405 - Flags: review?(dlee) → review+
Comment on attachment 9031404 [details] [diff] [review]
part 1 - flash features

Review of attachment 9031404 [details] [diff] [review]:
-----------------------------------------------------------------

Baku, thank you for working on this. This is great!
We can now easily fix Bug 1342333(Remove sync classifyLocalWithTables API) which
requires lots of work previously!!!

Please see comments below and let me know how do you think.

::: netwerk/url-classifier/AsyncUrlChannelClassifier.cpp
@@ +298,1 @@
>      nsAutoCString errorName;

Add UC_LOG_ENABLED()

@@ +301,5 @@
>          nsPrintfCString("BlacklistClassifierCallback[%p]:OnClassifyComplete "
>                          "got an unexpected error (rv=%s) while trying to "
>                          "create a whitelist URI. Allowing tracker.",
>                          this, errorName.get())
>              .get());

Should we move this debug message in GetFeatureTasks after GetURIFromChannel ?

@@ +320,1 @@
>                                        std::move(mChannelCallback));

For now, tracking protection/tracking annotation/flash they all have its whitelist table.
Just an idea that we could consider optimizing in the future that if a feature doesn't
have a whitelist preference or the whitelist table is somehow empty, we don't need to invoke ::AsyncClassifyLocalWithFeatures or maybe try to return ASAP in ::AsyncClassifyLocalWithFeatures

@@ +337,5 @@
> +             "calling AsyncClassifyLocalWithFeatures with rv=%s.",
> +             this, errorName.get()));
> +      }
> +
> +      return TrackerFound(mResults, mChannel, mChannelCallback);

Is it ok to call |mChannelCallback| multiple times? Not sure if the caller passed the callback to ::CheckChannel expects that

@@ +362,5 @@
>    if (features.IsEmpty()) {
>      UC_LOG(("AsyncUrlChannelClassifier: Feature list is empty for channel %p",
>              aChannel));
>      return NS_ERROR_FAILURE;
>    }

It is not easy for people to understand the meaning of feature here.
Since here is the entry point of "classification", I think it will be useful
if we add some comment and maybe a example before UrlClassifierFeatureFactory::GetFeaturesFromChannel
to give the audience an overall idea.

@@ +371,5 @@
> +  if (NS_WARN_IF(NS_FAILED(rv)) || tasks.IsEmpty()) {
> +    return rv;
> +  }
> +
> +  MOZ_ASSERT(!tasks.IsEmpty());

Maybe some comments about why GetFeatureTasks is needed.
just to make sure if I understand this correctly, because ::AsyncClassifyLocalWithFeatures is per URI call and it is possible that those features using different URIs, so FeatureTask groups features with the same URI together?

@@ +396,5 @@
> +    rv = uriClassifier->AsyncClassifyLocalWithFeatures(
> +        task.mURI, task.mFeatures, nsIUrlClassifierFeature::blacklist,
> +        callback);
> +    if (NS_WARN_IF(NS_FAILED(rv))) {
> +      return rv;

Maybe an edge case that what if it fails on the second run.
Do we need to update callback.mTaskCount accordingly and skip doing things after
first callback returns?
Will it be easier if we just skip these features if returns failure?

::: netwerk/url-classifier/UrlClassifierCommon.cpp
@@ +243,5 @@
>  
>    return NS_OK;
>  }
>  
> +/* static */ nsresult UrlClassifierCommon::CreateWhiteListURI(

We should change the function name and add some comment.
I'll suggest at least include the following to avoid misuse.

// Use this function only when you are looking for a pairwise whitelist uri with the format:
// http://toplevel.page/?resource=channel.uri.domain
/* static */ nsresult UrlClassifierCommon::CreatePairwiseWhiteListURI(

::: netwerk/url-classifier/UrlClassifierFeatureFactory.cpp
@@ +75,5 @@
> +    nsTArray<nsCOMPtr<nsIUrlClassifierFeature>> flashFeatures;
> +    UrlClassifierFeatureFlash::GetFeatures(
> +        contentPolicyType == nsIContentPolicy::TYPE_SUBDOCUMENT, flashFeatures);
> +    aFeatures.AppendElements(flashFeatures);
> +  }

I think we should try to hide feature dependent things from Factory.
And I do search for UrlClassifierFeatureFlash::MaybeCreate to see where we decide to
create flash features and then I found out it is here and ::GetFeature.

So how about putting these to UrlClassifierFeatureFlash::MaybeCreate(aChannel)?

::: netwerk/url-classifier/UrlClassifierFeatureFlash.cpp
@@ +66,5 @@
> +  }
> +}
> +
> +/* static */ void UrlClassifierFeatureFlash::GetFeatures(
> +    bool aIs3rdParty, nsTArray<nsCOMPtr<nsIUrlClassifierFeature>>& aFeatures) {

I am not sure about the performance impact here.
The original approach calls ::CheckIfClassifyNeeded before triggering classification.

If I understand correctly, right now we only check:
1. If it is a document/sub-document load
2. m3rdPartyOnly || aIs3rdParty

We should also include those preference related check in ::CheckIfClassifyNeeded to
avoid unnecessary classifications.

@@ +71,5 @@
> +  uint32_t numFeatures =
> +      (sizeof(sFlashFeaturesMap) / sizeof(sFlashFeaturesMap[0]));
> +  for (uint32_t i = 0; i < numFeatures; ++i) {
> +    MOZ_ASSERT(sFlashFeaturesMap[i].mFeature);
> +    if (!sFlashFeaturesMap[i].m3rdPartyOnly || aIs3rdParty) {

If I understand correctly, this is true when it is TYPE_SUBDOCUMENT, doesn't mean it is third-party.
I don't see the nsIDocument::IsThirdParty() check we used to have.

@@ +72,5 @@
> +      (sizeof(sFlashFeaturesMap) / sizeof(sFlashFeaturesMap[0]));
> +  for (uint32_t i = 0; i < numFeatures; ++i) {
> +    MOZ_ASSERT(sFlashFeaturesMap[i].mFeature);
> +    if (!sFlashFeaturesMap[i].m3rdPartyOnly || aIs3rdParty) {
> +      aFeatures.AppendElement(sFlashFeaturesMap[i].mFeature);

It is still possible that we have a preference but the table list is empty
I guess we can do something similar to the FeatureLoginReputation in Lazy Loading patch.
If the blacklist table is empty, then we don't create the future.

::: netwerk/url-classifier/UrlClassifierFeatureLoginReputation.cpp
@@ +104,5 @@
> +    return aChannel->GetURI(aURI);
> +  }
> +
> +  MOZ_ASSERT(aListType == nsIUrlClassifierFeature::whitelist);
> +  return UrlClassifierCommon::CreateWhiteListURI(aChannel, aURI);

This is wrong, the whitelist in login reputation doesn't use the pairwise whitelist
We should use aChannel->GetURI(aURI); here and MOZ_ASSERT when listType is a blacklist

::: netwerk/url-classifier/UrlClassifierFeatureTrackingProtection.cpp
@@ +132,5 @@
>  
> +NS_IMETHODIMP
> +UrlClassifierFeatureTrackingProtection::GetURIFromChannel(
> +    nsIChannel* aChannel, nsIUrlClassifierFeature::listType aListType,
> +    nsIURI** aURI) {

How do you think if we change the name to something like "GetURIByListType"?
I am thinking of this because the point of this function is "by listype", maybe
it is more clear to emphasize that.
Attachment #9031404 - Flags: review?(dlee) → review-
Comment on attachment 9031404 [details] [diff] [review]
part 1 - flash features

Review of attachment 9031404 [details] [diff] [review]:
-----------------------------------------------------------------

It looks good to me regarding the DOM part, but I don't have enough knowledge about url-classifier, I assume Dimi will review other things. :)

::: dom/ipc/URLClassifierParent.cpp
@@ +105,5 @@
> +  NS_IMETHOD
> +  GetURIFromChannel(nsIChannel* aChannel,
> +                    nsIUrlClassifierFeature::listType aListType,
> +                    nsIURI** aURI) override {
> +    // This method should not be called, but we have a URI, let's return it.

Nit: Maybe move this comment to line 110.

e.g.

// This method should not be called, but we have a URI, let's return it.
nsCOMPtr<nsIURI> uri = mURI;
...

::: netwerk/url-classifier/UrlClassifierFeatureFactory.cpp
@@ +73,5 @@
> +  if (contentPolicyType == nsIContentPolicy::TYPE_DOCUMENT ||
> +      contentPolicyType == nsIContentPolicy::TYPE_SUBDOCUMENT) {
> +    nsTArray<nsCOMPtr<nsIUrlClassifierFeature>> flashFeatures;
> +    UrlClassifierFeatureFlash::GetFeatures(
> +        contentPolicyType == nsIContentPolicy::TYPE_SUBDOCUMENT, flashFeatures);

Not sure why we treat subdocuments (e.g. iframes) as 3rd party, it seems to me that it is worth adding some comment explaining why or renaming the parameter of UrlClassifierFeatureFlash::GetFeatures to make it more clear.
Attachment #9031404 - Flags: review?(echen) → review+
> We can now easily fix Bug 1342333(Remove sync classifyLocalWithTables API)
> which
> requires lots of work previously!!!

Yes, when I land this, I'll ask you to review the removing of classifyLocalWithTables.

> For now, tracking protection/tracking annotation/flash they all have its
> whitelist table.
> Just an idea that we could consider optimizing in the future that if a
> feature doesn't
> have a whitelist preference or the whitelist table is somehow empty, we
> don't need to invoke ::AsyncClassifyLocalWithFeatures or maybe try to return
> ASAP in ::AsyncClassifyLocalWithFeatures

Right. Follow up?

> If I understand correctly, this is true when it is TYPE_SUBDOCUMENT, doesn't
> mean it is third-party.
> I don't see the nsIDocument::IsThirdParty() check we used to have.

A document is a 3rd party if loaded in an iframe. We use TYPE_SUBDOCUMENT for iframes, so yes, if contentType == TYPE_SUBDOCUMENT, the current channel is a 3rd party channel.

> It is still possible that we have a preference but the table list is empty
> I guess we can do something similar to the FeatureLoginReputation in Lazy
> Loading patch.
> If the blacklist table is empty, then we don't create the future.

Yes, you are right in principle, but we need the feature in order to know if the pref or the table is empty.
> > I don't see the nsIDocument::IsThirdParty() check we used to have.

Oh! I was wrong here. Ignore that part of my last comment.
Attached patch part 1 - flash features (obsolete) — Splinter Review
Attachment #9031404 - Attachment is obsolete: true
Attachment #9032141 - Flags: review?(dlee)
Comment on attachment 9031405 [details] [diff] [review]
part 2 - cleanup

Review of attachment 9031405 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsDocument.cpp
@@ -12079,5 @@
>   */
> -FlashClassification nsIDocument::PrincipalFlashClassification() {
> -  MOZ_ASSERT(mPrincipalFlashClassifier);
> -  return mPrincipalFlashClassifier->ClassifyMaybeSync(NodePrincipal(),
> -                                                      IsThirdParty());

I was trying to find where IsThirdParty() is handled after applying these patch.
Now I see the new part 1 patch includes that.

And I guess we could also get rid of nsIDocument::IsThirdParty.

::: dom/webidl/Document.webidl
@@ -516,5 @@
>  // For more information on Flash classification, see
>  // toolkit/components/url-classifier/flash-block-lists.rst
>  enum FlashClassification {
> -  "unclassified",   // Denotes a classification that has not yet been computed.
> -                    // Allows for lazy classification.

Webidl changes require a reivew from dom peer (I am not a DOM peer technically), though this is a chrome-only changes.
Attachment #9031405 - Flags: review?(echen) → review+
Blocks: 1342333
Blocks: 1515168
Comment on attachment 9032141 [details] [diff] [review]
part 1 - flash features

Review of attachment 9032141 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM! Again, thanks for getting this done!
Do you think we should ask a necko peer to review this patch since this involves necko ipc as well?

::: netwerk/url-classifier/AsyncUrlChannelClassifier.cpp
@@ +39,5 @@
> +// Features are able to classify particular URIs from a channel. For instance,
> +// tracking-annotation feature uses the top-level URI to whitelist the current
> +// channel's URI; flash feature always uses the channel's URI.  Because of
> +// this, this function aggregates feature per URI in an array of FeatureTask
> +// object.

This is very clear!
Attachment #9032141 - Flags: review?(dlee) → review+
(In reply to Andrea Marchesini [:baku] from comment #7)
> > don't need to invoke ::AsyncClassifyLocalWithFeatures or maybe try to return
> > ASAP in ::AsyncClassifyLocalWithFeatures
> 
> Right. Follow up?
> 
I created Bug 1515168 for follow up
Comment on attachment 9032141 [details] [diff] [review]
part 1 - flash features

Valentin, can you take a look at the necko part of this patch?
Attachment #9032141 - Flags: review?(valentin.gosu)
Comment on attachment 9032141 [details] [diff] [review]
part 1 - flash features

Review of attachment 9032141 [details] [diff] [review]:
-----------------------------------------------------------------

necko/ipdl bits look good.
nit: a more detailed description to the patch would be great!
r+ with these comments addressed.

::: dom/ipc/PURLClassifierLocal.ipdl
@@ +6,5 @@
>   * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
>  
>  include protocol PContent;
> +
> +include URIParams;

can you try this instead?
using refcounted class nsIURI from "mozilla/ipc/URIUtils.h";

I have a patch to convert a bunch of URIParams to nsIURI that I want to land soon.

::: netwerk/protocol/http/nsIHttpChannel.idl
@@ +503,5 @@
> +      FlashPluginAllowed = 1,
> +      FlashPluginDenied = 2,
> +
> +      // Keep this equal to the last value.
> +      FlashPluginLastValue = 2,

Can you add a static_assert for this somewhere? It's better than trusting programmers :)
Attachment #9032141 - Flags: review?(valentin.gosu) → review+
Blocks: 1515272
Attached patch part 1 - flash features (obsolete) — Splinter Review
Dimi, can you take a look at how we calculate the 3rd party flag here?
It turned out that nsIDocument::IsThirdParty() does something different than mozIThirdPartyUtils.

https://searchfox.org/mozilla-central/rev/13788edbabb04d004e4a1ceff41d4de68a8320a2/dom/base/nsDocument.cpp#12545-12621

nsIDocument::IsThirdParty() compares document's nsIPrincipals, mozIThirdPartyUtils checks the base-domains.

This means that: http://subdocument.example.com is considered 3rd-party by the former, and first-party by the latter.

We don't have the full chain of documents in the nsIChannel. I can just compare the channel's principal with the top-level principal.
Attachment #9032141 - Attachment is obsolete: true
Attachment #9032417 - Flags: review?(dlee)
qdot, if you can take a look at UrlClassifierFeatureFlash.cpp, "bool Is3rdPartyChannel(nsIChannel* aChannel)". This cannot be equal to nsIDocument::IsThirdParty because we don't have the whole chain of documents, at a channel level, but it's similar. See comment 15. Thanks.
Flags: needinfo?(kyle)
(In reply to Andrea Marchesini [:baku] from comment #15)
> 
> We don't have the full chain of documents in the nsIChannel. I can just
> compare the channel's principal with the top-level principal.

From Bug 1345611, it seems to check if the parent is 3rd party is the requirement.
If we only compare with top level, testcase like this[1] may fail?

[1] https://searchfox.org/mozilla-central/rev/2e5e28f518524f8af5b158ddb605022b6a2d68cf/toolkit/components/url-classifier/tests/browser/classifierTester.js#130
Dimi, do you know why we cannot simply use mozIThirdPartyUtil instead? And fix the test, of course.
Flags: needinfo?(dlee)
Hmm, yeah, this is going to be a problem because our current flash classifier expects to be able to walk the document tree to the top level, so that if we have a document parent/child relationship of a -> b -> a-with-flash, we consider the a-with-flash load denied. I don't believe mozIThirdPartyUtil has a way to pull off that chain checking either.

Unfortunately I'm not sure I have much advice on fixing that, because I'm not real sure what you *do* have access to at the channel level.
Flags: needinfo?(kyle)
(In reply to Andrea Marchesini [:baku] from comment #18)
> Dimi, do you know why we cannot simply use mozIThirdPartyUtil instead? And
> fix the test, of course.

Sorry, I don't :(

So if we can't find a way to get the whole chain thirty party information in the channel level, then I guess we have to:
1. Always include the flashSubDocTable in classification
2. In ::ProcessChannel, have a way to let document know the matched tables or set stats, like "denied_subdocument".
3. Check if it is 3rd party in ::DocumentFlashClassification and return the classification result.
Flags: needinfo?(dlee)
Attached patch part 1 - flash features (obsolete) — Splinter Review
Attachment #9032417 - Attachment is obsolete: true
Attachment #9032417 - Flags: review?(dlee)
Attachment #9032656 - Flags: review?(dlee)
Attached patch part 2 - cleanupSplinter Review
Attachment #9031405 - Attachment is obsolete: true
Attachment #9032658 - Flags: review?(dlee)
Comment on attachment 9032656 [details] [diff] [review]
part 1 - flash features

Review of attachment 9032656 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/url-classifier/UrlClassifierFeatureFlash.cpp
@@ +133,5 @@
> +      return;
> +    }
> +  }
> +
> +  bool is3rdParty = Is3rdPartyChannel(aChannel);

just curious, what if this is the case
domains: ["http://subdocument.example.com", "http://example.com", "http://subdocument.example.com"]

I can't find a scenario like this in testcase[1], should we deny or allow?
If I understand Comment 19 correctly, I guess the answer is deny, so probably we should not use 3rd party check here.

[1] https://searchfox.org/mozilla-central/rev/2e5e28f518524f8af5b158ddb605022b6a2d68cf/toolkit/components/url-classifier/tests/browser/classifierTester.js#130
Hi Kyle, could you help check the question in Comment 23? Thanks!
Flags: needinfo?(kyle)
Comment on attachment 9032658 [details] [diff] [review]
part 2 - cleanup

Review of attachment 9032658 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me!
Attachment #9032658 - Flags: review?(dlee) → review+
Flags: needinfo?(kyle)
Attachment #9032656 - Attachment is obsolete: true
Attachment #9032656 - Flags: review?(dlee)
Attachment #9034377 - Flags: review?(dlee)
Comment on attachment 9034377 [details] [diff] [review]
part 1 - flash features

Review of attachment 9034377 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me!
Attachment #9034377 - Flags: review?(dlee) → review+
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ddd69d7ae1a0
Port flash url-classifier to nsIUrlClassifierFeature - part 1 - Flash feature, r=dimi, r=edgar, r=valentin
https://hg.mozilla.org/integration/mozilla-inbound/rev/b8155ef9f71f
Port flash url-classifier to nsIUrlClassifierFeature - part 2 - Cleanup Flash classification, r=dimi, r=edgar
Yeah, that's correct, I believe it should be denied. Since this has now landed, I suppose will find out if it wasn't via more bugs being filed. :)
(In reply to Kyle Machulis [:qdot] [:kmachulis] (if a patch has no decent commit message, automatic r-) from comment #29)
> Yeah, that's correct, I believe it should be denied. Since this has now
> landed, I suppose will find out if it wasn't via more bugs being filed. :)

Thank you for the information :)
It turned out that I reviewed the older version of the patch, the landed patch did behave the same as your comment.
https://hg.mozilla.org/mozilla-central/rev/ddd69d7ae1a0
https://hg.mozilla.org/mozilla-central/rev/b8155ef9f71f
Status: NEW → RESOLVED
Closed: 10 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66
Blocks: 1513046
Depends on: 1514488
Blocks: 1523274
You need to log in before you can comment on or make changes to this bug.