Closed Bug 1360581 Opened 3 years ago Closed 3 years ago

HTTP channel TP annotation happens too late

Categories

(Core :: Networking: HTTP, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox56 --- fixed

People

(Reporter: mayhemer, Assigned: kershaw)

References

(Blocks 1 open bug)

Details

(Whiteboard: [necko-active])

Attachments

(3 files, 14 obsolete files)

7.23 KB, patch
kershaw
: review+
Details | Diff | Splinter Review
36.96 KB, patch
kershaw
: review+
Details | Diff | Splinter Review
6.03 KB, patch
mayhemer
: review+
Details | Diff | Splinter Review
channel classifier (started from nsHttpChannel::BeginConnect) calls SetIsTrackingResource and SetPriority on the channel too late, when the transaction may already be initiated and put to the http conn manager machinery.  Hence, there is probably no effect at all and the results from WPT might only be a jitter.

Not sure what to do here...  First, I'd like to race opening the cache entry and classification.  There is probably nothing from the cache we would need to classify.  This still can delay going out with transactions that are vital for the page first render and probably not TP listed.

Hence, opening this more for a discussion sake expecting this become just a wontfix or something not quantum-blocking.

I think bug 1358348 and bug 1360580 will mitigate this problem well enough.
Whiteboard: [necko-next]
Are you sure? We do not do speculativeConnect before classifier returns and transaction is created later as well in ContinueConnect.
This somewhat blocks prioritization decisions on bug 1358348 and also completely blocks implementation of bug 1358060 (which is quite important).  

We need to double check how we actually behave here.  I'm afraid, that we get the TP annotation too late if it happens asycnhronously (which I believe it is).  Only response we get from the classifier is Resume() of the channel, nothing more.  Result of the classification doesn't block scheduling of the transaction which may get activated on a connection/session sooner (with the default priority) than we get the annotation, so that the TB-based prioritization may not have any effect at all, specially for h1.

Kershaw, could you please take this as your #1 CDP bug and diagnose how the things actually work here?

If I'm wrong and the annotation comes early enough before the classifier does SetPriority() on the channel, or it's easy to fix w/o perfomance impact, we don't need bug 1358348.  Otherwise, this might be tricky regarding bug 1358060.

Thanks.
Assignee: nobody → kechang
Priority: P2 → P1
Whiteboard: [necko-next] → [necko-active]
Blocks: tailing
> Kershaw, could you please take this as your #1 CDP bug and diagnose how the
> things actually work here?
> 
> If I'm wrong and the annotation comes early enough before the classifier
> does SetPriority() on the channel, or it's easy to fix w/o perfomance
> impact, we don't need bug 1358348.  Otherwise, this might be tricky
> regarding bug 1358060.
> 

I can confirm that sometimes SetPriority is called too late because we call ContinueBeginConnectWithResult() at [1].

I don't think there is an easy fix without performance impact. Need some time to figure out a solution.

[1] http://searchfox.org/mozilla-central/rev/9a7fbdee1d54f99cd548af95b81231d80e5f9ad1/netwerk/protocol/http/nsHttpChannel.cpp#6447
    // mLocalBlocklist is true only if tracking protection is enabled and the
    // URI is a tracking domain

Looks like I was wrong.  Source of this bug is my lack of knowledge how "local blocking" and the classification actually works.

Determining mLocalBlocklist precedes the malware classification and is always async when performed, while performing it depends on a number of conditions (actually, mainly one - Tracking Protection being enabled for the context).  

It also blocks calling BeginConnectActual() which is the method leading immediately to initiating the transactions, hence we have tools to know if the URL is a tracker prior to scheduling the transaction.

I was testing with privacy.trackingprotection.annotate_channels at true (which _is_ the default!) and it makes channelClassifier->ShouldEnableTrackingProtection in nsHttpChannel::InitLocalBlockList return false bypassing resolution of mLocalBlocklist.

Hence, I'm not sure what the code at [1] does and I'm not sure if it's not just misplaced.

Kershaw, you originally wrote that piece [2] in bug 1141814, I reviewed it :')  Are we sure we can't get the "is a tracker" info sooner as part of the mLocalBlocklist resolution when either TP or just annotation is enabled?  Or how this all actually works? :)

CC'ing François that may know even better.


[1] https://dxr.mozilla.org/mozilla-central/rev/85e5d15c31691c89b82d6068c26260416493071f/netwerk/base/nsChannelClassifier.cpp#935
[2] https://hg.mozilla.org/mozilla-central/annotate/fb18b507a912/netwerk/base/nsChannelClassifier.cpp#l692
(In reply to Honza Bambas (:mayhemer) from comment #4)
>     // mLocalBlocklist is true only if tracking protection is enabled and the
>     // URI is a tracking domain
> 
> Looks like I was wrong.  Source of this bug is my lack of knowledge how
> "local blocking" and the classification actually works.
> 
> Determining mLocalBlocklist precedes the malware classification and is
> always async when performed, while performing it depends on a number of
> conditions (actually, mainly one - Tracking Protection being enabled for the
> context).  
> 
> It also blocks calling BeginConnectActual() which is the method leading
> immediately to initiating the transactions, hence we have tools to know if
> the URL is a tracker prior to scheduling the transaction.
> 
> I was testing with privacy.trackingprotection.annotate_channels at true
> (which _is_ the default!) and it makes
> channelClassifier->ShouldEnableTrackingProtection in
> nsHttpChannel::InitLocalBlockList return false bypassing resolution of
> mLocalBlocklist.
> 

"privacy.trackingprotection.annotate_channels" has nothing to do with ShouldEnableTrackingProtection.
It only affects two things:
1. Whether to call HttpBaseChannel::SetIsTrackingResource.
2. Always use TP table to classify an URI when calling nsUrlClassifierDBService::Classify.

> Hence, I'm not sure what the code at [1] does and I'm not sure if it's not
> just misplaced.
> 

The code at [1] is triggered by calling nsChannelClassifier::Start at [3]. Unfortunately, it could happen too late when mLocalBlocklist is false. 

> Kershaw, you originally wrote that piece [2] in bug 1141814, I reviewed it
> :')  Are we sure we can't get the "is a tracker" info sooner as part of the
> mLocalBlocklist resolution when either TP or just annotation is enabled?  Or
> how this all actually works? :)
> 

The only thing that could change the value of mLocalBlocklist is whether TP is enabled or not.
So, the current behavior can be summarized below:
1. When TP is enabled:
 - nsUrlClassifierDBService::AsyncClassifyLocalWithTables will be called at [4].
 - mLocalBlocklist could be set to true at [5].
 - If mLocalBlocklist is true, ContinueBeginConnectWithResult will be called until we get the result from channel classifer.

2. When TP is disabled:
 - mLocalBlocklist is always false.
 - ContinueBeginConnectWithResult could be called before the code at [1], so SetPriority may have no effect.

> CC'ing François that may know even better.
> 
> 
> [1]
> https://dxr.mozilla.org/mozilla-central/rev/
> 85e5d15c31691c89b82d6068c26260416493071f/netwerk/base/nsChannelClassifier.
> cpp#935
> [2]
> https://hg.mozilla.org/mozilla-central/annotate/fb18b507a912/netwerk/base/
> nsChannelClassifier.cpp#l692

[3] http://searchfox.org/mozilla-central/rev/24c443a440104dabd9447608bd41b8766e8fc2f5/netwerk/protocol/http/nsHttpChannel.cpp#6466
[4] http://searchfox.org/mozilla-central/rev/24c443a440104dabd9447608bd41b8766e8fc2f5/netwerk/protocol/http/nsHttpChannel.cpp#6121
[5] http://searchfox.org/mozilla-central/rev/24c443a440104dabd9447608bd41b8766e8fc2f5/netwerk/protocol/http/nsHttpChannel.cpp#6391
Kershaw, thanks for the summary, but it doesn't shade more light for me on this.  Few questions I still don't get answered (sorry for my ignorance):

- is mLocalBlocklist enough (or just that) to find out whether the resource is being loaded from a tracker domain? (aka "is a tracker")
- what exactly the "channel annotation" does?  I though it was introduced to determine whether a resources is a tracker even though TP was off, so that we can change priorities etc ; it seems to me that the annotation enables the malware blacklisting though (aka "more than we need to know"), so I'm quite confused
- if the answer to the first question is positive, how complicated would it be to enable determination of mLocalBlocklist when only channel annotation is enabled?

Francois, maybe you can shade some light here?  Thanks.
Flags: needinfo?(kechang)
Flags: needinfo?(francois)
> - is mLocalBlocklist enough (or just that) to find out whether the
>   resource is being loaded from a tracker domain? (aka "is a tracker")

Close, but not exactly. For a resource to be considered a tracker, it needs
to be both:

- on the local blacklist (i.e. mLocalBlocklist), and
- not on the local entity whitelist (which whitelists trackers that are
  first-party with the top-level page)

For the purposes of delaying BeginConnect(), we take a shortcut and only
consider the blacklist.

Based on that, we disable early DNS resolution and speculative connections:

https://searchfox.org/mozilla-central/rev/02cae69058d41e2c6b635edccbec91a6ca2cb57f/netwerk/protocol/http/nsHttpChannel.cpp#6420
https://searchfox.org/mozilla-central/rev/02cae69058d41e2c6b635edccbec91a6ca2cb57f/netwerk/protocol/http/nsHttpChannel.cpp#730

> - what exactly the "channel annotation" does? I though it was introduced
>   to determine whether a resources is a tracker even though TP was off, so
>   that we can change priorities etc ; it seems to me that the annotation
>   enables the malware blacklisting though (aka "more than we need to
>   know"), so I'm quite confused

The Classify() call (which may cause blocking remote lookups) checks all of
the lists that are enabled (incl. malware/phishing).

The call to AsyncClassifyLocalWithTables() on the other hand, which is
entirely local, only uses the tracking lists:

https://searchfox.org/mozilla-central/rev/02cae69058d41e2c6b635edccbec91a6ca2cb57f/netwerk/protocol/http/nsHttpChannel.cpp#6116

and is what we use to set mLocalBlocklist and delay BeginConnect().

> - if the answer to the first question is positive, how complicated would
>   it be to enable determination of mLocalBlocklist when only channel
>   annotation is enabled?

I'm not 100% sure, but it could be as simple as adding a check for
privacy.trackingprotection.annotate_channels here:

http://searchfox.org/mozilla-central/rev/02cae69058d41e2c6b635edccbec91a6ca2cb57f/netwerk/protocol/http/nsHttpChannel.cpp#6100
Flags: needinfo?(francois)
(In reply to François Marier [:francois] from comment #7)
> > - is mLocalBlocklist enough (or just that) to find out whether the
> >   resource is being loaded from a tracker domain? (aka "is a tracker")
> 
> Close, but not exactly. For a resource to be considered a tracker, it needs
> to be both:
> 
> - on the local blacklist (i.e. mLocalBlocklist), and
> - not on the local entity whitelist (which whitelists trackers that are
>   first-party with the top-level page)
> 

Current implementation of channel annotation only calls nsUrlClassifierDBService::Classify with the argument aTrackingProtectionEnabled set to true.

http://searchfox.org/mozilla-central/rev/79f625ac56d936ef5d17ebe13a123b25efee4241/netwerk/base/nsChannelClassifier.cpp#510

Does this mean this not enough to tell if a resource is a tracker?
Do we need to call AsyncClassifyLocalWithTables as well?

Francois, could you answer my two questions above? Thanks.

> > - if the answer to the first question is positive, how complicated would
> >   it be to enable determination of mLocalBlocklist when only channel
> >   annotation is enabled?
> 
> I'm not 100% sure, but it could be as simple as adding a check for
> privacy.trackingprotection.annotate_channels here:
> 
> http://searchfox.org/mozilla-central/rev/
> 02cae69058d41e2c6b635edccbec91a6ca2cb57f/netwerk/protocol/http/nsHttpChannel.
> cpp#6100

My original code in bug 1141814 did this:
https://hg.mozilla.org/mozilla-central/diff/fb18b507a912/netwerk/protocol/http/nsHttpChannel.cpp#l1.22

However, it has been removed by Ehsan due to a performance regression. Please see bug 1324053.
Flags: needinfo?(kechang) → needinfo?(francois)
Flags: needinfo?(francois)
I've had a discussion in person with Francois today, and I'd like to summarize some things below.

1. Current channel annotation is based on calling nsUrlClassifierDBService::Classify. This is able to tell whether a resource is a tracker because the local blacklist and whitelist are both checked. However, the moment we get the result from classifier is too late - the transaction could be already dispatched.

2. mLocalBlocklist is not 100% enough to know if its a tracker because we don't check the whitelist.

Based on two points above, I think using mLocalBlocklist could be an option to annotate channels a bit earlier with some changes below:
 - Adding a check for privacy.trackingprotection.annotate_channels at [1]
 - When TP is disabled and mLocalBlocklist is true, don't disable speculative connection and DNS prefetching.
 - Perhaps check whitelist at [2] when we get the result.

Honza, what do you think?

[1] http://searchfox.org/mozilla-central/rev/02cae69058d41e2c6b635edccbec91a6ca2cb57f/netwerk/protocol/http/nsHttpChannel.cpp#6100
[2] http://searchfox.org/mozilla-central/rev/2933592c4a01b634ab53315ce2d0e43fccb82181/netwerk/protocol/http/nsHttpChannel.cpp#6074
Flags: needinfo?(honzab.moz)
(In reply to Kershaw Chang [:kershaw] from comment #9)
> I've had a discussion in person with Francois today, and I'd like to
> summarize some things below.
> 
> 1. Current channel annotation is based on calling
> nsUrlClassifierDBService::Classify. This is able to tell whether a resource
> is a tracker because the local blacklist and whitelist are both checked.
> However, the moment we get the result from classifier is too late - the
> transaction could be already dispatched.
> 
> 2. mLocalBlocklist is not 100% enough to know if its a tracker because we
> don't check the whitelist.
> 
> Based on two points above, I think using mLocalBlocklist could be an option
> to annotate channels a bit earlier with some changes below:
>  - Adding a check for privacy.trackingprotection.annotate_channels at [1]
>  - When TP is disabled and mLocalBlocklist is true, don't disable
> speculative connection and DNS prefetching.
>  - Perhaps check whitelist at [2] when we get the result.
> 
> Honza, what do you think?
> 
> [1]
> http://searchfox.org/mozilla-central/rev/
> 02cae69058d41e2c6b635edccbec91a6ca2cb57f/netwerk/protocol/http/nsHttpChannel.
> cpp#6100
> [2]
> http://searchfox.org/mozilla-central/rev/
> 2933592c4a01b634ab53315ce2d0e43fccb82181/netwerk/protocol/http/nsHttpChannel.
> cpp#6074

Kershaw, good work!  Yes, if whitelist checking is fast (relatively cheap in-memory only operation) then yes, I agree with all the points.

Are you willing to write a patch?
Flags: needinfo?(honzab.moz)
Blocks: 1360580
> Are you willing to write a patch?

Sure thing. I am working on it.
(In reply to Kershaw Chang [:kershaw] from comment #9)
>  - Perhaps check whitelist at [2] when we get the result.

This is the function in which the whitelist is checked:

https://searchfox.org/mozilla-central/rev/a14524a72de6f4ff738a5e784970f0730cea03d8/netwerk/base/nsChannelClassifier.cpp#829
Attached patch WIP (obsolete) — Splinter Review
Please see commit message for details.

This patch still needs some cleanup and a full test, so only ask for feedback.
Thanks.
Attachment #8872078 - Flags: feedback?(honzab.moz)
Comment on attachment 8872078 [details] [diff] [review]
WIP

Review of attachment 8872078 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!  good work.  sorry for the delay.

(going to test this locally as well with the new throttling patch)

::: netwerk/base/nsChannelClassifier.cpp
@@ -73,5 @@
>  
>    void Init();
>    bool IsAllowListExample() { return sAllowListExample;}
> -  bool IsLowerNetworkPriority() { return sLowerNetworkPriority;}
> -  bool IsAnnotateChannelEnabled() { return sAnnotateChannelEnabled;}

could these rather be made static and public methods than moving the pref access to nshttpchannel.cpp?  but only if that is no too much work!

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +6178,5 @@
> +                                                   const nsACString& aLists,
> +                                                   const nsACString& aProvider,
> +                                                   const nsACString& aPrefix)
> +{
> +    // If the URI is not a tracker, call BeginConnectActual() right away.

on which thread is this called?

I presume on main thread.  does it make sense (in a followup bug please!) to do both checks (is in local black list/is whitelisted) in parallel to speed things up somehow?

::: netwerk/protocol/http/nsHttpChannel.h
@@ +705,5 @@
>      // Will be true if the onCacheEntryAvailable callback is not called by the
>      // time we send the network request
>      Atomic<bool> mRaceCacheWithNetwork;
>  
> +    bool mTrackingProtectionEnabled = false;

can be a bit field?
Attachment #8872078 - Flags: feedback?(honzab.moz) → feedback+
(In reply to Honza Bambas (:mayhemer) from comment #14)
> Comment on attachment 8872078 [details] [diff] [review]
> WIP
> 
> Review of attachment 8872078 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Thanks!  good work.  sorry for the delay.
> 
> (going to test this locally as well with the new throttling patch)
> 
> ::: netwerk/base/nsChannelClassifier.cpp
> @@ -73,5 @@
> >  
> >    void Init();
> >    bool IsAllowListExample() { return sAllowListExample;}
> > -  bool IsLowerNetworkPriority() { return sLowerNetworkPriority;}
> > -  bool IsAnnotateChannelEnabled() { return sAnnotateChannelEnabled;}
> 
> could these rather be made static and public methods than moving the pref
> access to nshttpchannel.cpp?  but only if that is no too much work!
> 

Actually, I am thinking not to remove these code in nsChannelClassifier.cpp, but moving the logic of channel annotation from nsHttpChannel.cpp to nsChannelClassifier.cpp in my next patch. By this we can make the code in nsHttpChannel.cpp a bit clear.



> ::: netwerk/protocol/http/nsHttpChannel.cpp
> @@ +6178,5 @@
> > +                                                   const nsACString& aLists,
> > +                                                   const nsACString& aProvider,
> > +                                                   const nsACString& aPrefix)
> > +{
> > +    // If the URI is not a tracker, call BeginConnectActual() right away.
> 
> on which thread is this called?
> 
> I presume on main thread.  does it make sense (in a followup bug please!) to
> do both checks (is in local black list/is whitelisted) in parallel to speed
> things up somehow?
> 

I am not sure, but this is definitely worth a followup bug for further discussion.

> ::: netwerk/protocol/http/nsHttpChannel.h
> @@ +705,5 @@
> >      // Will be true if the onCacheEntryAvailable callback is not called by the
> >      // time we send the network request
> >      Atomic<bool> mRaceCacheWithNetwork;
> >  
> > +    bool mTrackingProtectionEnabled = false;
> 
> can be a bit field?

If I move the code back to nsChannelClassifier.cpp, I can reuse nsChannelClassifier::mTrackingProtectionEnabled. So, there is no need to save this in nsHttpChannel.h.
Attached patch WIP - v2 (obsolete) — Splinter Review
Summary:
 - Move the annotation channel code back to nsChannelClassifier.cpp.
 - Reuse the current code in nsChannelClassifier.cpp as much as possible.

I think this should be close to final version, but I didn't test it thoroughly.
One thing that I need your advice, do we want to also check whitelist when TP is enabled and channel annotation is disabled? I am not sure we want to change the current behavior. Please note in this patch I assume we want to also check whitelist when TP is enabled.
Attachment #8872078 - Attachment is obsolete: true
Attachment #8873095 - Flags: feedback?(honzab.moz)
(In reply to Kershaw Chang [:kershaw] from comment #16)
> Created attachment 8873095 [details] [diff] [review]
> WIP - v2
> 
> Summary:
>  - Move the annotation channel code back to nsChannelClassifier.cpp.
>  - Reuse the current code in nsChannelClassifier.cpp as much as possible.
> 
> I think this should be close to final version, but I didn't test it
> thoroughly.
> One thing that I need your advice, do we want to also check whitelist when
> TP is enabled and channel annotation is disabled?

I'm probably not the right person to ask.  I don't know the classifier code and its usage that much (if at all) to answer that.  But probably we should?

Francois is the one to ask about this.

> I am not sure we want to
> change the current behavior. 

Likely not :)  Only purpose here is to get necessary tracking info when (!tp && annotate) before BeginConnectActual, nothing else.

> Please note in this patch I assume we want to
> also check whitelist when TP is enabled.

I really can't deduce of the patch if you do or not :D
Comment on attachment 8873095 [details] [diff] [review]
WIP - v2

Review of attachment 8873095 [details] [diff] [review]:
-----------------------------------------------------------------

I can't give much feedback on the classifier changes, Francois is the one to ask.  The http changes seems OK!  Thanks!

::: netwerk/base/nsChannelClassifier.cpp
@@ +223,5 @@
>  {
> +  if (mTrackingProtectionEnabled) {
> +    *result = mTrackingProtectionEnabled.value();
> +    return NS_OK;
> +  }

don't understand this change

@@ +856,5 @@
> +           "in whitelist so we won't block it", mClosure.get()));
> +      rv = NS_OK;
> +    }
> +
> +    return mClosure->OnClassifyCompleteInternal(rv, mList, mProvider, mPrefix);

you probably want to throw the closure away

::: netwerk/base/nsChannelClassifier.h
@@ +49,5 @@
> +    // Returning NS_OK means the check will be processed
> +    // and the caller should wait for the result.
> +    nsresult CheckIsTrackerWithLocalTable(nsIURIClassifierCallback* aCallback);
> +
> +    already_AddRefed<nsIURI> CreateWhiteListURI() const;

a comment needed.

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +6113,5 @@
>      // We skip speculative connections by setting mLocalBlocklist only
>      // when tracking protection is enabled. Though we could do this for
>      // both phishing and malware, it is not necessary for correctness,
>      // since no network events will be received while the
>      // nsChannelClassifier is in progress. See bug 1122691.

is the comment outdated?  looks like we now do this (set the local block list) all the time and prefs to annotate/tp are tested inside the classifier, right?

@@ -6112,5 @@
> -        return false;
> -    }
> -
> -    nsAutoCString tables;
> -    Preferences::GetCString("urlclassifier.trackingTable", &tables);

thanks for removing this ! :)
Attachment #8873095 - Flags: feedback?(honzab.moz) → feedback+
(In reply to Honza Bambas (:mayhemer) from comment #18)
> Comment on attachment 8873095 [details] [diff] [review]
> WIP - v2
> 
> Review of attachment 8873095 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I can't give much feedback on the classifier changes, Francois is the one to
> ask.  The http changes seems OK!  Thanks!
> 

Thanks for the feedback. I'll ask Francois to take a look.

> ::: netwerk/base/nsChannelClassifier.cpp
> @@ +223,5 @@
> >  {
> > +  if (mTrackingProtectionEnabled) {
> > +    *result = mTrackingProtectionEnabled.value();
> > +    return NS_OK;
> > +  }
> 
> don't understand this change
> 

This is because I don't want to call ShouldEnableTrackingProtectionInternal again. If mTrackingProtectionEnabled already has a value, just return it.

> @@ +856,5 @@
> > +           "in whitelist so we won't block it", mClosure.get()));
> > +      rv = NS_OK;
> > +    }
> > +
> > +    return mClosure->OnClassifyCompleteInternal(rv, mList, mProvider, mPrefix);
> 
> you probably want to throw the closure away
> 

I think this is fine since IsTrackerWhitelistedCallback object will be gone after calling OnClassifyCompleteInternal.

> ::: netwerk/base/nsChannelClassifier.h
> @@ +49,5 @@
> > +    // Returning NS_OK means the check will be processed
> > +    // and the caller should wait for the result.
> > +    nsresult CheckIsTrackerWithLocalTable(nsIURIClassifierCallback* aCallback);
> > +
> > +    already_AddRefed<nsIURI> CreateWhiteListURI() const;
> 
> a comment needed.
> 

Got it. Thanks.

> ::: netwerk/protocol/http/nsHttpChannel.cpp
> @@ +6113,5 @@
> >      // We skip speculative connections by setting mLocalBlocklist only
> >      // when tracking protection is enabled. Though we could do this for
> >      // both phishing and malware, it is not necessary for correctness,
> >      // since no network events will be received while the
> >      // nsChannelClassifier is in progress. See bug 1122691.
> 
> is the comment outdated?  looks like we now do this (set the local block
> list) all the time and prefs to annotate/tp are tested inside the
> classifier, right?
> 

Yes, I'll update it.
(In reply to Kershaw Chang [:kershaw] from comment #19)
> > > +    return mClosure->OnClassifyCompleteInternal(rv, mList, mProvider, mPrefix);
> > 
> > you probably want to throw the closure away
> > 
> 
> I think this is fine since IsTrackerWhitelistedCallback object will be gone
> after calling OnClassifyCompleteInternal.

General rule: when you don't need something, throw it away immediately.  It's better to prevent leaks this way.  You never know if there is or isn't a cycle.
(In reply to Honza Bambas (:mayhemer) from comment #17)
> (In reply to Kershaw Chang [:kershaw] from comment #16)
> > Created attachment 8873095 [details] [diff] [review]
> > WIP - v2
> > 
> > Summary:
> >  - Move the annotation channel code back to nsChannelClassifier.cpp.
> >  - Reuse the current code in nsChannelClassifier.cpp as much as possible.
> > 
> > I think this should be close to final version, but I didn't test it
> > thoroughly.
> > One thing that I need your advice, do we want to also check whitelist when
> > TP is enabled and channel annotation is disabled?
> 
> I'm probably not the right person to ask.  I don't know the classifier code
> and its usage that much (if at all) to answer that.  But probably we should?
> 
> Francois is the one to ask about this.
> 
> > I am not sure we want to
> > change the current behavior. 
> 
> Likely not :)  Only purpose here is to get necessary tracking info when (!tp
> && annotate) before BeginConnectActual, nothing else.
> 

I think we should not check whitelist here when TP is enabled, because the whitelist will be checked again at [1].
I'll update the patch later.

[1] http://searchfox.org/mozilla-central/rev/1a0d9545b9805f50a70de703a3c04fc0d22e3839/netwerk/base/nsChannelClassifier.cpp#896
Comment on attachment 8873095 [details] [diff] [review]
WIP - v2

Review of attachment 8873095 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +378,5 @@
>      nsresult rv;
>  
> +    // We don't need channel classifier at this point.
> +    mChannelClassifier = nullptr;
> +

If this channel was canceled before calling connect, we have no chance to release |mChannelClassifier|.

I think the best way to break the ref cycle is using weak ptr here. I'll update a patch to do this later.
Summary:
 - Make nsChannelClassifier support weak ptr
 - Save nsChannelClassifier's weak ptr in nsHttpChannel
 - Create nsChannelClassifier if the weak ptr is null
Attachment #8873095 - Attachment is obsolete: true
Attachment #8873369 - Flags: review?(honzab.moz)
Hi Francois,

Could you take a look at this patch? Details could be found in commit message.
Thanks.
Attachment #8873371 - Flags: review?(francois)
Comment on attachment 8873369 [details] [diff] [review]
Part1: Avoid creating channel classifier twice

Review of attachment 8873369 [details] [diff] [review]:
-----------------------------------------------------------------

I think you wanted to submit a different patch?
Attachment #8873369 - Flags: review?(honzab.moz)
Summary:
 - Remove weakptr of channel classifier
 - Make ReleaseListeners() virtual and override it in nsHttpChannel
 - Release mChannelClassifier in ReleaseListeners()

Thanks for your suggestion, Honza. Releasing mChannelClassifier in ReleaseListeners() works to solve ref count cycle problem.

Please take a look at this patch.
Attachment #8873369 - Attachment is obsolete: true
Attachment #8873420 - Flags: review?(honzab.moz)
Update the patch since there is no need for nsChannelClisifier to support WeakPtr.
Attachment #8873371 - Attachment is obsolete: true
Attachment #8873371 - Flags: review?(francois)
Attachment #8873424 - Flags: review?(francois)
Comment on attachment 8873420 [details] [diff] [review]
Part1: Avoid creating channel classifier twice, v2

Review of attachment 8873420 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/nsHttpChannel.h
@@ +711,5 @@
>  
>  protected:
>      virtual void DoNotifyListenerCleanup() override;
>  
> +    // Override ReleaseListeners() because mChannelClassifier is only existed

mChannelClassifier only exists
Attachment #8873420 - Flags: review?(honzab.moz) → review+
Kershaw, could you please add logs like:

"nsHttpChannel %p created nsChannelClassifier &p"

so that it's traceable how stuff is interconnected?  thanks.
(In reply to Honza Bambas (:mayhemer) from comment #29)
> Kershaw, could you please add logs like:
> 
> "nsHttpChannel %p created nsChannelClassifier &p"
> 
> so that it's traceable how stuff is interconnected?  thanks.

Sure. I'll do it.
Comment on attachment 8873424 [details] [diff] [review]
Part2: Add new API CheckIsTrackerWithLocalTable to check again local blacklist

Review of attachment 8873424 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/nsChannelClassifier.cpp
@@ +919,5 @@
> +       " status=0x%x, tpEnabled=%d",
> +       mChannelClassifier.get(), status, tpEnabled));
> +
> +  // If this is not in local blacklist or tracking protection is enabled,
> +  // directly sent the status back.

"send"

nit: it may be worth mentioning that the reason we don't care when TP is enabled is because it will handled in the call to Classify().

@@ +940,5 @@
> +  nsCOMPtr<nsIURIClassifierCallback> callback =
> +    new IsTrackerWhitelistedCallback<IsTrackerBlacklistedCallback>(
> +      this, aLists, aProvider, aPrefix, whitelistURI);
> +
> +  // If IsTrackerWhitelisted is failed, it means the uri is not in whitelist.

"has failed"

@@ +943,5 @@
> +
> +  // If IsTrackerWhitelisted is failed, it means the uri is not in whitelist.
> +  if (NS_FAILED(mChannelClassifier->IsTrackerWhitelisted(whitelistURI, callback))) {
> +    LOG(("IsTrackerBlacklistedCallback[%p]:OnClassifyComplete channel [%p] "
> +         "IsTrackerWhitelisted is failed.",

ditto
Attachment #8873424 - Flags: review?(francois) → review+
Kershaw, is there anything preventing you to land this TODAY?  It's exactly one week to merge, hence this is the due date for new features.
Flags: needinfo?(kechang)
(In reply to Honza Bambas (:mayhemer) from comment #32)
> Kershaw, is there anything preventing you to land this TODAY?  It's exactly
> one week to merge, hence this is the due date for new features.

I am waiting the final result on try.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a696d876de4c64c8d051970e38ff0de0d5a27024

It looks fine, so I'm going to land it.
Flags: needinfo?(kechang)
Carry r+.
Attachment #8873420 - Attachment is obsolete: true
Attachment #8873424 - Attachment is obsolete: true
Attachment #8874757 - Flags: review+
Correct the previous patch.
Attachment #8874757 - Attachment is obsolete: true
Attachment #8874759 - Flags: review+
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4d81e7dfd020
Part 1: Avoid nsChannelClassifier being created twice. r=mayhemer
https://hg.mozilla.org/integration/mozilla-inbound/rev/d1c5998858a3
Part 2: Do channel annotation a bit earlier. r=francois
Keywords: checkin-needed
Depends on: 1370583
(In reply to Wes Kocher (:KWierso) from comment #38)
> I had to back these out for asan failures in
> test_workerupdatefoundevent.html like
> https://treeherder.mozilla.org/logviewer.html#?job_id=104970773&repo=mozilla-
> inbound
> 
> https://hg.mozilla.org/integration/mozilla-inbound/rev/
> 78aa5a83314a9d35fe03c8f7ab552ccfa81e1190

It seems this failure is not caused by my patch.

I also see the same failures [1] at this push [2]. A simple one line modification also causes this failure, so maybe the culprit is at somewhere else?

[1] https://treeherder.mozilla.org/logviewer.html#?job_id=104964934&repo=mozilla-inbound&lineNumber=4219
[2] https://hg.mozilla.org/integration/mozilla-inbound/rev/1b21c086b9530c5e36cd0b9b7251541967cce58d
Flags: needinfo?(kechang)
(In reply to Wes Kocher (:KWierso) from comment #40)
> It definitely appears to have started with your landing and stopped with the
> backout:
> https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-
> searchStr=ea202f6d70b69a8e8d18cd211ce48e611620bf1d&fromchange=ee3c032059e517c
> f3d25b80c835dc45e2711b504&group_state=expanded&tochange=5801aa478de12a62b2b29
> 82659e787fcc4268d67
> 
> Your [2] link is prior to my backout, so your code was still landed at that
> point.

Thanks for correcting me. Apology for not checking the order of patches.
The test failure mentioned in comment #38 seems to be caused by spending some extra time before starting http connection.

In part 2 patch, AsyncClassifyLocalWithTables is called twice for checking both blacklist and whitelist and BeginConnectActual will be called until get the result from classifier.
Based on the try push [1], I found that if I only call AsyncClassifyLocalWithTables once and not check whitelist, this failure is gone.

Moreover, by only checking blacklist we should be able to alleviate the performance regression mentioned in bug 1370583 from 8.x% to 4.x% since AsyncClassifyLocalWithTables is invoked once.

Honza, what do you think? I think this could be the simplest workaround for now.



[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=19129fe2292ff396baa7c5472549d520c94c8653&group_state=expanded
Flags: needinfo?(honzab.moz)
(In reply to Kershaw Chang [:kershaw] from comment #42)
> The test failure mentioned in comment #38 seems to be caused by spending
> some extra time before starting http connection.
> 
> In part 2 patch, AsyncClassifyLocalWithTables is called twice for checking
> both blacklist and whitelist and BeginConnectActual will be called until get
> the result from classifier.
> Based on the try push [1], I found that if I only call
> AsyncClassifyLocalWithTables once and not check whitelist, this failure is
> gone.

Isn't this a revert to the original behavior?  If not, will it fulfill the purpose of the bug?  What is actually AsyncClassifyLocalWithTables doing that it takes so long?

> 
> Moreover, by only checking blacklist we should be able to alleviate the
> performance regression mentioned in bug 1370583 from 8.x% to 4.x% since
> AsyncClassifyLocalWithTables is invoked once.
> 
> Honza, what do you think? I think this could be the simplest workaround for
> now.

Can't say until I get an answer to the question above.

> 
> 
> 
> [1]
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=19129fe2292ff396baa7c5472549d520c94c8653&group_state=e
> xpanded
Flags: needinfo?(honzab.moz)
(In reply to Honza Bambas (:mayhemer) from comment #43)
> (In reply to Kershaw Chang [:kershaw] from comment #42)
> > The test failure mentioned in comment #38 seems to be caused by spending
> > some extra time before starting http connection.
> > 
> > In part 2 patch, AsyncClassifyLocalWithTables is called twice for checking
> > both blacklist and whitelist and BeginConnectActual will be called until get
> > the result from classifier.
> > Based on the try push [1], I found that if I only call
> > AsyncClassifyLocalWithTables once and not check whitelist, this failure is
> > gone.
> 
> Isn't this a revert to the original behavior?  If not, will it fulfill the
> purpose of the bug?  What is actually AsyncClassifyLocalWithTables doing
> that it takes so long?
> 

No, it's not a revert. BeginConnectActual will not be called until we get the result of whether the uri is in blacklist.
I am proposing to only check blacklist for saving some time.

AsyncClassifyLocalWithTables itself doesn't take too much time. What really expensive is the period waiting the result.
In our current code [1] and in part2 patch, we do channel annotation for every http channel when privacy.trackingprotection.annotate_channels is on.
I think this could be wrong because we actually don't have to check every uri.
Looking into the code in ShouldEnableTrackingProtectionInternal [2], we can skip the TP check for for first party and top-level loads. That means we should apply the same rule for channel annotation? If yes, we should add a check for privacy.trackingprotection.annotate_channels at [3], not [1].

Francois, could you correct me if I am wrong? Thanks.


@Honza, if I am right, we can skip a lot of unnecessary checks and the performance regression would be much less noticeable.


[1] http://searchfox.org/mozilla-central/rev/1a054419976437d0778a2b89be1b00207a744e94/netwerk/base/nsChannelClassifier.cpp#515-516
[2] http://searchfox.org/mozilla-central/rev/1a054419976437d0778a2b89be1b00207a744e94/netwerk/base/nsChannelClassifier.cpp#241-249
[3] http://searchfox.org/mozilla-central/rev/1a054419976437d0778a2b89be1b00207a744e94/netwerk/base/nsChannelClassifier.cpp#202
Flags: needinfo?(francois)
(In reply to Kershaw Chang [:kershaw] from comment #45)
> Looking into the code in ShouldEnableTrackingProtectionInternal [2], we can
> skip the TP check for for first party and top-level loads. That means we
> should apply the same rule for channel annotation?

That's correct. We only flag third-party tracking. We're not interested in first-party tracking.
Flags: needinfo?(francois)
Summary:
 - Add a new function nsChannelClassifier::ShouldEnableChannelAnnotation to see if channel annotation is enabled.
Attachment #8874758 - Attachment is obsolete: true
Attachment #8876073 - Flags: review?(francois)
Attached patch interdiff for part2 v3 and v2 (obsolete) — Splinter Review
Just for reference.
Summary:
 - Remove readonly of topWindowURI
 - Since the uri classifier needs topWindowURI to decide whether or not to enable channel annotation, we have to allow to change this attribute in js for passing the unit test.
Attachment #8876077 - Flags: review?(honzab.moz)
Comment on attachment 8876077 [details] [diff] [review]
Part3: Add setter for nsIHttpChannelInternal::topWindowURI

Review of attachment 8876077 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/HttpBaseChannel.cpp
@@ +2209,5 @@
>  
>  NS_IMETHODIMP
> +HttpBaseChannel::SetTopWindowURI(nsIURI *aTopWindowURI)
> +{
> +  mTopWindowURI = aTopWindowURI;

I would introduce a method for conditionally setting this, rather than making the attribute r/w.  Name it "setTopWindowURIIfUnknown".

The method then should (to make this more safe):
- when mTopWindowURI is already set, let it fail and not modify the uri
- when mTopWindowURI is null, try to get it with [1]
- when mTopWindowURI was found (there is a window) let this fail and do nothing
- only when all this has passed, change the URI.

mention what this actually does in the IDL comment, of course


[1] https://dxr.mozilla.org/mozilla-central/rev/e61060be36424240058f8bef4c5597f401bc8b7e/netwerk/protocol/http/HttpBaseChannel.cpp#2211
Attachment #8876077 - Flags: review?(honzab.moz) → review-
Comment on attachment 8876073 [details] [diff] [review]
Part2: Add new API CheckIsTrackerWithLocalTable to check again local blacklist, v3

Review of attachment 8876073 [details] [diff] [review]:
-----------------------------------------------------------------

Given that channel annotation is now exactly the same as what tracking protection would do (only first-party, only interested in the tracking lists), maybe we should call is mTrackingAnnotationEnabled?

To speed things up further, we could add a flag to ShouldEnableTrackingProtection() and skip the permission manager checks when we're looking for annotations only: http://searchfox.org/mozilla-central/rev/a798ee4fc323f9387b7576dbed177859d29d09b7/netwerk/base/nsChannelClassifier.cpp#255-317

The permission manager is the way that a user can manually unblock a site that's broken with TP. Annotations won't break anything so these exceptions aren't really useful. The same argument goes for exempting add-ons: http://searchfox.org/mozilla-central/rev/a798ee4fc323f9387b7576dbed177859d29d09b7/netwerk/base/nsChannelClassifier.cpp#251-253

Also, the part about updating the security state and firing a securityUI event doesn't make sense for annotations: http://searchfox.org/mozilla-central/rev/a798ee4fc323f9387b7576dbed177859d29d09b7/netwerk/base/nsChannelClassifier.cpp#319-336

So, I think you can return just before the AddonMayLoad() check in the case of annotations.
Attachment #8876073 - Flags: review?(francois)
Summary:
 - Rename mChannelAnnotationEnabled to mTrackingAnnotationEnabled
 - Add a new parameter aUseTrackingProtection in ShouldEnableTrackingProtectionInternal().

Thanks.
Attachment #8876073 - Attachment is obsolete: true
Attachment #8876074 - Attachment is obsolete: true
Attachment #8876636 - Flags: review?(francois)
Summary:
 - Fix the previous comment.
 - Add new function setTopWindowURIIfUnknown in nsIHttpChannelInternal

Thanks.
Attachment #8876077 - Attachment is obsolete: true
Attachment #8876637 - Flags: review?(honzab.moz)
Update the correct patch.
Attachment #8876636 - Attachment is obsolete: true
Attachment #8876636 - Flags: review?(francois)
Attachment #8876680 - Flags: review?(francois)
Comment on attachment 8876637 [details] [diff] [review]
Part3: Add setTopWindowURIIfUnknown in  nsIHttpChannelInternal

Review of attachment 8876637 [details] [diff] [review]:
-----------------------------------------------------------------

yes, just few details need polishing first.

::: netwerk/protocol/http/HttpBaseChannel.cpp
@@ +2215,5 @@
> +  }
> +
> +  if (mTopWindowURI) {
> +    return NS_ERROR_FAILURE;
> +  }

add LOG()s also to both the returns

@@ +2220,5 @@
> +
> +  // Don't modify |mTopWindowURI| if we can get one from GetTopWindowURI().
> +  nsCOMPtr<nsIURI> uri;
> +  if (NS_SUCCEEDED(GetTopWindowURI(getter_AddRefs(uri)))) {
> +    return NS_ERROR_FAILURE;

this is wrong.  look into the code of GetTopWindowURI how it works.
Attachment #8876637 - Flags: review?(honzab.moz) → feedback+
Comment on attachment 8876680 [details] [diff] [review]
Part2: Add new API CheckIsTrackerWithLocalTable to check again local blacklist, v4

Review of attachment 8876680 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with the changes below

::: netwerk/base/nsChannelClassifier.cpp
@@ +278,5 @@
>  
>  nsresult
> +nsChannelClassifier::ShouldEnableTrackingProtectionInternal(
> +                                                   nsIChannel *aChannel,
> +                                                   bool aUseTrackingProtection,

To make this flag clearer, I would suggest reversing the logic and calling this flag "aAnnotationsOnly".

@@ +332,5 @@
>        }
>        return NS_OK;
>      }
>  
> +    // Channel annotation only cares whether this is a third-party load or not,

I would rephrase this to highlight the reason for returning early:

"Unlike full Tracking Protection, annotations don't block anything so we don't need to take into account add-ons or user exceptions."
Attachment #8876680 - Flags: review?(francois) → review+
Fix the previous comments and carry r+.
Attachment #8876680 - Attachment is obsolete: true
Attachment #8877043 - Flags: review+
Summary:
 - Add some logs in SetTopWindowURIIfUnknown.
 - Correct the use of GetTopWindowURI.

Thanks.
Attachment #8876637 - Attachment is obsolete: true
Attachment #8877044 - Flags: review?(honzab.moz)
Comment on attachment 8877044 [details] [diff] [review]
Part3: Add setTopWindowURIIfUnknown in  nsIHttpChannelInternal, r=mayhemer

Review of attachment 8877044 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks
Attachment #8877044 - Flags: review?(honzab.moz) → review+
Attachment #8877044 - Attachment description: Part3: Add setTopWindowURIIfUnknown in nsIHttpChannelInternal → Part3: Add setTopWindowURIIfUnknown in nsIHttpChannelInternal, r=mayhemer
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/899f52bac76c
Part1: Avoid nsChannelClassifier being created twice, r=mayhemer
https://hg.mozilla.org/integration/mozilla-inbound/rev/db91fc6b9522
Part2: Do channel annotation a bit earlier and only for third-party uri, r=francois
https://hg.mozilla.org/integration/mozilla-inbound/rev/4eb6b24ccd80
Part3: Add setTopWindowURIIfUnknown in nsIHttpChannelInternal, r=mayhemer
Keywords: checkin-needed
Depends on: 1385576
No longer depends on: 1385576
See Also: → 1385576
You need to log in before you can comment on or make changes to this bug.