Closed Bug 1474280 Opened 6 years ago Closed 6 years ago

[FastBlock] Cancel slow HTTP requests of tracking resources

Categories

(Core :: Networking: HTTP, enhancement, P1)

enhancement

Tracking

()

VERIFIED FIXED
mozilla63
Tracking Status
firefox63 --- verified

People

(Reporter: bagder, Assigned: xeonchen)

References

Details

(Whiteboard: [necko-triaged])

Attachments

(2 files, 4 obsolete files)

Provide a pref to set the time-out time and when enabled, cancel HTTP requests that are marked as tracking resources that take longer than the given time to get response data.
Initial take on the fastblock logic in necko, looking for feedback on the approach. I'm basically going for roughly what the responsetimeout logic is already doing.
Attachment #8990699 - Flags: feedback?(honzab.moz)
Comment on attachment 8990699 [details] [diff] [review]
0001-bug-1474280-cancel-slow-HTTP-requests-of-tracking-re.patch

Review of attachment 8990699 [details] [diff] [review]:
-----------------------------------------------------------------

Based on how I understand the feature, you want this on (and only on) the channel level and you must carry the timer over redirects.  Start a single timer at the time of the page navigation start.  When it fires, cancel all in-progress channels that are marked as "ever went through a tracking domain" as we (probably) want to kill channel that have redirected to a tracker and back, but I would double check this in the ghostery code.

So, that when there is a page load timeline like:

0..................1....................2....................3..................4................5
>-----[GET tracker1.com/tracker] --- [301: tracker2.com/script] --- [301: tracker3.com/script] --X

On "X" you will cancel the tracker3.com/script channel.  And obviously, any channel to a tracking domain made after the 5 sec interval will be immediately `asyncCancel`ed.

Please fix me if I'm wrong.
Attachment #8990699 - Flags: feedback?(honzab.moz) → feedback-
I think the way the Ghostery feature works is similar to what Daniel's patch implements, that is, they don't take the page navigation start time into account, rather they take each individual connection's latency into consideration individually, and if it exceeds 5 seconds and the connection is to a tracking resource, the connection is canceled (sans the exception rules, of course).
(In reply to :Ehsan Akhgari from comment #3)
> I think the way the Ghostery feature works is similar to what Daniel's patch
> implements, that is, they don't take the page navigation start time into
> account, rather they take each individual connection's latency into
> consideration individually, and if it exceeds 5 seconds and the connection
> is to a tracking resource, the connection is canceled (sans the exception
> rules, of course).

I was always told something else.  OK, then we have to make sure how this works because you as well say "I think".  

Someone has to look into the Ghostery's extension code and look how that works.
(In reply to Honza Bambas (:mayhemer) from comment #4)
> (In reply to :Ehsan Akhgari from comment #3)
> > I think the way the Ghostery feature works is similar to what Daniel's patch
> > implements, that is, they don't take the page navigation start time into
> > account, rather they take each individual connection's latency into
> > consideration individually, and if it exceeds 5 seconds and the connection
> > is to a tracking resource, the connection is canceled (sans the exception
> > rules, of course).
> 
> I was always told something else.  OK, then we have to make sure how this
> works because you as well say "I think".  
> 
> Someone has to look into the Ghostery's extension code and look how that
> works.

Fair enough.  I wasn't 100% sure, but I had looked at the Ghostery code before so I decided to take another look to refresh my memory and indeed you are right.  Here are some links...

This code is the beginning of their smart blocking logic: https://github.com/ghostery/ghostery-extension/blob/e5f9ef76829f74a77ea4c7e0d810e6432185d8a6/src/classes/EventHandlers.js#L359  This runs on chrome.webRequest.onBeforeRequest.

The latency information is updated here: https://github.com/ghostery/ghostery-extension/blob/e5f9ef76829f74a77ea4c7e0d810e6432185d8a6/src/classes/EventHandlers.js#L390.  The latency information per each request is maintained by this class: https://github.com/ghostery/ghostery-extension/blob/e5f9ef76829f74a77ea4c7e0d810e6432185d8a6/src/classes/Latency.js

Here is the check to decide whether to activate the smart blocking logic: https://github.com/ghostery/ghostery-extension/blob/e5f9ef76829f74a77ea4c7e0d810e6432185d8a6/src/classes/EventHandlers.js#L425.  This function's implementation is here: https://github.com/ghostery/ghostery-extension/blob/e5f9ef76829f74a77ea4c7e0d810e6432185d8a6/src/classes/PolicySmartBlock.js#L87.  The slowness check is done here: https://github.com/ghostery/ghostery-extension/blob/e5f9ef76829f74a77ea4c7e0d810e6432185d8a6/src/classes/PolicySmartBlock.js#L260.  There you can see the latency is calculated based on `pageTimestamp` which maps to the timestamp of navigation (it is set to Date.now() here: https://github.com/ghostery/ghostery-extension/blob/8c4491d6d161504cd93ee98501257317823c25b3/src/classes/EventHandlers.js#L69).
The PHD for fastblock describes this feature as "It monitors trackers waiting for the first byte of data since opening the request channel. If this is not received within 5s, the request is canceled. If any bytes are received, the 5s timer is stopped."

Given this input, I take it the "first byte of data" is then actually the first byte of the response body.
(In reply to Daniel Stenberg [:bagder] from comment #6)
> The PHD for fastblock describes this feature as "It monitors trackers
> waiting for the first byte of data since opening the request channel. If
> this is not received within 5s, the request is canceled. If any bytes are
> received, the 5s timer is stopped."

Looks like that's wrong, I made a comment in that document...

> Given this input, I take it the "first byte of data" is then actually the
> first byte of the response body.

I think "since the start of the navigation of the current tab's top-level document" describes Ghostery's behavior even more precisely.
Status: NEW → ASSIGNED
> since the start of the navigation of the current tab's top-level document

So that's a defined moment in time? "start of navigation"?

If the page and its resources load very slowly, doesn't that risk that such a timer get to run for a while before all the resources have even started to load? Ie, it doesn't only kill slow loading tracking resources, it will also kill very fast tracking resources if the main page is slow enough?
... it seems exactly what this check is about: https://github.com/ghostery/ghostery-extension/blob/e5f9ef76829f74a77ea4c7e0d810e6432185d8a6/src/classes/PolicySmartBlock.js#L260

It seems to mark the request as "slow" if it took more than 5 seconds to *start* the request.
(In reply to Daniel Stenberg [:bagder] from comment #8)
> > since the start of the navigation of the current tab's top-level document
> 
> So that's a defined moment in time? "start of navigation"?

Yes, basically that maps to when nsDocShell::CreateAboutBlankContentViewer() is called <https://searchfox.org/mozilla-central/source/docshell/base/nsDocShell.h>.

You can get the navigation start time from the docshell, using mTiming->GetNavigationStartHighRes() for example()...

> If the page and its resources load very slowly, doesn't that risk that such
> a timer get to run for a while before all the resources have even started to
> load? Ie, it doesn't only kill slow loading tracking resources, it will also
> kill very fast tracking resources if the main page is slow enough?

Sure.  The goal of this feature is to cut down the overall page load time.  Usually a lot of tracking resources are loaded as a result of other tracking domains loading and running (e.g. a tracking script loading and running triggering the loading of other tracking resources) so in order to reduce the overall page load time cutting down the load of tracking resources past a window of time after the navigation makes sense.
I'm not disputing what ghostery does, I'm mostly noting that the definition of what fastblock is has changed quite a lot now, at least in my head. I want us to agree to what exactly it should do.

Is fastblock about avoiding issuing the tracking requests that aren't started within the given time period from start of navigation?

If the tracking request starts within the fastblock time, do we let the request "go on" even if it isn't completed within that time?
(In reply to Daniel Stenberg [:bagder] from comment #11)
> I'm not disputing what ghostery does, I'm mostly noting that the definition
> of what fastblock is has changed quite a lot now, at least in my head. I
> want us to agree to what exactly it should do.
> 
> Is fastblock about avoiding issuing the tracking requests that aren't
> started within the given time period from start of navigation?

Yes (including redirects, of course, where the replacement channel request starting past that point should be canceled similarly).

> If the tracking request starts within the fastblock time, do we let the
> request "go on" even if it isn't completed within that time?

Yes.
Summary: Cancel slow HTTP requests of tracking resources → [FastBlock] Cancel slow HTTP requests of tracking resources
Gary, welcome back!

I'm reassigning this one to you and I will mentor/review the work.
Assignee: daniel → xeonchen
Ehsan, thanks for comment 5!

From how I read in the code, it can be summarized as:

- the initial top level [1] main frame navigation request (the first original request) is not-blocked [2]
- top level request _redirects_ are treated as sub-requests (are submitted to the following conditions)
- instantly block any http: and ws: (to non-localhost) sub-requests when the top level page is https: (not sure we want to implement it in this bug)
- if a request is not a tracker (matching the host/path/some other conditions), it's not-blocked
- if the host name of the sub-request is the same as the host name of the top level page, it's not-blocked
- if the time between the navigation start [5][6] and the time request is made is more than 5000 secs, block the request immediately [3][4]
- and most importantly, I can't find ANY timer started when the navigation starts at [7] to cancel any pending tracking requests

This all makes it very simple to implement.  We simply kill a channel from inside asyncOpen() when it's 5+ seconds since navigation start.  Done.



[1] onBeforeRequest has one argument: details, a simple JS object with (among others) a tabId property.  It's id of the while _tab_ and is same for the top level document and any iframes in that document.
[2] the `not-blocked` term means that a request is immediately left to go freely, no further examination and no blocking is performed on it
[3] https://github.com/ghostery/ghostery-extension/blob/e5f9ef76829f74a77ea4c7e0d810e6432185d8a6/src/classes/PolicySmartBlock.js#L77-L116
[4] https://github.com/ghostery/ghostery-extension/blob/e5f9ef76829f74a77ea4c7e0d810e6432185d8a6/src/classes/PolicySmartBlock.js#L254-L270
[5] https://github.com/ghostery/ghostery-extension/blob/e5f9ef76829f74a77ea4c7e0d810e6432185d8a6/src/classes/EventHandlers.js#L70
[6] https://github.com/ghostery/ghostery-extension/blob/e5f9ef76829f74a77ea4c7e0d810e6432185d8a6/src/classes/TabInfo.js#L58
[7] https://github.com/ghostery/ghostery-extension/blob/e5f9ef76829f74a77ea4c7e0d810e6432185d8a6/src/classes/EventHandlers.js#L88-L126
The mDefaultRequestCreationTime field of the parent-most nsLoadGroup object should correspond to the navigation timestamp of the top-level document in the content process, I believe, which should be used for the 6th bullet point in comment 14.  This information may not be available in the parent process, in which case we can send it to the parent during asyncOpen() if we need to.
Gary,

at [1] you have the top level document in hands.  You can do:

document->GetDocShell()->GetNavigationTiming()->GetNavigationStartTimeStamp()

That is mozilla::TimeStamp with the start of the navigation.  Send this to the parent process, best as a new TimeStamp opening argument at [2].

Then, on the parent process at [3] AsyncAbort() the channel if TimeStamp::NowLoRes() - navigation start timestamp > TimeDuration::FromSeconds(5).

The '5' has to be a preference.  Best is to use [4].  We want to allow '0' (to block immediately) and we also need to turn this feature off (which will be the default to ship).  If you have one pref that can be set to something like 999999 or infinity or something (to turn off) or you have two prefs (browser.fastblock.enable and browser.fastblock.timeout) is entirely up to you.  Please use milliseconds resolution for the pref.

I don't think we need a special code for websockets, since each web socket needs a bootstrap http channel, that (please double check!) is submit to the same tracking protection annotation as a regular http channel request.



Ehsan, 

your service worker interception concern was wise.  I can see in the code of the http child channel that we break the child channel opening very soon (before going to the parent process) if interception may happen.

I know that there is a new code (we ship it currently disabled by default) that performs the interception fully in the parent process, after TP annotation.  Enabling that change depends on a larger project of sw refactoring, it's in incomplete state right now.  I believe for the study we can ship w/o this and not fast-block channels that would be tp annotated and intercepted.  To allow that, architectural changes are needed and those are pretty much in par with what we already have in the tree (disabled).



[1] https://searchfox.org/mozilla-central/rev/b0275bc977ad7fda615ef34b822bba938f2b16fd/netwerk/protocol/http/HttpChannelChild.cpp#2697
[2] https://searchfox.org/mozilla-central/rev/b0275bc977ad7fda615ef34b822bba938f2b16fd/netwerk/ipc/NeckoChannelParams.ipdlh#189-245
[3] https://searchfox.org/mozilla-central/rev/b0275bc977ad7fda615ef34b822bba938f2b16fd/netwerk/protocol/http/nsHttpChannel.cpp#573
[4] https://searchfox.org/mozilla-central/rev/b0275bc977ad7fda615ef34b822bba938f2b16fd/modules/libpref/Preferences.h#392-395
(In reply to Honza Bambas (:mayhemer) from comment #16)
> Ehsan, 
> 
> your service worker interception concern was wise.  I can see in the code of
> the http child channel that we break the child channel opening very soon
> (before going to the parent process) if interception may happen.
> 
> I know that there is a new code (we ship it currently disabled by default)
> that performs the interception fully in the parent process, after TP
> annotation.  Enabling that change depends on a larger project of sw
> refactoring, it's in incomplete state right now.  I believe for the study we
> can ship w/o this and not fast-block channels that would be tp annotated and
> intercepted.  To allow that, architectural changes are needed and those are
> pretty much in par with what we already have in the tree (disabled).

That sounds reasonable, thanks for checking!
Please review this first, I'll check websocket part later.
Attachment #8992680 - Flags: review?(honzab.moz)
Comment on attachment 8992680 [details] [diff] [review]
0001-Bug-1474280-Cancel-timed-out-trackers-r-mayhemer.patch

Review of attachment 8992680 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/nsITimedChannel.idl
@@ +52,5 @@
>    [noscript] attribute TimeStamp handleFetchEventStart;
>    [noscript] attribute TimeStamp handleFetchEventEnd;
>  
> +  // This is use to determine the duration since navigation started.
> +  [noscript] attribute TimeStamp navigationStartTimeStamp;

should this really be exposed here?  maybe just have it exposed on nsihttpchannelinternal?

::: netwerk/protocol/http/HttpBaseChannel.h
@@ +671,5 @@
>    TimeStamp                         mDispatchFetchEventStart;
>    TimeStamp                         mDispatchFetchEventEnd;
>    TimeStamp                         mHandleFetchEventStart;
>    TimeStamp                         mHandleFetchEventEnd;
> +  TimeStamp                         mNavigationStartTimeStamp;

do we really need this to be on the httpbasechannel?  does the child side really need it?  we want to save content memory as much as possible, so it would be better to move this to nshttpchannel.

::: netwerk/protocol/http/NullHttpChannel.cpp
@@ +720,5 @@
> +NS_IMETHODIMP
> +NullHttpChannel::GetNavigationStartTimeStamp(mozilla::TimeStamp *_retval) {
> +  MOZ_ASSERT(_retval);
> +  *_retval = mAsyncOpenTime;
> +  return NS_OK;

I think NS_ERROR_NOT_IMPLEMENTED for both would be better and enough

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +573,5 @@
>      if (isTrackingResource) {
>          AddClassFlags(nsIClassOfService::Tail);
> +
> +        if (CheckFastBlocked()) {
> +            Unused << AsyncAbort(NS_ERROR_ABORT);

yup, plus please add CloseCacheEntry(false); right under

and please move this block before setting the Tail COS flag

@@ +601,5 @@
> +        Preferences::AddUintVarCache(&sFastBlockTimeout, "browser.fastblock.timeout");
> +    }
> +
> +    if (sIsFastBlockEnabled) {
> +        TimeDuration duration = TimeStamp::NowLoRes() - mNavigationStartTimeStamp;

only if mNavigationStartTimeStamp is non null!  if it is, behave as fastblock were disabled.

@@ +603,5 @@
> +
> +    if (sIsFastBlockEnabled) {
> +        TimeDuration duration = TimeStamp::NowLoRes() - mNavigationStartTimeStamp;
> +        if (duration.ToMilliseconds() > sFastBlockTimeout) {
> +            LOG(("nsHttpChannel::Connect FastBlock timeout (%lf)\n",

add %p with |this| to the log please.

@@ +609,5 @@
> +            return true;
> +        }
> +    }
> +
> +    return false;

my personal flavor preference: the code is usually more readable when a single purpose function like this does early returns (|return false;|) on every condition that doesn't pass, than nesting of `if` blocks like this.

::: netwerk/protocol/http/nsHttpChannel.h
@@ +654,5 @@
>      nsresult AsyncOpenOnTailUnblock();
>      // Called on untail when tailed because of being a tracking resource.
>      nsresult ConnectOnTailUnblock();
>  
> +    // Check if current connection satisfied the FastBlock criteria.

current "connection"?  a bit more rich comment would be better
Attachment #8992680 - Flags: review?(honzab.moz) → feedback+
Honza, thank you for the feedback on previous patch.
Attachment #8992680 - Attachment is obsolete: true
Attachment #8993005 - Flags: review?(honzab.moz)
Comment on attachment 8993005 [details] [diff] [review]
0001-Bug-1474280-Cancel-timed-out-trackers-r-mayhemer.patch

Review of attachment 8993005 [details] [diff] [review]:
-----------------------------------------------------------------

::: modules/libpref/init/all.js
@@ +5842,5 @@
>  pref("dom.event.default_to_passive_touch_listeners", true);
> +
> +// Enable FastBlock?
> +pref("browser.fastblock.enabled", false);
> +// The timeout since navigation start

say what units it uses

and honestly better comments would be nice, since this is going to be discussed publicly soon and people will start looking here and wonder what this is.

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +6527,5 @@
>    return true;
>  }
>  
> +NS_IMETHODIMP
> +nsHttpChannel::GetNavigationStartTimeStamp(TimeStamp* aTimeStamp) {

{ on a new line
Attachment #8993005 - Flags: review?(honzab.moz) → review+
fix nit, carry r+
Attachment #8993005 - Attachment is obsolete: true
Attachment #8993263 - Flags: review+
Depends on: 1476993
Attachment #8990699 - Attachment is obsolete: true
Add logs for new functions, changing log message, carrying r+
Attachment #8993263 - Attachment is obsolete: true
Attachment #8994176 - Flags: review+
Attachment #8994177 - Flags: review?(valentin.gosu) → review+
Keywords: checkin-needed
Pushed by shindli@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/318799faa5fe
Part 1: Cancel timed-out trackers; r=mayhemer
https://hg.mozilla.org/integration/mozilla-inbound/rev/74e441c3bf95
Part 2: carry timestamp when redirects; r=valentin
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/318799faa5fe
https://hg.mozilla.org/mozilla-central/rev/74e441c3bf95
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Gary, thank you for landing this bug fast. Good job!
Ehsan, Daniel, and Valentin, thanks for your help!
Oh, I missed one person! Honza, thank you so much!
Depends on: 1478081
I can confirm the pref code is implemented. I verified using Fx 63.0b11, on Windows 10 x64, macOS 10.13.6 and Ubuntu 14.04 LTS.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: