Add an API to nsIDocument to determine whether a script is on the tracking protection list

RESOLVED FIXED in Firefox 53

Status

()

defect
RESOLVED FIXED
3 years ago
5 months ago

People

(Reporter: Ehsan, Assigned: Ehsan)

Tracking

(Depends on 1 bug)

unspecified
mozilla53
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox53 fixed)

Details

Attachments

(1 attachment)

In order to be able to put timeouts in the correct bucket as soon as
they are scheduled, we need to be able to synchronously tell whether a
timeout is coming from a script that is on the tracking protection list.
But the URL Classifier API which we use for this task is asynchronous.
Because of this, and to avoid unnecessary IPC from the content to the
parent process every time we need to know where a script came from, we
cache this information in nsIDocument.  The hash table mTrackingScripts
will have one entry per script loaded in the document which is on the
tracking protection list.

For performance reasons, we coalesce querying whether a script source
URL is on the tracking protection list with the load of the script from
the network.  In most cases we'll have the response from the URL
Classifier service before the script load is finished, but on the off
chance that the load finishes first, we wait before finishing the script
load to get the response from the URL Classifier service.
Blocks: 1312514
Comment on attachment 8816533 [details] [diff] [review]
Add an API to nsIDocument to determine whether a script is on the tracking protection list

Review of attachment 8816533 [details] [diff] [review]:
-----------------------------------------------------------------

Looks reasonable.  r=me with comments addressed.

Does this mean we always do two classification lookups for scripts now?  One for our normal malware/phishing/tacking protection and now one here?  Seems a possible future optimization would be to make the existing classification lookup mark the channel if it sees a tracking script even its only blocking malware/phishing (tp is disabled).  That would avoid the need to perform two lookups.

::: dom/base/nsIDocument.h
@@ +2876,5 @@
>    CreateEventTarget(const char* aName,
>                      mozilla::dom::TaskCategory aCategory) override;
>  
> +  void NoteScriptTrackingStatus(const nsACString& aURL, bool isTracking);
> +  bool IsScriptTracking(const nsACString& aURL) const;

Please add comments that these are the original URLs and not the final URLs.  They don't take into account any redirects that may have taken place.

I believe this is the case because:

https://dxr.mozilla.org/mozilla-central/source/dom/base/nsScriptLoader.cpp#2086

Is this what you intended?  I guess it depends on what will be calling IsScriptTracking().  We should probably make sure there is a test that redirects.

Also, you should probably note that they are fully resolved URLS.  Or can we just use nsIURI to avoid confusion?

::: dom/base/nsScriptLoader.cpp
@@ +2854,5 @@
> +    BasePrincipal::CreateCodebasePrincipal(mRequest->mURI, attrs);
> +  NS_ENSURE_TRUE(prin, NS_ERROR_FAILURE);
> +
> +  bool expectCallback = false;
> +  uriClassifier->Classify(prin, true, this, &expectCallback);

Can you add comment describing what the "true" is here?  I guess it means tracking protection.

::: dom/base/nsScriptLoader.h
@@ +706,5 @@
>    // SRI data verifier.
>    nsAutoPtr<mozilla::dom::SRICheckDataVerifier> mSRIDataVerifier;
>  
> +  // Status of the channel load.
> +  nsresult                      mChannelStatus;

Since you are passing the entire channel now it seems that you don't need to track and pass the status separately.
Attachment #8816533 - Flags: review?(bkelly) → review+
Depends on: 1321874
(In reply to Ben Kelly [:bkelly] from comment #2)
> Does this mean we always do two classification lookups for scripts now?  One
> for our normal malware/phishing/tacking protection and now one here?  Seems
> a possible future optimization would be to make the existing classification
> lookup mark the channel if it sees a tracking script even its only blocking
> malware/phishing (tp is disabled).  That would avoid the need to perform two
> lookups.

Filed bug 1321874 for this.

> ::: dom/base/nsIDocument.h
> @@ +2876,5 @@
> >    CreateEventTarget(const char* aName,
> >                      mozilla::dom::TaskCategory aCategory) override;
> >  
> > +  void NoteScriptTrackingStatus(const nsACString& aURL, bool isTracking);
> > +  bool IsScriptTracking(const nsACString& aURL) const;
> 
> Please add comments that these are the original URLs and not the final URLs.
> They don't take into account any redirects that may have taken place.
> 
> I believe this is the case because:
> 
> https://dxr.mozilla.org/mozilla-central/source/dom/base/nsScriptLoader.
> cpp#2086
> 
> Is this what you intended?  I guess it depends on what will be calling
> IsScriptTracking().  We should probably make sure there is a test that
> redirects.

Yes, this is what I intended.  The thing that matters here isn't the pre vs post redirect URL, it's the specific URL we pass to SpiderMonkey, because we ultimately need to match it against what JS::DescribeScriptedCaller() returns.  I will document this.

> Also, you should probably note that they are fully resolved URLS.  Or can we
> just use nsIURI to avoid confusion?

Since the thing we have on both ends is a string, going to an nsIURI and then back to a string seems pointless.  In practice we only ever call this on what JS::DescribeScriptedCaller() will return.

> ::: dom/base/nsScriptLoader.cpp
> @@ +2854,5 @@
> > +    BasePrincipal::CreateCodebasePrincipal(mRequest->mURI, attrs);
> > +  NS_ENSURE_TRUE(prin, NS_ERROR_FAILURE);
> > +
> > +  bool expectCallback = false;
> > +  uriClassifier->Classify(prin, true, this, &expectCallback);
> 
> Can you add comment describing what the "true" is here?  I guess it means
> tracking protection.
> 
> ::: dom/base/nsScriptLoader.h
> @@ +706,5 @@
> >    // SRI data verifier.
> >    nsAutoPtr<mozilla::dom::SRICheckDataVerifier> mSRIDataVerifier;
> >  
> > +  // Status of the channel load.
> > +  nsresult                      mChannelStatus;
> 
> Since you are passing the entire channel now it seems that you don't need to
> track and pass the status separately.

Sadly, the channel's status and the status passed to OnStopRequest are *not* the same thing (thanks, Necko!).  See <http://searchfox.org/mozilla-central/rev/0f92c398ce929a3f3d9dad30132c218def154e39/netwerk/base/nsBaseChannel.cpp#820>.  So unfortunately we'll need to pipe this variable through separately.
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/eb3ac9d2775b
Add an API to nsIDocument to determine whether a script is on the tracking protection list; r=bkelly
Assignee: nobody → ehsan
https://hg.mozilla.org/mozilla-central/rev/eb3ac9d2775b
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Depends on: 1322105
Depends on: 1322204
Depends on: 1323220
Depends on: 1325255
Depends on: 1325651
No longer depends on: 1325651
Depends on: 1370211
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.