Closed Bug 1242871 Opened 8 years ago Closed 8 years ago

Add webRequest "request origin" property

Categories

(WebExtensions :: Untriaged, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dev, Assigned: ma1)

References

Details

(Keywords: dev-doc-complete, Whiteboard: [triaged])

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0 Iceweasel/38.4.0
Build ID: 20151103235259

Steps to reproduce:

I'd like to request a "request origin" property being added to "webRequest.onBeforeRequest", which should give the same info like the "aRequestOrigin" parameter of the "shouldLoad()" [1] function.

The add-on I'm developing [2] depends heavily on the request origin, so it's required to port the add-on to WebExtensions.

[1] https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Reference/Interface/nsIContentPolicy#shouldLoad%28%29
[2] RequestPolicy Continued: https://github.com/RequestPolicyContinued/requestpolicy
For reference: my corresponding topic on Discourse: https://discourse.mozilla-community.org/t/webextensions-request-origin-info-in-webrequest/5137
Priority: -- → P3
Whiteboard: [triaged]
Yes, NoScript needs it too (especially for its ABE module). I was "implicitly" going to fix it in my current webRequest overhaul work, thanks for making this explicit. Actually, an "origin" property should be available in every call back, and should reflect the principal which started the load (correct me if I'm wrong). This is far less trivial than it seems, as shown by the long discussion in bug 1105556 and related ones, so I'm tracking it even though I'll try to land an initial approximation of what currently happens with NoScript sooner.
Assignee: nobody → g.maone
Depends on: 105556
Depends on: 1105556
No longer depends on: 105556
We have had a lot of discussions on requestingPrincipal and have settled on it in Bug 1105556.  But it would be good to hear about the following from addon developers using shouldLoad.

For aRequestingPrincipal in ShouldLoad(), do you expect:
* the principal passed into the be the principal of the page in which the resource will be loaded (loadingPrincipal) OR
* the principal of the element that initiated the load (triggeringPrincipal)

Thanks!
(In reply to Tanvi Vyas [:tanvi] from comment #2)

> For aRequestingPrincipal in ShouldLoad(), do you expect:
> * the principal passed into the be the principal of the page in which the
> resource will be loaded (loadingPrincipal) OR
> * the principal of the element that initiated the load (triggeringPrincipal)

The latter (triggeringPrincipal), mainly for the reasons you mentioned at point 3 in https://bugzilla.mozilla.org/show_bug.cgi?id=1105556#c38
Comment on attachment 8734551 [details]
MozReview Request: Bug 1242871 - add an originUrl property to every webRequest event details object + tests (extended & fixed + nits). r?kmag

https://reviewboard.mozilla.org/r/42347/#review39009

This looks good, but I'd like to see a few more tests.

I take it that this needs to wait for bug 1105556 to land?

::: toolkit/components/extensions/test/mochitest/test_ext_webrequest.html:114
(Diff revision 1)
>        }
>      }
>    }
>  
> +  function checkOrigin(details) {
> +    if (details.type !== "main_frame") {

Shouldn't we check for the main frame as well?

::: toolkit/components/extensions/test/mochitest/test_ext_webrequest.html:115
(Diff revision 1)
>      }
>    }
>  
> +  function checkOrigin(details) {
> +    if (details.type !== "main_frame") {
> +      browser.test.assertTrue(/\/file_WebRequest_page\d\.html$/.test(details.originUrl),  `originUrl for ${details.url} is correct (${details.originUrl})`);

It would be nice to test this in some other instances, such as clicking a link or submitting a form.

::: toolkit/modules/addons/WebRequest.jsm:130
(Diff revision 1)
>          // It's possible that this listener has been removed and the
>          // child hasn't learned yet.
>          continue;
>        }
>        let response = null;
> -      let data = {
> +      let data = Object.assign({requestId, browser}, msg.data);

It looks like we'll get an additional `ids` property after this change. Is that expected?

::: toolkit/modules/addons/WebRequestContent.js:158
(Diff revision 1)
>        }
>      }
>  
>      let data = {ids,
>                  url,
> +                originUrl: requestOrigin && requestOrigin.spec || "",

It seems like it would be more consistent to omit this when we have no origin data.
Attachment #8734551 - Flags: review?(kmaglione+bmo)
Comment on attachment 8734551 [details]
MozReview Request: Bug 1242871 - add an originUrl property to every webRequest event details object + tests (extended & fixed + nits). r?kmag

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/42347/diff/1-2/
Attachment #8734551 - Flags: review?(kmaglione+bmo)
https://reviewboard.mozilla.org/r/42347/#review39009

> Shouldn't we check for the main frame as well?

Done.

> It would be nice to test this in some other instances, such as clicking a link or submitting a form.

Added both tests.

> It looks like we'll get an additional `ids` property after this change. Is that expected?

Nope, that's not meant for the public API and now we remove it.

> It seems like it would be more consistent to omit this when we have no origin data.

Agreed, changed. We should warn consumers, though (in my version a string was guaranteed to be always there, therefore you could always invoke methods on it which now would throw, requiring a preliminary test).
https://reviewboard.mozilla.org/r/42347/#review39009

I believe my patch is resilient enough. If not, the tests will tell us (and them).
Actually the form and link tests, which did work fine for me locally, on try cause other tests to fail apparently because the script which should close the landing page doesn't get to properly run.
Investigating...
Comment on attachment 8734551 [details]
MozReview Request: Bug 1242871 - add an originUrl property to every webRequest event details object + tests (extended & fixed + nits). r?kmag

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/42347/diff/2-3/
Attachment #8734551 - Attachment description: MozReview Request: Bug 1242871 - add an originUrl property to every webRequest event details object + tests. r?kmag → MozReview Request: Bug 1242871 - add an originUrl property to every webRequest event details object + tests (extended & fixed). r?kmag
Comment on attachment 8734551 [details]
MozReview Request: Bug 1242871 - add an originUrl property to every webRequest event details object + tests (extended & fixed + nits). r?kmag

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/42347/diff/3-4/
Comment on attachment 8734551 [details]
MozReview Request: Bug 1242871 - add an originUrl property to every webRequest event details object + tests (extended & fixed + nits). r?kmag

https://reviewboard.mozilla.org/r/42347/#review40857

::: toolkit/components/extensions/test/mochitest/test_ext_webrequest.html:120
(Diff revision 4)
>        }
>      }
>    }
>  
> +  function checkOrigin(details) {
> +    let isCorrectOrigin = details.url.includes("_page1.html") && details.originUrl.endsWith("/test_ext_webrequest.html") ||

I think this should be `? ... :` rather than `&& ... ||`

::: toolkit/components/extensions/test/mochitest/test_ext_webrequest.html:125
(Diff revision 4)
> +    let isCorrectOrigin = details.url.includes("_page1.html") && details.originUrl.endsWith("/test_ext_webrequest.html") ||
> +                          /\/file_WebRequest_page\d\.html\b/.test(details.originUrl);
> +    browser.test.assertTrue(isCorrectOrigin, `originUrl for ${details.url} is correct (${details.originUrl})`);
> +  }
> +
> +

Extra line.

::: toolkit/modules/addons/WebRequest.jsm:542
(Diff revision 4)
> +          });
> +        } else {
> +          Object.assign(commonData, {
> +            windowId: 0,
> +            parentWindowId: 0,
> +            originUrl: "",

Should this property be absent rather than an empty string?
Attachment #8734551 - Flags: review?(kmaglione+bmo) → review+
Comment on attachment 8734551 [details]
MozReview Request: Bug 1242871 - add an originUrl property to every webRequest event details object + tests (extended & fixed + nits). r?kmag

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/42347/diff/4-5/
Attachment #8734551 - Attachment description: MozReview Request: Bug 1242871 - add an originUrl property to every webRequest event details object + tests (extended & fixed). r?kmag → MozReview Request: Bug 1242871 - add an originUrl property to every webRequest event details object + tests (extended & fixed + nits). r?kmag
Status: UNCONFIRMED → RESOLVED
Closed: 8 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
For the docs: this makes me think we should document the "details" object that get passed to the listener as a separate type, in its own page, and link to it from the docs page from each request listener.

At the moment the "details" object is duplicated in the docs page for each listener. This is already quite messy, but it will get worse with properties like this that aren't supported in all browsers. It means that there's no one place to say "originUrl isn't supported in Chrome" - we have to say it 8 times, once for each event, and I think it'll get really messy.

Is there any reason not to do this? Have a type called "RequestDetails" that the listener pages can link to? I know not all properties are defined for all listeners, but I think it'll be clearer to note these anomalies in the one page, than to duplicate all the other properties.
Flags: needinfo?(g.maone)
(In reply to Giorgio Maone [:mao] from comment #3)
> (In reply to Tanvi Vyas [:tanvi] from comment #2)
> 
> > For aRequestingPrincipal in ShouldLoad(), do you expect:
> > * the principal passed into the be the principal of the page in which the
> > resource will be loaded (loadingPrincipal) OR
> > * the principal of the element that initiated the load (triggeringPrincipal)
> 
> The latter (triggeringPrincipal), mainly for the reasons you mentioned at
> point 3 in https://bugzilla.mozilla.org/show_bug.cgi?id=1105556#c38
Hi Giorgio,

Is this true for both TYPE_DOCUMENT loads and other subresource loads?  For TYPE_SCRIPT, TYPE_IMAGE, TYPE_SUBDOCUMENT, etc you also want the element that initiated the load to be the triggeringPrincipal, even if that is different than the document it is being loaded into?
(In reply to Tanvi Vyas [:tanvi] from comment #17)

> > The latter (triggeringPrincipal), mainly for the reasons you mentioned at
> > point 3 in https://bugzilla.mozilla.org/show_bug.cgi?id=1105556#c38
> Hi Giorgio,
> 
> Is this true for both TYPE_DOCUMENT loads and other subresource loads?  For
> TYPE_SCRIPT, TYPE_IMAGE, TYPE_SUBDOCUMENT, etc you also want the element
> that initiated the load to be the triggeringPrincipal, even if that is
> different than the document it is being loaded into?

My understanding is that this way JavaScript clients of the API (e.g. WebExtensions through webRequest) would have access to both the data points, both being important to evaluate different potential scenarios (e.g. CSRF vs injections), while otherwise the former would be lost.
If this is correct, yes, I want triggeringPrincipal to always reflect the "entity" which initiated the load (e.g. a document triggering a load in a different window/frame through the target attribute of a link or the target argument of window.open(), or the CSS document including a certain image) while loadingPrincipal to reflect the document which will ultimately host the loaded resource (the content of the frame and the HTML page embedding the CSS document in our example).
I'll probably end to expose both, i.e. triggeringPrincipal's URL through details.orginURL (this bug), and loadingPrincipal's URL (which could currently be indirectly and asynchronously inferred from details.windowId) in a new convenience attribute, e.g. details.documentURL.

(In reply to Will Bamberg [:wbamberg] from comment #16)

> Is there any reason not to do this? Have a type called "RequestDetails" that
> the listener pages can link to? I know not all properties are defined for
> all listeners, but I think it'll be clearer to note these anomalies in the
> one page, than to duplicate all the other properties.

Both approaches have their drawbacks. Chrome's documentation doesn't commit to a type because the actually present properties can vary savagely depending both on the specific listener and on the options/filters passed on registration. Effectively conveying this information to developers in a single point (e.g. 'responseHeaders: optional, an array of name-value pairs representing the HTTP headers delivered with the response. Is exposed in the details argument passed to onHeadersReceived, onAuthRequired, onBeforeRedirect, onResponseStarted, onCompleted listeners if and only if the "responseHeaders" option had been passed on registration. It can be modified if the "blocking" option had been passed on listener registration.') can be equally challenging. For the sake of clarity and helping extensions developers, I'd prefer a way to store this information in a single place but generate multiple views on it, depending if one is looking at the listener, the registration options or the details object: it would be easier to retrieve, even if definitely redundant.
Flags: needinfo?(g.maone)
I've added "originUrl" to all the events under https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/WebRequest, and updated the browser compat tables. Please let me know if there's anything else needed here.

> 
> (In reply to Will Bamberg [:wbamberg] from comment #16)
> 
> > Is there any reason not to do this? Have a type called "RequestDetails" that
> > the listener pages can link to? I know not all properties are defined for
> > all listeners, but I think it'll be clearer to note these anomalies in the
> > one page, than to duplicate all the other properties.
> 
> Both approaches have their drawbacks. Chrome's documentation doesn't commit
> to a type because the actually present properties can vary savagely
> depending both on the specific listener and on the options/filters passed on
> registration. Effectively conveying this information to developers in a
> single point (e.g. 'responseHeaders: optional, an array of name-value pairs
> representing the HTTP headers delivered with the response. Is exposed in the
> details argument passed to onHeadersReceived, onAuthRequired,
> onBeforeRedirect, onResponseStarted, onCompleted listeners if and only if
> the "responseHeaders" option had been passed on registration. It can be
> modified if the "blocking" option had been passed on listener
> registration.') can be equally challenging. For the sake of clarity and
> helping extensions developers, I'd prefer a way to store this information in
> a single place but generate multiple views on it, depending if one is
> looking at the listener, the registration options or the details object: it
> would be easier to retrieve, even if definitely redundant.

Having the info in one place but generating different views for different events would be doable, but would require some very narrowly targeted macros, and I'm not keen to add more code like that in MDN. So I've just lived with the duplication.

I did try out a standalone page: https://developer.mozilla.org/en-US/docs/User:wbamberg/webRequest.RequestDetails but I'm not sure it's that useful, if each event page also has its own view of the object.
Flags: needinfo?(g.maone)
The originUrl treatment looks good, thanks.

Regarding the RequestDetails page, I agree, probably is not useful enough to justify its complexity and maintenance burden.
Flags: needinfo?(g.maone)
Product: Toolkit → WebExtensions
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: