Open Bug 1210990 Opened 10 years ago Updated 3 years ago

Implement content-side hooks to handle WebRequest decisions which affect or are affected by the DOM.

Categories

(WebExtensions :: Request Handling, enhancement, P3)

Product:

Component:

Type:

enhancement

Priority:

P3

Severity:

N/A

Tracking

(firefox44 affected)

Status:

NEW

Tracking Flags:

Tracking

Status

firefox44

---

affected

People

(Reporter: ma1, Unassigned)

References

Details

(Whiteboard: [webRequest]triaged)

Attachments

(1 file)

WIP 10 years ago Tom S. (please needinfo tschuster) 13.58 KB, patch		Details \| Diff \| Splinter Review

Giorgio Maone [:ma1]

Reporter

Description

•

10 years ago

nsIContentPolicy has a concept of "DOM node triggering the load", which is very useful for content-blocking clients but is completely lost in Chromium's webRequest. billm's idea to restore this kind of context, which sounds good to me: <QUOTE> Basically, we could fire an event in the content script at two times: - Before trying to load something, we would give the content script a chance to attach data to the element being loaded. This data would be passed to the webRequest handler somehow. - After the blocking decision was made, we would fire another event in the content script. That event could include information passed down from the add-on when it made the webRequest decision. </QUOTE>

Updated

•

10 years ago

Whiteboard: [webRequest]

Updated

•

10 years ago

Blocks: webext-port-noscript

Wladimir Palant

Comment 1

•

10 years ago

Please note that this "event" shouldn't be a regular DOM event - these could be intercepted by the webpage and allow it to mess with extensions (synthetic events, cancelling propagation) as well as detect extension decisions (particularly when something is being blocked). Instead, I guess that the content script should be able to add a listener (similar to chrome.runtime.onMessage). chrome.webRequest.onCollectRequestData and chrome.webRequest.onBlocked seem to match the naming scheme.

Blocks: abp

Tom S. (please needinfo tschuster)

Updated

•

10 years ago

Mentor: wmccloskey → evilpies

Tom S. (please needinfo tschuster)

Updated

•

10 years ago

Assignee: nobody → evilpies

Mentor: evilpies

Tom S. (please needinfo tschuster)

Updated

•

10 years ago

Mentor: wmccloskey

Tom S. (please needinfo tschuster)

Comment 2

•

10 years ago

Attached patch WIP — Details — Splinter Review

I implemented a very basic version of chrome.webRequest.onCollectRequestData. We need to decide how to do permission checking. And if we need to some option like ["blocking"] for this. Also we might want to disable the onCollectRequestData callback, when there is no corresponding onBeforeRequest for that addon in the parent.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Updated

•

10 years ago

Attachment #8677525 - Attachment is patch: true

Updated

•

10 years ago

Flags: blocking-webextensions-

Giorgio Maone [:ma1]

Reporter

Updated

•

10 years ago

Blocks: webext-port-abp

Updated

•

10 years ago

Whiteboard: [webRequest] → [webRequest]triaged

Giorgio Maone [:ma1]

Reporter

Updated

•

10 years ago

No longer blocks: e10s-noscript

Comment 3

•

9 years ago

I'm wondering if this WIP patch is still valid and worth taking a look at?

Flags: needinfo?(g.maone)

Flags: needinfo?(evilpies)

Tom S. (please needinfo tschuster)

Comment 4

•

9 years ago

It's really not a huge patch so rebasing it probably just means rewriting everything. The basic approach should still work.

Flags: needinfo?(evilpies)

Tom S. (please needinfo tschuster)

Updated

•

9 years ago

Assignee: evilpies → nobody

Comment 5

•

9 years ago

Doesn't seem like a P1, but noting it does block a few add-ons so dropping to a P2.

Component: WebExtensions: Untriaged → WebExtensions: Request Handling

Flags: blocking-webextensions-

Priority: P1 → P2

Giorgio Maone [:ma1]

Reporter

Comment 6

•

9 years ago

https://bugzilla.mozilla.org/attachment.cgi?id=8677525 is not valid anymore, as we currently use or nsIContentPolicy for non-HTTP URIs only, mostly for performance reasons. I've figured out some ways to work-around this problem in raw e10s-land for NoScript 2.9.5 which I'm trying to release ASAP, and I will necessarily have to tackle it as a WebExtensions issue in next step, NoScript 5.0 as an hybrid add-on. Therefore I expect this bug to get busy in the next few weeks.

Flags: needinfo?(g.maone)

Wladimir Palant

Updated

•

9 years ago

No longer blocks: abp

Giorgio Maone [:ma1]

Reporter

Comment 7

•

9 years ago

One way to tackle this would be mapping each non-document DOM element which loads its own content (especially if it has a layout, such as <object> and <img>) to an unique domElementId (unique for the document, so an integer counter would do) which is passed in nsILoadInfo and from there included in webRequest's details object, just like frameId and parentFrameId. On the content script side, an API should be provided to query this map (document.getLoadingNode(domElementId) or something like that) which returns the DOM node, or null if the id is unused or the node has been removed from the DOM in the meanwhile. If over this machinery some automagic could match domElementIds and requestIds, enabling loading nodes to be queried directly by requestId, that would might be more elegant, but I suspect the additional complexity, overhead and state would be not worth the effort.

Comment 8

•

9 years ago

Addons may actually want to inspect the DOM element *before* deciding whether to allow the request to continue. Just attaching IDs leads to a lot of round trip times because the addon would have send messages back to the content script to inspect the DOM node and then send yet another message back up to resume or cancel the suspended request. There are two use-cases for access to the DOM node triggering a request: a) annotating the DOM with information about the request, most commonly that it has been blocked. this can happen some point *after* the request has reached its final state. b) gathering additional information about the page state in general and the Node in particular *before* deciding whether the request should be allowed to go forward Both require scripts touching the content anyway. It would seem more efficient to have those callbacks executed in the content script environment.

Updated

•

9 years ago

webextensions: --- → ?

Giorgio Maone [:ma1]

Reporter

Updated

•

9 years ago

Blocks: 1331463

Giorgio Maone [:ma1]

Reporter

Comment 9

•

9 years ago

In email discussion with :andym CCed, :kmag wrote: "Providing a unique ID for the loading elements is probably the best solution [to enable extensions-managed Click2Play features]. We already get a reference to the element when we create the loadInfo. The only other realistic option I can think of would be to provide a CSS or XPath path to the element, but that has obvious drawbacks. You'll need to get a DOM peer to weigh in on that, though." So I'm asking :billm and/or :bz to comment my proposal at comment 7 from their DOM peers POV. Thank you.

Flags: needinfo?(wmccloskey)

Flags: needinfo?(bzbarsky)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 10

•

9 years ago

I guess I'd like some more context. Firing an event in the content script would allow you to attach an ID yourself (although admittedly this could be somewhat leak-prone). What's the rationale for only doing IDs? Because it's simpler?

Giorgio Maone [:ma1]

Reporter

Comment 11

•

9 years ago

(In reply to Bill McCloskey (:billm) from comment #10) > I guess I'd like some more context. Firing an event in the content script > would allow you to attach an ID yourself (although admittedly this could be > somewhat leak-prone). What's the rationale for only doing IDs? Because it's > simpler? I'd obviously prefer the ability for content scripts to attach an arbitrary serializable object to the requestInfo data passed to webRequest.onBeforeRequest, but I remember somebody commenting that it was expensive and complex, even though I cannot find this comment anymore (the source was Kris, maybe?) If this (which was :billm's original plan, IIRC) is actually feasible, and in a reasonable time-frame, I'd surely be more than happy.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 12

•

9 years ago

I was thinking we could attach some structured clone data to the loadinfo. Kris, do you have any thoughts on that?

Flags: needinfo?(wmccloskey) → needinfo?(kmaglione+bmo)

Kris Maglione [:kmag]

Comment 13

•

9 years ago

I'd rather not encourage people to attach content scripts to every content page for the sake of content blocking. For Giorgio's use case, he just needs to be able to map a blocked request to a specific element after he blocked it, so he can replace it with a click-to-play UI, and determine if later requests came from elements that have been whitelisted. The click-to-play UI would require a content script, but only when an element has been blocked. If we're going to depend on content scripts attaching data to the requests, anyway, it would probably make more sense to just do the processing directly in the content script. But I'm worried that would encourage people to use more content scripts than necessary, or to make them more complicated than necessary, so I'd still rather we stick with global listeners that rely on static data about the element.

Flags: needinfo?(kmaglione+bmo)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 14

•

9 years ago

(In reply to Kris Maglione [:kmag] from comment #13) > I'd rather not encourage people to attach content scripts to every content > page for the sake of content blocking. For Giorgio's use case, he just needs > to be able to map a blocked request to a specific element after he blocked > it, so he can replace it with a click-to-play UI, and determine if later > requests came from elements that have been whitelisted. > > The click-to-play UI would require a content script, but only when an > element has been blocked. This seems like kind of an unusual case, though. Pretty much any "blocker"-type add-on I can think of is going to need a content script anyway.

Kris Maglione [:kmag]

Comment 15

•

9 years ago

(In reply to Bill McCloskey (:billm) from comment #14) > This seems like kind of an unusual case, though. Pretty much any > "blocker"-type add-on I can think of is going to need a content script > anyway. I'm not sure how true that is, but either way, if they're going to include a content script in every site, I'd rather it be as simple as possible. The more of the blocking logic that happens in the background script, the less logic and configuration data has to be bundled into every content script.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 16

•

9 years ago

(In reply to Kris Maglione [:kmag] from comment #15) > (In reply to Bill McCloskey (:billm) from comment #14) > > This seems like kind of an unusual case, though. Pretty much any > > "blocker"-type add-on I can think of is going to need a content script > > anyway. > > I'm not sure how true that is, but either way, if they're going to include a > content script in every site, I'd rather it be as simple as possible. The > more of the blocking logic that happens in the background script, the less > logic and configuration data has to be bundled into every content script. But as comment 8 describes, this could lead to much worse performance overall due to more roundtrips. The amount of logic for a content script to save the ID of a node is pretty small. Maybe 10 lines of code. My only fear is that they would forget to clear the mapping in the "after" callback. I guess I'm not opposed to providing an ID-based API since that would ensure that it uses weak references. But I do think we need to offer the option of a content script that runs first.

Wladimir Palant

Comment 17

•

9 years ago

(In reply to Bill McCloskey (:billm) from comment #16) > The amount of logic for a content script to save the ID of a node is pretty > small. Maybe 10 lines of code. My only fear is that they would forget to > clear the mapping in the "after" callback. Anybody doing that should be using WeakMap, same is true for any functionality associating data with nodes. Not everybody will use it of course, but a specialized API is going to have the same issue. So maybe that's a documentation/review issue?

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 18

•

9 years ago

(In reply to Wladimir Palant from comment #17) > (In reply to Bill McCloskey (:billm) from comment #16) > > The amount of logic for a content script to save the ID of a node is pretty > > small. Maybe 10 lines of code. My only fear is that they would forget to > > clear the mapping in the "after" callback. > > Anybody doing that should be using WeakMap, same is true for any > functionality associating data with nodes. Not everybody will use it of > course, but a specialized API is going to have the same issue. So maybe > that's a documentation/review issue? I don't think there's a way to use a WeakMap for this. You need an ID -> Element map. WeakMaps only allow you to do Element -> ID.

Wladimir Palant

Comment 19

•

9 years ago

You are right, I missed that. We actually have to go through all elements in order to retrieve the data we attached via WeakMaps, not really an option here. Yes, there is no easy way to implement this then, JavaScript doesn't have a map where values are being stored as weak pointers.

Boris Zbarsky [:bzbarsky]

Comment 21

•

9 years ago

Sorry for the lag here.... Attaching an extra ID to various elements directly in the DOM could be done, but it wouldn't be simple (because we don't want to attach it to all nodes for memory reasons), and the putative document.getLoadingNode would not be fast unless we were willing to pay the cost of maintaining yet another hashtable all the time just on the off chance this stuff gets used. What we could do, though is have API on a document that works like this: 1) Start with an element. 2) Ask the document explicitly to create an integer handle for that element and hand it to you. This allows you to ship the handle around and then later query the element. The document will handle holding a weak ref so the element can go away. The handle can be used to retrieve the element later, if it still exists. This would require explicit action to create handles for the elements you care about, but the performance cost will be limited to those elements someone cares about.... Would that work for people?

Flags: needinfo?(bzbarsky)

Comment 22

•

9 years ago

I still think this is unnecessarily complicated and slow because it is founded on the assumption that such a ID would be generated, shuttled up to the background script, the request gets suspended there, the ID is sent back down to a content script, gets used there to inspect the DOM and the result is sent to the background script to resume the request or maybe not. That's 3 IPCs per resource loaded. Maybe 4 if the decision whether to block the element is sent back down to a content script after the 3rd step. Instead something like webrequest.onContentRequest.addListener(script, filter, {blocking: boolean}) would do the job. Whether it gets executed synchronously before the request so it can inspect and block or after the request so it can modify the DOM is decided by the blocking argument. The filters could be made more powerful by taking CSS selectors, page URLs (in addition to the target URLs) and a flag whether it applies to frames so that many requests can be prefiltered before they ever need to hit a content script sandbox. In the end some addon that is designed to interact with every single content page and frame will do so anyway. There is no point in adding all those indirections through element IDs. Other addons that only want to apply code to specific sites can use the filters And everything else that just filters requests without touching the DOM can use webRequest.onBeforeRequest as they do now. Well, that's the webextensions perspective. From a platform perspective I think it would require the ability for nsIContentPolicy to SUSPEND requests in addition to ACCEPT/REJECT and then resume them later. That way scripts could be called a tick later in the task loop where the "shouldLoad() can be called while the DOM and layout of the document involved is in an inconsistent state." restrictions don't apply anymore and then the request can be resumed or rejected asynchronously.

Giorgio Maone [:ma1]

Reporter

Comment 23

•

9 years ago

(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #21) > What we could do, though is have API on a document that works like this: > > 1) Start with an element. > 2) Ask the document explicitly to create an integer handle for that element > and hand it to you. > [...] > Would that work for people? That would work for my use case (custom click to play), together with the ability to attach an event handler, a callback or something to do this whenever a DOM element triggers a load and to attach the handle to the data passed to webRequest.onBeforeRequest(), or even better with this being done automatically for any element triggering a load if webRequest listeners are in use (which doesn't seem a big burden to me, given the machinery a network load puts in motion anyway).

Boris Zbarsky [:bzbarsky]

Comment 24

•

9 years ago

I can't speak to the bigger picture well here; I haven't thought it through yet. Changing nsIContentPolicy to be able to suspend, or just to respond async, is a bunch of work, but could be done. We did it for necko redirect notifications... The fact that content policy checks are mostly hidden in asyncOpen now might make it simpler. We'd need to change the Gecko-facing API to async-cancel instead of returning error from AsyncOpen, and fix consumers (image loading is the obvious one I know about) who depend on the current behavior. Oh, and navigation to a new window target; I have no idea how to make _that_ async. :( Anyway, if that's done, then making the content policy itself async is not so bad. Again, a lot of work, but may be well worth doing. Making content policy checks async, so they don't run at unsafe times is something we've literally wanted to do for over a decade now.

Comment 25

•

9 years ago

Kris will file another bug to allow the element identification and we'll make that the blocker for 57.

Flags: needinfo?(kmaglione+bmo)

Updated

•

9 years ago

webextensions: ? → ---

Boris Zbarsky [:bzbarsky]

Comment 26

•

9 years ago

Did comment 25 ever happen? Right now this looks like it's not being tracked anywhere for 57...

Flags: needinfo?(amckay)

Comment 27

•

9 years ago

Kris said he did create one, not sure where though. He's needinfo'd on this bug.

Flags: needinfo?(amckay)

Rob Wu [:robwu]

Updated

•

8 years ago

See Also: → 1325814

Updated

•

8 years ago

Flags: needinfo?(kmaglione+bmo) → needinfo?(amckay)

Wladimir Palant

Updated

•

8 years ago

See Also: → 1418975

Updated

•

8 years ago

Flags: needinfo?(amckay)

Priority: P2 → P3

Updated

•

7 years ago

Product: Toolkit → WebExtensions

Luca Greco [:rpl] [:luca] [:lgreco]

Comment 28

•

6 years ago

This bug would need a new mentor, :billm would not be able to mentor this anymore.

Mentor: bill.mccloskey

Updated

•

3 years ago

Severity: normal → S3

BugBot [:suhaib / :marco/ :calixte]

Comment 29

•

3 years ago

The severity field for this bug is relatively low, S3. However, the bug has 15 votes.
:robwu, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(rob)

BugBot (nomail) [:suhaib / :marco/ :calixte]

Comment 30

•

3 years ago

The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.

Flags: needinfo?(rob)

Marco Castelluccio [:marco]

Updated

•

3 years ago

Severity: S3 → --

Type: defect → enhancement

Shane Caraveo (:mixedpuppy)

Updated

•

3 years ago

Severity: -- → N/A

Tomislav Jovanovic :zombie

Updated

•

3 years ago

Severity: -- → N/A

You need to log in before you can comment on or make changes to this bug.