Closed Bug 1255894 Opened 8 years ago Closed 7 years ago

Support filtering streaming response data via webRequest API

Categories

(WebExtensions :: Request Handling, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED
mozilla57
Iteration:
48.2 - Apr 4
webextensions +

People

(Reporter: ma1, Assigned: kmag)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: dev-doc-complete, Whiteboard: [webRequest] triaged, [we-enterprise])

Attachments

(11 files, 4 obsolete files)

59 bytes, text/x-review-board-request
mixedpuppy
: review+
Details
59 bytes, text/x-review-board-request
baku
: review+
Details
59 bytes, text/x-review-board-request
mixedpuppy
: review+
baku
: review+
Details
59 bytes, text/x-review-board-request
mixedpuppy
: review+
dragana
: review+
Details
59 bytes, text/x-review-board-request
mixedpuppy
: review+
baku
: review+
Details
59 bytes, text/x-review-board-request
mixedpuppy
: review+
Details
59 bytes, text/x-review-board-request
mixedpuppy
: review+
Details
59 bytes, text/x-review-board-request
kanru
: review+
Details
59 bytes, text/x-review-board-request
dragana
: review+
billm
: review+
Details
59 bytes, text/x-review-board-request
dragana
: review+
Details
13.58 KB, patch
Details | Diff | Splinter Review
Extensions like NoScript or  EPUBReader ( https://addons.mozilla.org/en-US/firefox/addon/epubreader ), whose developer has recently contacted me to investigate the feasibility of a WebExtensions migration, need to manipulate HTTP responses on the fly and possibly divert them from their original consumer.

The Chrome extensions API does support a requestBody property in the webRequest.onBeforeRequest event (which, strangely enough, represents raw data a single array of bytes, with no way to consume it in chunks when it's a very large upload), allows for request and response headers manipulation, but has no way to analyze, let alone modify, the response payload.

Like requestBody, responseChunk be potentially very large and its consumers may require asynchronous processing not to block the parent process, therefore this too depends on bug 1254204.



Therefore I propose to extend the webRequest API with an additional event, tentatively called onResponseData, which is called one or more times as the response data is streamed down. The details object would contain the usual properties, and especially requestId, allowing clients to keep per-request state as needed, plus a new property called responseChunk, i.e. an array filled with the response bytes in chunks, as they're made available to the onDataAvailable() method of nsIStreamListener attached to the underlying HTTP channel.

Since the webRequest API guarantees that either the onCompleted or the onErrorOccurred listener is called when a request is done, we don't need any special EOF signaling: the client would handle its cleanup operations in those events' callbacks.
No longer blocks: 1201979
Whiteboard: [webRequest] → [webRequest] triaged
Flags: blocking-webextensions+
Priority: -- → P2
Flags: blocking-webextensions+
Please update us on the progress of onResponseData implementation

Our add-on needs to read incoming data on the fly
Component: WebExtensions: Untriaged → WebExtensions: Request Handling
We need to make sure this work doesn't cause performance problems.  We've done a lot of work in necko to allow off-main thread delivery (see bug 497003)--currently the HTML parser and the image parser both use that to improve performance.  Currently that work gets disabled if we detect that the nsITraceableChannel (the standard--and IIUC the only--way to intercept and modify data before OnDataAvailable is called on the original listener). See bug 908398 and bug 908405)

So we want to 1) discourage use of this API unless it's really needed (and I get that it may be for some use cases), and 2) make sure that we don't bake in any performance hit when these addons aren't being used.  I'm guessing we'll get #2 for free--that's how things already work (necko does a search through the listener chain to make sure all listeners are off-main-ready before switching to off-main delivery: the presence of any JS-implemented listeners (added by nsITraceableChannel), for instance, causes code to stay main-thread only).

So maybe there's no issue here--but I want to make sure. SC, can you keep an eye on the architecture used here and make sure we don't break off-main delivery any more than necessary?  If you need help figuring that out ask Honza.
Flags: needinfo?(schien)
The extension code that's consuming this will be running in its own process, so we can probably implement it in a way that's compatible with off-main-thread delivery. Either way, I definitely agree that we want to discourage this whenever it's not absolutely necessary.
OOP addons are still single-threaded, maybe it should be handled by workers?
(In reply to The 8472 from comment #4)
> OOP addons are still single-threaded, maybe it should be handled by workers?

It's not so much an issue of them being single-threaded as it is of not blocking the main UI thread while the data is being processed. Granted, they'd still block any UI that's running in the extension process, but that's a much smaller issue.

It definitely would be nice to be able to push these kinds of APIs into something like a service worker in the future, but that's a ways down the road.
Not quite as simple. If an extension takes time to mangle each request this can also slow down page loads. Parallelizing it could improve user experience.
If the data processing takes longer than the data transfer from the network, that's probably true. But workers don't buy us anything unless we somehow spawned multiple workers to handle parallel requests (which presents its own whole set of problems), and we'd like to discourage extensions from doing things like that in any case.
Yes, the idea was to have multiple workers. But even with a single worker it would still improve throughput if multiple addons use this feature.
Only if they happen to be running in the same process. But, again, if we have more consumers of this API than we have extension processes, or those extensions process data significantly slower than we can read it from the network, we're doing something wrong. That's not how this API is meant to be used.
This bug is blocking me from porting my extension to WebExtensions. I have an extension that scans JPEGs that are coming over the wire to alert on them containing EXIF GPS information (which is potentially privacy-compromising). My extension currently uses an nsIObserver and snoops incoming responses with JPEG MIME types. As far as I can tell, there isn't a way to do this using WebExtensions.

The source is here: https://github.com/allanlw/gps-detect/ and the listing is: https://addons.mozilla.org/en-US/firefox/addon/gpsdetect/
webextensions: --- → ?
Assignee: g.maone → nobody
Whiteboard: [webRequest] triaged → [webRequest] triaged, [we-enterprise]
Kris - Thanks for merging the bugs. My addon Integrated Inbox will cease to work, as it sounds like is the case for many others, unless this is addressed. 

Have a great weekend,
Michael
Attached patch webrequest_oda_uint8.diff (obsolete) — Splinter Review
Would something like this be what we're looking for, once bug 1261585 is fixed?

This patch just adds an onResponseData listener to WebRequest, which can optionally be "blocking" if response-rewriting is desired rather than simple inspection. It passes the data into the listener as a Uint8Array named responseBytes.

It just builds on the existing StartStopListener::OnDataAvailable, only doing extra work if an onResponseData listener was detected (hopefully I did that correctly).
Attachment #8845654 - Flags: review?(kmaglione+bmo)
Attached patch webrequest_oda_uint8.diff (obsolete) — Splinter Review
Here's a fresh version of the last patch, which changes the onDataAvailable handler to be a bit more solid, now that I've had a chance to test it a bit more thoroughly.
Attachment #8845654 - Attachment is obsolete: true
Attachment #8845654 - Flags: review?(kmaglione+bmo)
Attachment #8846323 - Flags: feedback?(kmaglione+bmo)
I started to look at bug 1326298 and was going to put some focus on that.  I'd rather not see this land simply to have to convert it to native code right away.
Comment on attachment 8846323 [details] [diff] [review]
webrequest_oda_uint8.diff

Review of attachment 8846323 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry I didn't see this sooner. As commented elsewhere in the bug, this needs to be implemented in C++, with careful attention to thread safety and performance. I thought the bug was already assigned to me.
Attachment #8846323 - Flags: feedback?(kmaglione+bmo) → feedback-
Assignee: nobody → kmaglione+bmo
Comment on attachment 8846323 [details] [diff] [review]
webrequest_oda_uint8.diff

Oh, no problem! One way or the other, I figured it would be worth providing a patch for others to experiment with if they need such an API for their extensions.
Attachment #8846323 - Attachment is obsolete: true
Comment on attachment 8850697 [details]
Bug 1255894: Part 1 - Allow applying content conversions in parent rather than child when necessary.

https://reviewboard.mozilla.org/r/123230/#review125854

This is not going to work. We do not apply conversion always. Only when the last listener in a listener chain (that is URILoader ans ExternalHelper) is called we know whether to apply conversion or not, e.g. this has to do with saving gzip files as gzip. And then there is UnknownDecoder that calls the last listener only when some OnDataAvailable is called, e.g. respons is already transfer to the child so nsITracableChannel cannot modify it. For more info please read bug 1261585.

So let me finish bug 1342386 and 1261585.
Bug 1261585 is done, it is only waiting for bug 1342386.
I have a patch for bug 1342386, but i did not have time to test it yet. It touch some very old code that I am afraid to break.(This bug transforms UnknownDecoder from a stream listener to a sniffer).
Attachment #8850697 - Flags: review?(dd.mozilla) → review-
Sorry, I'm not sure when I'll be able to get to this Kris. The last few days I was really busy preparing for a workweek next week, and I expect I won't have much time at the workweek itself. If there's anyone else who can review, please feel free to give it to them. The earliest I can likely review is April 3rd.
Shane and I will also be at a work week next week. I'll probably have time to be responsive, but if Shane doesn't think he'll have time to get to it either, it can probably wait until after.
Flags: needinfo?(mixedpuppy)
I've been digging through a couple patches but am wanting to wrap my head around the entirety before reviewing any single patch.  Was last looking at all the state engine stuff.  I don't expect to have time over work week to finish reviews.
Flags: needinfo?(mixedpuppy)
Attachment #8850699 - Flags: review?(wmccloskey) → review?(amarchesini)
Attachment #8850700 - Flags: review?(wmccloskey) → review?(amarchesini)
Attachment #8850702 - Flags: review?(wmccloskey) → review?(amarchesini)
Baku, Bill suggested you might be a good person to review this. Let me know if you have any questions, or you don't feel like you're the right person.

Thanks!
webextensions: ? → +
Comment on attachment 8850699 [details]
Bug 1255894: Part 3 - Create skeleton IPDL framework for OOP stream filters.

https://reviewboard.mozilla.org/r/123234/#review127646

This needs a better name. PStream looks like IPCStream that is about nsIInputStream.
Can you rename it?
I guess the fun part is in some other patch :)

::: toolkit/components/extensions/webrequest/PStreamFilter.ipdl:9
(Diff revision 1)
> +
> +include protocol PBackground;
> +
> +namespace mozilla {
> +namespace extensions {
> +

Add a comment about what this all is about.
Attachment #8850699 - Flags: review?(amarchesini) → review+
Comment on attachment 8850700 [details]
Bug 1255894: Part 4 - Implement StreamFilter IPC protocol.

https://reviewboard.mozilla.org/r/123236/#review127664

Ok, This is a complex patch. I need to review it more than once.
Here the first comments. I have a couple of questions:

1. Why do we need to have a double logic in the child and in the parent? Can we have it only in the child?
I would like to have a 'simpler' parent that does only what the child requires. Basically, I think we can have the parent without mState.
If we do that, we just need a boolean to know if the channel is suspended or not in case we need to resume it.

2. We must be sure that we don't send messages if the child crashes. ActorDestroyed() should be used to avoid that. Currently this is not implemented.

3. IO Thread or main-thread? There are a couple of naming issues.

::: toolkit/components/extensions/webrequest/PStreamFilter.ipdl:36
(Diff revision 1)
> +
> +  async StartRequest();
> +  async Data(uint8_t[] data);
> +  async StopRequest(nsresult aStatus);
> +
>    async __delete__();

Can you write here a comment explaining why sending __delete__ in the child is safe? What does prevent the parent to send another message in the meantime?

::: toolkit/components/extensions/webrequest/StreamFilterBase.h:22
(Diff revision 1)
> +{
> +public:
> +  typedef nsTArray<uint8_t> Data;
> +
> +protected:
> +  class BufferedData : public LinkedListElement<BufferedData> {

{ in a new line

::: toolkit/components/extensions/webrequest/StreamFilterBase.h:32
(Diff revision 1)
> +  };
> +
> +  LinkedList<BufferedData> mBufferedData;
> +
> +  inline void
> +  BufferData(Data&& aData) {

same here. \n{

::: toolkit/components/extensions/webrequest/StreamFilterChild.h:30
(Diff revision 1)
>  {
>  public:
>    NS_INLINE_DECL_REFCOUNTING(StreamFilterChild)
>  
> -  StreamFilterChild() {}
> +  StreamFilterChild()
> +    : mState(State::Uninitialized)

mNextState ?

::: toolkit/components/extensions/webrequest/StreamFilterChild.h:34
(Diff revision 1)
> -  StreamFilterChild() {}
> +  StreamFilterChild()
> +    : mState(State::Uninitialized)
> +    , mReceivedOnStop(false)
> +  {}
> +
> +  enum class State {

can you write comments explaining each state?

::: toolkit/components/extensions/webrequest/StreamFilterChild.h:49
(Diff revision 1)
> +    Disconnecting,
> +    Disconnected,
> +    Error,
> +  };
> +
> +  void Suspend(ErrorResult& rv);

aRv

::: toolkit/components/extensions/webrequest/StreamFilterChild.h:52
(Diff revision 1)
> +  };
> +
> +  void Suspend(ErrorResult& rv);
> +  void Resume(ErrorResult& rv);
> +  void Disconnect(ErrorResult& rv);
> +  void Close(ErrorResult& rv);

aRv to all these methods.

::: toolkit/components/extensions/webrequest/StreamFilterChild.h:57
(Diff revision 1)
> +  void Close(ErrorResult& rv);
> +  void Cleanup();
> +
> +  void Write(Data&& aData, ErrorResult& aRv);
> +
> +  inline State GetState() const

inline is not needed in class definition

::: toolkit/components/extensions/webrequest/StreamFilterChild.h:66
(Diff revision 1)
>  
>  protected:
> +  virtual IPCResult RecvInitialized(const bool& aSuccess) override;
> +
> +  virtual IPCResult RecvStartRequest() override;
> +  virtual IPCResult RecvData(Data&& data) override;

aData

::: toolkit/components/extensions/webrequest/StreamFilterChild.cpp:34
(Diff revision 1)
> +  case State::Disconnected:
> +    break;
> +
> +  default:
> +    ErrorResult rv;
> +    Disconnect(rv);

If you don't care about the result, use IgnoreErrorResult instead. the DTOR of ErrorResult, if it contains and error, does MOZ_CRASH.

::: toolkit/components/extensions/webrequest/StreamFilterChild.cpp:44
(Diff revision 1)
> +/*****************************************************************************
> + * State change methods
> + *****************************************************************************/
> +
> +void
> +StreamFilterChild::Suspend(ErrorResult& rv)

aRv, here and elsewhere.

::: toolkit/components/extensions/webrequest/StreamFilterChild.cpp:55
(Diff revision 1)
> +
> +    SendSuspend();
> +    break;
> +
> +  case State::Suspending:
> +    switch (mNextState) {

instead having 2 nested switch, what about:
if (mNextStep == State::Suspended && mNextState == State::Resuming) { ... } else { aRv.Throw(NS_ERROR_FAILURE); }

::: toolkit/components/extensions/webrequest/StreamFilterChild.cpp:62
(Diff revision 1)
> +    case State::Resuming:
> +      mNextState = State::Suspended;
> +      break;
> +
> +    default:
> +      rv = NS_ERROR_FAILURE;

rv.Throw(NS_ERROR_FAILURE);

::: toolkit/components/extensions/webrequest/StreamFilterChild.cpp:68
(Diff revision 1)
> +      return;
> +    }
> +    break;
> +
> +  case State::Resuming:
> +    switch (mNextState) {

same comments here.

::: toolkit/components/extensions/webrequest/StreamFilterChild.cpp:84
(Diff revision 1)
> +
> +  case State::Suspended:
> +    break;
> +
> +  default:
> +    rv = NS_ERROR_FAILURE;

aRv.Throw...

::: toolkit/components/extensions/webrequest/StreamFilterChild.cpp:90
(Diff revision 1)
> +    break;
> +  }
> +}
> +
> +void
> +StreamFilterChild::Resume(ErrorResult& rv)

aRv

::: toolkit/components/extensions/webrequest/StreamFilterChild.cpp:91
(Diff revision 1)
> +  }
> +}
> +
> +void
> +StreamFilterChild::Resume(ErrorResult& rv)
> +{

1. aRv
2. can we add MOZ_ASSERT about the mState?

::: toolkit/components/extensions/webrequest/StreamFilterChild.cpp:126
(Diff revision 1)
> +
> +  FlushBufferedData();
> +}
> +
> +void
> +StreamFilterChild::Disconnect(ErrorResult& rv)

aRv

::: toolkit/components/extensions/webrequest/StreamFilterChild.cpp:326
(Diff revision 1)
> +
> +    default:
> +      aRv = NS_ERROR_FAILURE;
> +      return;
> +    }
> +    break;

Why are we calling SendWrite() when we are in suspending or suspended mode? (I still need to read the parent side).

It seems that basically we are sending data always if we are not disconnected/closed.

::: toolkit/components/extensions/webrequest/StreamFilterParent.h:39
(Diff revision 1)
>    NS_DECL_THREADSAFE_ISUPPORTS
>  
>    explicit StreamFilterParent(uint64_t aChannelId, const nsAString& aAddonId);
>  
> +  enum class State {
> +    Uninitialized,

Comments here.

::: toolkit/components/extensions/webrequest/StreamFilterParent.h:60
(Diff revision 1)
>    void Init(already_AddRefed<nsIContentParent> aContentParent);
>  
>  protected:
>    virtual ~StreamFilterParent();
>  
> +  virtual IPCResult RecvWrite(Data&& data) override;

aData

::: toolkit/components/extensions/webrequest/StreamFilterParent.h:83
(Diff revision 1)
>    virtual void ActorDestroy(ActorDestroyReason aWhy) override;
>  
> +  void Broken();
> +
> +  inline void
> +  CheckResult(bool result)

aResult

::: toolkit/components/extensions/webrequest/StreamFilterParent.h:110
(Diff revision 1)
> +    MOZ_ASSERT(NS_IsMainThread());
> +  }
> +
> +  // Ensures that a strong reference to this instance is held alive while a
> +  // method-based or closure-based runnable is being dispatched.
> +  class KungFuDeathGrip : public Runnable {

{ in a new line

::: toolkit/components/extensions/webrequest/StreamFilterParent.h:112
(Diff revision 1)
> +
> +  // Ensures that a strong reference to this instance is held alive while a
> +  // method-based or closure-based runnable is being dispatched.
> +  class KungFuDeathGrip : public Runnable {
> +  public:
> +    KungFuDeathGrip(StreamFilterParent* aGrip, already_AddRefed<nsIRunnable> aRunnable);

make it private

::: toolkit/components/extensions/webrequest/StreamFilterParent.h:126
(Diff revision 1)
> +    }
> +
> +    NS_IMETHOD Run() override;
> +
> +  protected:
> +    virtual ~KungFuDeathGrip();

= default

::: toolkit/components/extensions/webrequest/StreamFilterParent.h:130
(Diff revision 1)
> +  protected:
> +    virtual ~KungFuDeathGrip();
> +
> +  private:
> +    RefPtr<StreamFilterParent> mGrip;
> +    RefPtr<nsIRunnable> mRunnable;

Runnable. Why nsIRunnable? If it is nsIRunnable, should it be nsCOMPtr?

::: toolkit/components/extensions/webrequest/StreamFilterParent.h:135
(Diff revision 1)
> +    RefPtr<nsIRunnable> mRunnable;
> +  };
> +
> +  // Dispatches a runnable to the given event target, while making sure a strong
> +  // reference to this instance is kept alive until it completes.
> +  inline void

remove inline here an elsewhere.

::: toolkit/components/extensions/webrequest/StreamFilterParent.h:138
(Diff revision 1)
> +  // Dispatches a runnable to the given event target, while making sure a strong
> +  // reference to this instance is kept alive until it completes.
> +  inline void
> +  SafeDispatch(nsIEventTarget* aEventTarget, already_AddRefed<nsIRunnable> aRunnable)
> +  {
> +      nsCOMPtr<nsIRunnable> runnable = aRunnable;

indentation

::: toolkit/components/extensions/webrequest/StreamFilterParent.h:158
(Diff revision 1)
> +
> +  template<typename Function>
> +  inline void
> +  RunOnWorkerThread(Function&& aFunc)
> +  {
> +    SafeDispatch(mWorkerThread,

same line?

::: toolkit/components/extensions/webrequest/StreamFilterParent.h:166
(Diff revision 1)
> +
> +  template<typename Function>
> +  inline void
> +  RunOnIOThread(Function&& aFunc)
> +  {
> +    SafeDispatch(mIOThread,

same line?

::: toolkit/components/extensions/webrequest/StreamFilterParent.cpp:46
(Diff revision 1)
> + *****************************************************************************/
>  
>  StreamFilterParent::StreamFilterParent(uint64_t aChannelId, const nsAString& aAddonId)
>    : mChannelId(aChannelId)
>    , mAddonId(NS_Atomize(aAddonId))
> +  , mWorkerThread(NS_GetCurrentThread())

Call it PBackgroundThread

::: toolkit/components/extensions/webrequest/StreamFilterParent.cpp:47
(Diff revision 1)
>  
>  StreamFilterParent::StreamFilterParent(uint64_t aChannelId, const nsAString& aAddonId)
>    : mChannelId(aChannelId)
>    , mAddonId(NS_Atomize(aAddonId))
> +  , mWorkerThread(NS_GetCurrentThread())
> +  , mIOThread(do_GetMainThread())

main thread? Why?

::: toolkit/components/extensions/webrequest/StreamFilterParent.cpp:64
(Diff revision 1)
>  void
>  StreamFilterParent::Init(already_AddRefed<nsIContentParent> aContentParent)
>  {
> -  nsCOMPtr<nsIContentParent> contentParent = aContentParent;
> -  NS_ReleaseOnMainThread(contentParent.forget());
> +  AssertIsWorkerThread();
> +
> +  nsIContentParent* contentParent = aContentParent.take();

Instead having this raw pointer, can you do something similar to what we have in BackgroundImplParent with CheckPrincipalRunnable?

::: toolkit/components/extensions/webrequest/StreamFilterParent.cpp:71
(Diff revision 1)
> +    DoInit(contentParent);
> +  });
> +}
> +
> +void
> +StreamFilterParent::DoInit(nsIContentParent* aContentParent)

call it InitOnMainThread

::: toolkit/components/extensions/webrequest/StreamFilterParent.cpp:79
(Diff revision 1)
> +
> +  RefPtr<nsIContentParent> contentParent = dont_AddRef(aContentParent);
> +
> +  bool success = false;
> +  auto guard = MakeScopeExit([&] {
> +    RunOnWorkerThread([=] {

RunOnBackgroundThread

::: toolkit/components/extensions/webrequest/StreamFilterParent.cpp:103
(Diff revision 1)
> + *****************************************************************************/
> +
> +void
> +StreamFilterParent::Broken()
> +{
> +  AssertIsWorkerThread();

rename worker to backgroundThread here. 'Worker' reminds DOM Workers.

::: toolkit/components/extensions/webrequest/StreamFilterParent.cpp:139
(Diff revision 1)
> +
> +  if (mState == State::TransferringData) {
> +    RunOnMainThread([this] {
> +      mChannel->Suspend();
> +    });
> +    RunOnWorkerThread([this] {

why a runnable here?

::: toolkit/components/extensions/webrequest/StreamFilterParent.cpp:141
(Diff revision 1)
> +    RunOnMainThread([this] {
> +      mChannel->Suspend();
> +    });
> +    RunOnWorkerThread([this] {
> +      mState = State::Suspended;
> +      CheckResult(SendSuspended());

This is wrong. here you have a runnable. But what about if the child is already crashed? You should deal with ActorDestroyed().

::: toolkit/components/extensions/webrequest/StreamFilterParent.cpp:156
(Diff revision 1)
> +
> +  if (mState == State::Suspended) {
> +    RunOnMainThread([this] {
> +      mChannel->Resume();
> +    });
> +    RunOnWorkerThread([this] {

same here.
Attachment #8850700 - Flags: review?(amarchesini) → review-
Comment on attachment 8850702 [details]
Bug 1255894: Part 6 - Implement StreamFilter DOM bindings.

https://reviewboard.mozilla.org/r/123240/#review128652

::: dom/webidl/StreamFilter.webidl:6
(Diff revision 1)
> +/* -*- Mode: IDL; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this file,
> + * You can obtain one at http://mozilla.org/MPL/2.0/.
> + */
> +

Documentation, URL of what this spec is about. Or a good comment explaining why we need it and where it is used.

::: dom/webidl/StreamFilterDataEvent.webidl:5
(Diff revision 1)
> +/* -*- Mode: IDL; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this file,
> + * You can obtain one at http://mozilla.org/MPL/2.0/. */
> +

documentation here as well.

::: toolkit/components/extensions/webrequest/StreamFilter.h:14
(Diff revision 1)
> +
> +#include "mozilla/dom/BindingDeclarations.h"
> +#include "mozilla/dom/StreamFilterBinding.h"
> +#include "mozilla/extensions/StreamFilterChild.h"
> +
> +#include "jsapi.h"

why do you need this?

::: toolkit/components/extensions/webrequest/StreamFilter.h:43
(Diff revision 1)
> +  static already_AddRefed<StreamFilter>
> +  Create(GlobalObject& global,
> +         uint64_t aRequestId,
> +         const nsAString& aAddonId);
> +
> +  explicit StreamFilter(nsIGlobalObject* aParent,

no explicit here. we have more than 1 argument.

::: toolkit/components/extensions/webrequest/StreamFilter.h:60
(Diff revision 1)
> +  void GetError(nsAString& aError)
> +  {
> +    aError = mError;
> +  }
> +
> +  inline StreamFilterStatus

no inline if it's already inline

::: toolkit/components/extensions/webrequest/StreamFilter.h:61
(Diff revision 1)
> +  {
> +    aError = mError;
> +  }
> +
> +  inline StreamFilterStatus
> +  Status() const {

{ in a new line

::: toolkit/components/extensions/webrequest/StreamFilter.h:66
(Diff revision 1)
> +  Status() const {
> +    MOZ_ASSERT(mActor);
> +    return mActor->Status();
> +  }
> +
> +  inline void Suspend(ErrorResult& aRv)

same comment about the inline

::: toolkit/components/extensions/webrequest/StreamFilter.h:74
(Diff revision 1)
> +    return mActor->Suspend(aRv);
> +  }
> +
> +  inline void Resume(ErrorResult& aRv)
> +  {
> +    MOZ_ASSERT(mActor);

nothing guarantees that mActor exists.

::: toolkit/components/extensions/webrequest/StreamFilter.cpp:115
(Diff revision 1)
> + *****************************************************************************/
> +
> +void
> +StreamFilter::Write(const ArrayBufferOrUint8Array& aData, ErrorResult& aRv)
> +{
> +  nsTArray<uint8_t> data;

make it FallibleTArray

::: toolkit/components/extensions/webrequest/StreamFilter.cpp:121
(Diff revision 1)
> +
> +  if (aData.IsArrayBuffer()) {
> +    const ArrayBuffer& buffer = aData.GetAsArrayBuffer();
> +
> +    buffer.ComputeLengthAndData();
> +    data.SetLength(buffer.Length());

SetLength() could fail if FallibleTArray. return OOM.

::: toolkit/components/extensions/webrequest/StreamFilter.cpp:127
(Diff revision 1)
> +    memcpy(data.Elements(), buffer.Data(), buffer.Length());
> +  } else if (aData.IsUint8Array()) {
> +    const Uint8Array& array = aData.GetAsUint8Array();
> +
> +    array.ComputeLengthAndData();
> +    data.SetLength(array.Length());

same here.

::: toolkit/components/extensions/webrequest/StreamFilter.cpp:145
(Diff revision 1)
> +
> +void
> +StreamFilter::FireEvent(const nsAString& aType)
> +{
> +  AutoEntryScript aes(mParent, "StreamFilter event");
> +  JSContext* cx = aes.cx();

remove this.

::: toolkit/components/extensions/webrequest/StreamFilter.cpp:147
(Diff revision 1)
> +StreamFilter::FireEvent(const nsAString& aType)
> +{
> +  AutoEntryScript aes(mParent, "StreamFilter event");
> +  JSContext* cx = aes.cx();
> +
> +  RootedDictionary<EventInit> init(cx);

it doesn't need to be rooted.

::: toolkit/components/extensions/webrequest/StreamFilter.cpp:167
(Diff revision 1)
> +  JSContext* cx = aes.cx();
> +
> +  RootedDictionary<StreamFilterDataEventInit> init(cx);
> +  init.mBubbles = false;
> +  init.mCancelable = false;
> +  init.mData.Init(ArrayBuffer::Create(cx, this, aData.Length(), aData.Elements()));

ArrayBuffer::Create can fail.

::: toolkit/components/extensions/webrequest/StreamFilterChild.cpp:368
(Diff revision 1)
> +  case State::Resuming:
> +  case State::Suspending:
> +    switch (mNextState) {
> +    case State::TransferringData:
> +    case State::Resuming:
> +      return StreamFilterStatus::Transferringdata;

This means that if I do:

foo.resume();
console.log(foo.status) -> it returns transferringData

::: toolkit/components/extensions/webrequest/StreamFilterEvents.h:42
(Diff revision 1)
> +  static already_AddRefed<StreamFilterDataEvent>
> +  Constructor(EventTarget* aEventTarget,
> +              const nsAString& aType,
> +              const StreamFilterDataEventInit& aParam);
> +
> +  static inline already_AddRefed<StreamFilterDataEvent>

inline

::: toolkit/components/extensions/webrequest/StreamFilterEvents.h:52
(Diff revision 1)
> +  {
> +    nsCOMPtr<EventTarget> target = do_QueryInterface(aGlobal.GetAsSupports());
> +    return Constructor(target, aType, aParam);
> +  }
> +
> +  inline void GetData(JSContext* aCx, JS::MutableHandleObject aResult) {

1. inline
2. {
Attachment #8850702 - Flags: review?(amarchesini) → review-
Comment on attachment 8850700 [details]
Bug 1255894: Part 4 - Implement StreamFilter IPC protocol.

https://reviewboard.mozilla.org/r/123236/#review127664

1) I tried to keep as much of the state logic in the child as possible. There are still a lot of things we need to keep track of in the parent, though, particularly for the sake of disconnecting the filter from the child and having communication continue unfiltered in the parent.

2) You're right, I meant to implement ActorDestroyed and handle it the way we handle a failure to send.

3) IO and main thread are two separate threads. The IO thread is the thread where channel IO happens. That's usually the main thread, but it can be retargeted for some channels. The main thread is where we need to do all non-IO interaction with the channel.

> Can you write here a comment explaining why sending __delete__ in the child is safe? What does prevent the parent to send another message in the meantime?

We don't send delete in the child. The child sends `Close()` to the parent, and then the parent sends `__delete__` to the child when it's finished.

> mNextState ?

It shouldn't matter. Any use of `mNextState` before it's initialized elsewhere would be an error, but I suppose initializing it to a sentinel value wouldn't be a bad idea.

> inline is not needed in class definition

I know, but I generally prefer to include it anyway. But I don't really have a strong opinion.

> If you don't care about the result, use IgnoreErrorResult instead. the DTOR of ErrorResult, if it contains and error, does MOZ_CRASH.

I think crashing on error is probably what we want here. `Disconnect()` should only ever return an error if we call it in an invalid state.

> instead having 2 nested switch, what about:
> if (mNextStep == State::Suspended && mNextState == State::Resuming) { ... } else { aRv.Throw(NS_ERROR_FAILURE); }

I don't really have a strong opinion on it. The nested switch is consistent with the other places where we need to do more complicated checks, and has the benefit of compiler warnings if we forget a state. But I'd be willing to switch to an `if`.

> 1. aRv
> 2. can we add MOZ_ASSERT about the mState?

What assert would we add? This is called from JS, so we really want to throw rather than crash if the state isn't what we expect.

> Why are we calling SendWrite() when we are in suspending or suspended mode? (I still need to read the parent side).
> 
> It seems that basically we are sending data always if we are not disconnected/closed.

So, basically, `Suspend()` only stops new incoming data from being dispatched to the JS listener. When the channel is suspended, the JS side is still allowed to write its own data to the channel.

Once it's called disconnect(), though, the filter is effectively dead. All data passes through on the parent side unfiltered, and the child isn't allowed to write additional data.

> Runnable. Why nsIRunnable? If it is nsIRunnable, should it be nsCOMPtr?

I guess it probably can be `Runnable`. I'm not sure if we currently use it with any other kind of `nsIRunnable`. It should have been `nsCOMPtr`, though, yeah.

> main thread? Why?

Because that's the default IO thread for all channels. It's never redirected to other threads until someone calls `RetargetDeliveryTo()` on the channel.

> Instead having this raw pointer, can you do something similar to what we have in BackgroundImplParent with CheckPrincipalRunnable?

I could, but it gets verbose very fast when we start creating special-purpose runnable classes, and closures don't work well with smart pointers.

> rename worker to backgroundThread here. 'Worker' reminds DOM Workers.

I mostly chose the terminology for consistency with the storage actors, but I'm not especially fond of it either.

> why a runnable here?

Oh, that was originally inside the `RunOnMainThread` runnable, so we didn't wind up doing it until after the channel was actually suspended. I'm no sure why I moved it out.

> This is wrong. here you have a runnable. But what about if the child is already crashed? You should deal with ActorDestroyed().

Hm. That's a good point.
Comment on attachment 8850702 [details]
Bug 1255894: Part 6 - Implement StreamFilter DOM bindings.

https://reviewboard.mozilla.org/r/123240/#review128652

> why do you need this?

I probably don't anymore. A previous version needed it for dealing with typed arrays.

> no explicit here. we have more than 1 argument.

Implicit constructures with multiple arguments are allowed in C++11

> nothing guarantees that mActor exists.

Hm. You're right. Originally I wasn't exposing the object to JS until it was initialized, but now that isn't guaranteed.

> This means that if I do:
> 
> foo.resume();
> console.log(foo.status) -> it returns transferringData

Right, that's intended. From the point of view of the JS consumer, there are no intermediate states. We just need to track them internally so that we don't try to change to a new state before the last state change call is complete.
Attachment #8850697 - Attachment is obsolete: true
Comment on attachment 8850698 [details]
Bug 1255894: Part 2 - Add mozIWebRequestService service for tracking filtered requests.

https://reviewboard.mozilla.org/r/123232/#review129576
Attachment #8850698 - Flags: review?(mixedpuppy) → review+
Comment on attachment 8850704 [details]
Bug 1255894: Part 8 - Add tests for response stream filtering.

https://reviewboard.mozilla.org/r/123244/#review129614
Attachment #8850704 - Flags: review?(mixedpuppy) → review+
Comment on attachment 8850703 [details]
Bug 1255894: Part 7 - Expose response stream filtering via the webRequest API.

https://reviewboard.mozilla.org/r/123242/#review129578

::: toolkit/components/extensions/schemas/web_request.json:210
(Diff revision 2)
> +      },
> +      {
> +        "name": "filterResponseData",
> +        "permissions": ["webRequestBlocking"],
> +        "type": "function",
> +        "description": "...",

add description
Attachment #8850703 - Flags: review?(mixedpuppy) → review+
Comment on attachment 8850700 [details]
Bug 1255894: Part 4 - Implement StreamFilter IPC protocol.

https://reviewboard.mozilla.org/r/123236/#review130056

That took a bit to wrap ones head around, but I feel pretty comfortable with all the state logic at this point.
Attachment #8850700 - Flags: review?(mixedpuppy) → review+
Comment on attachment 8850702 [details]
Bug 1255894: Part 6 - Implement StreamFilter DOM bindings.

https://reviewboard.mozilla.org/r/123240/#review130058

There's a lot in this that I haven't touched before so I don't feel comfortable reviewing the code, baku can finish that up.  From what I understand is going on here, I feel ok with what it does in the context of the overall patch set.
Attachment #8850702 - Flags: review?(mixedpuppy) → review+
Comment on attachment 8850701 [details]
Bug 1255894: Part 5 - Hook up StreamFilterParent as a stream listener.

https://reviewboard.mozilla.org/r/123238/#review130064

r+ for my part
Attachment #8850701 - Flags: review?(mixedpuppy) → review+
Cheers. I owe all of you drinks :)
Comment on attachment 8856132 [details]
Bug 1255894: Allow capturing `this` in lambda expressions when marked safe.

https://reviewboard.mozilla.org/r/128076/#review130936

I don't like this solution to the problem of avoiding raw pointer inside lambda static analysis. 

a) I don't think that the decision of whether or not a raw pointer to a type may be captured should be based on a type level flag. Instead it should be based on the call site.
b) CustomTypeAnnotation is the wrong way to implement this. This implements an inheritable attribute, which is inherited from members and base classes, which is definitely not what we want here (as a member being safe for static analysis doesn't imply that the outer type is at all).

I would suggest for now using a workaround in your code similar to this one (http://searchfox.org/mozilla-central/rev/eace920a0372051a11d8d275bd9b8f14f3024ecd/dom/media/MediaEventSource.h#143-154) in the media code (although I would probably use a name like `AssertSafeForLambdaCapture<T>`). I'm not the biggest fan of the workaround in general, but if you need to capture a raw pointer to a reference counted object by value it's probably the way to do it. This moves the decision to the place where the lambda is created which I think is much closer to the actual problem.
Attachment #8856132 - Flags: review?(michael) → review-
Comment on attachment 8856132 [details]
Bug 1255894: Allow capturing `this` in lambda expressions when marked safe.

https://reviewboard.mozilla.org/r/128076/#review130936

I didn't particularly like the solution either. I'd rather make the decision based on the function we're passing the lambda to, but given how unlikely it is to be used elsewhere in the near future, I decided to go with the simpler solution.

I didn't realize the type attribute would be inherited by other class members, but given that it only allows `this` to be captured, I don't think it could have an effect on them.

The RawPtr solution doesn't help much, since we really need to capture and access `this`, and the SafeDispatch method was meant to avoid the noise of capturing both `this` and a separate RefPtr in so many different places. I wrote it before I realized the clang plugin would be a problem.

I'll try to come up with a better solution that works with the argument passed to the dispatch functions rather than the type.
(In reply to Kris Maglione [:kmag] from comment #61)
> The RawPtr solution doesn't help much, since we really need to capture and
> access `this`, and the SafeDispatch method was meant to avoid the noise of
> capturing both `this` and a separate RefPtr in so many different places. I
> wrote it before I realized the clang plugin would be a problem.

You don't have to capture both - you can just write:

    RefPtr<SelfType> self = this;
    auto lambda = [self] () { ...; self->PrivateMethod(); };

and it will work just fine. I much much prefer that you do that compared to this SafeDispatch method. We also support:

    RefPtr<SelfType> self = this;
    auto lambda = [this, self] () { SOME_MACRO_WHICH_EXPANDS_TO_CALL_A_METHOD_ON_THIS; };

and the static analysis recognizes that the `this` variable's capture is guarded. If you want to use `=` default captures, you can even do:

    RefPtr<SelfType> self = this;
    auto lambda = [=, self] () { /* You can use this in here and it will be safe! */ };

> I'll try to come up with a better solution that works with the argument
> passed to the dispatch functions rather than the type.

I would prefer that you capture the `this` pointer with a strong reference as referred to above. I don't like the idea of adding a new special case into the static analysis just to clean up a couple of lambda call sites which could easily be changed to obviously correctly keep their captures alive.
Comment on attachment 8850700 [details]
Bug 1255894: Part 4 - Implement StreamFilter IPC protocol.

https://reviewboard.mozilla.org/r/123236/#review134332

::: toolkit/components/extensions/webrequest/PStreamFilter.ipdl:9
(Diff revision 3)
>  
>  include protocol PBackground;
>  
>  namespace mozilla {
>  namespace extensions {
>  

We need a comment here describing what this protocol is for.

::: toolkit/components/extensions/webrequest/StreamFilterBase.h:22
(Diff revision 3)
> +{
> +public:
> +  typedef nsTArray<uint8_t> Data;
> +
> +protected:
> +  class BufferedData : public LinkedListElement<BufferedData> {

{ in a new line

::: toolkit/components/extensions/webrequest/StreamFilterBase.h:22
(Diff revision 3)
> +{
> +public:
> +  typedef nsTArray<uint8_t> Data;
> +
> +protected:
> +  class BufferedData : public LinkedListElement<BufferedData> {

final

::: toolkit/components/extensions/webrequest/StreamFilterChild.cpp:33
(Diff revision 3)
> +  case State::Disconnecting:
> +  case State::Disconnected:
> +    break;
> +
> +  default:
> +    ErrorResult rv;

IgnoredErrorResult

::: toolkit/components/extensions/webrequest/StreamFilterChild.cpp:427
(Diff revision 3)
> + * Glue
> + *****************************************************************************/
> +
>  void
>  StreamFilterChild::ActorDestroy(ActorDestroyReason aWhy)
>  {

you should have a similar policy as the parent actor here. You should call Cleanup here.

::: toolkit/components/extensions/webrequest/StreamFilterParent.h:131
(Diff revision 3)
> +    MOZ_ASSERT(NS_IsMainThread());
> +  }
> +
> +  // Ensures that a strong reference to this instance is held alive while a
> +  // method-based or closure-based runnable is being dispatched.
> +  class KungFuDeathGrip : public Runnable

final

::: toolkit/components/extensions/webrequest/StreamFilterParent.h:131
(Diff revision 3)
> +    MOZ_ASSERT(NS_IsMainThread());
> +  }
> +
> +  // Ensures that a strong reference to this instance is held alive while a
> +  // method-based or closure-based runnable is being dispatched.
> +  class KungFuDeathGrip : public Runnable

This name is super confusing. KungFuDeathGrip is not used for runnable names.
KeepAliveStreamFilterActorRunnable? Find a better name please.

::: toolkit/components/extensions/webrequest/StreamFilterParent.cpp:24
(Diff revision 3)
>  
> -using namespace mozilla::dom;
> +/*****************************************************************************
> + * Helpers
> + *****************************************************************************/
> +
> +StreamFilterParent::KungFuDeathGrip::KungFuDeathGrip(StreamFilterParent* aGrip,

Why this cannot be inline?

::: toolkit/components/extensions/webrequest/StreamFilterParent.cpp:31
(Diff revision 3)
> +  : mGrip(aGrip)
> +  , mRunnable(aRunnable)
> +{}
> +
> +NS_IMETHODIMP
> +StreamFilterParent::KungFuDeathGrip::Run()

same here.

::: toolkit/components/extensions/webrequest/StreamFilterParent.cpp:251
(Diff revision 3)
> +void
> +StreamFilterParent::DoSendData(Data&& aData)
> +{
> +  AssertIsPBackgroundThread();
> +
> +  if (mState == State::TransferringData) {

?
Attachment #8850700 - Flags: review?(amarchesini) → review+
Comment on attachment 8850702 [details]
Bug 1255894: Part 6 - Implement StreamFilter DOM bindings.

https://reviewboard.mozilla.org/r/123240/#review134340

::: toolkit/components/extensions/webrequest/StreamFilter.h:27
(Diff revision 3)
> +
> +class StreamFilterChild;
> +
> +using namespace mozilla::dom;
> +
> +class StreamFilter : public DOMEventTargetHelper

final?

::: toolkit/components/extensions/webrequest/StreamFilter.h:41
(Diff revision 3)
> +  static already_AddRefed<StreamFilter>
> +  Create(GlobalObject& global,
> +         uint64_t aRequestId,
> +         const nsAString& aAddonId);
> +
> +  explicit StreamFilter(nsIGlobalObject* aParent,

explicit is not needed here.

::: toolkit/components/extensions/webrequest/StreamFilter.h:70
(Diff revision 3)
> +  }
> +
> +  void Suspend(ErrorResult& aRv)
> +  {
> +    if (mActor) {
> +      mActor->Suspend(aRv);

you are not including the actor header.
This works only because we unified c++ files.
Move these methods to the cpp file.

::: toolkit/components/extensions/webrequest/StreamFilter.h:103
(Diff revision 3)
> +    } else {
> +      aRv.Throw(NS_ERROR_NOT_INITIALIZED);
> +    }
> +  }
> +
> +  nsISupports* GetParentObject() const { return mParent; }

This is not needed. DOMEventTargetHelper does it for you.

::: toolkit/components/extensions/webrequest/StreamFilter.h:124
(Diff revision 3)
> +
> +private:
> +  void
> +  ConnectToPBackground();
> +
> +  nsCOMPtr<nsIGlobalObject> mParent;

you don't need this.

::: toolkit/components/extensions/webrequest/StreamFilter.cpp:38
(Diff revision 3)
> + * Initialization
> + *****************************************************************************/
> +
> +StreamFilter::StreamFilter(nsIGlobalObject* aParent,
> +                           uint64_t aRequestId,
> +                           const nsAString& aAddonId)

DOMEventTargetHelper(aParent)

::: toolkit/components/extensions/webrequest/StreamFilter.cpp:39
(Diff revision 3)
> + *****************************************************************************/
> +
> +StreamFilter::StreamFilter(nsIGlobalObject* aParent,
> +                           uint64_t aRequestId,
> +                           const nsAString& aAddonId)
> +  : mParent(aParent)

and remove this.

::: toolkit/components/extensions/webrequest/StreamFilter.cpp:139
(Diff revision 3)
> +    ok = ReadTypedArrayData(data, aData.GetAsUint8Array(), aRv);
> +  } else {
> +    MOZ_ASSERT_UNREACHABLE("Argument should be ArrayBuffer or Uint8Array");
> +    return;
> +  }
> +

Do we want to keep the actor alive if ok is false?

::: toolkit/components/extensions/webrequest/StreamFilter.cpp:175
(Diff revision 3)
> +  init.mBubbles = false;
> +  init.mCancelable = false;
> +
> +  auto buffer = ArrayBuffer::Create(cx, this, aData.Length(), aData.Elements());
> +  if (!buffer) {
> +    // TODO: There is no way to recover from this. This chunk of data is lost.

follow up?
If we are OOM I would not dispatch an event.

::: toolkit/components/extensions/webrequest/StreamFilter.cpp:222
(Diff revision 3)
> +NS_INTERFACE_MAP_BEGIN_CYCLE_COLLECTION_INHERITED(StreamFilter)
> +  NS_INTERFACE_MAP_ENTRY(nsIIPCBackgroundChildCreateCallback)
> +NS_INTERFACE_MAP_END_INHERITING(DOMEventTargetHelper)
> +
> +NS_IMPL_CYCLE_COLLECTION_UNLINK_BEGIN(StreamFilter)
> +  NS_IMPL_CYCLE_COLLECTION_UNLINK(mParent)

remove this

::: toolkit/components/extensions/webrequest/StreamFilter.cpp:226
(Diff revision 3)
> +NS_IMPL_CYCLE_COLLECTION_UNLINK_BEGIN(StreamFilter)
> +  NS_IMPL_CYCLE_COLLECTION_UNLINK(mParent)
> +NS_IMPL_CYCLE_COLLECTION_UNLINK_END
> +
> +NS_IMPL_CYCLE_COLLECTION_TRAVERSE_BEGIN_INHERITED(StreamFilter, DOMEventTargetHelper)
> +  NS_IMPL_CYCLE_COLLECTION_TRAVERSE(mParent)

remove this

::: toolkit/components/extensions/webrequest/StreamFilterChild.h:139
(Diff revision 3)
>  
>    State mState;
>    State mNextState;
>    bool mReceivedOnStop;
> +
> +  StreamFilter* mStreamFilter;

a comment about why we have this raw pointer. And what keeps alive what.

::: toolkit/components/extensions/webrequest/StreamFilterEvents.h:26
(Diff revision 3)
> +namespace extensions {
> +
> +using namespace JS;
> +using namespace mozilla::dom;
> +
> +class StreamFilterDataEvent : public Event

final?

::: toolkit/components/extensions/webrequest/StreamFilterEvents.h:57
(Diff revision 3)
> +  void GetData(JSContext* aCx, JS::MutableHandleObject aResult)
> +  {
> +    aResult.set(mData);
> +  }
> +
> +  virtual JSObject* WrapObjectInternal(JSContext* aCx,

no virtual here and in the DTOR
Attachment #8850702 - Flags: review?(amarchesini) → review+
Comment on attachment 8850702 [details]
Bug 1255894: Part 6 - Implement StreamFilter DOM bindings.

https://reviewboard.mozilla.org/r/123240/#review134340

> explicit is not needed here.

C++11 allows implicit construction with multiple arguments, so I'd rather keep the `explicit`.

> you are not including the actor header.
> This works only because we unified c++ files.
> Move these methods to the cpp file.

I'm including StreamFilterChild.h, where this class is defined. I'd rather keep these methods inline, since they're only called from a single place in StreamFilterBinding, and it would be a shame to add extra linkage/call overhead for functions as simple as this.

> Do we want to keep the actor alive if ok is false?

I guess not. We should probably close the channel with NS_ERROR_OUT_OF_MEMORY.

> follow up?
> If we are OOM I would not dispatch an event.

I think we should probably close the channel with NS_ERROR_OUT_OF_MEMORY and still dispatch the error event.
Comment on attachment 8850700 [details]
Bug 1255894: Part 4 - Implement StreamFilter IPC protocol.

https://reviewboard.mozilla.org/r/123236/#review134332

> ?

Sorry, I should have added a comment here when I split up the patch. The next patch in the series adds the logic that deals with attaching and manipulating traceable channels. Here, it adds a call to send data to the filtered channel.
Thanks for the reviews :)
I ran two sets of talos tests against these patches. Both sets compare running a skeleton WebExtension with a request listener for all URLs against the same extension that also adds a trivial response body filter to each request.

These are the results for in-process extensions:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=96fd2eb71647f50c0bd082645ed3609f34dcfb82&newProject=try&newRevision=4f727a99b3b767235187de53658623a3212f5265&framework=1&showOnlyImportant=0

And for out-of-process extensions:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=6c3f7be80481d153f06792d6ba56051e8b80edb3&newProject=try&newRevision=0abb3d405985020a1fb12865ab48989ac2ea44e7&framework=1&showOnlyImportant=0

In both cases, there is virtually no performance difference between the normal case and the request filtering case. In a few tests, there appear to be slight regressions (~.5-1ms on tp5o responsiveness), although the numbers for those are quite noisy, and within the margins of error.

The notable exceptions are the main netio metrics, which in almost all cases significantly decreased, and nonmain netio, which slightly increased. I'm not quite sure of the reason for this. My best guess is that it has something to do with redirecting the content conversions to the parent process, but I really don't know.
It seems that responses grabbed from the RAM cache cannot be read or altered by the current patchset; filter.onerror is triggered instead of ondata/onstop, where the event.target is a StreamFilter object with error "Invalid request ID" and status "failed" (and timestamp 0).
I have some related questions:
(Q1) Can the webRequest.onResponseData event be fired before the webRequest.onHeadersReceived event listeners are called?
(Q2) Can the webRequest.onResponseData event be fired before the webRequest.onHeadersReceived event listeners have returned?

To elaborate,

(Q1) If so, what guarantees that no onResponseData events are not dropped, because the following seems possible in theory:
1. (parent process->child process) webRequest.onHeadersReceived should be fired
2. (parent process) webRequest.onResponseData should be fired. There are no listeners yet, so the event is dropped.
3. (child process) webRequest.onHeadersReceived is fired.

(Q2) If not, how can an add-on implement redirect or cancel the request based on the response body?
For example, to implement content-sniffing if desired (e.g. if Content-Type is application/octet-stream and the response body starts with "%PDF", then the response is likely a PDF file).
Flags: needinfo?(g.maone)
We had a long discussion about this. Discussion was about how to implement this.

nsITracableChannel is tricky to make it work correctly in e10s. The problem is decision whether to decode data is made on the child process and in some cases also needs actual data. So we must fire OnDataAvailably (and deliver it from a parent to a child) before we know whether we should decode data or not. (nsITracableChannel should deliver decoded data).

From our discussion it seems that we can deliver data from one child(content process) to webextension process. This will make this much easier.
(In reply to Rob Wu [:robwu] from comment #70)
> I have some related questions:
> (Q1) Can the webRequest.onResponseData event be fired before the
> webRequest.onHeadersReceived event listeners are called?
> (Q2) Can the webRequest.onResponseData event be fired before the
> webRequest.onHeadersReceived event listeners have returned?

If there was such an event I've been tempted to say "No" to both the questions, for the mechanism to be fully effective and flexible.
But as far as I can see from the patches the API is quite different now from the original suggestion: for instance, there's no onResponseData event but we've got a filterResponseData(requestId) function instead.
So I'll let Kris explain when/where it's supposed to be called and how it works.
Flags: needinfo?(g.maone) → needinfo?(kmaglione+bmo)
All, 

I am trying to to understand with all the changes if we will still be able to edit the content (JS/HTML) before it is rendered?

Thanks,
Michael
(In reply to Michael Balazs from comment #73)
> All, 
> 
> I am trying to to understand with all the changes if we will still be able
> to edit the content (JS/HTML) before it is rendered?
> 
> Thanks,
> Michael

Yes.
Is this code out in any release at the moment that we can begin testing with?

Thanks,
Michael
Hi Kris,

I will implement a part of this bug, or I can take over the bug, as you prefer. I just need couple of clarifications on how webextensions work:

I have looked at your patches and most of StreamFilterParent/Child can move as it is to be between Child and WebExtension process.

The only problematic thing is WebRequest.jsm which keeps track of permissions. Is this correct? And it is sitting on the parent process?

HttpChannelChild needs to implement nsITraceableChannel which is easy.

when do we want to allow registration of a streamFilter? before OnStartRequest or also after OnStartRequest?
If we want to allow it after OnStartRequest it is a bit tricky to install the the listener on the right place, i.e. after decoders.
I tried to finish this bug. I rebased the patches but I have problem with the test.
I see an error:
JavaScript error: resource://gre/modules/ExtensionCommon.jsm, line 435: Error: Not implemented

but I really do not know anything about webextensions and I do not know how to fix it.

the error occurs when filterResponseData is called. I have change ext-c-webRequest.js to:

"use strict";

this.webRequest = class extends ExtensionAPI {
  getAPI(context) {
    return {
      webRequest: {
        filterResponseData(requestId) {
          requestId = parseInt(requestId, 10);

          return context.cloneScope.StreamFilter.create(
            requestId, context.extension.id);
        },
      },
    };
  }
};
With robwu_nl's help, Ihave fix the problem from comment 77.
While working on this, I realized one potential problem.

I am working on integrating comment 71. All data will be sent from a content process. If I remember correctly other events (on-request-sent, I do not know if on-response is used) will be sent from the parent process. It will be hard to guarantee synchronization of this events.

Could this be a potential problem? :mao, do you know answer to this question or do you know who knows?
Flags: needinfo?(g.maone)
(In reply to Dragana Damjanovic [:dragana] from comment #79)
> All data will be sent from a content
> process. If I remember correctly other events (on-request-sent, I do not
> know if on-response is used) will be sent from the parent process. It will
> be hard to guarantee synchronization of this events.
> 
> Could this be a potential problem? :mao, do you know answer to this question
> or do you know who knows?

The main if not the only problem I could see (beside properly documenting this async behavior): given this limitation will we be able to access the response headers (and especially content-type) from within our filter?
Flags: needinfo?(g.maone)
Comment on attachment 8850701 [details]
Bug 1255894: Part 5 - Hook up StreamFilterParent as a stream listener.

https://reviewboard.mozilla.org/r/123238/#review167676

Dropping my review on this one. This needs to be done in a different way, as we discussed a couple of weeks ago.
Attachment #8850701 - Flags: review?(dd.mozilla)
Attachment #8856132 - Attachment is obsolete: true
I shifted the review of part 9 to Kan-Ru (although it's not showing up in bugzilla yet...). Usually with these sorts of reviews, Kris, it's best to include a diff that shows how the generated code changed. I suspect Kan-Ru would appreciate this.

For part 10, what parts should I review? I'm not familiar with any of this code. baku or bkelly are probably better reviewers.
(In reply to Bill McCloskey (:billm) from comment #92)
> For part 10, what parts should I review? I'm not familiar with any of this
> code. baku or bkelly are probably better reviewers.

Just the parts that deal with creating the endpoints, transferring them to the child processes, and initializing the protocols. Dragana should be able to review the rest. I flagged you since you were the one who suggested this approach to Jason.
Flags: needinfo?(kmaglione+bmo)
(In reply to Bill McCloskey (:billm) from comment #92)
> I shifted the review of part 9 to Kan-Ru (although it's not showing up in
> bugzilla yet...). Usually with these sorts of reviews, Kris, it's best to
> include a diff that shows how the generated code changed. I suspect Kan-Ru
> would appreciate this.

The attached patch shows all of the generated code changes as a result of this patch.
Attachment #8901651 - Flags: review?(wmccloskey) → review?(kchen)
Comment on attachment 8901652 [details]
Bug 1255894: Part 10 - Move StreamFilterParent to necko child process.

https://reviewboard.mozilla.org/r/173076/#review179738

::: toolkit/components/extensions/webrequest/StreamFilterParent.cpp:50
(Diff revision 1)
>                                      mContext.forget());
>  }
>  
> -void
> -StreamFilterParent::Init(already_AddRefed<nsIContentParent> aContentParent)
> +bool
> +StreamFilterParent::Create(dom::ContentParent* aContentParent, uint64_t aChannelId, const nsAString& aAddonId,
> +                           Endpoint<PStreamFilterChild>& aEndpoint)

Can you pass this as a pointer instead of a reference since it's an outparam?
Attachment #8901652 - Flags: review?(wmccloskey) → review+
Comment on attachment 8901651 [details]
Bug 1255894: Part 9 - Allow returning non-copyable types in async IPDL methods.

https://reviewboard.mozilla.org/r/173074/#review179952
Attachment #8901651 - Flags: review?(kchen) → review+
Comment on attachment 8850701 [details]
Bug 1255894: Part 5 - Hook up StreamFilterParent as a stream listener.

https://reviewboard.mozilla.org/r/123238/#review180132

::: toolkit/components/extensions/webrequest/StreamFilterParent.cpp:402
(Diff revision 4)
> +      NewRunnableMethod<Data&&>("StreamFilterParent::DoSendData",
> +                                this,
> +                                &StreamFilterParent::DoSendData,
> +                                Move(data)),
> +      NS_DISPATCH_NORMAL);
> +  }

Is there an option that webExtension onlysee data but does not change it? As far as I can tell form this patches answer is no?
Attachment #8850701 - Flags: review?(dd.mozilla)
Comment on attachment 8850701 [details]
Bug 1255894: Part 5 - Hook up StreamFilterParent as a stream listener.

https://reviewboard.mozilla.org/r/123238/#review180132

> Is there an option that webExtension onlysee data but does not change it? As far as I can tell form this patches answer is no?

No. It may be worth adding in the future, but this solution covers both cases, and so far hasn't shown a measurable performance impact, so I think it's a good starting point. For the cases that we've talked about so far, even if an extension may not need to modify the data, it will generally need to be able cancel the request before offending data is processed while it's scanning the stream, so I'm not sure we really have much to gain.
Comment on attachment 8850701 [details]
Bug 1255894: Part 5 - Hook up StreamFilterParent as a stream listener.

https://reviewboard.mozilla.org/r/123238/#review180462
Attachment #8850701 - Flags: review+
One note:

if a resource is treated as a download, we are transferring it back to the parent process. During OnStartRequest, the OnStartRequest URILoader is called. URILoader tries to find proper listenter for a channel that will handle the data. During this it can come to decision that the resource should be downloaded. If that happen HttpChannelChild will send data back to the parent and the traceable channel will not see data.

Looking at one of these patches StreamFilterParent sends OnStartRequest massage to StreamFilterChild and immediately after that calls OnStartRequest for the next listener. So OnStartRequests will be called in one chain, like it is now and this will work fine with downloads. Only StreamFilterChild will never receive OnData and OnStop.
https://hg.mozilla.org/integration/mozilla-inbound/rev/31b01934c647ee8e45a1b2ae13a27947b1841d7b
Bug 1255894: Part 2 - Add mozIWebRequestService service for tracking filtered requests. r=mixedpuppy

https://hg.mozilla.org/integration/mozilla-inbound/rev/c90ba1cf0de266a399896f541bb9bfb08257a4ba
Bug 1255894: Part 3 - Create skeleton IPDL framework for OOP stream filters. r=baku

https://hg.mozilla.org/integration/mozilla-inbound/rev/eedc32719af7a4a34c6fb2651b05d25d86688e3f
Bug 1255894: Part 4 - Implement StreamFilter IPC protocol. r=baku,mixedpuppy

https://hg.mozilla.org/integration/mozilla-inbound/rev/acc5f5f5946cec6afcd87ed4cebc73a0a8c4226e
Bug 1255894: Part 5 - Hook up StreamFilterParent as a stream listener. r=dragana,mixedpuppy

https://hg.mozilla.org/integration/mozilla-inbound/rev/26ce00544f9862adf2e07ce2b001280e59015969
Bug 1255894: Part 6 - Implement StreamFilter DOM bindings. r=baku,mixedpuppy

https://hg.mozilla.org/integration/mozilla-inbound/rev/3599cdb1a849999f4d669885ddd7dbab21909cf5
Bug 1255894: Part 7 - Expose response stream filtering via the webRequest API. r=mixedpuppy

https://hg.mozilla.org/integration/mozilla-inbound/rev/cc92f00288cd3e433f47ab7d5d724afbc46d0450
Bug 1255894: Part 8 - Add tests for response stream filtering. r=mixedpuppy

https://hg.mozilla.org/integration/mozilla-inbound/rev/15097fdfa8e972216cb71cfd98c95591cf857e55
Bug 1255894: Part 8.1 - Disable response data filtering in release builds. r=me

https://hg.mozilla.org/integration/mozilla-inbound/rev/96b6a3d0231b93016f6d35a55b8f5ed98a6a8712
Bug 1255894: Part 9 - Allow returning non-copyable types in async IPDL methods. r=kanru
Landing the reviewed parts enabled for non-release builds (and without support for content converters) so that people have a chance to experiment before the merge.
Keywords: leave-open
Summary: Support a webRequest.onResponseData event to filter HTTP response bytes as they come → Support filtering streaming response data via webRequest API
https://hg.mozilla.org/integration/mozilla-inbound/rev/277554180e9faaab82cb27a891f5f8d01ede4d9d
Bug 1255894: Follow-up: Disable response body filtering tests on Android for mochitest server issues.
Thanks a ton for landing this experimental version, kmag!

It seems that this version does not see any requests read from the RAM cache (or something to that effect, as I mentioned in comment 69); I cannot tell if it's related to what Dragana mentioned in comment 100.

I take it that is just some of the work remaining to be done?
Comment on attachment 8901652 [details]
Bug 1255894: Part 10 - Move StreamFilterParent to necko child process.

https://reviewboard.mozilla.org/r/173076/#review181680
Attachment #8901652 - Flags: review?(dd.mozilla) → review+
Comment on attachment 8901653 [details]
Bug 1255894: Part 11 - Move StreamFilterParent to STS thread.

https://reviewboard.mozilla.org/r/173078/#review181682
Attachment #8901653 - Flags: review?(dd.mozilla) → review+
Ah, disregard my comments about the RAM-cache. My earlier patch used to get those RAM-cached details somehow (or seemed to do so), but given what's stated in the handlerBehaviorChanged MDN page, it's clear that it should not have.
Depends on: 1396648
(In reply to Thomas Wisniewski from comment #106)
> Thanks a ton for landing this experimental version, kmag!
> 
> It seems that this version does not see any requests read from the RAM cache
> (or something to that effect, as I mentioned in comment 69); I cannot tell
> if it's related to what Dragana mentioned in comment 100.
> 
> I take it that is just some of the work remaining to be done?

(In reply to Thomas Wisniewski from comment #109)
> Ah, disregard my comments about the RAM-cache. My earlier patch used to get
> those RAM-cached details somehow (or seemed to do so), but given what's
> stated in the handlerBehaviorChanged MDN page, it's clear that it should not
> have.

That documentation applies more to Chrome than to Firefox. As far as I know, the only memory cache that we don't notify request listeners for is the image cache (and I guess the bfcache), since it doesn't actually trigger any requests.

In theory, this API should still work on any request that looks like a request, but I'm not 100% sure. I'll do some more detailed checks when I land the final parts.
For those who will experiment with this API, here are some things I've noticed which might be good to note:

- Since Firefox defaults to sending an accept-encoding request header, the server will likely send back gzipped (or whatever) data that you will have to appropriately decode to process in your filter. In addition, you cannot override the content-encoding response header (an exception is thrown). As such, to prevent a content decoding error you either have to re-encode your replacement data back to that same gzip/whatever format, or just avoid the whole de/encode round-trip by preventing Firefox from sending an accept-encoding request header in the first place (by filtering it out with a blocking webRequest.onBeforeSendHeaders listener).

- Not just any arraybuffer/uint8array is acceptable for filter.write: it must be an instance from the same window object that the filter events are fired from (otherwise, you're cryptically told that you aren't sending back an arraybuffer or uint8array). As such you need to wait for one of the filter events to trigger, and use its scope to create any replacement buffers/arrays you send to filter.write. (In other words, be careful with using ES6 arrow functions for your filter-handlers.)

- You will want to ensure that you don't accidentally filter.write replacement data more than once by carelessly doing so in the ondata handler (which of course could fire multiple times). So if you already know what data you want to filter.write before it comes in, do so in the onstart handler instead.

Speaking of which, for that last use-case (of replacing the response wholesale with a pre-known result), I suspect it would be good to close the request on filter.close, to prevent wasting resources (if that isn't already being done; I haven't had time to check yet).
(In reply to Thomas Wisniewski from comment #111)
> For those who will experiment with this API, here are some things I've
> noticed which might be good to note:
> 
> - Since Firefox defaults to sending an accept-encoding request header, the
> server will likely send back gzipped (or whatever) data that you will have
> to appropriately decode to process in your filter. In addition, you cannot
> override the content-encoding response header (an exception is thrown). As
> such, to prevent a content decoding error you either have to re-encode your
> replacement data back to that same gzip/whatever format, or just avoid the
> whole de/encode round-trip by preventing Firefox from sending an
> accept-encoding request header in the first place (by filtering it out with
> a blocking webRequest.onBeforeSendHeaders listener).

That's only true because the last two parts haven't landed yet. Once they do, the proper content decoders will be applied before the data is sent to the extension.

> - Not just any arraybuffer/uint8array is acceptable for filter.write: it
> must be an instance from the same window object that the filter events are
> fired from (otherwise, you're cryptically told that you aren't sending back
> an arraybuffer or uint8array).

Hm. That's surprising.

> As such you need to wait for one of the filter events to trigger, and use
> its scope to create any replacement buffers/arrays you send to filter.write.
> (In other words, be careful with using ES6 arrow functions for your
> filter-handlers.)

I'm not sure what you mean. The scope that you create the event filter in is
the scope that the array buffers should belong to.

> Speaking of which, for that last use-case (of replacing the response
> wholesale with a pre-known result), I suspect it would be good to close the
> request on filter.close, to prevent wasting resources (if that isn't already
> being done; I haven't had time to check yet).

You need to call filter.close() after you're done writing data, no matter
what. If you're not interested in the actual stream data, you should do that
without waiting for onstop.
>I'm not sure what you mean.

Actually, I might be off-base here. It's been a long day on my end, so I might have just been seeing things. I'll see if I can reproduce it again the next chance I get.
Keywords: leave-open
https://hg.mozilla.org/integration/mozilla-inbound/rev/4668460d74e8b61bbb3e7a29ecd74adb25b206cd
Bug 1255894: Part 10 - Move StreamFilterParent to necko child process. r=billm,dragana

https://hg.mozilla.org/integration/mozilla-inbound/rev/74c32b25b40364c4126872c52ee24c02e8cb4a2e
Bug 1255894: Part 11 - Move StreamFilterParent to STS thread. r=dragana

https://hg.mozilla.org/integration/mozilla-inbound/rev/604fd3302562dc2059a8de3231818a1b618b8ffe
Bug 1255894: Part 12 - Re-enable response body filtering in release builds. r=me
Blocks: 1398950
how can it be used in webextension?

can you make a sample of webextension with this feature? or just a sample of code for webextension?
(In reply to Charles Jones from comment #116)
> how can it be used in webextension?
> 
> can you make a sample of webextension with this feature? or just a sample of
> code for webextension?

You can look here:
https://reviewboard.mozilla.org/r/123244/diff/4#index_header
Please see https://github.com/mdn/webextensions-examples/tree/master/http-response. If you have further questions, please use the mailing list or IRC: https://wiki.mozilla.org/Add-ons#Getting_in_touch
Verified the issue using the test add-on from Comment #118 (https://github.com/mdn/webextensions-examples/tree/master/http-response) and I can confirm that in the Latest Nightly(Build ID 20170918220054), the webextension is correctly adding the word "Example" on https://example.com so it becomes "WebExtension Example" and this thing is not happening on Latest Firefox Release. 


Also, I tried https://addons.mozilla.org/en-US/firefox/addon/epubreader from bug description on both Latest Nightly(Build ID 20170918220054) and Latest Release and the only notable difference was that after opening http://www.gutenberg.org/ebooks/27458 with the extension installed , on Release I was not able to go back to this page by pressing back button, but I was able to do that on Nightly which means that if this is related to this bug, it was solved.


Please let me know where can I find a valid version of NoScript webextension from bug description and I will test it too. Also, if you have special scenarios in mind for both add-ons from above please add them here.
(In reply to Victor Carciu from comment #119)
Thanks Victor. As far as I can tell epubreader does not use filterResponseData yet. If there's issues there, please file a bug that we can ping over to the author of that extension because its not related to this.

I don't know if builds of NoScript as a webextension are available yet, but pinging the author.
Flags: needinfo?(g.maone)
(In reply to Andy McKay [:andym] from comment #120)

> I don't know if builds of NoScript as a webextension are available yet, but
> pinging the author.

Not publicly yet. Also, the features which require filterResponseData are not among the most widely used, so they're likely to be fully ported later than others.
Anyway I'll post an update here as soon as they're done.
Flags: needinfo?(g.maone)
Depends on: 1403546
Depends on: 1404381
Depends on: 1404408
Depends on: 1405286
Depends on: 1405375
Depends on: 1405506
Flags: needinfo?(schien)
I've written some pages on this, as well as I can understand it from the IDL and playing with the API:

https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/webRequest/filterResponseData
https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/webRequest/StreamFilter

Let me know what else is needed. I've submitted a PR for the compat data, but it needs to be merged and deployed before it's visible on MDN. But I though it would be a good idea to get these docs reviewed sooner rather than wait for that.

My examples for suspend() and resume() don't work, I'm obviously doing something wrong but don't know what.
Flags: needinfo?(kmaglione+bmo)
https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/webRequest/filterResponseData: "You'd typically call this function from a webRequest event listener"

I think we should specify which event listeners. An example is shown for browser.webRequest.onBeforeRequest but when i tried with browser.webRequest.onResponseStarted, the browser.webRequest.filterResponseData did not catch anything. I assumed it is "too late" to filter the response data (and this makes this service useless for all my use cases).

That should be documented somewhere.
Depends on: 1422578
Depends on: 1432419
Depends on: 1416227
Product: Toolkit → WebExtensions
Flags: needinfo?(kmaglione+bmo)

I think this has been documented;

https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/filterResponseData

etc.

Let me know if you disagree, and let me know what still needs doing. Thanks!

You need to log in before you can comment on or make changes to this bug.