Closed Bug 980481 (director) Opened 6 years ago Closed 5 years ago

Debuggee instrumentation a.k.a. director

Categories

(Add-on SDK Graveyard :: General, defect, P1)

x86
macOS
defect

Tracking

(firefox38 fixed)

RESOLVED FIXED
mozilla38
Tracking Status
firefox38 --- fixed

People

(Reporter: irakli, Assigned: rpl)

References

Details

Attachments

(3 files, 34 obsolete files)

954 bytes, patch
Details | Diff | Splinter Review
10.77 KB, patch
Details | Diff | Splinter Review
59.22 KB, patch
Details | Diff | Splinter Review
`inpectionTarget` is an object representing entity targeted by a developer
toolbox. Toolbox Pane API exposes it in a form of `event.inspectionTarget`
that implements message event `source` API defined by [HTML WebMessaging][] specification.

We need to expose a feature to evaluate code once toolbox pane is activated
in order to instrument inspection target. We may be able to shoehorn page-mods, API for that, but would rather do something designed specifically for this, so
that we could support remote targets for example.
Summary: Code injection into inpectionTarget / debuggee → Code injection into inspectionTarget / debuggee
Priority: -- → P1
Attached file WIP prototype (obsolete) —
Given the discussion today on IRC I have updated JEP and posting parts of it here:



While firefox debugging protocol may provide some build-in instrumentation
code in form of actors that may not necessarily address cover everything
that developer tool add-on may want to accomplish. There for add-on should
be able to provide own custom instrumentation for a debugee. In most
browsers this is accomplished via code evaluation on the inpsection target,
but that's pretty limited. We could leverage work done to build firefox
devtools itself and build on top of actors. Add-on should be able to provide
custom actor definition in form of a module (with limited access, so that
we won't have to send all of the add-on modules to debuggee which maybe
on mobile phone). Add-on actors would be loaded when developer toolbox is
activated & there for it will be able to instrument debuggee. It also could
provide high level API that can be consumed from the toolbox panel or
add-on itself.

As a side effect community could implement actor providing chrome API &
a script for pane document as a clinet to it, essentially providing polyfill
that would bridge differences between browsers.
Summary: Code injection into inspectionTarget / debuggee → inspectionTarget / debuggee instrumentation
I'm not sure if we'll be able to feature introduced via Bug 977443 out of the box, but it's definitely
a step towards where we wanna get.

I would also think that making actors same privileged as inspection target would be a good default, we
could allow chrome privileges on explicit request though.
Depends on: 977443
I've prepared a new prototype (work in progress), it's based on the previous prototype with the following changes applied:

- rewritten to be applied to the fx-team branch 
- basic test case (using mochitest-chrome test suite)
- split responsibilities into 3 actors:
  - "devtools-addon": which forge "tab-instrumenter" actors 
  - "tab-instrumenter": which automatically setup requested instrumentation
  - "tab-worker": which expose the optional ContentScript worker
Attachment #8390708 - Attachment is obsolete: true
Alias: director
Summary: inspectionTarget / debuggee instrumentation → Debuggee instrumentation a.k.a. director
Here are some of the thoughts we had for the director actor:

1. It should be loaded with a rest of the devtools to be available
   regardless of the addon lifetimes.
2. Director shoud supervise worker actors installed by add-ons. It
   should be able to install / uninstall several actor workers arbitrary
   times through out the session.
3. Director should load it's worker actors into content privileged sandbox,
   similar to how content scripts are. So actors defined by add-ons won't
   have chrome privileges.
4. Worker actors should be able to define their own methods that could be
   invoked from the client (through RDP) & hook methods that director could
   invoke to notify worker about state changes, like page navigation etc..
5. Worker actor sholud be able to expose events along with methods so that
   it could push updates onto connected client.
6. Ideally worker API should not be constrained with `protocol.js`, meaning
   that worker actor implementer should not have to import `protocol.js` in
   some way & ideally user won't have to deal with type annotations either
   (although not clear if later will be doable). Definitely workers should
   not deal with defining other actors they could return, in most cases same
   API could be exposed by adding methods to worker actor that takes handle
   as an argument.
7. Director should load worker actors lazily, meaning that worker sandbox
   should be created & code should be loaded into it on demand (when associated
   panel is selected).
   
We were thingking that director on new worker registration would load it's source
into sandbox or maybe analyze it through parsing instead to detect it's methods,
that is to generate a shell actor using `protocol.js` that would have a same
interface as a worker definition. At the instantiation worker shell can take care
of sandbox creation and a worker loading into it. Worker shell would also proxy
method calls to a worker difinition in the sandbox. That way worker loaded in
the debuggee could be exposed by volcan making it's methods simply callable from
the client side. Although it's not clear how can we figure out types of the method
arguments & return values or how to annotate them. Although it's likely that we
could limit typesystem to two types of values JSON and Object grips. Alternative
could be to implement `protocol.js` backend part in non privileged JS similar to
how volcan implements frontend part, although that seems like a lot of work & not
the simplest API for users to use, maybe we could start with something easier.
We could just expose `Grip` class in the worker context that could be used to
box return values so that shell could distinguish weather return value is `json`
or `grip` value by checking `value instanceof Grip` and have all methods have
`JSON|Grip` type.
First prototype of the DirectorActor, based on the description of its behaviour.

And a first draft JEP design doc for the Director (currently it's mostly a description of the requirements and a some notes about the "work-in-progress" design and implementation choices and next steps of the current prototype):

- https://gist.github.com/rpl/681b1c827a483d699791#file-jep_devtoolssdk_director-md
Attachment #8429322 - Attachment is obsolete: true
This new revision introduce small tweaks on the previous patch, and I've used it to test the Direcor and Instrumenter actors on an Ember Inspector experimental branch:

- https://github.com/rpl/ember-extension/tree/experimental/remote-instrumenter

Using this prototype I've done some preliminary test on the following scenarios, Ember Inspector connected to:

- a local tab
- a remote tab running on Firefox for Desktop
- a remote tab running on Firefox for Android: http://youtu.be/A4cl-7W0xGg
- a remote app running on Firefox OS Simulator, embedded into the App Manager: http://youtu.be/Nou16YyltQY
- a remote app running on Firefox OS Simulator, embedded into the WebIDE: http://youtu.be/1wjzYdlrhfI

Changes:

- fix options injection in the sandbox using Cu.cloneInto

- tracked down issues on keep the same sandbox between window reloads: we have to create a new sandbox on every window origin change, 
  this is an intended behaviour and it's correct to get privilege errors on accessing a page elements from a different window origin.

- enabled wantExportHelpers sandbox attribute, which enable useful feature in the Instrumenter sandbox 
  (e.g. evailInWindow which is able to evaluate javascript code in a iframe window)

- renamed Instrumenter actor methods *sendEvent* to *sendSandboxEvent* and *callMethod* to *callSandboxMethod*, because I noticed
  that all protocol.js classes have an helper function named *_sendEvent* 

Things to "discuss / looking at" and implementation notes/concerns:

- the instrumenter sandbox is visible to the debugger server, but it is not able to list its sources, I need to take a deeper look to the problem to check if it is related to something I didn't do or I didn't do correctly, or if it's a limitation of the current implementation in terms of debuggable contexts)

- tweaks to the API visible to the addon developers: my main concerns here are related to keep the sandbox weight not too heavy (I remember some notes about lazy getter and reduce the addon-sdk modules use to get better performance on Firefox OS devices with small available RAM, sent from Alexandre Poirot in the devtools developers ml)

- reusing addon-sdk sandbox/worker code: I'm not fully convinced if it is better to adapt the current addon-sdk implementation and add supports to the new use-case or if it is better to get only what we need without sharing too much code, my main goal here is to keep the impact on the existent code base as low as possible

- tweaks to the Instrumenter lifecycle: my concerns here are to do not confuse addon developers with a sandbox which sometimes is automatically destroyed and recreated (but I think it would be interesting to keep the same sandbox for the entire debugging session, at least where we know that we will not change window origin, e.g. in a connected Firefox OS app)

- submit to code review: who is better to ask for code review on the Director/Instrumenter actors code?

Next Steps: 

- add more test cases: now that I'm pretty sure I'm not doing anything so wrong to prevents the Director and Instrumenter actors to work on Firefox OS scenarios, I want to create a couple of more basic test cases and then start tweaking the API which will be visible to the addon developers.

- try one more scenario: try to connect the Director/Instrumenter actors to a app running inside a real Firefox OS device
Assignee: nobody → luca.greco
Attachment #8442738 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Flags: needinfo?(rFobic)
So I am finally on this & I'm eager to run this to a completion as soon asap. Right now I'm looking at the JEP gist, there are a lot of details there thanks for documenting them it really helps. I am afraid I don't fully follow the diagrams, if I'll still have questions after looking at them few more times I'll try to ask questions over video.

Let me just propose making actual JEP out of that gist. At the moment we use following flow:

1. Create branch of  https://github.com/mozilla/addon-sdk/tree/JEP
2. Add an markdown file with JEP content
3. Send pull request to https://github.com/mozilla/addon-sdk/tree/JEP (not a master branch)

That way we can inline conversation about the design & use all the other pull request features. It should be pretty straight forward to transform your gist into this. In the meantime I'll ask questions and add comments here:


1. It seems that each tab will get it's own director, although installing / uninstalling instrumenter actors will affect all tabs, while activating / deactivating I suspect will be per tab basis. I think this is little awkward (although I think it's mostly due to tab actors being overloaded with ton of unrelated things). I think it would be more natural if root actor got a `director` as it's field that could install / uninstall / list instrumenters. Tab actor on the other hand could just gain a lazy getters for all the installed instrumenters.

2. It could be that I don't yet get as complete picture as you do so bare with me if I sound stupid, but I think that now instrumenter get's only a messaging channel with a client (toolbox/panel). It seems that it is unable to extend API by adding more methods. I guess what I'm talking about is that currently tab actors have fields for actors associated with a given tab like console, webaudio, etc... Than each one of them defines it's own set of methods (or requestTypes to be precise) to do certain things. Those actors are also defined in form of a class who's "public" methods become exposed to a client. I would love if instrumenters were only different by having a simpler API and by being loaded in the debuggee privileged sandbox. In other words I really wish they could define their own methods (request types). I realize it maybe a complicated task and it may make sense to defer that to a future, but still would be nice to have a thought on how to do it so we don't design current solution against it.

3. Looking further into a gist it seems there is a way to define custom methods / requestTypes like `callableEchoMethod` but it's not very clear what methods are going to be exposed by an actor front and what their type signatures will be like. It becomes even more confusing as some exports are input hooks while others are exposed front end methods. I would love if we could design some API where fronts would closely reflect what's defined in the back. I also think `instrumenter` in case of instrumenter script should be renamed to `debuggee` and probably be passed in as an argument to one of the hooks rather than being exposed as a global. 

Going to add more comments later
Flags: needinfo?(rFobic)
The new prototype doesn't yet include the "sandbox" side of the previous one, because it's focused on exploring potential issues or changes needed to be able to register/unregister instrumenter actors dinamically during the RemoteDebuggerServer life.

In the new prototype the DirectorActor create and register a new InstrumenterTabActor class dinamically from a json description received in the **director.install** request.

The dinamycally defined InstrumenterTabActor is then able to describe itself using the protocolDescription, like the other existent actors (and volcan will be able to dinamically create a client using the above protocol description).

Follows a couple of notes and potential issues identified hacking on this new prototype:

- can we register a new actor dinamically:
  yes, it's possible even if it's not currently used in any of the existent actors

- can we unregister the new actor?:
  yes, it doesn't seem to affect the existent actor instances, and on new connections its actor instance will not be created anymore

- can we unregister its protocol.js type?:
  this needs changes and more evaluation about when it is safe to unregister the protocol.js type,
  currently there isn't a protocol.js helper to unregister a type, and for a couple of good reasons:
    - none of the current actors changes its interface or is unregistered during the debugger server life
    - instrumenter type should be unregistered only when all the existent instances are destroyed (or all the connections closed)
    - some of the actor instances can be running in a subprocess

- can we update a previous installed actors with a new one?
  yes, currently the re-installed actor type seems to overwrite the previous registered actor type (in the protocol.js actor types map)
  - same problems for unregistering a protocol.js type (e.g. it should not be changed if old versions exists)

- how to handle e10s cases? (e.g. FirefoxOS apps)
  this prototype will not work in a multi-process tabs/apps scenario, because in the child process all the actors are redefined by statically loading actors defined in commonjs modules in the child process' RemoteDebuggerServer, so it will need to receive from the main process enough info to be able to register the same instrumenter actor in the child process as well, currently I'm thinking to explore one (or even both) of the following solutions:

  - SOLUTION 1: customize the child process remote debugger server to exchange
    registered directors metadata from the main process and dinamically regenerate
    needed actor classes

  - SOLUTION 2: change strategy, persist actor installed to files in a defined
    directory and customize the remote debugger server to load all modules defined
    there (and marked as active from the user for security reasons?)
Flags: needinfo?(rFobic)
sorry Bug980491 in the previous comment should be Bug980481
Flags: needinfo?(dcamp)
Cc-ing Alex as dcamp said he's our e10s guy!
Flags: needinfo?(rFobic) → needinfo?(poirot.alex)
Hi Luca, you again! It looks like we are on the same track ;)

(In reply to Luca Greco from comment #9)
> Created attachment 8469368 [details] [diff] [review]
> rev4 - Bug980491 statically registered Global DirectorActor and dynamically
> registered Tab InstrumenterActor
> 
> The new prototype doesn't yet include the "sandbox" side of the previous
> one, because it's focused on exploring potential issues or changes needed to
> be able to register/unregister instrumenter actors dinamically during the
> RemoteDebuggerServer life.

I'm not sure I follow why your code is much more complex than the patch from bug 977443.
Ideally we would just post a big source string and just evaluate that.
So that you wouldn't need createInstrumenterFrontClass/createInstrumenterActorClass, which adds quite a bunch of complexity!

> 
> In the new prototype the DirectorActor create and register a new
> InstrumenterTabActor class dinamically from a json description received in
> the **director.install** request.

What is the value/requirement behind this JSON description?
protocol.js actors are already able to describe themself from their implementation.
This JSON input looks like a duplicate data with the actor implementation.

> - how to handle e10s cases? (e.g. FirefoxOS apps)
>   this prototype will not work in a multi-process tabs/apps scenario,
> because in the child process all the actors are redefined by statically
> loading actors defined in commonjs modules in the child process'
> RemoteDebuggerServer, so it will need to receive from the main process
> enough info to be able to register the same instrumenter actor in the child
> process as well, currently I'm thinking to explore one (or even both) of the
> following solutions:
> 
>   - SOLUTION 1: customize the child process remote debugger server to
> exchange
>     registered directors metadata from the main process and dinamically
> regenerate
>     needed actor classes

+1

> 
>   - SOLUTION 2: change strategy, persist actor installed to files in a
> defined
>     directory and customize the remote debugger server to load all modules
> defined
>     there (and marked as active from the user for security reasons?)

-1, child processes have a very restricted access to the file system.

So I would suggest you to hack into main.js. I would add the dynamic list of additional actors right on DebuggerServer. Then I would hack into DebuggerServer.connectToChild in order to pipe the minimal set of information to the child when we start debugging it. I wouldn't send the actor source immediatly on connect.
Instead, I would suggest hacking on top of bug 988237, where I intend to load actors lazily. So that the actors source gets also pulled lazily from the child process.
I can do some progress on bug 988237 next days if that can help. But I will have to wait for Panos to come back in order to review it.

Here is a more concrete proposal in a kind of pseudo code:
1) Set a Map of dynamically registered actors on DebuggerServer.
  DebuggerServer.actors = new Map();

2) Expose two new methods on DebuggerServer to register dynamic actors by adding/removing them from the Map, but also call addTabActor/addGlobalActor.
  DebuggerServer.registerDynamicActor = function () {}
  DebuggerServer.unregisterDynamicActor = function () {}
  
3) Hack into DebuggerServer.connectToChild

  http://mxr.mozilla.org/mozilla-central/source/toolkit/devtools/server/main.js#648
  mm.sendAsyncMessage("debug:connect", { prefix: prefix, actors: this.actors });
  // I don't know if sendAsyncMessage correctly pipe Maps...

4) Hack child.js to take care of this actors Map

  http://mxr.mozilla.org/mozilla-central/source/toolkit/devtools/server/child.js#29
  let onConnect = DevToolsUtils.makeInfallible(function (msg) {
    ...
    let actors = msg.actors;
    actors.forEach(actor => DebuggerServer.registerDynamicActor(actor));
    ...
  });

5) Tweak registerDynamicActor/unregisterDynamicActor, connectToChild and child.js in order to send a new dedicated message for adding/removing actors to content processes that already exists
  DebuggerServer.registerDynamicActor = function (actor) {
    events.emit(this, "register", actor);
  }
  DebuggerServer.connectToChild = function () {
    events.on("register", function (actor) {
      mm.sendAsyncMessage("debug:register", {actor:actor});
    } 
  }
  child.js:
  addMessageListener("debug:register", function (msg) {
    DebuggerServer.registerDynamicActor(msg.actor);
  })

Then from the content process, in order to lazy load dynamic actor, you would use sendSyncMessage in order to fetch actor sources from the parent process. (sendSyncMessage only exists from child to parent process)
Flags: needinfo?(poirot.alex)
(In reply to Alexandre Poirot [:ochameau] from comment #13)
> Hi Luca, you again! It looks like we are on the same track ;)

absolutely true, and it's a very interesting and challenging track ;-) 
 
I'm happy to be in this track with you, I've learned a lot about the interaction between e10s and the Remote Debugger Server studying your work on fully exposing Firefox OS webapps to the DevTools.   

> (In reply to Luca Greco from comment #9)
> > Created attachment 8469368 [details] [diff] [review]
> > rev4 - Bug980491 statically registered Global DirectorActor and dynamically
> > registered Tab InstrumenterActor
> > 
> > The new prototype doesn't yet include the "sandbox" side of the previous
> > one, because it's focused on exploring potential issues or changes needed to
> > be able to register/unregister instrumenter actors dinamically during the
> > RemoteDebuggerServer life.
> 
> I'm not sure I follow why your code is much more complex than the patch from
> bug 977443.
> Ideally we would just post a big source string and just evaluate that.
> So that you wouldn't need
> createInstrumenterFrontClass/createInstrumenterActorClass, which adds quite
> a bunch of complexity!
> 
> > 
> > In the new prototype the DirectorActor create and register a new
> > InstrumenterTabActor class dinamically from a json description received in
> > the **director.install** request.
> 
> What is the value/requirement behind this JSON description?
> protocol.js actors are already able to describe themself from their
> implementation.
> This JSON input looks like a duplicate data with the actor implementation.

This feature started mostly as a remote version of a content-script, because a lot of the DevTools addon that I've studyied or ported (e.g. Ember Inspector) use content scripts to inject code that will control, instrument and exchange messages with the target tab.

Then, from prototype to prototype, we've discussed how DevTools SDK will expose this feature to the Addon Developers and I'm in the process of balancing it between "do not expose low level details to the addon developers" and "try to leverage as much as possible of the protocol.js self-descripted actors feature".

The current prototype is not definitive (and the dynamic registering of actors is the last thing I've prototyped, so it was in the exploring phase "test how far we can go without changing anything too much"),
but the idea is:

- addon developers code used by the director's instrumenter actors will run in a content privileged sandbox, mostly like content-scripts
- addon developers will not need to learn anything about protocol.js to leverage the feature from DevTools SDK APIs
- director's instrumenter actors will describe themselves using protocol.js, so DevTools SDK's volcan (http://github.com/Gozala/volcan)  will be able to auto-generate a client object like it currently does with other protocol.js actors

> > - how to handle e10s cases? (e.g. FirefoxOS apps)
> >   this prototype will not work in a multi-process tabs/apps scenario,
> > because in the child process all the actors are redefined by statically
> > loading actors defined in commonjs modules in the child process'
> > RemoteDebuggerServer, so it will need to receive from the main process
> > enough info to be able to register the same instrumenter actor in the child
> > process as well, currently I'm thinking to explore one (or even both) of the
> > following solutions:
> > 
> >   - SOLUTION 1: customize the child process remote debugger server to
> > exchange
> >     registered directors metadata from the main process and dinamically
> > regenerate
> >     needed actor classes
> 
> +1
> 
> > 
> >   - SOLUTION 2: change strategy, persist actor installed to files in a
> > defined
> >     directory and customize the remote debugger server to load all modules
> > defined
> >     there (and marked as active from the user for security reasons?)
> 
> -1, child processes have a very restricted access to the file system.

sounds good, as the solutions numbering suggests I preferred the solution 1 from the start :-)

> So I would suggest you to hack into main.js. I would add the dynamic list of
> additional actors right on DebuggerServer. Then I would hack into
> DebuggerServer.connectToChild in order to pipe the minimal set of
> information to the child when we start debugging it. I wouldn't send the
> actor source immediatly on connect.
> Instead, I would suggest hacking on top of bug 988237, where I intend to
> load actors lazily. So that the actors source gets also pulled lazily from
> the child process.
> I can do some progress on bug 988237 next days if that can help. But I will
> have to wait for Panos to come back in order to review it.
> ...

I was already going through this path following the changes you've worked on to expose webapps child processes through the Remote Debugger (as I said I'm definitely one of your most loyal followers on bugzilla :-))

but now that you've described it as the preferred path (and with a lot of useful details) I can move even faster to the next prototype, with much more confidence.

I'm taking a look to the bug 988237, so I can take it into account and prepare to support the lazy loading once ready.
This patch contains a new complete version of the director and director-instrumenters actor modules (and a new MessagePortActor).

The director module is responsible for:
- statically registering the director actor (as global and tab actor)
- exporting helpers to propagate the instrumenter descriptions to a child process (the actual change to the remote debugger main/child.js is splitted in a separate patch)

The director-instrumenter module is responsible for:
- exporting an helper function to create an InstrumenterActor class from the instrumenter description (id, contentScript, contentScriptOptions and an optional instrumenterOptions)
- define the MessagePortActor and the messageportevent dict type
- define a DirectorInstrumenterWorker which extend the Worker class from "sdk/content/worker" to leverage the addon-sdk content-script environment and provide a simpler interface for the InstrumenterActor, which uses the worker internally to manage and interact with the content-script sandbox.

Every InstrumenterActor provides a couple of common methods (setup, connectPort, emitPortEvent and finalize) and common events ("attach", "detach", "error", "message") routed from the content-worker to the actor (and forwarded to the debugger client)

instrumenterActor.connectPort returns to the client a MessagePortActor which wraps a MessagePort and give the client a direct communication channel between client and the content-script (this feature needs some changes to the addon-sdk, which will be described and discussed in a separate comment) 

optionally the instrumenter methods and events can be customized using the instrumenterOptions:

- to forward more events to the client, to be able to receive "self.port.emit" events originated by the content-script, using instrumenterOptions.events (e.g. { events:["myevent1"] })
- to map an instrumenter actor method call to a "self.on" handler in the content-script, using instrumenterOptions.methods (e.g. { methods:["myMethod"] })

Mapping custom instrument methods to the content-script needs more changes to the addon-sdk to get it nicer, and could be probably inhibited until it is better specified.
Attachment #8458082 - Attachment is obsolete: true
Attachment #8469368 - Attachment is obsolete: true
Attachment #8469372 - Attachment is obsolete: true
This patch contains the proposed changes to propagate the instrumenters to the child processes.
This patch contains mochitests used during the prototyping (I've done a first cleanup pass, but it's my first time with mochitests and I'm pretty sure they need more changes/cleanups)
This is a temporary change (not intended to be merged) to the addon-sdk, I've used this change to expose the MessageChannel constructor in the content-script as an experiment, and be able to prototype and test the MessagePortActor, waiting for a better and complete solution from the addon-sdk.
In this last patch, I'm trying a nicer way to forge a pair of MessagePorts and expose them to the content-script and the debugger-client:

- create a MessageChannel in the InstrumenterActor using the "sdk/messaging" addon-sdk module
- send port2 to a connectPort event handler in the content-script
- wrap port1 in a MessagePortActor and return it to the debugger-client

Working on this experiment, I had to ignore with a try-catch an exception which is raised during the require("sdk/messaging"), as a temporary workaround, waiting to be able to discuss it and find a better and permanent solution in the addon-sdk.

The exception is originated from an internal require("@test/options) which raises a "not found module" exception.
 
This problem is probably related to another problem which we need to resolve if we use the "sdk/content/worker" to manage the sandboxed code:

- in the remote debugger commonjs loader 'require("self").id' returns "undefined", because it's not an addon loader, and the id is used by some of the addon-sdk modules, e.g. to set the console object logging level in the "sdk/content/worker"
Flags: needinfo?(rFobic)
Comment on attachment 8475580 [details] [diff] [review]
rev5 - Bug980481 Director, Instrumenter and MessagePort Actors

Review of attachment 8475580 [details] [diff] [review]:
-----------------------------------------------------------------

I made bunch of comments in regards to emitSyncEvent although I guess it's not very important to fix them now, I should just change `sdk/content/worker` such that it would expose a `port` object. Everything else can be added / changed later. That being said I'd prefer to avoid exposing `emytSyncEvent` on the actors themself and would rather expose something like `invokeMethod -> Promise <Null|JSON>` even if under the hood that would use `contentWorker.emitSync`, that way we could change `invokeMethod` to use message ports in a future. I would also get rid of `self.port` and `self` message piping as again here I'd much rather leverage message port that will be injected into contentWorker.

Either way I don't think any of my comments should block this as at the end of the day end users should not know or care about anything but instance of `MessagePort` that will be available in the content script scope.


Also I'm pretty sure that in volcan we already do something along the lines of proposed `invokeMethod` that we should probably just copy over.

I am also forwarding review request to ochameau as he's more qualified to do this review from the devtools perspective.

::: toolkit/devtools/server/actors/director-instrumenter.js
@@ +83,5 @@
> +    this.port.postMessage(msg);
> +  }, {
> +    oneway: true,
> +    request: {}
> +  }),

I am not sure what `start` and `stop` do here. MessagePort's do have `start` and `close` methods maybe that is what you were planing on exposing ?

@@ +195,5 @@
> +  },
> +
> +  emitSyncEvent: function(name, ...args) {
> +    return this._contentWorker.emitSyncEvent.apply(this._contentWorker, [name].concat(args));
> +  },

If I understand intention here correctly, it's for invoking methods in the content script and then for obtaining values. In which case I'd rather not carry legacy of `emitSyncEvent` but rather implement here something along this lines:

....
_invokeID: 0,
observeReturns: function() {
  this.port.addEventListener("message", ({data}) => {
    if (data.type === "invoke/actor/return") {
       if (this.returns.has(data.id) {
         this.returns.get(data.id)(data.value)
         this.returns.delete(data.id)
       } else {
         throw Error("Unexpected return")
       }
    }
  })
},
invokeMethod: function(name, ...args) {
   return this.Return(id => {
     this._contentWorker.port.postMessage({
       type: "invoke/actor/method",
       name: name,
       params: args,
       id: id
     }
   });
},
Return: function(async) {
  return new Promise(resolve => {
    this.returns.set(++this._invokeID, resolve);
    async(this._invokeID);
  });
}
....

Of course stuff generating actors in the content scripts will also have to tinker with ports to make this transparent, but that's ok we were thinking of server side volcan anyhow :) 

P.S. Probably it would make more sense to store reference to `port` separately in the context of this actor.

@@ +245,5 @@
> +    oneway: true
> +  }),
> +
> +  connectPort: method(function () {
> +    let res = this._contentWorker.emitSyncEvent("connectPort");

I think this maybe little over-complicating things, how about doing something simpler like:

```js
const { port1, port2 } = new this._contentWorker.window.MessageChannel();
this._contentWorker.port = port1;
const actor = new MessagePortActor(this.conn, port2);
```

@@ +400,5 @@
> +        reject(e);
> +      }
> +    });
> +  },
> +  emitPortEvent: function(...args) {

I'm not sure there is much point in getting self.port or self.on to work. I would suggest to not even wire them, instead let users regular message ports.
Attachment #8475580 - Flags: review?(poirot.alex)
Attachment #8475580 - Flags: feedback+
Comment on attachment 8475582 [details] [diff] [review]
Bug980481_RegisterDirector_and_InstrumentersPropagatedToChildProcess.patch

Review of attachment 8475582 [details] [diff] [review]:
-----------------------------------------------------------------

I presume this depends on https://bugzilla.mozilla.org/attachment.cgi?id=8475580, that's definitely something I think ochameau should review :)
Attachment #8475582 - Flags: review?(poirot.alex)
Attachment #8475582 - Flags: feedback+
Comment on attachment 8475588 [details] [diff] [review]
TEMP MessageChannel_constructor_exposed_to_ContentScripts.patch

Review of attachment 8475588 [details] [diff] [review]:
-----------------------------------------------------------------

::: addon-sdk/source/lib/sdk/content/content-worker.js
@@ +291,5 @@
> +    // NOTE: exposed MessageChannel constructor to the content script
> +    Object.defineProperty(exports, "MessageChannel", {
> +      value: MessageChannel
> +    });
> +

Does this makes a trick ? I'm really surprised this makes a difference as MessageChannel is only enabled for chrome and this does not supposed to run with chrome principals. Also I'd expect if `MessageChannel` is available here it should be available in other content scripts to as we load both scripts identically o_0
Comment on attachment 8475596 [details] [diff] [review]
TMP Bug980481_sdk_messaging_to_InstrumenterActors.patch

Review of attachment 8475596 [details] [diff] [review]:
-----------------------------------------------------------------

I don't think this approach is good, the thing is messaging module exposes stuff from the window with expanded principals. Passing port of the different window will leak capabilities and can pose security risk. It's would be a lot better to just create message channel from the window being debugged as proposed in the original patch. Hopefully I'll finish my patch for content scripts soon that would just expose a message port so you won't have to deal with any of this at all, but in the meantime I'd suggest to use thing I've suggested in first patch.
Attachment #8475596 - Flags: feedback-
Flags: needinfo?(rFobic)
- removed previous MessageChannel hack in favor of the better workaround suggested in the previous review
- removed inactive/inhibited code for exchanging self/self.port events, instrumenter custom events and methods definitions (for a much more readable change without the unused/inactive code)
- changed setup method to accept reload as an option (like webaudio and webgl setup method)
- defined instrumenter-error, instrumenter-attach, instrumenter-detach dict types and route error, attach and detach events on the RDP connection
Attachment #8475580 - Attachment is obsolete: true
Attachment #8475584 - Attachment is obsolete: true
Attachment #8475588 - Attachment is obsolete: true
Attachment #8475596 - Attachment is obsolete: true
Attachment #8475580 - Flags: review?(poirot.alex)
Comment on attachment 8475582 [details] [diff] [review]
Bug980481_RegisterDirector_and_InstrumentersPropagatedToChildProcess.patch

Review of attachment 8475582 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good but I have some comments for the other patch.
Attachment #8475582 - Flags: review?(poirot.alex) → feedback+
Comment on attachment 8477573 [details] [diff] [review]
Bug980481 Director and InstrumenterActors - rev6

Review of attachment 8477573 [details] [diff] [review]:
-----------------------------------------------------------------

I still have hard time following the complexity of the final approach.
I got your point from comment 14:
 - should be able to evaluate the actor with content priviledges,
 - no need to know protocol.js,
 - should be able to describe its API spec.
But I'm not sure you really want/need API description.
Comment 6 tend to say it may be hard to accomplish,
especially if you want to make it automagically described.
In its current state, this patch navigates between calling method via protocol.js abstraction
and emitting/receiving messages via MessagePort.
You can just assume that all the communication with the actor will be through MessagePort
and drastically simplify the whole patch (and the usage of it for addon devs)!

I'm quite stressed to see such complexity on top of protocol.js which itself is already complex enough.
In this new revision of the patch I've removed all the complexity introduced by the dynamic actor registering, in favor of a simpler pre-registered ContentScriptInstrumenterActor which can be retrieved using the tab-level DirectorActor (which create the actor instances lazily when requested for the first time).

ContentScriptInstrumenterActor exposes a simple pre-defined protocol.js interface to setup ("enable content tab globals creation tracking") the instrumenter and receive its messageportactor (if any).

DirectorActor is registered as global and tab actor, the global DirectorActor can be used to pre-install the instrumenter definitions, which will be accessible in all tabs and propagated to the child processes.
Attachment #8475582 - Attachment is obsolete: true
Attachment #8477573 - Attachment is obsolete: true
Attachment #8477581 - Attachment is obsolete: true
Attachment #8483477 - Flags: feedback?(poirot.alex)
Comment on attachment 8483477 [details] [diff] [review]
Bug980481 Director and Instrumenter Actors - rev7

Review of attachment 8483477 [details] [diff] [review]:
-----------------------------------------------------------------

I've only done a high level review, but that looks already much simplier!
I'm bit lost in the Worker API, but I'm wondering if we can simplify this even more
thanks to the window-ready/window-destroyed events.

::: toolkit/devtools/server/actors/director-instrumenter.js
@@ +80,5 @@
> +  }),
> +
> +  /**
> +   * Starts to receive and send queued messages on this message port.
> +   */

Requirering to call `start()` seems to only complexify the API for no clear purpose?

@@ +277,5 @@
> +
> +    this._setupCalled = true;
> +
> +    this._contentObserver = new ContentObserver(this._tabActor);
> +    on(this._contentObserver, "global-created", this._onGlobalCreated);

ContentObserver is now deprecated. Instead you should be listenening to window-ready event being dispatched on TabActor:
  on(this._tabActor, "window-ready", function ({window, isTopLevel}) {...});
window-ready is dispatched for any document loaded in the tab, so it is also dispatched for iframes.
If you want to only take care of top level tab document, filter with `if (isTopLevel) {...}`.
This event will also be dispatched when switching to an iframe as well as when tab navigates or when documents get frozen/unfrozen in the bfcache.
So that when you listen to this event, you should automagically support iframe switching.

@@ +283,5 @@
> +    if (reload) {
> +      this.window.location.reload();
> +    } else {
> +      // fake a global created event to attach without reload
> +      emit(this._contentObserver, "global-created",this.window);

You shouldn't fake such event, instead you should call a sub method of onGlobalCreated.

@@ +328,5 @@
> +
> +  // local helpers
> +  get window() {
> +    return this._tabActor.window;
> +  },

I'm not sure such helper is really useful when used once/twice.

@@ +346,5 @@
> +
> +     // subscribe content script events
> +     on(this._contentWorker, "error", this._onContentScriptError);
> +     on(this._contentWorker, "attach", this._onContentScriptAttach);
> +     on(this._contentWorker, "detach", this._onContentScriptDetach);

In case you don't need the whole Worker API, note that two events are dispatched on TabActor: window-ready and window-destroyed.
They are somewhat equivalent to the event being tracked to implement attach/detach events.
So I don't know but you could just create a sandbox and get rid of the attach/detach things.

::: toolkit/devtools/server/main.js
@@ +648,5 @@
> +    mm.sendAsyncMessage("debug:connect", {
> +      prefix: prefix,
> +      // propagate installed instrumenters definitions to the child process
> +      directorInstrumenters: require("./actors/director").getDirectorInstrumenters()
> +    });

Hum I just realized that getDirectorInstrumenters() is most likely going to contain the big contentScript strings.
Regarding memory and performances, I think it would be better to lazily fetch contentScripts only when needed, i.e. when we call getInstrumenter.
It will most likely complexify things, but I think it is worth it as actor sources might be of a significant size!
from main.js you can introduce another addMessageListener,
and from child.js you can introduce some sendSyncMessage/sendAsyncMessage() in order to fetch instrumenters options on demand.

::: toolkit/devtools/server/tests/mochitest/test_director.html
@@ +173,5 @@
> +    var waitForInstrumenterAttach = waitForEvent(testInstrumenterClient, "attach");
> +
> +    // activate the instrumenter without window reloading
> +    // (and wait for attach)
> +    testInstrumenterClient.setup({reload: false});

In term of API, you have 3 steps:
- getInstrumenter
- setup
- start
And then you can send'n receive.
Are all these 3 steps really necessary?
Can't we merge all of them into getInstrumenter? Or at least end up with only 2?
Attachment #8483477 - Flags: feedback?(poirot.alex) → feedback+
Attachment #8486055 - Flags: feedback?(poirot.alex)
Comment on attachment 8486054 [details] [diff] [review]
Bug980481 Director and Instrumenter Actors - rev8 (port from deprecated ContentObserver to the new TabActor events)

Review of attachment 8486054 [details] [diff] [review]:
-----------------------------------------------------------------

Usage of tab actor events looks good, what about other remarks from comment 29?

::: toolkit/devtools/server/actors/director-instrumenter.js
@@ +275,5 @@
> +    }
> +
> +    this._setupCalled = true;
> +
> +    on(this.tabActor, "window-ready", this._onGlobalCreated);

At some point you should unregister this listener, may be in destroy?
Attachment #8486054 - Flags: feedback?(poirot.alex) → feedback+
Comment on attachment 8486055 [details] [diff] [review]
Bug980481 Lazy Propagated Director Instrumenters

Review of attachment 8486055 [details] [diff] [review]:
-----------------------------------------------------------------

I'm bit concerned about memory usage in child process, but the overall idea looks good.

::: toolkit/devtools/server/actors/director.js
@@ +57,5 @@
> +      });
> +    } else {
> +      console.debug("skip instrumenter (already installed)", instrumenterId);
> +    }
> +  });

Instead of hacking gDirectorInstrumenters for lazy instrumenters,
wouldn't it be easier to follow by always storing functions in gDirectorInstrumenters? Or an object with a constructor function as attribute?

::: toolkit/devtools/server/child.js
@@ +48,5 @@
> +          getInstrumenterById: (instrumenterId) => {
> +            return sendSyncMessage("debug:get-director-instrumenter",
> +                                   {instrumenterId: instrumenterId});
> +          }
> +      });

I wish we could lazy load director actor as well,
but don't have clear idea how to do so.
It ends up pulling director module and its dependencies: protocol.js (which isn't mandatory to use actors), director-instrumenter, and SDK world (content/worker, window/utils, heritage,...) :/
May be we could hook some attributes on DebuggerServer and use them from director.js?
  DebuggerServer.directorInstrumentersIds = msg.data.directorInstrumentersIds;
  DebuggerServer.getInstrumenterById = (..) => ...

::: toolkit/devtools/server/main.js
@@ +579,5 @@
>      let onActorCreated = DevToolsUtils.makeInfallible(function (msg) {
>        mm.removeMessageListener("debug:actor", onActorCreated);
>  
> +      // listen for lazy director instrumenter definition requests from the child process
> +      mm.addMessageListener("debug:get-director-instrumenter", onRequestDirectorInstrumenter);

It may just be easier to listen to 'debug:get-director-instrumenter' immediately like 'debug:actor'.

let onRequestDirectorInstrumenter = DevToolsUtils.makeInfallible(function (msg) {
		return require("./actors/director")
		.getDirectorInstrumenterById(msg.json.instrumenterId);
		}).bind(this);
// listen for lazy director instrumenter definition requests from the child process
mm.addMessageListener("debug:get-director-instrumenter", onRequestDirectorInstrumenter);

@@ +668,5 @@
>  
>      mm.sendAsyncMessage("debug:connect", {
>          prefix: prefix,
> +        // propagate installed instrumenters ids to the child process
> +        directorInstrumentersIds: require("./actors/director").getDirectorInstrumentersIds()

Since you are using director module here, in regular codepath...
you can import actor module in main.js header.
Attachment #8486055 - Flags: feedback?(poirot.alex) → feedback+
(In reply to Alexandre Poirot [:ochameau] from comment #32)
> Comment on attachment 8486054 [details] [diff] [review]
> Bug980481 Director and Instrumenter Actors - rev8 (port from deprecated
> ContentObserver to the new TabActor events)
> 
> Review of attachment 8486054 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Usage of tab actor events looks good, what about other remarks from comment
> 29?

I agree with the your remarks about the worker class and the needs to streamline the workflow directly exposed to the add-on developers.

I've used the "sdk/content/worker" temporary workaround to prove the general approach, waiting to be able to discuss it further with Irakli.

I'm going to rethink it and propose an approach based on your feedback and the outcome
of the discussion with Irakli asap. 
 
> ::: toolkit/devtools/server/actors/director-instrumenter.js
> @@ +275,5 @@
> > +    }
> > +
> > +    this._setupCalled = true;
> > +
> > +    on(this.tabActor, "window-ready", this._onGlobalCreated);
> 
> At some point you should unregister this listener, may be in destroy?

in the finalize:

> +    off(this.tabActor);

if I'm not wrong 'off' without event and callback parameters should remove all the
handlers.

This is a direct change of the previous 'off(this._contentObserver)', but now that we use
tabActor events removing all the tabActor event handlers is definitely wrong (because it will probably remove event handlers added by other modules as well).

So I'm going to change it into

> -    off(this.tabActor);
> +    off(this.tabActor, "window-ready", this._onGlobalCreated);

as suggested.
Clearing this needinfo flag - I'm pretty sure we talked since this flag, feel free to reping if I'm wrong.
Flags: needinfo?(dcamp)
In this new revision I've included the following changes (related to the previous feedback):

- renamed InstrumenterContentScript to DebugScript
- streamlined workflow (builk activation of debug scripts from the director actor, debug script events routed from the director actor
- moved register/unregister/list/get debug scripts helpers in the RemoteDebuggerServer 
- lazy list and get directors in RemoteDebuggerServer running in a child process
- removed addon-sdk content worker dependencies (replaced with a simpler non-privileged sandbox)
- messageport injected in the debug script sandbox before debug script code evaluation (currently as "port")

I've tested the lazy child process debug scripts resolver manually because I've verified that the iframe used in the "test_director_connectToChild.html" test case is not running in a child process.

I'm going to search deeply in the various testsuite for a better place to automate "e10s & B2G" test cases.

Any advice about where should I look first?

NOTE: currently the debug-scripts attach/detach/error events are sent twice (one from the debug-script actor and the other from the director actor), 
if the new simplified workflow looks good on a preliminary feedback, I'm going
to remove that events from the DebugScriptActor protocol.js definition and leave the responsability to the director actor.
Attachment #8486054 - Attachment is obsolete: true
Attachment #8486055 - Attachment is obsolete: true
Attachment #8495274 - Flags: feedback?(poirot.alex)
In this small incremental change I'm exploring how the debug-script will be able to evaluate javascript code in the target window now that evalInWindow is deprecated (https://bugzilla.mozilla.org/show_bug.cgi?id=1042840).

One of the options added by this change is to expose in the debug-script sandbox the evalWithDebugger function implemented by the webconsole actor, it is what all the native devtools use when they need to evaluate code in the target global (e.g. the webconsole/splitconsole and the scratchpad), it is pretty safe from a security point of view (e.g. it can't be redefined from the target, even if the target is chrome-privileged ) and its behaviour will be consistent to the other tools.

how it looks to you?
Attachment #8496820 - Flags: feedback?(rFobic)
Attachment #8496820 - Flags: feedback?(poirot.alex)
Comment on attachment 8495274 [details] [diff] [review]
Bug980481_DirectorDebugScripts_rev9.patch

Review of attachment 8495274 [details] [diff] [review]:
-----------------------------------------------------------------

To summarize a bit the major issue I see with this patch,
I think it would help splitting DirectorActor in two.
One that would expose installDebugScript, is a singleton global actor and lives in parent process
and the other once that exposes activateDebugScript, which is a tab actor instantiated once per tab
and may live in a parent process.
Then this split would most likely help moving code from main.js back to director.js.

Finally, you are using the word "debug" for debug script everywhere,
I'm not convinced that's the best naming for it.
Unfortunately I don't know what would be the best name. I can only think about "content script",
which may conflict with the current usage of this name.
But I must say that it really looks like a content script except that it is remote and it is content script actors.

At the end I still like the way this patch evolve. We may be close to something ready for landing.

::: toolkit/devtools/server/actors/director-debug-script.js
@@ +3,5 @@
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +"use strict";

I'm not a big fan of "director-debug-script.js" and Debug* class names.
I'm not sure the usage of "debug" describe correctly all these classes.
Wouldn't ContentScriptActor and ContentScriptSandbox be better?
And what about calling this file director-content-script.js? or director-scripts.js?
(Same comment apply to other files where you are using the word "debug")

It makes me wonder what is the difference between the content script API and directors!?
Is there any except that it works remotely and also on e10s tabs?

@@ +312,5 @@
> +      // fake a global created event to attach without reload
> +      this._onGlobalCreated({ id: getWindowID(this.window), window: this.window, isTopLevel: true });
> +    }
> +  }, {
> +    request: { reload: Option(0, "boolean") },

What about skipAttach argument?

@@ +340,5 @@
> +
> +    off(this.tabActor, "window-ready", this._onGlobalCreated);
> +    off(this.tabActor, "window-destroyed", this._onGlobalDestroyed);
> +
> +    this._debugScriptSandbox.destroy("debug-script-finalize");

You may not always have a _debugScriptSandbox set.

@@ +341,5 @@
> +    off(this.tabActor, "window-ready", this._onGlobalCreated);
> +    off(this.tabActor, "window-destroyed", this._onGlobalDestroyed);
> +
> +    this._debugScriptSandbox.destroy("debug-script-finalize");
> +    this._onGlobalDestroyed({ window: this.window, isTopLevel: true });

The call to _debugScriptSandbox.destroy and the call to _onGlobalDestroyed look redundant.
sandbox.destroy is also called from onGlobalDestroyed.

@@ +363,5 @@
> +       return;
> +     }
> +
> +     // TODO: check if we want to share a single sandbox per global
> +     //       for multiple debugger clients

Until then, I would imagine it is better to prevent creating a new sandbox if there is already a sandbox set.

@@ +388,5 @@
> +       return;
> +     }
> +
> +     // unmanage and cleanup the messageport actor
> +     if (!!this._messagePortActor) {

Is `!!` really useful?

@@ +394,5 @@
> +       this._messagePortActor = null;
> +     }
> +
> +     // NOTE: destroy here the old worker
> +     if (!!this._debugScriptSandbox) {

Same question here.

@@ +421,5 @@
> +  },
> +  _onDebugScriptAttach: function(window, port) {
> +    let portActor = new MessagePortActor(this.conn, port);
> +    this.manage(portActor);
> +    this._messagePortActor = portActor;

Shouldn't we do something if there is already a _messagePortActor being set? (unmanage/destroy?)

@@ +425,5 @@
> +    this._messagePortActor = portActor;
> +
> +    emit(this, "attach", {
> +      debugScriptId: this._debugScriptId,
> +      url: (!!window && !!window.location) ? window.location.toString() : "",

I'm really unsure all these `!!` are any useful.

@@ +450,5 @@
> +const DebugScriptSandbox = Class({
> +  initialize: function({debugScriptId, debugScriptCode, debugScriptOptions}) {
> +    this._debugScriptId = debugScriptId;
> +    this._debugScriptCode = debugScriptCode;
> +    this._debugScriptOptions = debugScriptOptions;

No need to prefix all variable of DebugScriptSandbox by _debugScript.
It will be easier to read by calling these variables _id, _code and _options!

::: toolkit/devtools/server/actors/director.js
@@ +30,5 @@
> +
> +exports.unregister = function(handle) {
> +  handle.removeTabActor(DirectorActor);
> +  handle.removeGlobalActor(DirectorActor);
> +};

register/unregister are no longer useful since we are loading actor lazily.

::: toolkit/devtools/server/main.js
@@ +572,5 @@
>      });
> +    this.registerModule("devtools/server/actors/director", {
> +      prefix: "director",
> +      constructor: "DirectorActor",
> +      type: { global: true, tab: true }

This actor ends up a bit weird regarding being a global or tab actor.
There is the usage of the two kinds, which is a bit weird, especially since one part of its methods are meant to be usage only as global,
and the others only as tab actor.

For ex, installDebugScript is only meant to be usage when instanciated on the root actor, whereas activateDebugScripts is only meant to be used when instanciated out of a tab actor.

I'm wondering if it wouldn't be really easier to follow if we split it into two disctinct actors!

@@ +613,5 @@
> +        get: debugScriptDef.get
> +      });
> +
> +      return true;
> +    }

It looks like some old code. I don't see any usage of `lazyLoading` attribute in this patch.

@@ +721,5 @@
> +    }
> +
> +    return this.debugScripts[id];
> +  },
> +

Hum, I'm now a bit skeptical when I see so many "director specifics" into DebuggerServer.
Directors now doesn't register any actor on DebuggerServer, it only maintains "debug scripts". It ends up being actors, but managed by the director actor itself. They aren't global nor tab actors, but something really special and specific to director.

Now, I think all this code should live in director.
But the issue is that you need a way to communicate between the director actor instanciated in the main process [1] (which is a singleton, it should be instanciated only once and exposes the installDebugScript method) and the child processes where you may have many director actors [2] exposing the activeDebugScript method.

So you should be able to move all these methods back to DirectorActor, or just merge them as most of them are already there.
The trick would be to get access to sendSyncMessage and addMessageListener directly from director.js!
I would suggest, from child.js to do that:
  if (!DebuggerServer.initialized) {
    DebuggerServer.init();
    
    DebuggerServer.parentMessageManager = {
      sendSyncMessage: sendSyncMessage,
      addMessageListener: addMessageListener
    };
  }
I'm not sure "parentMessageManager" is the best name to choose.
But that would help knowing, from an actor definition, if we are within a child, and if we are, how to communicate with the parent process!

Then, from connectToChild, instead of hacking it, it would be better to send an event and listen for it from director.js.
  events.emit(this, "connect-to-child", mm);
(Instead of this.installParentDebugScriptsHandlers(mm);)
Attachment #8495274 - Flags: feedback?(poirot.alex) → feedback+
Comment on attachment 8496820 [details] [diff] [review]
Bug980481_DebugScriptsEvalWithDebuggerHelper.patch

Review of attachment 8496820 [details] [diff] [review]:
-----------------------------------------------------------------

We shouldn't intanciate a console actor just for eval method!
Also the console eval method does way more stuff than we expect here.
I'm pretty sure it is way easier to expose a safe eval method now.

Irakli, isn't this something we should ask to gabor?
Or may be you know which xpconnect method we should be using?
Attachment #8496820 - Flags: feedback?(poirot.alex) → feedback-
Also note that bug 977443 has some updated patch attached.
We plan to also support on-demand registration of regular actors with full privileges.
I hoped that we could somehow share some code between your patches (your patch and honza one),
but I looked at both and it seems very unlikely, except the thing I just suggested in comment 38 about DebuggerServer.parentMessageManager and connect-to-child event.
I don't think that's so bad to not share much as director is implementing something quite different from regular actors.
(In reply to Alexandre Poirot [:ochameau] from comment #40)
> Also note that bug 977443 has some updated patch attached.
> I hoped that we could somehow share some code between your patches (your
> patch and honza one),
Yes this would be great. Luca we managed to fix Nick's patch. It introduces a 'ActorRegistryActor' that can be used to dynamically register new actors. The patch is quite simple and I think we could build other (more complex) logic on top of it (and share code if it makes the code cleaner).

> but I looked at both and it seems very unlikely, except the thing I just
> suggested in comment 38 about DebuggerServer.parentMessageManager and
> connect-to-child event.
Sounds good to me. Luca, I remember you created nice (UML?) diagrams explaining the logic behind Director actor. Are those still available?

> register/unregister are no longer useful since we are loading actor lazily.
Alex, what do you mean by "we are loading actor lazily"? (comes from the comment #38)

Honza
Updated patch:

- renamed DebugScript into DirectorScript
- splitted DirectorActor into a global DirectorRegistry actor and a tab DirectorManager actor
- minimized the addon-sdk modules loaded by the director-registry module
to the single "protocol.js" commonjs module (because it needs to be loaded when we connect to a child process)
- added a DirectorManagerActor.disableByScriptIds method
- changed DirectorManagerActor.list to return two array of scriptIds: installed and enabled  
- general patch cleanup
Attachment #8495274 - Attachment is obsolete: true
Attachment #8496820 - Attachment is obsolete: true
Attachment #8496820 - Flags: feedback?(rFobic)
Attachment #8508109 - Flags: review?(poirot.alex)
In this patch (required by the rev10 patch, attachment 8508109 [details] [diff] [review]), I've isolated the proposed changes on DebuggerServer main.js and child.js modules that optionally expose "actor module setup on parent/child process" hooks to the actor modules.

It is based on the suggestion from the previous comments, plus the change to the Debugger Server lazy module registering to ensure that we know which modules needs to be required and called to complete their setup on the parent and child process sides.

I don't like too much the parameter of the setupParentProcess/setupChildProcess functions exported by the actor modules that needs this hooks (because it doesn't explicit what you should do of this DebuggerServer parameter and when/how you should unsubscribe the connect-to-child/disconnect-to-child handlers from the DebuggerServer).

How about the following alternative hooks?

- childProcessSetup({ sendSyncMessage, sendAsyncMessage, addMessageListener })
- parentProcessConnectToChild(mm)
- parentProcessDisconnectFromChild(mm)
Attachment #8508129 - Flags: review?(poirot.alex)
Running the director's mochitests (from the attachment 8508129 [details] [diff] [review]), the DirectorManagerActor.finalize throws an exception:

when DirectorManagerActor.finalize is called from the DirectorManagerActor.destroy, "this.unmanage" will raise an exception because this.__poolMap is null.

This patch turns the "unmanage" method into a nop if "this.__poolMap is null or undefined" (as many of the "actor pool management" helpers defined in the protocol.js module already do)
Attachment #8508140 - Flags: review?(poirot.alex)
Comment on attachment 8508129 [details] [diff] [review]
Bug980481_DebuggerServer_E10S_setup.patch

Review of attachment 8508129 [details] [diff] [review]:
-----------------------------------------------------------------

You got it right, but I think we can do even better! With even more lazyness, because been lazy is great \o/
I'm sorry this patch looks like such a quest...

DebuggerServer.runE10SModuleSetup*Child*Process(); looks useless.
We should try to execute setupChildProcess directly from directory-registry as soon as we are in a child.
With your current patch we can figure this out by checking if DebuggerServer.parentMessageManager is defined,
but we could also expose a more explicit boolean flag to help readability (DebuggerServer.isInChildProcess or something).
And it isn't just useless, it also forces loading directory-registry.js no matter if we use director or not in child :/

So please remove runE10SModuleSetupChildProcess in this patch and tweak directory-registry.js to call setupChildProcess during module loading, only if we are in a child.

runE10SModuleSetup*Parent*Process can also be tweaked. That would be really great, but we can tweak that in a followup if you want.
As-is, it is similar to what we currently do for network monitor. We load some network monitor specifics in parent process,
no matter if we end up using the netmon in the child or not.
It would be perfect if we could load these ressources only if the related actor is loaded in the child.
Instead of sticking to only actors, I imagined something more flexible.
From director-registry.js, we would do something like:
  if (DebuggerServer.isInChildProcess) {
    DebuggerServer.parentMessageManager.addMessageListener(..., ...);
    DebuggerServer.requireInParent("toolkit/devtools/my/module", "symbolNameToCall");
    // In your case, you want to load the actor module, but in network monitor,
    // we load just some parent<->child helper code.
    // For you, the module path will be toolkit/devtools/server/actors/director-registry.
  }

DebuggerServer.requireInParent would use custom message manager messages to load modules in the parent and call one of its methods.


Both suggestions are based on the laziness of tab actors in the child process,
from here, we start loading dependencies on demand.
Attachment #8508129 - Flags: review?(poirot.alex)
Comment on attachment 8508109 [details] [diff] [review]
Bug980481_DirectorScripts_rev10.patch

Review of attachment 8508109 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me, we would need some tweaks according to attachment 8508129 [details] [diff] [review].

I mainly reviewed the actor/devtools part.
Irakli can you look at the API being exposed?
I talked a bit about the API itself in comment 29 but I don't have any strong opinion on what you want to expose to addons.

::: toolkit/devtools/server/actors/director-manager.js
@@ +704,5 @@
> +      return { enumerable: true, configurable: false, writable: false,
> +        value: directorScriptConsole[fun] };
> +    };
> +
> +    let properties = Object.keys(directorScriptConsole["__exposedProps__"])

I think __exposedProps__ is going away :/
It would be great to find another way to do that.
Attachment #8508109 - Flags: review?(rFobic)
Attachment #8508109 - Flags: review?(poirot.alex)
Attachment #8508109 - Flags: review+
Comment on attachment 8508140 [details] [diff] [review]
Bug980481_fix_protocoljs_unmanage_during_actordestroy.patch

Review of attachment 8508140 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks Luca for keeping up submitting patches!!
Attachment #8508140 - Flags: review?(poirot.alex) → review+
In this patch I've reworked the e10s setup following the suggestions from comment #45

In the 'toolkit/devtools/server/docs/lazy-actor-modules.md' I've added some notes about the registering of lazy loading actor modules to the DebuggerServer (because it is not currently documented in the protocol.js.md doc file) and added to it a section which explain how the e10s setup works.

Basically the DebuggerServer running in the child process will call the setupChild helper exported by the actor module when it loads the module for the first time.
Then the setupChild helper can optionally use the DebuggerServer.setupInParent helper to ask to the DebuggerServer running in the parent process to load the module (if not yet loaded) and to call its setupParent exported helper (which will subscribe event handlers on the frame MessageManager (the custom message manager messages) and the DebuggerServer (the "disconnected-from-child" event).

I preferred to keep the explicit "setupChild exported symbol name" instead of running the setup sequence implicitly, because the actor modules are required on the client to use the actors front classes and it will potentially run the setup sequence even when it is not needed/expected.
Attachment #8516082 - Flags: review?(poirot.alex)
(In reply to Alexandre Poirot [:ochameau] from comment #46)
> I think __exposedProps__ is going away :/
> It would be great to find another way to do that.

agreed, absolutely.

I'm still working to find the correct way to expose the console object without using the __exposedProps__, I need to do more tests to understand what I'm doing wrong with cu.cloneInto (and I hope to find the correct solution asap and submit a proposed change as a separate patch).
This change remove the deprecated __exposedProps__ as suggested in comment #46

The console API is now injected into the sandbox using the Cu.cloneInto helper with the cloneFunctions option.
Attachment #8516835 - Flags: review?(poirot.alex)
Comment on attachment 8516835 [details] [diff] [review]
Bug980481_remove_deprecated_exposedProps.patch

Review of attachment 8516835 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, but I don't know much about this new cloneFunction argument.
I think you want gabor review on that.
Attachment #8516835 - Flags: review?(poirot.alex)
Attachment #8516835 - Flags: review?(gkrizsanits)
Attachment #8516835 - Flags: review+
Comment on attachment 8516082 [details] [diff] [review]
Bug980481_LazyE10SActorModuleSetup.patch

Review of attachment 8516082 [details] [diff] [review]:
-----------------------------------------------------------------

Could you merge both patches so that I can easily test them?

::: toolkit/devtools/server/actors/director-manager.js
@@ +36,5 @@
> +  // load and route the director-registry child setup to the
> +  // director-registry module
> +  var { setupChildProcess } = require("./director-registry");
> +  setupChildProcess();
> +};

I got your point about not executing this code when the actor is used by the front end. But I'm not sure it is worth the complexity.
1/ It will bail out on `if (!DebuggerServer.isInChildProcess)` (front it in Firefox UI, within parent process),
2/ you can move that within the actor constructor/initialize method.

I'm sorry to insist, I really like KISS:
http://en.wikipedia.org/wiki/KISS_principle

::: toolkit/devtools/server/actors/director-registry.js
@@ +114,5 @@
> +  if (gTrackedMessageManager.has(mm)) {
> +    return;
> +  }
> +
> +  gTrackedMessageManager.add(mm);

It doesn't look like you are making any use of gTrackedMessageManager, expect maintaining itself. Is it any useful?

@@ +117,5 @@
> +
> +  gTrackedMessageManager.add(mm);
> +
> +  // listen for director-script requests from the child process
> +  mm.addMessageListener("debug:parent-director-registry", handleChildRequest);

nit: `parent`, in the event name sounds weird. what about dropping it ? or what about debug:director-registry-request ?

@@ +119,5 @@
> +
> +  // listen for director-script requests from the child process
> +  mm.addMessageListener("debug:parent-director-registry", handleChildRequest);
> +
> +  DebuggerServer.once("disconnected-from-child", handleMessageManagerListeners);

I think you can now rename handleMessageManagerListeners to something more precise like handleMessageManagerDisconnect

::: toolkit/devtools/server/main.js
@@ +779,5 @@
> +          frame: aFrame,
> +          mm: mm,
> +          actor: actor,
> +          childID: childID
> +      });

Some error checking with nice and exception error message would help debugging this new feature!
Like: let m; try {m = require(module)} catch(e) {dump(...)}
      if (!setupParent in m) dump(...);
Attachment #8516082 - Flags: review?(poirot.alex)
Comment on attachment 8516835 [details] [diff] [review]
Bug980481_remove_deprecated_exposedProps.patch

Review of attachment 8516835 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Attachment #8516835 - Flags: review?(gkrizsanits) → review+
minimal changes over the previous attachment #8508140 [details] [diff] [review]:

- rebased on the new head commit in the fx-team branch
- moved in the head of the patch queue ('cause it doesn't depends on the patches that follow)
Attachment #8508140 - Attachment is obsolete: true
This new patch includes:

- obsolete attachment #8508129 [details] [diff] [review] and attachment #8516082 [details] [diff] [review] folded together
- cleaned of all changes related to the director-registry and director-manager modules (moved in the patch which follows it)
- applied tweaks based on suggestions from comment #52
Attachment #8508129 - Attachment is obsolete: true
Attachment #8516082 - Attachment is obsolete: true
Attachment #8520257 - Flags: feedback?(poirot.alex)
Comment on attachment 8520257 [details] [diff] [review]
Bug980481_DebuggerServer_E10S_setup.patch

My apologies, fixed previous attachment flag.
Attachment #8520257 - Flags: feedback?(poirot.alex) → review?(poirot.alex)
This is the last patch in the queue, and it includes:

- folded together the previous (now obsolete) attachment #8508109 [details] [diff] [review] and attachment #8516835 [details] [diff] [review]
- included tweaks based on feedback and suggestions from #comment 52

NOTE: In this patch I've not yet removed the gTrackedMessagedManager, because in my current vision/understanding of the RemoteDebuggerServer e10s lifecycle, it helps to prevent registering message manager handlers more than once.
Attachment #8508109 - Attachment is obsolete: true
Attachment #8516835 - Attachment is obsolete: true
Attachment #8508109 - Flags: review?(rFobic)
Attachment #8520270 - Flags: review?(poirot.alex)
(In reply to Alexandre Poirot [:ochameau] from comment #52)
> Could you merge both patches so that I can easily test them?

Absolutely, I've reordered and folded the patched in three non-overlapping patched.

> I'm sorry to insist, I really like KISS:
> http://en.wikipedia.org/wiki/KISS_principle

No worries, writing code is often like talking/discussing together.
and I really like how this patch has taken shapes from our discussions :-)

 
> ::: toolkit/devtools/server/actors/director-registry.js
> @@ +114,5 @@
> > +  if (gTrackedMessageManager.has(mm)) {
> > +    return;
> > +  }
> > +
> > +  gTrackedMessageManager.add(mm);
> 
> It doesn't look like you are making any use of gTrackedMessageManager,
> expect maintaining itself. Is it any useful?

I've described in the comment #57 the reasons, nevertheless I'm open to rethink it further.
Comment on attachment 8520257 [details] [diff] [review]
Bug980481_DebuggerServer_E10S_setup.patch

Review of attachment 8520257 [details] [diff] [review]:
-----------------------------------------------------------------

Here is another couple of tweaks, but that looks good to me!

::: toolkit/devtools/server/main.js
@@ +824,5 @@
>      let prefix = aConnection.allocID("child");
>      let childID = null;
>      let netMonitor = null;
>  
> +    // provides hook to actors that needs to lazily e10s propagation of their

This comment looks truncated and doesn't make much sense to me?

@@ +833,5 @@
> +      try {
> +        m = require(module);
> +
> +        if (!setupParent in m) {
> +          dumpn("ERROR: module ", module, "does not export ", setupParent);

dumpn doesn't seem to support more than one argument??
http://mxr.mozilla.org/mozilla-central/source/toolkit/devtools/DevToolsUtils.js#315

@@ +834,5 @@
> +        m = require(module);
> +
> +        if (!setupParent in m) {
> +          dumpn("ERROR: module ", module, "does not export ", setupParent);
> +          return false;

Does message manager care about the returned value of message listeners?
I'm not sure it does, so you can just `return;`

@@ +843,5 @@
> +          frame: aFrame,
> +          mm: mm,
> +          actor: actor,
> +          childID: childID
> +        });

You are passing many things here, but at the end you end up using only `mm`. In order to stick with KISS and prevent landing preemptive features, I would prefer just passing `mm` and add new stuff only when we will need it.

@@ +853,5 @@
> +        dumpn(["ERROR: exception during actor module parent setup",
> +              " \n\t module: ", module, " \n\t setupParent: ", setupParent,
> +              DevToolsUtils.safeErrorString(e)
> +              ].join(""));
> +        return false;

The use of DevToolsUtils.makeInfallible and this exception handler in redundant. makeInfaillible goal is to try/catch and report the error.
It is very likely that makeInfallible error handler is less useful than this one, so you can just get rid of makeInfallible.

@@ +888,5 @@
>      let onMessageManagerDisconnect = DevToolsUtils.makeInfallible(function (subject, topic, data) {
>        if (subject == mm) {
>          Services.obs.removeObserver(onMessageManagerDisconnect, topic);
> +
> +        // provides hook to actors that needs to lazily e10s propagation of their

Same thing here
Attachment #8520257 - Flags: review?(poirot.alex) → review+
Comment on attachment 8520270 [details] [diff] [review]
Bug980481_DirectorScripts_rev11.patch

Review of attachment 8520270 [details] [diff] [review]:
-----------------------------------------------------------------

Same for this patch, that looks good to me, just some more tweaks and nits.

Here is a try run:
  https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=c85ba9d17ed6
I haven't tested this patch, have you? Also on a device/simulator?

Also wish we could hear some feedback from SDK people...
I don't have much opinion on the Sandbox/script API and haven't reviewed it extensively.

::: toolkit/devtools/server/actors/director-manager.js
@@ +32,5 @@
> +  // load and route the director-registry child setup to the
> +  // director-registry module
> +  var { setupChildProcess } = require("./director-registry");
> +  setupChildProcess();
> +}

I think it would be better to move this code to director-registry.js.
director-registry can easily self-maintain itself without any external dependency!

@@ +97,5 @@
> +      return;
> +    }
> +
> +    if (Cu.isDeadWrapper(this.port)) {
> +      console.error(ERR_MESSAGEPORT_DEAD);

Same comment here about hiding failure reason to the clients.
It may be better to throw?

@@ +291,5 @@
> +    this.tabActor = tabActor;
> +
> +    this._scriptId = scriptId;
> +    this._scriptCode = scriptCode;
> +    this._scriptOptions = scriptOptions;

nit: it may be easier to just save `initialize` second argument, instead of saving each of its attribute?

::: toolkit/devtools/server/actors/director-registry.js
@@ +20,5 @@
> +const ERR_DIRECTOR_UNINSTALL_UNKNOWN = "Trying to uninstall an unkown director-script";
> +
> +const ERR_DIRECTOR_PARENT_UNKNOWN_METHOD = "Unknown parent process method";
> +const ERR_DIRECTOR_CHILD_NOTIMPLEMENTED_METHOD = "Unexpected call to notImplemented method";
> +const ERR_DIRECTOR_CHILD_UNEXPECTED_REPLY = "Unexpected reply to called parent method";

We may mention that we received more than one response, which is the "unexpected" fact.

@@ +50,5 @@
> +    }
> +
> +    gDirectorScripts[id] = scriptDef;
> +
> +    return true;

nit (up to you): by returning a boolean instead of throwing in case of error, you prevent the client code to know why install request failed.
So that, when playing on device, you have to look at device logs to figure it out.

@@ +111,5 @@
> +let gTrackedMessageManager = new Set();
> +
> +exports.setupParentProcess = function setupParentProcess({mm}) {
> +  // prevents multiple subscriptions on the same messagemanager
> +  if (gTrackedMessageManager.has(mm)) { return; }

Here and elsewhere, the general code guideline in mozilla-central is to not use onliner like this.
We always use braces `{}` and always jump to a new line after `{`.

@@ +117,5 @@
> +
> +  // listen for director-script requests from the child process
> +  mm.addMessageListener("debug:director-registry-request", handleChildRequest);
> +
> +  DebuggerServer.on("disconnected-from-child", handleMessageManagerDisconnected);

I'm wondering at the end if "disconnected-from-child" is any useful.
It looks like we can just listen to: Services.obs.addObserver("message-manager-disconnect", handleMessageManagerDisconnect, false);

But that may be easier and clearer to use this new event.
I let this up to you.

@@ +197,5 @@
> +
> +  /* init & destroy methods */
> +  initialize: function(conn, tabActor) {
> +    protocol.Actor.prototype.initialize.call(this, conn);
> +    this.tabActor = tabActor;

It looks like you are not using `tabActor` anymore.

@@ +198,5 @@
> +  /* init & destroy methods */
> +  initialize: function(conn, tabActor) {
> +    protocol.Actor.prototype.initialize.call(this, conn);
> +    this.tabActor = tabActor;
> +    this._directorScriptActorsMap = new Map();

nor this map

::: toolkit/devtools/server/actors/root.js
@@ +150,5 @@
>      addNewRule: true,
>      // Whether the dom node actor implements the getUniqueSelector method
>      getUniqueSelector: true,
> +    // Whether the director scripts are supported
> +    directorScripts: true,

A trait isn't mandatory as we can check for actor availibility
`if (form.directorRegistryActor)`. But it's ok to add a trait if that can help JPM or whatever tool to detect the feature.
Attachment #8520270 - Flags: review?(poirot.alex) → review+
Updated patch with changes based on feedback from comment #59
Attachment #8520257 - Attachment is obsolete: true
Attachment #8526713 - Flags: review?(poirot.alex)
Updated patch with changes based on feedback from comment #60
Attachment #8520270 - Attachment is obsolete: true
Attachment #8526714 - Flags: review?(poirot.alex)
(In reply to Alexandre Poirot [:ochameau] from comment #59)
> ::: toolkit/devtools/server/main.js
> @@ +824,5 @@
> >      let prefix = aConnection.allocID("child");
> >      let childID = null;
> >      let netMonitor = null;
> >  
> > +    // provides hook to actors that needs to lazily e10s propagation of their
> 
> This comment looks truncated and doesn't make much sense to me?

definitely, I've changed and completed the comments (both) in the updated patch

> @@ +833,5 @@
> > +      try {
> > +        m = require(module);
> > +
> > +        if (!setupParent in m) {
> > +          dumpn("ERROR: module ", module, "does not export ", setupParent);
> 
> dumpn doesn't seem to support more than one argument??
> http://mxr.mozilla.org/mozilla-central/source/toolkit/devtools/DevToolsUtils.
> js#315

My fault, probably introduce converting console.log to dumpn in bulk.

fixed in the updated patch.

 
> @@ +834,5 @@
> > +        m = require(module);
> > +
> > +        if (!setupParent in m) {
> > +          dumpn("ERROR: module ", module, "does not export ", setupParent);
> > +          return false;
> 
> Does message manager care about the returned value of message listeners?
> I'm not sure it does, so you can just `return;`

If I'm not wrong, if the caller use sendSyncMessage the return value will be returned accordingly,

So the setupParent helper returns true or false to give the caller the change to know if the setup completed correctly.

> 
> @@ +843,5 @@
> > +          frame: aFrame,
> > +          mm: mm,
> > +          actor: actor,
> > +          childID: childID
> > +        });
> 
> You are passing many things here, but at the end you end up using only `mm`.
> In order to stick with KISS and prevent landing preemptive features, I would
> prefer just passing `mm` and add new stuff only when we will need it.

sounds good, I've added the actor and frame properties because it will be helpful
to port the network monitor to this helpers in a follow up.

In the last commit I've removed the currently unused parameters and added a TODO comment
about the possible followup.

> 
> @@ +853,5 @@
> > +        dumpn(["ERROR: exception during actor module parent setup",
> > +              " \n\t module: ", module, " \n\t setupParent: ", setupParent,
> > +              DevToolsUtils.safeErrorString(e)
> > +              ].join(""));
> > +        return false;
> 
> The use of DevToolsUtils.makeInfallible and this exception handler in
> redundant. makeInfaillible goal is to try/catch and report the error.
> It is very likely that makeInfallible error handler is less useful than this
> one, so you can just get rid of makeInfallible.

Agree, in the updated patch I've removed the redunant DevToolsUtils.makeInfallible
(In reply to Alexandre Poirot [:ochameau] from comment #60)
> Comment on attachment 8520270 [details] [diff] [review]
> Bug980481_DirectorScripts_rev11.patch
> 
> Review of attachment 8520270 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Same for this patch, that looks good to me, just some more tweaks and nits.
> 
> Here is a try run:
>   https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=c85ba9d17ed6
> I haven't tested this patch, have you? Also on a device/simulator?

I'm testing it on an E10S-enabled Firefox Desktop profile on every change,
and I've tested it in the simulator some revisions ago.

I'm going to build a custom simulator or b2g asap, but I wonder if I can
add a couple of e10s/b2g tests somewhere:
I've tried the mochitest-chrome testsuite with the "--e10s" but it get stuck before start to run any test cases. 
  

> Also wish we could hear some feedback from SDK people...
> I don't have much opinion on the Sandbox/script API and haven't reviewed it
> extensively.

should I add the feeback flag to the director patch or a comment with a needinfo flag?
 
> ::: toolkit/devtools/server/actors/director-manager.js
> @@ +32,5 @@
> > +  // load and route the director-registry child setup to the
> > +  // director-registry module
> > +  var { setupChildProcess } = require("./director-registry");
> > +  setupChildProcess();
> > +}
> 
> I think it would be better to move this code to director-registry.js.
> director-registry can easily self-maintain itself without any external
> dependency!

Unfortunately it can't self-maintain itself because the director registry isn't a tab actor and it will never be loaded in the child process.

The current lazy loading sequence is:

- the debugger client asks for the tab directorManager actor
- the debugger server running in the child process lazily load the directorManager for the first time
- the directorManager load the directorRegistry actor module and starts its "child process" setup
- the directorRegistry asks to the DebuggerServer running in the parent process to run the setup helper the director-registry parent counterpart  

> 
> @@ +97,5 @@
> > +      return;
> > +    }
> > +
> > +    if (Cu.isDeadWrapper(this.port)) {
> > +      console.error(ERR_MESSAGEPORT_DEAD);
> 
> Same comment here about hiding failure reason to the clients.
> It may be better to throw?

sound good, I've removed all the dead messageport checks to leave the exception to be throw

> ::: toolkit/devtools/server/actors/director-registry.js
> @@ +20,5 @@
> > +const ERR_DIRECTOR_UNINSTALL_UNKNOWN = "Trying to uninstall an unkown director-script";
> > +
> > +const ERR_DIRECTOR_PARENT_UNKNOWN_METHOD = "Unknown parent process method";
> > +const ERR_DIRECTOR_CHILD_NOTIMPLEMENTED_METHOD = "Unexpected call to notImplemented method";
> > +const ERR_DIRECTOR_CHILD_UNEXPECTED_REPLY = "Unexpected reply to called parent method";
> 
> We may mention that we received more than one response, which is the
> "unexpected" fact.

In the updated patch I've splitted the UNEXPECTED_REPLY into two more specific errors:
ERR_DIRECTOR_CHILD_MULTIPLE_REPLIES and ERR_DIRECTOR_CHILD_NO_REPLY.

> @@ +111,5 @@
> > +let gTrackedMessageManager = new Set();
> > +
> > +exports.setupParentProcess = function setupParentProcess({mm}) {
> > +  // prevents multiple subscriptions on the same messagemanager
> > +  if (gTrackedMessageManager.has(mm)) { return; }
> 
> Here and elsewhere, the general code guideline in mozilla-central is to not
> use onliner like this.
> We always use braces `{}` and always jump to a new line after `{`.

fixed coding style in the updated patch

> @@ +117,5 @@
> > +
> > +  // listen for director-script requests from the child process
> > +  mm.addMessageListener("debug:director-registry-request", handleChildRequest);
> > +
> > +  DebuggerServer.on("disconnected-from-child", handleMessageManagerDisconnected);
> 
> I'm wondering at the end if "disconnected-from-child" is any useful.
> It looks like we can just listen to:
> Services.obs.addObserver("message-manager-disconnect",
> handleMessageManagerDisconnect, false);
> 
> But that may be easier and clearer to use this new event.
> I let this up to you.

This made me to think about it a bit:

in the current form it doesn't add any value to what 'Services.obs.addObserver("message-manager-disconnect", ...)' already does (besides the need to include both Services and DebuggerServer in the module which use this "lazy e10s setup" feature)  

... and I dont' like a lot the "manual" cleanup approach which it's error prone:

> // listen for director-script requests from the child process
> mm.addMessageListener("debug:director-registry-request", handleChildRequest);
> 
> DebuggerServer.on("disconnected-from-child", handleMessageManagerDisconnected);
> function handleMessageManagerDisconnected(mm) {
>   if (/* we need to unsubscribe? */) {
>     DebuggerServer.off("disconnected-from-child", handleMessageManagerDisconnected);
>   }
> }

So in the updated patch I'm now using the childID to emit and subscribe "disconnected-from-child:CHILDID" events, and being able to use "DebuggerServer.once" to register the cleanup handler (which will be auto-unsubscribed):

> DebuggerServer.once("disconnected-from-child:" + ChildID, handleMessageManagerDisconnected); 
 
how it looks to you?

> @@ +197,5 @@
> > +
> > +  /* init & destroy methods */
> > +  initialize: function(conn, tabActor) {
> > +    protocol.Actor.prototype.initialize.call(this, conn);
> > +    this.tabActor = tabActor;
> 
> It looks like you are not using `tabActor` anymore.
> 
> @@ +198,5 @@
> > +  /* init & destroy methods */
> > +  initialize: function(conn, tabActor) {
> > +    protocol.Actor.prototype.initialize.call(this, conn);
> > +    this.tabActor = tabActor;
> > +    this._directorScriptActorsMap = new Map();
> 
> nor this map

removed unused tabActor and _directorScriptActorsMap.
we need a last feedback on the final API exposed to the sandbox by someone from the AddonSDK team:

- a messageport "shared" between the director-script and the debugger client exposed in the sandbox as "port"
- a director-script can optionally define a function to be called on sandbox nuking: "exports.onUnload = function () ..."
- a console object (created using the PlainTextConsole module from the addon-sdk) exposed in the sandbox as "console"

The director script introduce 4 new actor types:

- director registry (global actor): install/uinstall director script definitions
- director manager (tab actor): enable/disable installed director scripts
- director script (lazily created by director manager): manage the content script sandbox lifecycle
- message port (created by director script): move message port events between the remote debugger server and the connected debugger client
Flags: needinfo?(rFobic)
Comment on attachment 8526713 [details] [diff] [review]
Bug980481_DebuggerServer_E10S_setup.patch

Review of attachment 8526713 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/devtools/server/main.js
@@ +839,5 @@
> +          return false;
> +        }
> +
> +        // TODO: in a followup, frame (aFrame), actor (actor)
> +        //       1will be useful to move the network monitor from its custom

nit: typo `lwill`

@@ +840,5 @@
> +        }
> +
> +        // TODO: in a followup, frame (aFrame), actor (actor)
> +        //       1will be useful to move the network monitor from its custom
> +        //       e10s parent/child message exchange to the setupParent helper

I'm not sure this comment is any useful in codebase.
It may be enough to talk about that in a followup bug opened sooner than later ;)

@@ +846,5 @@
> +
> +        return true;
> +      } catch(e) {
> +        DevToolsUtils.reportException(["module: '", module, "', setupParent: '", setupParent,
> +                                       "'"].join(""), e);

Isn't it easier to use + rather than [].join("") ??

@@ +890,5 @@
> +        // provides hook to actor modules that need to exchange messages
> +        // between e10s parent and child processes
> +        // TODO: in a followup, frame (aFrame), actor (actor)
> +        //       1will be useful to move the network monitor from its custom
> +        //       e10s parent/child message exchange to the setupParent helper

Same comments for this TODO comment ;)
Attachment #8526713 - Flags: review?(poirot.alex) → review+
(In reply to Luca Greco from comment #64)
> (In reply to Alexandre Poirot [:ochameau] from comment #60)
> > Comment on attachment 8520270 [details] [diff] [review]
> > ::: toolkit/devtools/server/actors/director-manager.js
> > @@ +32,5 @@
> > > +  // load and route the director-registry child setup to the
> > > +  // director-registry module
> > > +  var { setupChildProcess } = require("./director-registry");
> > > +  setupChildProcess();
> > > +}
> > 
> > I think it would be better to move this code to director-registry.js.
> > director-registry can easily self-maintain itself without any external
> > dependency!
> 
> Unfortunately it can't self-maintain itself because the director registry
> isn't a tab actor and it will never be loaded in the child process.
> 
> The current lazy loading sequence is:
> 
> - the debugger client asks for the tab directorManager actor
> - the debugger server running in the child process lazily load the
> directorManager for the first time
> - the directorManager load the directorRegistry actor module and starts its
> "child process" setup
> - the directorRegistry asks to the DebuggerServer running in the parent
> process to run the setup helper the director-registry parent counterpart  
> 

You should really be able to move it to director-registry.js, as you are loading it anyway, thanks to:
  const { DirectorRegistry } = require("./director-registry");
few lines before!
Comment on attachment 8526714 [details] [diff] [review]
Bug980481_DirectorScripts_rev11.patch

Review of attachment 8526714 [details] [diff] [review]:
-----------------------------------------------------------------

> > DebuggerServer.once("disconnected-from-child:" + ChildID, handleMessageManagerDisconnected); 
>  
> how it looks to you?
> 

It looks good, thanks!
Attachment #8526714 - Flags: review?(poirot.alex) → review+
fixed nits based on feedback from comment #66 and rebased on the updated fx-team branch head
Attachment #8526713 - Attachment is obsolete: true
fixed nits based on feedback from comment #67 and rebased on the updated fx-team branch head to fix conflicts in advance
Attachment #8526714 - Attachment is obsolete: true
(In reply to Alexandre Poirot [:ochameau] from comment #67)
> You should really be able to move it to director-registry.js, as you are
> loading it anyway, thanks to:
>   const { DirectorRegistry } = require("./director-registry");
> few lines before!

Now I see what you mean, applied in the updated patch.
updated commit message + minor tweaks to the doc file
Attachment #8529714 - Attachment is obsolete: true
updated commit message
Attachment #8529716 - Attachment is obsolete: true
try build is green, so I'm going to add checkin-needed flags for the first two patches as discussed with Alex on IRC:

- attachment 8529744 [details] [diff] [review] (Bug 980481 - check for null __poolMap in the protocol.js unmanage method r=ochameau)
- attachment 8529745 [details] [diff] [review] (Bug 980481 - RemoteDebuggerServer e10s setupParent helper r=ochameau)

the last patch will wait for a final feedback from Addon SDK Team (I'm going to add "checkin-needed" on it once the needinfo request will be resolved):

- attachment 8529746 [details] [diff] [review] (Bug 980481 - DirectorRegistry, DirectorManager, DirectorScript and MessagePort RemoteDebuggerServer actors r=ochameau)
Keywords: checkin-needed
Comment on attachment 8529746 [details] [diff] [review]
Bug980481_DirectorScripts_rev11.patch

Review of attachment 8529746 [details] [diff] [review]:
-----------------------------------------------------------------

There are few things that I dislike about the current API (even though I was the one who suggested some of these earlier). Let me summarize them in the listing below:

1. API implies that script is executed per each window debugged.
2. Lifetime of a port is coupled with a lifetime of a window debugged.
3. Context of script execution is like module's although there are subtle differences & additional globals.
4. Special export `onUnload` makes it hard to define reusable higher level components.


While first two items from list above are true at present (because of the limitations of sandbox API),
I would be more comfortable to not encode them into an API so we could change that in a future. For
example we could reuse same sandbox across multiple windows or ports.

I think emulating a module environment is a good idea, but we should reduce subtle differences. More precisely
`module` should match sdk's (more details in diff). We should also expose `require` (that will throw exception
for now). Finally we should not add more globals (`window`, `port`, `scriptOptions`) instead we could pass those
as arguments to a main export.

I am not a fan of predefined names like `exports.onUnload`, it makes it hard to define reusable higher level components. I would personally much rather prefer to expose `unload` hook via `port` itself. It would be best to dispatch a special "message" event on the "port" `{data: {type: "unload" }}` instance to let the script know it needs to unload itself. Of course this means script will end up unloading if client sends that message but I don't think that's so unreasonable after all client send an unload message. Also note that once we will implement server part of volcan it would make even more sense.


Given all this I would suggest an API like this:

```js
exports.onMessage = function(message) {
  if (message.data.type === "unload") {
    exports.onUnload(message);
  } else {
    exports.receiveMessage(message);
  }
}

exports.attach = function({window, port, otherOptions}) {
  port.addEventListener("message", exports.onMessage);
}
```

That way there is no implication of how many times script is executed or how many times `attach` method
is invoked, which means we could change that in a future.



To summarize,  following changes are needed to land this:

1. Get rid of static exports.onUnload hook in favor of a special message on the exposed `port`.
2. Change `module` to match it's interface of sdk (https://github.com/mozilla/addon-sdk/blob/master/lib/toolkit/loader.js#L678-L686)
3. Pass window and port as an option to a main export, which by default will be `module.exports`.

Following changes could land as a followup after this this lands (or together if that's preferred)

4. Make main export configurable during script installation, so that it could be configured to `exports.attach` instead
of `module.exports`. I'll be ok with making it even part of static options.
5. Add `require` to the script loading context that for now will just throw module not found exception (in a future we will let
   it load server version of volcan). 



I can also be convinced that my suggestions are wrong & if you feel that is a the case please speak up.

::: toolkit/devtools/server/actors/director-manager.js
@@ +676,5 @@
> +        "inner-window-id": this._innerId
> +      }
> +    });
> +
> +    this._sandbox.scriptOptions = Cu.cloneInto(this._scriptOptions, this._sandbox);

Are `scriptOptions` really necessary ? Or maybe better question is what are the use cases for it ?

@@ +680,5 @@
> +    this._sandbox.scriptOptions = Cu.cloneInto(this._scriptOptions, this._sandbox);
> +    this._sandbox.window = window;
> +
> +    // enable CommonJS module exports
> +    this._sandbox.module = this._sandbox;

I'm not sure I like this idea in case of modules `this` is different from `module` also `module.uri` seems to reflect `uri` of the source for this module. Could you please update this to match sdk loaded modules closer (see https://github.com/mozilla/addon-sdk/blob/master/lib/toolkit/loader.js#L766 for more details)
Attachment #8529746 - Flags: feedback-
Flags: needinfo?(rFobic)
This patch introduces the following changes (based on the feedback on the API exposed in the director script sandbox from comment #79):

- use module.exports or module.exports[scriptOptions.attachMethod] as director script attach method 
- pass window and port as params to the exported attach method
- onUnload helper, which can be used to enqueue callbacks to be called during sandbox destroying, passed in the attach method params
- commonjs-based module object modelled after the addon-sdk module object
- fake require method exposed in the sandbox, which currently raise a "not implemented" exception

Things which are different from the comment #79:

- I preferred to don't change unload into a port event because we need to ensure that the unload callbacks are called before we nuke the sandbox (or they will probably never be called)
- the new onUnload helper is more composable that the previous exports.onUnload (much like the proposed port event) because it enables the sandbox code to enqueue more than one cleanup callback

Things which I'm not sure of:

- what do we want to put into the module.id and module.uri attributes?
  (currently I'm using the director script id as module.id and the 
  director script javascript code as uri)
  NOTE: we need to take into account that the original script data url
  can be on the other side of the debugger client connection.

PS: I'm set the obsolete flag on the previous revision of this patch and on the previous patches which are already merged.
Attachment #8529744 - Attachment is obsolete: true
Attachment #8529745 - Attachment is obsolete: true
Attachment #8529746 - Attachment is obsolete: true
Attachment #8537241 - Flags: review?(rFobic)
(In reply to Irakli Gozalishvili [:irakli] [:gozala] [@gozala] from comment #79) 
> ::: toolkit/devtools/server/actors/director-manager.js
> @@ +676,5 @@
> > +        "inner-window-id": this._innerId
> > +      }
> > +    });
> > +
> > +    this._sandbox.scriptOptions = Cu.cloneInto(this._scriptOptions, this._sandbox);
> 
> Are `scriptOptions` really necessary ? Or maybe better question is what are
> the use cases for it ?

In the ember inspector addon I'm using content worker options for resources useful during the worker life (e.g. small javascript snippets to inject into the target window)
Comment on attachment 8537241 [details] [diff] [review]
Bug980481_DirectorScripts_rev12.patch

I'm adding a review flag assigned to Alex.

The applied changes (briefly described in the #comment 80) are restricted to:

In the toolkit/devtools/server/actors/director-manager.js:

- the DirectorScriptSandbox attach and destroy methods
- minor changes in the DirectorScriptActor (only the minimal changes needed to adapt to the changes in the DirectorScriptSandbox)


In the toolkit/devtools/server/tests/mochitest/test_director.html:

- new test cases to cover the above changes
- minor reworks and cleanups
Attachment #8537241 - Flags: review?(poirot.alex)
Comment on attachment 8537241 [details] [diff] [review]
Bug980481_DirectorScripts_rev12.patch

Review of attachment 8537241 [details] [diff] [review]:
-----------------------------------------------------------------

Luca, consider it carrying r+ from me until you end up modifying main.js/root.js or e10s specific code again.
I'm fine with whatever Irakli r+ for the actors.
Attachment #8537241 - Flags: review?(poirot.alex) → review+
Attachment #8537241 - Flags: review?(rFobic) → review+
the last patch in the queue is ready for landing, checkin-needed flag added:

- attachment 8537241 [details] [diff] [review] (Bug 980481 - DirectorRegistry, DirectorManager, DirectorScript and MessagePort RemoteDebuggerServer actors r=ochameau)
Keywords: checkin-needed
(In reply to Luca Greco from comment #80)
> PS: I'm set the obsolete flag on the previous revision of this patch and on
> the previous patches which are already merged.

Why? They aren't obsolete just because they already landed :)
Whiteboard: [leave open]
Attachment #8529744 - Attachment description: Bug980481_fix_protocoljs_unmanage_during_actordestroy.patch → Bug980481_fix_protocoljs_unmanage_during_actordestroy.patch [checkin: comment 77]
Attachment #8529744 - Attachment is obsolete: false
Attachment #8529745 - Attachment description: Bug980481_DebuggerServer_E10S_setup.patch → Bug980481_DebuggerServer_E10S_setup.patch [checkin: comment 77]
Attachment #8529745 - Attachment is obsolete: false
Attachment #8537241 - Attachment description: Bug980481_DirectorScripts_rev12.patch → Bug980481_DirectorScripts_rev12.patch [checkin: comment 86]
https://hg.mozilla.org/mozilla-central/rev/724554c093a8
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Target Milestone: --- → mozilla37
I had to backout the last patch because Windows XP was hitting intermittent timeouts in test_director.html.
https://hg.mozilla.org/integration/fx-team/rev/3580a90de286

https://treeherder.mozilla.org/logviewer.html#?job_id=1597649&repo=fx-team
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla37 → ---
Attachment #8537241 - Attachment description: Bug980481_DirectorScripts_rev12.patch [checkin: comment 86] → Bug980481_DirectorScripts_rev12.patch
Attachment #8537241 - Attachment is obsolete: true
Flags: needinfo?(luca.greco)
I've identified the test which got stuck during the trybuild on the win32 target and 
then I tried to reproduce the testcase manually on the win32 binary built from the failed trybuild in a Windows XP VM, but it did work correctly like on the other platforms.

Nevertheless, during this tests I found a couple of points that could break the testcase flow (e.g. in the previous patch a directorScript which raises an error with some undefined attributes will raise a protocol.js exception).

So I've prepared a new rev13 patch revision to be pushed to the trybuild:

this new revision fixes the above corner cases and tweaks the test helpers to prevents the trybuild from getting stuck waiting for an event which will never be raised.

In the mean time I'm leaving the needinfo request opened.
The try launched by Irakli for me failed, but this time on Windows8 instead of WindowsXP, the good news is that the last changes on the test cases helped to prevent the try build from getting stuck,
the bad news is that it has not produced any useful log messages.

Then I've added another small tweak to prevents silent errors during director-script attach when the debugger client doesn't subscribe to the error events  on the DirectorManager Actor (which I've merged in this rev14 of the patch), and having received my try access info, I launched two new try builds, both restricted to the win32 and win64 platforms:

- https://treeherder.mozilla.org/#/jobs?repo=try&revision=863e6b68d70d
- https://treeherder.mozilla.org/#/jobs?repo=try&revision=039dd4e0f530

Neither of these two builds has presented the above issues anymore, so the previous timeouts on Windows XP and Windows 8 could be related to some kind of race condition which is difficult to reproduce manually and it doesn't always happen.

If it will fail in a next try, now it should produce an error log useful to track the issue.

is it reasonable to procede to the merge of this patch (now that it cannot get stuck indefinitely anymore) even if the tests could become intermittent?
Attachment #8548262 - Attachment is obsolete: true
Flags: needinfo?(luca.greco) → needinfo?(ryanvm)
1) Looks like you pushed on top of some other bustage? (the bc1 failures) - In general I'd recommend pushing on top of mozilla-central to avoid that situation. Really muddies the waters for Try pushes :)

2) The failures were intermittent, so to have confidence that the problem is fixed, you need multiple runs of the same job. In this case, you'd want maybe 10-20 Windows mochitest-other runs. You can trigger more by selecting the job on Treeherder, then clicking the little looping arrow icon near the bottom-left side with a "Retrigger the job" tooltip when you hover it.

3) Since you're debugging a mochitest-other issue, the Try syntax you used is quite inefficient (running a lot of jobs that aren't giving you any useful information). |try: -b do -p win32,win64 -u mochitest-o -t none| will give you a Windows-only Try run that only runs mochitest-other.

4) In general, our policy is to not permit landing patches with known issues like intermittent failures. We have enough as it is without knowingly adding more. Sorry, you'll need to sort these failures out before we can land it.

So what should you do at this point? Rebase your patch on top of mozilla-central, push to Try using the syntax in point #3, then retrigger 10-20 mochitest-other runs on each, then re-request checkin if your failures don't show up :)
Flags: needinfo?(ryanvm)
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #93)
> So what should you do at this point? Rebase your patch on top of
> mozilla-central, push to Try using the syntax in point #3, then retrigger
> 10-20 mochitest-other runs on each, then re-request checkin if your failures
> don't show up :)

Thank you Ryan, these are definitely the advices that I needed.
Finally I managed to find how to stop the tests to fail intermittently on windows platforms: I had to rewrite the test cases which open tabs in the mochitest-devtools testsuite (leaving the test cases which interact only with the global DirectorRegistryActor in the mochitest-other testsuite).

Then I ran the rewritten tests on the windows platforms a number of times and they appear to be stable now (the exception and retry failures in the run didn't seem to be related to the issue that I was working on):

- https://treeherder.mozilla.org/#/jobs?repo=try&revision=7ececdd85b0d

Now I rebased the patch on the new mozilla-central head and launched a new try build with the other platforms, to be sure that the rewritten tests will work correctly on the remaining platform as well:

- https://treeherder.mozilla.org/#/jobs?repo=try&revision=2ab2c6d8c88b
Attachment #8552557 - Attachment is obsolete: true
The last try runs look good and pretty stable now that I've moved most of the test cases in the mochitest-devtools testsuite.

checkin-needed keywork added.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/2bf7be61d48b
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Target Milestone: --- → mozilla38
You need to log in before you can comment on or make changes to this bug.