Closed Bug 1523104 (protocdp) Opened 4 years ago Closed 4 years ago

Land CDP prototype


(Remote Protocol :: Agent, enhancement, P1)



(Not tracked)



(Reporter: ato, Assigned: ato)


(Blocks 3 open bugs, )



(1 file, 1 obsolete file)

We want to write and land a prototype for a new agent/server that
enables Firefox to speak the Chrome DevTools Protocol (CDP).

This is an HTTP, WebSockets, and JSON based protocol for inspecting
state and controlling execution of web content documents, instrumenting
the browser in interesting ways, simulating user interaction for
automation purposes, and debugging JavaScript execution.

This bug is scoped to put together a framework and transport layer
suitable for implementing the CDP API and protocol. The initial
implementation will be an alternate target for third-party integrations
to connect to, supplementing the existing Firefox DevTools server.

The aim of this work is to enable Firefox to explore implementing
parts of the CDP APIs.

See also:

Assignee: nobody → ato
Blocks: puppeteer
Priority: -- → P1
Alias: protocdp
Attached patch 0001-wip.patch (obsolete) — Splinter Review

I’ve taken the original prototype I worked on and ported it to
use the new BrowsingContext based child actor system that is
now being deployed in the Firefox frontend. The idea is that this
will be a safer way to structure new code, given that it is its
long-term intention to entirely replace the frame script concept.

The attached prototype is a sparse, but working, implementation
of CDP in Gecko. Notably, it is missing almost every feature that
CDP has, but provides just enough functionality to make the demo in
remote/test/demo.js work. The limited feature set is intentional,
as the first order of business should be that we agree on a general
design approach.

A few words of warning when reading the diff:

  • The actor infrastructure has a limitation that child actors have
    to be predefined and registered in the parent process before the
    first child content process starts. This is a problem since
    the remote agent is essentially an opt-in feature initialised
    with a flag. Since this is the first Gecko component to need
    on-demand installation of child actors, I don’t believe this
    to be a show-stopper.

  • The transport layer is indisputably terrible, and is held
    together with a piece of string. We should accept that it is
    necessary to compromise on this to deliver Puppeteer support in
    the short term. But we should be aware that it is a priority to
    move to a high-quality HTTPD based on rust-cdp in the longer term.

    As the rust-cdp crate provides a complete, typed Rust
    implementation of the CDP protocol, automatically generated
    from Chrome’s protocol definition, we could remove entirety
    of the remote/server subfolder and the protocol schema checking
    done in Protocol.jsm. One of the particularly interesting
    features I find with rust-cdp is that the serde deserialisations
    could be used to delegate to browser-local clients over IPDL,
    meaning this XPCOM remote agent could be just one possible
    underlying implementation.

  • The 17k-dump of the CDP protocol definition in remote/Protocol.jsm
    is pure garbage. (-;

I’d like to know what you think of this general approach? I know
there are alternate ways we could approach this (building it in
to devtools’ actor system, but with a CDP-emitting serialiser is
one option), but I feel that the prospect of a modular IPDL/XPCOM
implementation supported by plumbing in rust-cdp holds potential.

Attachment #9039832 - Flags: feedback?(poirot.alex)

Andreas, I haven't had time to context switch this week, but I'll take time to review this early next week.

But it looks very promising. I'm not too concerned about the http endpoint as there isn't much happening there. We may even live with a very incomplete http implementation and that would still work. Most of CDP happens in the WebSocket connection. When you analyze how chromium works, I'm not sure the http endpoint is justified to list the targets. It looks like an historical thing that you have to go through. I say that because you can listen for other targets and communicate with them via a single WebSocket connection.

Also, I took some time to analyze how chromium was handling site-isolation. I'll try to summarize what I found next week as well. But I can already say that there is a lot of similar patterns between CDP / RDP and pur front-ends.

Comment on attachment 9039832 [details] [diff] [review]

Review of attachment 9039832 [details] [diff] [review]:

Thanks a lot for working on this. It is great to see that CDP could be implemented in Firefox somewhat easily!

I will start with a high level feedback.

1) Unused/useless code.

There is quite some cruft here and there that doesn't help reviewing this patch. I'll post another comment mentioning all the things that aren't used/useful.

2) HTTP vs WebSocket

In the current patch, you are starting one HTTP server on the port specified in the preference/command line.
And then, one WebSocket per target on random ports.
This is a diverging point from CDP, which serves everything from the base HTTP server, typically opened on port 9222. I'm afraid we would break support from random tooling based on CDP with such difference.
All targets on chrome's CDP are served from port 9222 via URL that look like:
Only the ID changes, the word "page" is used for all target kinds, even background page and workers.
It would be great if we could open only one port and serve the websockets from it, like chrome.

3) Implementing the domain in Target's process.

Then, the main comment I have is architectural. It will have a significant impact on this patch. 
In the current patch, and like the other puppeteer's patch you were starting from, you are implementing all the domains in the parent process. The domain implementation then interacts with message manager API in order to reach the content processes and interact with the DOM, listen to console message and do other typical interactions with the document to debug.
It has the side effect of listening to all documents messages and piping more data through message manager communication channel than you really need.
For domains like Console, DOM and Debugger, the vast majority of the requests and events require direct access/interation with the Target context (Page, Frame, Worker, ...). It means that you will manually setup Frame script messages to implement each method and event.
Doing this won't scale.
That's not what we do in existing RDP and CDP implementations.
Both Firefox and Chromium implement the Domains inspecting the Target within the target's context.
More concretely, it means that `remote/domain/Log.jsm` should be loaded in the page's process rather than in the parent process. Then `remote/ConsoleServiceObserver.jsm`, which is a frame script in the current patch, should now be merged into Log.jsm and used by it directly.
Your job here, if you accept it, will be to setup the architecture to automatically load Log.jsm within the target's process on-demand, if this domain is queried.

Now, not all Domain are specific to the target. Like "Browser" domain. This one can and probably should be implemented in the parent process. We also have this concept in Firefox. We have the "global actors", which are loaded in the parent process once, and the "target-scoped actors", which are loaded once for each target, in their process.
Last but not least, in chromium, they support hybrid domains, which can be implemented in both. A first implementation in the parent process will be called and may optionally delegate to another one being run in the target's process(es) (very precise detail: it looks like it may be able to call all the targets).
That may be very handy to support that, but I don't know if that's a must-have in the first iteration?

I wrote a detailed description of CDP, which hopefully will help working on this patch:

4) Fission / site-isolation

As you will see in the document I linked in the previous paragraph, remoted frames are debuggable through distinct target. You will first connect to the page target via websocket and then from its `Target` domain, be able to listen to new targets via `Target.targetCreated` and attach to it via `Target.attachTarget` method. Then you can attach to the new target, created for remote frame via the same websocket connection.
This is all based on a `sessionId` attribute passed in the JSON objects.
We should implement that at some point, but I think we can ignore this for now.
Looking at how it is being implemented, it looks fairly trivial and require some "router" on both server and client side to redirect each request/event to the right target instance. It may actually align with the code dispatching the request from the parent process to the target's process.

5) Lazyness

A first look, everything is pretty much loaded on startup or on first connection.
A lesson learned from DevTools is that lazy loading is important.
So it will be useful to ensure that we have a minimum level of lazy loading.
For example:
* load the domain implementation only if we use them, if a client query it,
* avoid loading anything but the command line handler if the command line argument isn't passed.

6) Modules (and compartments)

This comment is more about the future. We can probably move forward with JSM as a first iteration.
6.1) JSMs vs ES Modules
You are using JSM, but there is a strong motivation to start using ES Modules in Firefox codebase.
The only issue is that ES Modules aren't easily usable from chrome codebase. AFAIK, it still requires a document.
But it may be worth pinging :jonco about that. It would be fantastic if we could be using ES Modules right away.
We have plan to start using them in DevTools, but it is much harder on an existing and large codebase.

6.2) Compartments
Using JSMs will later prevent us from debugging chrome targets.
This is clearly out of scope of this project, but I'm already thinking about using CDP to debug Firefox, including browser.xul and all chrome privileged code. To do that, this codebase would have to be loaded in its own compartment, flagged with "invisibleToDebugger=true" so that the Debugger API know what is debuggee versus debugger frames. It is clearly not feasible with JSM as JSMs are global and there will be no way to say that everything under remote/ should be loaded slightly differently.
It may be different if we are using ES Modules as we may be loading module from a dedicated Document/Context, which we should be able to flag accordingly.
So I think it is best to ignore that for now, and see how things are going with ES Modules instead.

7) Testing

chrome-remote-interface is supposed to work from a web page:
So, I imagine we should be able to make it work from an xpshell test script and easily run demo.js, which would be awesome!

You may need a document to make it work.
Note the existence of this test helper, which helps load'n spawn a document in XPCShell:
This is based on top of Services.appShell.createWindowlessBrowser, in case this helper isn't useful.
Attachment #9039832 - Flags: feedback?(poirot.alex) → feedback+
Comment on attachment 9039832 [details] [diff] [review]

Review of attachment 9039832 [details] [diff] [review]:

Here is some more low level random feedback, close to a code-review.
I didn't went through everything, but I think we can simplify the current patch and start focusing on the point I raised in the previous comment.

Also, we should find ways to collaborate here so that you aren't alone doing all the patches!

::: remote/Debugger.jsm
@@ +18,5 @@
> + * and for each accepted connection a new Session is created.
> + * There can be multiple sessions per target.
> + * The session's lifetime is equal to the lifetime of the debugger connection.
> + */
> +this.Debugger = class {

Debugger isn't a great name for this class. Especially given that we are going to have a Debugger domain.

We are having tons of poorly named classes in DevTools that no longer make any sense, so I would like to be careful about the names :)

What about TargetServer or TargetWebSocket or ...?

@@ +26,5 @@
> +    this.sessions = new Map();
> +    this.nextConnID = 0;
> +  }
> +
> +  get connected() {

This would better be named "listening".

::: remote/Handler.jsm
@@ +17,5 @@
> +
> +class Handler {
> +  register(server) {
> +    server.registerPathHandler(this.path, this.rawHandle);
> +  }

This isn't used.

::: remote/RemoteAgent.js
@@ +25,5 @@
> +
> +const ENABLED = "remote.enabled";
> +const FORCE_LOCAL = "remote.force-local";
> +const HTTPD = "remote.httpd";
> +const SCHEME = `${HTTPD}.scheme`;

This isn't really used.

@@ +71,5 @@
> +    this.consoleService = new ConsoleServiceObserver();
> +
> +    this.tabs = new TabObserver({registerExisting: true});
> +    this.tabs.on("open", tab => this.targets.connect(tab.linkedBrowser));
> +    this.tabs.on("close", tab => this.targets.disconnect(tab.linkedBrowser));

We might want to move all of that in listen or whenever we accept the first connection in order to do things lazily.

@@ +73,5 @@
> +    this.tabs = new TabObserver({registerExisting: true});
> +    this.tabs.on("open", tab => this.targets.connect(tab.linkedBrowser));
> +    this.tabs.on("close", tab => this.targets.disconnect(tab.linkedBrowser));
> +
> +    Services.ppmm.addMessageListener("RemoteAgent:IsRunning", this);

This event isn't used.

@@ +143,5 @@
> +  get scheme() {
> +    if (!this.server) {
> +      return null;
> +    }
> +    return this.server.identity.primaryScheme;

This may be better inlined into listen method as you will nullify all the pref if the server failed starting.
Same applies to host and port.

@@ +165,5 @@
> +
> +  async handle(cmdLine) {
> +    let flag;
> +    try {
> +      flag = cmdLine.handleFlagWithParam("debug", false);

`--debug` already exists and spawn a C++ debugger.
We should find another name. Should we align on the one used in chrome? Or align with existing command line used in devtools?

@@ +231,5 @@
> +    ]);
> +  }
> +}
> +
> +class Targets {

It may be easier to follow the code if each Class was in its own file?

@@ +279,5 @@
> +  }
> +
> +  toJSON() {
> +    return [...this];
> +  }

TBH, this is really hard to follow *and* this is hurting you later in remote/Target.jsm's toJSON.
The combination of already kind of magic "toJSON" + iterator is too much.
You would benefit from being more explicit:
async toJSON() {
  const list = [];
  for (const target of this._targets.values()) {
      list.push(await target.toJSON());
  return list;
I don't know if it is fine to use async/await when you explicitely call toJSON? Otherwise it is worth having custom method name for this.

@@ +334,5 @@
> +    }
> +
> +    if (!this.instance_) {
> +      if (Services.appinfo.processType == Ci.nsIXULRuntime.PROCESS_TYPE_CONTENT) {
> +        this.instance_ = new ChildRemoteAgent();

This content process branch isn't used/useful. You can get rid of ChildRemoteAgent.
If this code happens to be trigerred in the content (which should not and probably be investigated),
we should throw Cr.NS_ERROR_NOT_IMPLEMENTED or some error.

::: remote/Session.jsm
@@ +24,5 @@
> +    this.connection.onmessage = null;
> +;
> +  }
> +
> +  async despatch({id, method, params}) {

This method has a typo.

@@ +101,5 @@
> +      throw new Error("No such domain: " + name);
> +    }
> +
> +    const inst = new Cls(this.session,;
> +    inst.on("*", this.session);

It would be great to stick to toolkit's EventEmitter, or the DevTools one, which still supports wildcard listener, but differently from you.

But I'm not sure we should do that. We should probably not listen to all the events.
It is pretty error prone as *all* events will end up being piped on the transport layer.
Whereas you may have internal events that can be emitted on the Domain, that shouldn't go on the wire.
It would be better to either:
* be more explicit on each domain, when we want to fire a network event to be sent to the client. So use another API. this.sendEvent("Log.entryAdded", {entry}) or this.emit("event", "Log.entryAdded", { entry }), or?
* filter and only pipe the events mentioned in the schema. That's what we do in RDP.

::: remote/Target.jsm
@@ +14,5 @@
> +XPCOMUtils.defineLazyServiceGetter(this, "Favicons",
> +    ";1", "nsIFaviconService");
> +
> +/** A debugging target. */
> +this.Target = class {

That may be a very personal opinion, but `class Target {` looks more readable to me.
If that's to please JSM export symbols, may be putting later `this.Target = Target;` is a better tradeoff?

@@ +19,5 @@
> +  constructor(browser) {
> +    this.browser = browser;
> +    this.debugger = new Debugger(this);
> +
> +    EventEmitter.decorate(this);

DevTools's event emitter supports ES classes and allows extending from it.
It may be worth using it, or tweaking toolkit's one to support that?

@@ +25,5 @@
> +
> +  connect() {
> +    Services.obs.addObserver(this, "message-manager-disconnect");
> +    this.debugger.listen();
> +    this.emit("connect");

Noone listen for connect/disconnect. We should remove this code until there is a usage of it.

::: remote/nsIRemoteAgent.idl
@@ +12,5 @@
> + * Remote agent is a low-level debugging interface
> + * based on the CDP protocol.
> + */
> +[scriptable, uuid(ccbd36a6-a8fc-4930-b747-8be79c7a04b2)]
> +interface nsIRemoteAgent : nsISupports

This nsIRemoteAgent isn't used, we should remove it.

::: remote/server/Socket.jsm
@@ +4,5 @@
> +
> +"use strict";
> +
> +  "ConnectionHandshake",

This symbol isn't used externaly to this module.

@@ +8,5 @@
> +  "ConnectionHandshake",
> +  "SocketListener",
> +];
> +
> +// This is an XPCOM-service-ified copy of ../devtools/shared/security/socket.js.

Hum. I'm wondering if we can make it so that devtools's server switches to ES Modules and have the CDP server share the files here... It may pull too many deps from shared... I don't know.
Component: Remote Agent → Agent
Product: DevTools → Remote Protocol

ochameau and I have addressed all the feedback from above, plus
some more.

Today I forcefully rebased and rewrote the history of the protocdp
branch in preparation of landing it in central. A few more minor
tweaks to documentation, READMEs, and a cursory review of the commit
log history is still pending.

Instead of landing the prototype as one giant commit, I intend to
overlay the history into central as has been the practice with other
projects developed out-of-tree in the past (testing/geckodriver,
testing/webdriver, testing/mozbase/rust, et al.). This is useful
when we find ourselves understanding why a certain change was made.

Attachment #9039832 - Attachment is obsolete: true
Keywords: leave-open
Pushed by
remote: initial cdp prototype; r=ochameau
remote: identify firefox-source-docs.m.o location absolutely; r=ochameau
remote: fix incorrect directory name for documentation; r=ochameau
remote: remove unused index; r=ochameau
remote: refer to correct bugzilla product in documentation; r=ochameau
remote: fix links to mailing list; r=ochameau
remote: rename Debugger to TargetListener; r=ochameau
remote: fix eslint error; r=ochameau
remote: document what a Target is; r=ochameau
remote: remove unused ChildRemoteAgent; r=ochameau
remote: s/despatch/dispatcher/g; r=ochameau
remote: fix TargetListener export; r=ato
remote: fix exporting formatError; r=ato
remote: stop binding JSM exported globals on this; r=ato
remote: execute the domains in the content process by piping all WebSocket request, response and events via message manager API; r=ato
remote: update the demo script to use a local page; r=ato
remote: fix eslint by removing unecessary commas; r=ato
remote: remove unused modules; r=ato
remote: remove unused nsIRemoteAgent interface; r=ato
remote: add missing Connection.close method called from TargetListener.close; r=ato
remote: add browser mochitest to test the agent via chrome-remote-interface; r=ato
remote: add a workaround in chrome-remote-interface to support server; r=ato
remote: wait for browser element init before creating Target; r=ato
remote: fix build by removing non-existent browser_tabs.js test; r=ochameau
remote: remove head.js support file to make it possible to execute browser_cdp.js; r=ochameau
remote: update testing documentation; r=ochameau
remote: ignore lint errors in chrome-remote-interface.js; r=ochameau
remote: fix lint problems related to new browser-chrome tests; r=ochameau
remote: salvage some code style documentation from Marionette; r=me
remote: change flag from --debug to --remote-debugger; r=ochameau
remote: mention which host it binds to by default in help text; r=ochameau
remote: contribute usage instructions; r=me a=docs
remote: release test_Session.js under CC-0; r=ochameau
remote: move Targets class to separate module; r=ochameau
remote: fix IPv6 loopback hostname; r=ochameau
remote: use application/json mime type for responses; r=ochameau
remote: respond with 500 Internal Server Error if unable to serialise JSON; r=ochameau
remote: re-add browser_tabs.js test and push head.js to fix the mochitests; r=ato
remote: drop protocol schema validation; r=ochameau
remote: fix lint errors by removing unused variables; r=ochameau
remote: make Session#receiveMessage slightly more succinct; r=ochameau
remote: remove reference to Network domain in the test; r=ato
remote: display error message and a stack in mochitest; r=ato
remote: allow Domains to be implemented in either parent or content processes; r=ato
remote: introduce MessagePromise sync primitive; r=ochameau
remote: test that method own property on domain class is function; r=ochameau
remote: s/remote-protocol/remote/g; r=ochameau
remote: destruct domain instances on closing session; r=ochameau
remote: refactor Session#dispatch; r=ochameau
remote: refactor Domains to not extend Map and wean us off custom EventEmitter; r=ochameau
remote: improve error legibility somewhat; r=ochameau
remote: propagate underlying cause to superclass; r=ochameau
remote: rename TargetListener.connected to listening; r=ato
remote: merge Handler into JSONHandler as we only have JSONHandlers; r=ato
remote: remove unused informative remote.httpd.* preferences; r=ato
remote: stop emitting unused connect/disconnect events; r=ato
remote: rename Prefs.jsm to match its exported symbol; r=ato
remote: remove unused listener argument passed to Connection; r=ato
remote: remove unused EventEmitter interface from Target; r=ato
remote: use toolkit's event emitter; r=ato
remote: destroy the related Session when the WebSocket connection drops; r=ato
remote: correctly unregister accepted listener in TargetListener; r=ato
remote: remove unused Target symbol from RemoteAgent; r=ato
remote: format error packets according to puppeteer expectations; r=ato
remote: remove unsupported feature in demo; r=ochameau
remote: associate formatting with error prototype; r=ochameau
remote: add Log.verbose; r=ochameau
remote: emit events when targets connect and disconnect; r=ochameau
remote: document Session class; r=ochameau
remote: use fatal error when unable to start HTTPD; r=ochameau
remote: clarify error message when unable to start HTTPD; r=ochameau
remote: self-assign id to connection; r=ochameau
remote: signal that transport is ready when connection is created; r=ochameau
remote: drop outdated todos; r=me
remote: add WebSocketServer.upgrade for upgrading existing httpd.js requests; r=ochameau
remote: upgrade to WebSocket on existing HTTPD; r=ochameau
remote: remove unused BrowserObserver; r=ochameau
remote: remove unused WindowManager.isWindowIncluded; r=ochameau
remote: drop unused module imports; r=ochameau
remote: drop unused functions in Browser domain; r=ochameau
remote: sort imports; r=ochameau
remote: format; r=me a=docs
remote: remove unused variables; r=ochameau
remote: fix WindowManager docs; r=me a=docs
remote: disconnect from build system; r=me

We have a new remote protocol in Firefox that is based on the Chrome
DevTools Protocol (CDP). This is a low-level debugging interface with
which you can inspect the state and control execution of documents
running in web content, instrument Firefox in interesting ways, simulate
user interaction for automation purposes, and debug JavaScript execution.

This patch hooks the server part of this implementation, known as the
remote agent, up to the build system. The remote agent is not enabled
in the default build, so the following is needed in mozconfig to build it:

ac_add_options --enable-cdp

A subsequent change to enable the remote agent in Nightly builds only
will follow in due course. That would allow us to run TaskCluster
test jobs to verify the remote protocol's functional integrity.

Blocks: 1533831
Pushed by
remote: hook new remote protocol up to build system; r?#build
Keywords: leave-open

Closed: 4 years ago
Resolution: --- → FIXED
Depends on: 1534827
Blocks: 1549421
Blocks: 1549448
Blocks: 1549449
Blocks: 1549451
Blocks: 1549470
Blocks: 1549469
Blocks: 1549475
Blocks: 1549511
You need to log in before you can comment on or make changes to this bug.