Support Chrome Puppeteer API
Categories
(DevTools :: General, enhancement, P2)
Tracking
(Not tracked)
People
(Reporter: aslushnikov, Assigned: aslushnikov)
References
Details
Attachments
(1 file)
User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36 Expected results: Let's make https://github.com/GoogleChrome/puppeteer/ work with Firefox!
Assignee | ||
Comment 1•4 years ago
|
||
This patch: - introduces a new event "onFrameLocationChange" that is needed to track same-document navigations inside iframes. - adds a new "//testing/juggler" folder that implements a custom automation protocol to back Puppeteer API. With this patch, running Firefox with "-juggler 0" flag will expose a Juggler debugging server.
Updated•4 years ago
|
Updated•4 years ago
|
Comment 2•4 years ago
|
||
Comment on attachment 9029755 [details]
Introduce "juggler" remote debugging protocol
Flagging Alex and Brian for a first round of feedback.
Comment 3•4 years ago
|
||
Thanks, I've left a couple of initial comments in Phabricator. Andrey, would you be willing to take a patch into the main Puppeteer library that enables switching the browser that gets used? Possibly something like `puppeteer.launch({ browser: 'firefox' })` or similar. I think the steps that would be needed for this on the Firefox side on top of the current implementation of Juggler would be: 1) Use CDP for transport 2) Implement the current CDP methods that Puppeteer uses for chrome (as opposed to the sometimes-modified ones that Juggler currently uses). Or alternatively, adapt Chrome to use the modified methods. We've started to look into doing this a bit, and it seems viable (even while sharing much of the rest of the Juggler implementation).
Assignee | ||
Comment 4•4 years ago
|
||
Hi Brian, thanks for looking through this! > Andrey, would you be willing to take a patch into the main Puppeteer library that enables switching the browser that gets used? Possibly something like `puppeteer.launch({ browser: 'firefox' })` or similar. How do you see this working? Puppeteer npm package only downloads Chromium and doesn't download Firefox. Requiring "puppeteer" and specifying a "browser" in the launch options won't work. We currently ship a separate NPM package - `puppeteer-firefox`. This package downloads Firefox and provides an API to drive it. Puppeteer users can pick a driver by just requiring the right package, or even require both and run the same tests against two different browsers - that's what we currently do in our own tests: https://github.com/GoogleChrome/puppeteer/blob/0cccc586cc3e7140fb9c51d859943cfb39d83b02/experimental/puppeteer-firefox/test/test.js#L63-L72 And as a bonus, TypeScript types work out-of-the-box :) > 1) Use CDP for transport Not using CDP is a conscious decision. CDP is a poor place to align upon: CDP exposes a lot of chrome-specific aspects. As Chromium evolves over time, CDP changes with it. Puppeteer became *the* place where we deal with CDP complexities to provide a neutral, semver-compliant automation API. From engineering point of view, it's just way cheaper for us to do this in Puppeteer than in Chromium. Many Chrome-specific things that leak big times through CDP (e.g. out-of-process-iframes and cross-process navigations) are foreign concepts to Gecko. It's an endless uphill battle to try to mock how Chrome works, and benefits are unclear to me. I do see a lot of shortcomings though. IMO it's crucial for the success of puppeteer-firefox to use a protocol that's natural to Firefox. This not only keeps things simple on the implementation side, but gives all the leverage to address user needs.
Comment 5•4 years ago
|
||
(In reply to aslushnikov from comment #4) > > Andrey, would you be willing to take a patch into the main Puppeteer library that enables switching the browser that gets used? Possibly something like `puppeteer.launch({ browser: 'firefox' })` or similar. > > How do you see this working? Puppeteer npm package only downloads Chromium > and doesn't download Firefox. Requiring "puppeteer" and specifying a > "browser" in the launch options won't work. > > We currently ship a separate NPM package - `puppeteer-firefox`. This package > downloads Firefox and provides an API to drive it. > Puppeteer users can pick a driver by just requiring the right package, or > even require both and run the same tests against two different browsers - > that's what we currently do in our own tests: > https://github.com/GoogleChrome/puppeteer/blob/ > 0cccc586cc3e7140fb9c51d859943cfb39d83b02/experimental/puppeteer-firefox/test/ > test.js#L63-L72 I see the concern with shipping both binaries in the main package. I'm also now reading about puppeteer-core at https://pptr.dev/#?product=Puppeteer&version=v1.11.0&show=api-puppeteer-vs-puppeteer-core, and assuming that puppeteer-firefox is also using this shared core, then adding new package(s) for individual browsers does make sense. My main concern when asking this is around making sure that the API is complete and well-tested in Firefox so that puppeteer-firefox doesn't fall out of date as things on either the Pupeteer side or the Firefox side change over time. The fact that the tests are running against Firefox in the main repo is helpful. A few follow up questions: 1) Does that run all of the Puppeteer tests or just a subset? 2) Have you had a chance to run this against any third party application Puppeteer test suites to see if they are passing? I'm sure our testing team will want to look into this, so just curious if you've already looked into that or have pointers to projects using a lot of these tests. 3) How do you ensure that chromium commits don't break Puppeteer? Are the Puppeteer tests vendored-in to chromium? > > 1) Use CDP for transport > > Not using CDP is a conscious decision. CDP is a poor place to align upon: > CDP exposes a lot of chrome-specific aspects. As Chromium evolves over time, > CDP changes with it. Puppeteer became *the* place where we deal with CDP > complexities to provide a neutral, semver-compliant automation API. From > engineering point of view, it's just way cheaper for us to do this in > Puppeteer than in Chromium. > > Many Chrome-specific things that leak big times through CDP (e.g. > out-of-process-iframes and cross-process navigations) are foreign concepts > to Gecko. It's an endless uphill battle to try to mock how Chrome works, and > benefits are unclear to me. I do see a lot of shortcomings though. > > IMO it's crucial for the success of puppeteer-firefox to use a protocol > that's natural to Firefox. This not only keeps things simple on the > implementation side, but gives all the leverage to address user needs. I haven't yet had a chance to dig into each method in Juggler and compare to the corresponding CDP method, but I believe there are some changes that look superficial enough that if we stuck with CDP naming it could perhaps let puppeteer share more with firefox-puppeteer and cause less divergence between the browers? For example: https://phabricator.services.mozilla.com/D13889#inline-74729. I think it's important for us to decide on our side if we want to use an RDP-like thing for transport or a CDP-like thing. If we want to use something like RDP, I believe there's some code reuse (from either DevTools or Marionette) that can be done inside of Juggler to avoid using a copy of it. I think Alex and Andreas have both looked into this in more detail, so I'll ask them to add any more details that I'm missing or getting wrong. Relatedly: I'll be mostly offline this coming week, so it'd be great to get their input.
Comment 6•4 years ago
|
||
Comment on attachment 9029755 [details] Introduce "juggler" remote debugging protocol (In reply to Brian Grinstead [:bgrins] from comment #5) > I think Alex and Andreas have both looked into this in more detail, so I'll > ask them to add any more details that I'm missing or getting wrong. > Relatedly: I'll be mostly offline this coming week, so it'd be great to get > their input.
Assignee | ||
Comment 7•4 years ago
|
||
> 1) Does that run all of the Puppeteer tests or just a subset? ATM puppeteer-firefox passes 285 tests out of 519 puppeteer tests. The goal is to have all of them once puppeteer-firefox supports all the API surface. > 2) Have you had a chance to run this against any third party application Puppeteer test suites to see if they are passing? Not really. In many cases, puppeteer test suites rely on request interception that is currently not supported in puppeteer-firefox. However, a few of my personal Puppeteer automation scripts run just fine with Puppeteer-Firefox. > 3) How do you ensure that chromium commits don't break Puppeteer? Are the Puppeteer tests vendored-in to chromium? Oftentimes Chromium commits do break Puppeteer for legitimate reasons - we update Puppeteer code once we roll a new chromium revision into it. For example: https://github.com/GoogleChrome/puppeteer/pull/2989 Occasional regressions do happen as well since there are no Puppeteer tests in Chromium repository. This, however, doesn't bother us much - both CDP and headless have a pretty good test coverage. I can recall only a few regressions in Chromium during the 1.5 years of Puppeteer existence. > but I believe there are some changes that look superficial enough that if we stuck with CDP naming it could perhaps let puppeteer share more with firefox-puppeteer and cause less divergence between the browers? I don't think code reuse is necessary and that much of a goal. From code complexity standpoint, the whole Puppeteer is less than 8KLOC - and there's *a lot* of sugar APIs in there. I expect puppeteer-firefox to be even less since there's a possibility of a more high-level protocol on the Firefox side. IMO both Puppeteer and Puppeteer-Firefox are quite small and low-effort projects. When it comes to API compatibility, we have to rely on compatibility test suite rather then on code reuse. This is the main reason we want puppeteer-firefox to be part of Puppeteer repo - this way we can effectively reuse all our testing infrastructure for both Puppeteer and Puppeteer-Firefox.
Comment 8•4 years ago
|
||
Comment on attachment 9029755 [details] Introduce "juggler" remote debugging protocol I’ve studied the Juggler code in detail and I think it holds great potential! The overarching concern I have is similar to bgrins’. This looks (as you would expect!) a lot like an implementation of CDP and it is unclear to me if the minor deviations from that are necessary to serve the goal of supporting Puppeteer. Reusing the CDP APIs directly rather than making minor changes, would have a benefit for long-term maintainability. The way Juggler uses RDP is incompatible with how the devtools actor hierarchy works, which I assume is why it spins up its own TCP port. It is simultaneously incompatible with the Marionette protocol that we use for WebDriver and various automation such as mochitests, reftests, and WPT. The only thing these three protocols have in common is that they share the same transport layer (JSON over TCP) and the packet sizing and framing approach provided by DebuggerTransport.jsm. Adding another remote protocol to support a single client application seems like a bad engineering choice, as it is unlikely to gain any wide-spread adoption or give us compatibility with other CDP-based tooling, such as VSCode. For similar reasons I am also concerned that having a specific client application for Firefox increases maintenance costs (any API change will need to be made in two places) and is therefore likely to lead to deviations in functionality. Drawing from personal experience as one of the primary authors of the WebDriver standard, this approach seems analogous to technical choices made early in the history of Selenium that it took years to recover from. Rather than browser-specific backends, we today have a standard that provides a contract and a transparent protocol more in the spirit of the open web. Another technical observation is that Juggler stores a lot of state in content frame scripts. This is risky since content frame scripts get reloaded when content browsers change processes. As remoteness changes are bound to get more frequent in the short-to-medium term with Fission (https://wiki.mozilla.org/Project_Fission), we should make every precaution that any new code that we land doesn’t add to the laundry list of existing code we need to fix. Please note that I will be away for the rest of the year, but before I take a much-deserved holiday, I want to stress that I’m overall impressed with the work done here. (-:
Comment 9•4 years ago
|
||
> The way Juggler uses RDP is incompatible with how the devtools actor hierarchy works, which I assume is why it spins up its own TCP port. That's the main concern I have with the current patch. It isn't quite CDP, nor is it RDP. So that it makes the whole testing/juggler code really specific to puppeteer and only puppeteer. It sounds unlikely that any of this is going to be reused by any other project. I do understand that integrating with existing chrome Domains (Page, Target, Browser, ...) is a much larger work and as you said, could force us to expose chrome specifics. It would be easier to expose brand new APIs/Domains, specific to Puppeteer. That's exactly what you did here, but with a custom protocol and with some confusion by reusing existing CDP domain names and methods. > IMO it's crucial for the success of puppeteer-firefox to use a protocol that's natural to Firefox. This not only keeps things simple on the implementation side, but gives all the leverage to address user needs. As of today, this protocol is RDP. But I'm not sure that's the key here. > I expect puppeteer-firefox to be even less since there's a possibility of a more high-level protocol on the Firefox side. The main benefit you got from the current patch is about writing code in Firefox that matches exactly puppeteers APIs. That's what you call "high-level protocol". That's exactly what makes puppeteer-firefox simple, simplier than the chrome equivalent. It isn't so much about using RDP versus CDP. If you start using RDP existing actors, you are most likely going to leak firefox specifics into puppeteer-firefox and complexify it. > Many Chrome-specific things that leak big times through CDP (e.g. out-of-process-iframes... A quick note about out of process iframes. We are actively working on that. This is the "Fission" project already mentioned in the previous comment. This is going to require some changes to FrameTree component as the way to iterate over DocShells and connecting to them is going to change. FrameScripts are going to be replaced by some other API. To me, the best option from Firefox engineering standpoint would be to have puppeteer-firefox to reuse existing RDP actors and contribute to them any missing feature. We would not have to maintain: another protocol, new code to iterate over tabs and frames, as well as code to inspect them by piercing process boundaries. It would be an equivalent of what is being done for chrome in puppeteer, but using RDP actors instead of CDP domains. Now, that's a suggestion I made by ignoring the intent of Mozilla [1] to start exposing a CDP server. It may be a good time to actually start having a base CDP server and have puppeteer be the first usecase for it. Andreas recently demonstrated that we can reasonably have such base CDP server in-tree. May be we should land such base server and rebase your patch on top of it? [1] https://groups.google.com/forum/#!msg/mozilla.dev.platform/4-4A8W-nP5g/Y9C9UkWTAAAJ
Updated•4 years ago
|
Comment 10•4 years ago
|
||
I’m tying up a few loose ends today by making sure Bugzilla reflects
reality.
As a result of the hard work by aslushnikov et al. on this patch and
in fact since discussions going as far back as to TPAC two years ago,
we have had multiple Mozilla-internal discussions debating how best to
tackle the challenge of gaining Puppeteer support in Firefox. It
should be abundantly clear to all parties involved that Puppeteer is
increasingly making its way into web authors’ toolboxes as a preferred
browser automation tool, and that its bidirectional and asynchronous
API is appealing to many.
Not supporting Puppeteer also poses significant web interoperability
risk to Mozilla. As websites migrate automation to Puppeteer, the
lack of cross-browser support in many cases leaves Firefox untested,
forcing Firefox to play a catch-up game addressing web compatibility
issues.
Simultenously Mozilla’s ambitions stretch beyond Puppeteer’s feature
set. Puppeteer is backed by the Chrome DevTools Protocol (CDP), which
also drives an armada of other tools. The simlarities between the
Juggler protocol presented in this patch and CDP are uncanny, and we
believe that it is in the best interest of the open web for us to
instead implement the subsets of CDP that allows us to support the
Puppeteer API surface.
To this effect, we have filed
https://bugzilla.mozilla.org/show_bug.cgi?id=puppeteer as a tracking
bug for gaining Puppeteer support in Firefox. The scope of this work
is limited to the subset of CDP that is necessary to support
Puppeteer’s API.
Due to the astonishing progress made on
https://github.com/aslushnikov/gecko-dev/tree/master/testing/juggler,
we would like to uplift and port large parts of the existing Juggler
implementation into our CDP implementation in Firefox.
I see great potential for collaboration in this area.
Have a great weekend, and I’ll see you all in the meeting on Tuesday!
Description
•