1499845 - Fission: Implement frontend API(s) for taking a screenshot of OOP iframes

Reporter

Description

•

7 years ago

With Fission we will have remote iframes for cross-origin sites. This means that a single content process will not be painting all of the web content inside of a tab. We'll be able to support this using our compositor infrastructure for normal rendering, as the compositor will receive layers from all frames in a tab and composite them into one image. This will not work for CanvasRenderingContext2D::DrawWindow, which is used in lots of places for taking a screenshot of web content. There are two different problems that make me believe a new API is needed. The first is that we shouldn't allow a content process to request a screenshot of cross-origin content. If a content process is compromised this could allow it to access sensitive information. So this new API should be limited to the browser process if possible, or privileged content processes if needed. The second is that painting a cross process document tree is an async operation. It will be slower than traditional drawWindow, and the browser process cannot block on content processes. So this new API should be async. I implemented internals to support this with an unused API in bug 1475139. I've been now looking at all the different use cases an API should support. To my knowledge these are the use cases in mozilla-central. 1. Draw the whole browser window 2. Draw the full tab including non-visible areas 3. Draw the visible rect of a tab 4. Draw a specified rect of a tab 5. Draw a specific DOM node in a tab 6. Draw the contents of a <iframe mozbrowser> These all have different requirements and we have multiple re-implementations of all these using drawWindow in the code base. It'd be nice for the screenshot API to support all these use cases. I think something like this could work: ``` // Takes a screenshot of the clientBoundingRect of element found // by querySelector(selector). If no selector is given, then takes // a screenshot of the current viewport. Promise<ImageBitmap> captureSelector(DOMString selector, CaptureElementOptions options) dictionary CaptureElementOptions { // Whether to scroll to the selected element before capturing it // Useful as it will layout fixed and sticky position items correctly // Will need to restore scroll offsets after the capture bool scrollIntoView = true, // An offset to be applied to the selected client bounding rect float top = 0, // An offset to be applied to the selected client bounding rect float left = 0, // An offset to be applied to the selected client bounding rect float width = undefined, // A height to override of the selected client bounding rect float height = undefined, // A resolution to render the content at, defaults to devicePixelRatio float resolution = undefined, } ``` I think that we could provide a WebExtensions API for this similar to captureVisibleTab. [1] Internally, I think we could expose this off of <browser> as well. The biggest question I have, is where does this API need to be exposed and what processes does it need to be available in? Is only allowing the API in the parent process viable, or do we need to allow it in certain processes? [1] https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/tabs/captureVisibleTab

Ryan Hunt [:rhunt]

Reporter

Comment 1

•

7 years ago

Mike, Brian, would you have any insights into the requirements for this API? Or know who else to talk to. Here are some of the uses of drawWindow I was mentioning. Page thumbnail [1], Devtools screen capture [2], Devtools eye dropper [3]. [1] https://searchfox.org/mozilla-central/rev/eef79962ba73f7759fd74da658f6e5ceae0fc730/toolkit/components/thumbnails/PageThumbUtils.jsm#227 [2] https://searchfox.org/mozilla-central/rev/3d989d097fa35afe19e814c6d5bc2f2bf10867e1/devtools/shared/screenshot/capture.js#94 [3] https://searchfox.org/mozilla-central/rev/3d989d097fa35afe19e814c6d5bc2f2bf10867e1/devtools/server/actors/highlighters/eye-dropper.js#490

Flags: needinfo?(mconley)

Flags: needinfo?(bgrinstead)

Mike Conley (:mconley) (:⚙️)

Comment 2

•

7 years ago

Making this async makes perfect sense, and the idea that content processes shouldn't be able to capture pixels from other processes makes perfect sense. We should get some of the browser screenshots devs to look this proposal over - I've needinfo'd Jared Hirsch.

Flags: needinfo?(mconley) → needinfo?(jhirsch)

Brian Grinstead [:bgrins]

Comment 3

•

7 years ago

(In reply to Ryan Hunt [:rhunt] from comment #1) > Mike, Brian, would you have any insights into the requirements for this API? > Or know who else to talk to. > > Here are some of the uses of drawWindow I was mentioning. > > Page thumbnail [1], Devtools screen capture [2], Devtools eye dropper [3]. > > [1] > https://searchfox.org/mozilla-central/rev/ > eef79962ba73f7759fd74da658f6e5ceae0fc730/toolkit/components/thumbnails/ > PageThumbUtils.jsm#227 > [2] > https://searchfox.org/mozilla-central/rev/ > 3d989d097fa35afe19e814c6d5bc2f2bf10867e1/devtools/shared/screenshot/capture. > js#94 > [3] > https://searchfox.org/mozilla-central/rev/ > 3d989d097fa35afe19e814c6d5bc2f2bf10867e1/devtools/server/actors/highlighters/ > eye-dropper.js#490 I think your (1), (2), (3), and (5) in Comment 0 cover the DevTools usecases (using --chrome, no args, --fullpage, and --selector, respectively). Yulia was the last person to work with screenshots in DevTools (during GCLI removal), so she may be able to give more feedback or be able to redirect appropriately.

Flags: needinfo?(bgrinstead) → needinfo?(ystartsev)

Yulia Startsev [:yulia] OOO until July 2026

Comment 4

•

7 years ago

Hi Ryan, We already have a cross process issue with screenshots on devtools, this is the bug for it https://bugzilla.mozilla.org/show_bug.cgi?id=1474006 It would be really great to have a general api so we don't keep reimplementing the same functionality! At the moment, screenshots are represented by a target scoped actor. (https://searchfox.org/mozilla-central/source/devtools/server/actors/screenshot.js, uses the capture code that you mentioned) Since it is done this way, we have the capability to take a screenshot in every process that we need to. The hard part is merging the image. In order to make it fission ready, what we would need to do is traverse the targets in a given tab, take a screenshot in each one, determine the offset of the screenshot, then combine the images into a final product. I am not sure how the browser folks are doing it, so I am looking forward to their ideas as well.

Flags: needinfo?(ystartsev)

Ryan Hunt [:rhunt]

Reporter

Comment 5

•

7 years ago

(In reply to Yulia Startsev [:yulia] from comment #4) > Hi Ryan, > > We already have a cross process issue with screenshots on devtools, this is > the bug for it https://bugzilla.mozilla.org/show_bug.cgi?id=1474006 > > It would be really great to have a general api so we don't keep > reimplementing the same functionality! At the moment, screenshots are > represented by a target scoped actor. > (https://searchfox.org/mozilla-central/source/devtools/server/actors/ > screenshot.js, uses the capture code that you mentioned) > > Since it is done this way, we have the capability to take a screenshot in > every process that we need to. The hard part is merging the image. In order > to make it fission ready, what we would need to do is traverse the targets > in a given tab, take a screenshot in each one, determine the offset of the > screenshot, then combine the images into a final product. > > I am not sure how the browser folks are doing it, so I am looking forward to > their ideas as well. Unfortunately it's not always possible to correctly merge the contents of a tab in that manner. For example, content in one frame can overlap and underlap the contents of a sub frame [1]. In addition, with transforms [2] and filters [3] it's not a simple copy and paste operation. That's why we've been doing some work to provide a platform API which will take care of all of this. We'd also like to not expose this API inside processes containing web content. Does the devtools panel run in the browser process or a content process? I'm not super familiar with devtools internals so I'm not sure where it'd be convenient for this API to be exposed. [1] https://eqrion.github.io/web-tests/embed/same-origin-overlap.html [2] https://eqrion.github.io/web-tests/embed/same-origin-3d-transform.html [3] https://eqrion.github.io/web-tests/embed/same-origin-filter.html

Jared Hirsch [:jhirsch] (he/him) (Needinfo please)

Comment 6

•

7 years ago

Sorry for the long delay. The Screenshots use cases are: 2. Draw the full tab including non-visible areas 3. Draw the visible rect of a tab 4. Draw a specified rect of a tab Screenshots is implemented as a webextension, so you'll need to expose the API to the webextension process (Screenshots is a specially-privileged webextension with the 'mozillaAddons' permission; not sure if this affects anything at the process level). Screenshots captures page content via a content script iframe that calls chrome.drawWindow, if that API is available. If not, it falls back to calling captureVisibleTab from the background page. We could just call the new API from the background page instead, I reckon. An async API works fine for us.

Flags: needinfo?(jhirsch)

Yulia Startsev [:yulia] OOO until July 2026

Comment 7

•

7 years ago

Hi Ryan,

I just saw that you worked on this concurrently to this bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1475139
To answer your question - the devtools panel runs in the parent process.

This api that you worked on looks really promising: would we be able to use it?

Flags: needinfo?(rhunt)

Ryan Hunt [:rhunt]

Reporter

Comment 8

•

7 years ago

Hi Yulia, I haven't gotten any time to work on it since this proposal. It should be fairly simple to implement though.

I'll try and take a look at it in the next week.

Flags: needinfo?(rhunt)

Ryan Hunt [:rhunt]

Reporter