Closed Bug 1311041 (marionette-window-tracking) Opened 8 years ago Closed 5 years ago

Window handle changes when framescript is moved to a different process

Categories

(Remote Protocol :: Marionette, defect, P1)

Version 3
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: ato, Assigned: ato)

References

(Blocks 3 open bugs, )

Details

(Keywords: pi-marionette-server, pi-marionette-spec)

Attachments

(1 file, 2 obsolete files)

When a browser goes through the painful process of a remoteness change, its outerWindowID that we use as the WebDriver window handle also changes.  This means users cannot rely on the window handle to stay consistent.

This is a violation of the “unique resource identifier” a window handle should be mandated by the WebDriver specification.

mconley mentions in https://bugzilla.mozilla.org/show_bug.cgi?id=1259055#c22 that we could use a <xul:browser>’s permanentKey as a replacement, as they are transferred when a browser is torn out to a new window.
Blocks: webdriver
Blocks: 1202246
Assignee: nobody → ato
Status: NEW → ASSIGNED
Andreas, it would be good if you could wait for bug 1124604 being landed, which also includes a couple of changes to browser.js and also adds a couple of new tests. I'm planning to finish the patch today. Thanks.
Blocks: 1330309
Alias: marionette-window-tracking
Blocks: 1301073
Blocks: 1380513
Priority: -- → P1
No longer blocks: 1380513
Blocks: 1313338
Attached patch wip (obsolete) — Splinter Review
MozReview-Commit-ID: CcJA7TyCwKs
Thank you for sharing the WIP. There are some parts we might want to discuss especially the different context classes and which methods those offer. I see upcoming conflicts for different applications. But maybe it's too early, and you still have considered to move around once the initial code is working. Otherwise lots of work done already!
Blocks: 1255946
Attached patch wip (obsolete) — Splinter Review
MozReview-Commit-ID: CcJA7TyCwKs
Attachment #8893591 - Attachment is obsolete: true
Blocks: 1373444
Depends on: 1392281
No longer blocks: 1373444
Depends on: 1392318
Depends on: 1392339
Depends on: 1392346
Depends on: 1392323
Depends on: 1393812
Depends on: 1394849
Depends on: 1394881
Depends on: 1400256
Depends on: 1404946
Depends on: 1405004
Depends on: 1408454
Depends on: 1409379
Blocks: 1420461
Depends on: 1422593
Depends on: 1422915
Depends on: 1423209
Depends on: 1423282
Depends on: 1423356
Depends on: 1423357
Depends on: 1423358
Depends on: 1423359
Blocks: 1426154
Blocks: 1426208
Depends on: 1379490
Attachment #8899161 - Attachment is obsolete: true
Depends on: 1429082
Depends on: 1429083
Depends on: 1429091
Depends on: 1429111
Depends on: 1429116
No longer blocks: 1426154
Depends on: 1426154
Depends on: 1429199
Depends on: 1430077
Depends on: 1430109
Depends on: 1430119
Depends on: 1431155
Depends on: 1431462
Depends on: 1432212
Depends on: 1432538
Blocks: 1431116
Blocks: 1363368
Blocks: 1337898
Blocks: 1364434
Blocks: 1288926
Blocks: 1447024
Depends on: 1447045
Depends on: 1460656
It’s time to make a brief update on the progress of this bug.  Some
months ago I had completed a patch that provided functioning window
tracking and passed all the Mn tests on try.  When I turned my
attention to Wr (WPT reftests) I discovered that the approach I had
chosen for tracking windows in Mn tests was not going to cut it for
Wr.

The short explanation is that the reftest harness manually spawns
a ChromeWindow and attaches a custom <xul:browser> to it.  This
content browser wouldn’t get automatically registered by the
BrowserObserver in charge of picking up new windows and browsers
because it was relying on the TabOpen/TabClose events.  Since the
<xul:browser> in the reftest window was not associated with a
tabbrowser, these events would not fire.

I experimented with various workarounds for having the reftest
harness manually add the content browser to the browser store used
by BrowserObserver, but ran into a range of synchronisation problems:
because the construction and initialisation of the <xul:browser>
in the reftest harness is not atomic, I went down a deep rabbit
hole of trying to make the BrowserObserver, which was designed to
work in a reactive, event-based manner, work with in a synchronous
and blocking call.  In the end I concluded this workaround was not
feasible and returned to the drawing board.

When new content browsers are created, they happen to fire a
Browser:Init event on their message manager.  This is sent after
the creation of the <xul:browser> chrome element itself and after
any IPC messages from the frame script (such as Marionette:Register).
The great thing about it, is that it is also sent independently
from <tabbrowser>, e.g. is useful for reftest.

Unfortunately, and to my great sorrow, there is no equivalent to
Browser:Init for when the content browser has been torn down and
been removed.  Through much help from mconley I discovered
message-manager-{close,disconnect}: "close" fires before the message
manager is freed and disconnect fires after the fact.  Unfortunately,
again, it also fires when a content browser suffers a remoteness
change which makes it unsuitable for removing the browser from the
cache: if the content browser were to re-register from a new child
process we would no longer hold a reference to it in the cache.

After much thinking and tinkering I’ve come up with a solution that
on message-manager-disconnect tests if the chrome element <xul:browser>
has been detached from the DOM.  This seems to reliably tell us
whether the content browser has been removed and should be removed
from the cache, or whether it is merely transitioning to a new child
process.

The contents of the patches I will submit for this bug will mainly
consist of two things: a window- and browser tracking system, which
allows Marionette to track content browsers across _all_ windows
and in all contexts; and a set of browsing context abstractions
which provide specialisations for Firefox and Fennec.

The latter has been pretty stable throughout the work on this patch,
but will allow us to separate out application-specific implementations.
This will be great if we want to add GeckoView support in the future
as it provides clean interfaces ridden of the if-condition spaghetti
code that we currently find throughout Marionette.

I will give a brief introduction to these abstractions.  Because
WebDriver speak in the web platform terminology of ‘browsing contexts’
I thought it would be a nice challenge to see if we could map down
to this terminology in Marionette.

Each ChromeWindow is represented by a ChromeContext that in addition
to providing various functionality pertaining to <window> also
implements WebDriver-specific functionality such as a blocking API
for window resizing, repositioning, &c.

Similarily each content browser (represented by <xul:browser> in
chrome space and by a frame script in the child process) is represented
with a ContentContext abstract class.  It has a TabbedContentContext
specialisation representing content browsers that are associated
to a tabbrowser, and this again has two specialisations: a
TabbedFirefoxContext and a TabbedFennecContext.  The justifications
for where to draw the line on these abstractions is simply where
the API of Firefox and Fennec differs.  They happen to differ when
dealing with tab management, so that’s where the line is drawn.

From a usage perspective this means that ContentContext#close()
will do the right thing in driver.js, and that callers don’t have
to worry specifically about what the implementation specifics for
a particular application are.  This is a great leap forward and
significantly reduces if-condition and switch-statement clutter in
driver.js and browser.js.
Depends on: 1464469
Depends on: 1465527
Blocks: 1465719
Depends on: 1467215
No longer blocks: 1264259
Depends on: 1469307
Blocks: 1471288
Blocks: 1121705
No longer blocks: 1121705
Depends on: 1121705
Depends on: 1473614
Depends on: 1476305
Blocks: 1286893
Depends on: 1490333
Depends on: 1492499
Blocks: 1501778
Lets update the summary of the bug given that this is not only for remoteness changes anymore. It happens each time the framescript is moved to a different process.
Summary: Window handle changes on remoteness change → Window handle changes when framescript is moved to a different process
No longer blocks: 1351940

In the recent weeks I have come to the conlusion that fixing the
window handles to stay unique following remoteness changes is
somewhat directly related to ensuring Fission compatiblity in
Marionette.

As process swaps get more frequent and as nested browsing contexts
such as <frame> and <iframe> shift from being traversable through
the docshell to becoming remote, there is a pressing need to be
able to uniquely keep track of content browsers (<browser>) across
all processes.

The DOM team is pulling a lot of infrastructure for handling content
browsers from the product frontends (Firefox and Fennec) up to the
platform level. One of the new underlying primitives introduced
is the BrowsingContext interface that is meant to carry the same
meaning as “browsing contexts” in the HTML specification. It assigns
a universally unique identifier to each context, and because they
will be serialisable you can query and reference them from any
process.

BrowsingContext directly comparable to the (wait for it…) BrowsingContext
interfaces I had been working on in this patch. But whereas my
patch had support for Firefox and Fennec using existing toolkit
technology to keep track of content browsers appearing and disappearing,
the new platform BrowsingContext does all of this at the platform
level, which means it will be future-compatible with any new products
(e.g. based on GeckoView).

The second half of the patches I’ve been working on for this bug
were a series of class specialisations to get rid of spaghetti
if-conditions for product-specific behaviour. These are still
relevant and we would like to see these in Marionette long-term.
However, as we discuss implementing a new remote interface in
Firefox, we face the question what to do with the Marionette protocol.

It’s a bigger discussion what to do with Marionette and how to make
it Fission compatible (as it reaches beyond the scope of this bug),
but the addition of BrowsingContext does however render this bug
superfluous.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX

(In reply to Andreas Tolfsen ⦗:ato⦘ from comment #19)

The DOM team is pulling a lot of infrastructure for handling content
browsers from the product frontends (Firefox and Fennec) up to the
platform level. One of the new underlying primitives introduced
is the BrowsingContext interface that is meant to carry the same
meaning as “browsing contexts” in the HTML specification. It assigns
a universally unique identifier to each context, and because they
will be serialisable you can query and reference them from any
process.

Note that we still have to find the window the wanted browsing context is contained in. Is there an easy way to do that like an API which returns the current window handle for a browsing context? If not we might have to filter all open chrome windows for the requested browsing context id. Maybe internally we can simply store the chrome window handle together with the browsing context id, so we don't have to do the filtering each time.

The second half of the patches I’ve been working on for this bug
were a series of class specialisations to get rid of spaghetti
if-conditions for product-specific behaviour. These are still
relevant and we would like to see these in Marionette long-term.

I have to start working on bug 1496773 for GeckoView soon, which will pull-in at least one more platform. As such more if-conditions will have to be added. Without looking at your full patch, what do you think how much work it is to just get those separations extracted and landed? Would it be worth it? Or should I for now just add the necessary conditions?

It’s a bigger discussion what to do with Marionette and how to make
it Fission compatible (as it reaches beyond the scope of this bug),
but the addition of BrowsingContext does however render this bug
superfluous.

For the Fission work see bug 1518468.

Flags: needinfo?(ato)
No longer blocks: 1501778
No longer blocks: 1426208

Note that I filed bug 1519335 to investigate the usage of the browsing context id.

No longer blocks: 1465719

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #20)

Note that we still have to find the window the wanted browsing
context is contained in. Is there an easy way to do that like
an API which returns the current window handle for a browsing
context? If not we might have to filter all open chrome windows
for the requested browsing context id. Maybe internally we can
simply store the chrome window handle together with the browsing
context id, so we don't have to do the filtering each time.

A BrowsingContext is an associated state of the content browser, so
you can find the window a browsing context is contained in the usual
way by traversing up the tree to find the nearest <window> element:

myBrowser = gBrowser.browsers[0]
myBrowser.browsingContext
<- BrowsingContext
myBrowser.closest("window")
<- <window id="main-window" …>

ChromeWindows aren’t browsing contexts, but they also always exist
in the parent process, so their outerWindowIDs won’t change.

I have to start working on bug 1496773 for GeckoView soon,
which will pull-in at least one more platform. As such more
if-conditions will have to be added. Without looking at your full
patch, what do you think how much work it is to just get those
separations extracted and landed? Would it be worth it? Or should
I for now just add the necessary conditions?

If Marionette as we know it is going to be discontinued, it doesn’t
make sense to spend a lot of time on major refactorings.

Flags: needinfo?(ato)

If Marionette as we know it is going to be discontinued

What will be alternative?

(In reply to support from comment #23)

If Marionette as we know it is going to be discontinued

What will be alternative?

We should narrow the scope of this discussion to talk about using
the new platform BrowsingContext abstractions for dealing with host
process changes.

To be clear, the future of the Marionette remote protocol has not
yet been decided. We will likely make a decision about that when
doing our H2 planning.

No longer blocks: 1477977
Product: Testing → Remote Protocol
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: