Closed Bug 1692468 Opened 3 years ago Closed 2 years ago

Move element reference cache from the parent to the content process

Categories

(Remote Protocol :: Marionette, enhancement, P1)

Default
enhancement
Points:
8

Tracking

(firefox110 fixed)

RESOLVED FIXED
110 Branch
Tracking Status
firefox110 --- fixed

People

(Reporter: whimboo, Assigned: whimboo)

References

(Blocks 1 open bug, )

Details

(Whiteboard: [webdriver:m5][wptsync upstream][webdriver:relnote])

Attachments

(6 files, 3 obsolete files)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

As noted by James on https://phabricator.services.mozilla.com/D104525#inline-587388 we might want to create a new custom object in element.getElementId() that wraps ContentDOMReference, and includes the extra properties as needed. Right now it just sticks additional properties onto that reference object.

https://searchfox.org/mozilla-central/source/testing/marionette/element.js#789-793

Maybe that could be done together with bug 1680479. At that time we will have to touch this code anyway.

See Also: → 1680479

To wrap up what needs to be done here:

  1. Create a wrapper around ContentDOMReference that has to live in the same process as the JSWindowActor child instances. That will be in the child process except when Marionette runs in chrome scope where it would live in the parent process instead.
  2. Move the element store from the parent process into the child process which includes the uuid generation for element ids.

With that we have to make sure that stale elements can still be handled as expected by the WebDriver classic specification and serialization keeps working as well.

Blocks: 1770420
Points: --- → 8
Priority: P3 → P1
Summary: Create custom type around ContentDOMReference for storing additional data → Create wrapper around ContentDOMReference to generate uuids already in the content process
Whiteboard: [webdriver:m4]
Blocks: 1770731
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Blocks: 1772484
No longer blocks: 1772484
Depends on: 1772484
Blocks: 1774182
Depends on: 1775036
Depends on: 1775064

Given that my local patch is quite old and I never pushed it as WIP and having all the recent fixes coming in by a rebase, I want to have a copy of my patch not only on my local machine. Sadly I cannot push to phab so I pushed it to try:

https://treeherder.mozilla.org/jobs?repo=try&revision=299f52851510f1f2400b7fe78fc81c9b56fc2a34

I'll rebase early next week.

Attachment #9284509 - Attachment is obsolete: true
Attachment #9284508 - Attachment is obsolete: true
Attachment #9284507 - Attachment is obsolete: true

Note that landing these patches is blocked by the final decision on https://github.com/w3c/webdriver/pull/1667.

With the recent discussion on the phabricator review we found cases where the element cache actually has to be kept per process so that it can be shared between different windows and frames. To correctly determine of an element is part of an active document and to throw a no such element error we would need a new JS exposed API. This will be added via bug 1779808.

Depends on: 1779808

I pushed the patch set with the final patch to move the cache to a singleton per process to try:
https://treeherder.mozilla.org/jobs?repo=try&revision=146fc4ad98c71d668c0adc11dd6c75072513270e

James, not sure how we want to handle the reload of pages. As the try results show it won't work with no such element anymore. Most likely it will be kept as stale element? Sadly I don't have the time anymore to check that. Maybe you can have a look and also run against your example tests? I hope it helps to figure out what's left to do for the WebDriver classic spec.

Flags: needinfo?(james)

Yeah, maybe the thing to do is to just drop the WebDriver PR for now, fix this properly in BiDi, and then later make any changes to WebDriver required to make it compatible with BiDi semantics. For BiDi we'll likely define a per-browsing-context-group mapping between shared id and node, and ensure the serialization has enough information to allow -classic to maintain it's current "same document references only" bevhaiour.

Flags: needinfo?(james)

(In reply to James Graham [:jgraham] from comment #13)

Yeah, maybe the thing to do is to just drop the WebDriver PR for now, fix this properly in BiDi, and then later make any changes to WebDriver required to make it compatible with BiDi semantics. For BiDi we'll likely define a per-browsing-context-group mapping between shared id and node,

The ContentDOMReference module has its weak map as a singleton and as such it exists across the lifetime of the process. Note that a single process can actually have more than one browsing context group. Further individual element references are stored per browsing context and not per group. As such whenever a browsing context is destroyed the references will be gone as well. So I don't see why we would need a per-browsing-context-group mapping.

and ensure the serialization has enough information to allow -classic to maintain it's current "same document references only" bevhaiour.

If we drop the WebDriver PR we could still try to keep the current behavior which includes that we fail with a stale element error when navigating away from the current page. Given that the staleness checks are done in Marionette specific modules, we could certainly have a different implementation for such a check in BiDi. Requirement here IMHO would still be that both protocols can resolve the same element references at any time.

I'm going to use some of the examples from James over from this GitHub issue to have a comprehensive list of tests to use. I'm going to file new bugs for other issues that might have to get fixed first.

Depends on: 1398792

While adding more of these tests I came to the conclusion that it would be better to stop working on this bug for now. There are definitions in the WebDriver classic specification like get a known connected element which will make it hard to have a good sharing strategy with BiDi. As best we will wait until James is back, or already discuss it during the Test summit in Berlin end of August.

Status: ASSIGNED → NEW
Whiteboard: [webdriver:m4] → [webdriver:m4:blocked]
Priority: P1 → P2
Whiteboard: [webdriver:m4:blocked] → [webdriver:backlog:blocked]
No longer blocks: 1770420
Blocks: 1778936
Blocks: 1790361
Priority: P2 → P1
Whiteboard: [webdriver:backlog:blocked] → [webdriver:m5:blocked]
Depends on: 1793920
Blocks: 1274251
Blocks: 1794078
No longer blocks: 1774182
Depends on: 1774182
Status: NEW → ASSIGNED
Summary: Create wrapper around ContentDOMReference to generate uuids already in the content process → Move element reference cache from the parent to the content process
Whiteboard: [webdriver:m5:blocked] → [webdriver:m5]
Depends on: 1800737
No longer depends on: 1779808
Blocks: 1770733
No longer blocks: 1770731
Blocks: 1764594

My latest version of the patch shows an assertion and crash for MOZ_ASSERT(nsContentUtils::IsSafeToRunScript()) when trying to get a process actor for the parent process. The code where it's failing can be found in the observer handling within ProcessDataParent.sys.mjs. It looks like:

var observer = {
  async observe(subject, topic) {
    if (topic === "dom-window-destroyed") {
      const actor = subject
        .browsingContext
        .currentWindowGlobal
        .domProcess
        .getActor("WebDriverProcessData");
      await actor.cleanUp({ browsingContext: subject.browsingContext });
    }
  }
}

Note that this usage is based on the documentation for JS actors.

Is this specific assertion in getActor() necessary? Or is the code above wrong?

Nika or Olli, your help would be appreciated. Thanks!

Flags: needinfo?(smaug)
Flags: needinfo?(nika)

There is a script blocker here https://searchfox.org/mozilla-central/rev/17aeb39742eba71e0936ae44a51a54197100166d/layout/base/nsDocumentViewer.cpp#797
Script blocker doesn't really block scripts, and some chrome js does run with script blockers.
But since actors tend to do something with web pages, that shouldn't happen while we have a blocker on stack.

Could you do the cleanup asynchronously? Just store domProcess and call cleanUp in a separate task?

Flags: needinfo?(smaug)
Flags: needinfo?(nika)
Duplicate of this bug: 1691918
Attachment #9284512 - Attachment description: Bug 1692468 - [marionette] Move Element Reference store into content process. → Bug 1692468 - [marionette] Move Element Reference store as node cache into content process.

As it turned out we have to register for the browsing-context-discarded observer in the child actor. That way we can circumvent the issue with the currentWindowGlobal already being set to null. The latest try builds look finally pretty green.

Blocks: 1806593
Blocks: 1807227
Pushed by hskupin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d91e6dd09853
[marionette] Align calling conventions for fromJSON and toJSON. r=webdriver-reviewers,jgraham
https://hg.mozilla.org/integration/autoland/rev/891bf02067de
[marionette] Fix Marionette unit tests for staleness checks. r=webdriver-reviewers,jgraham
https://hg.mozilla.org/integration/autoland/rev/74b088e7e642
[marionette] Improve unit tests for element references. r=webdriver-reviewers,jdescottes
https://hg.mozilla.org/integration/autoland/rev/9928d906cf24
[wdspec] Enhance "no such element" tests for closed other frame and window. r=webdriver-reviewers,jdescottes
https://hg.mozilla.org/integration/autoland/rev/1db2547849c6
[marionette] Refactor starting and stopping of observing windows. r=webdriver-reviewers,jdescottes
https://hg.mozilla.org/integration/autoland/rev/eda24b3ff95b
[marionette] Move Element Reference store as node cache into content process. r=jgraham,webdriver-reviewers,jdescottes
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/37717 for changes under testing/web-platform/tests
Whiteboard: [webdriver:m5] → [webdriver:m5], [wptsync upstream]
Upstream PR merged by moz-wptsync-bot
Whiteboard: [webdriver:m5], [wptsync upstream] → [webdriver:m5][wptsync upstream][webdriver:relnote]
Product: Testing → Remote Protocol
Regressions: 1822466
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: