Closed Bug 1670332 Opened 3 years ago Closed 3 years ago

Review contentWindow usage in marionette JSWindowActors

Categories

(Remote Protocol :: Marionette, task, P1)

Default
task

Tracking

(Fission Milestone:M7, firefox85 fixed)

RESOLVED FIXED
85 Branch
Fission Milestone M7
Tracking Status
firefox85 --- fixed

People

(Reporter: jdescottes, Assigned: impossibus)

References

Details

(Whiteboard: [marionette-fission-mvp])

Attachments

(1 file)

As explained in Bug 1669961, the contentWindow property of a JSWindowActorChild might be out of sync and point to another window than the one corresponding to the current actors pair.

Bug 1669961 will most like make contentWindow return null instead of pointing to a wrong window.

The goal of this bug is to review the current usage of this.contentWindow in the JSWindowActors used in the marionette codebase and make sure we handle the null case correctly. Alternatively we could also use document.defaultView instead of content window, since document is supposed to always point to the correct document.

What exactly does a null state of this.contentWindow mean? From the other bug I read:

this.contentWindow is a windowproxy which might point to a window from another process.

I thought that each window has its own JSWindowActor pair, so how could this actually point to a different (out-of-process) window?

Right now I don't see issue around this usage in actors when enabling those for Fission jobs. So maybe lets revisit once bug 1669961 gets fixed, or deal with the fallout then. I would just wait here for now.

Severity: -- → S3
Priority: -- → P3
Whiteboard: [marionette-fission-mvp]

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #1)

What exactly does a null state of this.contentWindow mean?

As I said in the summary, I assume that the fix for Bug 1669961 will be to make this.contentWindow return null.
That's based on what :kmag mentioned about this bug in #Fission, "I'm thinking we should make the JSWindowActorChild .contentWindow getter return null or throw if the inner window isn't current.".

From the other bug I read:

this.contentWindow is a windowproxy which might point to a window from another process.

I thought that each window has its own JSWindowActor pair, so how could this actually point to a different (out-of-process) window?

:nika said this could happen when accessing a window in the bfcache. I assume there might be other use cases. For my wpt tests it seems this happened when creating JSWindowActors right before the window global got destroyed.
Now exactly why this happens, why this.contentWindow might not be current probably comes down to the implementation of the window proxy or the jswindowactorchild? I think we should rather ask in Bug 1669961 about this.

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #2)

Right now I don't see issue around this usage in actors when enabling those for Fission jobs. So maybe lets revisit once bug 1669961 gets fixed, or deal with the fallout then. I would just wait here for now.

I agree! I think this is really an edge case. We hit this with the Reftest JSWindowActor because we are creating actors when the navigation happens. The current code should be fine for the Marionette JSWindowActor.

Tracking marionette-fission-mvp bugs for Fission Beta milestone (M7)

Fission Milestone: --- → M7

Quick update here, the current patch on bug 1669961 will make contentWindow return null whenever the issue occurs.
Searchfox query to list our current usage of contentWindow in child actors:
https://searchfox.org/mozilla-central/search?q=contentWindow&path=marionette**FrameChild

We need to understand in which cases this can really occur, beyond the edge case I noticed when working on the RefTest actors.
If this is really only happening in edge cases, we could probably throw in MarionetteFrameChild::receiveMessage. But looking at the test being added for Bug 1669961, it seems this should happen whenever the current page is loaded from the bfcache. If this is the case, then we should maybe consider using this.document.defaultView instead?

We have to fix that. I see failures when running the following command:

mach wpt --setpref="marionette.log.level=trace" testing/web-platform/tests/accelerometer/Accelerometer-iframe-access.https.html

Assignee: nobody → mjzffr
Status: NEW → ASSIGNED
Priority: P3 → P1

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #7)

We have to fix that. I see failures when running the following command:

mach wpt --setpref="marionette.log.level=trace" testing/web-platform/tests/accelerometer/Accelerometer-iframe-access.https.html

This test fails for me with the same error both with JSWindowActors enabled and disabled:

0:04.67 pid:734703 1605710640150	Marionette	DEBUG	2 -> [0,16,"WebDriver:ExecuteAsyncScript",{"scriptTimeout":null,"newSandbox":false,"args":[],"filename":"testing/web-platform/test ... EventListener(\"load\", () => {\n    resolve();\n  }, { once: true });\n} else {\n  resolve();\n}","sandbox":null,"line":84}]
 0:04.68 pid:734703 1605710640160	Marionette	DEBUG	2 <- [1,16,{"error":"javascript error","message":"JavaScriptError: Document was unloaded","stacktrace":"WebDriverError@chrome://ma ... tte/content/error.js:360:5\nevaluate.sandbox/promise</unloadHandler<@chrome://marionette/content/evaluate.js:130:20\n"},null]

So I don't think it's related to this bug. Do you see something different, Henrik?

Flags: needinfo?(hskupin)

Here's a try push where I blindly changed all contentWindow usage to document.defaultView: https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=c63ae1bbaff2c77b9c96f980c914cc4204d8969c

Looks pretty green so far.

I'd like to write a wdspec test that creates a bfcache scenario where contentWindow ends up being null.

The required fix for this bug will make sure that we don't see the following error message:

0:34.18 pid:56060 JavaScript error: chrome://marionette/content/actors/MarionetteCommandsChild.jsm, line 62: TypeError: can't access property "addEventListener", this.contentWindow is null

But it will not change the result of the above mentioned test. There might be other tests, which might face a different outcome. I would suspect perform action specific ones.

Flags: needinfo?(hskupin)

Okay, I found a similar example with wpt /inert/inert-retargeting-iframe.tentative.html

0:04.96 pid:895048 1605883318516	Marionette	DEBUG	2 -> [0,16,"WebDriver:ExecuteScript",{"script":"window.open('about:blank', '453ef49e-80ad-412c-a63d-183f332bf94c', 'noopener')","n ... ,"filename":"testing/web-platform/tests/tools/wptrunner/wptrunner/executors/executormarionette.py","sandbox":null,"line":84}]
 0:04.97 pid:895048 1605883318530	Marionette	TRACE	[38] Frame script loaded
 0:04.97 pid:895048 1605883318532	Marionette	DEBUG	2 <- [1,16,null,{"value":null}]
 0:04.98 pid:895048 1605883318534	Marionette	DEBUG	2 -> [0,17,"WebDriver:GetWindowHandles",{}]
 0:04.98 pid:895048 1605883318534	Marionette	DEBUG	2 <- [1,17,null,["9","38"]]
 0:04.98 pid:895048 1605883318538	Marionette	DEBUG	2 -> [0,18,"WebDriver:SwitchToWindow",{"handle":"38","name":"38","focus":true}]
 0:04.99 pid:895048 1605883318541	Marionette	DEBUG	2 <- [1,18,null,{"value":null}]
 0:04.99 pid:895048 1605883318545	Marionette	DEBUG	2 -> [0,19,"WebDriver:ExecuteAsyncScript",{"scriptTimeout":null,"newSandbox":false,"args":[],"filename":"testing/web-platform/test ... EventListener(\"load\", () => {\n    resolve();\n  }, { once: true });\n} else {\n  resolve();\n}","sandbox":null,"line":84}]
 0:04.99 pid:895048 1605883318547	Marionette	DEBUG	2 <- [1,19,null,{"value":null}]
 0:04.99 pid:895048 1605883318549	Marionette	DEBUG	2 -> [0,20,"WebDriver:Navigate",{"url":"http://web-platform.test:8000/inert/inert-retargeting-iframe.tentative.html"}]
 0:04.99 pid:895048 1605883318550	Marionette	TRACE	[38] MarionetteCommands actor created for window id 25
 0:04.99 pid:895048 JavaScript error: chrome://marionette/content/actors/MarionetteCommandsChild.jsm, line 62: TypeError: can't access property "addEventListener", this.contentWindow is null

With the change to document.defaultView (as in the try push in Comment 9) there are no more TypeErrors of this kind.

Pushed by mjzffr@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/d1c1abe62c6c
[marionette] Replace contentWindow usage with document.defaultView r=marionette-reviewers,whimboo
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 85 Branch
No longer regressions: 1685454
Product: Testing → Remote Protocol
You need to log in before you can comment on or make changes to this bug.