Closed Bug 1330348 Opened 7 years ago Closed 7 years ago

Make forward- and backward commands synchronous

Categories

(Remote Protocol :: Marionette, defect, P1)

Version 3
defect

Tracking

(firefox52 fixed, firefox-esr52 fixed, firefox53 fixed, firefox54 fixed, firefox55 fixed)

RESOLVED FIXED
mozilla55
Tracking Status
firefox52 --- fixed
firefox-esr52 --- fixed
firefox53 --- fixed
firefox54 --- fixed
firefox55 --- fixed

People

(Reporter: ato, Assigned: whimboo)

References

(Blocks 2 open bugs, )

Details

(Keywords: pi-marionette-server, pi-marionette-spec)

Attachments

(2 files, 1 obsolete file)

Tests using the forward- and backwards commands in Marionette are known to cause intermittents.  We suspect this is because they are not implemented in a synchronous manner, e.g. they don’t wait for the document to reach the expected state before returning.

Because navigating backwards and forwards may either reload a document from bfcache or navigate internally in a document (if push state or internal anchors have been used), we _do not_ expect a document load event to occur.  I’m not entirely sure what the right fix is, but suspect the fix might involve updating the WebDriver standard to be more specific.
Blocks: webdriver
https://bugzilla.mozilla.org/show_bug.cgi?id=1291320 is for fixing the Refresh command.  The refresh command, in contrast to Back and Forward, always expect the document to reload and for the document load events to fire.
Keep in mind that also JS code can trigger navigating back and forward. For those and like other commands (click, keypress) which cause a load of a page, it will be hard to check that in Marionette server with the current algorithm.

In Mozmill I solved this problem by assigning a uuid for each page load. Once a page gets unloaded it gets reset (and the current value backup'ed), and once loaded a new value is set. With that we were able to see that a page load was happening for a tab. So this was able to also handle external page loads triggered by whatever the user is doing.
Blocks: 1332998
Severity: normal → major
Priority: -- → P1
I feel this should wait for a solution on bug 1288336, so we know better what could be done here. Andreas, David, what do you think?
Blocks: 1335952
Well, the bug number I wanted to use should have been bug 1333458.
I don’t think we should wait.  This is a P1.
Blocks: 1305659
I will give this a shot.
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Experimenting with events while using go_back() in a Marionette test shows the following:

> ******** Caught unload event: pagehide
> ******** Caught load event: DOMContentLoaded
> ******** Caught load event: pageshow

This means that we also get a `DOMContentLoaded` event fired we could make use of. There shouldn't be a need to wait for the full page being loaded. This is just similar to what we do for the `get()` command.

For about: pages I do not see any event being fired, and I think that this might be covered by the progress listener as part of the `get()` command.

My proposal would really be to improve the generic page load behavior, at least the part for splitting out the waitForPageLoad method so that this method can be also used by other commands.
(In reply to Henrik Skupin (:whimboo) from comment #9)
> Experimenting with events while using go_back() in a Marionette test shows
> the following:
> 
> > ******** Caught unload event: pagehide
> > ******** Caught load event: DOMContentLoaded
> > ******** Caught load event: pageshow
> 
> This means that we also get a `DOMContentLoaded` event fired we could make
> use of. There shouldn't be a need to wait for the full page being loaded.
> This is just similar to what we do for the `get()` command.

Bear in mind that not all traversing of the browsing context’s history by delta +/- 1 cause DOMContentLoaded to fire.  This is the case if the document by default does not fire DOMContentLoaded events (ImageDocument) as well as for a few ‘special’ schemes (see https://url.spec.whatwg.org/#is-special), or when navigating _within_ one document by fragment.

> For about: pages I do not see any event being fired, and I think that this
> might be covered by the progress listener as part of the `get()` command.

This is also true for other ‘special’ local schemes, including "blob", "data", and "filesystem".

> My proposal would really be to improve the generic page load behavior, at
> least the part for splitting out the waitForPageLoad method so that this
> method can be also used by other commands.

Yes, that is a good starting point.  I would suggest taking the existing implementation with its current behaviour and make that work independently before trying to improve on it.
(In reply to Andreas Tolfsen ‹:ato› from comment #6)
> I don’t think we should wait.  This is a P1.

Given that the page load detection code will completely change via bug 1333458, I don't think it make sense to work on that code now, which gets thrown away soon again.
Depends on: webdriver-navigate
I spoke with David yesterday and we decided to not wait for bug 1333458 but go ahead right now to get it actually fixed.

So while working on the implementation I noticed a situation when the spec is not that clear and causes a huge delay in running the tests. This happens when back/forward gets called if no history is actually available. In such cases we would try to navigate but then fail after 300s with a timeout. The browser should always know if such a navigation is possible, so bailing out early without loosing 300s would be great.

I filed the following issue for the webdriver spec: https://github.com/w3c/webdriver/issues/776
Comment on attachment 8838209 [details]
Bug 1330348 - Make forward- and backward commands synchronous.

This is an early WIP which just makes it working for the couple of examples. I would like to know if this is the right direction I'm heading. 

Short note... Similar to get() this code runs all in content. When I try to attach the event listeners in chrome, I do not see any events coming through. Also I made a unload method, which could generally be used. It returns if a page load is expected, and if so the calling code can execute the waitForPageLoad method. Both we could also use for get() later, if it turns out my idea from running all in chrome doesn't work.
Attachment #8838209 - Flags: feedback?(dburns)
Attachment #8838209 - Flags: feedback?(dburns) → feedback+
Also as noticed the post navigation steps for backward and forward are exactly identical to get(). So it's not enough to only wait for pageshow. Take into account that the former page could have been an error page, local file URL, and all that.

The open webdriver spec issue for that is: https://github.com/w3c/webdriver/issues/777
So I just uploaded the latest state of the patch which works for back and forward except for remoteness changes. Here I have the problem in how I have to specify the parameters for the waitForPageLoaded method. Maybe you can give me a hint, Andreas?

Otherwise I have a problem now with the following:

> **** rejecting promise for timeout
> JavaScript error: chrome://marionette/content/error.js, line 160: Error: Error loading page, timed out (checkLoad)

As it looks like the new code in error.js doesn't correctly forward the rejected promise, or I didn't update something in my patch in regards to those changes.

Andreas, I would appreciate if you can have a look over this WIP patch and tell me your feedback. Thanks.

Please be aware that I will not be around all next week.
Flags: needinfo?(ato)
Comment on attachment 8838209 [details]
Bug 1330348 - Make forward- and backward commands synchronous.

https://reviewboard.mozilla.org/r/113166/#review115248

::: testing/marionette/listener.js:1159
(Diff revision 2)
> +function waitForPageLoaded(msg) {
> +  let {pageTimeout, startTime} = msg;

Changing `msg` to `msg.json` should let you extract `pageTimeout` and `startTime` correctly.
Comment on attachment 8838209 [details]
Bug 1330348 - Make forward- and backward commands synchronous.

https://reviewboard.mozilla.org/r/113166/#review115250

::: testing/marionette/driver.js:1034
(Diff revision 2)
> -  yield this.listener.goBack();
> +  if (!this.curBrowser.tab) {
> +    return;
> +  }

What do these test for?

::: testing/marionette/driver.js:1046
(Diff revision 2)
> +  // If a remoteness update interrupts our page load, this will never return
> +  // We need to re-issue this request to correctly poll for readyState and
> +  // send errors.
> +  dump('*** current message id: ' + this.listener.activeMessageId + '\n');
> +  this.curBrowser.pendingCommands.push(() => {
> +    cmd.parameters.pageTimeout = this.timeouts.pageLoad;
> +    cmd.parameters.startTime = new Date().getTime();
> +    this.mm.broadcastAsyncMessage(
> +        "Marionette:waitForPageLoaded" + this.curBrowser.curFrameId,
> +        cmd.parameters);
> +  });

I’m not a great fan of the `pendingCommands.push(…)` API, but I guess this works fine for now?

In a more ideal world we would make _all_ calls through the `AsyncProxyChannel` (`this.listener`) repeat if they failed to execute so that we wouldn’t have to special case `waitForPageLoaded`.

I don’t know off hand if that is feasible or if we can even detect if a command failed, but it would certainly be possible to assume that all listener messages are prone to failure because the content frame script might disappear at any time.  We could then clear `pendingCommands` or the equivalent every time a response comes through from the listener.

Anyway, I’m not expecting you to fix that here.  Just explaining that `pendingCommands` is an abomination and we would do well by getting rid of it.
Just curious about waitForPageLoaded/waitForPageUnloaded: I thought you said you would do this waiting entirely in chrome space?  That the web progress listener would only send the right events if it was attached to the <xul:browser> in chrome?

If we were to move this code to chrome, the only thing we would have to contact the listener about would be to get the document’s readyState.
Flags: needinfo?(ato)
Comment on attachment 8838209 [details]
Bug 1330348 - Make forward- and backward commands synchronous.

https://reviewboard.mozilla.org/r/113166/#review115248

> Changing `msg` to `msg.json` should let you extract `pageTimeout` and `startTime` correctly.

That's what I also tried but in one case it gives an error that msg.json is undefined. So I believe there is a special calling convention when I want to call this method from inside listener.js itself?
Comment on attachment 8838209 [details]
Bug 1330348 - Make forward- and backward commands synchronous.

https://reviewboard.mozilla.org/r/113166/#review115250

> What do these test for?

As you can see we need the contentBrowser to determine if we can navigate backwards. To retrieve it we need a valid tab, and as such we have to check that here. Without a valid tab we simply return.

> I’m not a great fan of the `pendingCommands.push(…)` API, but I guess this works fine for now?
> 
> In a more ideal world we would make _all_ calls through the `AsyncProxyChannel` (`this.listener`) repeat if they failed to execute so that we wouldn’t have to special case `waitForPageLoaded`.
> 
> I don’t know off hand if that is feasible or if we can even detect if a command failed, but it would certainly be possible to assume that all listener messages are prone to failure because the content frame script might disappear at any time.  We could then clear `pendingCommands` or the equivalent every time a response comes through from the listener.
> 
> Anyway, I’m not expecting you to fix that here.  Just explaining that `pendingCommands` is an abomination and we would do well by getting rid of it.

This code path has been used to stay in compat with the get() command for now until we have a properly working solution for bug 1333458. As determined with David it's more immportant to get those commands in place, and maybe find some blockers where my idea for bug 1333458 doesn't work. And in this case I noticed that not all events are getting forwarded to the contentBrowser in the chrome process, which might still force us to listen for all inside the framescript. :/
(In reply to Andreas Tolfsen from comment #20)
> Just curious about waitForPageLoaded/waitForPageUnloaded: I thought you said
> you would do this waiting entirely in chrome space?  That the web progress
> listener would only send the right events if it was attached to the
> <xul:browser> in chrome?

As noticed while working on this bug we seem to not get ALL the necessary events in the chrome process. But I thought it's a good idea to get those commands implemented first to see if we can go ahead with the approach on bug 1333458 or not. Otherwise we would make the change to chrome and notice later it won't work and we have to revert.

> If we were to move this code to chrome, the only thing we would have to
> contact the listener about would be to get the document’s readyState.

This only applies to `waitForPageLoaded` or `pollForReadyState`. Something we definitely need is a way to figure out if an unload was happening. Without that a call to the before mentioned methods is useless. And `waitForPageUnloaded` should run successfully before a remoteness change kills our connection to the framescript.
Comment on attachment 8838209 [details]
Bug 1330348 - Make forward- and backward commands synchronous.

https://reviewboard.mozilla.org/r/113166/#review115250

> As you can see we need the contentBrowser to determine if we can navigate backwards. To retrieve it we need a valid tab, and as such we have to check that here. Without a valid tab we simply return.

So I guess we don’t have a valid tab when we’re in chrome context and `this.curBrowser` is set to something that isn’t a content window?  But :1032 asserts that we’re in content context.  Under what other circumstances do we not have a valid tab?

> This code path has been used to stay in compat with the get() command for now until we have a properly working solution for bug 1333458. As determined with David it's more immportant to get those commands in place, and maybe find some blockers where my idea for bug 1333458 doesn't work. And in this case I noticed that not all events are getting forwarded to the contentBrowser in the chrome process, which might still force us to listen for all inside the framescript. :/

That is unfortunate.  The ability to listen in chrome would, I think, save us a lot of work.  Are the events you are missing specific to about: pages?
(In reply to Andreas Tolfsen ‹:ato› from comment #24)
> > As you can see we need the contentBrowser to determine if we can navigate backwards. To retrieve it we need a valid tab, and as such we have to check that here. Without a valid tab we simply return.
> 
> So I guess we don’t have a valid tab when we’re in chrome context and
> `this.curBrowser` is set to something that isn’t a content window?  But
> :1032 asserts that we’re in content context.  Under what other circumstances
> do we not have a valid tab?

This will happen for all non-browser chrome windows. None of them have tabs and as such we would fail. Any kind of navigation should only be possible in a browser (navigator) window. Maybe we need another kind of assert which checks for the type of chrome window and if it's a browser?

> > This code path has been used to stay in compat with the get() command for now until we have a properly working solution for bug 1333458. As determined with David it's more immportant to get those commands in place, and maybe find some blockers where my idea for bug 1333458 doesn't work. And in this case I noticed that not all events are getting forwarded to the contentBrowser in the chrome process, which might still force us to listen for all inside the framescript. :/
> 
> That is unfortunate.  The ability to listen in chrome would, I think, save
> us a lot of work.  Are the events you are missing specific to about: pages?

I cannot fully remember now given that I changed so many things. But in general I would like to delay any of this work to bug 1333458 and just have something simple working for back/forward/refresh now.

I did some more testing today and found the following:

1. The pendingCommands usage is required when one of the commands trigger a page load which causes a remoteness change. Without adding the still required method to this list, it will never return because my current waitForPageLoaded method will be wiped out together with the framescript. Reinstalling the framescript will not trigger it again. This is identical to the get() command.

2. I notice a hang when my promise based waitForPageLoaded method resolves the promise in the content process. After some thinking I believe that I know why. Reason is the same as under 1). Given the reload of the framescript we create a new promise instance in that method. Resolving this doesn't have any impact to the promise we have used before. As such we never resolve the promise in driver.js.

Especially point 2) seems to require me now to change my page load methods to make use of callbacks instead of promises.
Hi Blake, I have a question regarding frame scripts and about reloading them when a remoteness change is happening. So for Marionette we have methods running in the framescript (content process) which check for the page load status while e.g a navigation happens. Registering for events is fine when no remoteness change is happening. What I wonder is how are we reloading framescripts in case of remoteness changes?

In the above example we will have to re-register for all the DOM events and progress listeners. While we are doing that could it be that we are missing events as sent between the remoteness change and our re-registering? Or are framescripts always reloaded before any kind of activity happens on the document, so that we can be sure not to miss events?

We are using the message manager to load the script and with aAllowDelayedLoad set to true.

Thanks.
Flags: needinfo?(mrbkap)
As noticed today navigating through documents with goBack and goForward is kinda complicated for framesets. Especially if you switched to a frame and triggered a navigation inside this frame before. Both commands can not automatically switch to the top frame, or stay at the frame currently selected, because the latter would cause a failure if goBack is called another time which would unload the frameset and navigate to the page as loaded before.

I'm not sure how to fix that at the moment. If polling for readyState doesn't work and we indeed have to listen for events, I would be blocked on the former comment.

David, how important would it be right now to have support for frames? I wonder if we could move this out until we have a better wait for page load algorithm.
Flags: needinfo?(dburns)
(In reply to Henrik Skupin (:whimboo) from comment #27)

> As noticed today navigating through documents with goBack and
> goForward is kinda complicated for framesets. Especially if you
> switched to a frame and triggered a navigation inside this frame
> before. Both commands can not automatically switch to the top frame,
> or stay at the frame currently selected, because the latter would
> cause a failure if goBack is called another time which would unload
> the frameset and navigate to the page as loaded before.

Traversal backwards and forwards in a tab isn’t specific to a frame.
If a frame has navigated, clicking the back button will cause that
frame to navigate back to its previous location.  In other words, the
navigation history for a <xul:browser> is shared amongst the outermost
window and any descendant windows (in <iframe> or <frame>).

Calling the Back/Forward commands are _not_ supposed to _switch_ to
the top frame unless it causes the top-level document to go away.  In
that case we need to “switch” back to the top-most frame, since the
context we were in has disappeared.
(In reply to Andreas Tolfsen ‹:ato› from comment #28)
> Calling the Back/Forward commands are _not_ supposed to _switch_ to
> the top frame unless it causes the top-level document to go away.  In
> that case we need to “switch” back to the top-most frame, since the
> context we were in has disappeared.

And the question is how to detect that, or if that should be done in the test by the user. Whatever behavior we choose has not been added to teh spec yet. See also https://github.com/w3c/webdriver/issues/777#issuecomment-283893980.
My latest patch includes support for all known situations except special URLs like file:// which we even cannot load via `get()` at the moment. So it's hard to create testcases for.

For frames I assumed that the test knows when frame boundaries are crossed and the parent frame or a specific frame has to be switched to. Given that the spec is unclear in that part, I will leave it for now.

Questions remaining:

1) The test_navigation.py defines a new class for back/forward now. For all those tests we let Marionette navigate to the pages we already test in the TestNavigate class. I wonder if we could cut down those loads to only what we need for back/forward, which will save us a lot of time.

2) I was not able to get the new waitForPageLoad/waitForPageUnload methods implemented with Promises because of possible remoteness changes. Under those conditions the original promise is not available anymore, and a hang is caused. Compared to the pollForReadyState this method contains an extra check for the formerly known loaded page. This is necessary because a remoteness change can happen when pagehide has already been sent, or even before. So a simple check for readyState == 'complete' would cause us to return too early and leads to a race. I could integrate that part in `pollForReadyState()` so we have only a single method for this purpose.

Andreas, maybe you can give some feedback? Thanks.
Flags: needinfo?(ato)
Attachment #8838209 - Flags: review?(dburns)
Depends on: 1344863
Depends on: 1344868
Comment on attachment 8838209 [details]
Bug 1330348 - Make forward- and backward commands synchronous.

https://reviewboard.mozilla.org/r/113166/#review119578

::: testing/marionette/driver.js:1100
(Diff revision 9)
> +    let parameters = {
> +      // TODO(ato): Bug 1242595
> +      command_id: this.listener.activeMessageId,
> +      lastSeenURL: currentURL,
> +      pageTimeout: this.timeouts.pageLoad,
> +      startTime: new Date().getTime(),

Shouldn’t the start time be at the point `yield this.listener.getCurrentUrl` was called?

::: testing/marionette/harness/marionette_harness/tests/unit/test_navigation.py:48
(Diff revision 9)
> +
> +        super(TestBackForward, self).tearDown()
> +
> +    def run_test(self, test_pages):
> +        for page in test_pages:
> +            if page.get("error"):

Use `has_key("error")` or add `is not None`.

::: testing/marionette/harness/marionette_harness/tests/unit/test_navigation.py:55
(Diff revision 9)
> +                    self.marionette.navigate(page["url"])
> +            else:
> +                self.marionette.navigate(page["url"])
> +            self.assertEqual(page["url"], self.marionette.get_url())
> +
> +        for page in test_pages[1::-1]:

Can you explain what this slice does?

::: testing/marionette/harness/marionette_harness/tests/unit/test_navigation.py:56
(Diff revision 9)
> +            else:
> +                self.marionette.navigate(page["url"])
> +            self.assertEqual(page["url"], self.marionette.get_url())
> +
> +        for page in test_pages[1::-1]:
> +            if page.get("error"):

Same with regards to `has_key`.

::: testing/marionette/harness/marionette_harness/tests/unit/test_navigation.py:64
(Diff revision 9)
> +            else:
> +                self.marionette.go_back()
> +            self.assertEqual(page["url"], self.marionette.get_url())
> +
> +        for page in test_pages[1::]:
> +            if page.get("error"):

`has_key` again.

::: testing/marionette/harness/marionette_harness/tests/unit/test_navigation.py:73
(Diff revision 9)
> +    def test_no_history_items(self):
> +        self.marionette.go_back()
> +        self.marionette.go_forward()

I would probably also check `window.history.length` here to ensure we’re in the correct state.

::: testing/marionette/harness/marionette_harness/tests/unit/test_navigation.py:114
(Diff revision 9)
> +        # We cannot use get_url() to wait until the target page has been loaded,
> +        # because it will return the URL of the top browsing context.
> +        Wait(self.marionette, timeout=self.marionette.timeout.page_load).until(
> +            expected.element_present(*test_element_locator),
> +            message="Target element 'email' has not been found")

You could maybe use `execute_script("return window.location.href")`?

::: testing/marionette/harness/marionette_harness/tests/unit/test_navigation.py:126
(Diff revision 9)
> +        # Go back to the non-frameset page
> +        # TODO: Do we have to switch automatically? I don't think so, because
> +        # in the case of goForward we cannot determine the correct browsing context.
> +        self.marionette.switch_to_parent_frame()

Your TODO here is correct: switching context should be explicit.

::: testing/marionette/listener.js:1123
(Diff revision 9)
> + * @param {Function} trigger
> + *     Callback to execute which triggers a page navigation.
> + * @param {Function} doneCallback
> + *     Callback to execute when the current page has been unloaded.

s/Function/function/g

See https://google.github.io/styleguide/javascriptguide.xml?showone=JavaScript_Types#JavaScript_Types for some good example on how to define API documentation types in JS.

::: testing/marionette/listener.js:1128
(Diff revision 9)
> + * @return {object}
> + *     Dictionary with the following items:
> + *         loading - Boolean flag if a page load will follow.
> + *         lastSeenURL - Last seen URL before the navigation request.
> + *         startTime - Time when the navigation request has been triggered.

Merge this into the `doneCallback` documentation, since this is passed there and not returned from the function.

::: testing/marionette/listener.js:1134
(Diff revision 9)
> + *     Dictionary with the following items:
> + *         loading - Boolean flag if a page load will follow.
> + *         lastSeenURL - Last seen URL before the navigation request.
> + *         startTime - Time when the navigation request has been triggered.
> + */
> +function waitForPageUnloaded(trigger, doneCallback) {

How come this function is implemented with callbacks instead of a promise?  Is it because `goBack`/`goForward` are not wrapped in `dispatch`?

::: testing/marionette/listener.js:1158
(Diff revision 9)
> +        doneCallback({loading: false, lastSeenURL: currentURL});
> +        break;
> +
> +      case "pagehide":
> +      case "unload":
> +        if (event.originalTarget == curContainer.frame.document) {

Probably award yourself a triple `===` here.

::: testing/marionette/listener.js:1184
(Diff revision 9)
> + * @param {Number} command_id
> + *     ID of the currently handled message between the driver and listener.
> + * @param {String} lastSeenURL
> + *     Last URL as seen before the navigation request got triggered.
> + * @param {Number} pageTimeout
> + *     Timeout in seconds the method has to wait for the page being finished loading.
> + * @param {Number} startTime
> + *     Unix timestap when the navitation request got triggred.

Lowercase all primitive types:

s/Number/number/g
s/String/string/g

::: testing/marionette/listener.js:1193
(Diff revision 9)
> + * @param {Number} pageTimeout
> + *     Timeout in seconds the method has to wait for the page being finished loading.
> + * @param {Number} startTime
> + *     Unix timestap when the navitation request got triggred.
> + */
> +function waitForPageLoaded(msg) {

This function is essentially unchanged from the version that is in mozilla-central, yes?

::: testing/marionette/listener.js:1255
(Diff revision 9)
> +  waitForPageUnloaded(() => {
> -  curContainer.frame.history.back();
> +    curContainer.frame.history.back();
> +  }, pageLoadStatus => {

This can be simplified to `waitForPageUnloaded(curContainer.frame.history.back)`.  You don’t need an arrow function here.

::: testing/marionette/listener.js:1275
(Diff revision 9)
> + * @param {Number} command_id
> + *     ID of the currently handled message between the driver and listener.
> + * @param {Number} pageTimeout
> + *     Timeout in seconds the method has to wait for the page being finished loading.

s/Number/number/g

::: testing/marionette/listener.js:1277
(Diff revision 9)
> + * @param {Number} pageTimeout
> + *     Timeout in seconds the method has to wait for the page being finished loading.

I think this is passed in milliseconds?

::: testing/marionette/listener.js:1283
(Diff revision 9)
> +  waitForPageUnloaded(() => {
> -  curContainer.frame.history.forward();
> +    curContainer.frame.history.forward();
> -  sendOk(msg.json.command_id);
> +  }, pageLoadStatus => {

Same here about the arrow function.
Attachment #8838209 - Flags: review?(ato) → review-
(In reply to Henrik Skupin (:whimboo) from comment #32)

> My latest patch includes support for all known situations except
> special URLs like file:// which we even cannot load via `get()` at the
> moment. So it's hard to create testcases for.

When navigating to a file: protocol, my experience is that the document
does load but that we don’t detect when the file: document has
finished loading.  Probably this is because of some missing event?

In any case, it’s fine not to fix that here.  Not supporting file:
doesn’t cause any regressions from the status quo.

> For frames I assumed that the test knows when frame boundaries are
> crossed and the parent frame or a specific frame has to be switched
> to. Given that the spec is unclear in that part, I will leave it for
> now.

I don’t exactly understand what you mean here.  Can you elaborate?

> 1) The test_navigation.py defines a new class for back/forward
> now. For all those tests we let Marionette navigate to the pages we
> already test in the TestNavigate class. I wonder if we could cut down
> those loads to only what we need for back/forward, which will save us
> a lot of time.

I reviewed your new class, TestBackForward, and found it a pleasure to
read.  I think it is a good test in its own right.  I would say that
TestNavigate is useful for keep testing other navigation behaviour, but
that we could probably consider dropping TestNavigate.test_go_forward
and TestNavigate.test_go_back.

> 2) I was not able to get the new waitForPageLoad/waitForPageUnload
> methods implemented with Promises because of possible remoteness
> changes.

I see.  I raised an issue about that before I saw this.

> Under those conditions the original promise is not available anymore,
> and a hang is caused. Compared to the pollForReadyState this method
> contains an extra check for the formerly known loaded page. This is
> necessary because a remoteness change can happen when pagehide has
> already been sent, or even before. So a simple check for readyState ==
> 'complete' would cause us to return too early and leads to a race. I
> could integrate that part in `pollForReadyState()` so we have only a
> single method for this purpose.

I think that would make sense.  This code is complex and I would rather
have as few versions of it as possible.  One, well-maintained function
for polling for document ready state is better than two (or three).
Flags: needinfo?(ato)
Comment on attachment 8838209 [details]
Bug 1330348 - Make forward- and backward commands synchronous.

https://reviewboard.mozilla.org/r/113166/#review119578

> Shouldn’t the start time be at the point `yield this.listener.getCurrentUrl` was called?

Please keep in mind that this place is the fallback case when we have to keep the current command active if a remoteness change occured. We do not want to loose this time as what we currently do for get(), and take a time which indeed has a small difference but which is better than using a new one after the remoteness change.

> Use `has_key("error")` or add `is not None`.

It's better to use `"error" in page` here. `has_key` has been removed in Python 3.x, and as best should not be used.

> Can you explain what this slice does?

It reverts the array (the last -1), and then slices everything except the first entry. It's needed because the first go_back will cause the 2nd last test page from the list loaded. But hey, it contained a failure, so it has to actually be `[-2::-1]`.

> You could maybe use `execute_script("return window.location.href")`?

This would not wait until the page has been loaded and cause a race. I will add this to the comment to make it clear.

> How come this function is implemented with callbacks instead of a promise?  Is it because `goBack`/`goForward` are not wrapped in `dispatch`?

Promises don't work in cases when a remoteness change occurs. For details please see bug 1330348 comment 25.

> This function is essentially unchanged from the version that is in mozilla-central, yes?

As we talked via Vidyo yesterday it is mostly a copy of `pollForReadyState` with the only exception that it contains extra code to optionally check if a specific page has been unloaded. I will merge the code into the existing method.

> I think this is passed in milliseconds?

You are right. Thank's for spotting.
Comment on attachment 8838209 [details]
Bug 1330348 - Make forward- and backward commands synchronous.

https://reviewboard.mozilla.org/r/113166/#review119578

> This can be simplified to `waitForPageUnloaded(curContainer.frame.history.back)`.  You don’t need an arrow function here.

Actually this does not work. When I pass in the function reference like this I get a JS error:

> JavaScript error: chrome://marionette/content/listener.js, line 1172: TypeError: 'forward' called on an object that does not implement interface History.

Maybe this is because `history.forward` and `history.back` are native code and cannot be passed along that way:

> trigger: function forward() {
>    [native code]
> }

I'm going to revert this.

> Same here about the arrow function.

Dropping here too similar to the other case.
(In reply to Henrik Skupin (:whimboo) from comment #26)
> events as sent between the remoteness change and our re-registering? Or are
> framescripts always reloaded before any kind of activity happens on the
> document, so that we can be sure not to miss events?

I think that framescripts should be registered earlier than any of these events. ni?smaug to verify that.
Flags: needinfo?(mrbkap) → needinfo?(bugs)
(for page load/unload events one really wants to use pageshow/hide, since those work correctly even when page is loaded from bfcache. Bfcache doesn't cause any reload, so no load event, only pageshow)


framescripts aren't ever reloaded. They are loaded when loadFrameScript is called, assuming there is the child side already running. If not, and loadFrameScript is called with aAllowDelayedLoad=true, the script is loaded ASAP the child side is created, before any web page is loaded there.
Though, running the scripts may create the initial about:blank

remoteness change (from a tab being first non-remote and then recreated being remote, or other way round) loads framescripts in the newly created tab, assuming one calls loadFrameScript again somewhere.
(remoteness change recreates nsFrameLoader, so also message manager is recreated).

If one uses global or window level message managers and loadFrameScript and aAllowDelayedLoad=true there, the script will be loaded automatically whenever a new tab is created.
Flags: needinfo?(bugs)
(In reply to Olli Pettay [:smaug] from comment #44)
> remoteness change (from a tab being first non-remote and then recreated
> being remote, or other way round) loads framescripts in the newly created
> tab, assuming one calls loadFrameScript again somewhere.
> (remoteness change recreates nsFrameLoader, so also message manager is
> recreated).

Which would mean that also all saved states of the former framescript instance are gone, and it will be better to manage all that in the chrome process. 

Something strange which I noticed when I updated the patch lately and excessively tested remoteness changes, in some cases we get a pagehide event with an undefined event.originalTarget. Does it mean the window/document is already destroyed, and the framescript will be also killed very soon? How does it come that another pagehide event is received for the same URL when the remoteness change is done, and the framescript is reloaded? Shouldn't a pagehide event before the remoteness change indicate that the page has been fully unloaded?

> If one uses global or window level message managers and loadFrameScript and
> aAllowDelayedLoad=true there, the script will be loaded automatically
> whenever a new tab is created.

Right, this is what we are doing. With loading I simply meant that a new instance is created for the newly remote tab.
Flags: needinfo?(bugs)
Comment on attachment 8838209 [details]
Bug 1330348 - Make forward- and backward commands synchronous.

https://reviewboard.mozilla.org/r/113166/#review119578

> As we talked via Vidyo yesterday it is mostly a copy of `pollForReadyState` with the only exception that it contains extra code to optionally check if a specific page has been unloaded. I will merge the code into the existing method.

I tried to do this but run into other troubles now because when running the back command, the `pollForReadyState` method gets invoked by probably a timer from the first `get` command. Seems like we are leaking timer instances, which now confuses other commands using the shared `pollForReadyState` method.
Comment on attachment 8838209 [details]
Bug 1330348 - Make forward- and backward commands synchronous.

https://reviewboard.mozilla.org/r/113166/#review119578

> I tried to do this but run into other troubles now because when running the back command, the `pollForReadyState` method gets invoked by probably a timer from the first `get` command. Seems like we are leaking timer instances, which now confuses other commands using the shared `pollForReadyState` method.

This was actually easy. So in the webprogress listener we abort waiting in case of image documents. But there we only remove the event listener for DOMContentLoaded, and forget about the registered listener itself. So when we have a goBack/goForward which navigates to an image, the still running progress listener hits the same code path and calls `pollForReadyState`. This is in conflict with the call for goBack/goForward. 

I will push a fix with a separate commit.
Blocks: 1344684
as was discussed in the W3C issue, we are going to drop synchronicity from the spec.

In Marionette however it will be good if we can try to wait for the page load, if it looks like we shouldnt then the onus is on the user of Marionette to wait for the page to be properly there.
Flags: needinfo?(dburns)
(In reply to Henrik Skupin (:whimboo) from comment #45)
> (In reply to Olli Pettay [:smaug] from comment #44)
> > remoteness change (from a tab being first non-remote and then recreated
> > being remote, or other way round) loads framescripts in the newly created
> > tab, assuming one calls loadFrameScript again somewhere.
> > (remoteness change recreates nsFrameLoader, so also message manager is
> > recreated).
> 
> Which would mean that also all saved states of the former framescript
> instance are gone, and it will be better to manage all that in the chrome
> process. 

Well of course, since if you change remoteness, you change process and state just don't magically move to another process.

> 
> Something strange which I noticed when I updated the patch lately and
> excessively tested remoteness changes, in some cases we get a pagehide event
> with an undefined event.originalTarget.
Very odd

 Does it mean the window/document is
> already destroyed, and the framescript will be also killed very soon?
window/document lifetime has nothing to do with framescripts. Framescripts are run in a JS scope on top of the web pages. So while new pages are loaded to a tab, framescripts stay there

> How
> does it come that another pagehide event is received for the same URL when
> the remoteness change is done, and the framescript is reloaded?
framescript isn't reloaded. Framescripts may be loaded in new tabs (remoteness effectively kills the old and creates a new tab)

> Shouldn't a
> pagehide event before the remoteness change indicate that the page has been
> fully unloaded?>
I'm not sure we dispatch pagehide properly when doing remoteness change. That is after all an edge case when showing up browser UI in a tab. There has been some work to fix remoteness change properly.
But, you should get an unload event targeted to the TabChildGlobal (the global in which framescript run) when a tab is closing. event.target should be 'this' in the framescript.
Perhaps that helps you?
Flags: needinfo?(bugs)
(In reply to Olli Pettay [:smaug] from comment #52)
> I'm not sure we dispatch pagehide properly when doing remoteness change.
> That is after all an edge case when showing up browser UI in a tab. There
> has been some work to fix remoteness change properly.

So would it make sense to get a bug filed for that? If it's such an edge-case I don't think that we are going to get this fixed, given that we want to also handle all about: pages as remoteness?

> But, you should get an unload event targeted to the TabChildGlobal (the
> global in which framescript run) when a tab is closing. event.target should
> be 'this' in the framescript.
> Perhaps that helps you?

Not such much for our case. What we are currently thinking about to do for Marionette is to move the page load handling to the chrome scope and do the listening via the progress listener on the browser element. With that we wouldn't suffer from loosing state for specific tabs when a remoteness change occurs. Current investigation is done via bug 1333458.

Thanks so far.
(In reply to Henrik Skupin (:whimboo) from comment #53)
> (In reply to Olli Pettay [:smaug] from comment #52)
> > I'm not sure we dispatch pagehide properly when doing remoteness change.
> > That is after all an edge case when showing up browser UI in a tab. There
> > has been some work to fix remoteness change properly.
> 
> So would it make sense to get a bug filed for that?
There are existing bugs for that, I think. freesamael might know that.

> If it's such an
> edge-case I don't think that we are going to get this fixed, given that we
> want to also handle all about: pages as remoteness?
I'm pretty sure we don't want all about: pages to be remote.
about:preferences as an example. (But I don't know what plans other people have, but making
all about:* remote sounds weird)
Comment on attachment 8845364 [details]
Bug 1330348 - get() command should not automatically focus tab.

https://reviewboard.mozilla.org/r/118564/#review120690

> The spec doesn't say something about a requirement to focus the tab when a navigation
> request is happening.

This is true, but the default focus in the browser is in the address bar, and if the focus happens to be elsewhere as a result of previous interaction steps or accidental user interference, issues such as https://github.com/mozilla/geckodriver/issues/394 happens.

> Also it would break tests when navigation requests are issued for background tabs.

Loading in background tabs is not a use case Marionette supports.
Attachment #8845364 - Flags: review?(ato) → review-
Comment on attachment 8845365 [details]
Bug 1330348 - Make pollForReadyState a shared method for navigation commands.

https://reviewboard.mozilla.org/r/118566/#review120694

::: testing/marionette/listener.js:923
(Diff revision 1)
> +      if (doc.readyState === "complete") {
> -      // document fully loaded
> +        // document fully loaded
> -      if (doc.readyState == "complete") {
> +        cleanupCallback();

I find it very confusing to read this code when the comments come after the fact.  Usually comments explain what is about to happen, which means I have to backtrace to understand what the condition is testing for.
Attachment #8845365 - Flags: review?(ato) → review+
Comment on attachment 8838209 [details]
Bug 1330348 - Make forward- and backward commands synchronous.

https://reviewboard.mozilla.org/r/113166/#review119578

> Please keep in mind that this place is the fallback case when we have to keep the current command active if a remoteness change occured. We do not want to loose this time as what we currently do for get(), and take a time which indeed has a small difference but which is better than using a new one after the remoteness change.

This is a small issue, but I feel we should respect the overall timeout and not restart after a remoteness change.  If the remoteness change is slow, that is a bug in Firefox.

> It's better to use `"error" in page` here. `has_key` has been removed in Python 3.x, and as best should not be used.

OK.

> It reverts the array (the last -1), and then slices everything except the first entry. It's needed because the first go_back will cause the 2nd last test page from the list loaded. But hey, it contained a failure, so it has to actually be `[-2::-1]`.

Please add comments about the two slicing uses.

> This was actually easy. So in the webprogress listener we abort waiting in case of image documents. But there we only remove the event listener for DOMContentLoaded, and forget about the registered listener itself. So when we have a goBack/goForward which navigates to an image, the still running progress listener hits the same code path and calls `pollForReadyState`. This is in conflict with the call for goBack/goForward. 
> 
> I will push a fix with a separate commit.

Thanks for looking into that.

> Actually this does not work. When I pass in the function reference like this I get a JS error:
> 
> > JavaScript error: chrome://marionette/content/listener.js, line 1172: TypeError: 'forward' called on an object that does not implement interface History.
> 
> Maybe this is because `history.forward` and `history.back` are native code and cannot be passed along that way:
> 
> > trigger: function forward() {
> >    [native code]
> > }
> 
> I'm going to revert this.

Huh, that’s surprising.  But then this is JS.
Comment on attachment 8838209 [details]
Bug 1330348 - Make forward- and backward commands synchronous.

https://reviewboard.mozilla.org/r/113166/#review120698
Attachment #8838209 - Flags: review?(ato) → review+
Comment on attachment 8845364 [details]
Bug 1330348 - get() command should not automatically focus tab.

https://reviewboard.mozilla.org/r/118564/#review120690

The focus issue surely applies to first window we open, and this is a known behavior. But after any navigation attempt the focus should never be left in the location bar. 

Anyway I think it warrants a different bug to maybe revise that decision. So I will delete the commit from this patch series.
Comment on attachment 8845365 [details]
Bug 1330348 - Make pollForReadyState a shared method for navigation commands.

https://reviewboard.mozilla.org/r/118566/#review120694

> I find it very confusing to read this code when the comments come after the fact.  Usually comments explain what is about to happen, which means I have to backtrace to understand what the condition is testing for.

Well, I will revert it.
Comment on attachment 8838209 [details]
Bug 1330348 - Make forward- and backward commands synchronous.

https://reviewboard.mozilla.org/r/113166/#review119578

> This is a small issue, but I feel we should respect the overall timeout and not restart after a remoteness change.  If the remoteness change is slow, that is a bug in Firefox.

`getCurrentUrl()` hasn't anything to do with the page load. Why should we pick a start time which is located before some other actions we have to get done before actually triggering the page load? 

Also as said above this is only for remoteness changes. Without this code we currently reset the timeout and start by 0.
Attachment #8845364 - Attachment is obsolete: true
Comment on attachment 8838209 [details]
Bug 1330348 - Make forward- and backward commands synchronous.

https://reviewboard.mozilla.org/r/113166/#review119578

> `getCurrentUrl()` hasn't anything to do with the page load. Why should we pick a start time which is located before some other actions we have to get done before actually triggering the page load? 
> 
> Also as said above this is only for remoteness changes. Without this code we currently reset the timeout and start by 0.

Given that this is already an improvement to what we had before I'm marking this as dropped. We can try to make further improvements in a follow-up bug if really necessary.
Pushed by hskupin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0fbfe6e94cb2
Make pollForReadyState a shared method for navigation commands. r=ato
https://hg.mozilla.org/integration/autoland/rev/e18d3dd20e8d
Make forward- and backward commands synchronous. r=ato
https://hg.mozilla.org/mozilla-central/rev/0fbfe6e94cb2
https://hg.mozilla.org/mozilla-central/rev/e18d3dd20e8d
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Lets uplift this test-only patch to aurora for now. If all works well over the next days I will request other uplifts. Thanks.
Whiteboard: [checkin-needed-aurora]
So far nothing turned out which would require a backout of this patch. We just noticed a couple of intermittent crashes in NSPR and mozglue land, which might get triggered by the new testcases. All have been filed already.

I would like to get this test-only patch uplifted to 53 beta as next step. Thanks.
Whiteboard: [checkin-needed-beta]
Depends on: 1346371
So long all fine and I think it would be great to get this test-only patch uplifted to esr52 and to release if still possible. The feature addition here is a P1 for webdriver compatibility. Thanks.
Whiteboard: [checkin-needed-esr52][checkin-needed-release]
Depends on: 1348227
Depends on: 1346209
Depends on: 1353447
Depends on: 1352132
Blocks: 1337464
Depends on: 1381339
No longer depends on: webdriver-navigate
Product: Testing → Remote Protocol
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: