Open Bug 1450876 Opened 4 years ago Updated 12 days ago

click and page load aborts too early if target URL loads too slowly (delayed pagehide event)

Categories

(Testing :: Marionette, defect, P3)

defect

Tracking

(firefox-esr60 wontfix, firefox62 affected, firefox63 affected)

Tracking Status
firefox-esr60 --- wontfix
firefox62 --- affected
firefox63 --- affected

People

(Reporter: whimboo, Unassigned)

References

(Depends on 1 open bug, Blocks 3 open bugs, )

Details

Attachments

(2 files)

Originally reported as: https://github.com/mozilla/geckodriver/issues/1236

If the target page is loading slowly and exceeds our hard-coded 5s timeout for `pagehide` the load listener will be canceled. This is unfortunate given that `pagehide` seem to get fired when the first data from the target page arrives. So our current logic doesn't make that much sense, and we should re-think in how to improve the situation. I might find some time this week to have a look at.

Here an example Marionette test which demonstrates the issue and is not covered by any unit test at the moment:

>     def test(self):
>         target = self.marionette.absolute_url("slow?delay=10")
>         self.marionette.navigate(self.inline("""<a id="link" href="{}">Slow</a>""".format(target)))
>         link = self.marionette.find_element(By.ID, "link")
>         link.click()
>         self.marionette.find_element(By.ID, "delay")
Summary: click and page load aborts too early if target URL loads too slowly → click and page load aborts too early if target URL loads too slowly (delayed pagehide event)
Sounds like a conformance issue.
Priority: P3 → P2
Sorry, that this bug was lost in my queue. I just had a quick look and found the following...

Given that `beforeunload` handlers are not allowed by the WebDriver specification for all the navigation and close window commands, and we have that implemented since bug 1434872, it should be easier now to simplify the page load logic.

Means when we detect an `onbeforeunload` event we can assume that a page load is undergoing, and as such directly wait for the page being finished loading. There shouldn't be a need to wait for the `pagehide` event too anymore.

I will see if I can fix this today.
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Depends on: 1434872
Priority: P2 → P1
Here are the try results:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=51b78995fb9751d828dc0cb5210b1a39a93a7c5d&selectedJob=191060385

It looks pretty good except that `test_click_link_install_addon` is failing now. Interestingly it was only working because we had this extra timeout of 5s included, but which should not have made it work at the first place. Basically installing an extension should return from the page load event with the same logic as for unknown MIME types which is bug 1366035.

Dave, I feel that fixing this bug has a high priority, even when we will have a regression on our side for clicking a link which installs an addon. I wonder if you could live with a regression in that area until bug 1366035 has been fixed?
Flags: needinfo?(dave.hunt)
I believe there's an AMO test that depends on installing an add-on from a link. I don't know which version of Firefox they target. Are you anticipating that this regression would only affect Nightly? Benny: Could you comment on the AMO test(s)?
Flags: needinfo?(dave.hunt) → needinfo?(bennyjr169)
We have 1 install test for AMO, but we also test installation on the Discovery Pane. We test both on nightly specifically too.
Flags: needinfo?(bennyjr169)
Ok, so it looks like we cannot fix this bug until bug 1366035 has been properly fixed.
Assignee: hskupin → nobody
Status: ASSIGNED → NEW
Depends on: 1366035
Priority: P1 → P3

My patch on bug 1612831 will fix this.

Depends on: 1612831
No longer depends on: 1366035, 1434872

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #9)

My patch on bug 1612831 will fix this.

Note that I actually missed to fix it in that other bug. So lets do it now.

Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Priority: P3 → P1

Currently the command only waits 5s for the target page to
be loaded. This needs to be extended to the currently set
page timeout.

Depends on D90413

I forgot about the install_addon case when writing this patch. Extending the timeout to the set page timeout will actually cause a timeout in those cases. See my comment 5 from 2 years ago. That said we need an early return from navigation when such an install notification pops-up. I wonder how best detect it. Gijs, do you know if there is some kind of event that gets fired for newly opened notifications (doorhanger?)?

Flags: needinfo?(gijskruitbosch+bugs)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #12)

I forgot about the install_addon case when writing this patch. Extending the timeout to the set page timeout will actually cause a timeout in those cases. See my comment 5 from 2 years ago. That said we need an early return from navigation when such an install notification pops-up. I wonder how best detect it. Gijs, do you know if there is some kind of event that gets fired for newly opened notifications (doorhanger?)?

There will be a "popupshowing" and "popupshown" event fired on the doorhanger in question. The event bubbles so you can listen for it on the document root and then check the event target to see what is being shown. You probably don't want to do whatever it is you're trying to do for e.g. tooltip and context menus showing up.

In fact, I also don't understand why you need to return from navigation when a doorhanger pops up... the comments here don't really explain - it sounds to me like the test is just broken and should be fixed, not that navigations should always be canceled by doorhangers. If doorhangers were to always cancel navigation, you'd have lots of other problems (e.g. every time we show a doorhanger, even for completely different things ("we blocked a tracker!", "you can use reader mode here!", "you bookmarked a thing", "you opened a panel while the navigation was happening"), navigation would break).

Flags: needinfo?(gijskruitbosch+bugs)

The problem is that when navigating to URLs with MIME types not handled directly in the tab Marionette currently times out (see bug 1366035), because we only see the beforeunload and unload events fired. There is no other event that tells us (for our current navigation listener implementation) that the navigation is done. Maybe not checking for DOMContentLoaded and pageshow but using a webprogress listener (bug 1664165) would help? Or is there another way to better detect such situations?

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #14)

The problem is that when navigating to URLs with MIME types not handled directly in the tab

Can you give a concrete way of reproducing this / example of what you mean?

Marionette currently times out (see bug 1366035), because we only see the beforeunload and unload events fired.

This doesn't really make sense - if something doesn't load in the tab, it shouldn't trip an "unload" event, because the current document should stay loaded.

There is no other event that tells us (for our current navigation listener implementation) that the navigation is done. Maybe not checking for DOMContentLoaded and pageshow but using a webprogress listener (bug 1664165) would help? Or is there another way to better detect such situations?

Putting more of this in the parent process using webprogress listeners does seem like it'd be a good idea regardless.

What would marionette actually want to happen when an attempt to navigate does not result in the tab loading a new page (instead e.g. opening an external app via a protocol, or a file type we can't handle, or a file type we handle by offering to install an add-on, or whatever else we support) ?

Flags: needinfo?(hskupin)

(In reply to :Gijs (he/him) from comment #15)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #14)

The problem is that when navigating to URLs with MIME types not handled directly in the tab

Can you give a concrete way of reproducing this / example of what you mean?

There is the following test for installing an addon by clicking a link:

https://searchfox.org/mozilla-central/rev/89d33e1c3b0a57a9377b4815c2f4b58d933b7c32/testing/marionette/harness/marionette_harness/tests/unit/test_click.py#438-444

Here appropriate log output:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=316700896&repo=mozilla-central&lineNumber=7448-7457

Marionette currently times out (see bug 1366035), because we only see the beforeunload and unload events fired.

This doesn't really make sense - if something doesn't load in the tab, it shouldn't trip an "unload" event, because the current document should stay loaded.

Sorry, that was my fault. We indeed don't see a unload event but only beforeunload. See the above referenced log output.

There is no other event that tells us (for our current navigation listener implementation) that the navigation is done. Maybe not checking for DOMContentLoaded and pageshow but using a webprogress listener (bug 1664165) would help? Or is there another way to better detect such situations?

Putting more of this in the parent process using webprogress listeners does seem like it'd be a good idea regardless.

What would marionette actually want to happen when an attempt to navigate does not result in the tab loading a new page (instead e.g. opening an external app via a protocol, or a file type we can't handle, or a file type we handle by offering to install an add-on, or whatever else we support) ?

For unknown file types that we cannot handle we might have to install an override handler, right? When we hit that the navigation should stop. But when installing an add-on there is no such dialog that we can set a handler for, but the notifications.

I assume the above situations might be similar to when a beforeunload event gets rejected and the navigation isn't going to happen. In such a case we might also have to stop waiting. The Webdriver spec defines the following to wait for the navigation to complete:

https://w3c.github.io/webdriver/#dfn-waiting-for-the-navigation-to-complete

But the above cases aren't covered here and would need special treatment for Firefox specific behavior.

Not sure if it would make sense to set a timeout for unload and if that doesn't happen within a specific time also return from the navigation request. It feels a bit racy to me similar to what we do with beforeunload.

Flags: needinfo?(hskupin)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #16)

For unknown file types that we cannot handle we might have to install an override handler, right?

I don't know; I think you may be able to deal with it using just web progress listeners. But that'd need some testing.

When we hit that the navigation should stop. But when installing an add-on there is no such dialog that we can set a handler for, but the notifications.

Yeah.

I assume the above situations might be similar to when a beforeunload event gets rejected and the navigation isn't going to happen. In such a case we might also have to stop waiting.

Yes. I'd expect that webprogress listeners should notify you of a start and then stop of a load (or perhaps never even register the start, but either way, state should be consistent).

The Webdriver spec defines the following to wait for the navigation to complete:

https://w3c.github.io/webdriver/#dfn-waiting-for-the-navigation-to-complete

But the above cases aren't covered here and would need special treatment for Firefox specific behavior.

I mean, other browsers also support downloading files, right? The HTML spec lists what happens at https://html.spec.whatwg.org/#process-a-navigate-response . I don't know how webdriver is supposed to handle this.

I'm currently not working on this bug. Also given the reply from Gijs it might be good to wait until we make use of a webprogress listener for tracking an active navigation. See bug 1664165.

Assignee: hskupin → nobody
Status: ASSIGNED → NEW
Depends on: 1664165
Priority: P1 → P3
You need to log in before you can comment on or make changes to this bug.