Open Bug 1333458 (webdriver-navigate) Opened 7 years ago Updated 1 year ago

[meta] Improve page load algorithm of Marionette

Categories

(Remote Protocol :: Marionette, enhancement)

Version 3
enhancement

Tracking

(Not tracked)

People

(Reporter: whimboo, Unassigned)

References

(Depends on 8 open bugs, Blocks 1 open bug)

Details

(Keywords: meta)

There are a couple of situations when a get() request currently causes a hang because we are waiting for a page load event which never occurs. Bug 1322772 simply fixed a case for image documents. Others I will add as dependency to this bug, which are:

* file:// uri's
* XPI URLs
* loading the slow test page from our harness

There might still be others.

Thinking more about all that I'm not sure if a blacklist of possible MIME types etc would make sense. I would say we need a better strategy in figuring out if there is an event or not. I will have to play around with all that the next couple of days.
Also as what I noticed with the slow test page, even if a load event occurs it is not obeyed if happened with a delay like 5s. The listener correctly sends Marionette.ok() but this never reaches the driver code. Maybe we should also finally get bug 1242595 fixed.
Depends on: 1316813
Blocks: 1316813
No longer depends on: 1316813
Mike, maybe you have an idea how we can improve this situation? Thanks.
Flags: needinfo?(mconley)
Redirecting to Mossop, who has graciously offered to take some of my needinfo / review queue!
Flags: needinfo?(mconley) → needinfo?(dtownsend)
The hang on navigation to an .xpi file is https://bugzilla.mozilla.org/show_bug.cgi?id=1331967.
For reference, our current “wait for document to finish loading” algorithm is https://github.com/mozilla/gecko-dev/blob/master/testing/marionette/listener.js#L890-L1084.  We use a combination of the web load listener and event checks for DOMContentLoaded.
(In reply to Andreas Tolfsen ‹:ato› from comment #5)
> For reference, our current “wait for document to finish loading” algorithm
> is
> https://github.com/mozilla/gecko-dev/blob/master/testing/marionette/listener.
> js#L890-L1084.  We use a combination of the web load listener and event
> checks for DOMContentLoaded.

At a guess I'd say you'd see more reliability from just using a web progress listener alone rather than watching for a DOM event.

Where does the code run, in the parent or child process?
Flags: needinfo?(dtownsend) → needinfo?(ato)
Dave, we jump back and forth between chrome and content, but the event listeners we attach at the moment are all done in the framescript.

Would another option like getting rid of the event listener "DOMContentLoaded" and webProgress handling, and simply checking `document.readyState` for the completeness of the load action instead also work? In that case we shouldn't have to worry about which load actions actually require a page load event and which not.
DOMContentLoaded events are not fired for all types of documents, for example if the protocol is javascript:, the loaded document is an ImageDocument, or we navigate by fragment to the same document.  See http://searchfox.org/mozilla-central/source/testing/marionette/navigate.js#15 for some examples.

The URL spec also specifies some ‘special local schemes’ (https://url.spec.whatwg.org/#is-special) which we also ought to handle in the future.  The WebDriver standard was recently updated to include these, and we haven’t made the necessary changes yet.

Checking just `document.readyState` seems brittle: how do you determine _when_ it is safe to check it?  Navigation may not have started yet, and you would end up testing the current document.
Flags: needinfo?(ato)
For fragment navigation you want the hashchange DOM event. For real loads I'd just use a webprogress listener waiting for STATE_IS_WINDOW & STATE_STOP, that should always occur after any potential DOMContentLoaded event.
Thanks Dave! I will have a look into this.
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Blocks: 1337464
Dave, I have two more question in case you can answer them directly. If not I will have to dig into this topic tomorrow.

1) As it looks like file:// URLs are handled differently. I do not see a start nor stop state reported for a window or document when using the webprogress listener. Do you know if there is a specific event we could listen for?

2) I assume that in those cases when we only want to wait until the DOM is ready, we could add a DOMContentLoaded event listener , and escape the webprogress listener when it got fired?
Flags: needinfo?(dtownsend)
(In reply to Henrik Skupin (:whimboo) from comment #11)
> Dave, I have two more question in case you can answer them directly. If not
> I will have to dig into this topic tomorrow.
> 
> 1) As it looks like file:// URLs are handled differently. I do not see a
> start nor stop state reported for a window or document when using the
> webprogress listener. Do you know if there is a specific event we could
> listen for?

These shouldn't behave any differently. Those states are the same ones that we listen to to control the tab busy indicator.

> 2) I assume that in those cases when we only want to wait until the DOM is
> ready, we could add a DOMContentLoaded event listener , and escape the
> webprogress listener when it got fired?

If you know there is a DOM to be ready I guess. I thought that that was what you were having problems with before though.
Flags: needinfo?(dtownsend)
(In reply to Dave Townsend [:mossop] from comment #12)
> > 1) As it looks like file:// URLs are handled differently. I do not see a
> > start nor stop state reported for a window or document when using the
> > webprogress listener. Do you know if there is a specific event we could
> > listen for?
> 
> These shouldn't behave any differently. Those states are the same ones that
> we listen to to control the tab busy indicator.

But I do not see a single invocation of the webprogress listener even not when I use NOTIFY_ALL. I may have to look for the tab busy indicator and the relevant code if I can borrow some of its code.

> > 2) I assume that in those cases when we only want to wait until the DOM is
> > ready, we could add a DOMContentLoaded event listener , and escape the
> > webprogress listener when it got fired?
> 
> If you know there is a DOM to be ready I guess. I thought that that was what
> you were having problems with before though.

For page navigation we have different page load strategies. Those can be found here:
https://w3c.github.io/webdriver/webdriver-spec.html#navigation

If a strategy is set, like "eager" we should not wait for the readyState "complete" but already abort when it reaches "interactive".

Andreas, am I correct here? Not sure if we even implemented that yet - would have to more closely check the current code.
Flags: needinfo?(ato)
> If a strategy is set, like "eager" we should not wait for the readyState "complete" but already abort when it reaches "interactive".

That is correct.  You can read the spec as well as I.
Flags: needinfo?(ato)
Btw. the webprogress listener gets attached inside the framescript by Marionette, while for the tab busy indicator it's all done in chrome scope:

https://dxr.mozilla.org/mozilla-central/source/browser/base/content/tabbrowser.xml#681

Not sure if this could cause different behavior.
Ok, so that is indeed the problem. Attaching the progress listener via the tabbrowser for the current tab makes it work and I get a START and STOP event for ANY of our problematic URLs!
Ok, so it's not related to in which context the progress listener gets added but on which element:

Works: window.gBrowser.addProgressListener()
Fails: window.gBrowser.selectedBrowser.addProgressListener()

Interesting is the following topic on MDN which describes how the tabbrowser is handling event:
https://developer.mozilla.org/en-US/docs/Listening_to_events_in_Firefox_extensions#How_events_are_used_by_the_tabbrowser

It's using an injected browser-status-filter (https://dxr.mozilla.org/mozilla-central/source/toolkit/components/statusfilter/nsBrowserStatusFilter.cpp) which pre-processes events and filters out unnecessary entries. Maybe this is what makes it working when attaching the webprogress listener to the tabbrowser?

Sadly addProgressListener() is not available for Fennec on window.BrowserApp, so it cannot easily be used.
So I was able to also get the START and STOP events for STATE_IS_WINDOW when attaching the progress listener to the browser object itself but by using browser-status-filter. When I try the exact same thing in a framescript, but using the following technique the events are not received:

  let docShell = curContainer.frame
      .document
      .defaultView
      .QueryInterface(Ci.nsIInterfaceRequestor)
      .getInterface(Ci.nsIWebNavigation)
      .QueryInterface(Ci.nsIDocShell);
  let webProgress = docShell.QueryInterface(Ci.nsIInterfaceRequestor)
      .getInterface(Ci.nsIWebProgress);
  webProgress.addProgressListener(..)

So my question now is, would it be better to listen for those events always on the chrome side, or do we simply attach the listener to the wrong object?

Boris, would you mind to give us an advice in how to best listen for those progress events? Thanks.
Flags: needinfo?(bzbarsky)
Blocks: 1145668
So we just discussed this topic in detail in a meeting, and as everyone agreed I will continue to investigate what's necessary to move all the page load code out of listener.js (framescript) into chrome. It will give us actually a lot of bonus points, and will cause a lot of code to be not used anymore. I will try to have a running prototype ready for Firefox and Fennec soon (maybe even tomorrow).

I will keep the ni? for Boris set, so we know if that is an expected behavior or something we have to get fixed.
Summary: Improve logic to determine if a load event is expected → Improve page load strategy of Marionette
Blocks: 1242595
I would expect the code in comment 18 to do the right thing if run in a framescript, if the document involved is the actual content document.

I would expect listening for progress events in _chrome_ to not work for _content_ loads at all.

To the extent that either of these expectations is not matching reality, clearly I don't have a correct mental model of _something_, but I don't know of what.
Flags: needinfo?(bzbarsky)
Blocks: 1330348
Blocks: 1291320
Well, the latest summary change was bad because "page load strategy" is something else, which will be part of the overall redesign of the page load algorithm.
Summary: Improve page load strategy of Marionette → Improve page load algorithm of Marionette
Some points of what we discussed yesterday:

* We want to move the page load detection code out of listener.js (framescript) into the chrome process.

The benefit is that we can have a drastic reduction in the back and forward communication with the framescript. Also first attempts show that we can get rid of lots of code.

The downside is that we need extra calls to the framescript to retrieve information about the content document like baseURI and readyState. But that should not be that cost intensive.

* We can leave our progress listeners attached to all the individual tabs.

With that we can even recognize page loads when Marionette doesn't run any command at the time of a triggered page load. It means that each command can query the info from the progress listener at any time for the current tab. It makes features like on bug 1335778 (page load after click) way easier to implement.
(In reply to Henrik Skupin (:whimboo) from comment #22)
> The downside is that we need extra calls to the framescript to retrieve
> information about the content document like baseURI and readyState. But that
> should not be that cost intensive.

Indeed, and crucially it will work with the synchronous calls through testing/marionette/proxy.js.
Blocks: 1344684
No longer blocks: 1344684
While working on bug 1291320 I have found the solution for all of our problems for page loads. It is working inside the framescript, which means in content scope. And I don't think as of now it warrants further research in how to get it working in chrome scope.

So what is it all about? The get command will now also make use of waitForPageUnloaded, similar to goBack and goForward. The trigger is simply `curContainer.frame.location = url;`. Which means all the cryptic page load handling stuff with event listeners and webprogress listeners can be removed from this method. It's now only 8 lines of code. We can also remove the additionally added files, which gives even more removals.

I have also changed pollForReadyState to make it not polling anymore but actually listening for the correct events which will be `DOMContentLoaded` and `pageshow`. With that we do not run into races by checking `readyState` and returning immediately because it is still `complete`. Timeout handling will just be a single invocation of a timer. Sadly it's a bit more complicated to do, because it turned out that setting the URL this way is a blocking call. As result I have to refactor our current methods. Actually I'm planning to combine both and that they use a global pageLoad object to share the data of the current command.

In regards of remoteness changes it's so that our framescript is initialized again before any events of the web page are getting fired. So the change from above will not result in missing events.

I also have an idea how to improve the dispatching, but this might have to wait a bit more until we can exactly determine if a page load is happening for any command.
Depends on: 1323755
With bug 1291320 we already get a kinda good improvement for the page load algorithm. But it's still in content and not chrome.
No longer blocks: 1291320
Depends on: 1291320
Priority: -- → P1
Blocks: 1366035
Priority: P1 → P3
So far it still doesn't turn out to be neccessary to do another round of larger improvements like moving to the chrome process. I will still keep an eye on it but for now I don't work on it at all.
Assignee: hskupin → nobody
Status: ASSIGNED → NEW
No longer blocks: 1337464
No longer blocks: 1332122
Depends on: 1418273
Depends on: 1421628
Depends on: 1422769
Depends on: 1126362
Alias: webdriver-navigate
Depends on: 1363418
Priority: P3 → P1
Summary: Improve page load algorithm of Marionette → [meta] Improve page load algorithm of Marionette
Depends on: 1287221
Priority: P1 → --
Depends on: 1312679
Depends on: 1400005
Depends on: 1450876
Depends on: 1611857
Depends on: 1612831
Depends on: 1664165
Depends on: 1666755
Depends on: 1675728
Depends on: 1678455
Type: defect → enhancement
Severity: normal → S3
Product: Testing → Remote Protocol
You need to log in before you can comment on or make changes to this bug.