Closed Bug 865641 Opened 11 years ago Closed 11 years ago

Very first waitForPageLoad() can fail if the startup page takes too long to load

Categories

(Testing Graveyard :: Mozmill, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: whimboo, Assigned: whimboo)

References

Details

(Whiteboard: [mozmill-2.0+][ateamtrack: p=mozmill q=2013q3 m=1])

Attachments

(2 files, 1 obsolete file)

I can see that httpd.js is failing on my Ubuntu 12.10 x64 machine by loading the referenced test file. A 'bad request' gets returned and that's why we fail to retrieve any kind of elements.

$ mozmill -b /mozilla/bin/beta/firefox -t mutt/mutt/tests/js/controller/contextmenu.js
Gtk-Message: Failed to load module "overlay-scrollbar"
TEST-START | contextmenu.js | setupModule
TEST-START | contextmenu.js | testMenu
ERROR | Test Failure | {
  "exception": {
    "stack": [
      "createInstance@resource://mozmill/driver/mozelement.js:45", 
      "ID@resource://mozmill/driver/mozelement.js:57", 
      "testMenu@resource://mozmill/modules/frame.js -> file:///mozilla/code/mozmill/mutt/mutt/tests/js/controller/contextmenu.js:21", 
      "Runner.prototype.wrapper@resource://mozmill/modules/frame.js:634", 
      "Runner.prototype.runTestModule@resource://mozmill/modules/frame.js:704", 
      "Runner.prototype.runTestFile@resource://mozmill/modules/frame.js:602", 
      "runTestFile@resource://mozmill/modules/frame.js:749", 
      "Bridge.prototype._execFunction@resource://jsbridge/modules/Bridge.jsm:140", 
      "Bridge.prototype.execFunction@resource://jsbridge/modules/Bridge.jsm:147", 
      "@resource://jsbridge/modules/Server.jsm:33", 
      ""
    ], 
    "message": "could not find element ID: fname", 
    "fileName": "resource://mozmill/driver/mozelement.js", 
    "name": "Error", 
    "lineNumber": 45
  }
}

This does not happen with Aurora but all other branches. I might have to find a possible regression vs. fix range here. It sounds complicated.

This should be a blocker given that serving local pages is commonly used now.
If I use a quick loading tab (e.g. "about:blank") as a startup page (--app-arg="about:blank"), then the problem goes away.

With this logic, if I put a sleep/waitFor in the test also passes.

And I'm also facing with this issue while playing with bug 872237 on Win7.
OS: Linux → All
I could think of that this bug is related to our broken window initialization code as covered on bug 872237. Szabolcs, do you still see it with the proposed patch on the other bug applied?
Flags: needinfo?(xabolcs)
(In reply to Henrik Skupin (:whimboo) from comment #2)
> I could think of that this bug is related to our broken window
> initialization code as covered on bug 872237. Szabolcs, do you still see it
> with the proposed patch on the other bug applied?

Yes, I still see it with attachment 755849 [details] [diff] [review] applied.
And if I run a manifest like 'mutt/mutt/tests/js/controller/tests.ini',
I got "TypeError: can't access dead object"! after the "fname" failure. :-/

> CRITICAL | Framework Failure | {
>  "message": "[JavaScript Error: \"TypeError: can't access dead object \" 
>  {file: \"resource://mozmill/modules/windows.js\" line: 188]"
> }


See bug 872237 comment #14:
> If I change |if (aWindow.content)| to |if (aWindow.document.readyState == "complete")| near L262, then I'm facing with bug 865641.
Flags: needinfo?(xabolcs)
this might be fixed with bug 872237
Depends on: 872237
Drop my last comment. It will not be fixed by that bug. It's somewhat in our start-up code and that we are starting the test too early and httpd.js is not ready yet.
No longer depends on: 872237
Blocks: 879491
This failure I see very rarely those days. Also it happens only for beta, release, and esr17. Then it fails with a rate of <10%. Can anyone confirm that?

Given that I don't think that we should block 2.0 on it. We might reconsider if someone else can find an always to reproduce testcase.
Assignee: hskupin → nobody
Whiteboard: [mozmill-2.0?] → [mozmill-2.1?]
So good news here. I had an idea. I just did the task in the last minutes and surprisingly it turned out what's wrong here. Thankfully it's NOT httpd.js but our waitForPageLoad() method! It means that it sometimes fails for the very first test, when the initial startup page gets loaded first. We somewhat have an overlap of page load states, so we simply fetch this load state instead of the wanted web page. As result the element we are looking for does not exist. The bug will not happen for any other test for the current session.

We should indeed fix this problem in 2.0 otherwise we risk broken tests depending on if the first test immediately makes use of waitForPageLoad(). I don't think that it will be too hard to fix. 

As reference I will attach a log with all the events and handlings as attachment. There you can see the difference between a pass and a fail. A test which exercises this problem shouldn't be that hard to create.

I will work on this bug tomorrow.
Assignee: nobody → hskupin
Hardware: x86_64 → All
Summary: httpd.js sometimes fails in loading referenced test files → Very first waitForPageLoad() can fail because it waits for the startup page but not the page to be loaded
Whiteboard: [mozmill-2.1?] → [mozmill-2.0?]
Attached file events log
Here you can see in details what's going on:

*** 'load' event: id=3, baseURI=chrome://browser/content/browser.xul
* current map: {"3":{"loaded":true}}
TEST-START | /mozilla/code/mozmill/mutt/mutt/tests/js/controller/contextmenu.js | setupModule
TEST-START | /mozilla/code/mozmill/mutt/mutt/tests/js/controller/contextmenu.js | testMenu
*** controller.open for http://localhost:43336/form.html
*** 'pageshow' event: id=8, baseURI=http://www.mozilla.org/en-US/firefox/23.0/firstrun/
* current map: {"3":{"loaded":true},"8":{"loaded":true}}
* current map: {"3":{"loaded":true},"8":{"loaded":true,"id_load_in_transition":{"name":"","number":"{da97872e-4c17-4cfb-80ed-c4be554755c8}","valid":true}}}
*** Page status updated: id=8, loaded=true, uuid={da97872e-4c17-4cfb-80ed-c4be554755c8}

Right after we call controller.open() the startup page gets finished loading. So we detect this as the wanted pageshow event.

Maybe we want to stop any page loading action for that tab before calling 'this.browserObject.loadURI(url);' in controller.js.
Ok, so this issue has nothing to do with waitForPageLoad but occurs if the startup page takes too long to get loaded. In such a case we have an overlap. But also other controller.open() calls can fail due to that. So we can use browserObject.stop() to stop an actual page load before calling loadURI(). This works fine for me and I'm just doing a sanity check before uploading the patch.
Summary: Very first waitForPageLoad() can fail because it waits for the startup page but not the page to be loaded → Very first waitForPageLoad() can fail if the startup page takes too long to load
Whiteboard: [mozmill-2.0?] → [mozmill-2.0+]
Attached patch Patch v1 (obsolete) — Splinter Review
This patch fixes the problem and also adds a test which uses a very slow loading external website on startup.
Attachment #773862 - Flags: review?(dave.hunt)
Comment on attachment 773862 [details] [diff] [review]
Patch v1

Review of attachment 773862 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. I had one nit and a couple of questions, but I'm okay with this landing as is.

::: mutt/mutt/tests/python/test_slow_pageload_on_startup.py
@@ +7,5 @@
> +
> +import mozmill
> +
> +class TestPageLoadOnStartup(unittest.TestCase):
> +    def test_no_abort_via_client_tests(self):

nit: Empty line before this line

@@ +9,5 @@
> +
> +class TestPageLoadOnStartup(unittest.TestCase):
> +    def test_no_abort_via_client_tests(self):
> +        testpath = os.path.join("js-modules", "testPageLoadOnStartup.js")
> +        self.do_test(testpath, passes=1, fails=0, skips=0)

you can omit fails and skips as the default values are 0

@@ +16,5 @@
> +        testpath = os.path.join(os.path.dirname(os.path.abspath(__file__)),
> +                                relative_test_path)
> +        tests = [{'path': testpath}]
> +
> +        runner_args = {'cmdargs': ['http://www.nbc.com']}

do we want to be relying on an external resource?
Attachment #773862 - Flags: review?(dave.hunt) → review+
(In reply to Dave Hunt (:davehunt) from comment #12)
> @@ +16,5 @@
> > +        testpath = os.path.join(os.path.dirname(os.path.abspath(__file__)),
> > +                                relative_test_path)
> > +        tests = [{'path': testpath}]
> > +
> > +        runner_args = {'cmdargs': ['http://www.nbc.com']}
> 
> do we want to be relying on an external resource?

There is no other way at the moment. We can't slowdown httpd.js that it delivers pages slower. So we have to live with that.

Other nits will be addressed in the next version of the patch.
Attached patch Patch v2Splinter Review
Attachment #773862 - Attachment is obsolete: true
Attachment #773883 - Flags: review+
Landed on master:
https://github.com/mozilla/mozmill/commit/c0277a214788b57282dd5032ba4dcf2f0e1d9159

Given that our page loaded logic for 1.5 is kinda broken, I don't think we should or have to change anything for 1.5.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Whiteboard: [mozmill-2.0+] → [mozmill-2.0+][ateamtrack: p=mozmill q=2013q2 m=4]
Whiteboard: [mozmill-2.0+][ateamtrack: p=mozmill q=2013q2 m=4] → [mozmill-2.0+][ateamtrack: p=mozmill q=2013q3 m=1]
Depends on: 895803
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: