Mozmill tests are failing with timeouts in waitForPageLoad() due to 'about:newtab'

RESOLVED FIXED

Status

P1
normal
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: vladmaniac, Assigned: whimboo)

Tracking

(Depends on: 1 bug)

unspecified
Dependency tree / graph

Firefox Tracking Flags

(firefox18 fixed, firefox19 fixed)

Details

(Whiteboard: [mozmill-test-failure][blocked by bug 799433] s=121001 u=failure c=awesomebar p=1, URL)

Attachments

(2 attachments, 5 obsolete attachments)

(Reporter)

Description

6 years ago
This happened today on Firefox 18. 
Mozmill version: 1.5.18 
Platform: Mac OS X 10.8 (x86_64)
Report link: http://mozmill-ci.blargon7.com/#/functional/report/671677a5d9d5ca25f3cf5ae1c4697387

Even if this seems to be just a hickup, we should monitor it at least.
(Reporter)

Updated

6 years ago
status-firefox18: --- → affected
Whiteboard: [mozmill-test-failure]
(Reporter)

Comment 1

6 years ago
This next failure report shows the exact same error but in multiple tests 
http://mozmill-ci.blargon7.com/#/functional/report/671677a5d9d5ca25f3cf5ae1c46a1fa8
(Reporter)

Comment 2

6 years ago
So this was not a single time, as I suspected, but still intermittent
http://mozmill-ci.blargon7.com/#/functional/report/671677a5d9d5ca25f3cf5ae1c46b0b2b
(Assignee)

Comment 3

6 years ago
Could be that a single testrun was affected by an issue with httpd.js.
OS: Mac OS X → All
Priority: -- → P2
Hardware: x86_64 → All
(Reporter)

Comment 4

6 years ago
(In reply to Henrik Skupin (:whimboo) from comment #3)
> Could be that a single testrun was affected by an issue with httpd.js.

I'm afraid we have to look into this at some point. Build are still affected, its not that often though.
This appears to be affecting every Nightly on Mac OS X 10.8 (French locale) since 2012-09-11, and only en-US from 2012-09-11 to 2012-09-12. The timeout is after loading the URL defined by window.BROWSER_NEW_TAB_URL.
(Assignee)

Updated

6 years ago
Whiteboard: [mozmill-test-failure] → [mozmill-test-failure] s=q3 u=failure c=awesomebar p=1
(Reporter)

Comment 6

6 years ago
I will take this. I need a testing env on a mac 10.8 at first though, but this would be the P1 for me today, if you guys have other ideas of course.
Assignee: nobody → vlad.mozbugs
Status: NEW → ASSIGNED
(Reporter)

Comment 7

6 years ago
This is very strange.

As preliminary investigation, I can reproduce the error but only if I run the whole functional testrun. If I run the test single from the command line it works as expected.
The setup module fails with a timeout error in tabs.closeAllTabs() which closes all tabs and opens about:blank. The test timeouts there, in opening about:blank, as dave pointed out in comment 5
(Reporter)

Comment 8

6 years ago
If I change the url defined by window.BROWSER_NEW_TAB_URL from 'about:newtab' to 'about:blank' or in fact any webpage, the test works as expected an we have a pass. Therefore, preliminary conclusions jump to a problem with about:newtab in this particular case.
(Reporter)

Comment 9

6 years ago
Created attachment 662994 [details] [diff] [review]
disable test v1.0

As we agreed in the 'ask an expert' session today, we are disabling this test for mac. here is the patch, tested and internally reviewed by Alex
Attachment #662994 - Flags: review?(hskupin)
(Reporter)

Updated

6 years ago
Attachment #662994 - Flags: review?(dave.hunt)
The patch was fine for disabling the test for just Mac, however this issue is not just occurring on Mac... Why was it decided to do this? I'll hold off backing out the patch until we have an answer here as I joined the 'ask an expert' session part-way through this conversation.
(Assignee)

Comment 12

6 years ago
Vlad pointed that out when I have asked for. But you are right, Dave. The given link in the URL field is wrong. Vlad, next time please ensure to select all platforms and all versions.

We should create a new patch which disables all platforms which should not be based on the last one but on the original content. Once that patch is ready we will backout the formerly one and directly land the new skip patch.

I would kinda appreciate that we check for a regression range, because it could be a bug in Firefox.
Keywords: regression, regressionwindow-wanted
(Assignee)

Comment 14

6 years ago
Dave, when landing skip patches please take care of the flags next time.
status-firefox-esr10: --- → affected
status-firefox15: --- → disabled
status-firefox16: --- → disabled
status-firefox17: --- → disabled
status-firefox18: affected → disabled
Whiteboard: [mozmill-test-failure] s=q3 u=failure c=awesomebar p=1 → [mozmill-test-failure][mozmill-test-skipped] s=q3 u=failure c=awesomebar p=1
(Reporter)

Comment 15

6 years ago
Created attachment 663371 [details]
simple testcase to demonstrate when we fail

We are failing if we remove all browser history and then access about:newtab
the failure in testGoButton.js is just a coincidence, we could have any test at all.
I cannot repro this manually, but if I run the simple test within the functional testrun, it fails, at least on mac os x 10.8 all the time.
(Assignee)

Comment 16

6 years ago
Interesting, so what are your next steps here to get the full details why it fails? There are still some different factors in this testcase which could cause the problem.
(Reporter)

Comment 17

6 years ago
So what I did so far is try to reproduce this manually by manually clearing out all the history but I had no luck and yesterday I was into the clearing history Firefox code.

It can be related to about:newtab, a bug in this features in interaction with clearing history.
Or it can be something in our testing framework, but I have nothing conclusive atm
(Assignee)

Comment 18

6 years ago
What about reducing the testcase even further or finally starting a hg bisect? Those two things would be the most valuable actions on this bug.
(Reporter)

Comment 19

6 years ago
(In reply to Henrik Skupin (:whimboo) from comment #18)
> What about reducing the testcase even further or finally starting a hg
> bisect? Those two things would be the most valuable actions on this bug.

Thanks for the tips. I will start on those
(In reply to Henrik Skupin (:whimboo) from comment #12)
> We should create a new patch which disables all platforms which should not
> be based on the last one but on the original content. Once that patch is
> ready we will backout the formerly one and directly land the new skip patch.

Any progress on this? I think some of our other tests are failing for the same reason. For example, see bug 794400.
(Reporter)

Comment 21

6 years ago
(In reply to Dave Hunt (:davehunt) from comment #20)
> (In reply to Henrik Skupin (:whimboo) from comment #12)
> > We should create a new patch which disables all platforms which should not
> > be based on the last one but on the original content. Once that patch is
> > ready we will backout the formerly one and directly land the new skip patch.
> 
> Any progress on this? I think some of our other tests are failing for the
> same reason. For example, see bug 794400.

Well there is some progress but yet not the reason why this is happening
I should really try the bisect today
(Assignee)

Updated

6 years ago
Blocks: 794400
(Assignee)

Comment 22

6 years ago
This is now a P1. Thanks Dave for making the bridge to other tests.

(In reply to Maniac Vlad Florin (:vladmaniac) from comment #21)
> I should really try the bisect today

Please do not try to do it. It has to be done today! Make it your top priority please.
Priority: P2 → P1
(Assignee)

Updated

6 years ago
Blocks: 794392
(Reporter)

Comment 23

6 years ago
Are you sure that the failure of the other tests are connected to this one?
I do not think that bug 794392 is related, but bug 794400 fails in the closeAllTabs method, so yes, I think this is related.
(Assignee)

Comment 25

6 years ago
I'm 99% sure that it's related too, because we make use of closeAllTabs() for each iteration of the endurance test. I don't know of any other failure we currently have which is connected to waitForPageLoad().
Fair enough, so it's really important that we get this fixed!
(Reporter)

Comment 27

6 years ago
As I said early on iRC, this is happening only within the functional testrun, it does not happen when running the test manually via command line, so I am tempted to assume we have something wrong there. I was bisecting mozilla-central without any luck in finding a firefox changeset which would be bad
(Assignee)

Comment 28

6 years ago
I really don't understand your latest statement Vlad. You have attached a simplified testcase which also showed the problem without having to run the whole tests. So why are you stepping back again? If it's clearly reproducible with the test why don't you use it for the regression test?
(Assignee)

Comment 29

6 years ago
Comment on attachment 663371 [details]
simple testcase to demonstrate when we fail

So the problem here is that we are cloning the repository into a temporary location like:

"/var/folders/wd/zmy4z7xn7wd7sjq90z1y52f80000gn/T/tmpQ6YQZ2.mozmill-tests".

So something is not working when the tests are located there. Not sure yet if it is a failure in httpd.js or Firefox itself. Will check tomorrow morning.
Attachment #663371 - Attachment is obsolete: true
(Assignee)

Comment 30

6 years ago
So here the notes what I did:

1. I have checked how we call Mozmill from our functional testrun:

http://hg.mozilla.org/qa/mozmill-automation/file/416592141962/libs/testrun.py#l340

2. The only item which made me wonder was the 'self._mozmill.tests' property. All others shouldn't be involved. So I have updated it to not use the cloned repository in the tmp folder but my default one under /data/code/mozmill-tests/nightly -> that was working

3. Execute a 'mkdir /private/var/folders/wd/zmy4z7xn7wd7sjq90z1y52f80000gn/T/tmpQ6YQZ2.mozmill-tests/' and copy all the mozmill-tests file in that folder.

4. Run 'mozmill -b %path% -t /private/var/folders/wd/zmy4z7xn7wd7sjq90z1y52f80000gn/T/tmpQ6YQZ2.mozmill-tests/tests/functional/ -> same failure happens.

5. It does not happen when you only run the awesomebar tests. So one of the addon tests is causing this problem.

Vlad, please continue the investigation today with the information from above.
(Assignee)

Updated

6 years ago
Blocks: 794750
(In reply to Henrik Skupin (:whimboo) from comment #30)
> 5. It does not happen when you only run the awesomebar tests. So one of the
> addon tests is causing this problem.
> 
> Vlad, please continue the investigation today with the information from
> above.
I ran with manifest files today, but can't reproduce that failure when running with
>[include:testAddons/manifest.ini]
>[include:testAwesomeBar/manifest.ini]
(Reporter)

Comment 32

6 years ago
(In reply to Henrik Skupin (:whimboo) from comment #30)
> So here the notes what I did:
> 
> 1. I have checked how we call Mozmill from our functional testrun:
> 
> http://hg.mozilla.org/qa/mozmill-automation/file/416592141962/libs/testrun.
> py#l340
> 
> 2. The only item which made me wonder was the 'self._mozmill.tests'
> property. All others shouldn't be involved. So I have updated it to not use
> the cloned repository in the tmp folder but my default one under
> /data/code/mozmill-tests/nightly -> that was working
> 
> 3. Execute a 'mkdir
> /private/var/folders/wd/zmy4z7xn7wd7sjq90z1y52f80000gn/T/tmpQ6YQZ2.mozmill-
> tests/' and copy all the mozmill-tests file in that folder.
> 
> 4. Run 'mozmill -b %path% -t
> /private/var/folders/wd/zmy4z7xn7wd7sjq90z1y52f80000gn/T/tmpQ6YQZ2.mozmill-
> tests/tests/functional/ -> same failure happens.
> 
> 5. It does not happen when you only run the awesomebar tests. So one of the
> addon tests is causing this problem.
> 

Its no need to do all that. Just have a test which uses closeAllTabs before the simple testcase, and it will reproduce in the command line environment. 
I was not able to build a testcase in a single file, we need to files to be ran for this to reproduce. 
The final scenario will be: 
1. have the simple test file and testManagerKeyboardShortcut for e.g in the same folder
2. run the folder with the mozmill -t in the command line

it will fail, at least it does for me. its very strange that if we put another test there besides testManagerKeyboardShortcut it will pass locally. If we use the testrun script for example, its most likely to fail frequently, you just need to have more than two tests in the folder, and both to make usage of tabs.closeAllTabs() in setupModule.

I was looking both me and Alex on this one today, but sadly we need more time as nothing shows up. 

We also tried to to a hg bisect but we got no Firefox changeset as bad.
(Assignee)

Updated

6 years ago
Whiteboard: [mozmill-test-failure][mozmill-test-skipped] s=q3 u=failure c=awesomebar p=1 → [mozmill-test-failure][mozmill-test-skipped] s=121001 u=failure c=awesomebar p=1
(Reporter)

Comment 33

6 years ago
Created attachment 667875 [details]
simple test 1

This is the first simple test file. 
Sorry to say it would contain a dependency, 'prefs' because we need to set a pref when closing all tabs. I decided to leave it there as an exception to our simple testcases rule because otherwise it will complicate the code in the test and setting a pref does not have anything to do with our error.

I've tested it with other machines also and it does not reproduce on all of them.
I wanted to create a screencast, and the strange thing is that it does not reproduce within the screencast, only outside of it..this is new information but I can't explain why atm.
(Reporter)

Comment 34

6 years ago
Created attachment 667876 [details]
simple test 2

* this is the 2nd simple test

Please run both tests at the same time if wanna try to reproduce the failure.
They should go in a folder under tests/functional folder
(Assignee)

Comment 35

6 years ago
(In reply to Maniac Vlad Florin (:vladmaniac) from comment #33)
> Sorry to say it would contain a dependency, 'prefs' because we need to set a
> pref when closing all tabs. I decided to leave it there as an exception to

Please use the Services.jsm module to handle setting/getting prefs.

Also please combine both in a patch which makes it easier for us to run. Thanks!
(Reporter)

Comment 36

6 years ago
(In reply to Henrik Skupin (:whimboo) from comment #35)
> (In reply to Maniac Vlad Florin (:vladmaniac) from comment #33)
> > Sorry to say it would contain a dependency, 'prefs' because we need to set a
> > pref when closing all tabs. I decided to leave it there as an exception to
> 
> Please use the Services.jsm module to handle setting/getting prefs.
> 
> Also please combine both in a patch which makes it easier for us to run.
> Thanks!

Oki doki, on it!
(Assignee)

Updated

6 years ago
Blocks: 760411
(Assignee)

Updated

6 years ago
No longer blocks: 760411
(Reporter)

Comment 37

6 years ago
Reproduced accidentally also on Windows 7
http://mozmill-crowd.blargon7.com/#/functional/report/d11b1de413a0179d904e737230cf6ca5

but this is intermittent so I do not think this is mac dependent at all.
(Assignee)

Comment 38

6 years ago
So what about the minimized testcase? We are still waiting for it here. Also what are the results of my proposal I gave you after the Ask an Expert session? I haven't seen an update since then.
(Reporter)

Comment 39

6 years ago
(In reply to Henrik Skupin (:whimboo) from comment #38)
> So what about the minimized testcase? We are still waiting for it here. Also
> what are the results of my proposal I gave you after the Ask an Expert
> session? I haven't seen an update since then.

On Friday I was trying to setup the prerequisites for building Firefox for mac. Had some issues with Xcode. I was building Firefox using hg bisect and divide et impera algorithm to reduce the regression range on only one changeset. during the weekend there was a blackout probably and the PC restarted, so the minimized testcases is useless now because I cannot reproduce it on my mac box. I was trying to, then I tried to investigate another issue today and found out that it happens again on win 7. a firefox build lasts 5-6 hours for me, so I could not possibly be fast with this one...
(Assignee)

Comment 40

6 years ago
You forgot the details from the meeting. As I have said it happens each day on the Linux VM we are running for Mozmill CI. So while this box is not utilized we can make sure to use it for testing.
(Reporter)

Comment 41

6 years ago
Created attachment 669445 [details] [diff] [review]
simple test patch v 1.0

I have updated the simplified testcases
No dependencies now
No manifests in the patch because we will not check this in
Attachment #667875 - Attachment is obsolete: true
Attachment #667876 - Attachment is obsolete: true
(Reporter)

Comment 42

6 years ago
Created attachment 669447 [details] [diff] [review]
simple test patch v1.1

just realized that test1 has no blank line at the end of the file, not sure why beacuse locally it had.
just fixed that in 1.1 version
Attachment #669445 - Attachment is obsolete: true
(Reporter)

Comment 43

6 years ago
seems the issue is still there, but I cannot see it locally. strange. hope it will be ok judging that it won't be checked in
(Assignee)

Comment 44

6 years ago
Created attachment 669451 [details]
minimized testcase
Attachment #669447 - Attachment is obsolete: true
(Assignee)

Comment 45

6 years ago
Thankfully tinderbox builds are still available. I will use those to nail down the regression range even further.
QA Contact: hskupin
(Assignee)

Comment 46

6 years ago
Regression range is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=12dad118c02f&tochange=8b46964e55c9

Lets see if builds from fxteam are around. If not I highly suspect bug 762094 to being the cause here.
(Assignee)

Comment 47

6 years ago
fxteam builds were still around. So it's indeed a regression by bug 762094. I will file a new Firefox bug for it so we can hopefully get this addressed asap.
(Assignee)

Updated

6 years ago
Depends on: 799433
(Assignee)

Updated

6 years ago
Keywords: regression, regressionwindow-wanted
Whiteboard: [mozmill-test-failure][mozmill-test-skipped] s=121001 u=failure c=awesomebar p=1 → [mozmill-test-failure][mozmill-test-skipped][blocked by bug 799433] s=121001 u=failure c=awesomebar p=1
(Assignee)

Updated

6 years ago
Assignee: vlad.mozbugs → nobody
Summary: Mozmill test failure /testAwesomeBar/testGoButton.js | controller.waitForPageLoad(): Timeout waiting for page loaded. → Mozmill tests are failing with timeouts in waitForPageLoad() due to 'about:newtab'
(Assignee)

Updated

6 years ago
Depends on: 764782
(Assignee)

Updated

6 years ago
No longer blocks: 794750
(Assignee)

Comment 48

6 years ago
Not sure why testGoButton.js has been disabled on OS X for all the branches. Only 18.0 and 19.0 were affected.

Backed out the patch across branches:
http://hg.mozilla.org/qa/mozmill-tests/rev/26a730907ac6 (default)
http://hg.mozilla.org/qa/mozmill-tests/rev/fbb13d16bbc4 (aurora)
http://hg.mozilla.org/qa/mozmill-tests/rev/0f71274296f8 (beta)
http://hg.mozilla.org/qa/mozmill-tests/rev/9377b609fe95 (release)
http://hg.mozilla.org/qa/mozmill-tests/rev/005fe5bc4930 (esr10)

If there are still waitForPageLoad() failures in the next days please feel free to reopen. Vlad, would you mind to re-enable the Litmus test for Firefox 10? Thanks.
Assignee: nobody → hskupin
Blocks: 794750
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
status-firefox-esr10: affected → ---
status-firefox15: disabled → ---
status-firefox16: disabled → ---
status-firefox17: disabled → ---
status-firefox18: disabled → fixed
status-firefox19: --- → fixed
Flags: in-litmus?(vlad.mozbugs)
Resolution: --- → FIXED
Whiteboard: [mozmill-test-failure][mozmill-test-skipped][blocked by bug 799433] s=121001 u=failure c=awesomebar p=1 → [mozmill-test-failure][blocked by bug 799433] s=121001 u=failure c=awesomebar p=1
(Assignee)

Updated

6 years ago
No longer blocks: 794400

Comment 49

6 years ago
Litmus no longer available – page gets redirected to MozTrap; no existing test cases in MozTrap yet (nothing to enable)
(Assignee)

Updated

6 years ago
Flags: in-litmus?(vlad.mozbugs)
You need to log in before you can comment on or make changes to this bug.