Closed Bug 614973 Opened 14 years ago Closed 13 years ago

Expression is NULL failure in testPasswordNotSaved

Categories

(Mozilla QA Graveyard :: Mozmill Tests, defect)

All
Linux
defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: u279076, Assigned: u279076)

References

()

Details

(Whiteboard: [mozmill-test-failure][mozmill-branch-fail][MozmillTestday][fixed-by-637138])

Attachments

(2 files, 8 obsolete files)

MODULE: testPasswordManager/testPasswordNotSaved.js
TEST: testPasswordNotificationBar
ERROR: Expression "{"value":"password-save"}" returned null. Anonymous == false
BRANCH: mozilla1.9.1
PLATFORM: Linux 10.04
Blocks: 604718
Whiteboard: [mozmill-test-failure][mozmill-branch-fail][linux-only]
Assignee: nobody → anthony.s.hughes
OS: All → Linux
Whiteboard: [mozmill-test-failure][mozmill-branch-fail][linux-only] → [mozmill-test-failure][mozmill-branch-fail][MozmillTestday]
Summary: Expression is NULL failure in testPassswordNotSaved → Expression is NULL failure in testPasswordNotSaved
Attached patch Patch v1 (mozilla1.9.1) (obsolete) — Splinter Review
Seems like the failure is actually in testPopupBlocked.  We are incorrectly looking for password-save instead of popup-blocked notification bar.  Here is a patch which fixes this.
Attachment #493418 - Flags: review?(aaron.train)
Comment on attachment 493418 [details] [diff] [review]
Patch v1 (mozilla1.9.1)

Strange. Thanks.
Attachment #493418 - Flags: review?(aaron.train) → review+
(In reply to comment #1)
> Created attachment 493418 [details] [diff] [review]
> Patch v1 (mozilla1.9.1)

Landed:
http://hg.mozilla.org/qa/mozmill-tests/rev/5d042b4155b4 [mozilla1.9.1]
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Looks like this is still happening and is being reported on mozilla1.9.2 and mozilla1.9.1 now.  The fix I applied in comment 3 was needed but was not the cause of the issue, apparently.

Reopening to investigate what is going on with the password notification bar.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Again, this is linux-only.  I've attached a URL to the brasstacks report showing the failures.
After several attempts, I am unable to reproduce this locally.  I'll try investigating using qa-set.
I'm unable to reproduce this on qa-set either.  Henrik, is it possible this is a problem with QA-Horus? Perhaps it needs to be restarted? I seem to remember seeing odd issues before in daily-testruns where restarting the VMs and the box fixed it.
Hm, this Linux VM was the only VM which was made visible on this machine, and there was a shell window. I have closed the window and minimized the VM. Lets see if this has changed anything in todays testrun.
Still failing on that VM and nowhere else.  Seems it is failing on the Linux32 VM only, not the Linux64 VM.  In fact, I can't find this failure at all going back 10 days on the Linux64 VM -- only the Linux32 VM.  This makes me think something is wrong with the VM itself.  Any chance we can restart the VM and see if it self-corrects itself?
Go ahead. Simply check tomorrows results to proof.
(In reply to comment #10)
> Go ahead. Simply check tomorrows results to proof.

Restarting the VM seems to have cleared the failure on 3.6 but it is still occurring on 3.5. I'll have to investigate this locally on qa-horus.
Two things I've noticed while investigating the VMs:

Ubuntu64 - I'm seeing a JSBridge component failure when launching Mozmill:
jsbridge: Exception: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIServerSocket.init]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: file:///tmp/tmp8b_mWm.mozrunner/extensions/jsbridge@mozilla.com/resource/modules/server.js :: anonymous :: line 279"  data: no]

Ubuntu32 - I'm seeing a setuptools error when launching Mozmill:
/usr/local/bin/mozmill:5: UserWarning: Unbuilt egg for setuptools [unknown version] (/usr/lib/python2.6/dist-packages)
  from pkg_resources import load_entry_point

I have no idea if these impact the reliability of the tests, but I don't see these errors on the qa-set VMs or my local VMs.
(In reply to comment #12)
> Ubuntu64 - I'm seeing a JSBridge component failure when launching Mozmill:
> jsbridge: Exception: [Exception... "Component returned failure code: 0x80004005
> (NS_ERROR_FAILURE) [nsIServerSocket.init]"  nsresult: "0x80004005

there is a running instance. Simply kill all jsbridge processes or the browser, if it is still open.

> Ubuntu32 - I'm seeing a setuptools error when launching Mozmill:
> /usr/local/bin/mozmill:5: UserWarning: Unbuilt egg for setuptools [unknown
> version] (/usr/lib/python2.6/dist-packages)
>   from pkg_resources import load_entry_point
> 
> I have no idea if these impact the reliability of the tests, but I don't see
> these errors on the qa-set VMs or my local VMs.

I will look at this.
Status: REOPENED → ASSIGNED
> Ubuntu32 - I'm seeing a setuptools error when launching Mozmill:
> /usr/local/bin/mozmill:5: UserWarning: Unbuilt egg for setuptools [unknown
> version] (/usr/lib/python2.6/dist-packages)
>   from pkg_resources import load_entry_point
> 
> I have no idea if these impact the reliability of the tests, but I don't see
> these errors on the qa-set VMs or my local VMs.

> I will look at this.

I fixed this error...running a daily testrun now to see if it changes our brasstacks results.
Well, that didn't fix this failure.  Continuing to investigate...
Status: ASSIGNED → NEW
Ok, I have something interesting.  

If I execute a manual testrun, the test passes:
mozmill -b <build> -t ./firefox/

If I execute a daily testrun, the test fails:
testrun_general.py <build>

Any ideas, Henrik?
I recall that the order of tests is somewhat mixed-up on Linux. Can you please check if that happens here?
(In reply to comment #17)
> I recall that the order of tests is somewhat mixed-up on Linux. Can you please
> check if that happens here?

testNavigateFTP happens before testPasswordNotSaved, same as on other platforms.
(In reply to comment #18)
> testNavigateFTP happens before testPasswordNotSaved, same as on other
> platforms.

So what you can do is, use your cloned version of the automation scripts and open lib/testrun.py. Search for:

self.test_path = os.path.join(self._base_path, 'tests')

And replace it with 

self.test_path = os.path.join(self._base_path, 'tests', 'testPasswordManager')

That will only execute the specified folder then. Does that change anything?
(In reply to comment #19)
> (In reply to comment #18)
> > testNavigateFTP happens before testPasswordNotSaved, same as on other
> > platforms.
> 
> So what you can do is, use your cloned version of the automation scripts and
> open lib/testrun.py. Search for:
> 
> self.test_path = os.path.join(self._base_path, 'tests')
> 
> And replace it with 
> 
> self.test_path = os.path.join(self._base_path, 'tests', 'testPasswordManager')
> 
> That will only execute the specified folder then. Does that change anything?

I did this but it still ran all of the tests, resulting in the same failure.
Actually, I got it to run just the Password Manager tests by changing the following:

http://hg.mozilla.org/qa/mozmill-automation/file/1f4778986d18/libs/testrun.py#l358
self.test_path = os.path.join('firefox/testPasswordManager')

The result is all 8 of the Password Manager tests pass.  Henrik, please advise.

Thanks.
Can you please attach the full list of tests which get run in-front of this failing one, and that for both ways of execution? Looks like a test from another subfolder could be causing this.

You can also pull the hotfix-1.5.2, where you now can specify multiple tests/folders with --tests. That should make it easier to build-up a list of tests to run in the specified order.
It definitely seems to be some sort of ordering condition which is unique to running the testrun_general.py script on the Linux32 VM...

Under all testing conditions, except for testrun_general.py on Linux32, the order of execution is the order on the filesystem (alphabetical).  However, when you run testrun_general.py on the Linux32 VM, the order is all over the map:

testAddons
testAwesomeBar
testBookmarks
testCookies
testDownloading
testFindInPage
testFormManager
testGeneral
testInstallation
testLayout
testPasswordManager
testPopups
testPreferences
testPrivateBrowsing
testSearch
testSecurity
testSessionStore
testTabbedBrowsing
testTechnicalTools
testToolbar

- BECOMES - 

testSessionStore
testBookmarks
testCookies
testTechnicalTools
testInstallation
testPreferences
testFormManager
testLayout
testPasswordManager
testAwesomeBar
testSecurity
testFindInPage
testPopups
testSearch
testTabbedBrowsing
testToolbar
testAddons
testGeneral
testDownloading
testPrivateBrowsing
(In reply to comment #23)
> Under all testing conditions, except for testrun_general.py on Linux32, the
> order of execution is the order on the filesystem (alphabetical).  However,
> when you run testrun_general.py on the Linux32 VM, the order is all over the
> map:

That's strange, because I do only forward the folder name to Mozmill. There is not happening any mixing of folders on the automation script side. To investigate this please file a new bug for it under automation.

To get this bug solved, please checkout the latest version of Mozmill from the hotfix-1.5.2 branch. By using it you can specify multiple -t options. That way you can force Mozmill to exactly run the same order of tests as the general test-run does. It should help to nail down this problem.
I think I'm not doing this right because it is only running the last folder.

Here is my command:
(mozmill1.5.2-venv)host-3-58:firefox ashughes$ mozmill -b /Applications/Shiretoko.app/ -t testSessionStore -t testBookmarks -t testCookies -t testTechnicalTools -t testInstallation -t testPreferences -t testFormManager -t testLayout -t testPasswordManager --show-errors

I'm using mozmill-1.5.2pre from:
k0s-mozmill-e0f995a

The only tests that execute are the ones in testPasswordManager, nothing else is even executed.
(In reply to comment #25)
> I'm using mozmill-1.5.2pre from:
> k0s-mozmill-e0f995a

You shouldn't use Jeff's repository. Get the latest revision from :
https://github.com/mozautomation/mozmill/tree/hotfix-1.5.2
(In reply to comment #26)
> (In reply to comment #25)
> > I'm using mozmill-1.5.2pre from:
> > k0s-mozmill-e0f995a
> 
> You shouldn't use Jeff's repository. Get the latest revision from :
> https://github.com/mozautomation/mozmill/tree/hotfix-1.5.2

Moving this issue over to bug 623992.
Just to note, even with the mixed order our tests should not fail. Means we should find the root cause of this problem and get the right test fixed.
Depends on: 623992
So, I've run the tests using hotfix-1.5.2, and there are no failures whatsoever.  Henrik, can you get the testrun_general.py script working so I can try it with hotfix-1.5.2?

Right now, the script only starts Firefox, no tests run.
(In reply to comment #29)
> So, I've run the tests using hotfix-1.5.2, and there are no failures
> whatsoever.  Henrik, can you get the testrun_general.py script working so I can
> try it with hotfix-1.5.2?
> 
> Right now, the script only starts Firefox, no tests run.

Sorry, this was meant for bug 623992.
Using hotfix and libs/testrun.py set to only run testPasswordManager tests, the test fails via script, but passes if run with Mozmill.

mozmill -b <build> -t firefox/testPasswordManager: PASS

testrun_general.py <build>: FAIL

Henrik, do you have any advice on how I can troubleshoot this further?
Status: NEW → ASSIGNED
(In reply to comment #31)
> Using hotfix and libs/testrun.py set to only run testPasswordManager tests, the
> test fails via script, but passes if run with Mozmill.

How does your change look like?
http://hg.mozilla.org/qa/mozmill-automation/file/1f4778986d18/libs/testrun.py#l358
self.test_path = os.path.join('firefox/testPasswordManager')
(In reply to comment #33)
> http://hg.mozilla.org/qa/mozmill-automation/file/1f4778986d18/libs/testrun.py#l358
> self.test_path = os.path.join('firefox/testPasswordManager')

Sorry, that should be
http://hg.mozilla.org/qa/mozmill-automation/file/tip/libs/testrun.py#l358
(In reply to comment #34)
> (In reply to comment #33)
> > http://hg.mozilla.org/qa/mozmill-automation/file/1f4778986d18/libs/testrun.py#l358
> > self.test_path = os.path.join('firefox/testPasswordManager')

You will have to use: os.path.join('firefox', 'testPasswordManager').

Also a full error message would be good. Not sure what you are seeing.
> You will have to use: os.path.join('firefox', 'testPasswordManager').
Done. Error still exists.

> Also a full error message would be good. Not sure what you are seeing.
Test Failure: {"exception": {"message": "Unexpectedly found element Lookup: /id(\"main-window\")/id(\"mainPopupSet\")/id(\"notification-popup\")/id(\"password-save-notification\")"
I'm starting to think this is something unique to either qa-horus or the Ubuntu VM.  

On qa-horus, the error is:
Expression "{"value":"password-save"}" returned null. Anonymous == false

I cannot reproduce the error on my Ubuntu 10.10 VM.  I'm currently copying over the 10.04 VM (same as the one qa-horus uses).
This seems to only fail on qa-horus.  I've tried to reproduce this failure locally on Ubuntu 10.10, Ubuntu 10.04, on qa-set, and on qa-horus.  The only place it is reproducible on qa-horus.

Henrik, can you think of any reason why the test would only fail on qa-horus?
Not out of my head, but you should probably dump as many as possible data to the console right before the failing line. If you can prepare the test-run so I could start it with a simple command, I could offer my help in investigation.
Any update here?
(In reply to comment #40)
> Any update here?

Nothing new yet.  I've been focusing on Firefox 4.0b9 testing this week.  It should be off my plate by noon PDT today.  This bug is my top Mozmill priority.
Attached patch Patch v1 (default) (obsolete) — Splinter Review
Refactor test to use waitFor() instead of assert().

I've created this patch to refactor the assertion after the ESC key is pressed to close the notification bar.  The error we are receiving indicates that notificationBar.isNull in the assert().  This makes me think the assert() is not fired fast enough (ie. the assert is being fired after the notification bar has already been destroyed).  Hopefully a waitFor() will be more robust here.
Attachment #493418 - Attachment is obsolete: true
Attachment #504044 - Flags: review?(aaron.train)
Attachment #504044 - Flags: review?(hskupin)
Attachment #504044 - Flags: review?(aaron.train)
Attachment #504044 - Flags: review+
Comment on attachment 504044 [details] [diff] [review]
Patch v1 (default)

>   controller.keypress(passwordNotification, "VK_ESCAPE", {});
>-  controller.assert(function() {
>-    return passwordNotification.getNode().parentNode.state == "closed";
>-  }, "Password notification should be closed");
>+  controller.waitFor(function() {

A waitFor makes really sense here because we have a slight animation. There has to be a space between function and the opening bracket.

>+    return passwordNotification.getNode().parentNode.state === "closed";
>+  }, "PasswordNotificationBar.State: got '" +
>+     passwordNotification.getNode().parentNode.state + "' expected 'closed'");

The message should be formal text followed by " - got XXX, expected: YYY", i.e. "Password notification bar has been closed...".  

r=me with both updated.
Attachment #504044 - Flags: review?(hskupin) → review+
Attached patch Patch v1.1 (default) (obsolete) — Splinter Review
Patch with revisions applied.
Attachment #504044 - Attachment is obsolete: true
(In reply to comment #44)
> Created attachment 504068 [details] [diff] [review]
> Patch v1.1 (default)
> 
> Patch with revisions applied.

Landed:
http://hg.mozilla.org/qa/mozmill-tests/rev/029aa329b8be (default)
I can't directly backport this patch yet because I've found a rather large discrepency between the test on default vs branch.  I'll attach the two files so they can be diff'd.
Attached file testPasswordNotSaved.js (default) (obsolete) —
Attached file testPasswordNotSaved.js (mozilla1.9.2) (obsolete) —
Attachment #504069 - Attachment is patch: true
Attachment #504070 - Attachment is patch: true
Attachment #504069 - Attachment is patch: false
Attachment #504070 - Attachment is patch: false
Attached image Screenshot: diff
That didn't work exactly how I liked it, so here is a screenshot from Komodo showing the files side by side; default is on the left, mozilla1.9.2 is on the right.
Nevermind all that, I forgot that DEFAULT uses a doorhanger, whereas BRANCH uses a notification bar.

I'll create a separate patch for branch.
(In reply to comment #50)
> Nevermind all that, I forgot that DEFAULT uses a doorhanger, whereas BRANCH
> uses a notification bar.
> 
> I'll create a separate patch for branch.

On a side note, I think it might be worth refactoring the DEFAULT version of this test to remove references to "notification bar" as this is what caused my confusion initially.
Agreed, anyone reading these tests would undoubtedly be confused by the false comments without a refactor.
waitForElementNotPresent (bug 610757) does not fix this error on qa-horus. I propose the following:

1. Refactor test on default to remove references to "bar"
2. Refactor test on branch to yield more valuable error messages
3a. If that gives us more clues, investigate and fix
3b. If not, disable the test until we can find a fix

Henrik, what do you think?
Sure, but we should probably start with the branch tests, because those are failing while default doesn't show the failure. If you can come up with a patch I will make sure to land it if it gets r+ before tomorrows testrun.
I vote we simply just disable this test RE: https://bugzilla.mozilla.org/show_bug.cgi?id=623992#c16

Going back to the list in comment 53, 

1. Refactor test on default to remove references to "bar"
 - we can still do this easily
2. Refactor test on branch to yield more valuable error messages
 - there really isn't anything to refactor here.  We have waitFor() in place via getTabPanelElement().  The error which is reporting to brasstacks is actually coming from elementslib.

I reiterate, we should remove "bar" from default and disable this test on branch until bug 623992 is resolved.

Henrik, what do you think?
One other thing I noticed during debugging, testAccessPageInfo.js does not cleanly destruct.  It leaves the Page Info dialog hanging around.  This doesn't seem to have any bearing on the testrun, but we should make sure it cleanly destructs anyway.

Henrik, should I file a bug for this?
The issue here is exactly the testAccessPageInfoDialog.js test. Once it does not open the window, everything is fine. See my tests:

With page info window opened:
http://mozmill-crowd.brasstacks.mozilla.com/#/general/report/90365723b3905a8fdc8895b9b805a6b2

Page info window not opened:
http://mozmill-crowd.brasstacks.mozilla.com/#/general/report/90365723b3905a8fdc8895b9b805ab9d

That means fixing testAccessPageInfoDialog.js will also fix this bug. Can you create a patch Anthony? And thanks for the investigation.
As what I have also seen on the Linux VM on horus, we have problems with HTTPd. In some cases we cannot connect. That gives me the impression that not only the page info window is responsible for the failure. I have removed it and the failure persists:

http://mozmill-crowd.brasstacks.mozilla.com/#/general/report/90365723b3905a8fdc8895b9b805b27d

So lets get the page info clean-up work happen on its own bug, so we can check-in the patch ASAP. Then we can check what has been changed for this test failure.
testAccessPageInfoDialog.js: bug 626674
Depends on: 626674
Using the refactoring done in bug 626674, we now fail with a waitFor() directly in the test (no longer in elementslib with an obscure exception).  However, we do get a strange scenario where the GOT is the same as EXPECTED but still fail.

Here is the code:
// The notification bar should not be visible before submitting the credentials
  var button = null;
  controller.waitFor(function () {
    button = tabBrowser.getTabPanelElement(tabBrowser.selectedIndex, '/{"value":"password-save"}/anon({"type":"info"})/{"class":"messageCloseButton tabbable"}');  
    return button === null;
  }, "Checking for notification bar close button: got " + button + ", expected null");

Here is the error message:
Checking for notification bar close button: got null, expected null
Shouldn't we check here for the notification bar and not the button in general? When you run the test please add a dump statement like the following into the waitFor callback to check the real value of the element:

dump("** button element: " + button + "\n");
dump("** button node: " + button.getNode() + "\n");

That will give us an idea.
Also please keep in mind that getTabPanelElement always returns an element, which means that button is always a valid object.
(In reply to comment #61)
> Shouldn't we check here for the notification bar and not the button in general?

This is what I was doing before.  The panel containing notification bars has a .hasNotification property.  It has a value of 0 if none, and 1 or more if one or more.  I don't think it's feasible enough for us to check if > 1 because it will give us a false positive if a notification bar is loaded before password-saved and not closed (ie. Mozilla Rights).  After some discussion, we agreed it would be best to just check for the (X) button of the password-save bar.
Ok, I've added your dumps.

I get 51 instances of the dumps before the waitFor() times out:
 *** button element: [object Object] *** 
 *** button node: [object XULElement]*** 

Again, the error message is:
Checking for notification bar close button: got null, expected null
Do you have a patch you can attach, so I can try it tomorrow? Just to make sure we have the same code base. Thanks.
(In reply to comment #65)
> Do you have a patch you can attach, so I can try it tomorrow? Just to make sure
> we have the same code base. Thanks.

A patch to test for this will have to wait for bug 626674. I'll land that asap and then lets look at this bug.
This issue is still persistent on Linux on the default branch. I think we should run it with Mozmill 1.5.2 just to see if something will be changed.

Therefore you can find a virtualenv sandbox with the latest hotfix changeset in /data/testing/hskupin/linux-sandbox. Just enable it with "source linux-sandbox/bin/activate". From inside the env you could start the daily test-run, which should report to mozmill-archive.
No longer depends on: 623992
Attached patch Patch v2 (obsolete) — Splinter Review
Refactoring assertNodeNotExist into a waitFor seems to have prevented this failure.
Attachment #504068 - Attachment is obsolete: true
Attachment #504069 - Attachment is obsolete: true
Attachment #504070 - Attachment is obsolete: true
Attachment #507543 - Flags: review?(aaron.train)
Comment on attachment 507543 [details] [diff] [review]
Patch v2

Looks fine by me
Attachment #507543 - Flags: review?(aaron.train) → review+
Comment on attachment 507543 [details] [diff] [review]
Patch v2

Quick spotcheck please, Henrik.
Attachment #507543 - Flags: review?(hskupin)
Comment on attachment 507543 [details] [diff] [review]
Patch v2

>   // The notification should not be visible before submitting the credentials
>-  controller.assertNodeNotExist(passwordNotification);
>+  controller.waitFor(function () {
>+    return passwordNotification.getNode().parentNode.state === "closed";
>+  }, "Password notification closed: got '" +
>+     passwordNotification.getNode().parentNode.state + "' expected 'closed'");

Why do we need a waitFor here? There shouldn't be any notification open at this time. If that's the case we should really have a reset call in setupModule or teardown.
Attachment #507543 - Flags: review?(hskupin) → review-
(In reply to comment #71)
>> Comment on attachment 507543 [details] [diff] [review]
>> Patch v2
>
> Why do we need a waitFor here? There shouldn't be any notification open at this
> time. If that's the case we should really have a reset call in setupModule or
> teardown.

The purpose of this check is to make sure there is no notification before we submit the credentials; it should only appear after (something which you requested be in this test when it was being created).
(In reply to comment #72)
> The purpose of this check is to make sure there is no notification before we
> submit the credentials; it should only appear after (something which you
> requested be in this test when it was being created).

Even when a notification would be shown, we wouldn't catch this with a waitFor. The check you are doing immediately returns because it checks for the closed property. The opening of the notification is delayed. So it would slip through. If we want to test this correctly we would have to wait for the notification and catch the thrown exception.
Personally, I never understood why we were checking to make sure a notification does not appear before the Login button is clicked. There is no historical evidence (ie. bug) where this has ever happened.  I'd be perfectly happy to leave this out of the test completely.  

> If we want to test this correctly we would have to wait for the notification
> and catch the thrown exception.

I'm not sure I understand what you are asking for here.  Are you asking to implement a waitFor() which waits for a notification to exists and when it fails to do so we catch that as a PASS, not a FAIL?
(In reply to comment #74)
> Personally, I never understood why we were checking to make sure a notification
> does not appear before the Login button is clicked. There is no historical
> evidence (ie. bug) where this has ever happened.  I'd be perfectly happy to
> leave this out of the test completely.  

Well, having no bug doesn't mean it cannot regress by any future change. But such a regression would be highly visible and nightly testers would catch it immediately. Even, as you said, the risk that it will happen is low. So let us remove this part.

> > If we want to test this correctly we would have to wait for the notification
> > and catch the thrown exception.
> 
> I'm not sure I understand what you are asking for here.  Are you asking to
> implement a waitFor() which waits for a notification to exists and when it
> fails to do so we catch that as a PASS, not a FAIL?

For future cases you have understood it correctly. Yes, that's what I have said.
Attached patch Patch v3 (obsolete) — Splinter Review
With the check for "notification not present before login" removed, we now get a reference to the notification after (not before) the Login button is clicked.

Also, I renamed the test method to testPasswordNotification to remove confusion about bar vs doorhanger.
Attachment #507543 - Attachment is obsolete: true
Attachment #507772 - Flags: review?(aaron.train)
Comment on attachment 507772 [details] [diff] [review]
Patch v3

>-  // After logging in, close the notification bar
>+  // Click the login button and wait for the form to process
>+  controller.click(new elementslib.ID(controller.tabs.activeTab, "LogIn"));
>+  controller.waitForPageLoad();

Nit: Pull out the element declaration outside the call to click.
Attachment #507772 - Flags: review?(aaron.train) → review-
Attached patch Patch v4 (obsolete) — Splinter Review
Attachment #507772 - Attachment is obsolete: true
Attachment #507916 - Flags: review?(aaron.train)
Attachment #507916 - Flags: review?(aaron.train) → review+
Attachment #507916 - Flags: review?(hskupin)
Comment on attachment 507916 [details] [diff] [review]
Patch v4

>   // Close the notification and check its state
>   controller.keypress(passwordNotification, "VK_ESCAPE", {});
>   controller.waitFor(function () {
>     return passwordNotification.getNode().parentNode.state === "closed";
>-  }, "Password notification bar closed: got '" +
>+  }, "Password notification closed: got '" +
>      passwordNotification.getNode().parentNode.state + "' expected 'closed'");

Isn't that a boolean like expression? We can only have two states here, so I don't think we should add the got/expected part. "Password notification has been closed" should be enough IMO.

r+ with it updated.
Attachment #507916 - Flags: review?(hskupin) → review+
Ok, I'll make that change.  Before I do though, I have a question about backporting these changes.  A big part of this patch is removing the confusion of "notification bar" vs "doorhanger" by referring to it simply as "notification".  Is there any objections to me backporting that change?

This change would make the tests the same across all branches.
I don't see an issue for older branches, but we do not have eliminated all instances of notifications in Firefox 4. That means notifications and doorhangers will co-exist. Sorry, I totally forgot about that part. That said, we have totally different ways to access such content. We could take care of in the shared module refactoring. For now I would agree, lets call it notification.
Attachment #507916 - Attachment is obsolete: true
Attachment #508040 - Flags: review+
Comment on attachment 508040 [details] [diff] [review]
Patch v4.1 [checked-in]

Landed:
http://hg.mozilla.org/qa/mozmill-tests/rev/f4209042c07c [default]

Backport patch coming up.
Attachment #508040 - Attachment description: Patch v4.1 → Patch v4.1 [checked-in]
(In reply to comment #83)
> Backport patch coming up.

On second thought, since this test is not failing on branches, lets not touch it.  In fact, this test has not failed for 10 days on the branches.  We can address that separately, in a new bug, if/when it starts failing on branches.

Marking resolved fixed -- will mark verified after tomorrow's testrun.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago13 years ago
Resolution: --- → FIXED
Error started reappearing on the branches only on Feb 5th and is happening every day.  Reopening to investigate.

http://mozmill-release.brasstacks.mozilla.com/#/general/report/bb657a86662f7d3425d7730a33c00948
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I've tried to reproduce this error locally and on the VM on qa-horus, unsuccessfully.  The only time I'm able to reproduce the failure is if I forcibly steal focus away from Firefox while the test is running.

Is it possible this is a focus issue and that something is stealing it from Firefox on this VM?  It might explain why we only see it recently on the Ubuntu 10.10 32-bit VM on qa-horus recently.

Henrik, please advise.
How did you run the test? With the testrun_all script? If not please do so.
(In reply to comment #87)
> How did you run the test? With the testrun_all script? If not please do so.

Ok, I can trigger the failure using testrun_all locally.  The only difference I can see between testrun_all and testrun_general is that the update tests are triggered first.  Are there any other differences? Is there a chance our update tests are causing the failure?
Sounds great! Next step would be to strip down the mozmill-tests repository to only those tests which are necessary to run before the password one, which could cause this problem. That's the first thing we have to check, before we could make assumptions about the update tests.
Status: REOPENED → ASSIGNED
I've narrowed this down to a set of tests:

testBookmarks
testCookies
testFormManager
testSessionStore
testPasswordManager

Remove any of these and the test passes.  Tests must be run using testrun_all.py otherwise they pass.

So, it would seem that there is something happening between all of these tests which is causing the problem.  I have no idea how to debug this any further.

Henrik, do you have any suggestions?
I miss the tests inside those folders. Which of those are really necessary? Most of these folders have multiple test modules in it.
(In reply to comment #91)
> I miss the tests inside those folders. Which of those are really necessary?
> Most of these folders have multiple test modules in it.

I haven't broken those folders out test by test yet.  That will take me quite a bit more time.  I'll do that and report back in a few days.
Ok, I've narrowed it down to a couple of tests:

testSessionStore/testUndoTabFromContextMenu.js
testFormManager/testClearFormHistory.js
testPasswordManager/testPasswordNotSaved.js
(in that order)

Henrik, any recommendations on how I can debug this further?
This seems to be a focus issue.  We are failing to type the credentials in the form.  However, there is no dialog or context menus blocking focus.  I tried setting the form element to autofocus and even tried to force .focus() but we just can't seem to get focus here.

I'm wondering if there is something we could be doing to better detect and debug focus in Mozmill itself.
Ok, I've found a workaround...adding tabBrowser.closeAllTabs() to the end of the setupModule().  It would seem like testClearFormHistory (which has no teardownModule) is finishing in a dirty state.

Henrik, please advise.
As per discussion on IRC, I tried inserting a sleep(30000) before the controller.type().  I can't even manually get focus on the textboxes in the form.  So it would seem there is a focus issue going on here (I suspect something bad is happening in testClearFormHistory).

To test yourself on qa-horus:
1. Switch to the Ubuntu 10.10 32-bit VM
2. Open terminal to /mnt/hgfs/data/testing/ashughes
3. Run ./mozmill-automation/testrun_general.py --repository=./mozmill-tests-1.9.1/ --logfile=testrun.log ~/Desktop/Link\ to\ builds/shiretoko/firefox
4. Let the tests run until the login form appears
5. Try to click one of the textboxes and type

Result:
Cannot get focus of form elements.
Just a follow-up on what was discussed in IRC today.

We copied the Ubuntu 10.10 32-bit VM from qa-horus to qa-set to see if the bug is reproducible and here is what I found...

Ubuntu 10.10 32-bit on QA-Set: test passes
Ubuntu 10.10 32-bit on QA-Horus: test fails
Ubuntu 10.10 32-bit from QA-Horus on QA-Set: test fails

So, it would seem there is a "problem" with the VM from QA-Horus.  The obvious quick-fix here is to simply clone the QA-Set VM to replace the QA-Horus VM.  

Henrik, please advise.
This is scary, because I have installed a complete fresh VM for Ubuntu 10.10 some weeks back. Al, how did the upgrade happen on QA-Set? I would say you haven't installed a fresh 10.10?

For the easiest case I can copy over the Ubuntu machine from qa-set and add the cronjobs. I will do it right away, so we can check our todays test-run.
Machine has been copied over and I have started a testrun to update all machines to yesterdays build. Lets see the results.
Looks like it has been fixed the problem. Lets wait for another test-run today.
The previous VM was an upgrade from a 9.10 VM to 10.4 and then 10.10 as distro upgrades within linux. It was not a clean install (and most of our users aren't doing clean installs either).
This has not failed since cloning the VM from QA-Set -- marking RESOLVED FIXED.

I'll mark it verified if it's still passing next week.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
http://mozmill-release.brasstacks.mozilla.com/#/general/failure?branch=All&platform=All&from=2011-02-17&to=2011-02-26&test=firefox%2FtestPasswordManager%2FtestPasswordNotSaved.js&func=testPasswordNotificationBar

This test is intermittent orange, failing 3 times in the last 10 days, only on the Ubuntu 10.10 32-bit VM.  What's our policy for dealing with intermittent orange tests?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Anthony, looks like we can close this bug again. The new Ubuntu machine has been fixed this.
Looks like this has not failed since Feb 25th -- re-resolving.
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
No failures in quite a while.  Marking VERIFIED.
Status: RESOLVED → VERIFIED
Whiteboard: [mozmill-test-failure][mozmill-branch-fail][MozmillTestday] → [mozmill-test-failure][mozmill-branch-fail][MozmillTestday][fixed-by-637138]
Product: Mozilla QA → Mozilla QA Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: