Closed Bug 1069897 Opened 10 years ago Closed 10 years ago

Test failure "Window has been found" in testSecurity/testSSLStatusAfterRestart.js

Tracking

(firefox34 fixed, firefox35 fixed, firefox36 fixed)

Status:

RESOLVED FIXED

Tracking Flags:

Tracking

Status

firefox34

---

fixed

firefox35

---

fixed

firefox36

---

fixed

People

(Reporter: AndreeaMatei, Assigned: cosmin-malutan)

References

(
URL
)

Details

(Whiteboard: [mozmill-test-failure][sprint])

Attachments

(3 files, 2 obsolete files)

skip.patch 10 years ago Andrei Eftimie 953 bytes, patch	AndreeaMatei : review+	Details \| Diff \| Splinter Review
testcase.js 10 years ago Daniel Gherasim 1.83 KB, application/javascript		Details
patch v1.0 10 years ago Cosmin Malutan, [:cosmin-malutan] 1.15 MB, patch		Details \| Diff \| Splinter Review
patch v1.0 10 years ago Cosmin Malutan, [:cosmin-malutan] 2.42 KB, patch		Details \| Diff \| Splinter Review
patch v2.0 10 years ago Cosmin Malutan, [:cosmin-malutan] 3.32 KB, patch	AndreeaMatei : review+ AndreeaMatei : checkin+	Details \| Diff \| Splinter Review

Andreea Matei [:AndreeaMatei]

Reporter

Description

•

10 years ago

This test was pushed today and we see a few failures on Aurora de and en-us, Linux 13.10 and 12.04:
http://mozmill-daily.blargon7.com/#/remote/report/2f982f72826307fed840a3b11c6d7b9e
http://mozmill-daily.blargon7.com/#/remote/report/2f982f72826307fed840a3b11c68a381

Affected line:
http://hg.mozilla.org/qa/mozmill-tests/file/default/lib/windows.js#l130

In the same test I see a "Notification popup has been open" failure:
http://mozmill-daily.blargon7.com/#/remote/report/2f982f72826307fed840a3b11c6d7302

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 1

•

10 years ago

Please backout the causing test, there shouldn't be a need for this bug. A fix has to take place on the implementation bug.

Andrei Eftimie

Comment 2

•

10 years ago

We will skip the test, not backout because of a dependency tree.

Andrei Eftimie

Comment 3

•

10 years ago

Only linux seems to be affected.

Andrei Eftimie

Comment 4

•

10 years ago

Attached patch skip.patch — Details — Splinter Review

Attachment #8492140 - Flags: review?(andreea.matei)

Andrei Eftimie

Comment 5

•

10 years ago

For safety disabled on all platforms.

Andreea Matei [:AndreeaMatei]

Reporter

Comment 6

•

10 years ago

Comment on attachment 8492140 [details] [diff] [review]
skip.patch

Review of attachment 8492140 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, please land it.

Attachment #8492140 - Flags: review?(andreea.matei) → review+

Andreea Matei [:AndreeaMatei]

Reporter

Updated

•

10 years ago

status-firefox35: --- → affected

Andrei Eftimie

Comment 7

•

10 years ago

Disabled:
https://hg.mozilla.org/qa/mozmill-tests/rev/7757dc1f1ef5 (default)
https://hg.mozilla.org/qa/mozmill-tests/rev/85f8ed5098bc (mozilla-aurora)

status-firefox34: affected → disabled

status-firefox35: affected → disabled

Daniel Gherasim

Comment 8

•

10 years ago

So this is reproducible only on CI and it's failing on the OV certificate check, I'll investigate further.

Daniel Gherasim

Updated

•

10 years ago

Assignee: nobody → daniel.gherasim

Status: NEW → ASSIGNED

Daniel Gherasim

Updated

•

10 years ago

Priority: -- → P1

Daniel Gherasim

Comment 9

•

10 years ago

If I remove the check for 
>  27   url: "http://ssl-ov.mozqa.com/",
>  28   identity: "",
>  29   type: "unknownIdentity",
>  30   callback: checkSecurityTab_OV
>  31 }


Test passes, I will continue the investigation on Monday.

Andrei Eftimie

Updated

•

10 years ago

Blocks: 1025849

Daniel Gherasim

Comment 10

•

10 years ago

This only reproduces on CI and not on our local machines, that mostly because of the less memory we have on the CI which could cause race conditions.

If I put a controller.sleep(500) after we change the tab we are (changing value of the tabBrowser.selectedIndex) the test passed 200/200 times.

Here is where I added the code:
> // Check that the correct status is shown for each certificate type
> TEST_DATA.forEach(aPage => {
>   controller.waitForPageLoad();
> 
>   cert = security.getCertificate(tabBrowser.securityUI);
>   verifyCertificateStatus(aPage);
>   tabBrowser.selectedIndex++;
>   controller.sleep(500); 
> });

Daniel Gherasim

Comment 11

•

10 years ago

So along with "Window has been found" we also get a :
> controller.waitForPageLoad(URI=http://ssl-ov.mozqa.com/, readyState=complete)

It's likely that this is the first line that fails, and then we fail in waiting for the window.

Andrei Eftimie

Comment 12

•

10 years ago

Does that mean we're not waiting for the right observers in the new Window classes?
Does tabbrowser wait for the tab to load when we change selectedIndex?

Daniel Gherasim

Comment 13

•

10 years ago

(In reply to Andrei Eftimie from comment #12)
> Does that mean we're not waiting for the right observers in the new Window
> classes?

Nope. We don't use here the new window class and the specific observers we added.
We are only using the handleWindow method which just got moved from one file to another.
I don't think it's something related to how we handle windows, the failure most likely it's above this, and the window doesn't get open.

> Does tabbrowser wait for the tab to load when we change selectedIndex?

It waits for the this._tabs.getNode().selectedIndex to be equal with our expected tab.
> assert.waitFor(function () {
>      return this.selectedIndex === aIndex;
>    }, "The tab with index '" + aIndex + "' has been selected", undefined,
>     undefined, this);

We have to investigate if we can wait for a tab to be loaded (once we click on it).

Andrei Eftimie

Comment 14

•

10 years ago

(In reply to daniel.gherasim from comment #13)
> It waits for the this._tabs.getNode().selectedIndex to be equal with our
> expected tab.
> > assert.waitFor(function () {
> >      return this.selectedIndex === aIndex;
> >    }, "The tab with index '" + aIndex + "' has been selected", undefined,
> >     undefined, this);
> 
> We have to investigate if we can wait for a tab to be loaded (once we click
> on it).

This. We might need to wait for some animations to finish.
AFAIK we implement this in relation to tab loading (new tab, close tab). Maybe we need to apply the same logic to changing tabs.

Daniel Gherasim

Comment 15

•

10 years ago

Attached file testcase.js — Details

Testcase with the failure: "Notification popup has been opened" 7/50 times.
That's similar failure with what we do in the test, and it's same problem, that doing something imediately after we change the .selectedIndex using our tabBrowser may fail sometime.

I would suggest to do a workaround here so we can have the test up & running, and fix .selectedIndex in another bug.

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 16

•

10 years ago

I think what we need here is the TabShown event. Can you please test this Daniel?

https://developer.mozilla.org/en-US/docs/Web/Events/TabShow

Daniel Gherasim

Comment 17

•

10 years ago

(In reply to Henrik Skupin (:whimboo) from comment #16)
> I think what we need here is the TabShown event. Can you please test this
> Daniel?
> 
> https://developer.mozilla.org/en-US/docs/Web/Events/TabShow

TabShow doesn't work, I think it's only triggered when it's loaded & shown.
But "TabSelect" event works locally, https://developer.mozilla.org/en-US/docs/Web/Events/TabSelect .
I'll test on the CI machine once I find one free to see if it fixes this problem.

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 18

•

10 years ago

I see. That indeed should be the correct event. Thanks for letting us know. Btw how many of our tests are using selectedIndex? I would assume this problem could be the base of many other failures we are seeing!

Daniel Gherasim

Comment 19

•

10 years ago

 (In reply to Henrik Skupin (:whimboo) from comment #18)
> I see. That indeed should be the correct event. Thanks for letting us know.
> Btw how many of our tests are using selectedIndex? I would assume this
> problem could be the base of many other failures we are seeing!

There are 11 tests which get used of this method.

So I tested with the "TabSelect" event and it's still failing. (Maybe would be good to add this event anyway in another bug for cases we select a tab).

After digging more into this, I found out what's causing this failure.
So we are clicking the tab > then call waitForPageLoad() > we click on the 'identity box' to open the notificationPopup but somehow untill the popup is open, we focus on the url bar and notification get's automatically closed before we click the moreInfoButton on it. So it's a race condition that causes this.

Daniel Gherasim

Comment 20

•

10 years ago

I'm not sure about the focus on url bar, but something it's clearely closing it really fast.

Daniel Gherasim

Comment 21

•

10 years ago

Printing the notification state after we click on the box it shows: showing, closed, closed.

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 22

•

10 years ago

(In reply to daniel.gherasim from comment #19)
> So I tested with the "TabSelect" event and it's still failing. (Maybe would
> be good to add this event anyway in another bug for cases we select a tab).

Absolutely. Lets get this fixed sooner than later. Please file a bug for it.

Daniel Gherasim

Comment 23

•

10 years ago

While for the other pages is: showing, open, open.

So maybe ssl-ov.mozqa.com it's the fastest page and that's why we fail on it.
That's one reason we should assert in the waitForNotificationPanel that we got the expected state on the panel, and not just check for the event.

Cosmin Malutan, [:cosmin-malutan]

Assignee

Comment 24

•

10 years ago

The cause of this failure is that we click on "identity-popup-more-info-button" when the popup is closed for some reason, I couldn't see this failure but if I interfere with the testrun, I get exactly this issue.
Also I triggered an 'escape' event prior of clicking and I saw the issue, and if I call button.doCommand() the windows opens even if the notification has been closed.

So the issue here is not that the window hasn't been open, is that we didn't had the notification in the correct state, we have to wait for notification to close.

This line:
>http://hg.mozilla.org/qa/mozmill-tests/file/3c0b8bae9dff/firefox/tests/remote/testSecurity/testSSLStatusAfterRestart.js#l118
Should be:
>  locationBar.waitForNotificationPanel(() => {
>    moreInfoButton.click();
>  }, {type: "identity", open: false});


Also we trigger escape event on notification even if it has already closed in, this if it hits a race condition it can close something else, the try/finaly from the line above has to be removed because handling the window has nothing to do with notification, the assertion I suggest above should be sufficient.
http://hg.mozilla.org/qa/mozmill-tests/file/3c0b8bae9dff/firefox/tests/remote/testSecurity/testSSLStatusAfterRestart.js#l123

Cosmin Malutan, [:cosmin-malutan]

Assignee

Comment 25

•

10 years ago

Attached patch patch v1.0 (obsolete) — Details — Splinter Review

This patch should fix this issue.
http://mozmill-crowd.blargon7.com/#/remote/report/ad8dad905f3281cb521a23ddf310fd8b

Assignee: daniel.gherasim → cosmin.malutan

Attachment #8506021 - Flags: review?(andrei.eftimie)

Attachment #8506021 - Flags: review?(andreea.matei)

Cosmin Malutan, [:cosmin-malutan]

Assignee

Comment 26

•

10 years ago

Attached patch patch v1.0 (obsolete) — Details — Splinter Review

Sorry, wrong patch

Attachment #8506021 - Attachment is obsolete: true

Attachment #8506021 - Flags: review?(andrei.eftimie)

Attachment #8506021 - Flags: review?(andreea.matei)

Attachment #8506022 - Flags: review?(andrei.eftimie)

Attachment #8506022 - Flags: review?(andreea.matei)

Cosmin Malutan, [:cosmin-malutan]

Assignee

Comment 27

•

10 years ago

Comment on attachment 8506022 [details] [diff] [review]
patch v1.0

Review of attachment 8506022 [details] [diff] [review]:
-----------------------------------------------------------------

I've got a failure with this patch applied, but at least I know that my assumption was right, the notification has been closed prior of clicking moreInfoButton button.

::: firefox/tests/remote/testSecurity/testSSLStatusAfterRestart.js
@@ +118,5 @@
> +  locationBar.waitForNotificationPanel(() => {
> +    moreInfoButton.click();
> +  }, {type: "identity", open: false});
> +
> +  windows.handleWindow("type", "Browser:page-info", aPage.callback);

We know for sure that the notification has been opend because it didn't failed to open, and that by the time we click the more info button it's closed.
http://hg.mozilla.org/qa/mozmill-tests/file/3c0b8bae9dff/firefox/tests/remote/testSecurity/testSSLStatusAfterRestart.js#l104
http://mozmill-crowd.blargon7.com/#/remote/report/ad8dad905f3281cb521a23ddf3118d25

Attachment #8506022 - Flags: review?(andrei.eftimie)

Attachment #8506022 - Flags: review?(andreea.matei)

Andreea Matei [:AndreeaMatei]

Reporter

Updated

•

10 years ago

Whiteboard: [mozmill-test-failure] → [mozmill-test-failure][sprint]

Cosmin Malutan, [:cosmin-malutan]

Assignee

Comment 28

•

10 years ago

I used setTimeout to delay the closing of page-info window, and the issue was 100% reproducible. It might be the case that if the window closes after the notification has been opened, it will blur the notification which will cause it to close. That would be a reace condition, that's why we fail intermittently and we never fail on the first certificate. What I will try now to see if it fixes the issue is to retrieve the controller when handling the window, and close it explicitly with waitForWindowState.

Cosmin Malutan, [:cosmin-malutan]

Assignee

Comment 29

•

10 years ago

Attached patch patch v2.0 — Details — Splinter Review

So the we hit this problem because tabBrowser.selectedIndex++ doesn't correctly waits for the tab to change, and some event it's closing the notification after it opens.
Whatsoever if we open the tab, navigate to the page and check the certificate in the same loop, we don't hit that race-condition, which closes our notification so it's clearly not a bug but a test implementation weakness.

I ran over 130 rans and it didn't failed, so this fixes the issue.
http://mozmill-sandbox.blargon7.com/#/remote/reports?app=All&branch=All&platform=All&from=2014-10-17&to=2014-10-17

Attachment #8506022 - Attachment is obsolete: true

Attachment #8506875 - Flags: review?(andrei.eftimie)

Attachment #8506875 - Flags: review?(andreea.matei)

Andreea Matei [:AndreeaMatei]

Reporter

Comment 30

•

10 years ago

Comment on attachment 8506875 [details] [diff] [review]
patch v2.0

Review of attachment 8506875 [details] [diff] [review]:
-----------------------------------------------------------------

This looks way better. I'll wait until Monday morning to land this, it's a bit late now, with the weekend coming.

Attachment #8506875 - Flags: review?(andrei.eftimie)

Attachment #8506875 - Flags: review?(andreea.matei)

Attachment #8506875 - Flags: review+

Attachment #8506875 - Flags: checkin?(andreea.matei)

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 31

•

10 years ago

So this patch will delay the test a bit given that we now are waiting for the SSL pages to be loaded before continuing to the next tab. Why don't we try go get bug 1071566 fixed, so we wait for the other tab to be selected?

Cosmin Malutan, [:cosmin-malutan]

Assignee

Comment 32

•

10 years ago

We wait anyway for the page load, we just wait in the same loop we open the tab, by this way we are sure the test works. Anyway per comment 22 it looks to be clear that bug 1071566 won't fix this issue.

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 33

•

10 years ago

Ah, missed that part. So lets get this in even we have a slightly longer overhead. At least the test is passing. Thanks Cosmin.

Andreea Matei [:AndreeaMatei]

Reporter

Comment 34

•

10 years ago

Comment on attachment 8506875 [details] [diff] [review]
patch v2.0

Review of attachment 8506875 [details] [diff] [review]:
-----------------------------------------------------------------

http://hg.mozilla.org/qa/mozmill-tests/rev/22fd445ad105 (default)

Attachment #8506875 - Flags: checkin?(andreea.matei) → checkin+

Andreea Matei [:AndreeaMatei]

Reporter

Updated

•

10 years ago

status-firefox36: --- → fixed

Cosmin Malutan, [:cosmin-malutan]

Assignee

Comment 35

•

10 years ago

The test passed on Nightly it can be back-ported to aurora and beta, the patch applies nicely on both branches.

http://mozmill-daily.blargon7.com/#/remote/failure?app=Firefox&branch=All&platform=All&from=2014-10-20&to=&test=%2FtestSecurity%2FtestSSLStatusAfterRestart.js&func=testDisplayCertificateStatus

Andreea Matei [:AndreeaMatei]

Reporter

Comment 36

•

10 years ago

Transplanted:
http://hg.mozilla.org/qa/mozmill-tests/rev/805fe0b2dd29 (aurora)
http://hg.mozilla.org/qa/mozmill-tests/rev/a023470ed0c9 (beta)

Thanks!

Status: ASSIGNED → RESOLVED

Closed: 10 years ago

status-firefox34: disabled → fixed

status-firefox35: disabled → fixed

Resolution: --- → FIXED

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 37

•

10 years ago

Has this really been tested for Beta? All of the remote testruns on Linux are failing with this specific test! It's a different problem but it should have not been landed on that branch.

http://mozmill-release.blargon7.com/#/remote/top?app=All&branch=34.0&platform=All&from=2014-10-23&to=2014-10-24

Cosmin Malutan, [:cosmin-malutan]

Assignee

Comment 38

•

10 years ago

Yes I've tested this even now is not failing only if I interferer with run, I'll check it on a CI node.
Maybe it's best to checkout the re-enabling patch, or to check-in the skip again as it might be something else.

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 39

•

10 years ago

Please file a new bug for this case, and yes lets skip it if possible.

BMO Automation

Updated

•

5 years ago

Product: Mozilla QA → Mozilla QA Graveyard

You need to log in before you can comment on or make changes to this bug.