[mozmill] calendar recurrence tests regularly failing on linux

RESOLVED FIXED in 5.7

Status

Calendar
Internal Components
P2
normal
RESOLVED FIXED
8 months ago
4 months ago

People

(Reporter: Taraman, Assigned: Taraman)

Tracking

(Blocks: 1 bug)

Dependency tree / graph

Details

(Whiteboard: [Thunderbird-testfailure: Z Linux])

Attachments

(5 attachments, 1 obsolete attachment)

(Assignee)

Description

8 months ago
The Mozmill recurrence tests fail most of the time on linux32 and sometimes on Linux 64 debug.

Tests seen failing on Linux32:
testWeeklyUntilRecurrence
testWeeklyWithExceptionRecurrence
testBiweeklyRecurrence

Tests seen failing on Linux64:
testAnnualRecurrence
testLastDayOfMonthRecurrence

ince the tests always fail on the first step after closing the event-dialog, it seems it has something to do with event-saving.
(Assignee)

Comment 1

8 months ago
I set up a VM with linux32 now, to see if I can reproduce this issue, since the tests are failing all the time there.

And it seems, that the datepickers are broken there. They are not displayed correctly and always show 1/1/1970.
When running the test, the type command in these fields also everytime puts this value.
This is the reason, why the tests which use typed dates fail.

So it seems at the moment, this is not a test but, but a real bug.
(Assignee)

Comment 2

8 months ago
Created attachment 8825470 [details]
evntDatePickerError.png

This is what the eventDialog looks like.
(Assignee)

Updated

8 months ago
Hardware: Unspecified → x86

Comment 3

8 months ago
Thanks, that's an evidence I was looking for in bug 1325792.
Depends on: 1325792

Comment 4

8 months ago
Is there anything in the error console? If so, please comment in bug 1325792.
Flags: needinfo?(Mozilla)
(Assignee)

Updated

8 months ago
Flags: needinfo?(Mozilla)
(Assignee)

Comment 5

7 months ago
Aside from the Linux32 Problem, there are the Failures on Linux64, which seem to be timing related.
It looks, that my function to see if the view loaded completely is not working 100%.

So I will set every check for an element after view-change to a waitFor call instead nodeExists.

Comment 6

7 months ago
Damn, did I just report a duplicate of this in bug 1334783? The is certainly more visible since bug 1332708 landed.

Comment 7

7 months ago
Maybe the Linux32 issue is (also) related to the setup of the automation machines. There have been recently comments on other bugs that stated some obscure behaviour disappeared after switching to another desktop or widget engine. So, if modifying timeouts or making sure to have e.g. a menue list opened before trying to click one of its items like recommended in bug 1332708, it may be worth to try to reproduce the issue by setting up a VM with the same configuration that is used on automation (if you haven't that already).

Comment 8

7 months ago
On a second thought, the Linux32 may be hard to fix as long the datetime issue as documented with the screenshot exists on that platform, so we should make that a priority.

Updated

7 months ago
Duplicate of this bug: 1334783
(Assignee)

Updated

7 months ago
Depends on: 1229684
(Assignee)

Comment 10

7 months ago
Created attachment 8833518 [details] [diff] [review]
hardenTests.diff

This should fix the intermittent failures in the tests by adding "waitFor..." functions after every view change.
Attachment #8833518 - Flags: review?(makemyday)
(Assignee)

Comment 11

7 months ago
And BTW, the TryRun is here:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=3e4023c222da0c9bb62503605a031ebd2f2dcd4f

Comment 12

7 months ago
Thanks, we really need to get the trees (C-C and C-A) greener again. However, on Linux32 I still see:

TEST-UNEXPECTED-FAIL | /builds/slave/test/build/tests/mozmill/cal-recurrence/testWeeklyWithExceptionRecurrence.js | testWeeklyWithExceptionRecurrence.js::testWeeklyWithExceptionRecurrence
(Assignee)

Comment 13

7 months ago
Yes I know, but this is dependent on bug 1229684.

Comment 14

7 months ago
Well, the log says:

INFO -  SUMMARY-UNEXPECTED-FAIL | testWeeklyWithExceptionRecurrence.js | testWeeklyWithExceptionRecurrence.js::testWeeklyWithExceptionRecurrence
INFO -    EXCEPTION: Timeout exceeded for waitForElementNotPresent Lookup: /id("messengerWindow")/id("tabmail-container")/id("tabmail")/id("tabpanelcontainer")/id("calendarTabPanel")/id("calendarContent")/id("calendarDisplayDeck")/id("calendar-view-box")/id("view-deck")/id("day-view")/anon({"anonid":"mainbox"})/anon({"anonid":"scrollbox"})/anon({"anonid":"daybox"})/[0]/anon({"anonid":"boxstack"})/anon({"anonid":"topbox"})/{"flex":"1"}/{"flex":"1"}/{"flex":"1"}/{"tooltip":"itemTooltip","calendar":"mozmill"}
INFO -      at: utils.js line 447
INFO -         TimeoutError utils.js:447 13
INFO -         waitFor utils.js:485 11
INFO -         MozMillController.prototype.waitFor controller.js:687 3
INFO -         MozMillController.prototype.waitForElementNotPresent controller.js:709 3
INFO -         testWeeklyWithExceptionRecurrence testWeeklyWithExceptionRecurrence.js:107 5

So that has nothing to do with bug 1229684. Or does it?

Comment 15

7 months ago
Comment on attachment 8833518 [details] [diff] [review]
hardenTests.diff

Review of attachment 8833518 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, r=me with the below comment considered.

::: calendar/test/mozmill/cal-recurrence/testAnnualRecurrence.js
@@ +58,4 @@
>  
>          // day view
>          switchToView(controller, "day");
> +        controller.waitForElement(lookupEventBox("day", ALLDAY, null, 1, null, EVENTPATH));

Can you please make this

controller.waitForElement(
    lookupEventBox("day", ALLDAY, null, 1, null, EVENTPATH)
);

This applies also for the other three changes in cal-recurrence/testAnnualRecurrence.js.
Attachment #8833518 - Flags: review?(makemyday) → review+

Comment 16

7 months ago
(In reply to Jorg K (GMT+1) from comment #14)
> So that has nothing to do with bug 1229684. Or does it?

The current patch is intended for hardening the tests as proposed in bug 1332708 but not making the passing on Linux32 instantly. As mentioned in comment 13, fixing the failing mozmill tests for that platform also depends on resolving the datetime formatter issue by landing bug 1229684.

Setting the until date for the recurring event doesn't seem to result in the expceted recurrences which lets fail testing for the same here (which is what the quoted log message is about).
(Assignee)

Comment 17

7 months ago
Style nits fixed, pushed to:
-> https://hg.mozilla.org/comm-central/rev/56625e348834cc35cb8c8333efca97fc77ed113b

Leaving the bug open for a moment, to see if the intermittent failures on linux64 cease.

As laid out in comment #16 & comment #13, te linux32 failures are not due to faulty tests, but working tests failing due to a real bug.
Target Milestone: --- → 5.3

Comment 18

7 months ago
https://hg.mozilla.org/comm-central/rev/25eb5fc185536e83d8d54a019ba77a7427d9f0c6 (backout)
https://hg.mozilla.org/comm-central/rev/2537a3a08ebd52eaf485d95fe693a1c50bd3f3ac (relanded)
Please make sure you land patches with the correct bug number. Also please adhere to the standard format of the commit message.
Target Milestone: 5.3 → 5.6
(Assignee)

Comment 19

7 months ago
@Jorg K: sorry for that, will keep an eye on that in the future.

We have one more failure at a different spot on linux64 on tryserver:
https://treeherder.mozilla.org/logviewer.html#?job_id=75030159&repo=try-comm-central&lineNumber=5878

Here, it looks that for some reason, the recurrence is not set correctly.
(Assignee)

Comment 20

7 months ago
Created attachment 8835032 [details]
index.png

This is a screenshot taken by Mozmills screenshot function at the time of test failure.

The Date should have been 2008-03-31 and there an occurrence is expected.

It looks like the function gotToDate of the test-calendar-utils has a timing problem here.

If you assume that the month march in the minimonth is clicked and the code to click on the month is executed before the display updates, you will land on 2008-03-02.
(Assignee)

Comment 21

7 months ago
Now we have two more failures on linux32debug:

06:37:16     INFO -    EXCEPTION: waitFor: Timeout exceeded for '() => {
06:37:16     INFO -          return iframe.contentWindow.onLoad &&
06:37:16     INFO -                 iframe.contentWindow.onLoad.hasLoaded == true;
06:37:16     INFO -      }'

06:37:16     INFO -    EXCEPTION: waitFor: Timeout exceeded for '() => mozmill.utils.getWindows("Calendar:EventDialog").length == 0'

The first is while waiting for the event-dialog to open, the second is while witing for it to close- which is mainly because the preceding test stopped and left the dialog open.

It seems, that the linux debug builds are the the ones having least performance. So maybe we should raise the timeout - although it seems that the standard 5 secs should be well enough...
(Assignee)

Comment 22

7 months ago
Created attachment 8835360 [details] [diff] [review]
More fixes

This patch addresses the above 2 issues.

The minimonth has an event if the month changed and we can check for this to ensure the day-matrix is updated.

For the opening of the event-dialog I increased the timeout.

Successful Test-run is here:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=679a8fcd3a3ce8c23c36c90fde3a98b66493d63e

And to be sure I started another one:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=873c200a502ed469d4262a4e97f679135e3dfc56
Attachment #8835360 - Flags: review?(makemyday)

Updated

7 months ago
Whiteboard: [Thunderbird-testfailure: Z Linux]

Comment 23

7 months ago
Comment on attachment 8835360 [details] [diff] [review]
More fixes

Review of attachment 8835360 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, thanks for keeping traction.
Attachment #8835360 - Flags: review?(makemyday) → review+
(Assignee)

Comment 24

7 months ago
pushed to:
-> https://hg.mozilla.org/comm-central/rev/8887137884e779ea23f3f5cb711c9d27b8949c4d

Comment 25

7 months ago
May I ask of this is done now?

You may have noticed the whiteboard entry and the query on the treeherder status (my new toys):
https://mzl.la/2lrQTHC

So this bug is one of the ones that will make the tree greener, I hope.
More tests are still coming in bug 983791 where I saw a huge patch on try.
(Assignee)

Comment 26

7 months ago
I left this open because I want to watch Treeherder if there are any more failures not covered by these two patches.
But we can also close it and open a new bug if need be.

Bug 983791 could land soon, if we really want all the recurrences to be tested in rotated view - I asked Philipp if we keep it this way.
(Assignee)

Comment 27

6 months ago
We have now 1 failure in the last 4 days:
https://treeherder.mozilla.org/logviewer.html#?job_id=76808242&repo=comm-central&lineNumber=2714

With this rate of failure, it is hard to track this behaviour down.

Is there any way to put some permanent debug code in the test - preferrably only on tryserver.
This way I could put some things in the log to help understand what is happening.
(Assignee)

Comment 28

6 months ago
12:34:58     INFO -  SUMMARY-UNEXPECTED-FAIL | testWeeklyNRecurrence.js | testWeeklyNRecurrence.js::testWeeklyNRecurrence
12:34:58     INFO -    EXCEPTION: Expression "{"value":"2009"}" returned null. Anonymous == false
12:34:58     INFO -      at: elementslib.js line 486
12:34:58     INFO -         reduceLookup elementslib.js:486 15
12:34:58     INFO -         Lookup.prototype.getNode elementslib.js:501 10
12:34:58     INFO -         MozMillController.prototype.click controller.js:527 17
12:34:58     INFO -         goToDate test-calendar-utils.js:257 9
12:34:58     INFO -         testWeeklyNRecurrence testWeeklyNRecurrence.js:52 5
12:34:58     INFO -         Runner.prototype.wrapper frame.js:585 9
12:34:58     INFO -         Runner.prototype._runTestModule frame.js:655 9
12:34:58     INFO -         Runner.prototype.runTestModule frame.js:701 3
12:34:58     INFO -         Runner.prototype.runTestDirectory frame.js:525 7
12:34:58     INFO -         runTestDirectory frame.js:707 3
12:34:58     INFO -         Bridge.prototype._execFunction server.js:179 10
12:34:58     INFO -         Bridge.prototype.execFunction server.js:183 16
12:34:58     INFO -         Session.prototype.receive server.js:283 3
12:34:58     INFO -         AsyncRead.prototype.onDataAvailable server.js:88 3

This is the error we still get on the linux debug builds from time to time.

It looks like the flaw is in the goToDate function of the calendar-utils.
I will have a look how to fix this when I have time.
Priority: -- → P2
(Assignee)

Comment 29

4 months ago
Created attachment 8860034 [details] [diff] [review]
fixGoToDate V1

This is another patch addressing the
    EXCEPTION: Expression "{"value":"2009"}" returned null. Anonymous == false
issue by replacing the generic timeout by a test that the value is in view.

The positive side-effect is that this speeds up the function on fast systems/builds.

Also the timeout for the monthchange-event to fire is raised, since we had timouts for this on the debug-builds.

I will start a try run later.
Attachment #8860034 - Flags: review?(makemyday)
(Assignee)

Comment 30

4 months ago
Here ist the try run:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=ade81ade417736af6320d6a9038644cc06b9e00b

unfortunately, there is still a timeout failure in the monthchange event.
It looks, like there is a case, when the event does not fire. I will investigate further.
(Assignee)

Comment 31

4 months ago
It looks, like the monthchange really takes a long time on some occasions, as some debug code revealed:
>07:21:04     INFO -  console.log: Change to 5-1-2009
>07:21:13     INFO -  console.log: Debugstring: select year monthchange took 4.924s
>07:21:14     INFO -  console.log: Debugstring: select month monthchange took 0.616s

on another run even:
>07:52:13     INFO -  console.log: Change to 1-1-1970
>07:52:18     INFO -  --DOCSHELL 0x7f3095371800 == 14 [pid = 2320] [id = {77c12a9c-16de-43a0-a6f6-d37ec947d10c}]
>07:52:31     INFO -  console.log: Debugstring: select year monthchange took 12.652s

Most interestingly the test didn't stop although the timeout was set to 5s only.
The log-times also don't reflect the measured times - so maybe this is not too accurate...

I will set the Timeout to 10s now to be sure.

I thought about doing this only for debug builds, but I don't see any bad side-effects in raising the timeout for the other builds as well, so I will skip this code...
(Assignee)

Comment 32

4 months ago
Created attachment 8862575 [details] [diff] [review]
fixGoToDate V2

The patch as per previous comment
Attachment #8860034 - Attachment is obsolete: true
Attachment #8860034 - Flags: review?(makemyday)
Attachment #8862575 - Flags: review?(makemyday)
(Assignee)

Comment 33

4 months ago
And a tryrun:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=c83a8e03fe49f066ef60466420de4d473cb78c13

Comment 34

4 months ago
Comment on attachment 8862575 [details] [diff] [review]
fixGoToDate V2

Review of attachment 8862575 [details] [diff] [review]:
-----------------------------------------------------------------

I leave the length of the timeout to your discretion, r+ with the below comments considered.

::: calendar/test/mozmill/shared-modules/test-calendar-utils.js
@@ +246,4 @@
>              anon({"anonid":"yearcell"})
>          `));
>  
> +        function getYearListitem (aYear) {

Make this
 
let getYearListitem = function(aYear) {
 
please or move this function to the top of goToDate before you define any variable to avoid strict warnings.

@@ +251,5 @@
> +                ${miniMonth}/anon({"anonid":"minimonth-header"})/
> +                anon({"anonid":"minmonth-popupset"})/anon({"anonid":"years-popup"})/
> +                {"label":"${aYear}"}
> +            `)
> +        }

And this };
Attachment #8862575 - Flags: review?(makemyday) → review+
(Assignee)

Comment 35

4 months ago
With the above fixed pushed to comm-central:
-> https://hg.mozilla.org/comm-central/rev/9c284993401409fa2c6af5a0430197980b0e69bf

This should fix this bug now.
Status: ASSIGNED → RESOLVED
Last Resolved: 4 months ago
Resolution: --- → FIXED
Target Milestone: 5.6 → 5.7
You need to log in before you can comment on or make changes to this bug.