Closed Bug 451287 Opened 16 years ago Closed 2 years ago

[meta] Mochitest Counts are off and erratic - all platforms, both on failing and successful builds

Categories

(Testing :: Mochitest, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED INACTIVE

People

(Reporter: lsblakk, Unassigned)

References

(Depends on 2 open bugs)

Details

(Keywords: meta, Whiteboard: [Waiting for dependencies])

Attachments

(2 files)

So the mochitest numbers are all over the place.  In a single changeset being built across all the slaves for any given platform, there can be a variance of anywhere from 4 to 30+ tests.

I will comment to this with several examples across all platforms.  Mac only has one buildslave on Mozilla-Central at the moment, so there is less information on that one - but I consider this to be an issue for all platforms because there is certainly some fluctuation on the Mac numbers.
Example #1:

Linux Slaves:

first run of changeset a56849e7b07a 
followed by a second run of the _same_ changeset

slave 1      |       slave 2

63826/0/1644       63849/0/1644
63826/0/1644       63826/0/1644

Builds started at 1:32 am on 8/19 and end at 4:12 am on 8/19
On the Mac slave, on the same changeset as comment #3 - a56849e7b07a the mochitest count dipped down by 3653...

60237/54/1648

where it is usually in the ~63950 number
On Windows slaves:

An example of a green run:

changeset: cb6eede58eea

slave 1     |     slave 2       |     slave 3

64183/0/1646    64216/0/1646        64174/0/1646

On Windows slaves:

An example of an orange run:

changeset: 300311085278

slave 1     |     slave 2       |     slave 3

64218/8/1644    64209/8/1644        64222/12/1644
An example where there was no change apart from removing an indent in a line of code:

(i think this is on Mac, but tinderbox is not responding right now so I can't double-check)

All builds before and after this build had run 63826, then the indent change is checked in and the results are:

63849/0/1644

changeset: 3d7ff51f7a4a

build on 8/18 at 16:38
Do you think you could use your test-log-parsing tool to figure out a diff of what tests were run in these sets of builds?
Using the build for changeset 30da9cae7cf2

http://avnerd.tv/sharedFiles/Mozilla/diff.txt is a diff between: 

build no. 444 on qm-centos5-moz2-01 which started at 8:37am on 8/20 and was a successful run - with mochitest count of 63861/0/1642

and

build no. 546 on qm-centos5-03 which started at 9:10am on 8/20 and was a successful run - with mochitest count of 63883/0/1642
Using the build for changeset 30da9cae7cf2

http://avnerd.tv/sharedFiles/Mozilla/diff545.txt is a diff between: 

build no. 444 on qm-centos5-moz2-01 which started at 8:37am on 8/20 and was a
successful run - with mochitest count of 63861/0/1642

and

build no. 545 on qm-centos5-03 which started at 8:15am on 8/20 and was a
successful run - with mochitest count of 63883/0/1642
This example is a Linux run that went orange - changeset 7b75ed52358c

(544) qm-centos5-02 started at 7:19am and counted 63867/0/1642

(443) qm-centos5-moz2-01 started at 6:49 am and counted 63861/1/1642

(the build failed because of 2 browserchrome tests)

http://avnerd.tv/sharedFiles/Mozilla/diff_544_nojunk.txt 
So, on the Windows side of things - 3 machines were building the same changeset over and over (at least 3 times apiece) from 7:49 am on 8/20 to 13:28 am on 8/20

The number of unittests run fluctuated on the machine itself as well as across the three machines.

qm-win2k3-03          qm-win2i3-moz2-01   qm-win2k3-unittest-hw
(574-576)             (489-492)           (533-536 - no 534 because it failed out)

64239/0/1644          64229/0/1644        64249/0/1644
64221/0/1644          64212/0/1644        64227/0/1644
64221/0/1644          64251/0/1644        64231/0/1644        

I ran the results as a diff against build 489 as a yard stick - and they are as follows:

http://avnerd.tv/sharedFiles/Mozilla/diff_490.txt 
http://avnerd.tv/sharedFiles/Mozilla/diff_491.txt 
http://avnerd.tv/sharedFiles/Mozilla/diff_492.txt 
http://avnerd.tv/sharedFiles/Mozilla/diff_533.txt 
http://avnerd.tv/sharedFiles/Mozilla/diff_535.txt 
http://avnerd.tv/sharedFiles/Mozilla/diff_536.txt 
http://avnerd.tv/sharedFiles/Mozilla/diff_574.txt
http://avnerd.tv/sharedFiles/Mozilla/diff_575.txt
http://avnerd.tv/sharedFiles/Mozilla/diff_576.txt   

adding to comment #13 that qm-win2k3-unittest-hw had two runs that were orange, one with no mochitest fails and one with:

64231/4/1644
64227/0/1644


and then one green run:

64249/0/1644

sorry for missing that in the previous comment
Anyone looking into this?  Still an issue - two builds with the same rev, different number of mochitests being run.
I think something is really confused here. For example, from your diff_490 above:
  /tests/content/base/test/test_NodeIterator_basics_filters.xhtml | basics backward - index 16:
+ /tests/content/base/test/test_NodeIterator_basics_filters.xhtml | basics backward - index 17:
  /tests/content/base/test/test_NodeIterator_basics_filters.xhtml | basics backward - index 18:

This test is run in a loop. This diff indicates that in build 489, the test didn't get run for index 17, but it did in build 490. That isn't actually possible if you look at the test:
http://mxr.mozilla.org/mozilla-central/source/content/base/test/test_NodeIterator_basics_filters.xhtml#26

So maybe some output is getting lost along the way, or something weird like that, but I don't believe for a second that that test didn't actually get run.
Priority: -- → P1
I can reproduce locally: the success count done by the harness is off by +/- +650.
Assignee: nobody → sgautherie.bz
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.9.2a1
Version: unspecified → Trunk
Depends on: 483555
Depends on: 486247
Depends on: 486253
Depends on: 486256
(In reply to comment #17)
> I can reproduce locally: the success count done by the harness is off by +/-
> +650.

Tracking the cause(s) of this difference was a little painful, but not as much I was worried for a moment :-|

***

When the logs and counts (are fixed to) match, we'll see whether this bug remains or not.
Priority: P1 → --
Hardware: x86 → All
Whiteboard: [Waiting for dependecies]
Whiteboard: [Waiting for dependecies] → [Waiting for dependencies]
(In reply to comment #18)
> > +650.
> 
> Tracking the cause(s) of this difference was a little painful, but not as much
> I was worried for a moment :-|

PS:
That was for a Firefox (3.6a1pre) build.
Maybe SeaMonkey (2.0b1pre) has other cases too: to be checked in due time...
Depends on: 486781
Depends on: 492956
Depends on: 494397
Depends on: 505755
Depends on: 502646
Depends on: 558610
Depends on: 484994
Serge, what do you think the fix is here? I can't tell from the above comments.
(In reply to comment #20)
> Serge, what do you think the fix is here? I can't tell from the above comments.

Nothing "here": this bug is only a meta, as it appeared there are various causes to these misbehaviors :-/

NB: I used to try and push hard on these subjects a year ago, then I mostly dropped the matter for lack of (further) interest from involved people :-/
Keywords: meta
Target Milestone: mozilla1.9.2a1 → ---
Blocks: 558610
No longer depends on: 558610
Depends on: 621390
Depends on: 622070
Depends on: 652494
Depends on: 718239
Depends on: 759789
No longer depends on: 486247
Depends on: 677964
Depends on: 1032878
No longer depends on: 677964
Depends on: 1048775

The bug assignee didn't login in Bugzilla in the last 7 months.
:ahal, could you have a look please?
For more information, please visit auto_nag documentation.

Assignee: bugzillamozillaorg_serge_20140323 → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(ahal)
Summary: Mochitest Counts are off and erratic - all platforms, both on failing and successful builds → [meta] Mochitest Counts are off and erratic - all platforms, both on failing and successful builds
Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(ahal)
Resolution: --- → INACTIVE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: