Closed Bug 1342828 Opened 7 years ago Closed 7 years ago

linux64 debug mozmill test always hit timeout

Categories

(Thunderbird :: Testing Infrastructure, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: arai, Assigned: tomprince)

References

Details

Attachments

(5 files, 1 obsolete file)

> command timed out: 7200 seconds elapsed running ['/tools/buildbot/bin/python',
> 'scripts/scripts/desktop_unittest.py', '--cfg', 'unittests/linux_unittest.py',
> '--mozmill-suite', 'mozmill', '--cfg', 'unittests/thunderbird_extra.py',
> '--blob-upload-branch', 'try-comm-central', '--download-symbols', 'true'],
> attempting to kill

it need to be split up into 2 or more chunks, like bug 1248692 for xpcshell test.
Aleth, would you know how to do this?
Flags: needinfo?(aleth)
Blocks: 1340947
can it be separated into many smaller parts?
in that case it's easy to test try-only intermittent.
I think currently mozmill test result is not reliable for finding regression, because of this issue.
even if we compare the result between previous run, the set of executed tests can be different, and we'll overlook regression, or get confused by not-new-regression.
Severity: normal → major
Assignee: nobody → aleth
Flags: needinfo?(aleth)
I'm not sure if mozmill supports totalChunks, but let's try it and see. If it doesn't work we might have to increase the script_maxtime instead.
Attachment #8843727 - Flags: review?(jlund)
Sorry about the following unrelated comment:
On Linux debug the Xpcshell tests run as X1 and X2 on the treeherder. X1 seem to be TB tests whereas X2 are M-C tests. Could I have that on all platforms and also on opt builds.
(In reply to Jorg K (GMT+1) from comment #5)
> Sorry about the following unrelated comment:
> On Linux debug the Xpcshell tests run as X1 and X2 on the treeherder. X1
> seem to be TB tests whereas X2 are M-C tests. Could I have that on all
> platforms and also on opt builds.

That pattern is a coincidence, as you can see from the fact it's only approximately true (e.g. devtools/ in X1). The test order is more or less alphabetical.

I'm not sure what the resource costs would be of using two xpcshell chunks everywhere.
Comment on attachment 8843727 [details] [diff] [review]
Use two chunks for TB Linux debug mozmill tests

Review of attachment 8843727 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozilla-tests/thunderbird_config.py
@@ +166,5 @@
>  ]
> +MOZMILL_TWO_CHUNKS = [
> +    ('mozmill', {
> +        'use_mozharness': True,
> +        'script_path': 'scripts/desktop_unittest.py',

hm, I suppose if you wanted to test this on try you hack the desktop_unittest.py script in-tree to pretend as if totalChunks:2 was passed in the script args.
Attachment #8843727 - Flags: review?(jlund) → review+
What is the status of this bug?
It has a positive review but no chekin yet...
Flags: needinfo?(jlund)
Flags: needinfo?(aleth)
Sadly I haven't seen Aleth in weeks :-(
Or it just needs to land on *central? Can it break anything outside of c-c if it isn't right? In which repo does it go when the file is in mozilla-tests/ ?
(In reply to Markus Adrario [:Taraman] from comment #8)
> What is the status of this bug?
> It has a positive review but no chekin yet...

I'm not sure. It has been reviewed so it can land on buildbot-configs (outside gecko/central). Someone will have to land and be on point for any fallout.
Flags: needinfo?(jlund)
Philipp, could you please take a look at this and land it if you think it's right.
Flags: needinfo?(philipp)
Looks right to me. I'll take care to back out if there is fallout, just let me know if I miss it.

https://hg.mozilla.org/build/buildbot-configs/rev/555764ee0a26cda0164cb1d54fdb014b2afc6606
Flags: needinfo?(philipp)
Flags: needinfo?(aleth)
Blocks: 1342734
When is this change expected to take effect?
until now the problem remains...
Philipp, can you please back this out again, it seems to have destroyed Mozmill on Linux altogether, see treeherder C-C after Mon May 1, 2017, 22:15:53: ? ? B X1 X2.
Flags: needinfo?(philipp)
Backed out in https://hg.mozilla.org/build/buildbot-configs/rev/f3cb566ee39e4a42af2f7713053d73d98f11f7f6
Pending a merge to production and a reconfig
Flags: needinfo?(philipp)
Nick, could you please take care of the merge to production, etc.
Flags: needinfo?(nthomas)
Looks like it's in production now.
Flags: needinfo?(nthomas)
Tom, could you take a look at this one. We put this into production and had to back it out.
Flags: needinfo?(tom.prince)
It looks like the mozmill test runner (https://dxr.allizom.org/comm-central/source/mail/test/mozmill/runtestlist.py) doesn't have support for running in multiple chunks currently, unlike the xpcshell runner (https://dxr.allizom.org/comm-central/source/mozilla/testing/xpcshell/runxpcshelltests.py).

It looks like it would make sense to port runtestlist to use manifestparser, which can then take advantage manifestparser.chunk_by_slice. A quicker solution would be to just duplicate the slicing logic (https://dxr.allizom.org/comm-central/source/mozilla/testing/mozbase/manifestparser/manifestparser/filters.py#170-194) from there.
I looked into using manifestparser, but it has more expectations on test layout and test running, so I just implemented chunking in the mozmill runner.

It does look like mozmill has some support for manifestparser (although it doesn't expose filters). It might be worthwhile exploring migrating to using that, but this unblocks chunking the tests pending that work.
Flags: needinfo?(tom.prince)
Comment on attachment 8876459 [details] [diff] [review]
Support chunks in mozmill tests.

Looks good to me, thanks for figuring it out! I agree that this is the simpler solution, especially given mozmill is only still used in Thunderbird.

Let's get this patch pushed then retry the buildbot changes. I wonder if the question mark that showed up on treeherder was just a side-effect of the chunk logic missing, or if more work is needed.
Attachment #8876459 - Flags: review+
Assignee: aleth → tom.prince
Keywords: checkin-needed
(In reply to Philipp Kewisch [:Fallen] from comment #22)
> Let's get this patch pushed then retry the buildbot changes.
I can certainly land this in the next push.

Can someone please enlighten me what the Buildbot changes are for. Are they needed as well? Philipp talks about a "simpler solution" presented in the second patch.
Oh, looks like the first patch requests chunks and the second patch makes them possible for C-C.
To see the treeherder question marks Philipp mentioned, please click here:
https://treeherder.mozilla.org/#/jobs?repo=comm-central&revision=34153f64820674523909b789c68f58e8f0c1af01
> I wonder if the question mark that showed up on treeherder was just a side-effect of the chunk logic missing, or if more work is needed.

I had been going to guess that it was just a side effect, but looking at https://github.com/mozilla/treeherder/blob/5e7a78cbbd129b21164609a3d96353dc114ecde8/treeherder/etl/buildbot.py#L642 the trailing '$' in the regex might need to be removed.
Does this change mozmill itself? May the change get overwritten if we update mozmill? If so, the hunk should be somehow marked so it doesn't get lost as it will not be upstream.
No. The vendored copy of mozmill is at https://dxr.allizom.org/comm-central/source/mail/test/resources/mozmill . This is just the code that invokes mozmill. https://dxr.allizom.org/comm-central/source/mail/test/mozmill/runtest.py is a wrapper around mozmill which is invoked by https://dxr.allizom.org/comm-central/source/mail/test/mozmill/runtestlist.py which is what is being changed.
(In reply to Jorg K (GMT+2) from comment #23)
> (In reply to Philipp Kewisch [:Fallen] from comment #22)
> > Let's get this patch pushed then retry the buildbot changes.
> I can certainly land this in the next push.
> 
> Can someone please enlighten me what the Buildbot changes are for. Are they
> needed as well? Philipp talks about a "simpler solution" presented in the
> second patch.

So for checkin, please push the c-c patch. I will take care of the buildbot patch once it is in. With "simpler solution", I just agreed with Tom from comment 20 and 21 that just porting the chunking logic makese sense instead of porting mozmill to use the manifestparser.

Tom, if you are certain that the $ will fix it I think you can go ahead and send a PR for treeherder. Otherwise we can fix this when the builds are in as it seems to just be a cosmetic issue.
I know I make myself very unpopular by re-reviewing things before landing, but so be it.

Let's assume we have 100 tests and want two chunks. Let's also assume this_chunk is set to 1 and 2, although I can't see where this happens.

this_chunk=1:
+    tests_per_chunk = float(len(tests)) / options.total_chunks
+    start = int(round((options.this_chunk - 1) * tests_per_chunk))
+    end = int(round(options.this_chunk * tests_per_chunk))
We get: tests_per_chunk = 50.
start=0
end=50.

this_chunk=2:
We get: tests_per_chunk = 50.
start=50
end=100.

I think there is a -1 missing on the end calculation. Or am I missing something? Then I apologise in advance ;-)
Keywords: checkin-needed
> I know I make myself very unpopular by re-reviewing things before landing, but so be it.

Easier to fix things before they land than after.

> I think there is a -1 missing on the end calculation. Or am I missing something?

Python (as well as many other languages) uses half-open intervals to represent ranges. That is, a range includes the left endpoint but not the right endpoint. So tests[0:50] selects everything from 0-49 and tests[50:100] selects everything 50-99. They do this to make this kind of code easier to reason about; you know that you've covered a large interval by smaller ones if all the endpoints match, rather than if the endpoints differ by one, for example.

> Then I apologise in advance ;-)

If something is confusing for one person, it is liable to be confusing to others, so better to catch it and improve it, if necessary. I don't *think* that any explanation is needed in the code here, but it seems better to ask than to let something confusing get through.
(In reply to Tom Prince from comment #31)
> That is, a range includes the left endpoint but not the
> right endpoint. So tests[0:50] selects everything from 0-49 and
> tests[50:100] selects everything 50-99.
I thought you would answer that, I was going to look it up once I returned to my office, but you beat me to it.
This clearly shows my ignorance and the fact that we need a build/release engineer ;-)

> I don't *think* that any explanation is needed in the code here, ...
Well, if it's a language feature, it doesn't need a comment. Code is to be read by people capable of understanding it ;-)

I'll get it landed after the next M-C merge. As I explained elsewhere, M-C land so many changes that in order to see bustage early, I do a push after every M-C merge.
Keywords: checkin-needed
https://hg.mozilla.org/comm-central/rev/aa78e5d5dc26d9cda6e983589ccb98a7a94054be

Philipp, please take care of the buildbot part. Can we also please fix the treeherder, cosmetic or not ;-)
Flags: needinfo?(philipp)
Mozmill still working ;-)
Just as a matter of interest, where do you increment options.this_chunk?
this_chunk isn't every incremented. The buildbot configuration arranges to call the script twice (in the two different builds), once with --this-chunk 1 and once with --this-chunk 2. (See https://dxr.allizom.org/build-central/source/buildbotcustom/misc.py#665-686 for details)
Good to know, thanks. Any reason to use allizom?
https://dxr.mozilla.org/build-central/source/buildbotcustom/misc.py#665-686 works, too.
https://hg.mozilla.org/build/buildbot-configs/rev/4f9b97ebf944bb3fb0aff829e8450d41ffdc5821

I think buildbot-configs is merged to production and reconfigured at midnight, I always forget if this is automated or requires someone to manually intervene.
Flags: needinfo?(philipp)
The question marks are back, can we get them fixed, please.
Flags: needinfo?(tom.prince)
https://github.com/mozilla/treeherder/pull/2573 should fix the `?`s.
Flags: needinfo?(tom.prince)
It looks like one of the tests is still failing due to timeout. It looks like the number of test per folder is unevenly distributed causing the first chunk to take wildly longer. (The first chunk has over 600 tests and the second has ~250).
(In reply to Tom Prince from comment #39)
> https://github.com/mozilla/treeherder/pull/2573 should fix the `?`s.
It does, you can see it working on comm-beta right now. Thanks!
It looks like https://bugzilla.mozilla.org/attachment.cgi?id=8876459 should be landed comm-beta too, since the buildbot changes are causing the tests to not run there. (This might mean that this also needs to be applied to ESR releases?)
Comment on attachment 8876459 [details] [diff] [review]
Support chunks in mozmill tests.

I'll take care of it, thanks!
Attachment #8876459 - Flags: approval-comm-esr52+
Attachment #8876459 - Flags: approval-comm-beta+
Attachment #8880478 - Flags: review?(jlund)
Comment on attachment 8880478 [details] [diff] [review]
Support chunked mozmill tests in try syntax.

Review of attachment 8880478 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm. do we need to update trychooser front end to add this option?
Attachment #8880478 - Flags: review?(jlund) → review+
https://hg.mozilla.org/build/buildbotcustom/rev/e32f42e43b34782fde995aff4477c5cbc5d2d967

(In reply to Jordan Lund (:jlund) from comment #48)
> do we need to update trychooser front end to add this option?
Maybe not. Mozmill is not used in Firefox and our TB folks just put |-u mozmill|. That just stopped working due to the chunking so now it will work again.

Are we done here or will we still address comment #42? Tom?
Looking at the latest treeherder push on comm-central, the first mozmill test run is still timing out, so we'll need to something to address that.

An easy workaround would be to up the chunks, but given that the first chunk takes >120m and the second just 45, there should be a better way of balancing the tests.
I'd go to three chunks and be done with it ;-)
Nick, sorry for the hassle, did the push from comment #49 get merged to production?
Flags: needinfo?(nthomas)
Sorry, I didn't know what to look for, now I understand.
Tom, can we do a bit more here, since both Xpcshell and Mozmill are timing out, for example here:
https://treeherder.mozilla.org/#/jobs?repo=comm-central&revision=0170000376576d23e86bd8faa3e51c4e659f85c2
Flags: needinfo?(mozilla)
Attachment #8887142 - Attachment is obsolete: true
Comment on attachment 8887143 [details]
Bug 1342828 - Use many chunks in thunderbird debug tests.

https://reviewboard.mozilla.org/r/157892/#review162994

::: mozilla-tests/thunderbird_config.py:171
(Diff revision 1)
>      ('mozmill', {
>          'use_mozharness': True,
>          'script_path': 'scripts/desktop_unittest.py',
>          'extra_args': ['--mozmill-suite', 'mozmill',
>                         '--cfg', 'unittests/thunderbird_extra.py',
>                         '--opt-cfg', 'unittests/thunderbird_buildbot.py'],

This depends on https://reviewboard.mozilla.org/r/155746/ for the trivial reason that they overlap adding this line.
Comment on attachment 8887143 [details]
Bug 1342828 - Use many chunks in thunderbird debug tests.

https://reviewboard.mozilla.org/r/157894/#review162998

This is probably overkill, but it seems easier to split things up more, and increase the timeouts to figure out what is going on, and then make the bounds tighter later.
Great!
Comment on attachment 8887143 [details]
Bug 1342828 - Use many chunks in thunderbird debug tests.

https://reviewboard.mozilla.org/r/157894/#review163254
Attachment #8887143 - Flags: review+
If you would like to dig into the timeouts and running them locally is an issue then I suggest getting a loaner, see https://wiki.mozilla.org/ReleaseEngineering/How_To/Request_a_loaner.
I think it's time to close this bug and open a new one for follow-up work.

In
https://treeherder.mozilla.org/#/jobs?repo=comm-central&revision=25b49e872ac53975737e857be31f5c676c586007
there are no more timeouts, the slowest tests are X4 and Z1:

Linux 32bit: X4: 110 min, Z1: 113 min
Linux 64bit: X4:  85 min, Z1:  88 min.

With 'script_maxtime': 3600*3 which is 180 min we should be good for a while, right?
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(mozilla)
Keywords: leave-open
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: