Closed Bug 1011292 Opened 10 years ago Closed 10 years ago

change how we chunk in desktop_unittest.py

Categories

(Release Engineering :: General, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mozilla, Assigned: mozilla)

References

Details

(Whiteboard: [mozharness])

Attachments

(6 files, 3 obsolete files)

Currently our suite name, e.g. plain5, browser-chrome-3, etc. determines our chunking.

I'd like to separate that out into plain and browser-chrome (or, for browser-chrome, we might need a separate browser-chrome-chunked since we have a 'browser-chrome' that's unchunked for sanity or something) and also pass a --this-chunk X --total-chunks Y that gets passed through to the test harness.

This doesn't do anything by itself, but if we then get a buildbot step to translate properties to --this-chunk X --total-chunks Y command line args, I think we can reduce our test builder count by 5043.

If we do that without touching how we call desktop_unittest.py, I think we reduce our builder count by 543.  Since this is a factor of almost 10, this bug seems worth fixing.
This might have far reaching issues in terms of queue collapsing and such, since we'll effectively have the same builder triggered N times per job, and we actually want -all- N to run, but be able to queue collapse certain ones with job Y+1.

I don't forsee how we can make that happen as this is proposed without eliminating all collapsing at present.
(In reply to Justin Wood (:Callek) from comment #1)
> This might have far reaching issues in terms of queue collapsing and such,
> since we'll effectively have the same builder triggered N times per job, and
> we actually want -all- N to run, but be able to queue collapse certain ones
> with job Y+1.
> 
> I don't forsee how we can make that happen as this is proposed without
> eliminating all collapsing at present.

Already resolved in #releng, and not actually part of this bug, since this bug doesn't change the number of builders, just changes where the chunking information is set.
Assignee: nobody → aki
This actually makes mozilla-tests/config.py a lot prettier.  misc.py, however, ... we'll see.
Attached patch buildbot-configs changes (obsolete) — Splinter Review
s,total_chunks,totalChunks,g due to generateTestBuilderNames()
http://hg.mozilla.org/build/buildbotcustom/file/9b87707fc148/misc.py#l228

This passes checkconfig with my buildbotcustom patch!
Next, massive dump_masters diff, then some spot testing to make sure the chunking goes through to the test harness with the mozharness patch.
Attachment #8423529 - Attachment is obsolete: true
I *think* this is resulting in a good dump_masters.sh diff.
I'm going to eyeball it some more and try to get a character diff rather than a line diff.  Tried using github but it gacked on the size of the files.
Attachment #8423582 - Attachment is obsolete: true
(In reply to Justin Wood (:Callek) from comment #1)
> This might have far reaching issues in terms of queue collapsing and such,
> since we'll effectively have the same builder triggered N times per job, and
> we actually want -all- N to run, but be able to queue collapse certain ones
> with job Y+1.
> 
> I don't forsee how we can make that happen as this is proposed without
> eliminating all collapsing at present.

I think we can handle this by changing mergeRequests to look at these properties and to refuse to merge things together that are for different jobs.
Attached patch dump-mastersSplinter Review
Switch dump_masters.sh to, by default, only dump the limited set used in ./test-masters.sh.

Asking for review on a braindump script because I'm not sure if there are reasons not do this.
Attachment #8424201 - Flags: review?(catlee)
Comment on attachment 8423643 [details] [diff] [review]
buildbotcustom changes

I just had a snow leopard test box go through a couple non-chunked suites green, and launched browser-chrome-3 with the right chunking args (though the chunk args now happen earlier in the command line).

http://dev-master1.build.mozilla.org:8053/buildslaves/talos-r4-snow-003
Attachment #8423643 - Attachment description: [needs testing] buildbotcustom changes → buildbotcustom changes
Attachment #8423643 - Flags: review?(catlee)
Comment on attachment 8423580 [details] [diff] [review]
buildbot-configs changes

On second thought, maybe Jordan.

I'm going to do a full Cypress run when these land.
Attachment #8423580 - Attachment description: [needs testing] buildbot-configs changes → buildbot-configs changes
Attachment #8423580 - Flags: review?(jlund)
Attachment #8423643 - Flags: review?(catlee) → review?(jlund)
Attachment #8423535 - Attachment description: [needs testing] mozharness changes → mozharness changes
Attachment #8423535 - Flags: review?(jlund)
I will try to look at this tomorrow so I don't block but, as it's a Canadian holiday, I will be away for much of it and can't promise to have review before tues.
(In reply to Jordan Lund (:jlund) from comment #12)
> I will try to look at this tomorrow so I don't block but, as it's a Canadian
> holiday, I will be away for much of it and can't promise to have review
> before tues.

Nope, I realized you were in Canada when I requested review.  Take your holiday.
Comment on attachment 8423643 [details] [diff] [review]
buildbotcustom changes

ahh, misc.py ... it's always a pleasure reviewing you.

I think this lgtm. as you mentioned, we should do a full cypress run-through. FWIW, I dump_master'd this without the buildbot-config patch and it yielded a no-op which is what I expected. The dump_master with the bbot-cfg patch *looks* like it should be changing what it should change.
Attachment #8423643 - Flags: review?(jlund) → review+
Comment on attachment 8423580 [details] [diff] [review]
buildbot-configs changes

Review of attachment 8423580 [details] [diff] [review]:
-----------------------------------------------------------------

Overall, lgtm. <3 how it cleans up this file dramatically. 1 worry explained inline below

::: mozilla-tests/config.py
@@ -2,5 @@
>  
>  import config_common
>  reload(config_common)
>  from config_common import loadDefaultValues, loadCustomTalosSuites, \
> -    nested_haskey, get_talos_slave_platforms, delete_slave_platform

nice

@@ +354,5 @@
>      }),
>  ]
>  
>  MOCHITEST_DT_3 = [
> +    ('mochitest-devtools-chrome', {

So, IIUC, this may end up as a slight confusion or worse, biting us later down the road.

possible confusion:
1) having this value the same as here[1]. By purely looking at the buildername, it might make it hard to tell if the build we are doing is a chunked build or not. 'mochitest-browser-chrome[2] is a similar case.
2) I am not sure how we will go about differentiating all of this in TBPL. For the same reason mentioned above about identifying when we are doing a chunked builder or a non-chunked one. But also more than that, how do we tell one chunked builder against another chunked builder?

possible bite:
1) I think the reason this doesn't yield duplicate buildernames is because we, for now, have proportioned correctly where we use vars like MOCHITEST_DT and MOCHITEST_DT_3 and they never overlap on the same branch/gecko version. But we could end up in a situation where we add 'mochitest-devtools-chrome' twice. You could argue that would be a mistake doing so and should be caught by checkconfig.
2) IIUC the chunk and the non-chunk equivalent would share the same suite_config[3] because again, they share the same name. The chances of this hurting us down the line is hard to imagine but it is worth noting since it breaks the convention of explicitly setting a test suite with a test suite config.


maybe we need to do things like 'mochitest-devtools-chrome-chunked'. Though that will not give us quite the same decrease in builders as 5043 :). It also won't help us differentiate chunks in TBPL.

I realize this might have all been priorly discussed so apologies in advance if that is the case. r+ if this can be explained to me.

[1] http://hg.mozilla.org/build/buildbot-configs/file/b9e31e3ee685/mozilla-tests/config.py#l400
[2] http://hg.mozilla.org/build/buildbot-configs/file/b9e31e3ee685/mozilla-tests/config.py#l342
[3] http://hg.mozilla.org/build/buildbot-configs/file/b9e31e3ee685/mozilla-tests/config.py#l638
Comment on attachment 8423535 [details] [diff] [review]
mozharness changes

Review of attachment 8423535 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm,

1 nit inline. I'll r+ if I'm wrong.

::: configs/unittests/linux_unittest.py
@@ +101,5 @@
>          "cppunittest": ['tests/cppunittests']
>      },
>      "all_jittest_suites": {
>          "jittest": [],
> +        "jittest-chunked": [],

I don't see where in buildbot-configs where we consolidate jittest. Are we wanting to do that for these too?
Attachment #8423535 - Flags: review?(jlund) → review-
Attachment #8423580 - Attachment is obsolete: true
Attachment #8423580 - Flags: review?(jlund)
Attachment #8425643 - Flags: review?(jlund)
Comment on attachment 8423535 [details] [diff] [review]
mozharness changes

Back to r? after fixing the configs patch.
Attachment #8423535 - Flags: review- → review?(jlund)
(In reply to Jordan Lund (:jlund) from comment #15)
> Comment on attachment 8423580 [details] [diff] [review]
> buildbot-configs changes
> 
> Review of attachment 8423580 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Overall, lgtm. <3 how it cleans up this file dramatically. 1 worry explained
> inline below
> 
> ::: mozilla-tests/config.py
> @@ -2,5 @@
> >  
> >  import config_common
> >  reload(config_common)
> >  from config_common import loadDefaultValues, loadCustomTalosSuites, \
> > -    nested_haskey, get_talos_slave_platforms, delete_slave_platform
> 
> nice
> 
> @@ +354,5 @@
> >      }),
> >  ]
> >  
> >  MOCHITEST_DT_3 = [
> > +    ('mochitest-devtools-chrome', {
> 
> So, IIUC, this may end up as a slight confusion or worse, biting us later
> down the road.
> 
> possible confusion:
> 1) having this value the same as here[1]. By purely looking at the
> buildername, it might make it hard to tell if the build we are doing is a
> chunked build or not. 'mochitest-browser-chrome[2] is a similar case.

Buildernames are different, per my misc.py patch:
+        kwargs_copy['suites_name'] = '%s-%d' % (kwargs_copy['suites_name'], i)

> 2) I am not sure how we will go about differentiating all of this in TBPL.
> For the same reason mentioned above about identifying when we are doing a
> chunked builder or a non-chunked one. But also more than that, how do we
> tell one chunked builder against another chunked builder?

That's not a problem as of this bug.  Chunked buildernames continue to have the -CHUNKNUM appended in misc.py.  It'll be a problem if/when we start combining these into the same builder.

> possible bite:
> 1) I think the reason this doesn't yield duplicate buildernames is because
> we, for now, have proportioned correctly where we use vars like MOCHITEST_DT
> and MOCHITEST_DT_3 and they never overlap on the same branch/gecko version.
> But we could end up in a situation where we add 'mochitest-devtools-chrome'
> twice. You could argue that would be a mistake doing so and should be caught
> by checkconfig.

See above, we can still do this since the buildernames don't overlap atm.

> 2) IIUC the chunk and the non-chunk equivalent would share the same
> suite_config[3] because again, they share the same name. The chances of this
> hurting us down the line is hard to imagine but it is worth noting since it
> breaks the convention of explicitly setting a test suite with a test suite
> config.

I don't see this being a problem right now; the problem right now is we have way too many duplicate suite_config definitions with no exceptions, which means this was a YAGNI optimization so far.

But yes, if we do run into wanting to have a different suite_config for a chunked vs non-chunked suite, I think the solution is to rename the suite at that point.

Thanks for catching this, but I *think* we're ok here.  Do you agree?

> maybe we need to do things like 'mochitest-devtools-chrome-chunked'. Though
> that will not give us quite the same decrease in builders as 5043 :). It
> also won't help us differentiate chunks in TBPL.
> 
> I realize this might have all been priorly discussed so apologies in advance
> if that is the case. r+ if this can be explained to me.
> 
> [1]
> http://hg.mozilla.org/build/buildbot-configs/file/b9e31e3ee685/mozilla-tests/
> config.py#l400
> [2]
> http://hg.mozilla.org/build/buildbot-configs/file/b9e31e3ee685/mozilla-tests/
> config.py#l342
> [3]
> http://hg.mozilla.org/build/buildbot-configs/file/b9e31e3ee685/mozilla-tests/
> config.py#l638
Comment on attachment 8425643 [details] [diff] [review]
chunk-configs3 - add jittest

for reasons explained in https://bugzilla.mozilla.org/show_bug.cgi?id=1011292#c20, my concerns were invalid.

this looks great! So neat to see such a clean up achieve the same thing.
Attachment #8425643 - Flags: review?(jlund) → review+
Attachment #8423535 - Flags: review?(jlund) → review+
This is so we can pre-land the mozharness change before the reconfig, and have it work until the final test master finishes reconfiging.
Comment on attachment 8425832 [details] [diff] [review]
chunk-mozharness2 - keep existing suite configs

Review of attachment 8425832 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with "browser-chromechunked3" fix
Attachment #8425832 - Flags: review+
Comment on attachment 8425832 [details] [diff] [review]
chunk-mozharness2 - keep existing suite configs

https://hg.mozilla.org/build/mozharness/rev/9d56c8ec1f5d
Attachment #8425832 - Flags: checked-in+
Cypress push: https://tbpl.mozilla.org/?tree=Cypress&rev=9d8d16695f6a
I'd like to see this go green.

I think

_ cypress green
_ merge+reconfig
_ verify nothing's busted
_ remove the suite# lines in the mozharness unittest configs, like https://bugzilla.mozilla.org/attachment.cgi?id=8423535
_ reso fixed!
Attachment #8424201 - Flags: review?(catlee) → review+
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Something here went live today
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: