Closed Bug 1142050 Opened 10 years ago Closed 10 years ago

Add ability to chunk by runtime to mochitest

Categories

(Testing :: Mochitest, defect)

defect
Not set
normal

Tracking

(firefox40 fixed)

RESOLVED FIXED
mozilla40
Tracking Status
firefox40 --- fixed

People

(Reporter: ahal, Assigned: ahal)

References

(Blocks 1 open bug)

Details

Attachments

(5 files, 1 obsolete file)

No description provided.
Assignee: nobody → ahalberstadt
Status: NEW → ASSIGNED
Attached file MozReview Request: bz://1142050/ahal (obsolete) —
/r/5223 - Bug 1142050 - Add initial test runtimes files for mochitest, r=jmaher Pull down this commit: hg pull review -r 03e8eba63a9dc518cdf4e2ef7714bc1075dbf26d
Comment on attachment 8576256 [details] MozReview Request: bz://1142050/ahal /r/5223 - Bug 1142050 - Add initial test runtimes files for mochitest, r=jmaher Pull down this commit: hg pull review -r 03e8eba63a9dc518cdf4e2ef7714bc1075dbf26d
Attachment #8576256 - Flags: review?(jmaher)
Attachment #8576256 - Flags: review?(jmaher)
Comment on attachment 8576256 [details] MozReview Request: bz://1142050/ahal https://reviewboard.mozilla.org/r/5221/#review4229 overall this looks great. I was under the impression that we had a common multiplier for all platforms for a given test between opt/debug? also how does pgo, e10s, ASAN fit into this?
Comment on attachment 8576256 [details] MozReview Request: bz://1142050/ahal Oh oops, I uploaded this to the wrong bug.. I'll paste your comment over to bug 1139904 and reply there.
Attachment #8576256 - Attachment is obsolete: true
Comment on attachment 8576256 [details] MozReview Request: bz://1142050/ahal /r/5223 - Bug 1142050 - Add buildprops platform and buildtype guess to mozinfo.py, r=ted /r/5387 - Bug 1142050 - Add --chunk-by-runtime option to mochitest, r=jmaher Pull down these commits: hg pull review -r bad6ac0c7c797e41b128f270ef8f4223a9d97cfe
Attachment #8576256 - Attachment is obsolete: false
Attachment #8576256 - Flags: review?(ted)
Attachment #8576256 - Flags: review?(jmaher)
This patch is really ugly (especially the mozinfo.py part that ted is reviewing), but I don't know of any other way. The only alternative I can see is to not store data on a per-platform basis. The underlying issue is that we are relying on data that comes from varying jobs in automation. So we need to be able to correlate a local build to a buildbot job. It would be awesome if there were a better way to do this.
Btw, here's a try run with everything in action: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1526f3bd2c89 m-bc before: (32, 11, 27) m-bc after: (32, 22, 23) m-e10s-bc before: (29, 7, 20) m-e10s-bc after: (22, 15, 13) The results are better, but not perfect. Likely because it is only counting tests above the 90th percentile. One big improvement would be to calculate the average of all tests below the 90th percentile and use that as the default value for tests that don't show up in the runtimes files.
I am most concerned about bc1 (non-e10s), lets look at the total runtime: non-e10s: before: 32+11+27 = 70 minutes after: 32+22+23 = 77 minutes e10s: before: 29+7+20 = 56 minutes after: 22+15+13 = 50 minutes I would like to know if we are running the same number of tests. I suspect in non-e10s mode we are running more.
Comment on attachment 8576256 [details] MozReview Request: bz://1142050/ahal https://reviewboard.mozilla.org/r/5221/#review4353 ::: testing/mochitest/runtests.py (Diff revision 2) > + data_dir = os.path.join(SCRIPT_DIR, 'runtimes', '{}-{}'.format( can we validate that data_dir exists? ::: testing/mochitest/runtests.py (Diff revision 2) > + flavor = 'plain' we will need to be aware of gl, this will be a subsuite soon. ::: testing/mochitest/runtests.py (Diff revision 2) > + with open(runtime_file, 'r') as f: we assume this file exists- not a good idea- lets validate the file and print a good error message. I can see us standing up a new platform or something and having an odd error or timeout that we can't figure out. ::: python/mozbuild/mozbuild/mozinfo.py (Diff revision 2) > + if d['buildapp'] == 'mobile/android': can the buildapp checks be elif conditions? I would like to ensure that this 'guess' work is done in one place if possible. Do we have other instances of this in mochitest/reftest/xpcshell? If so, we should find a way to ensure the format matches and we can reduce our hacking/guessing duplication.
Attachment #8576256 - Flags: review?(jmaher)
(In reply to Joel Maher (:jmaher) from comment #8) > I would like to know if we are running the same number of tests. I suspect > in non-e10s mode we are running more. For non-e10s I found the treeherder results for the parent changeset and tallied the total number of assertions in both the parent and my try push, and they added up to the same number. The numbers I posted here are what treeherder prints, which includes all of the buildbot/mozharness setup and downloading. It would be more accurate to compare the timestamps of the SUITE-START and SUITE-END log messages. The e10s case is what I'm more worried about, the assertion counts were off by about 4000 there. I'll investigate that in a bit, though I don't think that should block this patch from landing.
https://reviewboard.mozilla.org/r/5221/#review4355 I agree that this should really only live in one place. I think mozinfo.py is the correct place to put this because it will be available both when building locally and when running from a tests.zip. Putting it there is also test suite agnostic, so reftest/xpcshell/etc can make use of it as well.
yes, I agree mozinfo is the logical location for this. If you could file bugs for any other locations, that would be very beneficial!
https://reviewboard.mozilla.org/r/5221/#review4499 > we assume this file exists- not a good idea- lets validate the file and print a good error message. I can see us standing up a new platform or something and having an odd error or timeout that we can't figure out. See comment above. > can we validate that data_dir exists? I'm tempted to just let open do the validation. If you try to open a non-existent file, python raises: IOError: [Errno 2] No such file or directory: 'path/to/file' What message would you prefer? I'm not convinced any extra validation we did would be much clearer than that. > we will need to be aware of gl, this will be a subsuite soon. Thanks for the reminder, but I'll need to wait and see what gets passed into pulse before trying to guess what it's suite name will look like.
all of that sounds good.
Comment on attachment 8576256 [details] MozReview Request: bz://1142050/ahal /r/5223 - Bug 1142050 - Add buildprops platform and buildtype guess to mozinfo.py, r=ted /r/5387 - Bug 1142050 - Add --chunk-by-runtime option to mochitest, r=jmaher /r/5715 - Bug 1142050 - Use average runtime of excluded tests as the default runtime value Pull down these commits: hg pull review -r dc865c0cd97c22cbc5b23d4a582e80644e8efec1
Attachment #8576256 - Flags: review?(jmaher)
The latest commit in the review tweaks the algorithm to take the average runtime of the excluded tests into account. This results in much better results. Attached is a spreadsheet showing deltas between slowest and fastest chunks for mochitest-browser-chrome on various platforms both with and without --chunk-by-runtime. The delta decreases dramatically with --chunk-by-runtime, showing an average improvement of 78%.
Comment on attachment 8576256 [details] MozReview Request: bz://1142050/ahal https://reviewboard.mozilla.org/r/5221/#review4671 Ship It!
Attachment #8576256 - Flags: review?(jmaher) → review+
https://reviewboard.mozilla.org/r/5223/#review5159 ::: python/mozbuild/mozbuild/mozinfo.py (Diff revision 3) > + } My only quibble here is that everything else in mozinfo is a flat structure and you're adding a nested dict here. Maybe flatten this out to 'platform_guess' 'buildtype_guess' or something? This seems as good a place as any for this to live. Obviously the ideal would be for the RelEng code and mozinfo code to be shared so we're not defining this in multiple places and trying to keep them in sync. Maybe Armen would have an idea about how to do that? Definitely followup fodder.
Comment on attachment 8576256 [details] MozReview Request: bz://1142050/ahal /r/5223 - Bug 1142050 - Add buildprops platform and buildtype guess to mozinfo.py, r=ted /r/5387 - Bug 1142050 - Add --chunk-by-runtime option to mochitest, r=jmaher Pull down these commits: hg pull review -r 521b3a37c11478794e1de13f9c11e7b9730301a4
Attachment #8576256 - Flags: review?(ted)
Attachment #8576256 - Flags: review+
Comment on attachment 8576256 [details] MozReview Request: bz://1142050/ahal The latest version flattens the mozinfo dict like ted suggested. Carrying r+ forward.
Attachment #8576256 - Flags: review+
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
Here's a patch to get this running on try. So far only mochitest browser-chrome and devtools runtime data is checked into the tree, so only those suites will work.
Depends on: 1151370
No longer depends on: 1151370
Depends on: 1151370
Attachment #8576256 - Attachment is obsolete: true
Attachment #8619726 - Flags: review+
Attachment #8619727 - Flags: review+
Attachment #8619728 - Flags: review+
Blocks: 1181793
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: