Closed
Bug 1173039
Opened 9 years ago
Closed 9 years ago
Write a script to generate runtimes files via ActiveData
Categories
(Testing :: General, defect)
Testing
General
Tracking
(firefox42 fixed)
RESOLVED
FIXED
mozilla42
Tracking | Status | |
---|---|---|
firefox42 | --- | fixed |
People
(Reporter: jgriffin, Assigned: jgriffin)
References
Details
Attachments
(1 file)
343.15 KB,
patch
|
ahal
:
review+
|
Details | Diff | Splinter Review |
We should create a script to make it easy to generate runtimes.json files. I have such a script locally, and am testing out the resulting chunk times on this try job: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5e4615a5d9ce
Assignee | ||
Comment 1•9 years ago
|
||
Try run with a complete set of mochitest-bc files generated by it: https://treeherder.mozilla.org/#/jobs?repo=try&revision=99e5b945559c
Attachment #8620416 -
Flags: review?(ahalberstadt)
Assignee | ||
Comment 2•9 years ago
|
||
btw Kyle, the data ActiveData generates is very easy to consume for this use case.
Comment 3•9 years ago
|
||
Comment on attachment 8620416 [details] [diff] [review] Write script to generate runtimes data, Review of attachment 8620416 [details] [diff] [review]: ----------------------------------------------------------------- Any idea why some files seemed to shrink so much and others grow? Afaict, it looks like the algorithm you used is more or less the same one I used in runtime-accumulator. Or are these all replacing the ones that you generated manually? Anyway, looks good with some nits below. ::: testing/runtimes/writeruntimes.py @@ +8,5 @@ > + > +here = os.path.abspath(os.path.dirname(__file__)) > + > +ACTIVE_DATA_URL = "http://activedata.allizom.org/query" > +THRESHOLD = 0.9 # ignore the bottom THRESHOLD*100% of numbers nit: in stats parlance, I think this is a percentile. Can we call it PERCENTILE? @@ +38,5 @@ > + testdata = defaultdict(lambda: defaultdict(list)) > + for result in data: > + platform = result[0] > + buildtype = result[2] > + if not buildtype in testdata[platform]: nit: buildtype not in @@ +39,5 @@ > + for result in data: > + platform = result[0] > + buildtype = result[2] > + if not buildtype in testdata[platform]: > + testdata[platform][buildtype] = [] You could avoid this with `defaultdict(lambda: defaultdict(lambda: defaultdict(list)))` (just kidding :p) @@ +40,5 @@ > + platform = result[0] > + buildtype = result[2] > + if not buildtype in testdata[platform]: > + testdata[platform][buildtype] = [] > + testdata[platform][buildtype].append([result[1], result[3]]) It would be nice to have a comment that explains what we expect each column in the result to be @@ +58,5 @@ > + > + # split the durations into two groups; excluded and specified > + ommitted = [] > + specified = {} > + for result in testdata[platform][buildtype]: nit: for test, duration in testdata...
Attachment #8620416 -
Flags: review?(ahalberstadt) → review+
Assignee | ||
Comment 4•9 years ago
|
||
I noticed some of the original mochitest-browser-chrome files included runtimes for devtools tests. These shouldn't, so that would explain cases of shrinkage. Some of the files I manually created earlier didn't use a threshold of "top 10%", they used a threshold of > 10s, so that may also explain some of the size changes.
Assignee | ||
Comment 5•9 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #4) > I noticed some of the original mochitest-browser-chrome files included > runtimes for devtools tests. These shouldn't, so that would explain cases > of shrinkage. I confirmed this as the primary cause of shrinkage. The original file for linux-opt contained 262 entries, but only 150 of those were actually for mbc...the rest were for devtools. The new file contains just 125 entries, all for mbc. > Some of the files I manually created earlier didn't use a threshold of "top 10%", they used a > threshold of > 10s, so that may also explain some of the size changes. I confirmed this as the primary cause of file growth. The linux64-pgo file I manually created using a treshold of 10s only contained 29 runtimes; the new files which has a 10% threshold contains 125.
Assignee | ||
Comment 6•9 years ago
|
||
Nits addressed. For kicks, here's a try run which uses an 80% percentile, instead of 90%. I want to compare the chunk normalizations against the try run in comment #1. https://treeherder.mozilla.org/#/jobs?repo=try&revision=258af41b7b58
Assignee | ||
Comment 7•9 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #6) > Nits addressed. For kicks, here's a try run which uses an 80% percentile, > instead of 90%. I want to compare the chunk normalizations against the try > run in comment #1. > > https://treeherder.mozilla.org/#/jobs?repo=try&revision=258af41b7b58 This does, not surprisingly, improve the chunking. It's noticeable mostly on the very slow linux-debug runs, but otherwise not particularly worthwhile. Perhaps I can add an option for percentile, which will default to .9, but which can be overriden for specific platforms.
https://hg.mozilla.org/mozilla-central/rev/a16fce3c2d59
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox42:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla42
Comment 10•9 years ago
|
||
I wonder if there's any value in making the script pretty-print the JSON file and sort the keys so that diffs are usable? Are we just going to wind up changing the entire file with every commit anyway?
Assignee | ||
Comment 11•9 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #10) > I wonder if there's any value in making the script pretty-print the JSON > file and sort the keys so that diffs are usable? Are we just going to wind > up changing the entire file with every commit anyway? Almost certainly, since the chances of the average duration of a test being exactly the same week after week are next to zero. I guess we could modify this script to read the original file as input, and only write a new value if the new value is sufficiently different from the old. This would reduce a lot of churn, and in this case, using pretty print would make sense. I'll file this as a follow-up.
You need to log in
before you can comment on or make changes to this bug.
Description
•