Closed
Bug 1173039
Opened 10 years ago
Closed 10 years ago
Write a script to generate runtimes files via ActiveData
Categories
(Testing :: General, defect)
Testing
General
Tracking
(firefox42 fixed)
RESOLVED
FIXED
mozilla42
Tracking | Status | |
---|---|---|
firefox42 | --- | fixed |
People
(Reporter: jgriffin, Assigned: jgriffin)
References
Details
Attachments
(1 file)
343.15 KB,
patch
|
ahal
:
review+
|
Details | Diff | Splinter Review |
We should create a script to make it easy to generate runtimes.json files.
I have such a script locally, and am testing out the resulting chunk times on this try job:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5e4615a5d9ce
Assignee | ||
Comment 1•10 years ago
|
||
Try run with a complete set of mochitest-bc files generated by it: https://treeherder.mozilla.org/#/jobs?repo=try&revision=99e5b945559c
Attachment #8620416 -
Flags: review?(ahalberstadt)
Assignee | ||
Comment 2•10 years ago
|
||
btw Kyle, the data ActiveData generates is very easy to consume for this use case.
Comment 3•10 years ago
|
||
Comment on attachment 8620416 [details] [diff] [review]
Write script to generate runtimes data,
Review of attachment 8620416 [details] [diff] [review]:
-----------------------------------------------------------------
Any idea why some files seemed to shrink so much and others grow? Afaict, it looks like the algorithm you used is more or less the same one I used in runtime-accumulator. Or are these all replacing the ones that you generated manually? Anyway, looks good with some nits below.
::: testing/runtimes/writeruntimes.py
@@ +8,5 @@
> +
> +here = os.path.abspath(os.path.dirname(__file__))
> +
> +ACTIVE_DATA_URL = "http://activedata.allizom.org/query"
> +THRESHOLD = 0.9 # ignore the bottom THRESHOLD*100% of numbers
nit: in stats parlance, I think this is a percentile. Can we call it PERCENTILE?
@@ +38,5 @@
> + testdata = defaultdict(lambda: defaultdict(list))
> + for result in data:
> + platform = result[0]
> + buildtype = result[2]
> + if not buildtype in testdata[platform]:
nit: buildtype not in
@@ +39,5 @@
> + for result in data:
> + platform = result[0]
> + buildtype = result[2]
> + if not buildtype in testdata[platform]:
> + testdata[platform][buildtype] = []
You could avoid this with `defaultdict(lambda: defaultdict(lambda: defaultdict(list)))` (just kidding :p)
@@ +40,5 @@
> + platform = result[0]
> + buildtype = result[2]
> + if not buildtype in testdata[platform]:
> + testdata[platform][buildtype] = []
> + testdata[platform][buildtype].append([result[1], result[3]])
It would be nice to have a comment that explains what we expect each column in the result to be
@@ +58,5 @@
> +
> + # split the durations into two groups; excluded and specified
> + ommitted = []
> + specified = {}
> + for result in testdata[platform][buildtype]:
nit: for test, duration in testdata...
Attachment #8620416 -
Flags: review?(ahalberstadt) → review+
Assignee | ||
Comment 4•10 years ago
|
||
I noticed some of the original mochitest-browser-chrome files included runtimes for devtools tests. These shouldn't, so that would explain cases of shrinkage.
Some of the files I manually created earlier didn't use a threshold of "top 10%", they used a threshold of > 10s, so that may also explain some of the size changes.
Assignee | ||
Comment 5•10 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #4)
> I noticed some of the original mochitest-browser-chrome files included
> runtimes for devtools tests. These shouldn't, so that would explain cases
> of shrinkage.
I confirmed this as the primary cause of shrinkage. The original file for linux-opt contained 262 entries, but only 150 of those were actually for mbc...the rest were for devtools. The new file contains just 125 entries, all for mbc.
> Some of the files I manually created earlier didn't use a threshold of "top 10%", they used a
> threshold of > 10s, so that may also explain some of the size changes.
I confirmed this as the primary cause of file growth. The linux64-pgo file I manually created using a treshold of 10s only contained 29 runtimes; the new files which has a 10% threshold contains 125.
Assignee | ||
Comment 6•10 years ago
|
||
Nits addressed. For kicks, here's a try run which uses an 80% percentile, instead of 90%. I want to compare the chunk normalizations against the try run in comment #1.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=258af41b7b58
Assignee | ||
Comment 7•10 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #6)
> Nits addressed. For kicks, here's a try run which uses an 80% percentile,
> instead of 90%. I want to compare the chunk normalizations against the try
> run in comment #1.
>
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=258af41b7b58
This does, not surprisingly, improve the chunking. It's noticeable mostly on the very slow linux-debug runs, but otherwise not particularly worthwhile. Perhaps I can add an option for percentile, which will default to .9, but which can be overriden for specific platforms.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
status-firefox42:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla42
Comment 10•10 years ago
|
||
I wonder if there's any value in making the script pretty-print the JSON file and sort the keys so that diffs are usable? Are we just going to wind up changing the entire file with every commit anyway?
Assignee | ||
Comment 11•10 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #10)
> I wonder if there's any value in making the script pretty-print the JSON
> file and sort the keys so that diffs are usable? Are we just going to wind
> up changing the entire file with every commit anyway?
Almost certainly, since the chances of the average duration of a test being exactly the same week after week are next to zero.
I guess we could modify this script to read the original file as input, and only write a new value if the new value is sufficiently different from the old. This would reduce a lot of churn, and in this case, using pretty print would make sense. I'll file this as a follow-up.
You need to log in
before you can comment on or make changes to this bug.
Description
•