Bimodal results are causing occasional false results from the TResize test, for example: https://groups.google.com/forum/#!searchin/mozilla.dev.tree-management/tresize|sort:date/mozilla.dev.tree-management/odm0eNeYpAA/Aln8c2mBGcwJ Looking at the graph, it looks like this bimodal behavior has been around forever: http://graphs.mozilla.org/graph.html#tests=[[254,131,15]]&sel=none&displayrange=365&datatype=running
It looks like the result is usually exactly 11 or exactly 12, which makes me suspect this has the same root cause as bug 755633 -- we are inappropriately rounding these numbers to integers, adding aliasing noise to the data.
Depends on: 755633
Summary: Bimodal data in TResize on Fedora 12 x64 → Bimodal data (caused by rounding?) in TResize on Fedora 12 x64
Whiteboard: [regression-detection] → [SfN]
TResize compares timestamps using (new Date()).getTime(): http://hg.mozilla.org/build/talos/file/c10f4a861b3d/talos/startup_test/tresize-test.html#l39 I think these millisecond timestamps are not high-enough resolution for this test, and we should find a more precise way to do this measurement. We can use higher-resolution timers, or we could measure the cumulative time over the whole test instead of calculating times for each separate run and averaging them.
Assignee: nobody → mbrubeck
I looked at the raw data collected (found in the tinderbox logs) for this and it looks like we are treating it correctly. I suspect this is more of an issue of a possible machine configuration or something specific to a build. I am pretty confident this isn't a rounding issue.
(In reply to Joel Maher (:jmaher) from comment #3) > I looked at the raw data collected (found in the tinderbox logs) for this > and it looks like we are treating it correctly. I suspect this is more of > an issue of a possible machine configuration or something specific to a > build. I am pretty confident this isn't a rounding issue. Well, each individual test result is definitely rounded to the nearest millisecond -- not because we're explicitly rounding it, but because we are using a timer that offers only one-millisecond resolution. This means small changes are either magnified or lost. For example, if the individual times are tightly clustered around 13.45ms, then the results will all (or almost all) be recorded as 13ms. A tiny 0.5% regression could push the average time above 13.50ms, change the recorded measurements to 14ms, and give the appearance of a 7.7% regression. But a major decrease of 0.90ms (6.7%) might not show up at all, since we don't have the resolution to distinguish the new time (12.55ms) from the old time.
Created attachment 735947 [details] [diff] [review] untested patch: Use high-resolution timers in tresize This uses performance.now (part of the Web Performance API, supported in Firefox 15 and higher -- see bug 539095) to get higher-resultion times in tresize. If this works, it would probably be useful in other Talos tests as well. For documentation on performance.now() see: http://www.w3.org/TR/hr-time/
Attachment #735947 - Flags: review?(jmaher)
Comment on attachment 735947 [details] [diff] [review] untested patch: Use high-resolution timers in tresize Review of attachment 735947 [details] [diff] [review]: ----------------------------------------------------------------- great. Let me push this to try and crank up a few dozen retriggers of this test.
Attachment #735947 - Flags: review?(jmaher) → review+
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.