Switch motionmark to use 'ramp' mode and report complexity score
Categories
(Testing :: Raptor, task, P2)
Tracking
(firefox120 fixed)
Tracking | Status | |
---|---|---|
firefox120 | --- | fixed |
People
(Reporter: jrmuizel, Assigned: aglavic)
References
(Blocks 1 open bug)
Details
(Whiteboard: [fxp])
Attachments
(2 files, 2 obsolete files)
To get stable numbers to compare Firefox with WebRender vs without we chose the current configuration. (See bug 1423267 comment 3).
However, I don't think this configuration is working well:
- The units reported here are (ms) but maybe they're fps? https://treeherder.mozilla.org/jobs?repo=mozilla-central&revision=c14f7934269f333be9e65958c7a012899b3123bd&group_state=expanded&selectedTaskRun=bYAN6l1qTH63dIoAgKRD0w.0
- The values seem to cap at around 60 (which suggests that they are fps)
- This configuration is not representative of the way that people actually run MotionMark
- Chrome appears to do worse than Firefox in CI but that doesn't match the results when running it manually.
In bug 1778575 we're looking to fix frame scheduling which should make the measurements we get in ramp mode much more stable. I don't think we need to wait for that to land before changing the mode though. For now, I'd rather have numbers that are closer to what MotionMark reports than stability.
Reporter | ||
Updated•2 years ago
|
Comment 2•2 years ago
|
||
in general this seems reasonable- if the numbers most people get are not represented with our CI/tests, then we should change our CI. Keep in mind we can also change the labels we use (default is 'ms', can add 'fps') and make sure things are lower_is_better || higher_is_better.
Keep in mind we also have older hardware that runs these tests- maybe it is representative. There are plans in place to upgrade the CPU (and keep intel GPU) with this month ordering some prototypes.
I would leave this up to the perf tooling team to prioritize/change/review as needed. :kimberlythegeek, can you chime in here if there are other things to consider.
Comment 3•2 years ago
|
||
:jrmuziel Could you provide more information on how the configuration is not representative, and using ramp mode?
Updated•2 years ago
|
Updated•2 years ago
|
Reporter | ||
Comment 4•2 years ago
|
||
When you run https://browserbench.org/MotionMark/ in its default configuration it uses ramp mode. The constant complexity mode that we run it in is only accessible through https://browserbench.org/MotionMark/developer.html.
Comment 5•2 years ago
•
|
||
:jrmuizel, regarding point (4), have you seen this on multiple machines and platforms, or only you're own so far?
Also, can you elaborate on why you want the complexity to be reported? We could add this to our extra-options, but it's unclear if we'll ever have more than 1 complexity variation of motionmark running at once.
Updated•2 years ago
|
Updated•2 years ago
|
Reporter | ||
Comment 6•2 years ago
|
||
(In reply to Greg Mierzwinski [:sparky] from comment #5)
:jrmuizel, regarding point (4), have you seen this on multiple machines and platforms, or only you're own so far?
I've run it on a couple of other machines now and the results are mixed.
Also, can you elaborate on why you want the complexity to be reported? We could add this to our extra-options, but it's unclear if we'll ever have more than 1 complexity variation of motionmark running at once.
Complexity is the score reported by MotionMark when you run it in it's default configuration. I just want that. That will prevent tests from getting capped at 60fps like they currently do.
Comment 7•2 years ago
|
||
Ah ok, perfect, thanks for the additional info!
Assignee | ||
Updated•2 years ago
|
Reporter | ||
Comment 8•2 years ago
|
||
Who should do this work?
Comment 9•2 years ago
|
||
The jira task wasn't setup properly so it evaded our grooming filter sorry about that. We'll find someone to look into this at the next grooming session (on Monday Dec 19).
Updated•2 years ago
|
Assignee | ||
Comment 10•2 years ago
|
||
:jrmuizel a few questions about the switch:
- Would you prefer mean or median for the complexity scores?
- What are the units for complexity score? Should we use a unit of 'score'?
- Do you want this to be changed for both motionmark-html and motionmark-animometer?
Assignee | ||
Comment 11•2 years ago
•
|
||
As well if we are tracking score, is lower still better?
(In reply to Andrej Glavic (:andrej) from comment #10)
:jrmuizel a few questions about the switch:
- Would you prefer mean or median for the complexity scores?
- What are the units for complexity score? Should we use a unit of 'score'?
- Do you want this to be changed for both motionmark-html and motionmark-animometer?
Assignee | ||
Comment 12•2 years ago
|
||
Updated•2 years ago
|
Assignee | ||
Comment 13•2 years ago
|
||
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Updated•2 years ago
|
Reporter | ||
Comment 14•2 years ago
|
||
(In reply to Andrej Glavic (:andrej) from comment #10)
:jrmuizel a few questions about the switch:
- Would you prefer mean or median for the complexity scores?
probably the median
- What are the units for complexity score? Should we use a unit of 'score'?
yep, score seems best
- Do you want this to be changed for both motionmark-html and motionmark-animometer?
Yes
As well if we are tracking score, is lower still better?
No, higher is better
Assignee | ||
Comment 15•2 years ago
|
||
Since we are already changing the parameters for the controller, would you like to keep all other existing preferences listed below?
- test-interval=15
- display=minimal
- tiles=big
- frame-rate=30
- kalman-process-error=1
- kalman-measurement-error=4
- time-measurement=performance
Assignee | ||
Updated•2 years ago
|
Reporter | ||
Comment 16•2 years ago
|
||
I think defaults look more like:
- frame-rate=50
- test-interval=30
I think everything else can stay the same.
Updated•2 years ago
|
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Comment 17•1 year ago
|
||
FWIW, this is the set of default options I get on the developer menu on https://browserbench.org/MotionMark1.2/developer.html. E.g. https://browserbench.org/MotionMark1.2/developer.html?warmup-length=2000&warmup-frame-count=30&first-frame-minimum-length=0&test-interval=30&display=minimal&tiles=big&controller=ramp&frame-rate=50&time-measurement=performance&suite-name=MotionMark
Assignee | ||
Comment 18•1 year ago
|
||
We are working on changing motionmark to use ramp mode, but for chrome and chromium when we alter to ramp mode on macs we find that we get a return value of one for all tests and subtests:
https://treeherder.mozilla.org/jobs?repo=try&revision=344b651c2a66fb39b8e4b65fe033d0a7117fc8ed
This is for the 1300 M2s but it was a similar thing for the 1015
Comment 19•1 year ago
|
||
We've been seeing a number of scoring issues with MotionMark in general - including a reported 0 score on Chrome on the Multiply test on very fast devices. But not all tests, so I suspect something else is going wrong here. Hoping to fix some of the structural scoring problems MotionMark 2. In the meantime, how difficult would it be to make a brand new taskcluster job so we can at least track ramp results for Firefox?
Assignee | ||
Comment 20•1 year ago
|
||
We can definitely do that :) I can look into that and get it sometime soon after all-hands!
Updated•1 year ago
|
Assignee | ||
Comment 22•1 year ago
|
||
What we are doing in this bug is adding the ability to get motionmark ramp scores for just firefox
Motionmark has issues with other browsers, which is why we are starting with just firefox for now
Comment 23•1 year ago
|
||
Pushed by aglavic@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9a8976307af6 Get motionmark ramp scores tracked for Firefox. r=perftest-reviewers,afinder
Assignee | ||
Updated•1 year ago
|
Comment 24•1 year ago
|
||
bugherder |
Description
•