Closed Bug 1917594 Opened 1 year ago Closed 5 months ago

Verify screen refresh rates for motionmark 1.3.x+

Categories

(Testing :: Performance, task, P3)

task

Tracking

(firefox140 fixed)

RESOLVED FIXED
140 Branch
Tracking Status
firefox140 --- fixed

People

(Reporter: kshampur, Assigned: kshampur)

References

(Blocks 2 open bugs, Regressed 1 open bug)

Details

(Whiteboard: [fxp])

Attachments

(3 files)

From the motionmark about

The MotionMark benchmark relies on the requestAnimationFrame() JavaScript API, which provides callbacks at a consistent frequency related to screen refresh rate. However, browsers have made different choices about whether requestAnimationFrame() should strictly follow screen refresh rate. Safari currently fires requestAnimationFrame() callbacks at 60Hz on 120Hz screens, while other browsers fire it at 120Hz. This affects the benchmark score, so to compare browser scores across browsers, be sure to set the screen refresh rate to 60Hz (for example on macOS, this can be done in the Displays panel in System Settings).

I am not sure if we enforce 60hz across machines, this bug should track confirming that and implementing it as required

I believe we do that on windows already https://searchfox.org/mozilla-central/rev/04a47c08504e6357a3164163dd19a47754521204/testing/mozharness/configs/raptor/windows_config.py#78

but not linux/mac

(not sure if we want/need it on mobile yet)

clarifying this is for 1.3.x+

Summary: Verify screen refresh rates for motionmark → Verify screen refresh rates for motionmark 1.3.x+
See Also: → 1883718

:jmaher (please redirect as needed)

any chance you'd know if there is a way to check/enforce a screen refresh rate on linux/mac, similar to how it is done on Windows? e.g. https://searchfox.org/mozilla-central/rev/7a85a111b5f42cdc07f438e36f9597c4c6dc1d48/testing/mozharness/configs/raptor/windows_config.py#82

Flags: needinfo?(jmaher)

It looks like there are solutions for mac/linux:
https://stackoverflow.com/questions/1225057/how-can-i-determine-the-monitor-refresh-rate

we have unittests on mac that use nsscreen:
https://searchfox.org/mozilla-central/source/browser/components/shell/test/mac_desktop_image.py#29

so that dependency is fixed for osx 10.15 currently (not sure about the m2 14.40 machines).

Flags: needinfo?(jmaher)

In this Try I ended up using system_profiler and xrander for mac/linux respectively
https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&author=kshampur%40mozilla.com&selectedTaskRun=bUsgVdnqS9uLtMEpzXGdBw.0

it seems the M2 machines are 60hz, and the linux machines are 60 hz( actually ~59.87 but should satisfy the 60 hz criteria)

just waiting on the 10.15 machines to start running. If that is also 60hz I will close this. Potentially file a follow up bug of having a unit test verify the refresh rates in raptor tests, for mac/linux. But seems like a nice to have at this point

Assignee: nobody → kshampur

in this other try run of just mac
https://treeherder.mozilla.org/jobs?repo=try&selectedTaskRun=UQzIklWTT_G7cwsORUuKvA.0&tier=1%2C2%2C3&revision=cb1799b380a845909a3f520c4b217254abab5809

the jobs that have a motionmark score of low integers (like, 1) have a refresh rate of <60hz (24hz) in this case. I did initially suspect the weird scores we've seen in the past (eg bug 1871603, bug 1883718) might be related to refresh rate (i mentioned here https://bugzilla.mozilla.org/show_bug.cgi?id=1883718#c2)

going to submit another Try run with with several retriggers on mac+linux as I recall seeing weird low integer scores before on various platforms/browsers and confirm if it is tied to low refresh rates

Here is another Try from last week with several more retriggers as mentioned above

mac+linux : https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=71e0fd0ce6f6c37862f1bb536e2b134a48406947&selectedTaskRun=VBLP6pc8T9qoxBdYifCKng.0

it does indeed seem that the only cases where the weird scoring occurs is when the refresh rate is found to be 24hz e.g. (in some cases it comes up empty ''? e.g.). AND it only happens on the 10.15 machines. All the 1400 machines seem ok at 60hz

windows is also fine, (as expected)
This used to occur on more than just mac iirc, but i think the upgrade from 1.3.0 -> 1.3.1 fixed some stuff (i recall changelog mentioned some fixes/changes to frame rate detection)

:rcurran, not sure if you'd be the right person to ask... so to summarize the above^ it seems weird motionmark scores occur on osx 10.15 machines that have a 24hz refresh rate (and in some cases, no refresh rate is outputted?). I couldn't exactly find a clear pattern/list of all the machine ID's .
(but I can go back and try doing that later more in depth if you think that would help)

The tests running on 60 hz monitors seem to be fine so far.

Would you happen to know how the osx 10.15 machines are physically set up? Perhaps some monitors are just manually set to 24hz (and could those be switched up to 60hz?) and for the case where system profiler outputs no hz... either no monitor is hooked up to it or it's a really old monitor?

Flags: needinfo?(rcurran)
Blocks: 1858668

if the macosx refresh rate is randomly 24hz, this is probably the KVM switch connected to the machine, or some prior task that leaves the machine in a specific state.

10.15 machines are hosted in our own datacenter.
14.xx machines are hosted at macstadium.

is this refresh rate printed out on live 10.15 machines, or with a patch on try?

ideally we can test for this in task bootstrapping...I would like to know where the code for the printing of "refresh rate Xhz" is coming from and how to calculate that.

the 10.15 machines are all connected via hdmi to a KVM switch. Ryan is on PTO today and tomorrow, maybe :aerickson can get specific details of the 10.15 hdmi dongle/kvm/connection?

Flags: needinfo?(aerickson)
Attached file refresh_rate.patch โ€”

(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #9)

if the macosx refresh rate is randomly 24hz, this is probably the KVM switch connected to the machine, or some prior task that leaves the machine in a specific state.

10.15 machines are hosted in our own datacenter.
14.xx machines are hosted at macstadium.

is this refresh rate printed out on live 10.15 machines, or with a patch on try?

ideally we can test for this in task bootstrapping...I would like to know where the code for the printing of "refresh rate Xhz" is coming from and how to calculate that.

thanks for the insight Joel. That explains why the 1400's are consistent.

Yes, it is only being printed out on Try right now for this investigation.
I've attached the diff of what I was using in our raptor test

oh, trying to terminate early- a lot of failures:
https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=0fe694539ca3ed4f5c9de89a2d8a9550e9e91d34

I will harvest the machines, I bet it is a small subset that frequently has 24hz

I think we should look into why some of these machines listed below have 24hz instead of 60hz. Maybe a different resolution, a bad dongle, etc.?

Machine number frequency
6 1
8 2
47 2
71 2
63 1
79 1
89 3
93 4
99 2
103 2
105 1
114 4
117 2
144 2
149 2
151 1
154 2
163 5
164 3
169 1
170 2
175 3
187 2
188 2
190 1
210 6
215 1
221 2
223 2
224 1
248 1
270 2
278 1
281 1

Nice, thanks for taking a look!

I know in your try you weren't looking at linux but I noticed an issue previously when I was looking at linux before... for some reason in my own linux machine that xrandr command works, but through pythons subprocess it seems to not properly remove the asterik in that sed command e.g. refresh rate in CI would get returned as 59.87* instead of 59.87

https://firefox-ci-tc.services.mozilla.com/tasks/bPVH9K24QG6rCSIVP9zbfA/runs/0/logs/public/logs/live.log#L975

not a big deal - just wanted to point that out since I noticed in your diff that it looks for a comparison for 60 in if str(refresh_rate) != "60": which wouldn't be satisfied for the current linux

by the way, did you encounter any tasks that outputted an empty field instead of 24hz or 60hz? that was another case I got. Like this https://firefox-ci-tc.services.mozilla.com/tasks/JpGVERVIRDKmJh8Fd-18Kw/runs/0/logs/public/logs/live.log#L880

I saw a couple with '' instead of 24; and I am not sure what to do about linux, it will need to be something on the new 24.04 image. Thanks for calling this out. We can focus on making macosx solid.

:kshampur

Most of the MacStadium Macs have a fit-Headless 4K dongle attached (see: https://fit-iot.com/web/product/fit-headless-4k/).

In mdc1, however, some Macs use an alternative display emulator (example: https://www.amazon.com/Headless-Display-Emulator-Headless-1920x1080-Generation/dp/B06XT1Z9TF?th=1). Specifically, Macs such as macmini-r8-103, macmini-r8-188, and macmini-r8-281 (randomly selected from your list) have this type of dongle, with their resolution set to 3840x2160.

One other machine is reporting connections to a VE228H monitor, which appears to be routed through a patch panel to an Asus display, though the exact configuration is uncertain to me at the moment. To make it more confusing, macmini-r8-93 (from list) is setup like this (res 4096x2160) and is still having the issue.

Is there a machine I can look at that has passed this test recently to try to compare?

Flags: needinfo?(rcurran)
Flags: needinfo?(aerickson)

(In reply to Ryan Curran from comment #16)

:kshampur

Most of the MacStadium Macs have a fit-Headless 4K dongle attached (see: https://fit-iot.com/web/product/fit-headless-4k/).

In mdc1, however, some Macs use an alternative display emulator (example: https://www.amazon.com/Headless-Display-Emulator-Headless-1920x1080-Generation/dp/B06XT1Z9TF?th=1). Specifically, Macs such as macmini-r8-103, macmini-r8-188, and macmini-r8-281 (randomly selected from your list) have this type of dongle, with their resolution set to 3840x2160.

Other machines are reporting connections to a VE228H monitor, which appears to be routed through a patch panel to an Asus display, though the exact configuration is uncertain to me at the moment. To make it more confusing, macmini-r8-93 (from list) is setup like this (res 4096x2160) and is still having the issue.

Is there a machine I can look at that has passed this test recently to try to compare?

thanks for the details :rcurran

Is there a machine I can look at that has passed this test recently to try to compare?

sorry what do you mean by this? like, a test that passed that successfully outputted 60hz?
if so, in this try any osx 10.15 tasks where the scores are not just low integers, so e.g. this one looks normal and. So the one in that link highlights such a task. if you search the log file you'll see Refresh Rate: 60 Hz

somewhat related... i just noticed that this task outputs an empty refreshrate but has normal looking scores... so that makes this more confusing, as previously i thought any reported refreshrate of 24hz and empty hz had weird scores... but it seems maybe only 24hz leads to weird scores, and not always empty refresh rate...

Ok from that try link it sounds like you're saying macmini-r8-134 recently passed the test

That machine is configured with this dongle https://www.amazon.com/Headless-Display-Emulator-Headless-1920x1080-Generation/dp/B06XT1Z9TF?th=1

And its screen resolution is set to 1920x1080...which is much lower than the resolutions that failed the test (i think)

(In reply to Ryan Curran from comment #18)

Ok from that try link it sounds like you're saying macmini-r8-134 recently passed the test

That machine is configured with this dongle https://www.amazon.com/Headless-Display-Emulator-Headless-1920x1080-Generation/dp/B06XT1Z9TF?th=1

And its screen resolution is set to 1920x1080...which is much lower than the resolutions that failed the test (i think)

interesting thanks for looking into it.. also 134 doesn't show up in Joel's list (expectedly)
Though, resolution =/= refresh rate, but I don't know enough about these headless display emulators

Also curious about mdc1.macmini-r8-16 which displayed no refresh rate, what kind of (headless?) monitor is that set up with?

(In reply to Kash Shampur [:kshampur] โŒšEST from comment #19)

(In reply to Ryan Curran from comment #18)

Ok from that try link it sounds like you're saying macmini-r8-134 recently passed the test

That machine is configured with this dongle https://www.amazon.com/Headless-Display-Emulator-Headless-1920x1080-Generation/dp/B06XT1Z9TF?th=1

And its screen resolution is set to 1920x1080...which is much lower than the resolutions that failed the test (i think)

interesting thanks for looking into it.. also 134 doesn't show up in Joel's list (expectedly)
Though, resolution =/= refresh rate, but I don't know enough about these headless display emulators

Also curious about mdc1.macmini-r8-16 which displayed no refresh rate, what kind of (headless?) monitor is that set up with?

Iโ€™m not yet familiar with the emulators, as I havenโ€™t had a chance to visit the data center. However, Iโ€™m more than happy to answer any questions I can, and Iโ€™ll gladly pass along any others to our infrastructure team.

macmini-r8-16.test.releng.mdc1.mozilla.com has this plugged in at a resolution of 4096x2160

hey :jmaher and :rcurran! just wanted to follow up on this

so what would be a good path forward here now? Is there a way to remedy some of the culprit machines here? (or if it's already being worked on, any link I could follow?)
and, is there anything I (or my team) could help with regarding this?

Flags: needinfo?(rcurran)
Flags: needinfo?(jmaher)

I will leave the NI? for rcurran, my take is we need to get the same dongles for all the machines before we can enforce anything here. If we have different types of dongle and connections, we will significantly increase the chance of making this work consistently.

Another thought is if we split the pool, we could have some workers with a uniform dongle/connection which would then allow us to not buy a bunch of new equipment.

I don't know the full picture of the hardware or the discussions/plans around that.

Flags: needinfo?(jmaher)

Just want to validate that all nodes except one gecko-t-osx-1015-r8

have (and have always had) the 28E850 emulator dongle

ha, and machine #186 wasn't in my original list. So given that we have identical dongles, I have a couple of thoughts:

  1. dongles are going bad? I have seen this in the past
  2. dongles need to be re-seated (unplugged and replugged in)
  3. OS is using wrong driver/monitor/settings

My thoughts would be that we focus on #3 first, and if we detect something is wrong, adjust the settings. If the desired settings are not available or do not stick around, then look at #1 and #2.

Ryan, how can we focus on #3? is there a way to query across the pool what driver, monitor type, and resolution/refresh rate we have for each machine?

:jmaher

how are you validating the refresh rate on the macs? none of the commands I have tried related to system_profiler seem to return anything. thank you

Flags: needinfo?(jmaher)

(In reply to Ryan Curran from comment #25)

:jmaher

how are you validating the refresh rate on the macs? none of the commands I have tried related to system_profiler seem to return anything. thank you

Not sure if Joel has changed the approach, but I was using system_profiler SPDisplaysDataType | grep 'UI' | cut -d '@' -f 2 | cut -d ' ' -f 2 | sed 's/Hz//
However that doesn't work for you? I do recall there were some other system_profiler commands suggested when I was looking online, but those didn't work... but SPDisplaysDataType seemed to work.

Worth noting, some machines were returning empty output though (and others 24, others 60). Are you trying on one specific machine or different machines? If the former... maybe you've come across a machine that gives empty output

here is what I did:

diff --git a/testing/mozharness/external_tools/refresh_rate.py b/testing/mozharness/external_tools/refresh_rate.py
new file mode 100644
--- /dev/null
+++ b/testing/mozharness/external_tools/refresh_rate.py
@@ -0,0 +1,25 @@
+import platform
+import subprocess
+import sys
+
+def get_refresh_rate():
+    if platform.system() == "Darwin":
+        cmd = "system_profiler SPDisplaysDataType | grep 'UI' | cut -d '@' -f 2 | cut -d ' ' -f 2 | sed 's/Hz//'"
+        result = subprocess.run(cmd, shell=True, capture_output=True, text=True)
+        refresh_rate = result.stdout.strip()
+        print(f"Refresh Rate: {refresh_rate} Hz")
+        if str(refresh_rate) != "60":
+            print(f"ERROR: expected refresh rate = 60, instead got {refresh_rate}.")
+            return 1
+    elif platform.system() == "Linux":
+        cmd = "xrandr | grep '*' | awk '{print $2}' | sed 's/*+//'"
+        result = subprocess.run(cmd, shell=True, capture_output=True, text=True)
+        refresh_rate = result.stdout.strip()
+        print(f"Refresh Rate: {refresh_rate} Hz")
+        if str(refresh_rate) != "60":
+            print(f"ERROR: expected refresh rate = 60, instead got {refresh_rate}.")
+            return 1
+    else:
+        return 0
+
+sys.exit(get_refresh_rate())
Flags: needinfo?(jmaher)
Flags: needinfo?(rcurran)

Didn't mean to clear the NI but attached a document with findings

Cut a ticket to Van Le to re seat 5 of the dongles on units that register 24hz and 5 of the dongles on units that register no hz

See Also: → 1932377

Reinstalling macOS seems to have resolved this issue. Since we're migrating to macOS 14 this week anyways, we'll migrate the hosts reporting 24hz or no hz first to the new worker pool

playing with my patch, I find these refresh rates:
osx/aarch64 (11.20 @macstadium): 53.00
osx/x86_64 (14.70 @mdc): 24.00 || 60.00

linux wayland (vm @gcp): 59.96

so it appears that at least 1 of the freshly installed 14.70 machines has an invalid refresh rate- maybe it gets into a state where it fails and falls back?

We are planning to switch to osx/aarch64 from macstadium -> mdc and also run 15.x, I wonder if we should wait for that before enforcing anything. Or I could build this to ensure a refresh rate of X where X is defined based on os::version

(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #32)

playing with my patch, I find these refresh rates:
osx/aarch64 (11.20 @macstadium): 53.00
osx/x86_64 (14.70 @mdc): 24.00 || 60.00

linux wayland (vm @gcp): 59.96

so it appears that at least 1 of the freshly installed 14.70 machines has an invalid refresh rate- maybe it gets into a state where it fails and falls back?

fwiw (at least in my teams case) I don't think we run any perftests on the 11.20
just the 10.15 intel (which is soon becoming 14.7) and the aarch64 (which are on 14.5 iirc)

it seems the aarch64 machines were consistently 60 hz, hosted at macstadium.
and at least there is consistency in seeing 24 mixed in with 60 on the 14.7, which is in mdc

We are planning to switch to osx/aarch64 from macstadium -> mdc and also run 15.x, I wonder if we should wait for that before enforcing anything. Or I could build this to ensure a refresh rate of X where X is defined based on os::version

i am now wondering if switching to mdc will introduce 24hz on the aarch64 machines

(In reply to Kash Shampur [:kshampur] โŒšEST from comment #33)

i am now wondering if switching to mdc will introduce 24hz on the aarch64 machines

though I forget, were dongles the culprit in the end? so if we use the right dongle at mdc then should be fine? as mentioned by Ryan before:

Most of the MacStadium Macs have a fit-Headless 4K dongle attached (see: https://fit-iot.com/web/product/fit-headless-4k/).

In mdc1, however, some Macs use an alternative display emulator (example: https://www.amazon.com/Headless-Display-Emulator-Headless-1920x1080-Generation/dp/B06XT1Z9TF?th=1).

dongles are not the only problem, we switched dongles a couple of weeks ago with no change, a reinstall seemed to fix it all.

most likely when we leave macstadium and host in house this will result in the same problem with invalid refresh rates.

I suspect (with minimal data to back this up - some 14.70 machines have bad refresh rates already) there is something going on with the unittests that is causing the machines to get into a bad state. If that is the case, then the perf tests at macstadium are not affected, but when sharing a pool in MDC next year with unittests we could run into similar problems.

I ordered the same dongles that MacStadium is using and sent them to mdc1 for QTS to install to test

See Also: → 1943655

just making note, after Bug 1943655 landed, I still see the occasional incorrect refresh rate and as a result, incorrect score values (low integers)

e.g. https://treeherder.mozilla.org/jobs?repo=mozilla-central&selectedTaskRun=Cc5e7gD7Rj6uJzZTKiQmQQ.0&tier=1%2C2%2C3&searchStr=macos%2C14.70%2Cx64%2Cshippable%2Copt%2Cbrowsertime%2Cperformance%2Ctests%2Con%2Cfirefox%2Ctest-macosx1470-64-shippable%2Fopt-browsertime-benchmark-firefox-motionmark-1%2Cmm&revision=690b763720b4e1de5a454714ad58b3b05f6503f4

in the logs I see

[task 2025-03-08T12:07:27.444Z] 12:07:27     INFO -  Refresh Rate:  Hz
[task 2025-03-08T12:07:27.444Z] 12:07:27     INFO -  ERROR: expected refresh rate = 60.00, instead got .
[task 2025-03-08T12:07:27.450Z] 12:07:27     INFO - Return code: 1

I am thinking we should fail the entire task if this happens?

we didn't force failure, there were too many machines without the desired resolution.

:rcurran, can you do a new query and determine how many 14.70 machines are not our desired 60hz refresh rate?

Flags: needinfo?(rcurran)

:jmaher

1 @ 24.00
140 @ 60.00
29 @ unknown

Flags: needinfo?(rcurran)

:kshampur, with 30/170 machines in a non ideal state, auto failing jobs would increase our backlog queues significantly. If the perf numbers are critical to have 60hz, then we need to drop 10-15% of our load on the 14.70 machine pool and we could then enforce this.

Ryan, correct me if I am wrong, but are we waiting for new dongles or kvm cables?

:jmaher

as you know we are planning to rollout the m4s and replace some of the existing r8s. during this period we also hope to have the new kvm installed on all the hosts

:rcurran, what is the general timeline for this? I want this bug to reflect what the next step is and when (rough idea) that will happen.

Flags: needinfo?(rcurran)

:jmaher The current plan is for Van and me to start working on the racks during the week of April 22. Our goal for that first week is to complete 1โ€“2 racksโ€”each with a 78-host capacityโ€”including shifting, KVM installation, decommissioning hosts, and adding the new M4s.

The full timeline is still TBD, but by the end of that week, we should have a better sense of our pace and will be able to provide a more accurate estimate. In total, there are six racks to complete.

Flags: needinfo?(rcurran)
Pushed by jmaher@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/987bb2e68422 ensure resolution and refresh rate for macosx. r=bhearsum
Status: NEW → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → 140 Branch
Regressions: 1967410
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: