Closed Bug 1894695 Opened 5 months ago Closed 4 months ago

Large FCP on MacOS in CI when using webdriver since v124.0.2

Categories

(Core :: Performance, defect)

Firefox 127
defect

Tracking

()

RESOLVED DUPLICATE of bug 1893921
Performance Impact ?

People

(Reporter: barry, Unassigned)

Details

Steps to reproduce:

We run automated tests on Core Web Vitals on all three major browsers in https://github.com/GoogleChrome/web-vitals/

This has worked well for v124.0.2 and under but after the webdriver started using v127.0a1 we have noticed frequent very high First Contentful Paint (FCP) times in our CI tests causing the tests to fail every time.

The FCP times are sometimes 5 seconds or above. We don't see the same with Chrome nor Safari, and didn't see this in the past when on v124.0.2

I can't reproduce this locally on my Mac and only see it on the GitHub Action machines but it is consistently there.

Actual results:

I've created a minimal repo here: https://github.com/tunetheweb/github-action-test/. You can clone that repo and run the "Run tests" repo.

Example output from three runs shows FCP time of: 5860ms, 3481ms, 55ms. The last one is what I'd expect for this very simple page.

[0-0] RUNNING in firefox - file:///test/test.mjs
[0-0] Test 1 FCP entry: {
[0-0] name: 'first-contentful-paint',
[0-0] entryType: 'paint',
[0-0] startTime: 5860,
[0-0] duration: 0
[0-0] }
[0-0] AssertionError in "tests.test 1"
AssertionError [ERR_ASSERTION]: The expression evaluated to a falsy value:

assert(fcpEntry.startTime < 1000)

at Context.<anonymous> (file:///Users/runner/work/github-action-test/github-action-test/test/test.mjs:31:7)

[0-0] Test 2 FCP entry: {
[0-0] name: 'first-contentful-paint',
[0-0] entryType: 'paint',
[0-0] startTime: 3481,
[0-0] duration: 0
[0-0] }
[0-0] AssertionError in "tests.test 2"
AssertionError [ERR_ASSERTION]: The expression evaluated to a falsy value:

assert(fcpEntry.startTime < 1000)

at Context.<anonymous> (file:///Users/runner/work/github-action-test/github-action-test/test/test.mjs:31:7)

[0-0] Test 3 FCP entry: {
[0-0] name: 'first-contentful-paint',
[0-0] entryType: 'paint',
[0-0] startTime: 55,
[0-0] duration: 0
[0-0] }
[0-0] FAILED in firefox - file:///test/test.mjs

Expected results:

I would expect an FCP of under a second for this simple page.

Moving this over to a potential component as I am not sure where this would fit best. If this is not the right, please change to a more suitable one.
Thanks!

Component: Untriaged → Performance
Product: Firefox → Core

This bug was moved into the Performance component.

:barry, could you make sure the following information is on this bug?

  • For slowness or high CPU usage, capture a profile with http://profiler.firefox.com/, upload it and share the link here.
  • For memory usage issues, capture a memory dump from about:memory and attach it to this bug.
  • Troubleshooting information: Go to about:support, click "Copy raw data to clipboard", paste it into a file, save it, and attach the file here.

If the requested information is already in the bug, please confirm it is recent.

Thank you.

Flags: needinfo?(barry)

I've been unable to replicate this locally, where I'd be able to perform those tasks to get extra info. It is very repeatable in CI though.

Flags: needinfo?(barry)
Performance Impact: --- → ?

I'm seeing the same problem as Barry.

Via the github action test, very slow first-contentful-paint times are reported, in my case [5339, 2540, 1081]

And locally it's fine:

[1-0]   name: 'first-contentful-paint',
[1-0]   entryType: 'paint',
[1-0]   startTime: 84,
[1-0]   duration: 0
[1-0] }
[1-0] Test 2 FCP entry: {
[1-0]   name: 'first-contentful-paint',
[1-0]   entryType: 'paint',
[1-0]   startTime: 20,
[1-0]   duration: 0
[1-0] }
[1-0] Test 3 FCP entry: {
[1-0]   name: 'first-contentful-paint',
[1-0]   entryType: 'paint',
[1-0]   startTime: 16,
[1-0]   duration: 0
[1-0] }

I think we'll need some way to inspect what's happening in the CI environment.

Perhaps pass in environment variables to capture moz_logs and retrieve the file.
Something like these additions to the ENV

MOZ_LOG=timestamp,nsHttp:5,nsHostResolver:5,cache2:5,nsSocketTransport:5,socket,NetworkPredictor:5,nsIOService:5
MOZ_LOG_FILE=log.txt

Henrik, Sparky - wondering if you two have encountered any recent issues in automation that could explain this behaviour?

In updating Firefox from v124.0.2 to v127.0a1, Chrome's WebVitals tests are now recording multi-second first-contentful-paint times from Firefox, MacOs. (Expecting < 100ms).

Flags: needinfo?(hskupin)
Flags: needinfo?(gmierz2)

I can see that these are ARM64 machines. So I wonder if that could be the same as what has been reported on bug 1893921. Could you check with a non-ARM machine if you see the same? Also when did you upgrade your workers to ARM64? Maybe the regression started at that point? If not and you were running ARM workers already before, would you be able to do some regression testing? It would be good to know in which version of Firefox that slowness started. The range from 124.0a1 to 127.0a1 is quite large. Thanks.

Flags: needinfo?(hskupin)
Flags: needinfo?(gmierz2)

I think you're right!

It looks like GitHub updated the definition of macos-latest to be the ARM-based macos-14 machines around the time this started breaking. Checking the logs without an issue they were all on macos-12 (which apparently is intel-based).

I also downgraded to macos-12 and it doesn't show the issue.

So looks like nothing to do with v127 (which happened around the same time) and just a change in the GitHub macos-latest environment which switched to ARM.

Lets close this and will follow https://bugzilla.mozilla.org/show_bug.cgi?id=1893921

Status: UNCONFIRMED → RESOLVED
Closed: 4 months ago
Duplicate of bug: 1893921
Resolution: --- → DUPLICATE

Thanks for the feedback and good to hear that this works with the Intel based workers!

See Also: 1893921
You need to log in before you can comment on or make changes to this bug.