Closed Bug 1902979 Opened 1 year ago Closed 1 year ago

svg/animations/end-of-time-*-crash.html wpt tests run out of memory on linux CI

Categories

(Testing :: CI Configuration, defect)

defect

Tracking

(firefox128 fixed, firefox129 fixed)

RESOLVED FIXED
129 Branch
Tracking Status
firefox128 --- fixed
firefox129 --- fixed

People

(Reporter: jcristau, Assigned: jcristau)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure)

Attachments

(2 files, 1 obsolete file)

The web-platform-tests-crashtest task on linux CI appears to frequently run out of memory. On the old X11 / ubuntu 18.04 workers, the task survives after a 45s timeout. On the newer wayland / ubuntu 22.04 workers, especially with a debug build, more often than not the worker itself seems to run OOM.

https://profiler.firefox.com/from-url/https%3A%2F%2Ffirefox-ci-tc.services.mozilla.com%2Fapi%2Fqueue%2Fv1%2Ftask%2FCvcJevkOSbif9Fc73EUfGg%2Fruns%2F2%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_resource-usage.json/marker-chart/?globalTrackOrder=0&thread=0&timelineType=stack&v=10 is from one of the rare runs on ubuntu 22.04 debug that did not die.
https://profiler.firefox.com/from-url/https%3A%2F%2Ffirefox-ci-tc.services.mozilla.com%2Fapi%2Fqueue%2Fv1%2Ftask%2FFg6CnG_zSDaYFtp_sErs8A%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_resource-usage.json/marker-chart/?globalTrackOrder=0&thread=0&timelineType=stack&v=10 is from a run on a ubuntu 18.04 worker.

Should we use workers with more ram, skip this test, change it so it's not so hungry, something else?

is more ram == larger instance? right now the wayland machines are running on n2-standard-2 which is the same as t-linux-large* runs on. So we are running on the same instance size, I am not sure if the VM vs docker consumes more resources, nor if 2204/wayland consumes more than 1804/x11.

They're both running out of memory, but the old worker seems to recover better than the new one.

if we are going to make 2204 a real tier1, we need 2204-xlarge, so we should create that and give it a try

Depends on: 1903073
Assignee: nobody → longsonr
Status: NEW → ASSIGNED

Comment on attachment 9407981 [details]
Bug 1902979 - Don't dispatch SMIL events unless there are listeners r=smaug

Revision D214010 was moved to bug 1903214. Setting attachment 9407981 [details] to obsolete.

Attachment #9407981 - Attachment is obsolete: true

bug 1903214 should help since we'll no longer send any events in these testcases.

We could also reduce the number of runnnables we create by having a queue of (target, event) pairs. And then a single runnable per animation frame tick or some such. That runnable would then dispatch many events. Or something along those lines.

Depends on: 1903214

Looks like I misread the profiles yesterday. The ubuntu 18.04 workers for wpt have 16GB ram, while the wayland / 22.04 ones have only 8, so that explains why they behave differently. The test still times out, but that's separate from killing the worker.

Assignee: longsonr → jcristau
Component: SVG → CI Configuration
Product: Core → Testing

This switches the wayland wpt tasks to run on xlarge workers, like the
corresponding x11 tasks.

Pushed by jcristau@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f917602f4e57 add t-linux-xlarge-wayland worker type. r=jmaher,taskgraph-reviewers,bhearsum

This combined with bug 1903214 is looking great on autoland. Please request Beta approval when you get a chance.

Flags: needinfo?(jcristau)
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 129 Branch

This switches the wayland wpt tasks to run on xlarge workers, like the
corresponding x11 tasks.

Original Revision: https://phabricator.services.mozilla.com/D214127

Attachment #9408366 - Flags: approval-mozilla-beta?

beta Uplift Approval Request

  • User impact if declined: none
  • Code covered by automated testing: yes
  • Fix verified in Nightly: yes
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: n/a
  • Risk associated with taking this patch: low
  • Explanation of risk level: test-only
  • String changes made/needed: n/a
  • Is Android affected?: no
Flags: needinfo?(jcristau)
Attachment #9408366 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9408366 - Flags: approval-mozilla-beta+ → approval-mozilla-beta-

is this not uplifting to beta? I saw the + and now the -;

See my comment in Phabricator.

Depends on: 1900673
Attachment #9408366 - Flags: approval-mozilla-beta- → approval-mozilla-beta?
Attachment #9408366 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: