1922641 - Frequently failing jobs ended up by claim expired / worker shutdown / intermittent task

tszentpeteri

Reporter

Description

•

1 month ago

Push with failures
Task example

Cristina Horotan [:chorotan]

Comment 1

•

1 month ago

Update:
@relsre was pinged to check the workers. It sounds like many jobs are failing due to potentially OOM issue, when the workers run the tests. One of the logs is this one: https://firefox-ci-tc.services.mozilla.com/tasks/YeWJdYO0St-EOco2LYldEg/runs/0/logs/public/logs/live.log You should be able to see the worker node info at the beginning of the log, and the error towards the bottom of the log.

I’m seeing many claim expired exceptions for this worker pool, well before the services release
Seeing tons of errors https://firefox-ci-tc.services.mozilla.com/worker-manager/gecko-t%2Ft-linux-xlarge-noscratch-gcp/errors
The zone 'projects/fxci-production-level1-workers/zones/us-central1-b' does not have enough resources available to fulfill the request. '(resource type:compute)'.

Cristina Horotan [:chorotan]

Updated

•

1 month ago

Flags: needinfo?(aerickson)

Comment hidden (Intermittent Failures Robot)

317 failures in 947 pushes (0.335 failures/push) were associated with this bug yesterday.

** This test has failed more than 150 times in the last 21 days. It should be disabled until it can be fixed. **

Repository breakdown:

mozilla-central: 92
autoland: 162
mozilla-esr128: 25
mozilla-beta: 35
mozilla-esr115: 3

Platform and build breakdown:

android-em-7-0-x86_64-qr: 24
- debug: 19
- debug-isolated-process: 1
- opt: 4
linux1804-64-qr: 86
- debug: 70
- opt: 16
linux32: 1
- asan: 1
linux1804-64-asan-qr: 43
- opt: 43
windows2012-32-shippable: 8
- opt: 8
linux1804-64-shippable-qr: 11
- opt: 11
android-em-7-0-x86_64-lite-qr: 11
- opt: 11
windows2012-64: 2
- debug: 1
- opt: 1
windows11-32-2009-shippable-qr: 3
- opt: 3
android-5-0-armv7: 2
- opt: 2
osx-cross: 3
- debug: 1
- opt: 2
windows11-64-2009-asan-qr: 5
- opt: 5
linux64: 13
- tsan: 3
- asan: 4
- debug: 6
osx-shippable: 2
- opt: 2
windows11-64-2009-qr: 15
- opt: 3
- debug: 12
linux1804-64-tsan-qr: 21
- opt: 21
windows11-32-2009-qr: 5
- opt: 3
- debug: 2
linux64-aarch64-shippable: 1
- opt: 1
linux1804-64-ccov-qr: 11
- opt: 11
android-em-7-0-x86_64-shippable-qr: 3
- opt: 3
windows11-64-2009-ccov-qr: 4
- opt: 4
android-em-7-0-x86_64-shippable-lite-qr: 2
- opt: 2
windows2012-32: 2
- opt: 1
- debug: 1
android-5-0-x86-shippable: 1
- opt: 1
linux1804-32-shippable-qr: 3
- opt: 3
AC-android-all: 3
- opt: 3
focus-android-all: 2
- opt: 2
android-5-0-x86_64: 2
- opt: 1
- debug: 1
windows11-64-2009-devedition-qr: 2
- opt: 2
windows2012-64-shippable: 1
- opt: 1
windows11-64-2009-shippable-qr: 1
- opt: 1
linux2204-64-wayland-shippable: 2
- opt: 2
linux2204-64-wayland: 4
- debug: 3
- opt: 1
linux64-noopt: 1
- asan: 1
osx-cross-aarch64: 1
- asan: 1
osx-aarch64-shippable: 1
- opt: 1
android-hw-a55-14-0-android-aarch64-shippable-qr: 1
- opt: 1
android-5-0-armv7-shippable: 1
- opt: 1
windows2012-64-devedition: 1
- opt: 1
linux1804-64-devedition-qr: 8
- opt: 8
linux64-shippable: 1
- opt: 1
linux64-nightlyasrelease: 1
- opt: 1
toolchains: 1
- opt: 1
android-5-0-x86: 1
- opt: 1

For more details, see:
https://treeherder.mozilla.org/intermittent-failures/bugdetails?bug=1922641&startday=2024-10-03&endday=2024-10-03&tree=all

Comment hidden (Intermittent Failures Robot)

521 failures in 320 pushes (1.628 failures/push) were associated with this bug yesterday.

Repository breakdown:

autoland: 269
mozilla-beta: 29
mozilla-central: 222
try: 1

Platform and build breakdown:

android-em-7-0-x86_64-lite-qr: 14
- opt: 14
android-em-7-0-x86_64-qr: 49
- debug: 28
- opt: 17
- debug-isolated-process: 4
linux1804-64-qr: 123
- debug: 97
- opt: 26
linux1804-64-asan-qr: 38
- opt: 38
linux64: 18
- asan: 6
- debug: 9
- opt: 2
- tsan: 1
linux2204-64-wayland: 3
- debug: 3
windows11-64-2009-qr: 27
- debug: 17
- opt: 10
windows11-32-2009-qr: 11
- opt: 2
- debug: 9
android-5-0-aarch64-shippable: 3
- opt: 3
AC-ui-test: 1
- opt: 1
windows2012-64: 7
- debug: 2
- opt: 4
- ccov: 1
AC-android-all: 11
- opt: 11
android-em-7-0-x86_64-shippable-lite-qr: 10
- opt: 10
linux64-nightlyasrelease: 1
- opt: 1
linux64-shippable: 2
- opt: 2
windows11-32-2009-shippable-qr: 4
- opt: 4
android-5-0-armv7: 3
- debug: 2
- opt: 1
android-em-7-0-x86_64-shippable-qr: 12
- opt: 12
android-5-0-x86_64: 3
- debug: 2
- opt: 1
android-5-0-aarch64: 2
- debug: 2
linux64-noopt: 3
- debug: 2
- asan: 1
linux1804-64-shippable-qr: 38
- opt: 38
linux1804-64-ccov-qr: 12
- opt: 12
windows11-64-2009-shippable-qr: 4
- opt: 4
linux2204-64-wayland-shippable: 3
- opt: 3
windows2012-32-shippable: 6
- opt: 6
linux1804-64-tsan-qr: 10
- opt: 10
windows11-64-2009-ccov-qr: 9
- opt: 9
windows11-64-2009-asan-qr: 8
- opt: 8
linux1804-64-devedition-qr: 3
- opt: 3
linux1804-32-shippable-qr: 4
- opt: 4
linux1804-64: 1
- opt: 1
linux64-qr: 1
- opt: 1
windows2012-32-devedition: 4
- opt: 4
osx-cross-devedition: 1
- opt: 1
osx-shippable: 2
- opt: 2
linux32-shippable: 1
- opt: 1
windows2012-64-devedition: 5
- opt: 5
windows2012-aarch64-shippable: 14
- opt: 14
windows2012-64-shippable: 7
- opt: 7
osx-cross-aarch64: 1
- debug: 1
lint: 3
- opt: 3
osx-cross: 2
- debug: 1
- plain: 1
ios: 2
- opt: 1
- plain: 1
windows10-64-2009-qr: 1
- debug: 1
android-5-0-x86-shippable: 2
- opt: 2
android-5-0-armv7-shippable: 1
- opt: 1
fenix-android-all: 2
- opt: 2
android-5-0-x86: 1
- opt: 1
toolchains: 2
- opt: 2
android-5-0-x86_64-shippable: 1
- opt: 1
linux64-aarch64-shippable: 6
- opt: 6
win64-nightlyasrelease: 1
- opt: 1
win64-aarch64-shippable: 1
- opt: 1
windows2012-aarch64: 1
- debug: 1
android-hw-p5-13-0-android-aarch64-qr: 2
- debug: 2
android-5-0-geckoview-fat-aar-shippable: 1
- opt: 1
android-hw-a55-14-0-android-aarch64-shippable-qr: 1
- opt: 1
android-hw-a51-11-0-aarch64-shippable-qr: 7
- opt: 7
android-hw-a51-11-0-aarch64-shippable: 5
- opt: 5

For more details, see:
https://treeherder.mozilla.org/intermittent-failures/bugdetails?bug=1922641&startday=2024-10-04&endday=2024-10-04&tree=all

Comment hidden (Intermittent Failures Robot)

1052 failures in 3574 pushes (0.294 failures/push) were associated with this bug in the last 7 days.

This is the #1 most frequent failure this week.

** This failure happened more than 75 times this week! Resolving this bug is a very high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 1 week, the affected test(s) may be disabled. **

Repository breakdown:

mozilla-central: 361
autoland: 473
mozilla-beta: 141
try: 27
mozilla-esr115: 4
mozilla-esr128: 45
mozilla-release: 1

Platform and build breakdown:

android-em-7-0-x86_64-qr: 83
- debug: 50
- debug-isolated-process: 8
- opt: 25
linux1804-64-qr: 227
- debug: 184
- opt: 43
linux32: 1
- asan: 1
android-em-7-0-x86_64-lite-qr: 26
- opt: 26
linux1804-64-asan-qr: 101
- opt: 101
windows2012-32-shippable: 20
- opt: 20
linux1804-64-shippable-qr: 64
- opt: 64
windows2012-64: 9
- debug: 3
- opt: 5
- ccov: 1
windows11-32-2009-shippable-qr: 12
- opt: 12
android-5-0-armv7: 6
- opt: 4
- debug: 2
osx-cross: 5
- debug: 2
- opt: 2
- plain: 1
windows11-64-2009-asan-qr: 18
- opt: 18
linux64: 31
- tsan: 4
- asan: 10
- debug: 15
- opt: 2
osx-shippable: 4
- opt: 4
windows11-64-2009-qr: 51
- opt: 13
- debug: 38
linux1804-64-tsan-qr: 60
- opt: 60
windows11-32-2009-qr: 20
- opt: 5
- debug: 15
linux64-aarch64-shippable: 7
- opt: 7
linux1804-64-ccov-qr: 28
- opt: 28
android-em-7-0-x86_64-shippable-qr: 30
- opt: 30
windows11-64-2009-ccov-qr: 16
- opt: 16
linux2204-64-wayland: 11
- debug: 10
- opt: 1
android-em-7-0-x86_64-shippable-lite-qr: 23
- opt: 23
android-5-0-aarch64-shippable: 4
- opt: 4
AC-ui-test: 1
- opt: 1
AC-android-all: 14
- opt: 14
linux64-nightlyasrelease: 2
- opt: 2
linux64-shippable: 5
- opt: 5
android-5-0-x86_64: 5
- debug: 3
- opt: 2
android-5-0-aarch64: 2
- debug: 2
linux64-noopt: 4
- debug: 2
- asan: 2
android-hw-a55-14-0-android-aarch64-shippable-qr: 4
- opt: 4
windows11-64-2009-shippable-qr: 12
- opt: 12
linux2204-64-wayland-shippable: 5
- opt: 5
android-hw-a51-11-0-aarch64-shippable: 17
- opt: 17
android-hw-a51-11-0-arm7: 2
- opt: 1
- debug: 1
windows2012-64-shippable: 10
- opt: 10
android-hw-p5-13-0-android-aarch64-qr: 6
- debug: 4
- opt: 2
android-hw-p5-13-0-android-aarch64-shippable-qr: 4
- opt: 4
android-5-0-x86_64-shippable: 2
- opt: 2
lint: 4
- opt: 4
linux1804-64-devedition-qr: 16
- opt: 16
linux1804-32-shippable-qr: 7
- opt: 7
linux1804-64: 1
- opt: 1
windows11-64-2009-mingwclang-qr: 2
- debug: 2
windows11-32-2009-mingwclang-qr: 1
- debug: 1
android-5-0-geckoview-fat-aar-shippable: 2
- opt: 2
linux64-qr: 1
- opt: 1
windows2012-32: 2
- opt: 1
- debug: 1
android-5-0-x86-shippable: 3
- opt: 3
focus-android-all: 2
- opt: 2
windows11-64-2009-devedition-qr: 3
- opt: 3
osx-cross-aarch64: 2
- asan: 1
- debug: 1
osx-aarch64-shippable: 1
- opt: 1
android-hw-a51-11-0-aarch64-shippable-qr: 9
- opt: 9
android-5-0-armv7-shippable: 2
- opt: 2
windows2012-64-devedition: 6
- opt: 6
windows2012-32-devedition: 4
- opt: 4
osx-cross-devedition: 1
- opt: 1
linux32-shippable: 3
- opt: 3
windows2012-aarch64-shippable: 14
- opt: 14
win32-shippable: 1
- opt: 1
toolchains: 3
- opt: 3
ios: 2
- opt: 1
- plain: 1
android-5-0-x86: 2
- opt: 2
windows10-64-2009-qr: 1
- debug: 1
fenix-android-all: 2
- opt: 2
win64-nightlyasrelease: 1
- opt: 1
win64-aarch64-shippable: 1
- opt: 1
windows2012-aarch64: 1
- debug: 1

For more details, see:
https://treeherder.mozilla.org/intermittent-failures/bugdetails?bug=1922641&startday=2024-09-30&endday=2024-10-06&tree=all

Julien Cristau [:jcristau]

Comment 5

•

1 month ago

This is more likely to be spot preemption than OOM; sometimes we detect it and retry the task automatically, but not always.

Andrew Erickson [:aerickson]

Comment 6

•

1 month ago

:yarik and :mboris,

Yes, this does seem to be due to spot instance shutdowns. More details about this in https://mozilla.slack.com/archives/CKBFXRD1T/p1727982186900009.

This doesn't seem to be a new thing, but it seems to be hitting autoland jobs hard and the sheriffs are noticing.

Could worker-manager consider preemption shutdown events (I think right now it just considers if instances couldn't be launched due to lack of capacity) and try different zones or track which zones are having issues and use others?

I guess we could attack at the taskgraph layer by having autoland jobs go to a non-spot worker-pool.

We can add more zones to the configs, but not sure if it will help.

Thoughts?

Thanks!

Flags: needinfo?(ykurmyza)

Flags: needinfo?(mboris)

Flags: needinfo?(aerickson)

Matt Boris :mboris

Comment 7

•

1 month ago

Adding zones would be beneficial but so would adding additional instance types that would be sufficient to run these tasks on (in hopes of the new instance types aren't as high demand, so less spot terminations). Other option, as you mentioned Andy, is going with an on-demand instance type.

The worker manager specific question is good for Yarik.

Flags: needinfo?(mboris)

Comment hidden (Intermittent Failures Robot)

826 failures in 840 pushes (0.983 failures/push) were associated with this bug yesterday.

Repository breakdown:

mozilla-central: 147
mozilla-beta: 152
autoland: 438
try: 23
mozilla-release: 24
mozilla-esr128: 18
mozilla-esr115: 24

Platform and build breakdown:

linux32-shippable: 3
- opt: 3
android-em-7-0-x86_64-shippable-qr: 16
- opt: 16
linux1804-64-devedition-qr: 3
- opt: 3
linux64-shippable: 18
- opt: 18
windows2012-64: 15
- asan: 3
- debug: 7
- opt: 4
- ccov: 1
linux64: 26
- debug: 9
- tsan: 3
- asan: 6
- opt: 7
- plain: 1
windows2012-32: 3
- opt: 2
- debug: 1
linux1804-64-shippable-qr: 34
- opt: 34
ios: 2
- plain: 1
- opt: 1
linux32: 2
- debug: 2
android-5-0-x86_64: 10
- debug: 5
- opt: 5
fenix-android-all: 8
- opt: 8
linux1804-64-qr: 163
- opt: 28
- debug: 135
symbols: 2
- opt: 2
windows2012-32-shippable: 10
- opt: 10
windows2012-64-shippable: 13
- opt: 13
linux1804-64: 1
- opt: 1
android-em-7-0-x86_64-lite-qr: 48
- opt: 48
linux1804-64-asan-qr: 83
- opt: 83
android-em-7-0-x86_64-qr: 78
- debug-isolated-process: 10
- opt: 18
- debug: 50
windows11-64-2009-shippable-qr: 6
- opt: 6
osx-shippable: 5
- opt: 5
windows11-64-shippable-qr: 1
- opt: 1
android-5-0-x86: 1
- opt: 1
osx-cross: 6
- asan: 2
- opt: 3
- plain: 1
linux1804-64-tsan-qr: 96
- opt: 96
linux2204-64-wayland: 5
- debug: 3
- opt: 2
windows10-64-2009-qr: 3
- debug: 3
windows11-64-2009-qr: 15
- debug: 13
- opt: 2
windows11-64-2009-asan-qr: 13
- opt: 13
android-5-0-x86-shippable: 2
- opt: 2
AC-ui-test: 6
- opt: 6
AC-android-all: 19
- opt: 19
linux1804-64-ccov-qr: 5
- opt: 5
lint: 6
- opt: 6
windows11-32-2009-qr: 15
- debug: 12
- opt: 3
linux64-aarch64-shippable: 1
- opt: 1
android-hw-p5-13-0-android-aarch64-shippable-qr: 4
- opt: 4
windows2012-aarch64-shippable: 1
- opt: 1
windows11-32-2009-shippable-qr: 5
- opt: 5
windows2012-aarch64: 1
- debug: 1
android-em-7-0-x86_64-shippable-lite-qr: 25
- opt: 25
android-hw-p5-13-0-android-aarch64-qr: 8
- opt: 3
- debug: 5
android-hw-a55-14-0-android-aarch64-shippable-qr: 3
- opt: 3
windows11-64-2009-ccov-qr: 7
- opt: 7
linux1804-32-shippable-qr: 1
- opt: 1
windows11-64-2009-devedition-qr: 2
- opt: 2
windows2012-32-devedition: 1
- opt: 1
android-5-0-x86_64-shippable: 4
- opt: 4
win64-nightlyasrelease: 4
- opt: 4
linux64-qr: 3
- opt: 3
linux64-nightlyasrelease: 2
- opt: 2
gecko-decision: 1
- opt: 1
toolchains: 1
- opt: 1
android-5-0-armv7-shippable: 1
- opt: 1
linux2204-64-wayland-shippable: 3
- opt: 3
android-5-0-aarch64: 1
- opt: 1
android-5-0-aarch64-shippable: 1
- opt: 1
osx-cross-noopt: 1
- debug: 1
osx-cross-aarch64: 2
- debug: 2
osx-cross-aarch64-add-on-devel: 1
- opt: 1

For more details, see:
https://treeherder.mozilla.org/intermittent-failures/bugdetails?bug=1922641&startday=2024-10-07&endday=2024-10-07&tree=all

Yarik Kurmyza [:yarik] (he/him) (UTC+1)

Comment 9

•

1 month ago

Yeah, I think we need to include more regions/zones to avoid this.
At the moment workers do not communicate back preemption events back to worker manager, so it doesn't know what is happening on that side.
We could monitor task exceptions with worker-shutdown reason, but I'm not sure how to use that number if we would start tracking it.

In the feature that is being developed, worker manager would be able to react to the quota errors by trying to provision less in that region, and prefer other regions for some time.

Flags: needinfo?(ykurmyza)

Comment hidden (Intermittent Failures Robot)

Cosmin Sabou [:CosminS]

Comment 11

•

1 month ago

This is severely impacting the trees so a solution here would be greatly appreciated. https://treeherder.mozilla.org/jobs?repo=mozilla-central&group_state=expanded&resultStatus=exception&fromchange=b48e31d47d1f562424c7693ad93f74ed39251edb&tochange=ba301423863e02a50279faf39bb566fa3945007

Andrew Erickson [:aerickson]

Comment 12

•

1 month ago

Tracking work to add more zones/regions for these workers in https://mozilla-hub.atlassian.net/browse/RELOPS-1103.

Julien Cristau [:jcristau]

Comment 13

•

1 month ago

I don't think we should add more zones/regions here.

What I suspect might be happening is that since the deploy of tc 72.0.1 (just a few hours before this bug was filed), treeherder doesn't classify tasks as retried, and thus they bubble up to the sheriffs. That deploy included the fix for https://github.com/taskcluster/taskcluster/issues/7174 which changed how/when tc sends pulse messages for retried tasks, so that looks like a likely cause for the increase in tasks that show up as "exception" instead of "retry".

Andrew Halberstadt [:ahal]

Assignee

Updated

•

1 month ago

Component: Workers → Treeherder

Product: Taskcluster → Tree Management

Version: unspecified → ---

Andrew Halberstadt [:ahal]

Assignee

Comment 14

•

1 month ago

So currently Treeherder looks back at runId - 1 when it gets the event for reruns:
https://github.com/mozilla/treeherder/blob/fb59b00868b6d90083531891beca53a477107403/treeherder/etl/taskcluster_pulse/handler.py#L297-L300

But now, it's going to need to look forward at runId + 1 when it gets task-exception events. So basically, when you get a task-exception event, inspect runId + 1. If it has reasonCreated = "retry" (or task-retry?), then classify the current run as retry. Otherwise classify the current run as exception.

Comment hidden (Intermittent Failures Robot)

697 failures in 1038 pushes (0.671 failures/push) were associated with this bug yesterday.

Repository breakdown:

autoland: 383
mozilla-beta: 102
try: 48
mozilla-central: 164

Platform and build breakdown:

linux1804-64-qr: 105
- debug: 74
- opt: 31
windows10-64-2009-qr: 6
- opt: 3
- debug: 3
linux64-shippable: 12
- opt: 12
windows11-32-2009-qr: 15
- debug: 9
- opt: 6
linux1804-64-asan-qr: 73
- opt: 73
android-5-0-aarch64-shippable: 3
- opt: 3
android-em-7-0-x86_64-qr: 69
- debug: 50
- debug-isolated-process: 5
- opt: 14
windows11-64-2009-qr: 14
- debug: 11
- opt: 3
linux1804-64-tsan-qr: 51
- opt: 51
android-em-7-0-x86_64-shippable-lite-qr: 16
- opt: 16
osx-cross-aarch64-add-on-devel: 1
- opt: 1
windows2012-64: 12
- debug: 5
- opt: 3
- asan: 4
linux2204-64-wayland-shippable: 2
- opt: 2
android-em-7-0-x86_64-shippable-qr: 19
- opt: 19
linux1804-32-shippable-qr: 4
- opt: 4
linux1804-64-ccov-qr: 12
- opt: 12
windows11-64-2009-devedition-qr: 4
- opt: 4
linux1804-64-devedition-qr: 12
- opt: 12
windows2012-32-shippable: 15
- opt: 15
linux1804-64-shippable-qr: 39
- opt: 39
osx-shippable: 9
- opt: 9
linux32-shippable: 3
- opt: 3
windows2012-64-shippable: 14
- opt: 14
windows11-64-2009-shippable-qr: 7
- opt: 7
windows11-32-2009-shippable-qr: 3
- opt: 3
linux1804-32-qr: 1
- debug: 1
android-em-7-0-x86_64-lite-qr: 33
- opt: 33
windows11-64-2009-ccov-qr: 9
- opt: 9
win64-nightlyasrelease: 4
- opt: 4
osx-cross-aarch64: 5
- asan: 3
- debug: 2
linux64: 27
- asan: 7
- debug: 13
- opt: 5
- tsan: 1
- plain: 1
gecko-decision: 2
- opt: 2
osx-cross: 7
- opt: 2
- debug: 2
- asan: 3
windows2012-aarch64: 2
- opt: 1
- debug: 1
ios: 3
- opt: 2
- plain: 1
linux1804-64: 2
- opt: 2
lint: 4
- opt: 4
android-5-0-armv7: 4
- opt: 2
- debug: 2
windows2012-32: 4
- debug: 4
focus-android-all: 3
- opt: 3
macosx1015-64-qr: 1
- debug: 1
android-hw-a55-14-0-android-aarch64-shippable-qr: 1
- opt: 1
linux32: 1
- debug: 1
android-5-0-x86_64: 8
- debug: 6
- opt: 2
windows11-64-2009-asan-qr: 9
- opt: 9
android-5-0-x86_64-shippable: 1
- opt: 1
AC-ui-test: 10
- opt: 10
AC-android-all: 13
- opt: 13
fenix-ui-test: 1
- opt: 1
android-5-0-geckoview-fat-aar: 1
- opt: 1
windows2012-64-devedition: 2
- opt: 2
android-5-0-armv7-shippable: 2
- opt: 2
android-5-0-geckoview-fat-aar-shippable: 1
- opt: 1
fenix-android-all: 3
- opt: 3
windows-mingw32: 1
- all: 1
linux64-noopt: 1
- debug: 1
android-5-0-aarch64: 4
- opt: 3
- debug: 1
android-5-0-x86-shippable: 2
- opt: 2
toolchains: 1
- opt: 1
linux64-aarch64-shippable: 1
- opt: 1
macosx64-shippable: 1
- opt: 1
android-hw-a55-14-0-aarch64-shippable: 1
- opt: 1
android-5-0-x86: 1
- opt: 1

For more details, see:
https://treeherder.mozilla.org/intermittent-failures/bugdetails?bug=1922641&startday=2024-10-09&endday=2024-10-09&tree=all

Comment hidden (Intermittent Failures Robot)

Andrew Halberstadt [:ahal]

Assignee

Comment 17

•

1 month ago

I don't have a treeherder dev environment set up, but I'll take an initial stab at a patch.

Assignee: nobody → ahal

Status: NEW → ASSIGNED

Comment hidden (Intermittent Failures Robot)

2542 failures in 4679 pushes (0.543 failures/push) were associated with this bug in the last 7 days.

This is the #1 most frequent failure this week.

** This failure happened more than 75 times this week! Resolving this bug is a very high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 1 week, the affected test(s) may be disabled. **

Repository breakdown:

mozilla-central: 478
autoland: 1254
mozilla-beta: 535
try: 116
mozilla-esr128: 76
mozilla-release: 53
mozilla-esr115: 30

Platform and build breakdown:

linux32-shippable: 8
- opt: 8
android-em-7-0-x86_64-shippable-qr: 62
- opt: 62
linux1804-64-qr: 442
- debug: 347
- opt: 95
windows10-64-2009-qr: 13
- opt: 4
- debug: 9
linux64-shippable: 36
- opt: 36
linux1804-64-tsan-qr: 280
- opt: 280
linux1804-64-devedition-qr: 36
- opt: 36
windows11-32-2009-qr: 51
- debug: 38
- opt: 13
android-em-7-0-x86_64-qr: 250
- debug: 154
- debug-isolated-process: 31
- opt: 65
windows2012-64: 35
- asan: 9
- debug: 16
- opt: 9
- ccov: 1
android-em-7-0-x86_64-lite-qr: 117
- opt: 117
linux64: 76
- debug: 32
- tsan: 5
- asan: 17
- opt: 20
- plain: 2
linux1804-64-asan-qr: 268
- opt: 268
windows2012-32: 8
- opt: 2
- debug: 6
android-5-0-aarch64-shippable: 6
- opt: 6
linux1804-64: 5
- opt: 5
windows11-64-2009-ccov-qr: 18
- opt: 18
linux1804-64-shippable-qr: 120
- opt: 120
ios: 6
- plain: 3
- opt: 3
linux32: 5
- debug: 4
- asan: 1
windows2012-32-shippable: 45
- opt: 45
android-5-0-x86_64: 20
- debug: 13
- opt: 7
AC-android-all: 49
- opt: 49
android-5-0-armv7: 5
- opt: 3
- debug: 2
android-em-7-0-x86_64-shippable-lite-qr: 76
- opt: 76
windows11-64-2009-qr: 59
- debug: 51
- opt: 8
fenix-android-all: 11
- opt: 11
windows11-32-2009-shippable-qr: 24
- opt: 24
android-hw-a55-14-0-aarch64-shippable: 24
- opt: 24
symbols: 2
- opt: 2
windows2012-64-devedition: 4
- opt: 4
osx-cross-noopt: 2
- debug: 2
doc: 1
- opt: 1
osx-cross-aarch64-add-on-devel: 2
- opt: 2
windows11-64-2009-shippable-qr: 29
- opt: 29
windows2012-64-shippable: 43
- opt: 43
windows11-64-2009-asan-qr: 42
- opt: 42
osx-cross: 16
- debug: 4
- asan: 5
- opt: 6
- plain: 1
linux2204-64-wayland-shippable: 5
- opt: 5
linux1804-32-shippable-qr: 17
- opt: 17
linux1804-64-ccov-qr: 26
- opt: 26
linux64-nightlyasrelease: 5
- opt: 5
windows11-64-2009-devedition-qr: 11
- opt: 11
android-hw-a55-14-0-android-aarch64-shippable-qr: 16
- opt: 16
osx-shippable: 22
- opt: 22
windows11-64-shippable-qr: 1
- opt: 1
windows10-64-2009-shippable-qr: 2
- opt: 2
win64-nightlyasrelease: 13
- opt: 13
windows11-64-2009-mingwclang-qr: 1
- debug: 1
macosx1100-aarch64-shippable-qr: 6
- opt: 6
android-5-0-x86: 2
- opt: 2
android-5-0-x86_64-shippable: 7
- opt: 7
linux2204-64-wayland: 9
- debug: 6
- opt: 3
android-5-0-x86-shippable: 5
- opt: 5
macosx1015-64-shippable-qr: 1
- opt: 1
AC-ui-test: 17
- opt: 17
linux1804-32-qr: 1
- debug: 1
lint: 14
- opt: 14
linux64-aarch64-shippable: 3
- opt: 3
osx-cross-aarch64: 8
- asan: 4
- debug: 4
android-hw-p5-13-0-android-aarch64-shippable-qr: 4
- opt: 4
windows2012-aarch64-shippable: 2
- opt: 2
windows2012-aarch64: 3
- debug: 2
- opt: 1
gecko-decision: 3
- opt: 3
android-5-0-geckoview-fat-aar-shippable: 3
- opt: 3
osx-nightlyasrelease: 1
- opt: 1
android-hw-p5-13-0-android-aarch64-qr: 8
- opt: 3
- debug: 5
osx-cross-add-on-devel: 1
- opt: 1
focus-android-all: 3
- opt: 3
macosx1015-64-qr: 1
- debug: 1
windows2012-32-devedition: 3
- opt: 3
fenix-ui-test: 1
- opt: 1
android-5-0-geckoview-fat-aar: 1
- opt: 1
linux64-qr: 4
- opt: 4
android-5-0-armv7-shippable: 3
- opt: 3
android-hw-a51-11-0-aarch64-shippable-qr: 1
- opt: 1
windows-mingw32: 1
- all: 1
linux64-noopt: 1
- debug: 1
android-5-0-aarch64: 5
- opt: 4
- debug: 1
toolchains: 2
- opt: 2
macosx64-shippable: 1
- opt: 1
linux64-snap-amd64-esr: 1
- opt: 1
linux64-devedition: 1
- opt: 1

For more details, see:
https://treeherder.mozilla.org/intermittent-failures/bugdetails?bug=1922641&startday=2024-10-07&endday=2024-10-13&tree=all

Comment hidden (Intermittent Failures Robot)

679 failures in 644 pushes (1.054 failures/push) were associated with this bug yesterday.

Repository breakdown:

autoland: 289
mozilla-central: 11
mozilla-beta: 207
try: 10
mozilla-esr115: 28
mozilla-esr128: 134

Platform and build breakdown:

windows11-64-2009-qr: 18
- debug: 14
- opt: 4
linux1804-64-qr: 227
- debug: 207
- opt: 20
windows2012-64: 15
- debug: 9
- opt: 3
- asan: 3
android-em-7-0-x86_64-lite-qr: 23
- opt: 23
linux1804-64-tsan-qr: 67
- opt: 67
android-em-7-0-x86_64-qr: 23
- opt: 5
- debug: 10
- debug-isolated-process: 8
lint: 3
- opt: 3
windows11-64-2009-asan-qr: 6
- opt: 6
linux1804-64-asan-qr: 74
- opt: 74
windows11-32-2009-qr: 13
- debug: 12
- opt: 1
linux1804-64-shippable-qr: 34
- opt: 34
linux64-aarch64-shippable: 1
- opt: 1
windows11-64-2009-ccov-qr: 1
- opt: 1
windows2012-aarch64-shippable: 1
- opt: 1
android-hw-a55-14-0-android-aarch64-shippable-qr: 4
- opt: 4
android-hw-a55-14-0-aarch64-shippable: 3
- opt: 3
android-em-7-0-x86_64-shippable-qr: 6
- opt: 6
linux1804-32-shippable-qr: 2
- opt: 2
osx-cross-aarch64: 5
- asan: 2
- debug: 3
osx-cross-aarch64-add-on-devel: 1
- opt: 1
linux32: 4
- asan: 2
- debug: 2
windows10-64-2009-qr: 1
- debug: 1
windows2012-aarch64-devedition: 1
- opt: 1
osx-cross-devedition: 3
- opt: 3
osx-shippable: 8
- opt: 8
windows2012-32-devedition: 2
- opt: 2
linux32-shippable: 2
- opt: 2
windows2012-32-shippable: 3
- opt: 3
windows2012-64-devedition: 2
- opt: 2
linux64-devedition: 4
- opt: 4
osx-aarch64-shippable: 2
- opt: 2
linux64: 19
- debug: 4
- opt: 7
- asan: 6
- tsan: 1
- plain: 1
android-5-0-aarch64: 5
- opt: 3
- debug: 2
osx-cross: 11
- opt: 5
- debug: 6
ios: 3
- opt: 1
- debug: 1
- plain: 1
toolchains: 1
- opt: 1
linux64-noopt: 1
- debug: 1
android-5-0-armv7: 2
- opt: 2
windows2012-32: 5
- opt: 3
- debug: 2
android-5-0-x86: 2
- opt: 2
android-5-0-x86_64: 4
- opt: 1
- debug: 3
android-em-7-0-x86_64-shippable-lite-qr: 5
- opt: 5
windows2012-aarch64: 3
- opt: 2
- debug: 1
android-5-0-armv7-shippable: 1
- opt: 1
linux64-nightlyasrelease: 1
- opt: 1
AC-android-all: 7
- opt: 7
android-em-7-0-x86: 1
- debug: 1
linux2204-64-wayland: 19
- debug: 18
- opt: 1
fenix-ui-test: 1
- opt: 1
linux2204-64-wayland-shippable: 3
- opt: 3
AC-ui-test: 3
- opt: 3
windows11-64-2009-mingwclang-qr: 2
- debug: 2
osx-nightlyasrelease: 1
- opt: 1
android-5-0-x86_64-shippable: 1
- opt: 1
linux1804-64-devedition-qr: 15
- opt: 15
windows11-64-2009-shippable-qr: 1
- opt: 1
windows11-64-2009-devedition-qr: 1
- opt: 1
windows11-32-2009-shippable-qr: 2
- opt: 2

For more details, see:
https://treeherder.mozilla.org/intermittent-failures/bugdetails?bug=1922641&startday=2024-10-14&endday=2024-10-14&tree=all

Comment hidden (Intermittent Failures Robot)

490 failures in 955 pushes (0.513 failures/push) were associated with this bug yesterday.

Repository breakdown:

autoland: 257
mozilla-beta: 69
mozilla-central: 119
try: 24
mozilla-esr115: 13
mozilla-esr128: 8

Platform and build breakdown:

linux1804-64-asan-qr: 62
- opt: 62
android-em-7-0-x86_64-qr: 45
- debug: 24
- debug-isolated-process: 6
- opt: 15
linux1804-64-qr: 70
- debug: 47
- opt: 23
linux1804-64-tsan-qr: 36
- opt: 36
linux32: 3
- asan: 2
- debug: 1
android-em-7-0-x86_64-lite-qr: 30
- opt: 30
linux1804-64-ccov-qr: 7
- opt: 7
windows11-64-2009-qr: 11
- debug: 9
- opt: 2
windows11-32-2009-qr: 8
- debug: 7
- opt: 1
windows2012-64-add-on-devel: 1
- opt: 1
win64-nightlyasrelease: 2
- opt: 2
windows2012-64: 6
- debug: 3
- asan: 2
- ccov: 1
windows11-64-2009-asan-qr: 7
- opt: 7
osx-cross-aarch64: 2
- debug: 2
osx-aarch64-shippable: 1
- opt: 1
android-em-7-0-x86_64-shippable-lite-qr: 14
- opt: 14
linux1804-64-devedition-qr: 5
- opt: 5
linux1804-64-shippable-qr: 36
- opt: 36
linux1804-32-shippable-qr: 5
- opt: 5
AC-android-all: 14
- opt: 14
windows11-32-2009-shippable-qr: 2
- opt: 2
windows2012-32-shippable: 7
- opt: 7
android-em-7-0-x86_64-shippable-qr: 13
- opt: 13
windows11-64-2009-shippable-qr: 8
- opt: 8
android-5-0-x86_64-shippable: 1
- opt: 1
lint: 7
- opt: 7
windows2012-64-shippable: 2
- opt: 2
linux64: 14
- debug: 4
- asan: 5
- opt: 2
- tsan: 3
osx-cross: 2
- debug: 1
- opt: 1
windows2012-32: 1
- opt: 1
windows11-64-2009-devedition-qr: 1
- opt: 1
linux64-nightlyasrelease: 1
- opt: 1
linux1804-64: 2
- opt: 2
ios: 3
- opt: 1
- debug: 2
linux64-shippable: 6
- opt: 6
toolchains: 2
- opt: 2
osx-nightlyasrelease: 2
- opt: 2
android-5-0-armv7: 2
- opt: 2
android-5-0-x86_64: 3
- opt: 1
- debug: 2
AC-ui-test: 5
- opt: 5
osx-shippable: 4
- opt: 4
macosx1015-64-shippable-qr: 1
- opt: 1
android-em-7-0-x86: 1
- opt: 1
windows11-64: 2
- opt: 2
windows10-64-2009-shippable-qr: 1
- opt: 1
android-5-0-x86-shippable: 2
- opt: 2
windows-mingw32: 1
- all: 1
fenix-android-all: 3
- opt: 3
android-5-0-aarch64-shippable: 2
- opt: 2
osx-cross-devedition: 2
- opt: 2
android-hw-a55-14-0-aarch64-shippable: 1
- opt: 1
linux2204-64-wayland-shippable: 11
- opt: 11
windows2012-aarch64: 2
- opt: 1
- debug: 1
android-5-0-x86: 2
- opt: 2
android-hw-p5-13-0-android-aarch64-qr: 1
- opt: 1
windows2012-32-noopt: 1
- debug: 1
linux2204-64-wayland: 3
- debug: 3
linux64-aarch64-shippable: 1
- opt: 1

For more details, see:
https://treeherder.mozilla.org/intermittent-failures/bugdetails?bug=1922641&startday=2024-10-15&endday=2024-10-15&tree=all

Andrew Halberstadt [:ahal]

Assignee

Comment 23

•

28 days ago

Attached file Update pulse handling for task-exception messages — Details

Comment hidden (Intermittent Failures Robot)

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 25

•

27 days ago

https://github.com/mozilla/treeherder/commit/2f33cc4d79fae06cb558ee14100cc0818faefe2b

Status: ASSIGNED → RESOLVED

Closed: 27 days ago

Resolution: --- → FIXED

Comment hidden (Intermittent Failures Robot)

1571 failures in 4579 pushes (0.343 failures/push) were associated with this bug in the last 7 days.

This is the #1 most frequent failure this week.

** This failure happened more than 75 times this week! Resolving this bug is a very high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 1 week, the affected test(s) may be disabled. **

Repository breakdown:

autoland: 870
mozilla-beta: 285
mozilla-central: 196
try: 34
mozilla-esr115: 41
mozilla-esr128: 145

Platform and build breakdown:

windows11-64-2009-qr: 45
- debug: 38
- opt: 7
linux1804-64-qr: 364
- debug: 297
- opt: 67
linux1804-64-asan-qr: 150
- opt: 150
windows2012-64: 24
- debug: 13
- opt: 3
- asan: 7
- ccov: 1
android-em-7-0-x86_64-qr: 88
- debug: 44
- debug-isolated-process: 15
- opt: 29
android-em-7-0-x86_64-lite-qr: 63
- opt: 63
linux1804-64-tsan-qr: 127
- opt: 127
linux32: 7
- asan: 4
- debug: 3
lint: 11
- opt: 11
windows11-64-2009-asan-qr: 15
- opt: 15
osx-cross: 18
- debug: 11
- opt: 6
- plain: 1
windows11-32-2009-qr: 24
- debug: 22
- opt: 2
linux1804-64-ccov-qr: 8
- opt: 8
windows2012-64-add-on-devel: 1
- opt: 1
AC-ui-test: 9
- opt: 9
win64-nightlyasrelease: 2
- opt: 2
linux1804-64-shippable-qr: 77
- opt: 77
android-5-0-armv7: 6
- debug: 1
- opt: 5
linux64-aarch64-shippable: 2
- opt: 2
windows11-64-2009-ccov-qr: 2
- opt: 2
osx-cross-aarch64: 7
- debug: 5
- asan: 2
windows2012-aarch64-shippable: 1
- opt: 1
osx-aarch64-shippable: 3
- opt: 3
osx-cross-noopt: 1
- debug: 1
android-5-0-x86_64: 9
- debug: 6
- opt: 3
android-hw-a55-14-0-android-aarch64-shippable-qr: 99
- opt: 99
android-hw-a55-14-0-aarch64-shippable: 102
- opt: 102
android-em-7-0-x86_64-shippable-qr: 21
- opt: 21
toolchains: 4
- opt: 4
linux1804-32-shippable-qr: 7
- opt: 7
android-5-0-armv7-shippable: 2
- opt: 2
android-5-0-x86-shippable: 4
- opt: 4
android-em-7-0-x86_64-shippable-lite-qr: 20
- opt: 20
android-5-0-x86_64-shippable: 4
- opt: 4
linux2204-64-wayland: 24
- opt: 2
- debug: 22
linux64: 38
- debug: 9
- asan: 14
- opt: 9
- tsan: 5
- plain: 1
linux32-shippable: 3
- opt: 3
ios: 7
- plain: 2
- opt: 2
- debug: 3
linux64-noopt: 2
- asan: 1
- debug: 1
osx-cross-aarch64-add-on-devel: 1
- opt: 1
AC-android-all: 23
- opt: 23
linux1804-64-devedition-qr: 20
- opt: 20
osx-aarch64-devedition: 1
- opt: 1
windows10-64-2009-qr: 2
- debug: 2
windows11-32-2009-shippable-qr: 4
- opt: 4
linux1804-64: 3
- opt: 3
windows2012-32-shippable: 10
- opt: 10
windows11-64-2009-shippable-qr: 9
- opt: 9
windows2012-64-shippable: 3
- opt: 3
windows2012-aarch64-devedition: 1
- opt: 1
windows2012-32: 6
- opt: 4
- debug: 2
osx-cross-devedition: 5
- opt: 5
osx-shippable: 12
- opt: 12
windows2012-32-devedition: 2
- opt: 2
windows2012-64-devedition: 2
- opt: 2
linux64-devedition: 4
- opt: 4
windows11-64-2009-devedition-qr: 2
- opt: 2
android-5-0-aarch64: 6
- opt: 4
- debug: 2
linux64-nightlyasrelease: 2
- opt: 2
android-5-0-x86: 5
- opt: 5
windows2012-aarch64: 5
- opt: 3
- debug: 2
linux64-shippable: 6
- opt: 6
linux64-ccov: 1
- opt: 1
osx-nightlyasrelease: 3
- opt: 3
android-em-7-0-x86: 2
- debug: 1
- opt: 1
macosx1015-64-shippable-qr: 1
- opt: 1
windows11-64: 2
- opt: 2
fenix-ui-test: 1
- opt: 1
linux2204-64-wayland-shippable: 14
- opt: 14
windows10-64-2009-shippable-qr: 1
- opt: 1
windows-mingw32: 1
- all: 1
windows11-64-2009-mingwclang-qr: 2
- debug: 2
fenix-android-all: 3
- opt: 3
android-5-0-aarch64-shippable: 2
- opt: 2
android-hw-p5-13-0-android-aarch64-qr: 1
- opt: 1
linux64-qr: 1
- opt: 1
windows2012-32-noopt: 1
- debug: 1

For more details, see:
https://treeherder.mozilla.org/intermittent-failures/bugdetails?bug=1922641&startday=2024-10-14&endday=2024-10-20&tree=all