Closed Bug 1869712 Opened 2 years ago Closed 1 year ago

Intermittent Symsys-symbols-ubuntu [taskcluster:error] Task timeout after 1200 seconds. Force killing container.

Categories

(Toolkit :: Crash Reporting, defect, P5)

defect

Tracking

()

RESOLVED FIXED
130 Branch
Tracking Status
firefox130 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: gsvelto)

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

Filed by: csabou [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=439902525&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/bKjWy5b8T5iOlu3rQKJitg/runs/0/artifacts/public/logs/live_backing.log


[task 2023-12-13T03:24:25.968Z] bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
[task 2023-12-13T03:24:25.968Z] + ./mach python toolkit/crashreporter/tools/upload_symbols.py https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/W6b_q0ekQ6CTAe9i_JG2-w/artifacts/public/build/target.crashreporter-symbols.zip --ignore-missing
[task 2023-12-13T03:24:27.209Z] INFO:upload-symbols:Using symbol upload token from the secrets service: "http://taskcluster/secrets/v1/secret/project/releng/gecko/build/level-3/gecko-symbol-upload"
[task 2023-12-13T03:24:27.287Z] INFO:upload-symbols:Uploading symbol file "https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/W6b_q0ekQ6CTAe9i_JG2-w/artifacts/public/build/target.crashreporter-symbols.zip" to "https://symbols.mozilla.org/upload/"
[task 2023-12-13T03:24:27.287Z] INFO:upload-symbols:Attempt 1 of 7...
[task 2023-12-13T03:29:27.559Z] ERROR:upload-symbols:Error: got HTTP response 504: GATEWAY_TIMEOUT
[task 2023-12-13T03:29:27.559Z] ERROR:upload-symbols:Response body:
[task 2023-12-13T03:29:27.559Z] ====================
[task 2023-12-13T03:29:27.559Z] 
[task 2023-12-13T03:29:27.559Z] ====================
[task 2023-12-13T03:29:27.559Z] 
[task 2023-12-13T03:29:27.559Z] INFO:upload-symbols:Retrying...
[task 2023-12-13T03:29:37.644Z] INFO:upload-symbols:Attempt 2 of 7...
[task 2023-12-13T03:34:37.014Z] ERROR:upload-symbols:Error: got HTTP response 504: GATEWAY_TIMEOUT
[task 2023-12-13T03:34:37.014Z] ERROR:upload-symbols:Response body:
[task 2023-12-13T03:34:37.014Z] ====================
[task 2023-12-13T03:34:37.014Z] 
[task 2023-12-13T03:34:37.014Z] ====================
[task 2023-12-13T03:34:37.014Z] 
[task 2023-12-13T03:34:37.014Z] INFO:upload-symbols:Retrying...
[task 2023-12-13T03:34:53.361Z] INFO:upload-symbols:Attempt 3 of 7...
[task 2023-12-13T03:39:53.013Z] ERROR:upload-symbols:Error: got HTTP response 504: GATEWAY_TIMEOUT
[task 2023-12-13T03:39:53.013Z] ERROR:upload-symbols:Response body:
[task 2023-12-13T03:39:53.013Z] ====================
[task 2023-12-13T03:39:53.013Z] 
[task 2023-12-13T03:39:53.013Z] ====================
[task 2023-12-13T03:39:53.013Z] 
[task 2023-12-13T03:39:53.013Z] INFO:upload-symbols:Retrying...
[task 2023-12-13T03:40:15.393Z] INFO:upload-symbols:Attempt 4 of 7...
[taskcluster:error] Task timeout after 1200 seconds. Force killing container.
Flags: needinfo?(ahochheiden)
Component: General → Symbols
Flags: needinfo?(ahochheiden) → needinfo?(willkg)
Product: Firefox Build System → Socorro

Weird, it fails only on this merge. On the following merge it was green.

Summary: Perma Symsys-symbols-ubuntu [taskcluster:error] Task timeout after 1200 seconds. Force killing container. → Intermittent Symsys-symbols-ubuntu [taskcluster:error] Task timeout after 1200 seconds. Force killing container.

That file is 2.2GB which is definitely on the big side and more likely to hit timeouts. There's a large number of upload api requests around that time. The cluster scaled up as far as it could go, but I can see cases where the gunicorn worker timed out trying to handle uploads.

This issue is ephemeral and you'll see it when the Mozilla Symbols Server is under heavy load. Having it work fine later is entirely plausible.

Flags: needinfo?(willkg)

Moving this to the right product. Socorro doesn't do symbols upload--the Mozilla Symbols Server does.

Component: Symbols → Upload
Product: Socorro → Tecken

I don't think this is a Tecken bug. I think this is something you'll need to fix in the symsys-symbols-ubuntu task. Maybe increase the timeout because even if it's not retrying, it's still doing more work than the 1200 seconds allows for?

Flags: needinfo?(csabou)
Flags: needinfo?(csabou)

(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #12)

I don't think this is a Tecken bug. I think this is something you'll need to fix in the symsys-symbols-ubuntu task. Maybe increase the timeout because even if it's not retrying, it's still doing more work than the 1200 seconds allows for?

Component: Upload → Crash Reporting
Flags: needinfo?(gsvelto)
Product: Tecken → Toolkit

Indeed, these tasks are uploading a huge amount of symbol files, let's give them more time to do it.

Assignee: nobody → gsvelto
Status: NEW → ASSIGNED
Flags: needinfo?(gsvelto)

There is an r+ patch which didn't land and no activity in this bug for 2 weeks.
:gsvelto, could you have a look please?
If you still have some work to do, you can add an action "Plan Changes" in Phabricator.
For more information, please visit BugBot documentation.

Flags: needinfo?(lissyx+mozillians)
Flags: needinfo?(gsvelto)
Flags: needinfo?(lissyx+mozillians)

Duh, I forgot to land this. Landing it now.

Flags: needinfo?(gsvelto)
Pushed by gsvelto@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3eac2f1ec013 Increase the maximum duration of system symbol upload tasks to avoid timeouts when encountering 5000+ new symbols at once r=gerard-majax
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 130 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: