Closed Bug 1958524 Opened 23 days ago Closed 17 days ago

Frequent fenix AABs build aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>

Categories

(Cloud Services :: General, defect)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: amarc, Unassigned)

Details

Attachments

(1 file)

Parsed log: https://treeherder.mozilla.org/logviewer?job_id=502419714&repo=mozilla-central
Full log: https://firefoxci.taskcluster-artifacts.net/Kvv2v7LrSQCzI42yyVEmPg/1/public/logs/live_backing.log

2025-04-04 12:04:24,346 - scriptworker.client - DEBUG - Task is validated against this schema: {'title': 'Taskcluster signing task minimal schema', 'type': 'object', 'properties': {'dependencies': {'type': 'array', 'minItems': 1, 'uniqueItems': True, 'items': {'type': 'string'}}, 'scopes': {'type': 'array', 'minItems': 1, 'uniqueItems': True, 'items': {'type': 'string'}}, 'payload': {'type': 'object', 'properties': {'upstreamArtifacts': {'type': 'array', 'items': {'type': 'object', 'properties': {'taskType': {'type': 'string'}, 'taskId': {'type': 'string'}, 'formats': {'type': 'array', 'uniqueItems': True, 'items': {'type': 'string'}}, 'paths': {'type': 'array', 'minItems': 1, 'uniqueItems': True, 'items': {'type': 'string'}}, 'authenticode_comment': {'type': 'string'}}, 'required': ['taskId', 'taskType', 'paths', 'formats']}, 'minItems': 1, 'uniqueItems': True}}, 'required': ['upstreamArtifacts']}}, 'required': ['scopes', 'payload']}
2025-04-04 12:04:24,352 - asyncio - DEBUG - Using selector: EpollSelector
2025-04-04 12:04:24,353 - signingscript.utils - INFO - Loading Autograph config from /app/configs/passwords.json
2025-04-04 12:04:24,354 - signingscript.utils - INFO - Autograph config loaded from /app/configs/passwords.json
2025-04-04 12:04:24,354 - signingscript.utils - INFO - mkdir /app/workdir/public/build
2025-04-04 12:04:24,354 - signingscript.utils - INFO - Copying /app/workdir/cot/aq67EDc0RVKyb0s6qso1Pw/public/build/target.aab to /app/workdir/public/build/target.aab
2025-04-04 12:04:24,648 - signingscript.script - INFO - signing public/build/target.aab
2025-04-04 12:04:24,648 - signingscript.task - INFO - sign(): Signing 357500730 bytes in /app/workdir/public/build/target.aab with gcp_prod_autograph_apk...
2025-04-04 12:04:24,648 - signingscript.sign - INFO - sign_file(): signing /app/workdir/public/build/target.aab with gcp_prod_autograph_apk... using autograph /sign/file
2025-04-04 12:04:24,648 - signingscript.sign - DEBUG - sign_file_with_autograph: cert_type: project:releng:signing:cert:production-signing, fmt: gcp_prod_autograph_apk
2025-04-04 12:04:24,648 - signingscript.sign - DEBUG - got autograph config: url: https://prod.autograph.prod.webservices.mozgcp.net, id: releng_signingscript_fenix_release_2023, formats: ['gcp_prod_autograph_apk'], key_id: fenix_release_apk
2025-04-04 12:04:24,649 - signingscript.sign - DEBUG - make_signing_req took 0.00s; RSS:324780 (+0)
2025-04-04 12:04:24,649 - signingscript.sign - DEBUG - sign_with_autograph: url: https://prod.autograph.prod.webservices.mozgcp.net/sign/file, keyid: fenix_release_apk, client_id: releng_signingscript_fenix_release_2023
2025-04-04 12:04:29,828 - signingscript.sign - DEBUG - req_size: 476667682
2025-04-04 12:05:01,828 - signingscript.sign - ERROR - Autograph response: 504, stream timeout
2025-04-04 12:05:01,828 - signingscript.sign - DEBUG - call_autograph took 37.18s; RSS:324780 (+0)
2025-04-04 12:05:01,828 - scriptworker.utils - DEBUG - retry_async: call_autograph: sleeping 4.0458637630008765 seconds before retry
2025-04-04 12:05:10,962 - signingscript.sign - DEBUG - req_size: 476667682
2025-04-04 12:05:41,927 - signingscript.sign - DEBUG - Autograph response: 201
2025-04-04 12:05:42,925 - signingscript.sign - DEBUG - call_autograph took 37.05s; RSS:335496 (+10716)
2025-04-04 12:05:42,925 - scriptworker.utils - DEBUG - retry_async: call_autograph: sleeping 8.95886773601595 seconds before retry
2025-04-04 12:05:57,090 - signingscript.sign - DEBUG - req_size: 476667682
2025-04-04 12:06:28,770 - signingscript.sign - DEBUG - Autograph response: 201
2025-04-04 12:06:29,819 - signingscript.sign - DEBUG - call_autograph took 37.93s; RSS:357144 (+21648)
2025-04-04 12:06:29,819 - scriptworker.utils - WARNING - retry_async: call_autograph: too many retries!
2025-04-04 12:06:29,819 - signingscript.sign - DEBUG - sign_with_autograph took 125.17s; RSS:357144 (+32364)
2025-04-04 12:06:29,819 - signingscript.sign - DEBUG - sign_file_with_autograph took 125.17s; RSS:357144 (+32364)
Traceback (most recent call last):
  File "/app/lib/python3.11/site-packages/aiohttp/client_proto.py", line 93, in connection_lost
    uncompleted = self._parser.feed_eof()
                  ^^^^^^^^^^^^^^^^^^^^^^^
  File "aiohttp/_http_parser.pyx", line 508, in aiohttp._http_parser.HttpParser.feed_eof
aiohttp.http_exceptions.TransferEncodingError: 400, message:
  Not enough data for satisfy transfer length header.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/bin/signingscript", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/app/lib/python3.11/site-packages/signingscript/script.py", line 139, in main
    return scriptworker.client.sync_main(async_main, default_config=get_default_config())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/scriptworker/client.py", line 166, in sync_main
    loop.run_until_complete(_handle_asyncio_loop(async_main, context))
  File "/usr/local/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/scriptworker/client.py", line 205, in _handle_asyncio_loop
    await async_main(context)
  File "/app/lib/python3.11/site-packages/signingscript/script.py", line 62, in async_main
    output_files = await sign(context, os.path.join(work_dir, path), path_dict["formats"], authenticode_comment=path_dict.get("comment"))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/signingscript/task.py", line 163, in sign
    output = await signing_func(context, output, fmt, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/signingscript/sign.py", line 188, in sign_file
    await sign_file_with_autograph(context, from_, fmt, to=to)
  File "/app/lib/python3.11/site-packages/signingscript/sign.py", line 117, in wrapped
    return await f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/signingscript/sign.py", line 1009, in sign_file_with_autograph
    signed_bytes = base64.b64decode(await sign_with_autograph(context.session, a, input_file, fmt, "file", extension_id=extension_id))
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/signingscript/sign.py", line 117, in wrapped
    return await f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/signingscript/sign.py", line 973, in sign_with_autograph
    sign_resp = await retry_async(
                ^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/scriptworker/utils.py", line 280, in retry_async
    _check_number_of_attempts(attempt, attempts, func, "retry_async")
  File "/app/lib/python3.11/site-packages/scriptworker/utils.py", line 275, in retry_async
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/signingscript/sign.py", line 117, in wrapped
    return await f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/signingscript/sign.py", line 873, in call_autograph
    return await resp.json()
           ^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/aiohttp/client_reqrep.py", line 1276, in json
    await self.read()
  File "/app/lib/python3.11/site-packages/aiohttp/client_reqrep.py", line 1218, in read
    self._body = await self.content.read()
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/aiohttp/streams.py", line 418, in read
    block = await self.readany()
            ^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.11/site-packages/aiohttp/streams.py", line 440, in readany
    await self._wait("readany")
  File "/app/lib/python3.11/site-packages/aiohttp/streams.py", line 347, in _wait
    await waiter
aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>
exit code: 1


Hi Polly, could you please take a look at what's going on here? Seems like an infra issue, it fails 2/3 times

Flags: needinfo?(polly)

(In reply to amarc from comment #1)

Hi Polly, could you please take a look at what's going on here? Seems like an infra issue, it fails 2/3 times

This is signing server flakiness. Redirecting to releng -- acottner, can you get this where it needs to go or redirect further? Thanks!

Flags: needinfo?(polly) → needinfo?(acottner)

For some reason, I can't put this in Cloud Services :: Operations - Autograph, even though that BZ component appears to exist.

Component: Android Studio and Gradle Integration → General
Product: Firefox Build System → Cloud Services

The autograph work is tracked in Jira now. I'm walking into a Dr appt now but will let my team know and check when I'm out.

Flags: needinfo?(acottner)
Attached image error response metric

I see some 504 responses earlier today, but not for the last 5+ hours. Will investigate further.

PR to increase load balancer request timeout to 60s: https://github.com/mozilla-it/webservices-infra/pull/4708

Timeout increase has been deployed. Will monitor.

Looks like this issue is resolved. I have also adjusted our alerting to the threshold needed to trigger is much lower, so we'll know if this happens going forward.

Status: NEW → RESOLVED
Closed: 17 days ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: