busted partials Intermittent asyncio.exceptions.TimeoutError
Categories
(Release Engineering :: Release Automation: Other, defect)
Tracking
(firefox79 fixed, firefox80 fixed)
People
(Reporter: CosminS, Assigned: sfraser)
References
Details
(Keywords: intermittent-failure)
Attachments
(1 file)
Log: https://treeherder.mozilla.org/logviewer.html#?job_id=307608509&repo=mozilla-central
2020-06-26 00:48:02,662 - DEBUG - Bytes downloaded for https://archive.mozilla.org/pub/firefox/nightly/2020/06/2020-06-25-09-44-52-mozilla-central-l10n/firefox-79.0a1.ca-valencia.win64-aarch64.complete.mar: 58738222
2020-06-26 00:48:25,484 - DEBUG - Bytes downloaded for https://archive.mozilla.org/pub/firefox/nightly/2020/06/2020-06-25-09-44-52-mozilla-central-l10n/firefox-79.0a1.ca-valencia.win64-aarch64.complete.mar: 62932526
2020-06-26 00:48:50,914 - WARNING - retry_async: download: too many retries!
Traceback (most recent call last):
File "/home/worker/bin/funsize.py", line 466, in <module>
main()
File "/home/worker/bin/funsize.py", line 455, in main
manifest = loop.run_until_complete(async_main(args, signing_cert))
File "/usr/lib/python3.8/asyncio/base_events.py", line 608, in run_until_complete
return future.result()
File "/home/worker/bin/funsize.py", line 400, in async_main
downloads = await download_and_verify_mars(
File "/home/worker/bin/funsize.py", line 249, in download_and_verify_mars
await asyncio.gather(*tasks)
File "/home/worker/bin/funsize.py", line 118, in retry_download
await retry_async(
File "/usr/local/lib/python3.8/dist-packages/scriptworker/utils.py", line 262, in retry_async
_check_number_of_attempts(attempt, attempts, func, "retry_async")
File "/usr/local/lib/python3.8/dist-packages/scriptworker/utils.py", line 259, in retry_async
return await func(*args, **kwargs)
File "/home/worker/bin/funsize.py", line 155, in download
chunk = await resp.content.read(chunk_size)
File "/usr/local/lib/python3.8/dist-packages/aiohttp/streams.py", line 368, in read
await self._wait('read')
File "/usr/local/lib/python3.8/dist-packages/aiohttp/streams.py", line 296, in _wait
await waiter
File "/usr/local/lib/python3.8/dist-packages/aiohttp/helpers.py", line 596, in exit
raise asyncio.TimeoutError from None
asyncio.exceptions.TimeoutError
[taskcluster 2020-06-26 00:48:51.266Z] === Task Finished ===
[taskcluster 2020-06-26 00:48:51.341Z] Artifact "public/build/ca-valencia/target.partial-1.mar" not found at "/home/worker/artifacts/target.partial-1.mar"
[taskcluster 2020-06-26 00:48:51.397Z] Artifact "public/build/ca-valencia/manifest.json" not found at "/home/worker/artifacts/manifest.json"
[taskcluster 2020-06-26 00:48:51.461Z] Artifact "public/build/ca-valencia/target.partial-4.mar" not found at "/home/worker/artifacts/target.partial-4.mar"
[taskcluster 2020-06-26 00:48:51.521Z] Artifact "public/build/ca-valencia/target.partial-2.mar" not found at "/home/worker/artifacts/target.partial-2.mar"
[taskcluster 2020-06-26 00:48:51.577Z] Artifact "public/build/ca-valencia/target.partial-3.mar" not found at "/home/worker/artifacts/target.partial-3.mar"
[taskcluster 2020-06-26 00:48:51.793Z] Unsuccessful task run with exit code: 1 completed in 811.258 seconds
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Comment 1•4 years ago
|
||
There were also these bustages over the night: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&group_state=expanded&resultStatus=testfailed%2Cbusted%2Cexception&searchStr=partials&revision=ca79c56a0a1b6fcbf823ee46973951b133a73a0c&selectedTaskRun=QWRvv7ncQB2e2E5thSTa8w.0
Aki, could you have a look over these failures? Thank you.
Comment hidden (Intermittent Failures Robot) |
Comment 3•4 years ago
|
||
a) In the busted run, we were downloading, just slowly, and
b) the errors went away in run 1,
so I'm fairly certain this is infra related. However, I'm looking at funsize.py
and see that we're using a semaphore to limit how many mars we generate concurrently. I'm wondering if we should also use a semaphore to limit concurrent downloads.
Simon, do you think a download semaphore might improve things here?
Assignee | ||
Comment 4•4 years ago
|
||
Worth a try, I'm unsure if the instance type these containers run on has changed recently, to give them a new network profile, so I'm happy to try it out. The Semaphore for the mar generation is not likely to be limiting things right now if the instance isn't overloaded, it's a safety net.
I wonder if the time-out it too short on the download as well.
Assignee | ||
Comment 5•4 years ago
|
||
Updated•4 years ago
|
Pushed by sfraser@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4a832f0c89db Reduce download concurrency in partials r=aki
Comment 7•4 years ago
|
||
bugherder |
Comment 11•4 years ago
|
||
bugherder uplift |
Comment hidden (Intermittent Failures Robot) |
Description
•