Intermittent periodic file update [taskcluster:error] Task timeout after 18000 seconds. Force killing container.
Categories
(Release Engineering :: Release Automation: Other, defect)
Tracking
(firefox-esr7887+ fixed, firefox86 fixed, firefox87+ fixed, firefox88+ fixed)
People
(Reporter: intermittent-bug-filer, Assigned: jcristau)
References
Details
(Keywords: intermittent-failure)
Attachments
(3 files, 1 obsolete file)
Filed by: malexandru [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=330707720&repo=mozilla-beta
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/YpyjYCLZSQe97EEa-5_Nng/runs/0/artifacts/public/logs/live_backing.log
@@ -11777,7 +11781,6 @@
// Fastly Inc. : http://www.fastly.com/
// Submitted by Fastly Security <security@fastly.com>
-edgecompute.app
fastly-terrarium.com
fastlylb.net
map.fastlylb.net
@@ -12614,10 +12617,6 @@
*.northflank.app
*.code.run
-// Noticeable : https://noticeable.io
-// Submitted by Laurent Pellegrino <security@noticeable.io>
-noticeable.news
-
// Now-DNS : https://now-dns.com
// Submitted by Steve Russell <steve@now-dns.com>
dnsking.ch
+ '[' -s /home/worker/artifacts/effective_tld_names.diff ']'
+ return 0
+ SUFFIX_LIST_UPDATED=true
+ '[' true == false ']'
+ '[' '' == true ']'
+ clone_repo
+ cd /home/worker
+ '[' '!' -d mozilla-beta ']'
+ /usr/bin/hg robustcheckout --sharebase /tmp/hg-store -b default https://hg.mozilla.org/releases/mozilla-beta mozilla-beta
(using Mercurial 4.8.1)
ensuring https://hg.mozilla.org/releases/mozilla-beta@default is available at mozilla-beta
(sharing from new pooled repository 8ba995b74e18334ab3707f27e9eb8f4e37ba3d29)
applying clone bundle from https://s3-external-1.amazonaws.com/moz-hg-bundles-us-east-1/releases/mozilla-beta/55fe52412f1546f6e4550dc1ee64fefce542b527.stream-v2.hg
611975 files to transfer, 3.49 GB of data
transferred 3.49 GB in 127.6 seconds (28.0 MB/sec)
finished applying clone bundle
searching for changes
adding changesets
adding manifests
adding file changes
added 3771 changesets with 23004 changes to 10456 files
new changesets d44181196337:a8a1f0369b83
searching for changes
no changes found
(pulling to obtain default)
(remote resolved default to a8a1f0369b834c74b1445833b2460617bac947f5; result is not deterministic)
(revision already present locally; not pulling)
(warning: large working directory being used without fsmonitor enabled; enable fsmonitor to improve performance; see "hg help -e fsmonitor")
287704 files updated, 0 files merged, 0 files removed, 0 files unresolved
updated to a8a1f0369b834c74b1445833b2460617bac947f5
PERFHERDER_DATA: {"framework": {"name": "vcs"}, "suites": [{"extraOptions": ["m5.4xlarge"], "hgVersion": "4.8.1", "lowerIsBetter": true, "name": "clone", "serverUrl": "hg.mozilla.org", "shouldAlert": false, "subtests": [], "value": 156.22249293327332}, {"extraOptions": ["m5.4xlarge"], "hgVersion": "4.8.1", "lowerIsBetter": true, "name": "update", "serverUrl": "hg.mozilla.org", "shouldAlert": false, "subtests": [], "value": 58.85510206222534}, {"extraOptions": ["m5.4xlarge"], "hgVersion": "4.8.1", "lowerIsBetter": true, "name": "overall", "serverUrl": "hg.mozilla.org", "shouldAlert": false, "subtests": [], "value": 217.2575089931488}, {"extraOptions": ["m5.4xlarge"], "hgVersion": "4.8.1", "lowerIsBetter": true, "name": "overall_clone", "serverUrl": "hg.mozilla.org", "shouldAlert": false, "subtests": [], "value": 217.2575089931488}, {"extraOptions": ["m5.4xlarge"], "hgVersion": "4.8.1", "lowerIsBetter": true, "name": "overall_clone_fullcheckout", "serverUrl": "hg.mozilla.org", "shouldAlert": false, "subtests": [], "value": 217.2575089931488}]}
+ /usr/bin/hg -R mozilla-beta pull
pulling from https://hg.mozilla.org/releases/mozilla-beta
searching for changes
no changes found
+ /usr/bin/hg -R mozilla-beta update -C default
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+ COMMIT_MESSAGE='No Bug, mozilla-beta repo-update'
+ '[' true == true ']'
+ stage_hsts_files
+ cd /home/worker
++ basename /home/worker/data/nsSTSPreloadList.inc
+ cp -f /home/worker/firefox/nsSTSPreloadList.inc mozilla-beta/security/manager/ssl/
+ COMMIT_MESSAGE='No Bug, mozilla-beta repo-update HSTS'
+ '[' true == true ']'
+ stage_hpkp_files
+ cd /home/worker
+ cp -f /home/worker/data/StaticHPKPins.h.out mozilla-beta/security/manager/ssl/StaticHPKPins.h
+ COMMIT_MESSAGE='No Bug, mozilla-beta repo-update HSTS HPKP'
+ '[' true == true ']'
+ stage_remote_settings_files
+ cd /home/worker
+ cp -a /home/worker/data/remote-settings.out/blocklists /home/worker/data/remote-settings.out/main /home/worker/data/remote-settings.out/pinning /home/worker/data/remote-settings.out/security-state mozilla-beta/services/settings/dumps
+ COMMIT_MESSAGE='No Bug, mozilla-beta repo-update HSTS HPKP remote-settings'
+ '[' true == true ']'
+ stage_tld_suffix_files
+ cd /home/worker
+ cp -a public_suffix_list.dat mozilla-beta//netwerk/dns/effective_tld_names.dat
+ COMMIT_MESSAGE='No Bug, mozilla-beta repo-update HSTS HPKP remote-settings tld-suffixes'
+ '[' false == true ']'
+ '[' false == true ']'
+ '[' true == true ']'
+ COMMIT_MESSAGE='No Bug, mozilla-beta repo-update HSTS HPKP remote-settings tld-suffixes - a=repo-update'
+ /usr/bin/hg -R mozilla-beta commit -u 'ffxbld <ffxbld@mozilla.com>' -m 'No Bug, mozilla-beta repo-update HSTS HPKP remote-settings tld-suffixes - a=repo-update'
+ push_repo
+ cd mozilla-beta
+ '[' '!' -r /home/worker/.arcrc ']'
++ command -v arc
+ ARC=/usr/bin/arc
+ '[' -z ryanvm ']'
++ /usr/bin/arc list
++ grep 'Needs Review'
++ grep -E 'mozilla-beta repo-update'
++ awk 'match($0, /D[0-9]+[^: ]/) { print substr($0, RSTART, RLENGTH) }'
+ /usr/bin/arc diff --verbatim --reviewers ryanvm
Linting...
No lint engine configured for this project.
Running unit tests...
No unit test engine is configured for this project.
Diff for 'services/settings/dumps/security-state/intermediates.json' with
context is 4,198,657 bytes in length. Generally, source changes should
not be this large. If this file is a huge text file, try using the
'--less-context' flag. If the file is not a text file, you can mark it
'binary'. Mark this file as 'binary' and continue? [y/N]
[taskcluster:error] Task timeout after 18000 seconds. Force killing container.
[taskcluster 2021-02-22 15:06:42.575Z] === Task Finished ===
[taskcluster 2021-02-22 15:06:42.575Z] Unsuccessful task run with exit code: -1 completed in 18009.359 seconds```
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Pushed by archaeopteryx@coole-files.de: https://hg.mozilla.org/integration/autoland/rev/89d3fb47753d use --less-context option on submission of periodic file update patches to work around Phabricator limit to changes in file. r=mtabara DONTBUILD
Comment 3•3 years ago
|
||
Updated•3 years ago
|
Updated•3 years ago
|
Assignee | ||
Comment 4•3 years ago
|
||
We need to be able to update this data, so tracking this as a release blocker.
Assignee | ||
Updated•3 years ago
|
Comment 5•3 years ago
|
||
bugherder |
Comment 6•3 years ago
|
||
The recent spike seems to be caused by:
https://hg.mozilla.org/releases/mozilla-release/rev/755246fd3b355aaa9885c5159523de1a75abed65
https://hg.mozilla.org/releases/mozilla-beta/rev/17c99982b6a9898cd5a69f5d75a0d290f121f428
Failure log: https://treeherder.mozilla.org/logviewer?job_id=331234368&repo=mozilla-beta&lineNumber=77840
...
No unit test engine is configured for this project.
Diff for 'services/settings/dumps/security-state/intermediates.json' with
context is 4,198,656 bytes in length. Generally, source changes should
not be this large. If this file is a huge text file, try using the
'--less-context' flag. If the file is not a text file, you can mark it
'binary'. Mark this file as 'binary' and continue? [y/N]
[taskcluster:error] Task timeout after 18000 seconds. Force killing container.
[taskcluster 2021-02-26 01:35:49.669Z] === Task Finished ===
[taskcluster 2021-02-26 01:35:49.669Z] Unsuccessful task run with exit code: -1 completed in 18009.488 seconds
Mihai, could you, please, take a look?
Comment 7•3 years ago
|
||
(In reply to Bogdan Tara[:bogdan_tara | bogdant] from comment #6)
The recent spike seems to be caused by:
https://hg.mozilla.org/releases/mozilla-release/rev/755246fd3b355aaa9885c5159523de1a75abed65
https://hg.mozilla.org/releases/mozilla-beta/rev/17c99982b6a9898cd5a69f5d75a0d290f121f428Failure log: https://treeherder.mozilla.org/logviewer?job_id=331234368&repo=mozilla-beta&lineNumber=77840
... No unit test engine is configured for this project. Diff for 'services/settings/dumps/security-state/intermediates.json' with context is 4,198,656 bytes in length. Generally, source changes should not be this large. If this file is a huge text file, try using the '--less-context' flag. If the file is not a text file, you can mark it 'binary'. Mark this file as 'binary' and continue? [y/N] [taskcluster:error] Task timeout after 18000 seconds. Force killing container. [taskcluster 2021-02-26 01:35:49.669Z] === Task Finished === [taskcluster 2021-02-26 01:35:49.669Z] Unsuccessful task run with exit code: -1 completed in 18009.488 seconds
Mihai, could you, please, take a look?
Aware of this, thanks for raising it. We no longer have an owner on this in RelEng, trying to dig to find a suitable solution here.
Assigning this to me for now.
Comment 8•3 years ago
|
||
Dana, something changed recently causing our periodic updates of intermediates.json to cause hugely bloated diffs (>4MB), breaking the jobs due to Phabricator limits in diff size. Is there anything you can think of that we could do here to lesson the size of this? I assume they're all on one line for reasons.
Comment 9•3 years ago
|
||
I've pushed a manual update of intermediates.json to mozilla-release. Mihai is going to manually trigger a new run of the pfu cron job to confirm that it runs green. If so, I'll do the same for the other trees.
https://hg.mozilla.org/releases/mozilla-release/rev/1492ed8d716e5987847b2b8c56861c0ac2d551ce
Comment 10•3 years ago
|
||
Pushed by ryanvm@gmail.com: https://hg.mozilla.org/mozilla-central/rev/96cb6ca6f536 Manually update intermediates.json to work around arc limitations. r+a=me DONTBUILD
Comment 11•3 years ago
|
||
Appears to have worked!
https://phabricator.services.mozilla.com/D106649
Updates to intermediates.json pushed to the other repos too:
https://hg.mozilla.org/mozilla-central/rev/96cb6ca6f536f5fd18c33994f891cabe01b4ccf8
https://hg.mozilla.org/releases/mozilla-beta/rev/6e386621dfa026e54fdffda0dde6e4f5dfe9d948
https://hg.mozilla.org/releases/mozilla-esr78/rev/d3a399f9e5869f8393d4a6ce970dca459ec00d96
Comment 12•3 years ago
|
||
Also, while it may become obsolete if we switch from arc to moz-phab here, it's probably worth noting that we're timing out at the moment because we're not properly noticing the fatal arc error and terminating the job, so instead we just sit there after the error until eventually the harness times out the job instead.
(In reply to Ryan VanderMeulen [:RyanVM] from comment #8)
Dana, something changed recently causing our periodic updates of intermediates.json to cause hugely bloated diffs (>4MB), breaking the jobs due to Phabricator limits in diff size. Is there anything you can think of that we could do here to lesson the size of this? I assume they're all on one line for reasons.
I think they're all on line because of how wget saves the json data: https://searchfox.org/mozilla-central/rev/362676fcadac37f9f585141a244a9a640948794a/taskcluster/docker/periodic-updates/scripts/periodic_file_updates.sh#307 - my (optimistic) guess is that isn't necessary. Again, optimistically, maybe pretty-printing it would make diffs more manageable?
Comment 14•3 years ago
|
||
The workaround from comment 11 got the jobs running green again for now, so I'm downgrading the severity as we're no longer blocked. That said, the overall situation is still fragile and we still need prioritize making this more robust before the next change to intermediates.json puts us back here again.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 16•3 years ago
|
||
Should we try something like this then:
diff --git a/taskcluster/docker/periodic-updates/scripts/periodic_file_updates.sh b/taskcluster/docker/periodic-updates/scripts/periodic_file_updates.sh
--- a/taskcluster/docker/periodic-updates/scripts/periodic_file_updates.sh
+++ b/taskcluster/docker/periodic-updates/scripts/periodic_file_updates.sh
@@ -304,7 +304,7 @@ function compare_remote_settings_files {
remote_records_url="$REMOTE_SETTINGS_SERVER/buckets/${bucket}/collections/${collection}/records"
local_location_output="$REMOTE_SETTINGS_OUTPUT/${bucket}/${collection}.json"
mkdir -p "$REMOTE_SETTINGS_OUTPUT/${bucket}"
- ${WGET} -qO "$local_location_output" "$remote_records_url"
+ ${WGET} -qO- "$remote_records_url" | ${JQ} . > "${local_location_output}"
# 5. Download attachments if needed.
if [ "${bucket}" = "blocklists" ] && [ "${collection}" = "addons-bloomfilters" ]; then
Mathieu, would remote settings be OK with dumps getting reformatted this way?
Comment 17•3 years ago
|
||
Mathieu, would remote settings be OK with dumps getting reformatted this way?
Yes, sure, no problem. Making the diffs easier to read would probably benefit everyone :)
The loading code is straightforward and since it's fetching from resource://
the overhead should be negligible.
Assignee | ||
Comment 18•3 years ago
|
||
One other question I guess: is the ordering of entries kept mostly the same when the "intermediates" collection is updated? If not, the diff will probably still be quite big when pretty-printed.
Assignee | ||
Comment 19•3 years ago
|
||
Answering my own question, it looks like entries are sorted by last_modified so we should be good.
Assignee | ||
Comment 20•3 years ago
|
||
This will make the diffs both easier to review, and smaller; the latter
should avoid bumping into arc's hardcoded 4MB size limit for diffs.
Comment 21•3 years ago
|
||
Pushed by jcristau@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/0a68e4b60e1c pretty-print remote settings dumps. r=releng-reviewers,bhearsum DONTBUILD
Assignee | ||
Comment 22•3 years ago
|
||
Stealing this bug from Mihai.
Assignee | ||
Comment 23•3 years ago
|
||
This commit was created by running:
for f in services/settings/dumps//.json; do jq . < $f | sponge $f; done
Future updates will automatically be pretty-printed, so this is a
one-time manual update to get us to the transitioned state.
Comment 24•3 years ago
|
||
Pushed by rvandermeulen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4770ba6932ee manually run all remote-settings dumps through jq. r=RyanVM
Assignee | ||
Updated•3 years ago
|
Comment 25•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/0a68e4b60e1c
https://hg.mozilla.org/mozilla-central/rev/4770ba6932ee
Comment 26•3 years ago
|
||
Comment 27•3 years ago
|
||
Pushed by asasaki@mozilla.com: https://hg.mozilla.org/ci/ci-configuration/rev/b64dae93e3b6 enable periodic-update manual cron. r=mtabara,RyanVM
Comment 28•3 years ago
|
||
https://phabricator.services.mozilla.com/D106867
The remote settings changes are a 39k diff now :-)
Assignee | ||
Comment 29•3 years ago
|
||
bugherder uplift |
Comment 30•3 years ago
|
||
(In reply to Dorel Luca [:dluca] from comment #5)
Backed out because we don't need it after all.
https://hg.mozilla.org/integration/autoland/rev/d3f2575c07e19a20bd71374670d41825b7f92387
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Comment 31•3 years ago
|
||
bugherder uplift |
Comment 32•3 years ago
|
||
bugherder uplift |
Comment hidden (Intermittent Failures Robot) |
Description
•