Perma HSTS preload list generation failed
Categories
(Core :: Security Block-lists, Allow-lists, and other State, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr102 | --- | unaffected |
firefox111 | --- | unaffected |
firefox112 | + | fixed |
firefox113 | + | fixed |
People
(Reporter: noriszfay, Assigned: RyanVM)
Details
(Keywords: intermittent-failure)
Attachments
(2 files)
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=409138019&repo=mozilla-beta&lineNumber=84712
Full log: https://firefoxci.taskcluster-artifacts.net/Waz0cdjBQSmWydXEgMd9Ug/2/public/logs/live_backing.log
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
ERROR: exception making request to erenimrek.com.tr
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
ERROR: exception making request to geoactivism.org
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 164: NS_ERROR_ENTITY_CHANGED:
/home/worker/scripts/periodic_file_updates.sh: line 171: 45 Segmentation fault (core dumped) LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:. ./xpcshell "${HSTS_PRELOAD_SCRIPT}" "${HSTS_PRELOAD_INC_OLD}"
+ echo 'HSTS preload list generation failed'
HSTS preload list generation failed
+ exit 43
[taskcluster 2023-03-16 16:12:58.531Z] === Task Finished ===
[taskcluster 2023-03-16 16:12:58.536Z] Artifact "public/build/StaticHPKPins.h.diff" not found at "/home/worker/artifacts/StaticHPKPins.h.diff": (HTTP code 404) no such container - Could not find the file /home/worker/artifacts/StaticHPKPins.h.diff in container 14658f5f02a2cedf0cdd542199a3908fa44f5075977281388c2d1b07ec082adb
[taskcluster 2023-03-16 16:12:58.538Z] Artifact "public/build/remote-settings.diff" not found at "/home/worker/artifacts/remote-settings.diff": (HTTP code 404) no such container - Could not find the file /home/worker/artifacts/remote-settings.diff in container 14658f5f02a2cedf0cdd542199a3908fa44f5075977281388c2d1b07ec082adb
[taskcluster 2023-03-16 16:12:58.540Z] Artifact "public/build/nsSTSPreloadList.diff" not found at "/home/worker/artifacts/nsSTSPreloadList.diff": (HTTP code 404) no such container - Could not find the file /home/worker/artifacts/nsSTSPreloadList.diff in container 14658f5f02a2cedf0cdd542199a3908fa44f5075977281388c2d1b07ec082adb
[taskcluster 2023-03-16 16:13:00.197Z] Unsuccessful task run with exit code: 43 completed in 9690.188 seconds
Updated•1 year ago
|
Assignee | ||
Comment 1•1 year ago
|
||
I've been poking at this a bit. Unfortunately, the last green run on Beta was prior to the Gecko 112 uplift on Monday. However, we did have a green run on mozilla-central on the last revision prior to the merge, which makes me feel reasonably good that this could be by something which landed on 113 and got uplifted to Beta.
Ignoring the 112 merge commit, here's a rough regression range:
https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=4597c864a7b163089de157fc6eefcaa67afdd42d&tochange=853c78c9c586a11ffbd5279667cf514f0dba2542
From that, only bug 1569405 really seems to stand out to me. The ORB change doesn't seem likely because that would have been presumably causing issues on m-c already if it were the problem.
Comment 2•1 year ago
|
||
Hmm, that sounds unlikely to me. That patch really just adds a null check, so if it was relevant here I'd expect to see a crash here without.
Assignee | ||
Comment 3•1 year ago
|
||
Updated•1 year ago
|
Assignee | ||
Comment 4•1 year ago
|
||
I'm disabling the HSTS updates for now to at least unblock the rest of the job.
Pushed by rvandermeulen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/26eb1010f766 Temporarily disable HSTS pinning updates due to crashes during the run. r=jcristau
Pushed by ryanvm@gmail.com: https://hg.mozilla.org/mozilla-central/rev/20511d3af52f Temporarily disable HSTS pinning updates due to crashes during the run. r=jcristau, a=NPOTB DONTBUILD
Assignee | ||
Comment 7•1 year ago
|
||
Comment 8•1 year ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 10•1 year ago
|
||
Right now, my best guess is that we have a combination of a latent bug being triggered by a remote change. After doing a bit of Try hackery to force periodic_file_updates.sh to pull a specific build rather than the latest from the index, I can reproduce the crashes even off a rev that previously ran green.
It is interesting that ESR102 isn't crashing. Suggests that it was something recent(ish).
Assignee | ||
Comment 11•1 year ago
|
||
One more wrinkle - of the 8 Try pushes I initially triggered, 3 actually managed to run without crashing.
Assignee | ||
Comment 12•1 year ago
|
||
Dana, I've been striking out trying to bisect this any further on Try. Do you have any ideas how we might be able to try to reproduce these crashes locally so they can get caught in a debugger?
You could try taking subsets of the preload list so it doesn't take so long to process. You might even be able to narrow it down to a specific domain. I think modifying hostsToContact
would be the easiest way to do this.
Assignee | ||
Comment 14•1 year ago
|
||
So far, I've been able to confirm that the crashes happen intermittently, but frequently. They also happen at different points during the run, suggesting that it isn't tied to one specific domain. My guess is that something changed during the 112 cycle that made us more likely to OOM or something, which is how this is manifesting. A stronger bit of evidence towards that is that if I change instance size to xlarge, I can't get any failures on Try. I'm not in a position to try to further narrow down why we seem to be more prone to these issues than we were previously, so I'm going to just submit the patch to bump the instance size.
Talking to Dana more, we might want to consider rethinking our general approach to these jobs, however. We already have bug 1810856 on file for splitting the repo-update job into two different ones so we can more easily trigger remote-settings updates without having to wait hours for the HSTS updates as well. We could potentially take that one step further and split the HSTS part into multiple phases where we probe the sites across multiple jobs running in parallel and then combine the results for a final patch to be produced. Doing so would probably put us in a much better place with respect to how heavyweight this whole process is at the moment.
Assignee | ||
Comment 15•1 year ago
|
||
Comment 16•1 year ago
|
||
Pushed by rvandermeulen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/b769d68836a6 Switch the repo-update job to xlarge instances and re-enable HSTS updates. r=jcristau DONTBUILD
Assignee | ||
Updated•1 year ago
|
Comment 17•1 year ago
|
||
bugherder |
Assignee | ||
Comment 18•1 year ago
|
||
bugherder uplift |
Description
•