Closed
Bug 1479918
Opened 6 years ago
Closed 6 years ago
HSTS preload list not updating?
Categories
(Core :: Security Block-lists, Allow-lists, and other State, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla63
People
(Reporter: mismith, Assigned: keeler)
References
Details
Attachments
(3 files)
According to <https://wiki.mozilla.org/SecurityEngineering/HTTP_Strict_Transport_Security_(HSTS)_Preload_List>, the HSTS preload list is supposed to be updated daily.
But looking through <https://hg.mozilla.org/mozilla-central/log/tip/security/manager/ssl/nsSTSPreloadList.inc>, I can't find any new domains added to the list in the past month.
There are domains in Chromium's list <https://chromium.googlesource.com/chromium/src/net/+/master/http/transport_security_state_static.json> which are missing and don't appear on the errors list <https://hg.mozilla.org/mozilla-central/raw-file/tip/security/manager/ssl/nsSTSPreloadList.errors>.
Picking an arbitrary example, 'alonetone.com' was added to Chromium's list in May <https://chromium.googlesource.com/chromium/src/net/+/c7fa5723f942f9d4be6c52b5b8bc0574ebca275e%5E%21/> and returns an eligible 'Strict-Transport-Security' header, yet is not in Firefox's list.
Comment 1•6 years ago
|
||
I just made a really quick comparison with reduced versions of
https://chromium.googlesource.com/chromium/src/net/+/master/http/transport_security_state_static.json?format=TEXT (base64), nsSTSPreloadList.inc and nsSTSPreloadList.errors using https://stackoverflow.com/a/4546712:
It looks like 5643 domains are unknown to Firefox.
Is
> "policy": "bulk-1-year"
making problems?
Flags: needinfo?(dkeeler)
Assignee | ||
Comment 2•6 years ago
|
||
Simon, now that we've moved the jobs that produce these files to... docker? taskcluster? I have no idea where to find the logs. Can you point me in the right direction? Thanks! (Also it looks like we no longer update the .errors file - was that intended?)
Flags: needinfo?(dkeeler) → needinfo?(sfraser)
Comment 3•6 years ago
|
||
There's some related comments on the Wikimedia bug tracker task to implement the HSTS preload list for all outgoing links from Wikipedia. See https://phabricator.wikimedia.org/T200745 (I'm Cyde over there.)
Assignee | ||
Comment 4•6 years ago
|
||
When the HSTS preload script was reworked to use async/await in bug 1436369,
`fetchstatus` would create an asynchronous xml http request and then attempt to
access a response header from it. However, there was nothing to ensure that the
request had completed before this code ran. This patch ensures that the request
has completed before the response header is used.
This patch also replaces a lingering instance of `Ci.nsISSLStatusProvider` that
should have been changed to `Ci.nsITransportSecurityInfo` in bug 1475647.
Finally, this patch removes the old, redundant getHSTSPreloadList.js in
security/manager/tools as well as the unused nsSTSPreloadList.errors file in
security/manager/ssl.
Assignee | ||
Updated•6 years ago
|
Updated•6 years ago
|
status-firefox61:
--- → wontfix
status-firefox62:
--- → affected
status-firefox63:
--- → affected
status-firefox-esr52:
--- → unaffected
status-firefox-esr60:
--- → affected
tracking-firefox62:
--- → +
tracking-firefox63:
--- → +
tracking-firefox-esr60:
--- → 62+
Comment 5•6 years ago
|
||
Following this fix, how often will the HSTS preload list in central be updated?
Assignee | ||
Comment 6•6 years ago
|
||
(In reply to Ben McIlwain from comment #5)
> Following this fix, how often will the HSTS preload list in central be
> updated?
At least once a week. Maybe daily.
Comment 7•6 years ago
|
||
Currently twice a week.
Comment 8•6 years ago
|
||
Comment on attachment 8998048 [details]
bug 1479918 - fix HSTS update script to only access XHR headers after each request has completed r?sfraser
Simon Fraser [:sfraser] ⌚️GMT has approved the revision.
Attachment #8998048 -
Flags: review+
Comment 9•6 years ago
|
||
(In reply to [:keeler] (use needinfo) from comment #2)
> Simon, now that we've moved the jobs that produce these files to... docker?
> taskcluster? I have no idea where to find the logs. Can you point me in the
> right direction? Thanks! (Also it looks like we no longer update the .errors
> file - was that intended?)
Yes, that was intended, as that sort of log should not be in-tree.
Tasks can be found in treeherder, such as https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-searchStr=pfu&fromchange=3cb90f16402bc5e1203e2771dc93553b8377fa40&selectedJob=192208505
Logs are available if you inspect the task.
Flags: needinfo?(sfraser)
Reporter | ||
Comment 10•6 years ago
|
||
When can we expect the first update to the HSTS preload list? Those ~5k URLs :darksider mentioned still appear to be missing from the version in mozilla-central.
Reporter | ||
Comment 11•6 years ago
|
||
(Oh - I see now that the patch was just approved hours ago, so it makes sense that there hasn't been an update yet.)
Comment 12•6 years ago
|
||
Pushed by dkeeler@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8015bcc55dae
fix HSTS update script to only access XHR headers after each request has completed r=sfraser
Comment 13•6 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Comment 14•6 years ago
|
||
FYI, I forced a run on m-c tip today after this fix was merged, but I'm not seeing any changes to the HSTS list other than a date bump:
https://phabricator.services.mozilla.com/D3333
Flags: needinfo?(dkeeler)
Comment 15•6 years ago
|
||
Sorry, should have linked to the log for the run too:
https://taskcluster-artifacts.net/VCGhdwBnR2-iQBt64g5HmA/0/public/logs/live_backing.log
Reporter | ||
Comment 16•6 years ago
|
||
Looks like every request to verify the presence of the HSTS header is failing, with messages like "ERROR: exception making request to kiwiirc.com" in the log.
Comment 17•6 years ago
|
||
Confirmed (perfektesgewicht.de). It would be good if the migration to Taskcluster also fixes bug 1309201.
Assignee | ||
Comment 18•6 years ago
|
||
Huh - maybe there's something in the docker environment that's preventing us from connecting to sites (other than to download the source list)? Any ideas, Simon?
Flags: needinfo?(dkeeler) → needinfo?(sfraser)
Comment 19•6 years ago
|
||
(In reply to [:keeler] (use needinfo) from comment #18)
> Huh - maybe there's something in the docker environment that's preventing us
> from connecting to sites (other than to download the source list)? Any
> ideas, Simon?
I've re-run the task mentioned above as an interactive task to try and recreate the conditions as much as possible.
kiwiirc.com appears to be ipv4 only.
# openssl s_client -connect kiwiirc.com:443 -crlf
[... openssl certs elided... ]
GET https://kiwiirc.com HTTP/1.0
HTTP/1.1 400 Bad Request
Date: Wed, 15 Aug 2018 07:34:02 GMT
Content-Type: text/html
Content-Length: 171
Connection: close
Server: cloudflare-nginx
CF-RAY: 44a9f9f669dcbfc7-MAN
perfektesgewicht.de is responding correctly, though:
Strict-Transport-Security: max-age=315360000; includeSubDomains; preload
It was included in the output, presumably due to the check that says 'if we already added it, but couldn't test this time, add it anyway'. Better logging there would be helpful, though.
When we moved it to taskcluster the two variants were producing identical diffs, so I'd need more support that it's due to the migration or worker it's running on.
Flags: needinfo?(sfraser)
Reporter | ||
Comment 20•6 years ago
|
||
(In reply to Simon Fraser [:sfraser] ⌚️GMT from comment #19)
> kiwiirc.com appears to be ipv4 only.
> # openssl s_client -connect kiwiirc.com:443 -crlf
> [... openssl certs elided... ]
> GET https://kiwiirc.com HTTP/1.0
You're missing a Host header:
> $ openssl s_client -connect kiwiirc.com:443 -crlf > [...snip...]
> GET / HTTP/1.1
> Host: kiwiirc.com
>
> HTTP/1.1 200 OK
> Date: Wed, 15 Aug 2018 11:55:05 GMT
> [...snip...]
> Strict-Transport-Security: max-age=5256000
> [...snip...]
Can we re-open this, since the HSTS preload list still isn't updating?
Updated•6 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla63 → ---
Comment 21•6 years ago
|
||
> > $ openssl s_client -connect kiwiirc.com:443 -crlf > [...snip...]
> > GET / HTTP/1.1
> > Host: kiwiirc.com
> >
> > HTTP/1.1 200 OK
> > Date: Wed, 15 Aug 2018 11:55:05 GMT
> > [...snip...]
> > Strict-Transport-Security: max-age=5256000
> > [...snip...]
>
In which case it's likely due to the max age being too low. It's set to 10886400 here:
https://dxr.mozilla.org/mozilla-central/source/taskcluster/docker/periodic-updates/scripts/getHSTSPreloadList.js#28
Reporter | ||
Comment 22•6 years ago
|
||
(In reply to Simon Fraser [:sfraser] ⌚️GMT from comment #21)
> In which case it's likely due to the max age being too low. It's set to
> 10886400 here:
>
> https://dxr.mozilla.org/mozilla-central/source/taskcluster/docker/periodic-
> updates/scripts/getHSTSPreloadList.js#28
I don't think we're even reaching that test, though. The "ERROR: exception making request" message is printed here:
https://dxr.mozilla.org/mozilla-central/rev/4e56a2f51ad739ca52046723448f3129a58f1666/taskcluster/docker/periodic-updates/scripts/getHSTSPreloadList.js#181
signifying that the HTTP request itself is failing.
Anyway, I didn't mean to single out kiwiirc.com, here. In the log :RyanVM linked, I'm seeing that same exception message for *every* domain the script checks for HSTS preloading eligibility - *all* of the HTTP requests seem to be failing.
Comment 23•6 years ago
|
||
(In reply to Michael Smith [:mismith] (non-bmo mail: mds009@eng.ucsd.edu) from comment #22)
> I don't think we're even reaching that test, though. The "ERROR: exception
> making request" message is printed here:
>
> https://dxr.mozilla.org/mozilla-central/rev/
> 4e56a2f51ad739ca52046723448f3129a58f1666/taskcluster/docker/periodic-updates/
> scripts/getHSTSPreloadList.js#181
>
> signifying that the HTTP request itself is failing.
Curiously, that should also have a colon after the hostname at a minimum, even if the exception is empty. What's going on there?
Comment 24•6 years ago
|
||
(In reply to Simon Fraser [:sfraser] ⌚️GMT from comment #23)
> Curiously, that should also have a colon after the hostname at a minimum,
> even if the exception is empty. What's going on there?
Ah, no, ignore that, the current tip doesn't have that part.
Comment 25•6 years ago
|
||
Testing with a smaller number of hosts, I get some errors from inside RedirectAndAuthStopper instead. It seems to be sites like siraweb.org that have a 'moved permanently' response.
JavaScript error: /home/worker/scripts/getHSTSPreloadList.js, line 151: Error: 2152398880
Adding error handlers has indicated that the errors on the full set of hosts are all timeouts. Perhaps the callbacks can't cope with the rate of host queries?
Comment 26•6 years ago
|
||
Instead of having the one script to do everything, it might to be better to split the script into two.
The first script runs weekly or fortnightly to download the HSTS preload list and commit it into code.
The second script runs daily or weekly to check if the domains have valid HSTS header, etc.
Using this method makes sure that the preload list is always current with the Chromium preload list and if the second script breaks for any reason in the future it wouldn't affect the updating of the list.
Comment 27•6 years ago
|
||
(In reply to Burton from comment #26)
> Instead of having the one script to do everything, it might to be better to
> split the script into two.
>
> The first script runs weekly or fortnightly to download the HSTS preload
> list and commit it into code.
> The second script runs daily or weekly to check if the domains have valid
> HSTS header, etc.
>
> Using this method makes sure that the preload list is always current with
> the Chromium preload list and if the second script breaks for any reason in
> the future it wouldn't affect the updating of the list.
I like the idea of splitting up the task, but I'm unsure if it's a good idea to include the contents of the preload list without checking it first. Would we not lose the value of checking it at all?
If we split the task up, I think it'd probably be good to divide up the task of checking the hosts, to avoid overloading a particular worker or hitting dns query rate limits.
Comment 28•6 years ago
|
||
I'm out of my javascript knowledge, at this point. Any advice?
Flags: needinfo?(dkeeler)
Assignee | ||
Comment 29•6 years ago
|
||
I think we're basically causing the platform (i.e. gecko/necko) to livelock by having 50000 simultaneous XHRs. I wrote a patch to break up the requests in batches of 100.
Flags: needinfo?(dkeeler)
Assignee | ||
Comment 30•6 years ago
|
||
Prior to this patch, getHSTSPreloadList.js would queue an XHR for every preload
list candidate site. This meant that there would be ~50,000 requests in flight
simultaneously. Simply processing these requests caused them to all time out,
and no useful work was done. This patch resolves them in batches of 100 to
avoid this issue.
Comment 31•6 years ago
|
||
(In reply to [:keeler] (use needinfo) from comment #30)
> Created attachment 9002019 [details]
> bug 1479918 - batch requests in getHSTSPreloadList.js to avoid livelock
> r?sfraser
>
> Prior to this patch, getHSTSPreloadList.js would queue an XHR for every
> preload
> list candidate site. This meant that there would be ~50,000 requests in
> flight
> simultaneously. Simply processing these requests caused them to all time out,
> and no useful work was done. This patch resolves them in batches of 100 to
> avoid this issue.
How long does the run take, now? The time limit on the task is set to 30 minutes, if we need to adjust that.
Assignee | ||
Comment 32•6 years ago
|
||
Somewhere around 4-6 hours? I forgot to time it accurately. We could potentially increase the batch size until we start seeing too many failures. In theory each batch could take up to 30 seconds, so some quick math makes it t=417/b, where b is the batch size and t is in hours (assuming 50000 domains to probe).
Comment 33•6 years ago
|
||
Ah, shame, I was hoping to not go back to the thing taking such a long time, but it can't be helped. I may adjust the task start times so that they finish around the time they have been, so far. It'll be less surprising for the people landing the changes.
Comment 34•6 years ago
|
||
We'll also *really* need to get to the bottom of the random timeout issues too then.
Comment 35•6 years ago
|
||
I'm running the patched version locally, I started 6 hours ago, it finished the HSTS probing about an hour ago; so it's still got the error in it which causes the event loop to keep spinning once all the tasks have returned.
Comment 36•6 years ago
|
||
Too late for 62 with today's merge to release.
Comment 37•6 years ago
|
||
Comment on attachment 9002019 [details]
bug 1479918 - batch requests in getHSTSPreloadList.js to avoid livelock r?sfraser
Simon Fraser [:sfraser] ⌚️GMT has approved the revision.
Attachment #9002019 -
Flags: review+
Comment 38•6 years ago
|
||
Pushed by rvandermeulen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e0f0d5824dd8
batch requests in getHSTSPreloadList.js to avoid livelock r=sfraser
Comment 39•6 years ago
|
||
Backed out for failing eslint at builds/worker/checkouts/gecko/taskcluster/docker/periodic-updates/scripts/getHSTSPreloadList.js:260:43
Push that started the failure: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=e0f0d5824dd8aaaaf1395e569cec1806b028b12e
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=196434638&repo=autoland&lineNumber=273
Backout: https://hg.mozilla.org/integration/autoland/rev/2d1a1f2205c5875bac3a97031b205abf4af8b610
Flags: needinfo?(dkeeler)
Comment 40•6 years ago
|
||
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/9c7178279949
Batch requests in getHSTSPreloadList.js to avoid livelock. r=sfraser
Comment 41•6 years ago
|
||
Trivial issue, I went ahead and fixed and re-pushed.
Flags: needinfo?(dkeeler)
Comment 42•6 years ago
|
||
bugherder |
Status: REOPENED → RESOLVED
Closed: 6 years ago → 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Comment 43•6 years ago
|
||
I'd say it worked!
https://hg.mozilla.org/integration/autoland/rev/7adbbec1286f
Comment 44•6 years ago
|
||
bugherder uplift |
Uplifted to ESR60 also:
https://hg.mozilla.org/releases/mozilla-esr60/rev/6554500ae477
https://hg.mozilla.org/releases/mozilla-esr60/rev/825800d57043
And I manually ran the update and got the following changeset:
https://hg.mozilla.org/releases/mozilla-esr60/rev/921a910927efe52404531a5967a0bf84c19ec119
Comment 45•6 years ago
|
||
Dana, it looks like the ESR60 run mostly just removed a bunch of entries and I see differences between the changeset that landed on autoland in comment 43 vs. the one for esr60 in comment 44. Any ideas why that might be?
Flags: needinfo?(dkeeler)
Assignee | ||
Comment 46•6 years ago
|
||
Bug 1475647 removed nsISSLStatusProvider in 63, hence changing the QueryInterface to nsITransportSecurityInfo. we need to keep using nsISSLStatusProvider in the uplifted version, though.
Flags: needinfo?(dkeeler)
Attachment #9005261 -
Flags: review+
Comment 47•6 years ago
|
||
bugherder uplift |
You need to log in
before you can comment on or make changes to this bug.
Description
•