compatibility issues with domain labels beginning or ending with hyphens
Categories
(NSS :: Libraries, defect)
Tracking
(Not tracked)
People
(Reporter: annevk, Assigned: keeler)
References
(Blocks 1 open bug)
Details
(Whiteboard: [psm-backlog])
Attachments
(2 files)
![]() |
Assignee | |
Comment 1•9 years ago
|
||
Reporter | ||
Comment 2•9 years ago
|
||
![]() |
Assignee | |
Comment 3•9 years ago
|
||
![]() |
Assignee | |
Comment 4•9 years ago
|
||
![]() |
Assignee | |
Updated•8 years ago
|
Reporter | ||
Comment 9•7 years ago
|
||
Comment 10•7 years ago
|
||
Reporter | ||
Comment 11•7 years ago
|
||
Comment 12•7 years ago
|
||
Comment 13•7 years ago
|
||
Comment 14•7 years ago
|
||
Comment 15•7 years ago
|
||
Comment 16•7 years ago
|
||
Reporter | ||
Comment 17•7 years ago
|
||
Comment 18•7 years ago
|
||
Comment 19•7 years ago
|
||
Reporter | ||
Comment 20•7 years ago
|
||
Comment 21•7 years ago
|
||
Reporter | ||
Comment 22•7 years ago
|
||
Comment 23•7 years ago
|
||
Reporter | ||
Comment 24•7 years ago
|
||
Comment 26•6 years ago
|
||
Just encountered this bug myself in the wild (student web page, they use chrome), here is a test page that should be good to test against.
https://subdomain-ending-with-a-dash-.glitch.me/
On chrome it shows no warning. IMO, this seems like one of the places where the browser should just accept the malformed URL since all other parts of the system accept it. Or alternatively change the message so it contains a line about it failing domain name syntax verification since the certificate is still valid, even if it may be unintentional.
Reporter | ||
Comment 28•5 years ago
|
||
Perhaps an approach here is that we don't validate wildcarded labels? So we remain strict for registrable domains, but any subdomains of those that use a wildcarded certificate are reachable, as long as the networking stack allows.
So you cannot use a certificate for test--.example.com
, but if you have a certificate for *.example.com
, test--.example.com
uses that, and the browser generally allows for navigating to test--.example.com
, mozilla::pkix won't complain.
Ryan, I realized we discussed this many years ago, but it keeps coming back. Any thoughts on what browsers can align on here?
Comment 29•5 years ago
|
||
Anne: The only punts the problem to a different layer, the QUIC/TLS layer, for which the specs then point to the DNS RFCs and things get weird again.
For example, TLS 1.3 has Server Name Indication support as mandatory. Punting the problem from the DNS layer ("We'll accept anything") and from the certificate layer ("We'll allow any label to match a wildcard") still has to deal with the TLS layer, on both the client and the server, in validating the protocol invariants. That is, the following language:
"HostName" contains the fully qualified DNS hostname of the server,
as understood by the client. The hostname is represented as a byte
string using ASCII encoding without a trailing dot. This allows the
support of internationalized domain names through the use of A-labels
defined in [RFC5890]. DNS hostnames are case-insensitive. The
algorithm to compare hostnames is described in [RFC5890], Section
2.3.2.4.
For example, Chrome's QUIC implementation does Yet Another Thing different from its TLS, certificate, and HTTP implementations, using this method to make sure that the SNI is valid. Why? Because it's Yet A Different Team, and that team doesn't have access to Chrome's URL parser in non-Chrome use cases, nor does it want to have to carry Chrome's (not-quite-)WHATWG URL parser. In that scenario, it would allow test--.example.com
but wouldn't allow -test.example.com
, even though the browser would allow navigation. It only allows underscores because of an unaddressed TODO while implementing.
I'm not sure I have much useful to add beyond what :abr and :mt mentioned, in Comment #14 / Comment #15, and which I understood the statement in Comment #24 to be about.
Regrettably, it seems like this is another place where Chrome had a bug / didn't correctly implement the spec, and things went downhill. The relevant code checks to make sure it starts with a letter-digit, but doesn't make sure it ends on a letter-digit. I filed a bug for that.
URL parsing continues to be a slightly-mitigated-but-overall-security-disaster. Bugs like https://crbug.com/449829 (Fixed in https://crbug.com/456391) or https://crbug.com/695474 show where the misalignment between DNS and URLs gets... messy. Similarly, reconstituting hostnames for URLs, as done by HTTP/2, also leads to weirdness (Fix). While it seems like more folks are starting to align on a WHATWT spec (such as in response to security issues), it seems they're doing it by forking Chrome's parser, bugs and all, which is discouraging.
I know Comment #24 captured some of my past concerns about the WHATWG URL spec being prescriptive about hostnames, but I do wonder if that's the only path to get out of this mess. While I wouldn't be comfortable with the lax parsing as is (which completely disregards RFC 1123/1034), perhaps the answer is to fold that text in, highlight some of the exceptions (e.g. underscores), figure out the interop issues (e.g. trailing hyphens), and effectively hard fork 1034/1123. If we're not willing to do that, it seems like we need to figure out a plan to break web compat to get alignment back on 1123/1034, and while that may not be the best thing to do during a global pandemic, it might be the best thing to do for the long term ecosystem.
Reporter | ||
Updated•5 years ago
|
Comment 30•5 years ago
|
||
(In reply to Anne (:annevk) from comment #28)
Perhaps an approach here is that we don't validate wildcarded labels? So we remain strict for registrable domains, but any subdomains of those that use a wildcarded certificate are reachable, as long as the networking stack allows.
Pleaes, no. It shouldn't be possible to use a wildcard certificate for any domain for which one couldn't get a normal single-domain certificate for.
If we're not willing to do that, it seems like we need to figure out a plan to break web compat to get alignment back on 1123/1034, and while that may not be the best thing to do during a global pandemic, it might be the best thing to do for the long term ecosystem.
Let's see some statistics on how often users make requests to domains that have labels with leading and/or trailing underscores. I bet the numbers would be well within the tolerance for deprecation & removal and maybe even lower than those for underscores in HTTPS domain names, which were already deprecated (and removed?).
Comment 32•5 years ago
|
||
I bet the numbers would be well within the tolerance for deprecation & removal and maybe even lower than those for underscores in HTTPS domain names, which were already deprecated (and removed?).
For clarification, does this imply that you would formally block domains with labels with leading/trailing underscores/hyphens on both HTTP and HTTPS? Or that you would simply formalize this error for HTTPS only? Or that you would remove the validation rule and allow domains with labels with leading/trailing underscores/hyphens?
To add my own 2c: blocking domains with labels with leading or trailing underscores/hyphens doesn't just change the behavior of domains, it has a second-order effect of pushing application developers to change the behavior of usernames/slugs in their software and data models. It's extremely common for services to allow subdomains with usernames; adding constraints more strict than [-a-z0-9]+
would have a ripple effect requiring frameworks and app developers to update their validation rules to cope with this rule (and deal with ongoing customer support issues from existing users with usernames/slugs that are invalid as a subdomain).
Django, for instance, validates slugs (SlugField
, validate_slug
, etc.) with a simple [-\w]+
. Standardizing on the current Firefox behavior would require Django to either deprecate slugs as they exist today for use with subdomains, or provide another field type for this purpose:
Github, as mentioned, has started disallowing usernames with trailing hyphens, but many already exist, and all of the Github Pages for these users are broken in Firefox. E.g.,: https://qix-.github.io
This is, unfortunately, a bug that's extremely easy for developers to overlook until after it's become a problem. My own service suffers from this issue, and only discovered this ticket as a result of a customer support ticket.
Comment 33•5 years ago
|
||
To be clear: The current Django behaviour is inconsistent with longstanding specs regarding DNS labels, so that's ostensibly a bug. RFC 1034 was written before I could even read, for example ;)
Now, I agree, the situation is messy because software inconsistently follows this, because different layers are more or less liberal (unfortunately), but from a spec perspective, it's not guaranteed these will work. I understand that's not perfect, but updating the hostname allocation to actually adhere to the preferred name syntax (and as modified by 1123) is a good thing. Unfortunately, bad advice often causes issues like this to be discovered later than ideal for servers.
In short, moving now is a good thing to do anyways, regardless of decisions made here.
Updated•3 years ago
|
Comment 36•3 years ago
|
||
The severity field for this bug is relatively low, S3. However, the bug has 8 duplicates.
:keeler, could you consider increasing the bug severity?
For more information, please visit auto_nag documentation.
Comment 37•3 years ago
|
||
The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.
Updated•2 years ago
|
Updated•2 years ago
|
Comment 39•1 year ago
|
||
A Firefox Android user reported bug 1912653, which I think is an instance of this bug:
https://_-.pages.debian.net/gsoc2024-parsons-ballarin
- Firefox Android fails to load the page and shows no error message.
- Firefox desktop shows an SSL_ERROR_BAD_CERT_DOMAIN error, warning that the site's certificate is valid for
*.pages.debian.net
andpages.debian.net
, not for_-.pages.debian.net
. - Chrome redirects to https://gsoc2024-parsons-ballarin----b19696eb446388731f769f725c7642be22.pages.debian.net/ without error.
![]() |
Assignee | |
Comment 41•9 months ago
|
||
Hyphens shouldn't appear at the beginning or end of labels. However, in
practice, sometimes a website will use a wildcard certificate and attempt to
match it with a reference ID with such a label. For compatibility, this change
allows reference ID labels to begin and/or end with hyphens.
Updated•9 months ago
|
![]() |
Assignee | |
Updated•9 months ago
|
Comment 42•9 months ago
|
||
Description
•