Closed Bug 1136616 Opened 9 years ago Closed 9 years ago

mozilla::pkix name matching disallows some characters in reference IDs that are valid in DNS labels

Categories

(Core :: Security: PSM, defect)

36 Branch
x86_64
Windows 7
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla39
Tracking Status
firefox36 + verified
firefox37 + verified
firefox38 + verified
firefox39 + verified
relnote-firefox --- 36+

People

(Reporter: none11given, Assigned: keeler)

References

(Blocks 1 open bug)

Details

(Keywords: regression)

Attachments

(2 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:36.0) Gecko/20100101 Firefox/36.0
Build ID: 20150222232811

Steps to reproduce:

Attempt to access a website via SSL where the subdomain contains an underscore


Actual results:

"Untrusted Connection" page appears, stating that Firefox cannot connect securely to under_score.example.com, reason given is "The certificate is only valid for the following names: *.example.com" (ssl_error_bad_cert_domain)

(Note that example.com is here just that -- an example. The website in question has a certificate valid for several domain names including "*.domain.tld" while I try to access "under_score.domain.tld")


Expected results:

The connection should be trusted and the website should load. Obviously "*.example.com" should match "under_score.example.com".

This bug is a regression, and is new in Firefox 36. It appeared a while ago in Firefox Developer Edition (which I often use in parallel), and now it has become part of Firefox stable.
Component: Untriaged → Security
We recently updated the name matching code to be more strict. RFC 952 and RFC 1123 essentially stipulate that host names only consist of letters, digits, and hyphens (separated by '.'). Since "_" isn't a valid character for a host name, the name matching code says it doesn't match the presented identifier in the certificate. I think the bug here is that the networking code didn't validate the host name.

https://tools.ietf.org/html/rfc1123#section-2
https://tools.ietf.org/html/rfc952#page-5
Component: Security → Networking
Product: Firefox → Core
Summary: ssl_error_bad_cert_domain when underscore in subdomain → necko should validate host names (invalid characters like "_" aren't rejected)
this is probably going to get wontfixed.. one of the reasons is that we do operate on non internet hosts using name resolution other than dns (netbios for example).
(I'm not saying that pkix should do anything different btw)
One source of the _ hostnames that may be reasonably large. Both AWS and Google Storage (and any other implementations of those APIs) do NOT restrict the names of their buckets to not include _s. Both AWS and Google Storage then turn those bucket names into CNAME records for SSL resolution end points. The effect of this is that the storage systems using buckets with _'s in their names work perfectly fine in Firefox >=35 and don't work now. They also work just fine in Chrome, IE, Safari, CURL, etc.
URLs have these requirements on hosts: https://url.spec.whatwg.org/#concept-host-parser

I think it would be better if TLS used that same definition... Having subtly different definitions of what a host means is sad. (And having distinct parsers for them is too.)
(In reply to Anne (:annevk) from comment #7)
> URLs have these requirements on hosts:
> https://url.spec.whatwg.org/#concept-host-parser

I'm having a hard time understanding that parser. It appears to suggest that all ASCII values are permissable in hostname? This seems wrong (e.g. 0x7F, punctuation (other than the separating '.'), control characters, etc. all shouldn't be present).
[Tracking Requested - why for this release]: Web compat regression.
Status: UNCONFIRMED → NEW
Ever confirmed: true
> We recently updated the name matching code to be more strict.

What was the bug for that?

I think it's pretty clear just based on the volume of bug reports in the time this has been shipping that disallowing '_' in hostnames is not web-compatible and we need to stop doing that...
Flags: needinfo?(dkeeler)
David, what is unclear? In step 6 a set of problematic ASCII code points is rejected, but in general we should not assume that we can subset what names subdomains can allocate.
We should indeed fix that in 36.0.1. Can someone take this and propose quickly a patch?
David, could you prepare a patch to backout this restriction?
This was changed in bug 1063281. Due to the size of the changes, a backout would be higher risk than just fixing it, which involves conditionally allowing "_" (and maybe other characters?) around here: https://hg.mozilla.org/releases/mozilla-release/annotate/63286f849ae3/security/pkix/lib/pkixnames.cpp#l905

(In reply to Anne (:annevk) from comment #12)
> David, what is unclear? In step 6 a set of problematic ASCII code points is
> rejected, but in general we should not assume that we can subset what names
> subdomains can allocate.

Oh, I see. To me, that list allows the following ASCII: control codes that are not NUL, TAB, LF, or CR; alphanumeric (no problems there); and !, ", $, &, ', (, ), *, +, ',', -, ., ;, <, =, >, ^, _, `, {, |, }, and ~.

Supporting all of that punctuation seems strange to me, but I can't think of a security concern if they're used. The control codes are much more problematic. For instance, how do we display DEL? Or VT (vertical tab)? If any of those could change how the hostname displays in the urlbar, that's definitely a concern.
Assignee: nobody → dkeeler
Blocks: 1063281
Component: Networking → Security: PSM
Flags: needinfo?(dkeeler)
Summary: necko should validate host names (invalid characters like "_" aren't rejected) → mozilla::pkix name matching disallows some characters in hostnames that are considered valid by the whatwg (like underscore)
(In reply to David Keeler [:keeler] (use needinfo?) from comment #14)
> Oh, I see. To me, that list allows the following ASCII: control codes that
> are not NUL, TAB, LF, or CR; alphanumeric (no problems there); and !, ", $,
> &, ', (, ), *, +, ',', -, ., ;, <, =, >, ^, _, `, {, |, }, and ~.
> 
> Supporting all of that punctuation seems strange to me, but I can't think of
> a security concern if they're used.

We should not be supporting such ridiculous characters in hostnames. Good luck getting most hostname parsers out there to actually understand a hostname that has '<' or ')' in it. As for security, '*' would obviously have a major security impact if permitted in hostnames, considering that acts as a wildcard in TLS certificates. Underscores are used in DNS for a variety of purposes, though not generally for actual hostnames (see http://domainkeys.sourceforge.net/underscore.html or http://www.c3.hu/docs/oreilly/tcpip/dnsbind/ch04_05.htm, which states 'Underscores are not allowed in host names.' based off of RFC 952). I don't see a direct security issue in supporting underscores, but I don't think we should be adding all those other random ASCII characters, especially considering some are unsafe shell characters.
(In reply to Not doing reviews right now from comment #11)
> I think it's pretty clear just based on the volume of bug reports in the
> time this has been shipping that disallowing '_' in hostnames is not
> web-compatible and we need to stop doing that...

FYI, I don't have any problem with adding '_' to the list of accepted characters in hostnames in certificates. Obviously adding control codes doesn't make sense. If there is no real-world motivation for adding the punctuation characters then I'd prefer not to add them. '*' has special meaning in names in certificates so it can't be added as a normal hostname character.
> To me, that list allows the following ASCII: control codes that are not NUL, TAB, LF, or
> CR; alphanumeric (no problems there); and !, ", $, &, ', (, ), *, +, ',', -, ., ;, <, =,
> >, ^, _, `, {, |, }, and ~.

Yeah, I agree.  I thought the application of the IDNA table at http://www.unicode.org/Public/idna/7.0.0/IdnaMappingTable.txt would filter out something too, but we invoke that algorithm with UseSTD3ASCIIRules=false, and everything in the ASCII range is either "valid" or "disallowed_STD3_valid"...

Why is UseSTD3ASCIIRules=false here?  If it really needs to be, it does seem like we should filter out more stuff from hostnames in the URL spec.
> Why is UseSTD3ASCIIRules=false here?

It might be just "_", but it might also be any of the other code points allowed by the URL Standard. The reason the URL does not forbid more is because there is not really any technical restriction at the DNS level. The code points the URL Standard does forbid is because they cause attack problems in the context of URLs.

That can be changed, but as evident by this very bug we should not do that based on what people find reasonable but rather based on concrete data that it would cause harm or can be done without problem.

As for displaying control code points, we have the square Unicode glyphs for those I thought.
(In reply to Anne (:annevk) from comment #18)
> The reason the URL does not forbid more is
> because there is not really any technical restriction at the DNS level.

https://tools.ietf.org/html/rfc2181#section-11
>   Note however, that the various applications that make use of DNS data
>   can have restrictions imposed on what particular values are
>   acceptable in their environment.  For example, that any binary label
>   can have an MX record does not imply that any binary name can be used
>   as the host part of an e-mail address.

By the way, how about adding characters what had been allowed before and then bikeshedding all you like? Don't forget we consider to include the fix in 36.0.1.
David, could you land a fix ASAP? We are going to do a 36.0.1 and we need this patch too as this is a recent regression. Thanks
Flags: needinfo?(dkeeler)
Summary: mozilla::pkix name matching disallows some characters in hostnames that are considered valid by the whatwg (like underscore) → mozilla::pkix name matching disallows some characters in reference IDs that are valid in DNS labels
Attached patch patch for 36 (obsolete) — Splinter Review
This patch allows underscores in reference DNS-IDs. This is for 36 (I'll have patches for the other branches soon).
Flags: needinfo?(dkeeler)
Attachment #8571479 - Flags: review?(brian)
Comment on attachment 8571479 [details] [diff] [review]
patch for 36

Review of attachment 8571479 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with comments addressed.

Let's discuss the degree of relevance of the WHATWG URL standard on certificate name matching via email, not in this bug, and definitely not in the source code.

::: security/pkix/lib/pkixnames.cpp
@@ +904,5 @@
> +        // hostname (a reference DNS-ID in our terminology). The only concrete
> +        // incompatibility we've encountered so far is underscore, so we allow
> +        // that in reference DNS-IDs (but not presented DNS-IDs).
> +        // See bug 1136616 and https://url.spec.whatwg.org/#concept-host-parser
> +        // (in particular step 6).

Let's not inject the WHATWG URL spec debate into the source code here.

Instead, let's add a reference to what RFC 5280 actually says to the top of this function: 

RFC 5280 Section 4.2.1.6 says that a dNSName "MUST be in the 'preferred name syntax', as specified by Section 3.5 of [RFC1034] and as modified by Section 2.1 of [RFC1123]" except "a dNSName of ' ' MUST NOT be used." Additionally, we allow underscores for compatibility with existing practice.

Then here, we can say:

We allow underscores for compatibility with existing practice. See bug 1136616.

@@ +907,5 @@
> +        // See bug 1136616 and https://url.spec.whatwg.org/#concept-host-parser
> +        // (in particular step 6).
> +        if (matchType != ValidDNSIDMatchType::ReferenceID) {
> +          return false;
> +        }

At a minimum, we need a comment here about why we're not allowing underscores in name constraints and presented IDs. But, why not just remove this restriction completely?

Is it the case that all of these certificates are using wildcard DNS-IDs in the certs?

Unless I'm missing something, I think we should allow underscores in name constraints and presented IDs too. It should never be *necessary* to use a wildcard DNS-ID just to match a specific name form.

Please file a follow-up bug about whether we should allow names that consist only of a single underscore and/or only of underscores. There's no time to investigate/debate that now.

::: security/pkix/test/gtest/pkixnames_tests.cpp
@@ +79,5 @@
> +  // 6).
> +  // For compatibility, allow underscores in reference DNS-IDs but not presented
> +  // DNS-IDs.
> +  DNS_ID_MATCH("*.example.com", "uses_underscore.example.com"),
> +  DNS_ID_MISMATCH("*.uses_underscore.example.com", "a.uses_underscore.example.com"),

As I said above, this test case should be made to pass.

Also, please include the following test cases:

// See bug XXXXXXX (where XXXXXX is the new follow-up bug).
DNS_ID_MATCH("_.example.com", "_.example.com"),
DNS_ID_MATCH("*.example.com", "_.example.com"),
DNS_ID_MATCH("_", "_"),
DNS_ID_MATCH("___", "___"),
DNS_ID_MATCH("example_", "example_"),
DNS_ID_MATCH("_example", "_example"),
DNS_ID_MATCH("*._", "x._"),

// See bug XXXXXXX (where XXXXXX is the new follow-up bug).
// A DNS-ID must not end in an all-numeric label. we don't
// consider underscores to be numeric.
DNS_ID_MATCH("_1", "_1"),
DNS_ID_MISMATCH("example.1", "example.1"),
DNS_ID_MATCH("example._1", "example._1"),
DNS_ID_MATCH("example.1_", "example.1_"),

Similarly, update the name constraint tests, both for dNSNames and RFC822Names.
Comment on attachment 8571479 [details] [diff] [review]
patch for 36

Review of attachment 8571479 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with comments addressed.

Let's discuss the degree of relevance of the WHATWG URL standard on certificate name matching via email, not in this bug, and definitely not in the source code.

::: security/pkix/lib/pkixnames.cpp
@@ +904,5 @@
> +        // hostname (a reference DNS-ID in our terminology). The only concrete
> +        // incompatibility we've encountered so far is underscore, so we allow
> +        // that in reference DNS-IDs (but not presented DNS-IDs).
> +        // See bug 1136616 and https://url.spec.whatwg.org/#concept-host-parser
> +        // (in particular step 6).

Let's not inject the WHATWG URL spec debate into the source code here.

Instead, let's add a reference to what RFC 5280 actually says to the top of this function: 

RFC 5280 Section 4.2.1.6 says that a dNSName "MUST be in the 'preferred name syntax', as specified by Section 3.5 of [RFC1034] and as modified by Section 2.1 of [RFC1123]" except "a dNSName of ' ' MUST NOT be used." Additionally, we allow underscores for compatibility with existing practice.

Then here, we can say:

We allow underscores for compatibility with existing practice. See bug 1136616.

@@ +907,5 @@
> +        // See bug 1136616 and https://url.spec.whatwg.org/#concept-host-parser
> +        // (in particular step 6).
> +        if (matchType != ValidDNSIDMatchType::ReferenceID) {
> +          return false;
> +        }

At a minimum, we need a comment here about why we're not allowing underscores in name constraints and presented IDs. But, why not just remove this restriction completely?

Is it the case that all of these certificates are using wildcard DNS-IDs in the certs?

Unless I'm missing something, I think we should allow underscores in name constraints and presented IDs too. It should never be *necessary* to use a wildcard DNS-ID just to match a specific name form.

Please file a follow-up bug about whether we should allow names that consist only of a single underscore and/or only of underscores. There's no time to investigate/debate that now.

::: security/pkix/test/gtest/pkixnames_tests.cpp
@@ +79,5 @@
> +  // 6).
> +  // For compatibility, allow underscores in reference DNS-IDs but not presented
> +  // DNS-IDs.
> +  DNS_ID_MATCH("*.example.com", "uses_underscore.example.com"),
> +  DNS_ID_MISMATCH("*.uses_underscore.example.com", "a.uses_underscore.example.com"),

As I said above, this test case should be made to pass.

Also, please include the following test cases:

// See bug XXXXXXX (where XXXXXX is the new follow-up bug).
DNS_ID_MATCH("_.example.com", "_.example.com"),
DNS_ID_MATCH("*.example.com", "_.example.com"),
DNS_ID_MATCH("_", "_"),
DNS_ID_MATCH("___", "___"),
DNS_ID_MATCH("example_", "example_"),
DNS_ID_MATCH("_example", "_example"),
DNS_ID_MATCH("*._", "x._"),

// See bug XXXXXXX (where XXXXXX is the new follow-up bug).
// A DNS-ID must not end in an all-numeric label. we don't
// consider underscores to be numeric.
DNS_ID_MATCH("_1", "_1"),
DNS_ID_MISMATCH("example.1", "example.1"),
DNS_ID_MATCH("example._1", "example._1"),
DNS_ID_MATCH("example.1_", "example.1_"),

Similarly, update the name constraint tests, both for dNSNames and RFC822Names.
Attachment #8571479 - Flags: review?(brian) → review+
David, I am sure you are on top of this but to be clear, I am waiting for this patch to build 36.0.1. Thanks!
Flags: needinfo?(dkeeler)
(In reply to Sylvestre Ledru [:sylvestre] from comment #25)
> David, I am sure you are on top of this but to be clear, I am waiting for
> this patch to build 36.0.1. Thanks!

Sorry for the delay - the way we were doing some tests wasn't matching my expectation, which led to some confusion. I figured it out, though. I'll have the patch up shortly.

(In reply to Brian Smith (:briansmith, :bsmith, use NEEDINFO?) from comment #23)

Thanks for the review, Brian. I believe I've addressed all of your comments.

> Let's discuss the degree of relevance of the WHATWG URL standard on
> certificate name matching via email, not in this bug, and definitely not in
> the source code.

Sounds good.

> ::: security/pkix/lib/pkixnames.cpp
> @@ +907,5 @@
> > +        // See bug 1136616 and https://url.spec.whatwg.org/#concept-host-parser
> > +        // (in particular step 6).
> > +        if (matchType != ValidDNSIDMatchType::ReferenceID) {
> > +          return false;
> > +        }
> 
> At a minimum, we need a comment here about why we're not allowing
> underscores in name constraints and presented IDs. But, why not just remove
> this restriction completely?
> 
> Is it the case that all of these certificates are using wildcard DNS-IDs in
> the certs?
> 
> Unless I'm missing something, I think we should allow underscores in name
> constraints and presented IDs too. It should never be *necessary* to use a
> wildcard DNS-ID just to match a specific name form.

That makes sense.

> Please file a follow-up bug about whether we should allow names that consist
> only of a single underscore and/or only of underscores. There's no time to
> investigate/debate that now.

Filed bug 1139039.

> ::: security/pkix/test/gtest/pkixnames_tests.cpp
> Also, please include the following test cases:
...
> DNS_ID_MISMATCH("example.1", "example.1"),

Turns out, in the 36 branch, we can't test this case since we require that reference IDs be valid. It should be possible in other branches, however (we did some refactoring).

> Similarly, update the name constraint tests, both for dNSNames and
> RFC822Names.

The mozilla::pkix name constraints implementation didn't land in 36, so we can't test that. I'll add them in the other branches.
Flags: needinfo?(dkeeler)
Attached patch patch for 36 v2Splinter Review
Approval Request Comment
[Feature/regressing bug #]: mozilla::pkix name matching (bug 1063281)
[User impact if declined]: users will see a name mismatch warning on some https sites
[Describe test coverage new/current, TreeHerder]: has tests
[Risks and why]: low - the functional changes are extremely small. Most of this patch is tests.
[String/UUID change made/needed]: none
Attachment #8571479 - Attachment is obsolete: true
Attachment #8572093 - Flags: review+
Attachment #8572093 - Flags: approval-mozilla-release?
Attachment #8572093 - Flags: approval-mozilla-release? → approval-mozilla-release+
Given the comments in https://code.google.com/p/chromium/issues/detail?id=463410#c2 it seems really inappropriate towards our users to only whitelist "_" without data that allowing more is actually problematic from a security perspective or data showing that allowing less is fine.
Added to the release notes with "36.0.1 - Hostnames with an underscore ("_") were rejected (1136616)" as wording.
I just found this bug yesterday, and it is fixed in 36.0.1 now.
https://support.mozilla.org/zh-CN/questions/1049026
Awesome.
Hope it can merge to Aurora branch also.
It is not OK on 38.0a2 yet.
Test URL:
https://coursera_assets.s3.amazonaws.com/about/overview/about_discover_a_course_youre_interested_in.jpg
keeler - Are we going to take this fix on all branches? If so, can you please get this landed on m-c and ready for Aurora and Beta?
Flags: needinfo?(dkeeler)
Yes, this fix will land on all branches. I'm still preparing branch-specific patches, but they should be done soon.
Flags: needinfo?(dkeeler)
Turns out this applies on mozilla-central, -aurora, and -beta just fine.

Approval Request Comment
[Feature/regressing bug #]: mozilla::pkix name matching (bug 1063281)
[User impact if declined]: https sites with underscores in the host name won't work
[Describe test coverage new/current, TreeHerder]: has tests
[Risks and why]: low
[String/UUID change made/needed]: none
Attachment #8574116 - Flags: review+
Attachment #8574116 - Flags: approval-mozilla-beta?
Attachment #8574116 - Flags: approval-mozilla-aurora?
Comment on attachment 8574116 [details] [diff] [review]
patch for central, aurora, and beta

This fix has already been verified and  shipped in 36.0.1. Beta+ Aurora+
Attachment #8574116 - Flags: approval-mozilla-beta?
Attachment #8574116 - Flags: approval-mozilla-beta+
Attachment #8574116 - Flags: approval-mozilla-aurora?
Attachment #8574116 - Flags: approval-mozilla-aurora+
See Also: → 89945
https://hg.mozilla.org/mozilla-central/rev/e6be4f63ad3b
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
I've verified this fix across all affected branches, with a site the presents a cert using an underscore in its reference DNS-ID: https://employment.jjc.edu.
You need to log in before you can comment on or make changes to this bug.