Closed Bug 1288049 Opened 8 years ago Closed 8 years ago

Canonicalize IPv4 for nsStandardURL

Categories

(Core :: Networking, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla51
Tracking Status
firefox51 --- fixed

People

(Reporter: CuveeHsu, Assigned: CuveeHsu)

References

Details

(Whiteboard: [necko-active])

Attachments

(1 file, 4 obsolete files)

Lots of discussion talks about anonicalizing IPv4/IPv6 equivalent form (bug 67730, bug 1063010, bug 1276144). For example, http://2130706433/ should be transformed to http://127.0.0.1/

Plan to start from |nsStandardURL::SetHost| and |nsStandardURL::BuildNormalizedSpec| for IPv4 part.

Test case should include those in Bug 1063010 Comment 0 (though some should be discussed)
Hello Valentin,
I start to investigate and need your opinion.

1) I guess those case in Bug 1063010 Comment 0 should be considered. But how about http://0xdeadbeefcafeBADF00D7f000001 ?
Spec doesn't include this one but we kind of support to navigate it somehow now.
Should we also canonicalize this form or just make it with different behavior from navigating to http://127.0.0.1

btw, Rust-url and other browser do not ignore all but the last 32-bits.

2) I'm not sure do we need a security review or process. How do you think?
Flags: needinfo?(valentin.gosu)
Hello David,
I'm worried about the two cases.
1) I'm not sure I'll hit the telemetry here. Do you suggest to modify |net_IsValidIPv4Addr| also?
http://searchfox.org/mozilla-central/source/security/manager/ssl/SSLServerCertVerification.cpp#976

2) Also here. But we'll change the form to caonicalized form before DNS, I guess I need not to do anything here.
http://searchfox.org/mozilla-central/source/security/pkix/test/gtest/pkixnames_tests.cpp#596

And anything else I should have considered?
Flags: needinfo?(dkeeler)
I don't think net_IsValidIPv4Addr needs modification - as far as I can tell it already only accepts canonical IPv4 strings (which I think is the correct behavior).
The work you're doing here shouldn't affect the mozilla::pkix gtests, so that shouldn't be a concern either.
Flags: needinfo?(dkeeler)
(In reply to Junior [:junior] from comment #1)
> Hello Valentin,
> I start to investigate and need your opinion.
> 
> 1) I guess those case in Bug 1063010 Comment 0 should be considered. But how
> about http://0xdeadbeefcafeBADF00D7f000001 ?
> Spec doesn't include this one but we kind of support to navigate it somehow
> now.
> Should we also canonicalize this form or just make it with different
> behavior from navigating to http://127.0.0.1
> 
> btw, Rust-url and other browser do not ignore all but the last 32-bits.

Chrome seems to reject the URL as invalid. Maybe we should do the same?
Flags: needinfo?(valentin.gosu) → needinfo?(annevk)
https://url.spec.whatwg.org/#concept-ipv4-parser takes care of range checks, no? (And I think those are modeled after Chrome, but it's been a while.) Might be a bug in rust-url or is there something I'm missing?
Flags: needinfo?(annevk)
(In reply to Anne (:annevk) from comment #5)
> https://url.spec.whatwg.org/#concept-ipv4-parser takes care of range checks,
> no? (And I think those are modeled after Chrome, but it's been a while.)
> Might be a bug in rust-url or is there something I'm missing?

Let me rephrase.
http://0xdeadbeefcafeBADF00D7f000001 works for firefox since we ignore all but the last 32-bits.
This behavior is out of spec. Therefore, that url doesn't work for Chrome and rust-url.

The question is: should we canonicalize this kind of url in the address_bar/host_header?
If not, should we block navigating to this kind of url in the future?
Flags: needinfo?(annevk)
I think we should treat that URL exactly like we treat http://test:test/ (i.e., fails to parse as a URL). Currently that results in a search for the address bar and an exception in the JavaScript URL API. Does that help?
Flags: needinfo?(annevk)
(In reply to Anne (:annevk) from comment #7)
> I think we should treat that URL exactly like we treat http://test:test/
> (i.e., fails to parse as a URL). Currently that results in a search for the
> address bar and an exception in the JavaScript URL API. Does that help?
Yes, thanks for your clear answer
Attached patch bug1288049-Canonicalize-IPv4 (obsolete) — Splinter Review
Hello Valentin,
I have a WIP patch which can solve IP v4 with decimal integer.
Could I have your feedback first? Thanks.

For comment 7, I file bug 1289711 to track
Attachment #8775086 - Flags: feedback?(valentin.gosu)
Comment on attachment 8775086 [details] [diff] [review]
bug1288049-Canonicalize-IPv4

Review of attachment 8775086 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/nsStandardURL.cpp
@@ +392,5 @@
> +    nsTArray<nsCString> parts;
> +    ParseString(host, '.', parts);
> +
> +    if (parts.Length() > 4) {
> +        return NS_ERROR_NOT_AVAILABLE;

NS_ERROR_FAILURE or NS_ERROR_UNEXPECTED would be more appropriate for this method.

@@ +410,5 @@
> +        numbers.AppendElement(n);
> +    }
> +    uint32_t ipv4 = numbers.LastElement();
> +    if (ipv4 >
> +        std::numeric_limits<uint32_t>::max() >> (8 * (numbers.Length() - 1))) {

This formulation is a bit hard to read.
I think you could skip the calculation, and use an array of constants.

@@ +425,5 @@
> +
> +    result.Truncate();
> +
> +    // reuse this array to store IP segments
> +    numbers.SetLength(4, mozilla::fallible);

return value must be used - otherwise you get a compile warning

@@ +437,5 @@
> +        result.AppendInt(numbers[i]);
> +        if (i != 3) {
> +          result.Append('.');
> +        }
> +    }

We can just convert ipv4 to network-endian, append each byte to the result.
We don't even need these 2 for loops.

::: netwerk/test/gtest/TestStandardURL.cpp
@@ +67,5 @@
>          url->GetRef(out);
>      }
>  });
> +
> +TEST(TestStandardURL, IPv4) {

gtests are awkward to write. You're better off adding a test to test_standardurl.js or a new JS file.

Also, we need tests for all the corner cases.
More than 4 parts, less than 4 parts.
Also, we need to properly handle hex and octal cases.

@@ +92,5 @@
> +
> +    // Doesn't work now
> +    ASSERT_EQ(url->SetSpec(NS_LITERAL_CSTRING("http://0177.00.00.01")), NS_OK);
> +    ASSERT_EQ(url->GetSpec(out), NS_OK);
> +    ASSERT_TRUE(out == NS_LITERAL_CSTRING("http://127.0.0.1/"));

This actually fails, because we don't handle octal.
Attachment #8775086 - Flags: feedback?(valentin.gosu) → feedback+
Comment on attachment 8775086 [details] [diff] [review]
bug1288049-Canonicalize-IPv4

Review of attachment 8775086 [details] [diff] [review]:
-----------------------------------------------------------------

Don't forget to do a try run. I expect some tests to fail, and a few web-platform-tests to UNEXPECTED-PASS.

::: netwerk/base/nsStandardURL.cpp
@@ +402,5 @@
> +            return NS_ERROR_NOT_AVAILABLE;
> +        }
> +        //TODO need to implement |ToInteger| for OCT
> +        nsresult rv = NS_OK;
> +        n = parts[i].ToInteger(&rv, 10);

Also, apart from hex & octal, make sure that n is not negative.
http://10.-20.30 is still a valid URL, but this normalization should fail for it.
> @@ +437,5 @@
> > +        result.AppendInt(numbers[i]);
> > +        if (i != 3) {
> > +          result.Append('.');
> > +        }
> > +    }
> 
> We can just convert ipv4 to network-endian, append each byte to the result.
> We don't even need these 2 for loops.
> 

I can't catch it. Could you elaborate more on how to approach this?
Do you mean not to transport to a string?
Flags: needinfo?(valentin.gosu)
(In reply to Junior [:junior] from comment #12)
> > @@ +437,5 @@
> > > +        result.AppendInt(numbers[i]);
> > > +        if (i != 3) {
> > > +          result.Append('.');
> > > +        }
> > > +    }
> > 
> > We can just convert ipv4 to network-endian, append each byte to the result.
> > We don't even need these 2 for loops.
> > 
> 
> I can't catch it. Could you elaborate more on how to approach this?
> Do you mean not to transport to a string?

Or you meant like 
nsPrintfCString("%d.%d.%d.$d", (ipv4 >> 24), (ipv4 && 0xff << 16), (ipv4 && 0xff << 8),  ipv4 && 0xff)?
(In reply to Junior [:junior] from comment #13)
> Or you meant like 
> nsPrintfCString("%d.%d.%d.$d", (ipv4 >> 24), (ipv4 && 0xff << 16), (ipv4 &&
> 0xff << 8),  ipv4 && 0xff)?

If you're to do this, use binary & instead of &&.

Also, you need to make sure ipv4 has the correct endianness.
So something like:
uint32 littleEndianIPv4;
LittleEndian::writeUint32(&littleEndianIPv4, ipv4);
then use littleEndianIPv4 to print the bytes to the string.
You can also cast littleEndianIPv4 to a uint8_t* and use it as an array of bytes, instead of shifting - the code is cleaner this way.
Flags: needinfo?(valentin.gosu)
Lots of red are owing to |ToInteger| allowing tolerance. I'd cook another one which I originally plan to do.
This patch breaks bug 1053245.
But I think a number which can be transformed to IPv4 deserves a notification.
If I'm reading the bug correctly I think the test for bug 1053245 can be changed. So instead of 1234 we could try 1234567890123456789.
(In reply to Valentin Gosu [:valentin] from comment #21)
> If I'm reading the bug correctly I think the test for bug 1053245 can be
> changed. So instead of 1234 we could try 1234567890123456789.

Yes, and a big number breaks my implementation (facepalm
I'll provide a patch for parsing number and request a feedback today.
Attached patch Canonicalize-IPv4, v2 (obsolete) — Splinter Review
Add number parser and fix test-break

TODO:
1. Fix break of test_nsDefaultURIFixup_info.js
   (nothing to do with setSpec but I have no idea to fix it now. Comment them out in the patch first)
2. Add xpcshell test

and maybe 3.
http://searchfox.org/mozilla-central/source/testing/web-platform/tests/url/urltestdata.json#3539

"http://%30%78%63%30%2e%30%32%35%30.01%2e" is escape of http://192.168.0.1. (Yes with trailing dot)
I think this should not be treated as a valid IPv4, but it seems not in this test. The worse thing is that my code fixed this, which is out of my expect.

So the fundamental problem is: Is "192.168.0.1." a valid IPv4?


By the way, |ParseString| in nsReadableUtils.cpp could not tell if the delimiter is in the end of string. A little weird thus I need to walk around.
Attachment #8775086 - Attachment is obsolete: true
Attachment #8777775 - Flags: feedback?(valentin.gosu)
Comment on attachment 8777775 [details] [diff] [review]
Canonicalize-IPv4, v2

Review of attachment 8777775 [details] [diff] [review]:
-----------------------------------------------------------------

Overall it looks really good.
Maybe ask :annevk about the trailing dot behaviour.
And we need a bunch of unit tests.

::: netwerk/base/nsStandardURL.cpp
@@ +405,5 @@
> +{
> +    uint32_t base;
> +    uint32_t prefixLength = 0;
> +
> +    if (input[0] == '0') {

move the if (input.Length() == 0) check here in case we end up using ParseIPv4Number in other places.

@@ +458,5 @@
> +    return NS_ERROR_FAILURE;
> +}
> +
> +nsresult
> +nsStandardURL::NormalizeIPv4(const nsCSubstring &host, nsCString &result)

Add a comment linking to the IPv4 parser in the url spec.
And comments to the code where it's not obvious why we do things that way.

@@ +462,5 @@
> +nsStandardURL::NormalizeIPv4(const nsCSubstring &host, nsCString &result)
> +{
> +    if (host.Last() == '.') {
> +        return NS_ERROR_FAILURE;
> +    }

> So the fundamental problem is: Is "192.168.0.1." a valid IPv4?

A domain name that ends in a dot is valid, so maybe the same should apply for IPs?

Chrome seems to accept it, and so does the ping utility on linux.

~$ ping 127.0.0.1.
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.026 ms

Maybe if the last character is '.' we should just remove it?
Attachment #8777775 - Flags: feedback?(valentin.gosu) → feedback+
Attached patch Canonicalize-IPv4, v3 (obsolete) — Splinter Review
I think the patch is ready except the dots problem (and a bug that 1..2 should have been invalid I just found)

In terms of IPv4,
1) Chrome supports leading dots and trailing dots,
that is, 127.0.0.1.... | ..127.0.0.1 | ..127.0.0.1... are all valid.
2) ping supports one or two trailing dots
3) Edge doesn't support

Hello :annevk,
Any idea of the correct dot behaviour?
Attachment #8777775 - Attachment is obsolete: true
Flags: needinfo?(annevk)
Per https://url.spec.whatwg.org/#concept-ipv4-parser the standard supports a trailing dot although it's considered a syntax violation (not a fatal error for parsers though). More than four dots are not supported and will cause the input to be treated as a domain.
Flags: needinfo?(annevk)
Attached patch Canonicalize-IPv4, v4 (obsolete) — Splinter Review
Here's my implementation about dot-behaviour:
For an IPv4,
1. leading dots are _not_ allowed, e.g., ".1.2.3.4"
2. one trailing dot is allowed, e.g., "1.2.3.4.", "1.2.3."
3. greater than or equal to two trailing dots are not allowed, no matter how many dot occurred, e.g., "1.2.3..", "1.2.3.4.."

This behaviour almost follows spec.
The exception is the second bullet which web-platform test wants to tolerant.

I'd like to sync with :annevk also, thus set a needinfo to see whether it makes sense.
Attachment #8778200 - Attachment is obsolete: true
Flags: needinfo?(annevk)
Attachment #8778819 - Flags: review?(valentin.gosu)
Per step 3 in https://url.spec.whatwg.org/#concept-ipv4-parser the standard requires parsers to allow a trailing dot as per 2 in your list. (Syntax violation just means that it's non-conforming to write such an address.)
Flags: needinfo?(annevk)
(In reply to Anne (:annevk) from comment #30)
> Per step 3 in https://url.spec.whatwg.org/#concept-ipv4-parser the standard
> requires parsers to allow a trailing dot as per 2 in your list. (Syntax
> violation just means that it's non-conforming to write such an address.)

Thanks. Sorry for misunderstanding syntax violation.
Comment on attachment 8778819 [details] [diff] [review]
Canonicalize-IPv4, v4

Review of attachment 8778819 [details] [diff] [review]:
-----------------------------------------------------------------

This looks great. Fantastic work here.
r+ with the following minor issues addressed.
Thanks!

::: netwerk/base/nsStandardURL.cpp
@@ +388,5 @@
>  }
>  
> +// |base| should be 8, 10 or 16. Not check the precondition for performance.
> +inline bool
> +nsStandardURL::IsValidOfBase(unsigned char c, uint32_t base) {

I think you can use static_assert to check if the base is 8, 10, 16.

@@ +474,5 @@
> +        return NS_ERROR_FAILURE;
> +    }
> +    nsTArray<nsCString> parts;
> +    if (!ParseString(host, '.', parts) ||
> +        parts.Length() == 0 || parts.Length() > 4 || host[0] == '.') {

please check that host.Length() > 0 - this could happen as NormalizeIDN may remove characters.

::: netwerk/base/nsStandardURL.h
@@ +186,5 @@
>  
>      bool     ValidIPv6orHostname(const char *host, uint32_t aLen);
> +    bool     IsValidOfBase(unsigned char c, uint32_t base);
> +    nsresult ParseIPv4Number(nsCString &input, uint32_t &number);
> +    nsresult NormalizeIPv4(const nsCSubstring &host, nsCString &result);

You could either make these static, as they don't use any member variables, or just put them in an anonymous namespace in the .cpp.
Attachment #8778819 - Flags: review?(valentin.gosu) → review+
> ::: netwerk/base/nsStandardURL.cpp
> @@ +388,5 @@
> >  }
> >  
> > +// |base| should be 8, 10 or 16. Not check the precondition for performance.
> > +inline bool
> > +nsStandardURL::IsValidOfBase(unsigned char c, uint32_t base) {
> 
> I think you can use static_assert to check if the base is 8, 10, 16.
Good idea! 
> 
> @@ +474,5 @@
> > +        return NS_ERROR_FAILURE;
> > +    }
> > +    nsTArray<nsCString> parts;
> > +    if (!ParseString(host, '.', parts) ||
> > +        parts.Length() == 0 || parts.Length() > 4 || host[0] == '.') {
> 
> please check that host.Length() > 0 - this could happen as NormalizeIDN may
> remove characters.
> 
parts.Length() == 0 implies host.Length() == 0
Or should I still explicitly annotate it or check it?
> ::: netwerk/base/nsStandardURL.h
> @@ +186,5 @@
> >  
> >      bool     ValidIPv6orHostname(const char *host, uint32_t aLen);
> > +    bool     IsValidOfBase(unsigned char c, uint32_t base);
> > +    nsresult ParseIPv4Number(nsCString &input, uint32_t &number);
> > +    nsresult NormalizeIPv4(const nsCSubstring &host, nsCString &result);
> 
> You could either make these static, as they don't use any member variables,
> or just put them in an anonymous namespace in the .cpp.

Okay let's just make it static.
Flags: needinfo?(valentin.gosu)
(In reply to Junior [:junior] from comment #33)
> > > +        parts.Length() == 0 || parts.Length() > 4 || host[0] == '.') {
> > 
> > please check that host.Length() > 0 - this could happen as NormalizeIDN may
> > remove characters.
> > 
> parts.Length() == 0 implies host.Length() == 0
> Or should I still explicitly annotate it or check it?

I guess that's OK then. You could add a comment mentioning this, so it's obvious when someone later reads the code.
Thanks!
Flags: needinfo?(valentin.gosu)
> > ::: netwerk/base/nsStandardURL.cpp
> > @@ +388,5 @@
> > >  }
> > >  
> > > +// |base| should be 8, 10 or 16. Not check the precondition for performance.
> > > +inline bool
> > > +nsStandardURL::IsValidOfBase(unsigned char c, uint32_t base) {
> > 
> > I think you can use static_assert to check if the base is 8, 10, 16.
> Good idea! 
Since base could not be determined in compile-time, I'd like to use MOZ_ASSERT
modify by reviewer's comment, carry r+
Attachment #8778819 - Attachment is obsolete: true
Attachment #8779283 - Flags: review+
Keywords: checkin-needed
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5cb616a7d19e
Canonicalize IPv4 for nsStandardURL, r=valentin
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/5cb616a7d19e
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Depends on: 1299391
Depends on: 1302004
Depends on: 1338154
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: