Closed
Bug 67730
Opened 24 years ago
Closed 5 years ago
Obfuscated "dotless" IP (single large decimal or hexed) addresses shouldn't work
Categories
(Core :: Networking, enhancement, P3)
Core
Networking
Tracking
()
RESOLVED
DUPLICATE
of bug 1381139
People
(Reporter: mozilla, Unassigned)
References
Details
(Keywords: sec-low, testcase, Whiteboard: [sg:low] bypass external filters that are unfamiliar with these formats [necko-backlog])
We have all seen the SPAM that comes with URL's obfuscated by changing an IP
address from it's 4-octet style to a simple decimal number (like
http://3486011863 instead of http://www.mozilla.org), or in octal (like
http://00000000317.00000000310.00000000121.00000000327/), or in base 256
notation (like http://4294967503.4294967496.4294967377.4294967511/).
Why Netscape and Internet Explorer follow these type of URL's is beyond me. It
would go along way in the battle against spammers to simply have Mozilla check
if a URL is either a decimal number in quad notation, or a fully qualified
domain name before it follows it.
Comment 1•24 years ago
|
||
I don't think we should block these urls. It's much easier for me to remember
2259499800 than it is for me to remember 134.173.59.24, and it isn't much more
obfuscated.
Converting anything that looks like an IP address to the canonical
xxx.xxx.xxx.xxx when following a link might make sense, but it could break
things, and I don't think there would be a great benefit because the
xxx.xxx.xxx.xxx form doesn't mean anything to most users either.
Comment 2•24 years ago
|
||
Remembering a 10 digit number more easily than a set of 4 numbers seems to fly
in the face of all my knowledge about human memory and probably in incidence
terms is less common than odd number bases being used to obfuscate an IP address.
That said, not supporting them would do what? Ignore them silently, put up a
message box?
S
Comment 3•24 years ago
|
||
This must break compliance with at least one RFC, surely? :-) I'm against.
Gerv
Comment 4•24 years ago
|
||
Mozilla's current behaviour is clearly in violation of RFC2396, which specifies
that a host name MUST be either a domain name or an IPv4 dotted quad.
RFC 2396 says in section 3.2.2:
The host is a domain name of a network host, or its IPv4 address as a
set of four decimal digit groups separated by ".". Literal IPv6
addresses are not supported.
hostport = host [ ":" port ]
host = hostname | IPv4address
hostname = *( domainlabel "." ) toplabel [ "." ]
domainlabel = alphanum | alphanum *( alphanum | "-" ) alphanum
toplabel = alpha | alpha *( alphanum | "-" ) alphanum
and somewhat earlier:
alphanum = alpha | digit
I believe all digits are legal.
Reporter | ||
Comment 6•24 years ago
|
||
Digits are legal in a host name, so just see if the host portion of a URI is a
valid dotted quad, and if it's not then just send it through DNS like any other
host name. That would cause all of the above URI's to fail just as
http://987.474.264.712 would (unless one of those numbers happens to be a
hostname on a LAN).
Comment 7•24 years ago
|
||
Simple decimal URLs also can allow people to bypass blocked (blacklisted by
filtering software) addresses. That might be considered a feature, not a bug.
Reporter | ||
Comment 8•24 years ago
|
||
That might be a beneficial side effect of Mozilla following obfuscated URL's,
but I think that by and large it [following obfuscated URI's] is not helpful. It
shouldn't be the job of Mozilla to ensure that it gives people a way to bypass
filtering software. What it is commonly used for prevents many users from
determining which domain, or IP subnet, a spam-vertized host is on. A lot of
users are now clueful enough to send LARTs to abuse@domain.com, and a few to do
a whois on the IP. I don't think there are any legitimate reasons to follow
these type of URI's (although I disagree with the principle of censorware, it is
technically not legit to bypass them on system where the admin has installed
them).
Comment 9•24 years ago
|
||
All digits are legal but a hostname made up only of digits isn't. Look at that
grammar more closely.
host = hostname | IPv4address
hostname = *( domainlabel "." ) toplabel [ "." ]
domainlabel = alphanum | alphanum *( alphanum | "-" ) alphanum
toplabel = alpha | alpha *( alphanum | "-" ) alphanum
hostname must contain a toplabel, and toplabel must begin with an alpha. If the
URI is all numbers and it's not decimal IPv4 dotted quad, it's illegal.
Comment 10•24 years ago
|
||
OK, I'm usually wrong with respect to grammar.
Jerry: what do you mean by "just send it through DNS?" You really have
to use the OS equivalent of gethostbyname. On Linux, gethostbyname
resolves all-digit names.
It concerns me that the current behavior is pervasive. Perhaps there is
some obscure RFC that demands it. This needs investigation.
Comment 11•24 years ago
|
||
Parsing octal-obfuscated IP address literals is a "happy" accident of the way
how inet_addr() is implemented in most systems -- it commonly uses strtol()
which has implicite rule of leading '0' meaning octal, unless '0x' which means
hexa-decimal...
Having stricter parser inside inet_addr() (or its equivalent) would certainly
block those obfuscations. The gethostbyname() at many implementations does
also call inet_addr() if it can't resolve the input string via DNS lookup.
As to IPv6 -- See RFC 2732: Format for Literal IPv6 Addresses in URL's
Comment 13•24 years ago
|
||
+qawanted, mozilla1.0
This should happen only in some versions of Windows, per Sean's comments in bug
12748. If someone has time, can they verify this?
I would like this fixed. If someone wants to map 10 digit decimal numbers to IP
addresses, they need to register a domain and use that.
Keywords: mozilla1.0,
qawanted
Summary: [RFE] Mozilla should not follow obfuscated URL's → [RFE] IP addresses should not work if they are in decimal
Comment 14•23 years ago
|
||
*** Bug 73597 has been marked as a duplicate of this bug. ***
Comment 15•23 years ago
|
||
RFC 1738 seems to explicitly disallow this format for valid HTTP URLs. RFC 1945
and RFC 2068 defer to this, although 2068 does note that "HTTP proxies may
receive requests for URIs not defined by RFC 1738."
Comment 16•23 years ago
|
||
CONFIRMED:
Linux, Mozilla 0.9.4
This only fails on Mac.
If you literalize the URL by putting a "." after the number, DNS does error.
Comment 17•23 years ago
|
||
+pp, -qawanted, ALL/ALL.
Reporter | ||
Comment 18•23 years ago
|
||
Change summary to something more descriptive.
Summary: [RFE] IP addresses should not work if they are in decimal → [RFE] Obfuscated IP addresses shouldn't work
Reporter | ||
Comment 19•22 years ago
|
||
Mass removing self from CC list.
Reporter | ||
Comment 20•22 years ago
|
||
Now I feel sumb because I have to add back. Sorry for the spam.
Comment 21•22 years ago
|
||
Chimerea accepts this (Chimera 0.3).
Summary: [RFE] Obfuscated IP addresses shouldn't work → [RFE] Obfuscated IP (single large decimal or hexed) addresses shouldn't work
Comment 22•22 years ago
|
||
I see decimal IP addresses in links every once in a while, most recently on
http://www.berrypatch.org/pictures.html. Fixing this bug in order to comply
with RFCs would break sites. Does the RFC say that user agents should/must
reject addresses that don't match the RFC definition? Is there any benefit to
fixing this bug other than RFC compliance? Would canonicalizing the decimal
address to xxx.xxx.xxx.xxx form (for the benefit of filtering software) be a
reasonable compromise?
Reporter | ||
Comment 23•22 years ago
|
||
The benefit is disallowing the obfuscation of URL's for nefarious purposes.
There is no reason to hide a host's identity whatsoever. The spec states that
the name SHOULD be checked before being sent to DNS.
Comment 24•22 years ago
|
||
I don't see how a simple decimal number is anymore obfuscated than a 4 octet
style number. And filing bugs to with the specific intent of breaking links
because spammers use them is a futile way to fight spam.
Reporter | ||
Comment 25•22 years ago
|
||
It's not a way to fight spam. It's a way to remove one more little trick in the
spammer's toolbox, AND get Mozilla to adhere to the standards that it should be
complying with anyway.
Comment 26•22 years ago
|
||
Joe:
Most humans are using dotted quad addressing for a reason, it is relatively
human readable. And almost all interfaces accept this format, OS configs, web
sites, even ARIN. Why should vendors and web sites start bolting on more code to
support a decimal to dotted quad and 32bit unsigned int just because a couple
system API's are too liberal?
Comment 27•22 years ago
|
||
Test case links do not work on MacOS 10.2. They do work on Windows 2000 using
Mozilla 1.2a. Could someone test other platforms?
Comment 28•22 years ago
|
||
This is a testcase that is regularly checked.
http://www.mozilla.org/quality/networking/testing/coretests.html
Basically, Chimera allows this addressing format as well. I think only Mozilla
on Mac OS X ignores it (this bug blocks bug 150966 for chimera).
Comment 29•22 years ago
|
||
moving neeti's futured bugs for triaging.
Assignee: neeti → new-network-bugs
Comment 30•22 years ago
|
||
[RFE] is deprecated in favor of severity: enhancement. They have the same meaning.
Severity: normal → enhancement
Summary: [RFE] Obfuscated IP (single large decimal or hexed) addresses shouldn't work → Obfuscated IP (single large decimal or hexed) addresses shouldn't work
Comment 31•22 years ago
|
||
Mozilla 1.3b for Mac OS X accepts this, so it is a characteristic of mach-o
Comment 32•22 years ago
|
||
-mozilla 1.0: long gone
-pp: now that mac cfm is gone, all plats do this.
Keywords: mozilla1.0,
pp
Comment 33•22 years ago
|
||
I added the word "dotless" because MS describes it using that term.
Here's something interesting to think about.
http://www.microsoft.com/technet/treeview/default.asp?url=/technet/security/bulletin/MS01-055.asp
I'm new to cookies, so I'm trying to figure out if this matters to us. If anyone
can think of a reason this bug would intersect badly w/ cookies, please open a
bug in cookies:
* The third vulnerability is a new variant of a vulnerability discussed in
Microsoft Security Bulletin MS01-051 affecting how IE handles URLs that include
dotless IP addresses. If a web site were specified using a dotless IP format
(e.g., http://031713501415 rather than http://207.46.131.13), and the request
were malformed in a particular way, IE would not recognize that the site was an
Internet site. Instead, it would treat the site as an intranet site, and open
pages on the site in the Intranet Zone rather than the correct zone. This would
allow the site to run with fewer security restrictions than appropriate. This
vulnerability does not affect IE 6.
Summary: Obfuscated IP (single large decimal or hexed) addresses shouldn't work → Obfuscated "dotless" IP (single large decimal or hexed) addresses shouldn't work
Comment 34•22 years ago
|
||
*** Bug 150966 has been marked as a duplicate of this bug. ***
Comment 35•22 years ago
|
||
Why has this bug languished so long?
Comment 36•20 years ago
|
||
I've written and submitted javascript functions that do strict IPv4 and DNS FQDN
validation. If you hooked them into the URL parser, you could reject a lot of
this stuff. You can see the work in bug 273097 and bug 268893.
This would be pretty controversial, and would need to be a pref. It would also
need to be modernized to include IPv6 and IDN, I'm focused on a certain level of
base functionality.
Reporter | ||
Comment 37•20 years ago
|
||
Controversial to spammers and virus writers. Can you attach the code, or email
it? I imagine it shouldn't be to hard to use it to validate the URL prior to
submitting it from the location bar to Necko, but I don't know without tinkering.
Comment 38•20 years ago
|
||
Jerry: see bug 268619 and bug 268893.
I think I posted the test harness so there is a file where you can try out any
values you want.
Comment 39•18 years ago
|
||
*** Bug 358447 has been marked as a duplicate of this bug. ***
Comment 40•17 years ago
|
||
This bug reflects a fundamental misunderstanding of what an IP address is. An IP address is a long int. That's it. One big number. The dot-quad version of the long int improves readability, but the long int is in fact a valid form of a valid IP address.
One might argue that a large number is harder to remember than a dot-quad version of an IP. To someone making such an argument I would inquire if they know their phone number. Because my system's IP address (1079075330) is no harder to remember than a phone number.
"Fixing" this "bug" would cause the browser to behave differently than every other TCP/IP using utility on the system, including ones that fetch web pages (Curl et al.) I would consider firefox not retrieving an address in this format to be a bug.
Speaking of which, my OSX version of Firefox does not retrieve an address in this format, and I consider that a bug.
Reporter | ||
Comment 41•17 years ago
|
||
(In reply to comment #40)
> This bug reflects a fundamental misunderstanding of what an IP address is.
No, it doesn't. There is a format to IP addresses called a dotted quad. Your argument is that since the dotted quad is just a representation of a hexadecimal number, that any representation of the IP address which can be converted to that hexadecimal number should be allowed at the user interface level. Should a word processor allow you to type in hexadecimal, or octal, or even binary? They don't. Why not? It's the same thing. Why don't telephones allow you to dial in octal? The fact is that there are both formal definitions, and conventions. By following numerical addresses not in dotted quad form, Firefox is violating the convention of representing IP addresses as a dotted quad. The brokenness of other products is not a convincing argument for maintaining the brokenness of Firefox.
What functionality are you losing by not being able to follow decimal representations of hexadecimal addresses? Is there some reason you cannot use the dotted quad format?
Comment 42•17 years ago
|
||
I'm saying that the number IS a valid address and deliberately breaking that functionality breaks an ad-hoc convention that is used by every other TCP/IP client program that I've tested it on, with several different flavors of UNIX. You are intercepting and subverting an underlying capability of the system standard library.
I mostly use the capability to demonstrate to programmers new to TCP/IP that an address IS just a number. Being able to browse to a number in that format drives that point home quite effectively.
Comment 43•17 years ago
|
||
Pedagogy is a weak justification. (I was taught C using gets(), and it was one of the most insecure functions ever devised.)
The RFCs are fairly clear on the correct behavior. If there's a reason to ignore them, it's what Jesse said above -- that some people actually use this stuff for legitimate things.
Reporter | ||
Comment 44•17 years ago
|
||
(In reply to comment #43)
> The RFCs are fairly clear on the correct behavior. If there's a reason to
> ignore them, it's what Jesse said above -- that some people actually use this
> stuff for legitimate things.
My position is that you need to do a cost/benefit analysis. What are the costs and benefits associated with each course of action?
1. The cost of leaving it as-is: Spammers and phishers are able to pile on another layer of obfuscation to their sites, making it more difficult for features like Google's anti-phishing or Thunderbird's phishing detection.
2. The cost of fixing it: Some may not be able to use Firefox to demonstrate that an IP address is really just a number that can be represented as a decimal number, or a decimal representation of a DWORD.
The judgment of which cost is the greater evil depends on your opinion of the severity of each.
Comment 45•17 years ago
|
||
Another cost of leaving is as-is: it makes Firefox appear to differ between operating systems (iirc)
Another cost of fixing it: some sites will break. (Perhaps markp could tell us how many.)
Comment 48•15 years ago
|
||
Since these IP address formats already don't work in Firefox on some operating systems, I would not expect many sites to break if we were to drop support for them entirely.
Whiteboard: [sg:low] bypass external filters that are unfamiliar with these formats
Comment 49•15 years ago
|
||
I agree with Jesse's comment above. These address formats serve no useful purpose any more, and they interact nastily with external security measures.
I added this comment to bug 554596, which might help clarify some of the historical issues here:
http://tools.ietf.org/html/draft-main-ipaddr-text-rep-00 -- see section 2.1.1,
"Early Practice", which explains how the 4.2BSD inet_aton() became the de-facto
standard for IPv4 address interpretation, and that compatibility with this
lingers to this day. It concludes:
The 4.2BSD inet_aton() has been widely copied and imitated, and so is
a de facto standard for the textual representation of IPv4 addresses.
Nevertheless, these alternative syntaxes have now fallen out of use
(if they ever had significant use). The only practical use that they
now see is for deliberate obfuscation of addresses: giving an IPv4
address as a single 32-bit decimal number is favoured among people
wishing to conceal the true location that is encoded in a URL. All
the forms except for decimal octets are seen as non-standard (despite
being quite widely interoperable) and undesirable.
http://www.pc-help.org/obscure.htm contains a number of different examples of
IP address obfuscation techniques, including uses of the numeric overflows
described above.
Comment 50•15 years ago
|
||
Also rescued from the comments there, here's another little-known format:
Various implementations of inet_aton() have exciting semi-documented features such as two- and three-part dotted numerical addresses, for example:
a.b -- 8.24 bits -- example: http://0x42.0x660d63
a.b.c -- 8.8.16 bits -- example: http://0x42.0x66.0x0d63
See http://www.securelist.com/en/blog/148/New_Brazilian_banking_Trojans_recycle_old_URL_obfuscation_tricks for the original test cases.
Updated•15 years ago
|
Assignee: general → nobody
QA Contact: benc → networking
Target Milestone: Future → ---
Comment 51•13 years ago
|
||
Ooh looks like Vint Cerf agrees with me! He's the father of the Internet, you know? ;-P
http://interviews.slashdot.org/story/11/10/25/1532213/vint-cerf-answers-your-questions-about-ipv6-and-more
VC: LOL! actually, most of us assumed that any way to generate the 32 number should be acceptable since the connection process doesn't actually use the text representation of the IP address. I think any value in the range 0 to 2^32-1 should be acceptable as an IP reference. As to stateless operation, I know what you mean; you have to get used to figuring out how to stash intermediate state (cookies usually)...
Comment 52•13 years ago
|
||
Bruce, you're conflating IPs and URIs here. The browser's location bar takes a URI, not an IP. The RFC for the URI specifies that the host part may be specified by name or by IP, but prescribes a certain format for the IP. Your question to Vint Cerf conveniently neglected to mention this distinction, and you can't infer from his answer that he actually read this bug report to find out what the issue really was.
Comment 53•11 years ago
|
||
13 years later, I rescind comment #3.
See http://blogs.msdn.com/b/ieinternals/archive/2014/03/06/browser-arcana-ipv4-ipv6-literal-urls-dotted-va-dotless.aspx . We are sending these unusual formats over the wire, in a header ("Host") which can be used by some for security-related decisions. This is a hostage to fortune; we should certainly stop doing that.
Gerv
Comment 54•11 years ago
|
||
to be clear the suggestion from the blog in comment 53 is that we translate these addresses into dotted decimal notation, not that we block them. I like that.
"OS; one of the first steps that class undertakes when constructing a URL object from a string is to convert any IP literal hostname into its canonical dotted-quad form. Chrome and Opera appear to match Internet Explorer’s behavior here, while Firefox 27 leaves the undotted decimal in the address bar and in the request sent to the network2:"
Comment 56•10 years ago
|
||
(In reply to Patrick McManus [:mcmanus] from comment #54)
> to be clear the suggestion from the blog in comment 53 is that we translate
> these addresses into dotted decimal notation, not that we block them. I like
> that.
And this suggestion was copied by bug 1063010.
At what level do we want to fix this? Note that other browsers even display the corrected address in e.g. tooltips (for in-page links), the URL bar, etc. etc.
It's 6pm on a monday and so if having a default-to-on pref for this is what appeases the people who insist on having http://2130706433/ work, I can live with doing that, too.
Comment 57•10 years ago
|
||
Who's gonna drive this?
Updated•9 years ago
|
Whiteboard: [sg:low] bypass external filters that are unfamiliar with these formats → [sg:low] bypass external filters that are unfamiliar with these formats [necko-backlog]
Comment 60•7 years ago
|
||
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P1
Comment 61•7 years ago
|
||
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P1 → P3
Comment 62•5 years ago
|
||
this is also an issue with the IPv4 concept parser
bug 1381139
"feature" - https://bugzilla.mozilla.org/show_bug.cgi?id=1288049
http://10.0.514
resolves to
http://10.0.2.2
Updated•5 years ago
|
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → DUPLICATE
Comment 64•5 years ago
|
||
@annevk this should not be a duplicate
bug 1381139 is a subset of this bug, should be marked duplicate the other way.
This bug should be kept open.
You need to log in
before you can comment on or make changes to this bug.
Description
•