67730 - Obfuscated "dotless" IP (single large decimal or hexed) addresses shouldn't work

Reporter

Description

•

25 years ago

We have all seen the SPAM that comes with URL's obfuscated by changing an IP address from it's 4-octet style to a simple decimal number (like http://3486011863 instead of http://www.mozilla.org), or in octal (like http://00000000317.00000000310.00000000121.00000000327/), or in base 256 notation (like http://4294967503.4294967496.4294967377.4294967511/). Why Netscape and Internet Explorer follow these type of URL's is beyond me. It would go along way in the battle against spammers to simply have Mozilla check if a URL is either a decimal number in quad notation, or a fully qualified domain name before it follows it.

Jesse Ruderman

Comment 1

•

25 years ago

I don't think we should block these urls. It's much easier for me to remember 2259499800 than it is for me to remember 134.173.59.24, and it isn't much more obfuscated. Converting anything that looks like an IP address to the canonical xxx.xxx.xxx.xxx when following a link might make sense, but it could break things, and I don't think there would be a great benefit because the xxx.xxx.xxx.xxx form doesn't mean anything to most users either.

Simon Lucy

Comment 2

•

25 years ago

Remembering a 10 digit number more easily than a set of 4 numbers seems to fly in the face of all my knowledge about human memory and probably in incidence terms is less common than odd number bases being used to obfuscate an IP address. That said, not supporting them would do what? Ignore them silently, put up a message box? S

Gervase Markham [:gerv]

Comment 3

•

25 years ago

This must break compliance with at least one RFC, surely? :-) I'm against. Gerv

Doug Sheppard

Comment 4

•

25 years ago

Mozilla's current behaviour is clearly in violation of RFC2396, which specifies that a host name MUST be either a domain name or an IPv4 dotted quad.

tenthumbs

Comment 5

•

25 years ago

RFC 2396 says in section 3.2.2: The host is a domain name of a network host, or its IPv4 address as a set of four decimal digit groups separated by ".". Literal IPv6 addresses are not supported. hostport = host [ ":" port ] host = hostname | IPv4address hostname = *( domainlabel "." ) toplabel [ "." ] domainlabel = alphanum | alphanum *( alphanum | "-" ) alphanum toplabel = alpha | alpha *( alphanum | "-" ) alphanum and somewhat earlier: alphanum = alpha | digit I believe all digits are legal.

Jerry Baker

Reporter

Comment 6

•

25 years ago

Digits are legal in a host name, so just see if the host portion of a URI is a valid dotted quad, and if it's not then just send it through DNS like any other host name. That would cause all of the above URI's to fail just as http://987.474.264.712 would (unless one of those numbers happens to be a hostname on a LAN).

Patrick Lam

Comment 7

•

25 years ago

Simple decimal URLs also can allow people to bypass blocked (blacklisted by filtering software) addresses. That might be considered a feature, not a bug.

Jerry Baker

Reporter

Comment 8

•

25 years ago

That might be a beneficial side effect of Mozilla following obfuscated URL's, but I think that by and large it [following obfuscated URI's] is not helpful. It shouldn't be the job of Mozilla to ensure that it gives people a way to bypass filtering software. What it is commonly used for prevents many users from determining which domain, or IP subnet, a spam-vertized host is on. A lot of users are now clueful enough to send LARTs to abuse@domain.com, and a few to do a whois on the IP. I don't think there are any legitimate reasons to follow these type of URI's (although I disagree with the principle of censorware, it is technically not legit to bypass them on system where the admin has installed them).

Doug Sheppard

Comment 9

•

25 years ago

All digits are legal but a hostname made up only of digits isn't. Look at that grammar more closely. host = hostname | IPv4address hostname = *( domainlabel "." ) toplabel [ "." ] domainlabel = alphanum | alphanum *( alphanum | "-" ) alphanum toplabel = alpha | alpha *( alphanum | "-" ) alphanum hostname must contain a toplabel, and toplabel must begin with an alpha. If the URI is all numbers and it's not decimal IPv4 dotted quad, it's illegal.

tenthumbs

Comment 10

•

25 years ago

OK, I'm usually wrong with respect to grammar. Jerry: what do you mean by "just send it through DNS?" You really have to use the OS equivalent of gethostbyname. On Linux, gethostbyname resolves all-digit names. It concerns me that the current behavior is pervasive. Perhaps there is some obscure RFC that demands it. This needs investigation.

neeti

Updated

•

24 years ago

Target Milestone: --- → Future

Matti Aarnio

Comment 11

•

24 years ago

Parsing octal-obfuscated IP address literals is a "happy" accident of the way how inet_addr() is implemented in most systems -- it commonly uses strtol() which has implicite rule of leading '0' meaning octal, unless '0x' which means hexa-decimal... Having stricter parser inside inet_addr() (or its equivalent) would certainly block those obfuscations. The gethostbyname() at many implementations does also call inet_addr() if it can't resolve the input string via DNS lookup. As to IPv6 -- See RFC 2732: Format for Literal IPv6 Addresses in URL's

benc

Comment 12

•

24 years ago

mass move, v2. qa to me.

QA Contact: tever → benc

benc

Comment 13

•

24 years ago

+qawanted, mozilla1.0 This should happen only in some versions of Windows, per Sean's comments in bug 12748. If someone has time, can they verify this? I would like this fixed. If someone wants to map 10 digit decimal numbers to IP addresses, they need to register a domain and use that.

Keywords: mozilla1.0, qawanted

benc

Updated

•

24 years ago

Summary: [RFE] Mozilla should not follow obfuscated URL's → [RFE] IP addresses should not work if they are in decimal

benc

Comment 14

•

24 years ago

*** Bug 73597 has been marked as a duplicate of this bug. ***

Chase Tingley

Comment 15

•

24 years ago

RFC 1738 seems to explicitly disallow this format for valid HTTP URLs. RFC 1945 and RFC 2068 defer to this, although 2068 does note that "HTTP proxies may receive requests for URIs not defined by RFC 1738."

benc

Comment 16

•

24 years ago

CONFIRMED: Linux, Mozilla 0.9.4 This only fails on Mac. If you literalize the URL by putting a "." after the number, DNS does error.

benc

Comment 17

•

23 years ago

+pp, -qawanted, ALL/ALL.

Keywords: qawanted → pp

OS: Windows 2000 → All

Hardware: PC → All

Jerry Baker

Reporter

Comment 18

•

23 years ago

Change summary to something more descriptive.

Summary: [RFE] IP addresses should not work if they are in decimal → [RFE] Obfuscated IP addresses shouldn't work

Jerry Baker

Reporter

Comment 19

•

23 years ago

Mass removing self from CC list.

Jerry Baker

Reporter

Comment 20

•

23 years ago

Now I feel sumb because I have to add back. Sorry for the spam.

benc

Comment 21

•

23 years ago

Chimerea accepts this (Chimera 0.3).

benc

Updated

•

23 years ago

Summary: [RFE] Obfuscated IP addresses shouldn't work → [RFE] Obfuscated IP (single large decimal or hexed) addresses shouldn't work

benc

Updated

•

23 years ago

Blocks: 150966

Jesse Ruderman

Comment 22

•

23 years ago

I see decimal IP addresses in links every once in a while, most recently on http://www.berrypatch.org/pictures.html. Fixing this bug in order to comply with RFCs would break sites. Does the RFC say that user agents should/must reject addresses that don't match the RFC definition? Is there any benefit to fixing this bug other than RFC compliance? Would canonicalizing the decimal address to xxx.xxx.xxx.xxx form (for the benefit of filtering software) be a reasonable compromise?

Jerry Baker

Reporter

Comment 23

•

23 years ago

The benefit is disallowing the obfuscation of URL's for nefarious purposes. There is no reason to hide a host's identity whatsoever. The spec states that the name SHOULD be checked before being sent to DNS.

Joseph Elwell

Comment 24

•

23 years ago

I don't see how a simple decimal number is anymore obfuscated than a 4 octet style number. And filing bugs to with the specific intent of breaking links because spammers use them is a futile way to fight spam.

Jerry Baker

Reporter

Comment 25

•

23 years ago

It's not a way to fight spam. It's a way to remove one more little trick in the spammer's toolbox, AND get Mozilla to adhere to the standards that it should be complying with anyway.

benc

Comment 26

•

23 years ago

Joe: Most humans are using dotted quad addressing for a reason, it is relatively human readable. And almost all interfaces accept this format, OS configs, web sites, even ARIN. Why should vendors and web sites start bolting on more code to support a decimal to dotted quad and 32bit unsigned int just because a couple system API's are too liberal?

John G. Myers

Comment 27

•

23 years ago

Test case links do not work on MacOS 10.2. They do work on Windows 2000 using Mozilla 1.2a. Could someone test other platforms?

benc

Comment 28

•

23 years ago

This is a testcase that is regularly checked. http://www.mozilla.org/quality/networking/testing/coretests.html Basically, Chimera allows this addressing format as well. I think only Mozilla on Mac OS X ignores it (this bug blocks bug 150966 for chimera).

Doug Turner (:dougt)

Comment 29

•

23 years ago

moving neeti's futured bugs for triaging.

Assignee: neeti → new-network-bugs

Brant Gurganus

Comment 30

•

23 years ago

[RFE] is deprecated in favor of severity: enhancement. They have the same meaning.

Severity: normal → enhancement

timeless

Updated

•

23 years ago

Summary: [RFE] Obfuscated IP (single large decimal or hexed) addresses shouldn't work → Obfuscated IP (single large decimal or hexed) addresses shouldn't work

benc

Comment 31

•

23 years ago

Mozilla 1.3b for Mac OS X accepts this, so it is a characteristic of mach-o

Andrew Hagen

Updated

•

23 years ago

No longer blocks: 150966

benc

Comment 32

•

23 years ago

-mozilla 1.0: long gone -pp: now that mac cfm is gone, all plats do this.

Keywords: mozilla1.0, pp

benc

Updated

•

23 years ago

Keywords: testcase

benc

Comment 33

•

22 years ago

I added the word "dotless" because MS describes it using that term. Here's something interesting to think about. http://www.microsoft.com/technet/treeview/default.asp?url=/technet/security/bulletin/MS01-055.asp I'm new to cookies, so I'm trying to figure out if this matters to us. If anyone can think of a reason this bug would intersect badly w/ cookies, please open a bug in cookies: * The third vulnerability is a new variant of a vulnerability discussed in Microsoft Security Bulletin MS01-051 affecting how IE handles URLs that include dotless IP addresses. If a web site were specified using a dotless IP format (e.g., http://031713501415 rather than http://207.46.131.13), and the request were malformed in a particular way, IE would not recognize that the site was an Internet site. Instead, it would treat the site as an intranet site, and open pages on the site in the Intranet Zone rather than the correct zone. This would allow the site to run with fewer security restrictions than appropriate. This vulnerability does not affect IE 6.

Summary: Obfuscated IP (single large decimal or hexed) addresses shouldn't work → Obfuscated "dotless" IP (single large decimal or hexed) addresses shouldn't work

benc

Comment 34

•

22 years ago

*** Bug 150966 has been marked as a duplicate of this bug. ***

Simon Fraser [no longer active]

Comment 35

•

22 years ago

Why has this bug languished so long?

benc

Comment 36

•

20 years ago

I've written and submitted javascript functions that do strict IPv4 and DNS FQDN validation. If you hooked them into the URL parser, you could reject a lot of this stuff. You can see the work in bug 273097 and bug 268893. This would be pretty controversial, and would need to be a pref. It would also need to be modernized to include IPv6 and IDN, I'm focused on a certain level of base functionality.

Jerry Baker

Reporter

Comment 37

•

20 years ago

Controversial to spammers and virus writers. Can you attach the code, or email it? I imagine it shouldn't be to hard to use it to validate the URL prior to submitting it from the location bar to Necko, but I don't know without tinkering.

benc

Comment 38

•

20 years ago

Jerry: see bug 268619 and bug 268893. I think I posted the test harness so there is a file where you can try out any values you want.

Jesse Ruderman

Comment 39

•

19 years ago

*** Bug 358447 has been marked as a duplicate of this bug. ***

Bruce Ide

Comment 40

•

18 years ago

This bug reflects a fundamental misunderstanding of what an IP address is. An IP address is a long int. That's it. One big number. The dot-quad version of the long int improves readability, but the long int is in fact a valid form of a valid IP address. One might argue that a large number is harder to remember than a dot-quad version of an IP. To someone making such an argument I would inquire if they know their phone number. Because my system's IP address (1079075330) is no harder to remember than a phone number. "Fixing" this "bug" would cause the browser to behave differently than every other TCP/IP using utility on the system, including ones that fetch web pages (Curl et al.) I would consider firefox not retrieving an address in this format to be a bug. Speaking of which, my OSX version of Firefox does not retrieve an address in this format, and I consider that a bug.

Jerry Baker

Reporter

Comment 41

•

18 years ago

(In reply to comment #40) > This bug reflects a fundamental misunderstanding of what an IP address is. No, it doesn't. There is a format to IP addresses called a dotted quad. Your argument is that since the dotted quad is just a representation of a hexadecimal number, that any representation of the IP address which can be converted to that hexadecimal number should be allowed at the user interface level. Should a word processor allow you to type in hexadecimal, or octal, or even binary? They don't. Why not? It's the same thing. Why don't telephones allow you to dial in octal? The fact is that there are both formal definitions, and conventions. By following numerical addresses not in dotted quad form, Firefox is violating the convention of representing IP addresses as a dotted quad. The brokenness of other products is not a convincing argument for maintaining the brokenness of Firefox. What functionality are you losing by not being able to follow decimal representations of hexadecimal addresses? Is there some reason you cannot use the dotted quad format?

Bruce Ide

Comment 42

•

17 years ago

I'm saying that the number IS a valid address and deliberately breaking that functionality breaks an ad-hoc convention that is used by every other TCP/IP client program that I've tested it on, with several different flavors of UNIX. You are intercepting and subverting an underlying capability of the system standard library. I mostly use the capability to demonstrate to programmers new to TCP/IP that an address IS just a number. Being able to browse to a number in that format drives that point home quite effectively.

Chase Tingley

Comment 43

•

17 years ago

Pedagogy is a weak justification. (I was taught C using gets(), and it was one of the most insecure functions ever devised.) The RFCs are fairly clear on the correct behavior. If there's a reason to ignore them, it's what Jesse said above -- that some people actually use this stuff for legitimate things.

Jerry Baker

Reporter

Comment 44

•

17 years ago

(In reply to comment #43) > The RFCs are fairly clear on the correct behavior. If there's a reason to > ignore them, it's what Jesse said above -- that some people actually use this > stuff for legitimate things. My position is that you need to do a cost/benefit analysis. What are the costs and benefits associated with each course of action? 1. The cost of leaving it as-is: Spammers and phishers are able to pile on another layer of obfuscation to their sites, making it more difficult for features like Google's anti-phishing or Thunderbird's phishing detection. 2. The cost of fixing it: Some may not be able to use Firefox to demonstrate that an IP address is really just a number that can be represented as a decimal number, or a decimal representation of a DWORD. The judgment of which cost is the greater evil depends on your opinion of the severity of each.

Jesse Ruderman

Comment 45

•

17 years ago

Another cost of leaving is as-is: it makes Firefox appear to differ between operating systems (iirc) Another cost of fixing it: some sites will break. (Perhaps markp could tell us how many.)

Jesse Ruderman

Updated

•

17 years ago

Depends on: 430273

Jesse Ruderman

Comment 48

•

15 years ago

Since these IP address formats already don't work in Firefox on some operating systems, I would not expect many sites to break if we were to drop support for them entirely.

Whiteboard: [sg:low] bypass external filters that are unfamiliar with these formats

Neil Harris

Comment 49

•

15 years ago

I agree with Jesse's comment above. These address formats serve no useful purpose any more, and they interact nastily with external security measures. I added this comment to bug 554596, which might help clarify some of the historical issues here: http://tools.ietf.org/html/draft-main-ipaddr-text-rep-00 -- see section 2.1.1, "Early Practice", which explains how the 4.2BSD inet_aton() became the de-facto standard for IPv4 address interpretation, and that compatibility with this lingers to this day. It concludes: The 4.2BSD inet_aton() has been widely copied and imitated, and so is a de facto standard for the textual representation of IPv4 addresses. Nevertheless, these alternative syntaxes have now fallen out of use (if they ever had significant use). The only practical use that they now see is for deliberate obfuscation of addresses: giving an IPv4 address as a single 32-bit decimal number is favoured among people wishing to conceal the true location that is encoded in a URL. All the forms except for decimal octets are seen as non-standard (despite being quite widely interoperable) and undesirable. http://www.pc-help.org/obscure.htm contains a number of different examples of IP address obfuscation techniques, including uses of the numeric overflows described above.

Neil Harris

Comment 50

•

15 years ago

Also rescued from the comments there, here's another little-known format: Various implementations of inet_aton() have exciting semi-documented features such as two- and three-part dotted numerical addresses, for example: a.b -- 8.24 bits -- example: http://0x42.0x660d63 a.b.c -- 8.8.16 bits -- example: http://0x42.0x66.0x0d63 See http://www.securelist.com/en/blog/148/New_Brazilian_banking_Trojans_recycle_old_URL_obfuscation_tricks for the original test cases.

Reed Loden [:reed]

Updated

•

15 years ago

Assignee: general → nobody

QA Contact: benc → networking

Target Milestone: Future → ---

Bruce Ide

Comment 51

•

14 years ago

Ooh looks like Vint Cerf agrees with me! He's the father of the Internet, you know? ;-P http://interviews.slashdot.org/story/11/10/25/1532213/vint-cerf-answers-your-questions-about-ipv6-and-more VC: LOL! actually, most of us assumed that any way to generate the 32 number should be acceptable since the connection process doesn't actually use the text representation of the IP address. I think any value in the range 0 to 2^32-1 should be acceptable as an IP reference. As to stateless operation, I know what you mean; you have to get used to figuring out how to stash intermediate state (cookies usually)...

Tristan Miller

Comment 52

•

14 years ago

Bruce, you're conflating IPs and URIs here. The browser's location bar takes a URI, not an IP. The RFC for the URI specifies that the host part may be specified by name or by IP, but prescribes a certain format for the IP. Your question to Vint Cerf conveniently neglected to mention this distinction, and you can't infer from his answer that he actually read this bug report to find out what the issue really was.

Curtis Koenig [:curtisk-use curtis.koenig+bzATgmail.com]]

Updated

•

13 years ago

Keywords: sec-low

Gervase Markham [:gerv]

Comment 53

•

11 years ago

13 years later, I rescind comment #3. See http://blogs.msdn.com/b/ieinternals/archive/2014/03/06/browser-arcana-ipv4-ipv6-literal-urls-dotted-va-dotless.aspx . We are sending these unusual formats over the wire, in a header ("Host") which can be used by some for security-related decisions. This is a hostage to fortune; we should certainly stop doing that. Gerv

Patrick McManus [:mcmanus]

Comment 54

•

11 years ago

to be clear the suggestion from the blog in comment 53 is that we translate these addresses into dotted decimal notation, not that we block them. I like that. "OS; one of the first steps that class undertakes when constructing a URL object from a string is to convert any IP literal hostname into its canonical dotted-quad form. Chrome and Opera appear to match Internet Explorer’s behavior here, while Firefox 27 leaves the undotted decimal in the address bar and in the request sent to the network2:"

:Gijs (he/him)

Comment 56

•

11 years ago

(In reply to Patrick McManus [:mcmanus] from comment #54) > to be clear the suggestion from the blog in comment 53 is that we translate > these addresses into dotted decimal notation, not that we block them. I like > that. And this suggestion was copied by bug 1063010. At what level do we want to fix this? Note that other browsers even display the corrected address in e.g. tooltips (for in-page links), the URL bar, etc. etc. It's 6pm on a monday and so if having a default-to-on pref for this is what appeases the people who insist on having http://2130706433/ work, I can live with doing that, too.

Florian Bender

Comment 57

•

11 years ago

Who's gonna drive this?

Patrick McManus [:mcmanus]

Updated

•

10 years ago

Whiteboard: [sg:low] bypass external filters that are unfamiliar with these formats → [sg:low] bypass external filters that are unfamiliar with these formats [necko-backlog]

Firefox Bug Husbandry Bot

Comment 60

•

8 years ago

Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258

Priority: -- → P1

Firefox Bug Husbandry Bot

Comment 61

•

8 years ago

Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258

Priority: P1 → P3

13hu

Comment 62

•

6 years ago

this is also an issue with the IPv4 concept parser
bug 1381139
"feature" - https://bugzilla.mozilla.org/show_bug.cgi?id=1288049

http://10.0.514
resolves to
http://10.0.2.2

Anne (:annevk)

Updated

•

6 years ago

Status: NEW → RESOLVED

Closed: 6 years ago

Resolution: --- → DUPLICATE

13hu

Comment 64

•

6 years ago

@annevk this should not be a duplicate
bug 1381139 is a subset of this bug, should be marked duplicate the other way.

This bug should be kept open.