439616 - Redirect of IDN - Domains

Reporter

Description

•

17 years ago

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14 <meta http-equiv="refresh" content="0; url=http://www.fübar.de" /> -> alert with "url is invalid" PHP: header("Location: http://www.fübar.de"); -> opens: http://www.f%c3%bcbar.de/ Reproducible: Always Steps to Reproduce: self-explanatory Expected Results: call the right url? =) a "document.location.href" works fine

Matthias Versen [:Matti]

Comment 1

•

16 years ago

Reporter: Is this still an issue with FF3.0.3 ?

Dominic Blass

Reporter

Comment 2

•

16 years ago

(no longer active)

Comment 3

•

15 years ago

Confirming. This is a problem with websites which return UTF-8 encoded URIs in their Location redirects, such as the bit.ly link in the URL field. With this bug present, the bit.ly (or other URL shorteners) links to such domains cannot be opened, even if the user presses Reload after seeing the failed page, because Firefox tries to query the wrong DNS name. The only work-around is going to the location bar which contains the expanded URL and hit Enter, which is not intuitive. The same URLs work in Chrome. I'm confirming this bug with a Major priority, since it effectively prevents users from visiting certain websites.

URL: http://bit.ly/d9M1aQ

Severity: normal → major

Status: UNCONFIRMED → NEW

blocking2.0: --- → ?

Component: General → Networking: HTTP

Ever confirmed: true

OS: Windows XP → All

Product: Firefox → Core

QA Contact: general → networking.http

Hardware: x86 → All

Version: unspecified → Trunk

Boris Zbarsky [:bzbarsky]

Comment 4

•

15 years ago

> This is a problem with websites which return UTF-8 encoded URIs in > their Location redirects Which is a violation of the HTTP RFC, no? The value of Location is an absoluteURI (RFC 2616 section 14.30), which is defined in RFC 2396 as: absoluteURI = scheme ":" ( hier_part | opaque_part ) hier_part = ( net_path | abs_path ) [ "?" query ] net_path = "//" authority [ abs_path ] authority = server | reg_name reg_name = 1*( unreserved | escaped | "$" | "," | ";" | ":" | "@" | "&" | "=" | "+" ) So sending unescaped non-ASCII bytes is actually not allowed in the Location header. Of course error handling if that's done is not defined either. It sounds like Chrome is treating Location values as IRIs instead of URIs. What we do is to take the URI given (as bytes), escape any non-ASCII bytes using URI-escaping so that we don't violate the HTTP spec ourselves, and use the resulting URI. Note that comment 2 indicates that PHP in that case is sending the non-ASCII char encoded as ISO-8859-1, not as UTF-8. Also note that it's not consistent with comment 0 in terms of the PHP behavior. The upshot of all of which is that httpbis needs to define what's supposed to happen here. Right now (as of http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-08 ) it seems to match RFC 2616 on the matter.

(no longer active)

Comment 5

•

15 years ago

(In reply to comment #4) > > This is a problem with websites which return UTF-8 encoded URIs in > > their Location redirects > > Which is a violation of the HTTP RFC, no? AFAICT, yes. > The value of Location is an > absoluteURI (RFC 2616 section 14.30), which is defined in RFC 2396 as: > > absoluteURI = scheme ":" ( hier_part | opaque_part ) > hier_part = ( net_path | abs_path ) [ "?" query ] > net_path = "//" authority [ abs_path ] > authority = server | reg_name > reg_name = 1*( unreserved | escaped | "$" | "," | > ";" | ":" | "@" | "&" | "=" | "+" ) > > So sending unescaped non-ASCII bytes is actually not allowed in the Location > header. Of course error handling if that's done is not defined either. Yes, this is also true to the best of my knowledge. > It sounds like Chrome is treating Location values as IRIs instead of URIs. Are IRIs a full superset of URIs? If yes, then is it a good decision to us to treat the Location value as an IRI as well? > What we do is to take the URI given (as bytes), escape any non-ASCII bytes > using URI-escaping so that we don't violate the HTTP spec ourselves, and use > the resulting URI. > > Note that comment 2 indicates that PHP in that case is sending the non-ASCII > char encoded as ISO-8859-1, not as UTF-8. Also note that it's not consistent > with comment 0 in terms of the PHP behavior. I'm not 100% sure, but AFAIK, PHP (at least up to PHP5) doesn't have any special notion of UTF-8 for the most part, so it the source code file is encoded in UTF-8, and contains something like: <?php header('Location: http://دامنه.کام/path'); ?> then PHP will end up sending the raw bytes between the quote characters as a header, which will map to UTF-8 codepoints. > The upshot of all of which is that httpbis needs to define what's supposed to > happen here. Right now (as of > http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-08 ) it seems to > match RFC 2616 on the matter. Can we consider this as a webcompat issue, and deviate from the spec for that reason? I know that most browsers treat relative URIs for the Location field as URIs pointing to resources on the same domain relative to the current request's location, but I can't find anywhere in the HTTP RFC which defines what should happen in this case. Can this be considered a similar issue?

Boris Zbarsky [:bzbarsky]

Comment 6

•

15 years ago

> Are IRIs a full superset of URIs? I believe that any valid URI is a valid IRI, yes. But I'm not exactly an expert on that stuff. > is it a good decision to us to treat the Location value as an IRI as well _That_ I don't know. > then PHP will end up sending the raw bytes between the quote characters as a > header That would explain the comment 0 vs comment 2 mess, yes. > Can we consider this as a webcompat issue We could, but it should still get specced in httpbis. Is UTF-8 the common case for these headers when they're non-ascii? Does it depend on the part of the URI (certainly that's the case in other places in IE)? Data needed... Data on how IE and Safari and Opera handle this would be good too. Also of interest is how the "actual" charset of the Location header lines up with the origin charset of the current request's URI... In general, they're uncorrelated, but what happens in practice? In general, I have no problem aligning with other browsers on UTF-8 here, as long as it doesn't break existing consumers. > but I can't find anywhere in the HTTP RFC which defines what should happen in > this case There isn't anything; relative URIs in Location are not valid HTTP 1.1.

(no longer active)

Comment 7

•

15 years ago

What if we look for escaped characters inside the domain name, and convert the domain to the ACE notation if it has any? Does that make any sense? I don't think that in that case we will be breaking any existing consumers.

Boris Zbarsky [:bzbarsky]

Comment 8

•

15 years ago

You mean like bug 412457 or bug 309671?

(no longer active)

Comment 9

•

15 years ago

(In reply to comment #8) > You mean like bug 412457 or bug 309671? Yes!

Johnny Stenback (:jst)

Comment 10

•

15 years ago

Not blocking 1.9.3 on this but I marked bug 412457 as wanted for 1.9.3. Am I correct in assuming that fixing that bug alone would fix this problem as well? If so, should we dupe this?

blocking2.0: ? → -

Valentin Gosu [:valentin] (he/him)

Updated

•

9 years ago

Updated

•

9 years ago

Whiteboard: [necko-backlog]

Firefox Bug Husbandry Bot

Comment 12

•

7 years ago

Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258

Priority: -- → P1

Firefox Bug Husbandry Bot

Comment 13

•

7 years ago

Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258

Priority: P1 → P3

Jens Stutte [:jstutte]

Comment 14

•

4 years ago

Bulk-downgrade of unassigned, >=3 years untouched DOM/Storage bug's priority.

If you have reason to believe this is wrong, please write a comment and ni :jstutte.

Severity: major → S4

Priority: P3 → P5

Bugzilla

Redirect of IDN - Domains

Categories

(Core :: Networking: HTTP, defect, P5)

Tracking

()

People

(Reporter: db, Unassigned)

References

(
URL
)

Details

(Whiteboard: [necko-backlog])

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Updated

Updated

Comment 12

Comment 13

Comment 14