Last Comment Bug 535845 - URL-encoded UTF-8 doesn't work on the commandline
: URL-encoded UTF-8 doesn't work on the commandline
Status: NEW
:
Product: Firefox
Classification: Client Software
Component: Shell Integration (show other bugs)
: Trunk
: x86 Mac OS X
: -- normal (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
:
Mentors:
http://www.%e2%9e%a1.ws
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-12-18 11:51 PST by Zandr Milewski [:zandr]
Modified: 2010-09-24 16:42 PDT (History)
2 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments

Description Zandr Milewski [:zandr] 2009-12-18 11:51:29 PST
URL shorteners are starting to use bizarre glyphs to get really short urls. In particular, tinyarro.ws seems to be gaining popularity.

Clicking on a link like this in Tweetie doesn't successfully open the URL. In fiddling with it a bit, it looks like Tweetie tries to open what I think is URL-encoded UTF-8. Firefox throws "Problem Loading Page" with the URL-encoded version in the error box, but the properly decoded URL in the address bar. Indeed, clicking in the address bar and hitting return opens the site successfully.

I tested this on the commandline with the same result.

For reference, Safari works, and clicking these links in Thunderbird works with Firefox (uses punycode instead?)
Comment 1 zooko 2010-01-12 14:55:01 PST
Here's an example tinyurl provided by http://➡.ws that I like to give to people, e.g. on twitter: http://➡.ws/zooko . If following that link works then you should end up at my blog.
Comment 2 Zandr Milewski [:zandr] 2010-01-12 15:01:02 PST
NB: The links Zooko provides work fine if you click on them from within the browser. It's clicking on these links in other apps and expecting them to open in Firefox where things break down.

@gjmf also reported the problem coming from Echofon: http://twitter.com/gjmf/status/7685896641
Comment 3 Zandr Milewski [:zandr] 2010-04-28 07:52:08 PDT
If you leave a tab in the "Problem Loading Page" state and restart the browser, session restore doesn't decode the URL in the address bar.
Comment 4 Zandr Milewski [:zandr] 2010-07-15 22:53:08 PDT
This appears to have regressed in 4.0b1, now the address bar shows the URLencoded version as well.
Comment 5 Zandr Milewski [:zandr] 2010-08-09 16:43:43 PDT
To make the regression even more magical... In 4.0, you see the URLencoded version in the awesome bar. If you then type in a different address, visit that page, and hit 'Back', you the 3.6 behavior. At that point you'll have a properly decoded address in the awesome bar, but the error page. You can then hit return in the awesome bar to load the page.
Comment 6 Zandr Milewski [:zandr] 2010-09-24 16:34:20 PDT
We're called out by name twice here: http://daringfireball.net/2010/09/starstruck
Comment 7 Philipp von Weitershausen [:philikon] 2010-09-24 16:42:43 PDT
I don't think http://www.%e2%9e%a1.ws is a valid way to encode an IDN. This method of url encoding is only valid for the path part, not the hostname. The hostname should be in punycode when you need to sling URL around in byte form.

That said, we can't undo what those Mac apps are doing, and quite a few web apps out there also seem to get it wrong. Firefox should accept these.

Note You need to log in before you can comment on or make changes to this bug.