If I perform a search using a search engine that requires its parameters to be passed in UTF-8, and later try to load the same URL using AwesomeBar, Firefox loads a URL encoded in local charset instead, even though the URL is stored correctly in History. Steps to reproduce: 1 open http://search.live.com/ 2 paste text 'mėšlas' in a search box and press enter 3 You'll see the results with this word, note that the search box still contains a word 'mėšlas' 4 Focus the AwesomeBar, enter words 'live' and 'mėšlas' in it, then choose the URL from a drop-down list and press enter 5 You'll see results for a word 'mlas' instead Open the History window. You will see two different URLs in the history: http://search.live.com/results.aspx?q=m%EB%F0las&go=&form=QBRE (incorrect) http://search.live.com/results.aspx?q=m%C4%97%C5%A1las&go=&form=QBRE (correct) This happens with Firefox 3.0.5 lt, the following configuration values are at their defaults: network.standard-url.encode-query-utf8 network.standard-url.encode-utf8 network.standard-url.escape-utf8
This problem still occurs, not for 'mėšlas' but for 'čest' word. I have followed the steps that Rimas Kudelis provided and the bug was repaired for word 'mėšlas'. But I can still get this bug with word 'čest'. It is strange that for one language/encoding the bug does not occur, but for Czech language/encoding it does. Steps to reproduce: 1 open http://search.live.com/ 2 paste text 'čest' in a search box and press enter 3 You'll see the results with this word, note that the search box still contains a word 'čest' 4 Focus the AwesomeBar, enter word 'čest' in it, then choose the URL from a drop-down list and press enter 5 You'll see results for a word '�est' instead You can find two different URLs in history: http://www.bing.com/search?q=%E8est&go=&form=QBLH&qs=n&sk= (incorrect �est) http://www.bing.com/search?q=%C4%8Dest&go=&form=QBLH&qs=n&sk= (correct čest) This happens with Firefox 4, the following configuration values are at their defaults: network.standard-url.encode-query-utf8 = false network.standard-url.encode-utf8 = true network.standard-url.escape-utf8 = true
Wow, that's an old bug you found. :) I just took steps to reproduce the problem and it's still there for my language too. I didn't stress enough that the URL is being encoded in a LOCAL 8 bit charset, which is windows-1257 in my case. I think you should use words that fit your 8-bit charset for testing. That is, the word "mėšlas" will only work for Lithuanians and Latvians. :)
Well my local charset is windows-1250. If I understand this right, when the URL parameters can be encoded in local charset, firefox will use the local charset instead of UTF-8. Otherwise it will use UTF-8 and no problem will occur - and that's why I have no trouble with 'mėšlas' but you have. The 'ė' is not in my local charset. I have this bug only when I choose URL from awesomebar with mouse or keyboard. The bug does not occur when: - opening from bookmark - opening from awesomebar in new tab = using Ctrl+left-mouse-click or middle-mouse-click - opening from history side panel (Ctrl+h)
I think v.ovcacik is right - this happens because we use the system character set for URLbar-input, and when you select an item from the dropdown it's treated as URLbar-input. Maybe we can change that? That also has the potential to break people, though. Flipping encode-query-utf8 fixes it, presumably (bug 552273 is related).
We decode the URL with platform charset on input, but encode it with UTF-8 on adding to the history. At least, we should decode URLs from awesomebar popup with UTF-8, because we know they are encoded with UTF-8. I'm not sure about URLs typed by the user.
Maybe Awesomebar should store encoded links instead, and only decode them for display? This would allow having links in any charset. OTOH, it would likely display links in foreign charsets incorrectly, so how about decoding only UTF-8 links for display, and showing all others with %xx characters?
Also it should be possible to store unicode links encoded in unicode, and non-unicode links with the appropriate characters escaped. This would require absolutely no additional processing when the user selects a link from the drop-down, and would ensure that the exact link is preserved.
(In reply to comment #6) > so how about decoding only UTF-8 > links for display, and showing all others with %xx characters? Firefox 4 already decodes only UTF-8 links for display. The problem is that Firefox doesn't encode URLs with UTF-8 on enter due to legacy compat.
Fixed by bug 461304.